Your SlideShare is downloading. ×
0
ARC206 – Scaling on AWS for the First 10
Million Users
Simon Elisha, Principal Solutions Architect, Amazon Web Services
No...
• ME: Simon Elisha – Principal Solutions Architect –
Amazon Web Services – @simon_elisha
• YOU: Here to learn more about s...
So how do we scale?
Hi, I have NO IDEA what I am doing!!
a lot of things to read
a lot of things to read

not where we want to start
Auto Scaling is a tool and a
destination. It’s not the single
thing that fixes everything.
What do we need
first?
Some basics
Regions
US West (Oregon)

EU (Ireland)
AWS GovCloud (US)

Asia Pacific (Tokyo)

US East (Virginia)
Asia Pacific
(Sydney)
U...
Availability Zones
US West (Oregon)

EU (Ireland)
AWS GovCloud (US)

Asia Pacific (Tokyo)

US East (Virginia)
Asia Pacific...
Edge Locations
Service Reference Model
Deployment & Administration
App Services
Compute

Storage &
Content
Delivery

Database

Networking...
AWS
OpsWorks

Amazon
SNS

Amazon
CloudSearch

Amazon
SES

Amazon SWF

Amazon
SQS

Amazon
CloudWatch

Amazon
EMR

Amazon
Ro...
So let’s start from day
one, user one ( you )
Day One, User One
• A single EC2 Instance

Amazon
Route 53

User

– With full stack on this host
•
•
•
•

Web app
Database...
“We’re gonna need a bigger box”
•
•
•
•
•
•
•
•

Simplest approach
Can now leverage PIOPs
High I/O instances
High memory i...
“We’re gonna need a bigger box”
•
•
•
•
•
•
•
•

Simplest approach
Can now leverage PIOPs
High I/O instances
High memory i...
Day One, User One
• We could potentially get
to a few hundred to a few
thousand depending on
application complexity
and tr...
Day One, User One:
• We could potentially get
to a few hundred to a few
thousand depending on
application complexity
and t...
Day Two, User >1
First let’s separate out
our single host into
more than one.
• Web
• Database
– Make use of a database
se...
Database Options
Self-managed

Database Server
on Amazon EC2
Your choice of
database running on
Amazon EC2
Bring Your Own
...
But how do I choose
what DB technology I
need? SQL? NoSQL?
Not a binary decision!
Blended approach
can reduce technical
debt
Start with SQL
databases where it
makes sense
Why start with SQL?
• Established and well worn technology
• Lots of existing code, communities, books, background,
tools,...
If your usage is such that you will be
generating several TB ( >5 ) of data
in the first year OR have an
incredibly data i...
Why else might you need NoSQL?
•
•
•
•
•
•

Super low latency applications
Metadata driven datasets
Highly unrelational da...
So decide wisely.
Look for the key
points of scale.
User >100
First let’s separate out
our single host into
more than one
• Web
• Database
– Use RDS to make your life
easier
...
User > 1000
User

Next let’s address our
lack of failover and
redundancy issues
• Elastic Load Balancing
• Another web ins...
Elastic Load Balancing
•

Create highly scalable applications

•

Distribute load across EC2 instances
in multiple Availab...
Scaling this
horizontally and
vertically will get us
pretty far
(10s-100s of thousands)
User >10 ks–100 ks
User

Amazon
Route 53

Elastic Load
Balancing

Web
Instance

Web
Instance

Web
Instance

RDS DB Instanc...
This will take us pretty far
honestly, but we care about
performance and
efficiency, so let’s clean
this up a bit
Shift Some Load Around
User

Let’s lighten the load on our
web and database instances:
•

•

•

Move static content from t...
Working with S3 – Amazon Simple Storage Service
•
•
•

•

Object-based storage for the web
11 9s of durability
Good for th...
Amazon CloudFront

CDN for Static

CDN for Static &

Content

No CDN

Dynamic Content

•

80
70
60
50
40
30
20
10
0

8:00
...
Shift Some Load Around
User

Let’s lighten the load on our
web and database instances
•

Move static content from the

web...
Shift Some Load Around
User

Let’s lighten the load on our
web and database
instances
• Move static content from the
web i...
Amazon DynamoDB
• Provisioned throughput NoSQL
database
• Fast, predictable performance
• Fully distributed, fault-toleran...
ElastiCache
•

•
•
•
•
•
•

Hosted Memcached & Redis
– Speaks same API as traditional open source
Memcached and Redis
Scal...
Now that our Web tier is
much more lightweight, we
can revisit the beginning of
our talk…
Auto Scaling!
Auto Scaling
Trigger autoscaling policy
Amazon
CloudWatch

Automatic resizing of compute clusters
based on demand
Feature
...
Typical Weekly Traffic to Amazon.com

Sunday

Monday

Tuesday

Wednesday

Thursday

Friday

Saturday
Typical Weekly Traffic to Amazon.com
Provisioned Capacity

Sunday

Monday

Tuesday

Wednesday

Thursday

Friday

Saturday
November Traffic to Amazon.com

November
November Traffic to Amazon.com
Provisioned Capacity

November
November Traffic to Amazon.com
76%

Provisioned Capacity

November

24%
November Traffic to Amazon.com

November
Auto Scaling lets you do
this!
User >500k+

Amazon
Route 53

User

Amazon
Cloudfront

Elastic Load
Balancing

Web
Instance

Web
Instance

Web
Instance

A...
A Pause to Think
“Give me six hours to chop down a tree and I will spend
the first four sharpening the axe.” – Abraham Lincoln
“World of Hurt” If You Are Missing These
•
•
•
•

Metrics & alarming
Automated builds
Automated deployment
Centralized log...
Host-level
Metrics

Aggregatelevel
Metrics

Log
Analysis

External Site
Performance
Not having proper monitoring
or metrics is like flying a
plane with an eye mask on in
a thunderstorm.
Oh and your wing is ...
AWS Marketplace & Partners Can Help
• Customers can find, research,
buy software
• Simple pricing aligns with EC2
usage mo...
Spend Your Time Wisely
Managing your infrastructure will become an
increasingly important part of your time. Use tools to
...
AWS Application Management Solutions
Higher level services

Elastic Beanstalk
Convenience

AWS OpsWorks

Do it yourself

A...
Host-based Configuration Management
Two big players
– Opscode Chef
– PuppetLabs Puppet

•
•
•
•
•

Both do more or less th...
From 500K to 1 Million Users
•
•
•
•

Getting serious now
Significant user base
Plenty of attention if things go wrong
Int...
Time to make some
radical improvements at
the web & app layers
SOA
=
Service-oriented Architecture
SOAing
Move services into their own tiers
or modules. Treat each of these
as 100% separate pieces of your
infrastructure a...
Loose Coupling Sets You Free!
• The looser they're coupled, the bigger they scale
–
–
–
–

Use independent components
Desi...
Loose Coupling + SOA = Winning
In the early days, if someone has a service for it already,
use that instead of building it...
On reinventing the wheel: If
you find yourself writing
your own queue, DNS
server, database, storage
system, monitoring to...
Take a deep breath
and stop it. Now.
Back to SOA
Imagine we let our users
upload photos
CloudFront
Download
Distribution

RRS
Amazon S3
Bucket to
Serve
Content to
CloudFront

Amazon S3
Bucket for
Ingest
Instanc...
Amazon Simple Workflow Service (SWF)
•
•
•
•
•
•
•

Provides an orchestration tool across your infrastructure
Can act as a...
CloudFront
Download
Distribution

RRS
Amazon S3
Bucket to
Serve
Content to
CloudFront

Amazon S3
Bucket for
Ingest
Instanc...
CloudFront
Download
Distribution

RRS
Amazon S3
Bucket to
Serve
Content to
CloudFront

Amazon S3
Bucket for
Ingest
Instanc...
Users > 1 Million
Reaching a million and above is going to require some of
all the previous things:
• Multi-AZ
• Elastic L...
Users > 1 Million
User

Amazon
Route 53

Amazon
Cloudfront

Elastic Load
Balancer
Amazon SQS
Web
Instance

Web
Instance

W...
The next big steps
From 5 to 10 Million Users
You may start to run into issues with your database around
contention on the write master.
How ...
Database Federation
• Split up databases by function
or purpose
• Harder to do cross-function
queries
• Essentially delays...
Sharded Horizontal Scaling
• More complex at the
application layer
• ORM support can help
• No practical limit on
scalabil...
Shifting Functionality to NoSQL
• Similar in a sense to federation
• Again, think about the earlier points for when you
ne...
From 5 to 10 Million Users
You may start to run into issues with speed and performance
of your applications
• Make sure yo...
A quick review
• Use Multi-AZ for your infrastructure
• Make use of self-scaling services (Elastic Load Balancing,
Amazon S3, Amazon SNS,...
Putting all this together
means we should now
easily be able to handle
10+ million users!
Users > 10 Million

Iterating on top of the
patterns seen here will get
you up and over 100
million users.
Users > 10 Million
•
•
•
•
•

More fine tuning of your application
More SOA of features and functionality
Going from Multi...
One More Thing
• A fantastic amount of FINANCIAL ENGINEERING
to do as well
• Reserved Instances
• Spot Instances
• Correct...
Next steps?
Read!
• aws.amazon.com/documentation
• aws.amazon.com/architecture
• aws.amazon.com/start-ups
Listen!
• aws.am...
Next steps?
Ask for help!
• forums.aws.amazon.com
• aws.amazon.com/support
• Your local account manager & solution archite...
AWS re:Invent Pub Crawl
Join the AWS Startup Team this evening at the AWS Pub Crawl
When: Wednesday November 13, 5:30pm - ...
Startup Spotlight Sessions with Dr. Werner Vogels
Thurs. Nov 14, Marcello Room 4406

SPOT 203 - Fireside Chats – Startup F...
Please give us your feedback on this
presentation

ARC206
As a thank you, we will select prize
winners daily for completed...
Scaling on AWS for the First 10 Million Users (ARC206) | AWS re:Invent 2013
Upcoming SlideShare
Loading in...5
×

Scaling on AWS for the First 10 Million Users (ARC206) | AWS re:Invent 2013

4,035

Published on

Cloud computing gives you a number of advantages in being able to scale on demand, easily replace whole parts of your infrastructure, and much more. As a new business looking to use the cloud, you inevitably ask yourself, Where do I start? Join us at this session to understand some of the common patterns and recommended areas of focus you can expect to work through while scaling an infrastructure to handle going from zero to millions of users. From leveraging highly scalable AWS services to making smart decisions on building out your application, you'll learn a number of best practices for scaling your infrastructure in the cloud. The patterns and practices reviewed in this session will get you there.

Published in: Technology

Transcript of "Scaling on AWS for the First 10 Million Users (ARC206) | AWS re:Invent 2013"

  1. 1. ARC206 – Scaling on AWS for the First 10 Million Users Simon Elisha, Principal Solutions Architect, Amazon Web Services November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  2. 2. • ME: Simon Elisha – Principal Solutions Architect – Amazon Web Services – @simon_elisha • YOU: Here to learn more about scaling infrastructure on AWS • TODAY: About best practices and things to think about when building for large scale
  3. 3. So how do we scale?
  4. 4. Hi, I have NO IDEA what I am doing!!
  5. 5. a lot of things to read
  6. 6. a lot of things to read not where we want to start
  7. 7. Auto Scaling is a tool and a destination. It’s not the single thing that fixes everything.
  8. 8. What do we need first?
  9. 9. Some basics
  10. 10. Regions US West (Oregon) EU (Ireland) AWS GovCloud (US) Asia Pacific (Tokyo) US East (Virginia) Asia Pacific (Sydney) US West (N. California) South America (Sao Paulo) Asia Pacific (Singapore)
  11. 11. Availability Zones US West (Oregon) EU (Ireland) AWS GovCloud (US) Asia Pacific (Tokyo) US East (Virginia) Asia Pacific (Sydney) US West (N. California) South America (Sao Paulo) Asia Pacific (Singapore)
  12. 12. Edge Locations
  13. 13. Service Reference Model Deployment & Administration App Services Compute Storage & Content Delivery Database Networking AWS Global Infrastructure
  14. 14. AWS OpsWorks Amazon SNS Amazon CloudSearch Amazon SES Amazon SWF Amazon SQS Amazon CloudWatch Amazon EMR Amazon Route 53 AWS Direct Connect AWS CloudFormation AWS Data Pipeline Amazon ElastiCache Amazon RDS Amazon DynamoDB Amazon RedShift App Services Amazon Elastic Transcoder Amazon VPC AWS IAM Deployment & Administration Compute Amazon EC2 AWS Elastic Beanstalk Storage & Content Delivery Database Networking AWS Global Infrastructure Amazon S3 Amazon CloudFront AWS Storage Gateway Amazon Glacier
  15. 15. So let’s start from day one, user one ( you )
  16. 16. Day One, User One • A single EC2 Instance Amazon Route 53 User – With full stack on this host • • • • Web app Database Management Etc. • A single Elastic IP • Route53 for DNS Elastic IP EC2 Instance
  17. 17. “We’re gonna need a bigger box” • • • • • • • • Simplest approach Can now leverage PIOPs High I/O instances High memory instances High CPU instances High storage instances Easy to change instance sizes Will hit an endpoint eventually hi1.4xlarge m2.4xlarge m1.small
  18. 18. “We’re gonna need a bigger box” • • • • • • • • Simplest approach Can now leverage PIOPs High I/O instances High memory instances High CPU instances High storage instances Easy to change instance sizes Will hit an endpoint eventually hi1.4xlarge m2.4xlarge m1.small
  19. 19. Day One, User One • We could potentially get to a few hundred to a few thousand depending on application complexity and traffic • No failover • No redundancy • Too many eggs in one basket Amazon Route 53 User Elastic IP EC2 Instance
  20. 20. Day One, User One: • We could potentially get to a few hundred to a few thousand depending on application complexity and traffic • No failover • No redundancy • Too many eggs in one basket Amazon Route 53 User Elastic IP EC2 Instance
  21. 21. Day Two, User >1 First let’s separate out our single host into more than one. • Web • Database – Make use of a database service? Amazon Route 53 User Elastic IP Web Instance Database Instance
  22. 22. Database Options Self-managed Database Server on Amazon EC2 Your choice of database running on Amazon EC2 Bring Your Own License (BYOL) Fully Managed Amazon RDS Amazon DynamoDB Amazon Redshift Microsoft SQL, Oracle or MySQL as a managed service Managed NoSQL database service using SSD storage Massively parallel, petabyte-scale, data warehouse service Flexible licensing – BYOL or license included Seamless scalability Fast, powerful and easy to scale Zero administration
  23. 23. But how do I choose what DB technology I need? SQL? NoSQL?
  24. 24. Not a binary decision!
  25. 25. Blended approach can reduce technical debt
  26. 26. Start with SQL databases where it makes sense
  27. 27. Why start with SQL? • Established and well worn technology • Lots of existing code, communities, books, background, tools, etc • You aren’t going to break SQL DBs in your first 10 million users. But you might break parts of it (hence blended approach) • Clear patterns to scalability
  28. 28. If your usage is such that you will be generating several TB ( >5 ) of data in the first year OR have an incredibly data intensive workload you might need NoSQL
  29. 29. Why else might you need NoSQL? • • • • • • Super low latency applications Metadata driven datasets Highly unrelational data Need schema-less data constructs* Massive amounts of data (again, in the TB range) Rapid ingest of data (thousands of records/sec) *Need != “its easier to do dev without schemas”
  30. 30. So decide wisely. Look for the key points of scale.
  31. 31. User >100 First let’s separate out our single host into more than one • Web • Database – Use RDS to make your life easier Amazon Route 53 User Elastic IP Web Instance RDS DB Instance
  32. 32. User > 1000 User Next let’s address our lack of failover and redundancy issues • Elastic Load Balancing • Another web instance Amazon Route 53 Elastic Load Balancing Web Instance Web Instance RDS DB Instance Active (Multi-AZ) RDS DB Instance Standby (Multi-AZ) Availability Zone Availability Zone – In another Availability Zone • Enable Amazon RDS multi-AZ
  33. 33. Elastic Load Balancing • Create highly scalable applications • Distribute load across EC2 instances in multiple Availability Zones Feature Available Health checks Session stickiness Secure sockets layer Monitoring Elastic Load Balancer Details Load balance across instances in multiple Availability Zones Automatically checks health of instances and takes them in or out of service Route requests to the same instance Supports SSL offload from web and application servers with flexible cipher support Publishes metrics to CloudWatch
  34. 34. Scaling this horizontally and vertically will get us pretty far (10s-100s of thousands)
  35. 35. User >10 ks–100 ks User Amazon Route 53 Elastic Load Balancing Web Instance Web Instance Web Instance RDS DB Instance RDS DB Instance Read Replica Read Replica Availability Zone Web Instance RDS DB Instance Active (Multi-AZ) Web Instance Web Instance RDS DB Instance Standby (Multi-AZ) Web Instance RDS DB Instance Read Replica Availability Zone Web Instance RDS DB Instance Read Replica
  36. 36. This will take us pretty far honestly, but we care about performance and efficiency, so let’s clean this up a bit
  37. 37. Shift Some Load Around User Let’s lighten the load on our web and database instances: • • • Move static content from the web Instance to Amazon S3 and CloudFront Move dynamic content from the Elastic Load Balancing to CloudFront Move session/state and DB caching to ElastiCache or Amazon DynamoDB Amazon Route 53 Amazon CloudFront Elastic Load Balancer Amazon S3 Web Instance ElastiCache RDS DB Instance Active (Multi-AZ) Amazon DynamoDB Availability Zone Check out Session: ARC309 – Dynamic Content Acceleration: Lightning Fast Web Apps with Amazon CloudFront and Amazon Route 53
  38. 38. Working with S3 – Amazon Simple Storage Service • • • • Object-based storage for the web 11 9s of durability Good for things like: – Static assets ( css, js, images, videos ) – Backups – Logs – Ingest of files for processing “Infinitely scalable” • • • • • • • Supports fine grained permission control Ties in well with CloudFront Ties in with Amazon EMR Acts as a logging endpoint for Amazon S3, CloudFront, Billing Supports encryption at transit and at rest Reduced redundancy 1/3 cheaper Amazon Glacier for super long term storage
  39. 39. Amazon CloudFront CDN for Static CDN for Static & Content No CDN Dynamic Content • 80 70 60 50 40 30 20 10 0 8:00 AM 9:00 AM 10:00 11:00 12:00 AM AM PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 5:00 PM 6:00 PM Server Load Response Time Server Load Response Time Server Load Cache static content at the edge for faster delivery Helps lower load on origin infrastructure Dynamic and static content Streaming video Zone apex support Custom SSL certificates Low TTLs ( as short as 0 seconds ) Lower costs for origin fetches ( between Amazon S3/EC2 and CloudFront ) Optimized to work with EC2, Amazon S3, Elastic Load Balancing, and Route53 Volume of Data Delivered (Gbps) • • • • • • • • Response Time Amazon CloudFront is a web service for scalable content delivery. 7:00 PM 8:00 PM 9:00 PM
  40. 40. Shift Some Load Around User Let’s lighten the load on our web and database instances • Move static content from the web instance to Amazon S3 and CloudFront • Move dynamic content from the Elastic Load Balancing to CloudFront • Move session/state and DB caching to ElastiCache or DynamoDB Amazon Route 53 Amazon CloudFront Elastic Load Balancing Amazon S3 Web Instance ElastiCache RDS DB Instance Active (Multi-AZ) Availability Zone Amazon DynamoDB
  41. 41. Shift Some Load Around User Let’s lighten the load on our web and database instances • Move static content from the web instance to Amazon S3 and CloudFront • Move dynamic content from the Elastic Load Balancing to CloudFront • Move session/state and DB caching to ElastiCache or Amazon DynamoDB Amazon Route 53 Amazon Cloudfront Elastic Load Balancing Amazon S3 Web Instance ElastiCache RDS DB Instance Active (Multi-AZ) Availability Zone Amazon DynamoDB
  42. 42. Amazon DynamoDB • Provisioned throughput NoSQL database • Fast, predictable performance • Fully distributed, fault-tolerant Feature Details Provisioned throughput Predictable performance Strong consistency Fault tolerant architecture • Considerations for nonuniform data Monitoring Dial up or down provisioned read/write capacity Average single-digit millisecond latencies from SSD-backed infrastructure Be sure you are reading the most up to date values Data replicated across Availability Zones Integrated to CloudWatch Secure Integrates with AWS Identity and Access Management (IAM) Elastic MapReduce Integrates with Amazon Elastic MapReduce for complex analytics on large datasets
  43. 43. ElastiCache • • • • • • • Hosted Memcached & Redis – Speaks same API as traditional open source Memcached and Redis Scale from one to many nodes Self-healing ( replaces dead instance ) Very fast ( single digit ms speeds usually ) Local to a single AZ for Memcache, with no persistence or replication With Redis can put a replica in a different AZ with persistence Use AWS’s Auto Discovery client to simplify clusters growing and shrinking without affecting your application
  44. 44. Now that our Web tier is much more lightweight, we can revisit the beginning of our talk…
  45. 45. Auto Scaling!
  46. 46. Auto Scaling Trigger autoscaling policy Amazon CloudWatch Automatic resizing of compute clusters based on demand Feature Details Control Define minimum and maximum instance pool sizes and when scaling and cool down occurs. Integrated to Amazon CloudWatch Use metrics gathered by CloudWatch to drive scaling. Instance types Run Auto Scaling for On-Demand and Spot Instances. Compatible with VPC. AWS autoscaling create-autoscaling-group — Auto Scaling-group-name MyGroup — Launch-configuration-name MyConfig — Min size 4 — Max size 200 — Availability Zones us-west-2c
  47. 47. Typical Weekly Traffic to Amazon.com Sunday Monday Tuesday Wednesday Thursday Friday Saturday
  48. 48. Typical Weekly Traffic to Amazon.com Provisioned Capacity Sunday Monday Tuesday Wednesday Thursday Friday Saturday
  49. 49. November Traffic to Amazon.com November
  50. 50. November Traffic to Amazon.com Provisioned Capacity November
  51. 51. November Traffic to Amazon.com 76% Provisioned Capacity November 24%
  52. 52. November Traffic to Amazon.com November
  53. 53. Auto Scaling lets you do this!
  54. 54. User >500k+ Amazon Route 53 User Amazon Cloudfront Elastic Load Balancing Web Instance Web Instance Web Instance Amazon S3 Web Instance Web Instance Web Instance DynamoDB RDS DB Instance RDS DB Instance Active (Multi-AZ) Read Replica Availability Zone ElastiCache RDS DB Instance RDS DB Instance Standby (Multi-AZ) Read Replica Availability Zone ElastiCache
  55. 55. A Pause to Think
  56. 56. “Give me six hours to chop down a tree and I will spend the first four sharpening the axe.” – Abraham Lincoln
  57. 57. “World of Hurt” If You Are Missing These • • • • Metrics & alarming Automated builds Automated deployment Centralized logging Check out Session: ARC306 – Lumberjacking on AWS: Cutting Through Logs to Find What Matters Check out Session: ARC307 –Continuous Integration and Deployment Best Practices on AWS
  58. 58. Host-level Metrics Aggregatelevel Metrics Log Analysis External Site Performance
  59. 59. Not having proper monitoring or metrics is like flying a plane with an eye mask on in a thunderstorm. Oh and your wing is on fire.
  60. 60. AWS Marketplace & Partners Can Help • Customers can find, research, buy software • Simple pricing aligns with EC2 usage model • Launch in minutes • Marketplace billing integrated into your AWS account • 1,000+ products across 20+ categories Learn more at: aws.amazon.com/marketplace
  61. 61. Spend Your Time Wisely Managing your infrastructure will become an increasingly important part of your time. Use tools to automate repetitive tasks • Tools to manage AWS resources • Tools to manage software on and configuration of your instances • Automated data analysis of logs and user actions
  62. 62. AWS Application Management Solutions Higher level services Elastic Beanstalk Convenience AWS OpsWorks Do it yourself AWS CloudFormation EC2 Control
  63. 63. Host-based Configuration Management Two big players – Opscode Chef – PuppetLabs Puppet • • • • • Both do more or less the same thing They have similar syntax Works well with tools from the previous slide Require some learning time Can’t scale easily without this kind of capability
  64. 64. From 500K to 1 Million Users • • • • Getting serious now Significant user base Plenty of attention if things go wrong Interesting phase for startups with funding rounds
  65. 65. Time to make some radical improvements at the web & app layers
  66. 66. SOA = Service-oriented Architecture
  67. 67. SOAing Move services into their own tiers or modules. Treat each of these as 100% separate pieces of your infrastructure and scale them independently. Amazon.com and AWS do this extensively! It offers flexibility and greater understanding of each component.
  68. 68. Loose Coupling Sets You Free! • The looser they're coupled, the bigger they scale – – – – Use independent components Design everything as a black box Decouple interactions Favor services with built in redundancy and scalability than building your own Use Amazon SQS as Buffers Tight Coupling Loose Coupling Controller A Q Controller B Q Controller A Controller B Check out Session: ARC301 – Controlling the Flood: Massive Message Processing with Amazon SQS & Amazon DynamoDB
  69. 69. Loose Coupling + SOA = Winning In the early days, if someone has a service for it already, use that instead of building it yourself Don’t reinvent the wheel Examples: • Email • Queuing • Transcoding • Search • • • • Databases Monitoring Metrics Logging Amazon SNS Amazon SES Amazon CloudSearch Amazon SQS Amazon SWF Amazon Elastic Transcoder
  70. 70. On reinventing the wheel: If you find yourself writing your own queue, DNS server, database, storage system, monitoring tool …
  71. 71. Take a deep breath and stop it. Now.
  72. 72. Back to SOA
  73. 73. Imagine we let our users upload photos
  74. 74. CloudFront Download Distribution RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Ingest Instances User SQS Queue Size for Thumbnail Autoscaling Group Instances Amazon SNS Topic Autoscaling Group SQS Queue Size Image for Mobile Instances SQS Queue Size Image for Web Autoscaling Group Amazon S3 Bucket for Originals
  75. 75. Amazon Simple Workflow Service (SWF) • • • • • • • Provides an orchestration tool across your infrastructure Can act as a middle layer to pass messages and setup tasks Lets you break down individual tasks into different workers Lets you define logic between workers Lets you make a worker task from anything that can be scripted Includes built-in retries, timeouts, logging Features built-in reliability, scalability, and low cost Your code = & Deciders Workers
  76. 76. CloudFront Download Distribution RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Ingest Instances User SQS Queue Size for Thumbnail Autoscaling Group Instances Amazon SNS Topic Autoscaling Group SQS Queue Size Image for Mobile Instances SQS Queue Size Image for Web Autoscaling Group Amazon S3 Bucket for Originals
  77. 77. CloudFront Download Distribution RRS Amazon S3 Bucket to Serve Content to CloudFront Amazon S3 Bucket for Ingest Instances User Autoscaling Group Instances Autoscaling Group SWF Instances Instance Running Decider Autoscaling Group Amazon S3 Bucket for Originals
  78. 78. Users > 1 Million Reaching a million and above is going to require some of all the previous things: • Multi-AZ • Elastic Load Balancing between tiers • Auto Scaling • Service-oriented architecture • Serving content smartly (S3/CloudFront) • Caching off DB • Moving state off tiers that autoscale
  79. 79. Users > 1 Million User Amazon Route 53 Amazon Cloudfront Elastic Load Balancer Amazon SQS Web Instance Web Instance Web Instance Web Instance Worker Instance Worker Instance Amazon DynamoDB ElastiCache RDS DB Instance RDS DB Instance Read Replica Read Replica Availability Zone RDS DB Instance Active (Multi-AZ) Amazon S3 Internal App Instance Internal App Instance Amazon CloudWatch Amazon SES
  80. 80. The next big steps
  81. 81. From 5 to 10 Million Users You may start to run into issues with your database around contention on the write master. How can you solve it? • Federation (splitting into multiple DBs based on function) • Sharding (splitting one data set up across multiple hosts) • Moving some functionality to other types of DBs (NoSQL)
  82. 82. Database Federation • Split up databases by function or purpose • Harder to do cross-function queries • Essentially delays the need for something like sharding or NoSQL until much further down the line • Won’t help with single huge functions or tables ForumsDB UsersDB ProductsDB
  83. 83. Sharded Horizontal Scaling • More complex at the application layer • ORM support can help • No practical limit on scalability • Operational complexity and sophistication • Shard by function or key space • RDBMS or NoSQL User ShardID 002345 A 002346 B 002347 C 002348 B 002349 A A C B
  84. 84. Shifting Functionality to NoSQL • Similar in a sense to federation • Again, think about the earlier points for when you need NoSQL vs SQL • Leverage hosted services like Amazon DynamoDB • Consider these use cases: – – – – – Leaderboards and scoring Rapid ingest of clickstream or log data Temporary data needs (cart data) “Hot” tables Metadata or lookup tables Amazon DynamoDB
  85. 85. From 5 to 10 Million Users You may start to run into issues with speed and performance of your applications • Make sure you have monitoring, metrics, & logging in place – If you can’t build it internally, outsource it! (third-party SaaS) • Pay attention to what customers are saying works well vs. what doesn’t, and use this as direction • Try to work on squeezing as much performance out of each service or component
  86. 86. A quick review
  87. 87. • Use Multi-AZ for your infrastructure • Make use of self-scaling services (Elastic Load Balancing, Amazon S3, Amazon SNS, SQS, Amazon SES, etc) Build in redundancy at every level • Blend SQL & NoSQL wisely • Cache data both inside and outside your infrastructure • Split tiers into individual services (SOA) • Use autoscaling once you’re ready for it • Use automation tools in your infrastructure • Make sure you have good metrics, monitoring, and logging tools in place • Don’t reinvent the wheel
  88. 88. Putting all this together means we should now easily be able to handle 10+ million users!
  89. 89. Users > 10 Million Iterating on top of the patterns seen here will get you up and over 100 million users.
  90. 90. Users > 10 Million • • • • • More fine tuning of your application More SOA of features and functionality Going from Multi-AZ to multi-region Needing to start building custom solutions Deep analysis of your whole stack Check out Session: ARC305 – How Netflix Leverages Multiple Regions to Increase Availability
  91. 91. One More Thing • A fantastic amount of FINANCIAL ENGINEERING to do as well • Reserved Instances • Spot Instances • Correct use of storage • Scaling driven by queues • Correct instance sizes • Etc… Check out Session: ARC313 – Running Lean and Mean: Designing Cost-Efficient Architectures on AWS
  92. 92. Next steps? Read! • aws.amazon.com/documentation • aws.amazon.com/architecture • aws.amazon.com/start-ups Listen! • aws.amazon.com/podcast
  93. 93. Next steps? Ask for help! • forums.aws.amazon.com • aws.amazon.com/support • Your local account manager & solution architect
  94. 94. AWS re:Invent Pub Crawl Join the AWS Startup Team this evening at the AWS Pub Crawl When: Wednesday November 13, 5:30pm - 7:30pm Where: Canaletto at The Venetian, 2nd Floor Who Will Be There: Startups, The AWS Startup Team, Startup Launch Companies and AWS re:Invent Hackathon winners
  95. 95. Startup Spotlight Sessions with Dr. Werner Vogels Thurs. Nov 14, Marcello Room 4406 SPOT 203 - Fireside Chats – Startup Founders, 1:30-2:30pm – Eliot Horowitz, CTO of MongoDB – Jeff Lawson, CEO of Twilio – Valentino Volonghi, Chief Architect of AdRoll SPOT 204 - Fireside Chats – Startup Influencers, 3:00-4:00pm – Albert Wegner, Managing Partner at Union Square Ventures – David Cohen, Founder and CEO of TechStars SPOT 101 - Startup Launches, 4:15-5:15pm – 5 companies powered by AWS launching at AWS re:Invent 2013
  96. 96. Please give us your feedback on this presentation ARC206 As a thank you, we will select prize winners daily for completed surveys!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×