Large-Scale Load Testing
Amazon.com’s Traffic on AWS
Carlos Arguelles, Amazon.com
November 15, 2013

© 2013 Amazon.com, In...
What I’d like you to get out of this
Load and performance issues cost
What I’d like you to get out of this
What I’d like you to get out of this
How you can leverage AWS for load and stress tests
About me
Amazon.com retail site
Amazon.com receives a LOT of traffic
Amazon.com retail site
Significant fluctuation throughout the day

(not to scale)
Amazon.com retail site
Significant fluctuation throughout the year

(not to scale)
Amazon.com retail site
Significant growth year to year

(not to scale)
Some load-related issues

can
1st

test (cancelled)

100%

85%
2nd test (successful)

50%
regular day (off-peak)

CPU Usage on our fleet
Some load-related issues

can only
Ingestion
Fleet

Amazon S3

Hadoop

Amazon
Database
DynamoDB

Output
Fleet
Some load-related issues

cannot
Start load…
20%

5%
Disk
Usage

5 hours
What do you really want to do?
Performance
Testing

Load
Testing

Resilience
Testing

Stress
Testing
Load Testing
Stress Testing
Resilience Testing
Performance Testing
How does AWS help us?
Generating load

Replays from real-world traffic

Artificial rate, blend of operations
Most useful AWS design pattern, ever
Distributing load, the hard way
Slave

Master

4000
3000 TPS

Slave

4000
3000 TPS

Slave

4000
3000 TPS

Slave

0 TPS
300...
Distributing load, the easy way

Controller
Controller

Job
Job
Job
Job
Job
Job
Job

Worker
Worker
Worker
Worker
Worker
Wo...
Replaying traffic to generate load
Test Data
Repository

Controller
Controller

Metrics &
Dashboards
Job
Job
Job
Job
Job
J...
Amazon S3 for storing data
Amazon DynamoDB for
indexing
Test Data
Repository

Controller
Controller

Reactive
auto scaling...
Generating load

Replays from real-world traffic

Artificial rate, blend of operations
Artificial traffic to generate load
• Why?
– You do not have
real-world data
– You expect a
change in traffic

• How?
– Co...
Artificial traffic to generate load
50,000 TPS
for 20 minutes

Minute#1: 50,000 TPS, 99% 1%

99% Read, 1% Writes

Minute#2...
Artificial traffic to generate load

Controller
Controller

Job
Job
Job
Job
Job
Job
Job

Worker
Worker
Worker
Worker
Worke...
Amazon EC2 Spot Instances
• A great way to inexpensively test
– Up to 90% off regular price (name your price)
– Interrupti...
Takeaways
Please give us your feedback on this
presentation

CPN102
As a thank you, we will select prize
winners daily for completed...
Large Scale Load Testing Amazon.com’s Traffic on AWS (CPN102) | AWS re:Invent 2013
Upcoming SlideShare
Loading in...5
×

Large Scale Load Testing Amazon.com’s Traffic on AWS (CPN102) | AWS re:Invent 2013

1,239

Published on

It’s 4am and you don’t know it, but you're about to get three times the traffic you were expecting. Is your service ready to handle it? Systems are only as scalable as their weakest component. Large scale load testing in production is the best (and surest) way to ensure that services can truly scale to the unexpected. But the load generator itself can be difficult to scale, expensive to run on hundreds or thousands of hosts, challenging to keep the data secure, and time consuming to develop. The Amazon.com retail site is one of most heavily used sites in the world, and has to be ready for anything, at anytime. How do you design a load test for this in record time while keeping it cost effective? Well, you use AWS! Come learn Best Practices on how you can use Amazon SQS, Amazon S3, Amazon EC2, Amazon CloudWatch, Auto Scaling, and Amazon DynamoDB to design horizontally scalable large-scale load tests that can simulate the load that millions of users are putting onto your site. We met a tight schedule and did it under budget thanks to AWS and you can too!

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,239
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
30
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Large Scale Load Testing Amazon.com’s Traffic on AWS (CPN102) | AWS re:Invent 2013

  1. 1. Large-Scale Load Testing Amazon.com’s Traffic on AWS Carlos Arguelles, Amazon.com November 15, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  2. 2. What I’d like you to get out of this Load and performance issues cost
  3. 3. What I’d like you to get out of this
  4. 4. What I’d like you to get out of this How you can leverage AWS for load and stress tests
  5. 5. About me
  6. 6. Amazon.com retail site Amazon.com receives a LOT of traffic
  7. 7. Amazon.com retail site Significant fluctuation throughout the day (not to scale)
  8. 8. Amazon.com retail site Significant fluctuation throughout the year (not to scale)
  9. 9. Amazon.com retail site Significant growth year to year (not to scale)
  10. 10. Some load-related issues can
  11. 11. 1st test (cancelled) 100% 85% 2nd test (successful) 50% regular day (off-peak) CPU Usage on our fleet
  12. 12. Some load-related issues can only
  13. 13. Ingestion Fleet Amazon S3 Hadoop Amazon Database DynamoDB Output Fleet
  14. 14. Some load-related issues cannot
  15. 15. Start load… 20% 5% Disk Usage 5 hours
  16. 16. What do you really want to do? Performance Testing Load Testing Resilience Testing Stress Testing
  17. 17. Load Testing
  18. 18. Stress Testing
  19. 19. Resilience Testing
  20. 20. Performance Testing
  21. 21. How does AWS help us?
  22. 22. Generating load Replays from real-world traffic Artificial rate, blend of operations
  23. 23. Most useful AWS design pattern, ever
  24. 24. Distributing load, the hard way Slave Master 4000 3000 TPS Slave 4000 3000 TPS Slave 4000 3000 TPS Slave 0 TPS 3000 TPS 12,000 TPS
  25. 25. Distributing load, the easy way Controller Controller Job Job Job Job Job Job Job Worker Worker Worker Worker Worker Worker Worker
  26. 26. Replaying traffic to generate load Test Data Repository Controller Controller Metrics & Dashboards Job Job Job Job Job Job Job Worker Worker Worker Worker Worker Worker Worker Service under test
  27. 27. Amazon S3 for storing data Amazon DynamoDB for indexing Test Data Repository Controller Controller Reactive auto scaling based on queue size Job Job Job Job Job Job Job Amazon SQS for state, resilience Amazon CloudWatch Metrics & Dashboards Worker Worker Worker Worker Worker Worker Worker Amazon EC2 & Auto Scaling for hardware
  28. 28. Generating load Replays from real-world traffic Artificial rate, blend of operations
  29. 29. Artificial traffic to generate load • Why? – You do not have real-world data – You expect a change in traffic • How? – Control rate – Control blend – Control duration
  30. 30. Artificial traffic to generate load 50,000 TPS for 20 minutes Minute#1: 50,000 TPS, 99% 1% 99% Read, 1% Writes Minute#20: 50,000 TPS, 99% 1% 85,000 TPS for 45 minutes 90% Read, 10% Writes Minute#1 … 10 TPS for 1 minute, 99% R 1% W 10 TPS for 1 minute, 99% R 1% W 1 2 … 95,000 TPS for 3 hours 80% Read, 20% Writes 10 TPS for 1 minute, 99% R 1% W 5000
  31. 31. Artificial traffic to generate load Controller Controller Job Job Job Job Job Job Job Worker Worker Worker Worker Worker Worker Worker
  32. 32. Amazon EC2 Spot Instances • A great way to inexpensively test – Up to 90% off regular price (name your price) – Interruption-tolerant, time-flexible tasks • Approaches – Combine with on-demand instances (burst) – Try Spot Instances first, then fallback to on-demand
  33. 33. Takeaways
  34. 34. Please give us your feedback on this presentation CPN102 As a thank you, we will select prize winners daily for completed surveys!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×