What makes AWS invincible? from JAWS Days 2014

16,377 views

Published on

Published in: Technology

What makes AWS invincible? from JAWS Days 2014

  1. 1. What makes AWS invincible? Haruka Iwao, 2014/03/15
  2. 2. Before talking about AWS
  3. 3. About myself Haruka Iwao (@Yuryu) DevOps Engineer at FreakOut, Inc. Lived in Osaka, Tsukuba, Yokohama. Now in Tokyo. Playing FFXIV ARR
  4. 4. Me Final Fantasy XIV ARR Status: Cleared the Coil Turn 5. Got my Allagan Weapon.
  5. 5. Kindle Publishing Publishing Kindle books about the Linux Kernel Search “Yuryu Linux”
  6. 6. About FreakOut, Inc. Not about freaking out :p Advertisement company Established in 2010 “Real-time Bidding”
  7. 7. Real-time bidding SSP Supply-side Platform DSP Demand-side Platform DSP Demand-side Platform DSP Demand-side PlatformRequest a page Read an ad tag Call for bids DSP decides the best ad for the user and page
  8. 8. Real-time bidding (2) SSP Supply-side Platform DSP Demand-side Platform DSP Demand-side Platform DSP Demand-side Platform Bid Auction Return the winning ad
  9. 9. Real-time bidding (3) http://londoncreative.com/real-time-bidding-spending-to-significantly-increase/
  10. 10. Our motto 50ms or die. Return a response within 50ms or lose an auction automatically. Latency matters. Literally.
  11. 11. How we use AWS
  12. 12. Our system at a glance http://aws.amazon.com/jp/solutions/case-studies/freakout/
  13. 13. Mix of on-premise and AWS On-premise in Japan AWS in North America Starting small Scaling well No need to visit a DC
  14. 14. Latency matters
  15. 15. Latency matters Latency is important for our service 1ms = 1/50 of processing time
  16. 16. Latency between servers Freedom to build an arbitrary network ... Gives you an arbitrary latency
  17. 17. Longer latency in AWS  On-premise time=0.063 ms time=0.083 ms time=0.077 ms time=0.070 ms time=0.092 ms time=0.069 ms time=0.077 ms  AWS, extreme case time=1.88 ms time=1.96 ms time=2.60 ms time=3.72 ms time=2.46 ms time=1.05 ms time=2.37 ms
  18. 18. Longer latency in AWS (2) Hard to see? Let’s make a graph...
  19. 19. Longer latency, illustrated 0 0.5 1 1.5 2 2.5 On-premise AWS RTT(ms) RTT(ms)
  20. 20. Longer latency in AWS (3) This is not always true Just an extreme case This applies to intra-AZ “Option” to group servers in near racks would be great
  21. 21. Placement groups Placement groups are not enough Only available to cluster compute instances Guarantees bandwidth, not latency
  22. 22. Possible workarounds Assume the latency Design your app accordingly Use persistent connections Put hot data on local Still, lower latency gives “extra” room
  23. 23. Infrastructure as Code
  24. 24. The “Awesome” Console
  25. 25. ... So awesome to make mistakes easily...
  26. 26. AWS is Programmable.
  27. 27. Thou hast SDK. Python
  28. 28. Thou hast CLI. CLI
  29. 29. Thou hast CloudFormation. AWS CloudFormation
  30. 30. SDK + CLI + CloudFormation You can “code” your infrastructure Infrastructure becomes “reproducible” and “reusable”
  31. 31. Always use CLI Always use CLI to make changes “Review” the commands Less chance of “oops” But...
  32. 32. CLI is hard to understand!
  33. 33. VS aws ec2 run-instances --image-id ami-xxxxxxxx -- count 1 --instance-type t1.micro --key-name MyKeyPair --security-group-ids sg-xxxxxxxx --subnet-id subnet-xxxxxxxx
  34. 34. Easy enough?
  35. 35. No way...
  36. 36. Record & Play “Record” instructions on the Web Console “Playback” them using CLI In other words...
  37. 37. Converted to aws ec2 run-instances --image-id ami-xxxxxxxx -- count 1 --instance-type t1.micro --key-name MyKeyPair --security-group-ids sg-xxxxxxxx --subnet-id subnet-xxxxxxxx
  38. 38. With “playback” You could review changes beforehand You could record changes and reuse them Easier than writing CLI commands by hand
  39. 39. A very famous quote about “code”
  40. 40. All your code are belong to test
  41. 41. Testing is Important Every program has bugs “Infrastructure as Code” is no exception
  42. 42. How do you test?How do you test?
  43. 43. Bugs can be fatal A bug can destroy your whole system What if you accidentally Terminate an instance Set a wrong route table Delete RR from Route53
  44. 44. “Sandbox” for testing VPC is (sometimes) not enough Test 100% bootstrap in a safe environment Register IAM accounts Add Route53 zones Set up S3 buckets, etc…
  45. 45. Framework for testing Test-kitchen to test your Chef cookbooks Serverspec to test your server setups How do you verify your changes to AWS?
  46. 46. Possible workarounds Use a separate account Maybe we need more environments in the future? Costs money CloudFormer converts environments to configuration
  47. 47. Scenario #1 You add a new rule to your security group aws ec2 authorize-security-… You want to make sure a port is open or closed between particular hosts How?
  48. 48. Workaround #1 Create a new VPC Apply the new rule Launch two instances Check connectivity
  49. 49. Scenario #2 You set up Route53 Health Checks Now you want to test if it actually fails-over How?
  50. 50. Workaround #2 Set up two ELBs / instances Stop instances registered to one ELB Query to R53 until it fails- over
  51. 51. Need a solution! A “common language” to verify AWS configuration Want to run tests cheaper, quicker and safer Even the requirements are not yet clear…
  52. 52. In the end of the presentation…
  53. 53. What makes AWS invincible? Lower latency Giving options or hints to EC2 “Playback” feature Generate CLI commands using simple UI Testing methodology
  54. 54. Thank you!

×