Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

2,780 views
2,578 views

Published on

Amazon DynamoDB is a fully-managed, zero-admin, high-speed NoSQL database service. Amazon DynamoDB was built to support applications at any scale. With the click of a button, you can scale your database capacity from a few hundred I/Os per second to hundreds of thousands of I/Os per second or more. You can dynamically scale your database to keep up with your application's requirements while minimizing costs during low-traffic periods. The service has no limit on storage. You also learn about Amazon DynamoDB's design principles and history.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,780
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
46
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

  1. 1. DAT101 - Production NoSQL in an Hour: Introduction to Amazon DynamoDB © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  2. 2. Introduction to Amazon DynamoDB
  3. 3. Amazon DynamoDB is the result of everything we’ve learned from building large-scale, non-relational databases for Amazon.com and building highly scalable and reliable cloud computing services at AWS.”
  4. 4. What is Amazon DynamoDB?
  5. 5. Design Philosophy
  6. 6. Design Philosophy
  7. 7. Design Philosophy
  8. 8. Design Philosophy
  9. 9. Design Philosophy
  10. 10. Flexible Data Model
  11. 11. Access and Query Model • Two primary key options • • Hash key: Key lookups: “Give me the status for user abc” Composite key (Hash with Range): “Give me all the status updates for user ‘abc’ that occurred within the past 24 hours” • Support for multiple data types – String, number, binary… or sets of strings, numbers, or binaries • Supports both strong and eventual consistency – Choose your consistency level when you make the API call – Different parts of your app can make different choices • Local Secondary Indexes
  12. 12. High Availability and Durability
  13. 13. I want to build a production-ready database…
  14. 14. This used to be the only way… You Choose: • Memory • CPU • Hard drive specs • Software • … To get the database performance you want: • Throughput rate • Latency • …
  15. 15. This used to be the only way… You Choose: • Memory • CPU • Hard drive specs • Software • … To get the database performance you want: • Throughput rate • Latency • …
  16. 16. Provisioned Throughput Model Tell us the performance you want Let us handle the rest
  17. 17. Provisioned Throughput Model Every DynamoDB table has: • Provisioned write capacity • Provisioned read capacity • No limit on storage
  18. 18. Provisioned Throughput Model
  19. 19. Provisioned Throughput Model Change your throughput capacity as needed Pay for throughput capacity and storage used
  20. 20. Seamless Scalability Change scale with the click of a button
  21. 21. Capacity Forecasting is Hard When you run your own database, you need to: • Try to forecast the scale you need • Invest time and money learning how to scale your database • React quickly if you get it wrong
  22. 22. Timid Forecasting: Plan for a lot more capacity than you probably need Benefits: • Safety – you know you’re ready Risks: • Buy too much capacity • Lose development resources to scale testing/planning • Do more work than necessary
  23. 23. Aggressive Forecasting: Cut it close! Plan for less capacity. Hope you don’t need more… Benefits: • Lower costs if all goes well Risks: • Last-minute scaling emergencies • How does your database behave at an unexpected scale?
  24. 24. Typical Holiday Season Traffic at Amazon Capacity Actual traffic
  25. 25. Unused Capacity 76% 24%
  26. 26. Reduce Costs by Matching Capacity to Your Needs Capacity we needed before DynamoDB Actual traffic Capacity we can provision with DynamoDB
  27. 27. Reduce Forecasting Risk by using DynamoDB
  28. 28. Reduce Forecasting Risk by using DynamoDB
  29. 29. What does DynamoDB handle for me?
  30. 30. Focus on your building your app, not running your database
  31. 31. Try it out! aws.amazon.com/dynamodb
  32. 32. Production NoSQL in an hour David R. Albrecht, Senior Engineer in Operations, Crittercism November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  33. 33. Mobile application performance management
  34. 34. HTTP Requests
  35. 35. 600 million devices
  36. 36. None of this adds differentiating business value.
  37. 37. Metadata: session id via usernames #import "Crittercism.h” [Crittercism enableWithAppID: @"<YOUR_CRITTERCISM_APP_ID>"]; [Crittercism setUsername:(NSString *)username];
  38. 38. we tried a lot of things most of them failed
  39. 39. Our first attempt: sharded MongoDB on EC2 orange AZ 1 AZ 2 apple durian Each shard: 2x m2.4xlarge, EBS opt Gross: 2x 3200 GB Net: 1.6 TB, RAID 10 Cost: EBS standard: $704/mo EC2 compute: $2650/mo Price floor: $1.45/GB-mo
  40. 40. But storage capacity wasn’t the problem!
  41. 41. Second attempt: Redis ring Master Slave Consistent hashing: Karger et al. Each shard: 2x m2.4xlarge Gross: 2x 64 GB RAM Net: 64 GB RAM O(10k) iops performance Cost: EC2 compute: $2650/mo Price floor: $41.45/GB-mo, but is an ops nightmare.
  42. 42. Lesson: db scaling is 2d RAM SSD iops HDD capacity
  43. 43. A horizontally-scalable, tabular, indexed database with user-defined consistency semantics.
  44. 44. Benefit: Pay only for consumed capacity
  45. 45. Benefit: load spike insurance
  46. 46. Benefit: application-appropriate scaling iops capacity
  47. 47. Benefit: no operational burden
  48. 48. Lessons learned • Database scaling is a 2D problem • Don't try to roll your own sharding scheme • Dynamo works for us.
  49. 49. david@crittercism.com
  50. 50. 100 Billion (with a B) Requests a Day with Amazon DynamoDB Valentino Volonghi, AdRoll Chief Architect November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  51. 51. Pixel “fires”
  52. 52. Pixel “fires” Serve ad?
  53. 53. Ad served Pixel “fires” Serve ad?
  54. 54. If you can’t reply in 100ms… It doesn’t matter anymore! Network 40 Buffer 20 But you really only get 40ms! Processing 40
  55. 55. Data must be available all over the world! Big picture slide
  56. 56. 10/2013 - ~20TB/day 4/2013 - ~5TB/day 7/2011 - ~50GB/day
  57. 57. What were our requirements?
  58. 58. Key-Value Store Requirements • • • • • <10ms random key lookup with 100bytes values 5-10B items stored Scale up and down without performance hit ~100% uptime, this is money for us Consistent and sustained read/write throughput
  59. 59. Why DynamoDB instead of… • Hbase: hbck like rain, really hard to manage • Cassandra: still immature when we needed it • Redis: limited by available memory, no clustering • Riak: great product, not fast enough for us • MongoDB: not consistent write throughput
  60. 60. But the real reason… They all require people to manage them! And they all are hard to run in the cloud!
  61. 61. DynamoDB by Our Numbers • • • • • 4 regions in use with live traffic replication 120B+ key fetches worldwide per day 1.5TB of data stored per region 30B+ items stored in reach region <3ms uniform query latency, <10ms 99.95%
  62. 62. What did we learn after all?
  63. 63. Batch operations as much as possible!
  64. 64. Query with GetItem – Update with UpdateItem HashKey KeyValue Low write throughput – Key splitting when exceeding max size – Write contention Query with Query – Update with BatchPutItem HashAndRangeKey KeyValue
  65. 65. Properly balance your structures!
  66. 66. Tips for Optimum Performance • Evenly distribute keys in hash range • All values should be about the same size • Cache reads for a few seconds • Buffer writes, when necessary • Exponential back-off retries
  67. 67. What do you mean you don’t care about the money?
  68. 68. Snacks DynamoDB Why do we pay so much for snacks again?
  69. 69. We have this huge database Pretty much always available And we barely know it’s there
  70. 70. Please give us your feedback on this presentation DAT101 As a thank you, we will select prize winners daily for completed surveys!

×