Webinar | Introduction to Amazon DynamoDB

7,233
-1

Published on

This webinar discusses Amazon DynamoDB, a NoSQL, highly scalable, SSD-based, zero administration database service in the AWS Cloud. We explain how DynamoDB works and also walk through some best practices and tips to get the most out of the service.

Published in: Technology, Business
1 Comment
18 Likes
Statistics
Notes
No Downloads
Views
Total Views
7,233
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
69
Comments
1
Likes
18
Embeds 0
No embeds

No notes for slide

Webinar | Introduction to Amazon DynamoDB

  1. 1. Introducing DynamoDB20th March, 2012Dr. Matt Wood - matthew@amazon.com
  2. 2. StorageTools & ComputeSupport Databases
  3. 3. Storage DatabasesTools & ComputeSupport
  4. 4. DatabasesRelational “NoSQL”databases databases
  5. 5. Any database on Amazon EC2 MySQL, DB2, Oracle, PostgreSQL...
  6. 6. Relational Database Service Managed MySQL and Oracle databases
  7. 7. Rapid High provisioning availabilityScalable Scalablestorage compute Relational Database Service Managed MySQL and Oracle databases
  8. 8. High performance databases Increase throughput Increase availability Reduce latency
  9. 9. High performance databases Read replicasPush-button scaling Increase throughput ElastiCache Increase availability Reduce latency
  10. 10. High performance databases Increase throughputMulti-AZ Increase availability Reduce latency
  11. 11. High performance databases Increase throughput Increase availability Reduce latency ElastiCache
  12. 12. Rich query semantics Joins, transactions, query optimisation
  13. 13. ProblemComplexity. Performance decreases at scale.
  14. 14. Performance Predictable, consistent Scale
  15. 15. Performance Predictable, consistent Degraded performance with scale Scale
  16. 16. Performance Predictable, consistent Degraded performance with scale Scale
  17. 17. = more problems
  18. 18. Data caching Provisioning!Data sharding = more problemsCluster management Fault management
  19. 19. Undifferentiated heavy lifting
  20. 20. DynamoDB
  21. 21. Fully managedNoSQL database service
  22. 22. Offload admin andoperational burden
  23. 23. Extremely fastperformance
  24. 24. Seamless scalability
  25. 25. Focus on your stuff
  26. 26. AGENDA Getting to know DynamoDBGu ided tour of service highlightsProvisioned throughputData modelDynamoDB in practiceAnalytics with Elastic MapReduce
  27. 27. H I G H L I G H T S Low latency Flexible Large scale Durable storageSeamless scaling Zero admin Predictable performance
  28. 28. H I G H L I G H T S SSD backed Low latency Single digit millisecond< 5 ms reads < 10 ms writes
  29. 29. H I G H L I G H T S Massive scaleNo table size limits. Unlimited storage.
  30. 30. H I G H L I G H T SSeamless scaleLive repartitioning. Zero admin.
  31. 31. H I G H L I G H T SFlexible data model Key/attribute store for evolving models
  32. 32. H I G H L I G H T SPredictable performance Provisioned throughput
  33. 33. H I G H L I G H T SDurable and available Consistent, disk-only writes
  34. 34. H I G H L I G H T SZero administration
  35. 35. What is provisioned throughput?
  36. 36. Reserve required IOPS Per table. Set at creation. Scale via API.
  37. 37. Scale at any time No downtime
  38. 38. Pay for throughput
  39. 39. Per 1kb item:$0.01 per hour for every 10 writes/second $0.01 per hour for every 50 strongly consistent reads/second
  40. 40. Per 1kb item: $0.28 per million writes$0.056 per million strongly consistent reads
  41. 41. Pay for storage$1.00 per Gb per month of indexed storage
  42. 42. Data model Flexible. Schema-less.
  43. 43. Simple key/value pairs title => “Introduction to DynamoDB” date => “20120320”
  44. 44. Associative array, or Hash[ title => “Introduction to DynamoDB”, date => “20120320” ]
  45. 45. Attributes[ title => “Introduction to DynamoDB”, date => “20120320” ]
  46. 46. [ title => “Disaster Recovery with AWS”, date => “20120320”, format => “webinar”, presenter => “Jeff Barr” ] Attributes [ title => “Introduction to DynamoDB”, date => “20120320” ]
  47. 47. [ title => “Disaster Recovery with AWS”, date => “20120320”, format => “webinar”, presenter => “Jeff Barr” ] Items [ title => “Introduction to DynamoDB”, date => “20120320” ]
  48. 48. [ title => “Disaster Recovery with AWS”, date => “20120328”, format => “webinar”, presenter => “Jeff Barr” ] Table [ title => “Introduction to DynamoDB”, date => “20120320” ]
  49. 49. Table
  50. 50. Item“ImageID” = “1” “Date” = “20100915”“Title” = “flower”“Tags” = “flower”,“jasmine”, “white”
  51. 51. “ImageID” = “1” “ImageID” =”2” “ImageID” =”3” “Date” = “Date” = “Date” = “20100915” “20100916” “20100917”“Title” = “flower” “Title” = “ferrari” “Title” = “coffee”“Tags” = “flower”, “Tags” = “car”, “Tags” = “drink”,“jasmine”, “white” “italian” “delicious”
  52. 52. “ImageID” = “1” Primary or hash key“Date” = “20100915” “Title” = “flower” “Tags” = “flower”, “jasmine”, “white”
  53. 53. “ImageID” = “1” Primary or hash key“Date” = “20100915” Composite or range key “Title” = “flower” “Tags” = “flower”, “jasmine”, “white”
  54. 54. “ImageID” = “1” Primary or hash key“Date” = “20100915” Composite or range key “Title” = “flower” “Tags” = “flower”, Sets of strings “jasmine”, “white” or numbers
  55. 55. Best practice Well balanced, fine grained hash keys.Customer, order, item, etc. rather than store_id.
  56. 56. Simple API Only 12 operations.
  57. 57. Consistency Writes are always consistent.Reads are consistent or eventually consistent.
  58. 58. Durability Writes occur to disk, not memory.Writes are acknowledged once they have been made in two physical data centres.
  59. 59. Availability Region specific (not AZ)Continuously replicated across multiple AZs
  60. 60. Let’s take a look!Building a simple DynamoDB powered web application
  61. 61. Threaded discussions NP-Complete.me Book reviews for programmersPage view counts Tagging
  62. 62. np-complete.me
  63. 63. np-complete.me
  64. 64. np-complete.me/asin
  65. 65. np-complete.me/discuss
  66. 66. np-complete.me/discuss
  67. 67. Book Thread Thread Thread Reply Reply
  68. 68. Book table Book metadataHash key asin => 0980576830
  69. 69. Book table Book metadata asin => 0980576830title => “Host Your Website on the Cloud” pages => “364” list-price => “£31.49”
  70. 70. Book table Book metadata, page views asin => 0980576830title => “Host Your Website on the Cloud” pages => “364” list-price => “£31.49” views => 145
  71. 71. Book table Book metadata, page views, book tags asin => 0980576830title => “Host Your Website on the Cloud” pages => “364” list-price => “£31.49” views => 145 tags => [“php”, “aws”]
  72. 72. Thread table Conversation thread Hash keyRange key asin => 0980576830 subject => “Very informative”
  73. 73. Thread table Conversation thread Hash keyRange key asin => 0980576830 subject => “Very informative”content => “This is a first class book...” name => “Matt Wood”
  74. 74. Reply table Conversation repliesHash key id => 0980576830:very-informativeRange key datetime => “20120320”
  75. 75. Reply table Conversation repliesHash key id => 0980576830:very-informativeRange key datetime => “20120320” reply => “I agree!” name => “Werner Vogels”
  76. 76. DynamoDB tablesBooks Threads Replies(asin) (asin, subject) (id, datetime)
  77. 77. DynamoDB tablesBooks Threads Replies(asin) (asin, subject) (id, datetime)
  78. 78. Book Logical model(asin) Thread (asin, subject) Thread (asin, subject) Thread (asin, subject) Reply (id, datetime) Reply (id, datetime)
  79. 79. Conditional writesClient #1DynamoDB asin => 1934356 pages => 384Client #2 Time
  80. 80. Conditional writes asin => 1934356 asin => 1934356Client #1 pages => 384 pages => 502DynamoDB asin => 1934356 asin => 1934356 pages => 384 pages => 502 asin => 1934356Client #2 pages => 384 Time
  81. 81. Conditional writes asin => 1934356 asin => 1934356Client #1 pages => 384 pages => 502DynamoDB asin => 1934356 asin => 1934356 ? pages => 384 pages => 502 asin => 1934356 asin => 1934356Client #2 pages => 384 pages => 450 Time
  82. 82. Conditional writes asin => 1934356 asin => 1934356Client #1 pages => 384 pages => 502 Failed conditionDynamoDB asin => 1934356 asin => 1934356 pages => 384 pages => 502 asin => 1934356 asin => 1934356Client #2 pages => 384 pages => 450 Time
  83. 83. Atomic increment/decrement asin => 0980576830 views => 145tables[‘books’].items[‘0980576830’].attributes.add(:views => 1) asin => 0980576830 views => 146
  84. 84. Tagging: many to many Book (asin, tags = [“php”, “aws”])Query by key, retrieve tag collection Add tags conditionally No secondary indexes Retrieve all books by tag
  85. 85. Tagging: many to many Book (asin, tags = [“php”, “aws”]) Tag (tag, asin = [“1449393683”, “0596515812”])Query by book, retrieve tag collectionQuery by tag, retrieve book collection
  86. 86. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  87. 87. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  88. 88. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  89. 89. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  90. 90. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  91. 91. Considerations Limited index and query modelThroughput is provisioned in 1K operations Maximum 64K item sizeBackup and restore via Elastic MapReduce
  92. 92. Elastic MapReduce Built for data. Designed for humans.
  93. 93. Collection Computation Collaboration
  94. 94. Collection Computation Collaboration DynamoDB Elastic MapReduce Amazon S3 Amazon EC2
  95. 95. DynamoDBData
  96. 96. DynamoDB DataCode Elastic MapReduce
  97. 97. DynamoDB DataCode Elastic Name MapReduce node
  98. 98. DynamoDB DataCode Elastic Name MapReduce node Elastic cluster
  99. 99. DynamoDB DataCode Elastic Name MapReduce node HDFS Elastic cluster
  100. 100. DynamoDB DataCode Elastic Name Output MapReduce node S3 HDFS Elastic cluster
  101. 101. DynamoDBData Output S3
  102. 102. Export to S3CREATE EXTERNAL TABLE orders_s3_new_export ( order_idstring, customer_id string, order_date int, totaldouble )PARTITIONED BY (year string, month string)ROW FORMAT DELIMITED FIELDS TERMINATED BY ,LOCATION s3://export_bucket;INSERT OVERWRITE TABLEorders_s3_new_exportPARTITION (year=2012, month=01)SELECT * from orders_ddb_2012_01;
  103. 103. Live data in DynamoDBSELECT customer_id, sum(total) spend, count(*)order_countFROM orders_ddb_2012_01WHERE order_date >= unix_timestamp(2012-01-01, yyyy-MM-dd)AND order_date < unix_timestamp(2012-01-08, yyyy-MM-dd)GROUP BY customer_idORDER BY spend descLIMIT 5;
  104. 104. Live and archive data
  105. 105. AGENDA Getting to know DynamoDBGu ided tour of service highlightsProvisioned throughputData modelDynamoDB in practiceAnalytics with Elastic MapReduce Slides available shortly.
  106. 106. DynamoDB free tier 5 writes/second 10 consistent reads/second 100Mb storage
  107. 107. Developer Guideaws.amazon.com/documentation/dynamodb
  108. 108. Drop us a line! aws.amazon.com/contact-us
  109. 109. Thank you!
  110. 110. Q&Amatthew@amazon.com
  111. 111. SimpleDBZero maintenance, NoSQL datastore
  112. 112. Flexible queries 10Gb / 1 billion No nativeattributes per tabel data sharding SimpleDB Zero maintenance, NoSQL datastore

×