Introducing          DynamoDB20th March, 2012Dr. Matt Wood - matthew@amazon.com
StorageTools &                      ComputeSupport          Databases
Storage          DatabasesTools &                      ComputeSupport
DatabasesRelational                “NoSQL”databases                databases
Any database on Amazon EC2     MySQL, DB2, Oracle, PostgreSQL...
Relational Database Service   Managed MySQL and Oracle databases
Rapid            High                provisioning    availabilityScalable                                        Scalables...
High performance databases       Increase throughput       Increase availability         Reduce latency
High performance databases                                      Read replicasPush-button  scaling     Increase throughput ...
High performance databases           Increase throughputMulti-AZ   Increase availability             Reduce latency
High performance databases       Increase throughput       Increase availability         Reduce latency        ElastiCache
Rich query semantics   Joins, transactions, query optimisation
ProblemComplexity. Performance decreases at scale.
Performance              Predictable, consistent                                 Scale
Performance                  Predictable, consistent              Degraded performance                   with scale       ...
Performance                  Predictable, consistent              Degraded performance                   with scale       ...
= more problems
Data caching                                       Provisioning!Data sharding          = more problemsCluster management  ...
Undifferentiated heavy lifting
DynamoDB
Fully managedNoSQL database     service
Offload admin andoperational burden
Extremely fastperformance
Seamless scalability
Focus on your stuff
AGENDA     Getting to know DynamoDBGu ided tour of service highlightsProvisioned throughputData modelDynamoDB in practiceA...
H   I   G   H   L   I   G   H   T   S  Low latency                                 Flexible  Large scale                  ...
H    I   G   H   L   I   G   H   T   S                   SSD backed     Low latency          Single digit millisecond< 5 m...
H   I   G   H   L   I   G   H   T   S  Massive scaleNo table size limits. Unlimited storage.
H   I   G   H   L   I   G   H   T   SSeamless scaleLive repartitioning. Zero admin.
H   I   G   H   L   I   G   H   T   SFlexible data model Key/attribute store for evolving models
H    I   G   H   L   I   G   H   T   SPredictable performance          Provisioned throughput
H   I   G   H   L   I   G   H   T   SDurable and available     Consistent, disk-only writes
H   I   G   H   L   I   G   H   T   SZero administration
What is provisioned  throughput?
Reserve required IOPS  Per table. Set at creation. Scale via API.
Scale at any time     No downtime
Pay for throughput
Per 1kb item:$0.01 per hour for every 10 writes/second  $0.01 per hour for every 50 strongly        consistent reads/second
Per 1kb item:          $0.28 per million writes$0.056 per million strongly consistent reads
Pay for storage$1.00 per Gb per month of indexed storage
Data model Flexible. Schema-less.
Simple key/value pairs title => “Introduction to DynamoDB”          date => “20120320”
Associative array,      or Hash[ title => “Introduction to DynamoDB”,         date => “20120320” ]
Attributes[ title => “Introduction to DynamoDB”,         date => “20120320” ]
[ title => “Disaster Recovery with AWS”,           date => “20120320”,          format => “webinar”,       presenter => “J...
[ title => “Disaster Recovery with AWS”,           date => “20120320”,          format => “webinar”,       presenter => “J...
[ title => “Disaster Recovery with AWS”,           date => “20120328”,          format => “webinar”,       presenter => “J...
Table
Item“ImageID” = “1”   “Date” =  “20100915”“Title” = “flower”“Tags” = “flower”,“jasmine”, “white”
“ImageID” = “1”      “ImageID” =”2”        “ImageID” =”3”   “Date” =              “Date” =             “Date” =  “20100915...
“ImageID” = “1”     Primary or hash key“Date” = “20100915” “Title” = “flower” “Tags” = “flower”, “jasmine”, “white”
“ImageID” = “1”     Primary or hash key“Date” = “20100915”   Composite or range key “Title” = “flower” “Tags” = “flower”, “j...
“ImageID” = “1”     Primary or hash key“Date” = “20100915”   Composite or range key “Title” = “flower” “Tags” = “flower”,   ...
Best practice    Well balanced, fine grained hash keys.Customer, order, item, etc. rather than store_id.
Simple API Only 12 operations.
Consistency       Writes are always consistent.Reads are consistent or eventually consistent.
Durability       Writes occur to disk, not memory.Writes are acknowledged once they have been      made in two physical da...
Availability         Region specific (not AZ)Continuously replicated across multiple AZs
Let’s take a look!Building a simple DynamoDB powered web application
Threaded discussions             NP-Complete.me              Book reviews for programmersPage view counts                 ...
np-complete.me
np-complete.me
np-complete.me/asin
np-complete.me/discuss
np-complete.me/discuss
Book       Thread       Thread       Thread                Reply                Reply
Book table                Book metadataHash key           asin => 0980576830
Book table                Book metadata           asin => 0980576830title => “Host Your Website on the Cloud”             ...
Book table             Book metadata, page views           asin => 0980576830title => “Host Your Website on the Cloud”    ...
Book table          Book metadata, page views, book tags           asin => 0980576830title => “Host Your Website on the Cl...
Thread table                       Conversation thread        Hash keyRange key          asin => 0980576830            sub...
Thread table                       Conversation thread        Hash keyRange key          asin => 0980576830            sub...
Reply table                       Conversation repliesHash key            id => 0980576830:very-informativeRange key      ...
Reply table                       Conversation repliesHash key            id => 0980576830:very-informativeRange key      ...
DynamoDB tablesBooks          Threads           Replies(asin)      (asin, subject)   (id, datetime)
DynamoDB tablesBooks          Threads           Replies(asin)      (asin, subject)   (id, datetime)
Book                              Logical model(asin)             Thread         (asin, subject)             Thread       ...
Conditional writesClient #1DynamoDB    asin => 1934356             pages => 384Client #2  Time
Conditional writes            asin => 1934356     asin => 1934356Client #1    pages => 384        pages => 502DynamoDB    ...
Conditional writes            asin => 1934356     asin => 1934356Client #1    pages => 384        pages => 502DynamoDB    ...
Conditional writes            asin => 1934356     asin => 1934356Client #1    pages => 384        pages => 502            ...
Atomic increment/decrement                  asin => 0980576830                     views => 145tables[‘books’].items[‘0980...
Tagging: many to many                 Book      (asin, tags = [“php”, “aws”])Query by key, retrieve tag collection      Ad...
Tagging: many to many                  Book       (asin, tags = [“php”, “aws”])                   Tag               (tag, ...
Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThro...
Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThro...
Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThro...
Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThro...
Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThro...
Considerations     Limited index and query modelThroughput is provisioned in 1K operations         Maximum 64K item sizeBa...
Elastic MapReduce Built for data. Designed for humans.
Collection   Computation   Collaboration
Collection    Computation           Collaboration  DynamoDB             Elastic MapReduce  Amazon S3              Amazon EC2
DynamoDBData
DynamoDB          DataCode     Elastic       MapReduce
DynamoDB          DataCode     Elastic   Name       MapReduce   node
DynamoDB          DataCode     Elastic   Name       MapReduce   node                           Elastic                    ...
DynamoDB          DataCode     Elastic   Name       MapReduce   node                                     HDFS             ...
DynamoDB          DataCode     Elastic   Name                     Output       MapReduce   node                       S3  ...
DynamoDBData                  Output                    S3
Export to S3CREATE EXTERNAL TABLE orders_s3_new_export ( order_idstring, customer_id string, order_date int, totaldouble )...
Live data in DynamoDBSELECT customer_id, sum(total) spend, count(*)order_countFROM orders_ddb_2012_01WHERE order_date >= u...
Live and archive data
AGENDA     Getting to know DynamoDBGu ided tour of service highlightsProvisioned throughputData modelDynamoDB in practiceA...
DynamoDB free tier        5 writes/second   10 consistent reads/second        100Mb storage
Developer Guideaws.amazon.com/documentation/dynamodb
Drop us a line! aws.amazon.com/contact-us
Thank you!
Q&Amatthew@amazon.com
SimpleDBZero maintenance, NoSQL datastore
Flexible queries  10Gb / 1 billion                              No nativeattributes per tabel                          dat...
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Webinar | Introduction to Amazon DynamoDB
Upcoming SlideShare
Loading in...5
×

Webinar | Introduction to Amazon DynamoDB

6,912

Published on

This webinar discusses Amazon DynamoDB, a NoSQL, highly scalable, SSD-based, zero administration database service in the AWS Cloud. We explain how DynamoDB works and also walk through some best practices and tips to get the most out of the service.

Published in: Technology, Business
1 Comment
17 Likes
Statistics
Notes
No Downloads
Views
Total Views
6,912
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
69
Comments
1
Likes
17
Embeds 0
No embeds

No notes for slide

Webinar | Introduction to Amazon DynamoDB

  1. 1. Introducing DynamoDB20th March, 2012Dr. Matt Wood - matthew@amazon.com
  2. 2. StorageTools & ComputeSupport Databases
  3. 3. Storage DatabasesTools & ComputeSupport
  4. 4. DatabasesRelational “NoSQL”databases databases
  5. 5. Any database on Amazon EC2 MySQL, DB2, Oracle, PostgreSQL...
  6. 6. Relational Database Service Managed MySQL and Oracle databases
  7. 7. Rapid High provisioning availabilityScalable Scalablestorage compute Relational Database Service Managed MySQL and Oracle databases
  8. 8. High performance databases Increase throughput Increase availability Reduce latency
  9. 9. High performance databases Read replicasPush-button scaling Increase throughput ElastiCache Increase availability Reduce latency
  10. 10. High performance databases Increase throughputMulti-AZ Increase availability Reduce latency
  11. 11. High performance databases Increase throughput Increase availability Reduce latency ElastiCache
  12. 12. Rich query semantics Joins, transactions, query optimisation
  13. 13. ProblemComplexity. Performance decreases at scale.
  14. 14. Performance Predictable, consistent Scale
  15. 15. Performance Predictable, consistent Degraded performance with scale Scale
  16. 16. Performance Predictable, consistent Degraded performance with scale Scale
  17. 17. = more problems
  18. 18. Data caching Provisioning!Data sharding = more problemsCluster management Fault management
  19. 19. Undifferentiated heavy lifting
  20. 20. DynamoDB
  21. 21. Fully managedNoSQL database service
  22. 22. Offload admin andoperational burden
  23. 23. Extremely fastperformance
  24. 24. Seamless scalability
  25. 25. Focus on your stuff
  26. 26. AGENDA Getting to know DynamoDBGu ided tour of service highlightsProvisioned throughputData modelDynamoDB in practiceAnalytics with Elastic MapReduce
  27. 27. H I G H L I G H T S Low latency Flexible Large scale Durable storageSeamless scaling Zero admin Predictable performance
  28. 28. H I G H L I G H T S SSD backed Low latency Single digit millisecond< 5 ms reads < 10 ms writes
  29. 29. H I G H L I G H T S Massive scaleNo table size limits. Unlimited storage.
  30. 30. H I G H L I G H T SSeamless scaleLive repartitioning. Zero admin.
  31. 31. H I G H L I G H T SFlexible data model Key/attribute store for evolving models
  32. 32. H I G H L I G H T SPredictable performance Provisioned throughput
  33. 33. H I G H L I G H T SDurable and available Consistent, disk-only writes
  34. 34. H I G H L I G H T SZero administration
  35. 35. What is provisioned throughput?
  36. 36. Reserve required IOPS Per table. Set at creation. Scale via API.
  37. 37. Scale at any time No downtime
  38. 38. Pay for throughput
  39. 39. Per 1kb item:$0.01 per hour for every 10 writes/second $0.01 per hour for every 50 strongly consistent reads/second
  40. 40. Per 1kb item: $0.28 per million writes$0.056 per million strongly consistent reads
  41. 41. Pay for storage$1.00 per Gb per month of indexed storage
  42. 42. Data model Flexible. Schema-less.
  43. 43. Simple key/value pairs title => “Introduction to DynamoDB” date => “20120320”
  44. 44. Associative array, or Hash[ title => “Introduction to DynamoDB”, date => “20120320” ]
  45. 45. Attributes[ title => “Introduction to DynamoDB”, date => “20120320” ]
  46. 46. [ title => “Disaster Recovery with AWS”, date => “20120320”, format => “webinar”, presenter => “Jeff Barr” ] Attributes [ title => “Introduction to DynamoDB”, date => “20120320” ]
  47. 47. [ title => “Disaster Recovery with AWS”, date => “20120320”, format => “webinar”, presenter => “Jeff Barr” ] Items [ title => “Introduction to DynamoDB”, date => “20120320” ]
  48. 48. [ title => “Disaster Recovery with AWS”, date => “20120328”, format => “webinar”, presenter => “Jeff Barr” ] Table [ title => “Introduction to DynamoDB”, date => “20120320” ]
  49. 49. Table
  50. 50. Item“ImageID” = “1” “Date” = “20100915”“Title” = “flower”“Tags” = “flower”,“jasmine”, “white”
  51. 51. “ImageID” = “1” “ImageID” =”2” “ImageID” =”3” “Date” = “Date” = “Date” = “20100915” “20100916” “20100917”“Title” = “flower” “Title” = “ferrari” “Title” = “coffee”“Tags” = “flower”, “Tags” = “car”, “Tags” = “drink”,“jasmine”, “white” “italian” “delicious”
  52. 52. “ImageID” = “1” Primary or hash key“Date” = “20100915” “Title” = “flower” “Tags” = “flower”, “jasmine”, “white”
  53. 53. “ImageID” = “1” Primary or hash key“Date” = “20100915” Composite or range key “Title” = “flower” “Tags” = “flower”, “jasmine”, “white”
  54. 54. “ImageID” = “1” Primary or hash key“Date” = “20100915” Composite or range key “Title” = “flower” “Tags” = “flower”, Sets of strings “jasmine”, “white” or numbers
  55. 55. Best practice Well balanced, fine grained hash keys.Customer, order, item, etc. rather than store_id.
  56. 56. Simple API Only 12 operations.
  57. 57. Consistency Writes are always consistent.Reads are consistent or eventually consistent.
  58. 58. Durability Writes occur to disk, not memory.Writes are acknowledged once they have been made in two physical data centres.
  59. 59. Availability Region specific (not AZ)Continuously replicated across multiple AZs
  60. 60. Let’s take a look!Building a simple DynamoDB powered web application
  61. 61. Threaded discussions NP-Complete.me Book reviews for programmersPage view counts Tagging
  62. 62. np-complete.me
  63. 63. np-complete.me
  64. 64. np-complete.me/asin
  65. 65. np-complete.me/discuss
  66. 66. np-complete.me/discuss
  67. 67. Book Thread Thread Thread Reply Reply
  68. 68. Book table Book metadataHash key asin => 0980576830
  69. 69. Book table Book metadata asin => 0980576830title => “Host Your Website on the Cloud” pages => “364” list-price => “£31.49”
  70. 70. Book table Book metadata, page views asin => 0980576830title => “Host Your Website on the Cloud” pages => “364” list-price => “£31.49” views => 145
  71. 71. Book table Book metadata, page views, book tags asin => 0980576830title => “Host Your Website on the Cloud” pages => “364” list-price => “£31.49” views => 145 tags => [“php”, “aws”]
  72. 72. Thread table Conversation thread Hash keyRange key asin => 0980576830 subject => “Very informative”
  73. 73. Thread table Conversation thread Hash keyRange key asin => 0980576830 subject => “Very informative”content => “This is a first class book...” name => “Matt Wood”
  74. 74. Reply table Conversation repliesHash key id => 0980576830:very-informativeRange key datetime => “20120320”
  75. 75. Reply table Conversation repliesHash key id => 0980576830:very-informativeRange key datetime => “20120320” reply => “I agree!” name => “Werner Vogels”
  76. 76. DynamoDB tablesBooks Threads Replies(asin) (asin, subject) (id, datetime)
  77. 77. DynamoDB tablesBooks Threads Replies(asin) (asin, subject) (id, datetime)
  78. 78. Book Logical model(asin) Thread (asin, subject) Thread (asin, subject) Thread (asin, subject) Reply (id, datetime) Reply (id, datetime)
  79. 79. Conditional writesClient #1DynamoDB asin => 1934356 pages => 384Client #2 Time
  80. 80. Conditional writes asin => 1934356 asin => 1934356Client #1 pages => 384 pages => 502DynamoDB asin => 1934356 asin => 1934356 pages => 384 pages => 502 asin => 1934356Client #2 pages => 384 Time
  81. 81. Conditional writes asin => 1934356 asin => 1934356Client #1 pages => 384 pages => 502DynamoDB asin => 1934356 asin => 1934356 ? pages => 384 pages => 502 asin => 1934356 asin => 1934356Client #2 pages => 384 pages => 450 Time
  82. 82. Conditional writes asin => 1934356 asin => 1934356Client #1 pages => 384 pages => 502 Failed conditionDynamoDB asin => 1934356 asin => 1934356 pages => 384 pages => 502 asin => 1934356 asin => 1934356Client #2 pages => 384 pages => 450 Time
  83. 83. Atomic increment/decrement asin => 0980576830 views => 145tables[‘books’].items[‘0980576830’].attributes.add(:views => 1) asin => 0980576830 views => 146
  84. 84. Tagging: many to many Book (asin, tags = [“php”, “aws”])Query by key, retrieve tag collection Add tags conditionally No secondary indexes Retrieve all books by tag
  85. 85. Tagging: many to many Book (asin, tags = [“php”, “aws”]) Tag (tag, asin = [“1449393683”, “0596515812”])Query by book, retrieve tag collectionQuery by tag, retrieve book collection
  86. 86. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  87. 87. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  88. 88. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  89. 89. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  90. 90. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  91. 91. Considerations Limited index and query modelThroughput is provisioned in 1K operations Maximum 64K item sizeBackup and restore via Elastic MapReduce
  92. 92. Elastic MapReduce Built for data. Designed for humans.
  93. 93. Collection Computation Collaboration
  94. 94. Collection Computation Collaboration DynamoDB Elastic MapReduce Amazon S3 Amazon EC2
  95. 95. DynamoDBData
  96. 96. DynamoDB DataCode Elastic MapReduce
  97. 97. DynamoDB DataCode Elastic Name MapReduce node
  98. 98. DynamoDB DataCode Elastic Name MapReduce node Elastic cluster
  99. 99. DynamoDB DataCode Elastic Name MapReduce node HDFS Elastic cluster
  100. 100. DynamoDB DataCode Elastic Name Output MapReduce node S3 HDFS Elastic cluster
  101. 101. DynamoDBData Output S3
  102. 102. Export to S3CREATE EXTERNAL TABLE orders_s3_new_export ( order_idstring, customer_id string, order_date int, totaldouble )PARTITIONED BY (year string, month string)ROW FORMAT DELIMITED FIELDS TERMINATED BY ,LOCATION s3://export_bucket;INSERT OVERWRITE TABLEorders_s3_new_exportPARTITION (year=2012, month=01)SELECT * from orders_ddb_2012_01;
  103. 103. Live data in DynamoDBSELECT customer_id, sum(total) spend, count(*)order_countFROM orders_ddb_2012_01WHERE order_date >= unix_timestamp(2012-01-01, yyyy-MM-dd)AND order_date < unix_timestamp(2012-01-08, yyyy-MM-dd)GROUP BY customer_idORDER BY spend descLIMIT 5;
  104. 104. Live and archive data
  105. 105. AGENDA Getting to know DynamoDBGu ided tour of service highlightsProvisioned throughputData modelDynamoDB in practiceAnalytics with Elastic MapReduce Slides available shortly.
  106. 106. DynamoDB free tier 5 writes/second 10 consistent reads/second 100Mb storage
  107. 107. Developer Guideaws.amazon.com/documentation/dynamodb
  108. 108. Drop us a line! aws.amazon.com/contact-us
  109. 109. Thank you!
  110. 110. Q&Amatthew@amazon.com
  111. 111. SimpleDBZero maintenance, NoSQL datastore
  112. 112. Flexible queries 10Gb / 1 billion No nativeattributes per tabel data sharding SimpleDB Zero maintenance, NoSQL datastore

×