• Save
Webinar | Introduction to Amazon DynamoDB
 

Like this? Share it with your network

Share

Webinar | Introduction to Amazon DynamoDB

on

  • 7,226 views

This webinar discusses Amazon DynamoDB, a NoSQL, highly scalable, SSD-based, zero administration database service in the AWS Cloud. We explain how DynamoDB works and also walk through some best ...

This webinar discusses Amazon DynamoDB, a NoSQL, highly scalable, SSD-based, zero administration database service in the AWS Cloud. We explain how DynamoDB works and also walk through some best practices and tips to get the most out of the service.

Statistics

Views

Total Views
7,226
Views on SlideShare
6,330
Embed Views
896

Actions

Likes
15
Downloads
69
Comments
1

9 Embeds 896

http://www.newvem.com 654
http://www.scoop.it 107
https://twitter.com 77
http://newvem.staging.wpengine.com 45
http://1.embed.urli.st 4
http://pinterest.com 4
http://us-w1.rockmelt.com 3
http://www.linkedin.com 1
http://www.newvem.stg 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • very useful as I am setting up a bookstore and this might come in in handy - I like how you use metadata in the NOSQL as I am an indexer and that is the best part of a database of books or information (makes for easy retrieval and mashups)
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Webinar | Introduction to Amazon DynamoDB Presentation Transcript

  • 1. Introducing DynamoDB20th March, 2012Dr. Matt Wood - matthew@amazon.com
  • 2. StorageTools & ComputeSupport Databases
  • 3. Storage DatabasesTools & ComputeSupport
  • 4. DatabasesRelational “NoSQL”databases databases
  • 5. Any database on Amazon EC2 MySQL, DB2, Oracle, PostgreSQL...
  • 6. Relational Database Service Managed MySQL and Oracle databases
  • 7. Rapid High provisioning availabilityScalable Scalablestorage compute Relational Database Service Managed MySQL and Oracle databases
  • 8. High performance databases Increase throughput Increase availability Reduce latency
  • 9. High performance databases Read replicasPush-button scaling Increase throughput ElastiCache Increase availability Reduce latency
  • 10. High performance databases Increase throughputMulti-AZ Increase availability Reduce latency
  • 11. High performance databases Increase throughput Increase availability Reduce latency ElastiCache
  • 12. Rich query semantics Joins, transactions, query optimisation
  • 13. ProblemComplexity. Performance decreases at scale.
  • 14. Performance Predictable, consistent Scale
  • 15. Performance Predictable, consistent Degraded performance with scale Scale
  • 16. Performance Predictable, consistent Degraded performance with scale Scale
  • 17. = more problems
  • 18. Data caching Provisioning!Data sharding = more problemsCluster management Fault management
  • 19. Undifferentiated heavy lifting
  • 20. DynamoDB
  • 21. Fully managedNoSQL database service
  • 22. Offload admin andoperational burden
  • 23. Extremely fastperformance
  • 24. Seamless scalability
  • 25. Focus on your stuff
  • 26. AGENDA Getting to know DynamoDBGu ided tour of service highlightsProvisioned throughputData modelDynamoDB in practiceAnalytics with Elastic MapReduce
  • 27. H I G H L I G H T S Low latency Flexible Large scale Durable storageSeamless scaling Zero admin Predictable performance
  • 28. H I G H L I G H T S SSD backed Low latency Single digit millisecond< 5 ms reads < 10 ms writes
  • 29. H I G H L I G H T S Massive scaleNo table size limits. Unlimited storage.
  • 30. H I G H L I G H T SSeamless scaleLive repartitioning. Zero admin.
  • 31. H I G H L I G H T SFlexible data model Key/attribute store for evolving models
  • 32. H I G H L I G H T SPredictable performance Provisioned throughput
  • 33. H I G H L I G H T SDurable and available Consistent, disk-only writes
  • 34. H I G H L I G H T SZero administration
  • 35. What is provisioned throughput?
  • 36. Reserve required IOPS Per table. Set at creation. Scale via API.
  • 37. Scale at any time No downtime
  • 38. Pay for throughput
  • 39. Per 1kb item:$0.01 per hour for every 10 writes/second $0.01 per hour for every 50 strongly consistent reads/second
  • 40. Per 1kb item: $0.28 per million writes$0.056 per million strongly consistent reads
  • 41. Pay for storage$1.00 per Gb per month of indexed storage
  • 42. Data model Flexible. Schema-less.
  • 43. Simple key/value pairs title => “Introduction to DynamoDB” date => “20120320”
  • 44. Associative array, or Hash[ title => “Introduction to DynamoDB”, date => “20120320” ]
  • 45. Attributes[ title => “Introduction to DynamoDB”, date => “20120320” ]
  • 46. [ title => “Disaster Recovery with AWS”, date => “20120320”, format => “webinar”, presenter => “Jeff Barr” ] Attributes [ title => “Introduction to DynamoDB”, date => “20120320” ]
  • 47. [ title => “Disaster Recovery with AWS”, date => “20120320”, format => “webinar”, presenter => “Jeff Barr” ] Items [ title => “Introduction to DynamoDB”, date => “20120320” ]
  • 48. [ title => “Disaster Recovery with AWS”, date => “20120328”, format => “webinar”, presenter => “Jeff Barr” ] Table [ title => “Introduction to DynamoDB”, date => “20120320” ]
  • 49. Table
  • 50. Item“ImageID” = “1” “Date” = “20100915”“Title” = “flower”“Tags” = “flower”,“jasmine”, “white”
  • 51. “ImageID” = “1” “ImageID” =”2” “ImageID” =”3” “Date” = “Date” = “Date” = “20100915” “20100916” “20100917”“Title” = “flower” “Title” = “ferrari” “Title” = “coffee”“Tags” = “flower”, “Tags” = “car”, “Tags” = “drink”,“jasmine”, “white” “italian” “delicious”
  • 52. “ImageID” = “1” Primary or hash key“Date” = “20100915” “Title” = “flower” “Tags” = “flower”, “jasmine”, “white”
  • 53. “ImageID” = “1” Primary or hash key“Date” = “20100915” Composite or range key “Title” = “flower” “Tags” = “flower”, “jasmine”, “white”
  • 54. “ImageID” = “1” Primary or hash key“Date” = “20100915” Composite or range key “Title” = “flower” “Tags” = “flower”, Sets of strings “jasmine”, “white” or numbers
  • 55. Best practice Well balanced, fine grained hash keys.Customer, order, item, etc. rather than store_id.
  • 56. Simple API Only 12 operations.
  • 57. Consistency Writes are always consistent.Reads are consistent or eventually consistent.
  • 58. Durability Writes occur to disk, not memory.Writes are acknowledged once they have been made in two physical data centres.
  • 59. Availability Region specific (not AZ)Continuously replicated across multiple AZs
  • 60. Let’s take a look!Building a simple DynamoDB powered web application
  • 61. Threaded discussions NP-Complete.me Book reviews for programmersPage view counts Tagging
  • 62. np-complete.me
  • 63. np-complete.me
  • 64. np-complete.me/asin
  • 65. np-complete.me/discuss
  • 66. np-complete.me/discuss
  • 67. Book Thread Thread Thread Reply Reply
  • 68. Book table Book metadataHash key asin => 0980576830
  • 69. Book table Book metadata asin => 0980576830title => “Host Your Website on the Cloud” pages => “364” list-price => “£31.49”
  • 70. Book table Book metadata, page views asin => 0980576830title => “Host Your Website on the Cloud” pages => “364” list-price => “£31.49” views => 145
  • 71. Book table Book metadata, page views, book tags asin => 0980576830title => “Host Your Website on the Cloud” pages => “364” list-price => “£31.49” views => 145 tags => [“php”, “aws”]
  • 72. Thread table Conversation thread Hash keyRange key asin => 0980576830 subject => “Very informative”
  • 73. Thread table Conversation thread Hash keyRange key asin => 0980576830 subject => “Very informative”content => “This is a first class book...” name => “Matt Wood”
  • 74. Reply table Conversation repliesHash key id => 0980576830:very-informativeRange key datetime => “20120320”
  • 75. Reply table Conversation repliesHash key id => 0980576830:very-informativeRange key datetime => “20120320” reply => “I agree!” name => “Werner Vogels”
  • 76. DynamoDB tablesBooks Threads Replies(asin) (asin, subject) (id, datetime)
  • 77. DynamoDB tablesBooks Threads Replies(asin) (asin, subject) (id, datetime)
  • 78. Book Logical model(asin) Thread (asin, subject) Thread (asin, subject) Thread (asin, subject) Reply (id, datetime) Reply (id, datetime)
  • 79. Conditional writesClient #1DynamoDB asin => 1934356 pages => 384Client #2 Time
  • 80. Conditional writes asin => 1934356 asin => 1934356Client #1 pages => 384 pages => 502DynamoDB asin => 1934356 asin => 1934356 pages => 384 pages => 502 asin => 1934356Client #2 pages => 384 Time
  • 81. Conditional writes asin => 1934356 asin => 1934356Client #1 pages => 384 pages => 502DynamoDB asin => 1934356 asin => 1934356 ? pages => 384 pages => 502 asin => 1934356 asin => 1934356Client #2 pages => 384 pages => 450 Time
  • 82. Conditional writes asin => 1934356 asin => 1934356Client #1 pages => 384 pages => 502 Failed conditionDynamoDB asin => 1934356 asin => 1934356 pages => 384 pages => 502 asin => 1934356 asin => 1934356Client #2 pages => 384 pages => 450 Time
  • 83. Atomic increment/decrement asin => 0980576830 views => 145tables[‘books’].items[‘0980576830’].attributes.add(:views => 1) asin => 0980576830 views => 146
  • 84. Tagging: many to many Book (asin, tags = [“php”, “aws”])Query by key, retrieve tag collection Add tags conditionally No secondary indexes Retrieve all books by tag
  • 85. Tagging: many to many Book (asin, tags = [“php”, “aws”]) Tag (tag, asin = [“1449393683”, “0596515812”])Query by book, retrieve tag collectionQuery by tag, retrieve book collection
  • 86. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  • 87. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  • 88. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  • 89. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  • 90. Autoscaling via SNS$Res = $DDB->describe_table(array(TableName => books));$Read = (int) $Res->body->Table->ProvisionedThroughput->ReadCapacityUnits;$Write = (int) $Res->body->Table->ProvisionedThroughput->WriteCapacityUnits;$Read *= 2;$Write *= 2;$PT = array(ReadCapacityUnits => (string) $Read, WriteCapacityUnits => (string) $Write);$Res = $ddb->update_table(array(TableName => books, ProvisionedThroughPut => $PT));
  • 91. Considerations Limited index and query modelThroughput is provisioned in 1K operations Maximum 64K item sizeBackup and restore via Elastic MapReduce
  • 92. Elastic MapReduce Built for data. Designed for humans.
  • 93. Collection Computation Collaboration
  • 94. Collection Computation Collaboration DynamoDB Elastic MapReduce Amazon S3 Amazon EC2
  • 95. DynamoDBData
  • 96. DynamoDB DataCode Elastic MapReduce
  • 97. DynamoDB DataCode Elastic Name MapReduce node
  • 98. DynamoDB DataCode Elastic Name MapReduce node Elastic cluster
  • 99. DynamoDB DataCode Elastic Name MapReduce node HDFS Elastic cluster
  • 100. DynamoDB DataCode Elastic Name Output MapReduce node S3 HDFS Elastic cluster
  • 101. DynamoDBData Output S3
  • 102. Export to S3CREATE EXTERNAL TABLE orders_s3_new_export ( order_idstring, customer_id string, order_date int, totaldouble )PARTITIONED BY (year string, month string)ROW FORMAT DELIMITED FIELDS TERMINATED BY ,LOCATION s3://export_bucket;INSERT OVERWRITE TABLEorders_s3_new_exportPARTITION (year=2012, month=01)SELECT * from orders_ddb_2012_01;
  • 103. Live data in DynamoDBSELECT customer_id, sum(total) spend, count(*)order_countFROM orders_ddb_2012_01WHERE order_date >= unix_timestamp(2012-01-01, yyyy-MM-dd)AND order_date < unix_timestamp(2012-01-08, yyyy-MM-dd)GROUP BY customer_idORDER BY spend descLIMIT 5;
  • 104. Live and archive data
  • 105. AGENDA Getting to know DynamoDBGu ided tour of service highlightsProvisioned throughputData modelDynamoDB in practiceAnalytics with Elastic MapReduce Slides available shortly.
  • 106. DynamoDB free tier 5 writes/second 10 consistent reads/second 100Mb storage
  • 107. Developer Guideaws.amazon.com/documentation/dynamodb
  • 108. Drop us a line! aws.amazon.com/contact-us
  • 109. Thank you!
  • 110. Q&Amatthew@amazon.com
  • 111. SimpleDBZero maintenance, NoSQL datastore
  • 112. Flexible queries 10Gb / 1 billion No nativeattributes per tabel data sharding SimpleDB Zero maintenance, NoSQL datastore