Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Gluecon 2012 - DynamoDB


Published on

Wrestle your NoSQL Data with Amazon DynamoDB

Published in: Technology
  • Be the first to comment

Gluecon 2012 - DynamoDB

  1. 1. ¡Ay, caramba! Wrestle Your NoSQL Data with DynamoDB Je ff Dougl a s @je ff do n th em ic C lo udSp ok es C ommun it y Arch itec t
  2. 2. Rambling Talk RoadmapShort NoSQL overview (thanks Max @ 10gen!)Why NoSQL database are like Mexican WrestlersAmazon DynamoDB in depthAmazon DynamoDB demo and codeCloudSpokes challenge submissions for “Build an#Awesome Demo with Amazon DynamoDB”
  3. 3. Times they are a-changin’ Cloud applications and APIs need to be fast, flexible and scalable. RDBMS typically do not scale well for certain data-intensive application. NoSQL is cloud friendly.“NoSQL is a rebellion against the DBAs who prevent us from doing shit.” - James Governor, Gluecon 2012
  4. 4. Why is NoSQL #awesome?Developed to manage large volumes of data thatdo not necessarily follow a fixed schemaGreat for heavy read/write workloadsSimple to setup, configure and administerDistributed, fault tolerant architectureScale out not upSpecialized database for the right task
  5. 5. Key NoSQL differencesDo not use SQL as a query languageDynamic & schema-lessNon-relational, no JOIN operationsNo complex transactionsMay not give full ACID guarantees; eventuallyconsistent instead. Performance and real-timenature is more important than consistency.
  6. 6. NoSQL databases are “different”
  7. 7. NoSQL database typesDocument store (MongoDB, CouchDB) A document-oriented database that stores, retrieves, and manages semi structured data including XML, YAML, JSON and binary (PDF, DOC)Key-value store (Cassandra, Redis) Stores scheme-less data referenced by a simple key valueGraph database (Neo4j, FlockDB) Stores the relationship of data as a graph (social relations, network topologies)
  8. 8. How to choose?With all of the different NoSQL database types, how do you choose the “best” one?
  9. 9. El Toro Más Macho MongoDB Stores structured data as JSON-like documents. Ad hoc queries, indexing, master-slave replication, sharding, server-side JavaScript execution All the “cool kids” are using it. Node.js + MongoDB = WINNING!
  10. 10. Muy Guapo Couchbase JSON Document store Embedded CouchDB with caching, clustering and high-performance storage management components. JavaScript as its query language and HTTP for an API Serve HTML and JavaScript-based “CouchApps”
  11. 11. El Matador Misterio Redis What exactly is redis? MAGIC! By definition, it’s an in-memory, key-value data store with optional durability. Data model includes list of string, sets of strings, sorted sets of strings & hashes. Awesome at doing set comparisons.
  12. 12. Comando Loco Apache Hadoop Fast, reliable analysis of both structured data and complex data. Derived from Googles MapReduce and File System (GFS) papers. Yahoo is one of the main contributors. Reliable data storage using the Hadoop Distributed File System (HDFS) and high- performance parallel data processing using MapReduce.
  13. 13. El Jefe Supremo Apache Cassandra Massively scalable key-value store initially developed by Facebook. BigTable data model (nested hashes) running on an Amazon Dynamo-like infrastructure. Has some RDBMS “feel” with column families that make it it a hybrid column/row store. No single point of failure, fault-tolerant multi data center replication, MapReduce support. CQL (Cassandra Query Language)
  14. 14. Introducing...
  15. 15. La Amazon DynamoDB
  16. 16. ¡Hola DynamoDBAmazon DynamoDB is a fast, fully managed key-valuedatabase service that scales seamlessly with extremelylow latency and predictable performance. Store and retrieve any amount of data Serve any level of request traffic Hands off administration Pay for throughput and not storage
  17. 17. ¡No! administraciónNo hardware or software provisioning, setup andconfiguration, software patching, or partitioning data overmultiple instances and regions.Specify the request throughput for your table and in thebackground, Amazon handles the provisioning of resources tomeet the requested throughput rate.Automatically partitions/re-partitions data and provisionsadditional server capacity based upon table size & throughput.Synchronously replicates data across multiple facilities in anAWS Region giving you high availability and data durability.
  18. 18. Muy rápidoConsistent, predictable performanceRuns on a new solid state disk (SSD) architecturefor low-latency response times.Read latencies average less than 5 milliseconds,and write latencies average less than 10milliseconds.
  19. 19. Muy EscalableNo table size limits (adiós SimpleDB?)No downtime when scaling up or downUnlimited storageAutomatically scale machine resources inresponse to increases in database traffic withoutthe need of client-side partitioning.
  20. 20. Modelo de datos flexibleFlexible data model with familiar tables, itemsand key-value pairs.Schema-less document storage. Each item canhave different attributes.Easy to create and modify documents. SimpleAPI.No cross-table joins. Use composite keys tomodel relationships.
  21. 21. DuraderoConsistent, disk-only writesAtomic increment/decrement (w/single API call)Optimistic concurrency control (aka conditionalwrites & updates)Item level transactions (even in bulk)Automatic and synchronous replication acrossdata centers and availability zones.
  22. 22. Costos?Pay for throughput and not storage.Priced per hour of provisioned read/writethroughputScales up and down well with a free tier
  23. 23. Write throughputWrite throughputUnit = size of item x writes/second$0.01 per hour for 10 write units
  24. 24. Read throughputStrongly consistent reads (mucho dinero)Eventually consistent reads See Amazon’s site for read throughput pricing!
  25. 25. Other featuresIntegrates with Amazon Elastic MapReduce andHadoop.Libraries, mappers and mocks for Django,Erlang, Java, .NET, Node.js, Perl, PHP, Python &Ruby.Session based authentication using AmazonSecurity Token ServiceMonitoring via CloudWatch
  26. 26. DynamoDB SemanticsTables, item & attributesItems are indexed by primary key (single hashand composite keys)Items are a collection of attributes and attributeshave a key and value.Unlimited number of attributes up to 64k total.
  27. 27. Simple API calls CreateTable PutItemUpdateTable GetItem DeleteTable UpdateItemDescribeTable DeleteItem ListTables Query BatchGetItem Scan BatchWriteItem
  28. 28. Kiva loan browser
  29. 29. CRUD items
  30. 30. Connect to DynamoDB
  31. 31. New Loan
  32. 32. Show Loan
  33. 33. All/Filter Loans
  34. 34. CloudSpokes Challenge
  35. 35. Flickr on DynamoDB Wcheung (Canada) submitted a Grails application that caches Flickr photos inAmazon DynamoDB. You can then search for cached feed entries by primary key (author + published date/time range) or by table scan. You can also “like” a photo, resulting in the atomic “like” counter for the item in DynamoDB getting incremented.
  36. 36. PosterityMbleigh (US) submitted a simple, barebones Twitter-esque service created inRuby using Sinatra. It is far from complete but uses a number of DynamoDBs key features including Hash/Range Keys and Atomic Set Push Operations.
  37. 37. DynamoDB Task ManagerDarthdeus (Czech Republic) wrote his app in Ruby using Sinatra. It uses a customORM he wrote called DynamoRecord to access DynamoDB. His main idea was to get at least some of the ActiveRecord-ish API to DynamoDB using some basic metaprogramming
  38. 38. Simple Sur vey Peakpado (US) created an application using Ruby on Rails. For each table hecreated a sophisticated hask/range key model class which resulted in an API very similar to ActiveRecord for DynamoDB.
  39. 39. Data Sets for Mumbai Romin (India) developed an API that exposes data sets of Mumbai city in JSONformat. The solution uses Amazon DynamoDB for storing the data and a NodeJSapplication that exposes the REST interface and talks to Amazon DynamoDB via a backend Java application.
  40. 40. Thanks!Jeff DouglasCloudSpokesCommunity