Successfully reported this slideshow.

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Related Audiobooks

Free with a 14 day trial from Scribd

See all

DynamoDB Gluecon 2012

  1. 1. ¡Ay, caramba! Wrestle Your NoSQL Data with DynamoDB Je ff Dougl a s @je ff do n th em ic C lo udSp ok es C ommun it y Arch itec t
  2. 2. Rambling Talk Roadmap Short NoSQL overview (thanks Max @ 10gen!) Why NoSQL database are like Mexican Wrestlers Amazon DynamoDB in depth Amazon DynamoDB demo and code CloudSpokes challenge submissions for “Build an #Awesome Demo with Amazon DynamoDB”
  3. 3. Times they are a-changin’ Cloud applications and APIs need to be fast, flexible and scalable. RDBMS typically do not scale well for certain data-intensive application. NoSQL is cloud friendly. “NoSQL is a rebellion against the DBAs who prevent us from doing shit.” - James Governor, Gluecon 2012
  4. 4. Why is NoSQL #awesome? Developed to manage large volumes of data that do not necessarily follow a fixed schema Great for heavy read/write workloads Simple to setup, configure and administer Distributed, fault tolerant architecture Scale out not up Specialized database for the right task
  5. 5. Key NoSQL differences Do not use SQL as a query language Dynamic & schema-less Non-relational, no JOIN operations No complex transactions May not give full ACID guarantees; eventually consistent instead. Performance and real-time nature is more important than consistency.
  6. 6. NoSQL databases are “different”
  7. 7. NoSQL database types Document store (MongoDB, CouchDB) A document-oriented database that stores, retrieves, and manages semi structured data including XML, YAML, JSON and binary (PDF, DOC) Key-value store (Cassandra, Redis) Stores scheme-less data referenced by a simple key value Graph database (Neo4j, FlockDB) Stores the relationship of data as a graph (social relations, network topologies)
  8. 8. How to choose? With all of the different NoSQL database types, how do you choose the “best” one?
  9. 9. El Toro Más Macho MongoDB Stores structured data as JSON-like documents. Ad hoc queries, indexing, master-slave replication, sharding, server-side JavaScript execution All the “cool kids” are using it. Node.js + MongoDB = WINNING!
  10. 10. Muy Guapo Couchbase JSON Document store Embedded CouchDB with caching, clustering and high-performance storage management components. JavaScript as its query language and HTTP for an API Serve HTML and JavaScript-based “CouchApps”
  11. 11. El Matador Misterio Redis What exactly is redis? MAGIC! By definition, it’s an in-memory, key-value data store with optional durability. Data model includes list of string, sets of strings, sorted sets of strings & hashes. Awesome at doing set comparisons.
  12. 12. Comando Loco Apache Hadoop Fast, reliable analysis of both structured data and complex data. Derived from Google's MapReduce and File System (GFS) papers. Yahoo is one of the main contributors. Reliable data storage using the Hadoop Distributed File System (HDFS) and high- performance parallel data processing using MapReduce.
  13. 13. El Jefe Supremo Apache Cassandra Massively scalable key-value store initially developed by Facebook. BigTable data model (nested hashes) running on an Amazon Dynamo-like infrastructure. Has some RDBMS “feel” with column families that make it it a hybrid column/row store. No single point of failure, fault-tolerant multi data center replication, MapReduce support. CQL (Cassandra Query Language)
  14. 14. Introducing...
  15. 15. La Amazon DynamoDB
  16. 16. ¡Hola DynamoDB Amazon DynamoDB is a fast, fully managed key-value database service that scales seamlessly with extremely low latency and predictable performance. Store and retrieve any amount of data Serve any level of request traffic Hands off administration Pay for throughput and not storage
  17. 17. ¡No! administración No hardware or software provisioning, setup and configuration, software patching, or partitioning data over multiple instances and regions. Specify the request throughput for your table and in the background, Amazon handles the provisioning of resources to meet the requested throughput rate. Automatically partitions/re-partitions data and provisions additional server capacity based upon table size & throughput. Synchronously replicates data across multiple facilities in an AWS Region giving you high availability and data durability.
  18. 18. Muy rápido Consistent, predictable performance Runs on a new solid state disk (SSD) architecture for low-latency response times. Read latencies average less than 5 milliseconds, and write latencies average less than 10 milliseconds.
  19. 19. Muy Escalable No table size limits (adiós SimpleDB?) No downtime when scaling up or down Unlimited storage Automatically scale machine resources in response to increases in database traffic without the need of client-side partitioning.
  20. 20. Modelo de datos flexible Flexible data model with familiar tables, items and key-value pairs. Schema-less document storage. Each item can have different attributes. Easy to create and modify documents. Simple API. No cross-table joins. Use composite keys to model relationships.
  21. 21. Duradero Consistent, disk-only writes Atomic increment/decrement (w/single API call) Optimistic concurrency control (aka conditional writes & updates) Item level transactions (even in bulk) Automatic and synchronous replication across data centers and availability zones.
  22. 22. Costos? Pay for throughput and not storage. Priced per hour of provisioned read/write throughput Scales up and down well with a free tier
  23. 23. Write throughput Write throughput Unit = size of item x writes/second $0.01 per hour for 10 write units
  24. 24. Read throughput Strongly consistent reads (mucho dinero) Eventually consistent reads See Amazon’s site for read throughput pricing!
  25. 25. Other features Integrates with Amazon Elastic MapReduce and Hadoop. Libraries, mappers and mocks for Django, Erlang, Java, .NET, Node.js, Perl, PHP, Python & Ruby. Session based authentication using Amazon Security Token Service Monitoring via CloudWatch
  26. 26. DynamoDB Semantics Tables, item & attributes Items are indexed by primary key (single hash and composite keys) Items are a collection of attributes and attributes have a key and value. Unlimited number of attributes up to 64k total.
  27. 27. Simple API calls CreateTable PutItem UpdateTable GetItem DeleteTable UpdateItem DescribeTable DeleteItem ListTables Query BatchGetItem Scan BatchWriteItem
  28. 28. Kiva loan browser http://kivabrowser.elasticbeanstalk.com
  29. 29. CRUD items
  30. 30. Connect to DynamoDB
  31. 31. New Loan
  32. 32. Show Loan
  33. 33. All/Filter Loans
  34. 34. CloudSpokes Challenge
  35. 35. Flickr on DynamoDB Wcheung (Canada) submitted a Grails application that caches Flickr photos in Amazon DynamoDB. You can then search for cached feed entries by primary key (author + published date/time range) or by table scan. You can also “like” a photo, resulting in the atomic “like” counter for the item in DynamoDB getting incremented. http://screencast.com/t/MAVgm7xeqDpr
  36. 36. Posterity Mbleigh (US) submitted a simple, barebones Twitter-esque service created in Ruby using Sinatra. It is far from complete but uses a number of DynamoDB's key features including Hash/Range Keys and Atomic Set Push Operations. http://www.screencast.com/t/me8hW27MYs3x
  37. 37. DynamoDB Task Manager Darthdeus (Czech Republic) wrote his app in Ruby using Sinatra. It uses a custom ORM he wrote called DynamoRecord to access DynamoDB. His main idea was to get at least some of the ActiveRecord-ish API to DynamoDB using some basic metaprogramming http://www.youtube.com/watch?v=9tOzaDPP39I
  38. 38. Simple Sur vey Peakpado (US) created an application using Ruby on Rails. For each table he created a sophisticated hask/range key model class which resulted in an API very similar to ActiveRecord for DynamoDB. http://screencast.com/t/ri1XkMxGcpnS
  39. 39. Data Sets for Mumbai Romin (India) developed an API that exposes data sets of Mumbai city in JSON format. The solution uses Amazon DynamoDB for storing the data and a NodeJS application that exposes the REST interface and talks to Amazon DynamoDB via a backend Java application.
  40. 40. Thanks! Jeff Douglas CloudSpokes Community Architect @jeffdonthemic jeff@cloudspokes.com http://www.cloudspokes.com http://blog.jeffdouglas.com

Editor's Notes

  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • ×