DevNation Atlanta

1,055 views

Published on

A talk given at DevNation Atlanta, 4/3/2010

Published in: Technology
  • Be the first to comment

  • Be the first to like this

DevNation Atlanta

  1. 1. NOSQL, COUCHDB AND THE CLOUD Brad Anderson Cloudant 1
  2. 2. BRAD ANDERSON • BS Hotel Management • Restaurant Chain Data - econometric modeling, BI/DW • Open Source - trac, dsource.org, couchdb • NOSQLEast 2009 • Cloudant • http://twitter.com/boorad 2
  3. 3. AGENDA • NOSQL • COUCHDB • Erlang • Cloud • Dynamo • MapReduce 3
  4. 4. IF YOU • don’t have ‘medium data’ or ‘big data’ • arecool with 25K loc object-relational mappers • love an ops challenge • areokay paying Uncle Larry 4
  5. 5. IF YOU • don’t have ‘medium data’ or ‘big data’ • arecool with 25K loc object-relational mappers • love an ops challenge • areokay paying Uncle Larry 4
  6. 6. IF NOT http://www.bigfatmoneybags.com/blog/wp-content/uploads/2009/12/screwed.jpg 5
  7. 7. RELATIONAL DATABASES RDBMS • Rigid Schema / ORM fun 1970-2010 • Scale Up • Everything is a Nail http://www.flickr.com/photos/36041246@N00/3419197777/ 6
  8. 8. SCALING RDBMS • Replication Sucks • master-slave • master-master • Partitioning Sucks • vertical (by functional area) • horizontal (by some key, say time) • Caching sort of works 7
  9. 9. OKAY, NOT SCREWED http://www.bigfatmoneybags.com/blog/wp-content/uploads/2009/12/screwed.jpg 8
  10. 10. RELATIONAL DATABASES RDBMS • Not Dead 1970- • Just have a ‘smell’ for certain tasks http://www.flickr.com/photos/36041246@N00/3419197777/ 9
  11. 11. NOSQL NOT ONLY SQL A moniker for different data storage systems solving very different problems, all where a relational database is not the right fit. 10
  12. 12. RIGHT FIT • Google indexes 400 Pb / day (2007) • CERN, LHC generates 100 Pb / sec • Unique data created each year (IDC, 2007) • 2007 40 Eb • 2010 988 Eb (exponential growth) • Flightcaster 11
  13. 13. FOUR CATEGORIES • Key/Value Stores • Dynomite, Voldemort, Tokyo • Document Stores • CouchDB, MongoDB • Column Stores / BigTable • HBase, Hypertable, Cassandra • Graph Databases • Neo4j, AllegroGraph, VertexDB 12
  14. 14. BIG TAKEAWAY function data data data data data data data data data data 13
  15. 15. BIG TAKEAWAY function function data function function data data function function data data data data data data data data function function data data data data data data function function data data function data Bring the function to the data 13
  16. 16. 14
  17. 17. HUH? ERLANG? • Programming Language created at Ericsson (20 yrs old now) • Designed for scalable, long-lived systems • Compiled, Functional, Dynamically Typed, Open Source 15
  18. 18. 3 BIGGIES • Massively Concurrent • green threads, very lightweight != os threads • Seamlessly Distributed • node = os thread = VM, processes can live anywhere • Fault Tolerant • 99.9999999 = 32ms downtime per year - AXD301 16
  19. 19. Of fi cia lB et a! CouchDB Apache 17
  20. 20. COUCHDB • Schema-free document database server • Robust, highly concurrent, fault-tolerant • RESTful JSON API • Futon web admin console • MapReduce system for generating custom views • Bi-directional incremental replication • couchapp: lightweight HTML+JavaScript apps served directly from CouchDB using views to transform JSON 18
  21. 21. FROM INTEREST TO ADOPTION • 100+ production users • Active commercial •3 development books being written • Rapidly maturing • Vibrant, open community 19
  22. 22. OF THE WEB Django may be built for the Web, but CouchDB is built of the Web. I've never seen software that so completely embraces the philosophies behind HTTP ... this is what the software of the future looks like. Jacob Kaplan-Moss October 17 2007 http://jacobian.org/writing/of-the-web/ 20
  23. 23. DOCUMENTS • Documents are JSON Objects • Underscore-prefixed fields are reserved • Documents can have binary attachments • MVCC _rev deterministically generated from doc content 21
  24. 24. ROBUST • Never overwrite previously committed data • In the event of a server crash or power failure, just restart CouchDB -- there is no “repair” • Take snapshots with “cp” • Configurable levels of durability: can choose to fsync after every update, or less often to gain better throughput 22
  25. 25. CONCURRENT • Erlang approach: lightweight processes to model the natural concurrency in a problem • For CouchDB that means one process per TCP connection • Lock-free architecture; each process works with an MVCC snapshot of a DB. • Performance degrades gracefully under heavy concurrent load 23
  26. 26. REST API • Create PUT /mydb/mydocid • Retrieve GET /mydb/mydocid • Update PUT /mydb/mydocid • Delete DELETE /mydb/mydocid 24
  27. 27. 25
  28. 28. VIEWS • Custom, persistent representations of document data • “Closeto the metal” -- no dynamic queries in production, so you know exactly what you’re getting • Generated using MapReduce functions written in JavaScript (and other languages) view must have a map function and may also have a • Each reduce function • Leverages view collation, rich view query API 26
  29. 29. DOCUMENTS BY AUTHOR 27
  30. 30. INCREMENTAL • Computing a view can be expensive, so CouchDB saves the result in a B-tree and keeps it up-to-date • Leafnodes store map results, inner nodes store reductions of children http://horicky.blogspot.com/2008/10/couchdb-implementation.html 28
  31. 31. REPLICATION • Peer-based, bi-directional replication using normal HTTP calls • Mediated by a replicator process which can live on the source, target, or somewhere else entirely • Replicate a subset of documents in a DB meeting criteria defined in a custom filter function (coming soon) • Applications (_design documents) replicate along with the data • Ideal for offline applications -- “ground computing” 29
  32. 32. CLOUD 30
  33. 33. SHOWROOM A cluster of couches 31
  34. 34. Help me, this name sucks! SHOWROOM A cluster of couches 31
  35. 35. ARCHITECTURE • Each cluster is a ring of nodes (Dynamo, Dynomite) • Any node can handle request (consistent hashing) • O(1), with a hop • nodes own partitions (ring is divided) • data are distributed evenly across partitions and replicas • mapreduce functions are passed to nodes for execution
  36. 36. RESEARCH • Google’s MapReduce, http://bit.ly/bJbyq5 • Amazon’s Dynamo, http://bit.ly/b7FlsN • CAP theorem, http://bit.ly/bERr2H
  37. 37. CLUSTER CONTROLS •N - Replication Q •Q - Partitions = 2 •R - Read Quorum •W - Write Quorum • These constants define the cluster
  38. 38. N Consistency Throughput Durability N = Number of replicas per item stored in cluster
  39. 39. Q Throughput Scalability 2^Q = Number of partitions (shards) in cluster T = Number of nodes in cluster 2^Q / T = Number of partitions per node
  40. 40. R Latency Consistency R = Number of successful reads before returning value(s) to client
  41. 41. W Latency Durability W = Number of successful writes before returning ‘success’ to client
  42. 42. Load Balancer Node 1 24 No de A B C D de No B 2 A Z C Y D X E C N od e D 3 E F D No de E 4 F G
  43. 43. request PUT http://boorad.cloudant.com/dbname/blah?w=2 Load Balancer Node 1 24 No de A B C D de No B 2 A Z C Y D X E C N od e D 3 E F D No de E 4 F G
  44. 44. request PUT http://boorad.cloudant.com/dbname/blah?w=2 Load Balancer Node 1 24 No de A B C D de No B 2 A Z C Y D X E C N od e D 3 E F D No de E 4 F G
  45. 45. request PUT http://boorad.cloudant.com/dbname/blah?w=2 Load Balancer Node 1 24 No de A B C D de No B 2 A Z C Y D X hash(blah) = E E C N od e D 3 E F D No de E 4 F G
  46. 46. request PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Load Balancer R=2 Node 1 24 No de A B C D de No B 2 A Z C Y D X hash(blah) = E E C N od e D 3 E F D No de E 4 F G
  47. 47. request PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Load Balancer R=2 Node 1 24 No de A B C D de No B 2 A Z C Y D X hash(blah) = E E C N od e D 3 E F D No de E 4 F G
  48. 48. request PUT http://boorad.cloudant.com/dbname/blah?w=2 N=3 W=2 Load Balancer R=2 24 Node 1 No node down de A B C D de No B 2 A Z C Y D X hash(blah) = E E C N od e D 3 E F D No de E 4 F G
  49. 49. RESULT • For standalone or cluster • one REST API • one URL • For cluster • redundant data • distributed queries • scale out 40
  50. 50. DEMO? 41
  51. 51. QUESTIONS?
  52. 52. CREDITS • Emil Eifrem, http://bit.ly/5D40WQ • Sergio Bossa, http://bit.ly/c9UoRZ • Cliff Moon, http://bit.ly/bX887c 43

×