Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Non-Relational
                     Databases
                        and
                  World Domination
             ...
Overview
                   •   Relational vs. Non-Relational
                            •   Why Switch?
                ...
Relational Databases




Thursday, 3 December 2009
Relational Databases
                   •   Relational algebra: union, intersection,
                       difference, ca...
Non-Relational Databases
                   •   Everything else!
                   •   Myriad of features, including:
   ...
CAP               eorem
                   •   Three requirements for applications in a
                       distributed...
Why Switch?
                   •   Data structure
                   •   Scalability
                   •   The New Cool

...
Data Structure
                             Symptoms


Thursday, 3 December 2009
Sparse Data
                   •   Tables with many columns, only a few
                       being used by any particula...
Attribute Tables
                   •   Each row is (fkey, att_name, att_value)




Thursday, 3 December 2009
Data Dumps
                   •   Given up on using columns for structured
                       data
                   ...
Too Many Joins
                   •   Schemas involving large numbers of
                       many-to-many join tables o...
Frequent Schema Changes
                   •   May be fine for small databases
                   •   Can be tedious
      ...
Scalability



Thursday, 3 December 2009
Thursday, 3 December 2009
Thursday, 3 December 2009
Write Capacity
                   •   If read capacity is the problem, then set
                       up master-slave rep...
Too Much Data
                   •   Too much for one server to hold
                   •   Hard to shard the data sensibl...
Non-Relational
                              Solutions


Thursday, 3 December 2009
Diverse Ecosystem
                   •   Column-oriented databases
                   •   Document-oriented databases
    ...
BigTable
                   •   “a sparse, distributed multi-dimensional
                       sorted map”
              ...
Document Databases
                   •   Arbitrary number of “sparse” attributes per
                       document
    ...
Graph Databases
                   •   Good for highly interconnected data
                   •   Focus on the relationshi...
Distributed K-V Stores
                   •   Giant hash table/dictionary
                   •   Mainly solve data scalabi...
Distributed K-V Stores
                   •   Scalaris, Dynomite, Ringo: data
                       consistency
         ...
Apache
                            CouchDB
Thursday, 3 December 2009
CouchDB and Ruby
   # with !, it creates the database if it doesn't already exist
   @db = CouchRest.database!("http://127...
CouchDB and Ruby
      @db.bulk_save([
           {"wild" => "and random"},
           {"mild" => "yet local"},
          ...
CouchDB and Ruby
       @db.save_doc({
         "_id" => "_design/first",
         :views => {
            :test => {
    ...
CouchDB and Ruby
                   •   Read more about CouchRest on github
                   •   Also check out newcomer...
Schema-Free (JSON)
                   •

                                              Features
                   •   Doc...
Schema-Free (JSON)
                   •

                                              Features
                   •   Doc...
http://www.flickr.com/photos/stilleben2001/223243329/




                            Documents
Thursday, 3 December 2009
Schema-Free ( JSON)
                      {
                            "_id": "BCCD12CBB",
                            "_...
Schema-Free ( JSON)
                      {
                            "_id": "BCCD12CBB",
                            "_...
Schema-Free ( JSON)
                      {
                            "_id": "BCCD12CBB",
                            "_...
Schema-Free ( JSON)
                      {
                            "_id": "BCCD12CBB",
                            "_...
Schema-Free (JSON)
                   •

                                              Features
                   •   Doc...
Document-Oriented
                                                           Not Relational

                   •   Docume...
Document-Oriented
                                                           Not Relational
                   •   Documen...
Schema-Free (JSON)
                   •

                                              Features
                   •   Doc...
Highly Concurrent




Thursday, 3 December 2009
Highly Concurrent

                    •       Functional languages highly appropriate
                            for par...
Highly Concurrent

                    •       Functional languages highly appropriate
                            for par...
Highly Concurrent

                    •       Functional languages highly appropriate
                            for par...
MVCC
                   •   Multiversion Concurrency Control
                   •   Reads: lock-free; never block
        ...
Schema-Free (JSON)
                   •

                                              Features
                   •   Doc...
ful
                                                 CRUD
                   •   Create
                       HTTP PUT /d...
ful
                                             Example
             couch = CouchRest.database!("http://
             12...
Cacheability
                   •   Both documents and views return ETags
                   •   Clients send If-None-Matc...
Schema-Free (JSON)
                   •

                                              Features
                   •   Doc...
JavaScript-Powered
                               Map/Reduce
                   •   Map functions extract data from your
 ...
http://horicky.blogspot.com/2008/10/couchdb-implementation.html
Thursday, 3 December 2009
Map/Reduce Views
              Docs
                                        Map
    {"user" : "Chris",
                   ...
Map/Reduce Views
              Docs
                                        Map
    {"user" : "Chris",
                   ...
Map/Reduce Views
              Docs
                                        Map
    {"user" : "Chris",
                   ...
Render Views as HTML
         lists/index.js              /drl/_list/sofa/index/recent-posts?descending=true&limit=8




T...
Server-Side JavaScript
                   •   _show for transforming documents
                   •   _list for transformi...
Schema-Free (JSON)
                   •

                                              Features
                   •   Doc...
Replication
                   •   Incremental
                   •   Near-real-time
                            •   Clust...
“Ground Computing”
                                                   @jhuggins




                              http://w...
http://www.flickr.com/photos/hercwad/2290378571/
Thursday, 3 December 2009
Latency Sucks




Thursday, 3 December 2009
Stuart Langridge - Canonical




      !                     !


Thursday, 3 December 2009
Thursday, 3 December 2009
Thursday, 3 December 2009
Thursday, 3 December 2009
Thursday, 3 December 2009
Thursday, 3 December 2009
Thursday, 3 December 2009
Thursday, 3 December 2009
Thursday, 3 December 2009
Thursday, 3 December 2009
Thursday, 3 December 2009
Con icts



Thursday, 3 December 2009
Con ict resolution by
                                 example


                             A               B




Thursd...
Con ict resolution by
                                 example


                             A               B


        ...
Con ict resolution by
                                 example


                             A               B


        ...
Con ict resolution by
                                 example


                             A               B


        ...
Con ict resolution by
                                 example


                             A               B


        ...
Con ict resolution by
                                 example


                             A               B


        ...
Con ict resolution by
                                 example


                             A               B


        ...
Con ict resolution by
                                 example


                             A               B


        ...
Con ict resolution by
                                 example


                             A               B


        ...
Con ict resolution by
                                 example


                             A               B


        ...
Con ict resolution by
                                 example


                             A               B


        ...
Schema-Free (JSON)
                   •

                                              Features
                   •   Doc...
Robust Storage

                   Append-Only
                   File Structure

                   Designed to
         ...
Robust



Thursday, 3 December 2009
Thursday, 3 December 2009
anks!

                      www.jasondavies.com

                            @jasondavies

Thursday, 3 December 2009
Upcoming SlideShare
Loading in …5
×

Non Relational Databases And World Domination

1,434 views

Published on

Apparently NoSQL is all the rage these days, but what does it really mean and what technologies are out there? When to use a non-relational database? How to decide which one to use to achieve world domination? How do I use CouchDB with Ruby on Rails?

Published in: Technology
  • Be the first to comment

Non Relational Databases And World Domination

  1. 1. Non-Relational Databases and World Domination Jason Davies Thursday, 3 December 2009
  2. 2. Overview • Relational vs. Non-Relational • Why Switch? • Non-Relational Solutions • Document Databases • Key Value Stores • CouchDB Features Thursday, 3 December 2009
  3. 3. Relational Databases Thursday, 3 December 2009
  4. 4. Relational Databases • Relational algebra: union, intersection, difference, cartesian product • Easy to perform dynamic queries • Fixed Schemas • Normalisation Thursday, 3 December 2009
  5. 5. Non-Relational Databases • Everything else! • Myriad of features, including: • Key-Value stores with external indexers • Schemaless • RESTful APIs Thursday, 3 December 2009
  6. 6. CAP eorem • Three requirements for applications in a distributed environment: • Consistency • Availability • Partition tolerance • Pick two Thursday, 3 December 2009
  7. 7. Why Switch? • Data structure • Scalability • The New Cool Thursday, 3 December 2009
  8. 8. Data Structure Symptoms Thursday, 3 December 2009
  9. 9. Sparse Data • Tables with many columns, only a few being used by any particular row Thursday, 3 December 2009
  10. 10. Attribute Tables • Each row is (fkey, att_name, att_value) Thursday, 3 December 2009
  11. 11. Data Dumps • Given up on using columns for structured data • Instead simply serialising it (JSON, YAML, XML, etc.) and dumping strings to database Thursday, 3 December 2009
  12. 12. Too Many Joins • Schemas involving large numbers of many-to-many join tables or tree-like structures Thursday, 3 December 2009
  13. 13. Frequent Schema Changes • May be fine for small databases • Can be tedious • Rebuilding indexes is slow for millions of rows Thursday, 3 December 2009
  14. 14. Scalability Thursday, 3 December 2009
  15. 15. Thursday, 3 December 2009
  16. 16. Thursday, 3 December 2009
  17. 17. Write Capacity • If read capacity is the problem, then set up master-slave replication Thursday, 3 December 2009
  18. 18. Too Much Data • Too much for one server to hold • Hard to shard the data sensibly Thursday, 3 December 2009
  19. 19. Non-Relational Solutions Thursday, 3 December 2009
  20. 20. Diverse Ecosystem • Column-oriented databases • Document-oriented databases • Key value stores • Graph-oriented databases • Distributed databases • MapReduce Thursday, 3 December 2009
  21. 21. BigTable • “a sparse, distributed multi-dimensional sorted map” • Designed to scale into the petabyte range • HBase (Java, Hadoop) • Hypertable • Cassandra (Facebook, based on Amazon’s Dynamo) Thursday, 3 December 2009
  22. 22. Document Databases • Arbitrary number of “sparse” attributes per document • Documents often map well to JSON e.g. in CouchDB • Cons: usually can’t perform joins or transactions spanning multiple documents Thursday, 3 December 2009
  23. 23. Graph Databases • Good for highly interconnected data • Focus on the relationships between items • Optimised for querying transitive relationships i.e. variable length chains of joins • Neo4J, AllegroGraph, Sesame Thursday, 3 December 2009
  24. 24. Distributed K-V Stores • Giant hash table/dictionary • Mainly solve data scalability problems • Transparently partition and replicate data • Cons: • eventual consistency or other distributed transaction protocols • hard to do integrity constraints, hard to catch application bugs Thursday, 3 December 2009
  25. 25. Distributed K-V Stores • Scalaris, Dynomite, Ringo: data consistency • MemcacheDB, Tokyo Cabinet: low latency Thursday, 3 December 2009
  26. 26. Apache CouchDB Thursday, 3 December 2009
  27. 27. CouchDB and Ruby # with !, it creates the database if it doesn't already exist @db = CouchRest.database!("http://127.0.0.1:5984/couchrest-test") response = @db.save_doc({ :key => 'value', 'another key' => 'another value' }) doc = @db.get(response['id']) puts doc.inspect Thursday, 3 December 2009
  28. 28. CouchDB and Ruby @db.bulk_save([ {"wild" => "and random"}, {"mild" => "yet local"}, {"another" => ["set","of","keys"]} ]) # returns ids and revs of the current docs puts @db.documents.inspect Thursday, 3 December 2009
  29. 29. CouchDB and Ruby @db.save_doc({ "_id" => "_design/first", :views => { :test => { :map => "function(doc){for(var w in doc) { if(!w.match(/^_/))emit(w,doc[w])}}" } } }) puts @db.view('first/test')['rows'].inspect Thursday, 3 December 2009
  30. 30. CouchDB and Ruby • Read more about CouchRest on github • Also check out newcomer RubyAqua Thursday, 3 December 2009
  31. 31. Schema-Free (JSON) • Features • Document Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage Thursday, 3 December 2009
  32. 32. Schema-Free (JSON) • Features • Document Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage Thursday, 3 December 2009
  33. 33. http://www.flickr.com/photos/stilleben2001/223243329/ Documents Thursday, 3 December 2009
  34. 34. Schema-Free ( JSON) { "_id": "BCCD12CBB", "_rev": "AB764C", "type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true } Thursday, 3 December 2009
  35. 35. Schema-Free ( JSON) { "_id": "BCCD12CBB", "_rev": "AB764C", "type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true } Thursday, 3 December 2009
  36. 36. Schema-Free ( JSON) { "_id": "BCCD12CBB", "_rev": "AB764C", "type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true } Thursday, 3 December 2009
  37. 37. Schema-Free ( JSON) { "_id": "BCCD12CBB", "_rev": "AB764C", "type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true } Thursday, 3 December 2009
  38. 38. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage Thursday, 3 December 2009
  39. 39. Document-Oriented Not Relational • Documents in the Real World™ • Bills, letters, tax forms… • Same type != same structure • Can be out of date (so what?) • No references Thursday, 3 December 2009
  40. 40. Document-Oriented Not Relational • Documents in the Real World™ Bills, letters, tax forms… Natural Data • • Same type != same structure • Behaviour Can be out of date (so what?) • No references Thursday, 3 December 2009
  41. 41. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage Thursday, 3 December 2009
  42. 42. Highly Concurrent Thursday, 3 December 2009
  43. 43. Highly Concurrent • Functional languages highly appropriate for parallellism Thursday, 3 December 2009
  44. 44. Highly Concurrent • Functional languages highly appropriate for parallellism • Lightweight “processes” and message- passing; “shared-nothing” Thursday, 3 December 2009
  45. 45. Highly Concurrent • Functional languages highly appropriate for parallellism • Lightweight “processes” and message- passing; “shared-nothing” • Easy to create fault-tolerant systems Thursday, 3 December 2009
  46. 46. MVCC • Multiversion Concurrency Control • Reads: lock-free; never block • Potential for massive horizontal scaling • Writes: all-or-nothing • Success • Fail: conflict error, fetch and try again Thursday, 3 December 2009
  47. 47. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage Thursday, 3 December 2009
  48. 48. ful CRUD • Create HTTP PUT /db/mydocid • Read HTTP GET /db/mydocid • Update HTTP PUT /db/mydocid • Delete HTTP DELETE /db/mydocid Thursday, 3 December 2009
  49. 49. ful Example couch = CouchRest.database!("http:// 127.0.0.1:5984/tweets") tweets_url = "http://twitter.com/statuses/ user_timeline.json" tweets = http.get(tweets_url) couch.bulk_save(tweets) Thursday, 3 December 2009
  50. 50. Cacheability • Both documents and views return ETags • Clients send If-None-Match • CouchDB responds with 304 Not Modified and bypasses potentially expensive lookup • Can use Varnish/Squid as caching proxy • Proxy- friendly Thursday, 3 December 2009
  51. 51. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage Thursday, 3 December 2009
  52. 52. JavaScript-Powered Map/Reduce • Map functions extract data from your documents • Reduce functions aggregate intermediate values • The kicker: Incremental B-tree storage Thursday, 3 December 2009
  53. 53. http://horicky.blogspot.com/2008/10/couchdb-implementation.html Thursday, 3 December 2009
  54. 54. Map/Reduce Views Docs Map {"user" : "Chris", function(doc) { {"key": "Alice", "value": 5} "points" : 3 } if (doc.user && doc.points) { {"key": "Bob", "value": 7} {"user": "Joe", emit(doc.user, doc.points); {"key": "Chris", "value": 3} "points" : 10 } } {"key": "Joe", "value": 10} {"user": "Alice", } {"key": "Mary", "value": 9} "points" : 5 } {"user": "Mary", "points" : 9} {"user": "Bob", Reduce "points": 7} function(keys, values, rereduce) { Alice ... Chris: 15 return sum(values); Everyone: 34 } Thursday, 3 December 2009
  55. 55. Map/Reduce Views Docs Map {"user" : "Chris", function(doc) { {"key": "Alice", "value": 5} "points" : 3 } if (doc.user && doc.points) { {"key": "Bob", "value": 7} {"user": "Joe", emit(doc.user, doc.points); {"key": "Chris", "value": 3} "points" : 10 } } {"key": "Joe", "value": 10} {"user": "Alice", } {"key": "Mary", "value": 9} "points" : 5 } {"user": "Mary", "points" : 9} {"user": "Bob", Reduce "points": 7} function(keys, values, rereduce) { Alice … Chris: 15 return sum(values); Everyone: 34 } Thursday, 3 December 2009
  56. 56. Map/Reduce Views Docs Map {"user" : "Chris", function(doc) { {"key": "Alice", "value": 5} "points" : 3 } if (doc.user && doc.points) { {"key": "Bob", "value": 7} {"user": "Joe", emit(doc.user, doc.points); {"key": "Chris", "value": 3} "points" : 10 } } {"key": "Joe", "value": 10} {"user": "Alice", } {"key": "Mary", "value": 9} "points" : 5 } {"user": "Mary", "points" : 9} {"user": "Bob", Reduce "points": 7} function(keys, values, rereduce) { Alice … Chris: 15 return sum(values); Everyone: 34 } Thursday, 3 December 2009
  57. 57. Render Views as HTML lists/index.js /drl/_list/sofa/index/recent-posts?descending=true&limit=8 Thursday, 3 December 2009
  58. 58. Server-Side JavaScript • _show for transforming documents • _list for transforming views • _update for transforming PUTs/POSTs • Code-sharing between client and server • Easy deployment Thursday, 3 December 2009
  59. 59. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage Thursday, 3 December 2009
  60. 60. Replication • Incremental • Near-real-time • Clustered mirrors • Scheduled • Ad-hoc Thursday, 3 December 2009
  61. 61. “Ground Computing” @jhuggins http://www.flickr.com/photos/mcpig/872293700/ Thursday, 3 December 2009
  62. 62. http://www.flickr.com/photos/hercwad/2290378571/ Thursday, 3 December 2009
  63. 63. Latency Sucks Thursday, 3 December 2009
  64. 64. Stuart Langridge - Canonical ! ! Thursday, 3 December 2009
  65. 65. Thursday, 3 December 2009
  66. 66. Thursday, 3 December 2009
  67. 67. Thursday, 3 December 2009
  68. 68. Thursday, 3 December 2009
  69. 69. Thursday, 3 December 2009
  70. 70. Thursday, 3 December 2009
  71. 71. Thursday, 3 December 2009
  72. 72. Thursday, 3 December 2009
  73. 73. Thursday, 3 December 2009
  74. 74. Thursday, 3 December 2009
  75. 75. Con icts Thursday, 3 December 2009
  76. 76. Con ict resolution by example A B Thursday, 3 December 2009
  77. 77. Con ict resolution by example A B ❦ Thursday, 3 December 2009
  78. 78. Con ict resolution by example A B ❦ Thursday, 3 December 2009
  79. 79. Con ict resolution by example A B ❦ ❦ Thursday, 3 December 2009
  80. 80. Con ict resolution by example A B ❦ ❦ Thursday, 3 December 2009
  81. 81. Con ict resolution by example A B ❦ ✿ ❦ Thursday, 3 December 2009
  82. 82. Con ict resolution by example A B ❦ ✿ ❦ Thursday, 3 December 2009
  83. 83. Con ict resolution by example A B ✿ Thursday, 3 December 2009
  84. 84. Con ict resolution by example A B ✿ Thursday, 3 December 2009
  85. 85. Con ict resolution by example A B ✿ Thursday, 3 December 2009
  86. 86. Con ict resolution by example A B ✿ Thursday, 3 December 2009
  87. 87. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage Thursday, 3 December 2009
  88. 88. Robust Storage Append-Only File Structure Designed to Crash Instant-On Thursday, 3 December 2009
  89. 89. Robust Thursday, 3 December 2009
  90. 90. Thursday, 3 December 2009
  91. 91. anks! www.jasondavies.com @jasondavies Thursday, 3 December 2009

×