Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

NoSQL and CouchDB

2,191 views

Published on

Published in: Technology
  • Be the first to comment

NoSQL and CouchDB

  1. 1. Who am I ? -> My Name: João Cerdeira -> Team Leader -> An Agile enthusiast: Scrum / Kanban / Lean -> A true believer in OpenSourcehttp://twitter.com/jacerdeira cerdeira@gmail.com
  2. 2. Disclamer-> I understand your questions, butsometimes I dont have answers-> Im not a NoSQL Dogmatic, just anenthusiast about the newways of storing information-> I have worked withRDBMS for 12 years
  3. 3. Everyone has their preferences
  4. 4. I dont care if I/you will use SQL or NoSQL. I just want to deliver betterServices/Aplications to the clients/users.
  5. 5. Concepts & Theory
  6. 6. Scale up vs Scale Down
  7. 7. Performance VS Scalability Latency VS ThroughputAvailability VS Consistency
  8. 8. BrewersCAPTheorem
  9. 9. Choose only 2: C onsistency A vailability P artition ToleranceAt a given time in certain enviroment
  10. 10. Consistency MS B RD Availability N oS Q L Partition Tolerance
  11. 11. Centralized SystemIn a centralized system (RDBMS) wedont have network partition P in CAPSo we get: A vailability C onsistency
  12. 12. -> A tomicity-> C onsistency-> I solated-> D urability
  13. 13. Distr ibuted SystemIn a distr ibuted system we (might) havenetwork partition P in CAPSo you can only pick one: A vailability C onsistency
  14. 14. CAP in practiceWe have only two types of Systems CP == CA (very similar) APSo in a network partition we have onlyone choice C onsistency A vailability
  15. 15. -> B asically A vailable-> S oft state-> E ventually consistent
  16. 16. Eventual Consistency
  17. 17. How toScale OutRDBMS ?http://capellaniaprimaria.blogspot.com/2011/02/concurso-deportivo-4-pregunta.html
  18. 18. Partition
  19. 19. Partition + Replication
  20. 20. ORM Problems
  21. 21. ORM ProblemsWhat you want ?Find/read a record/object
  22. 22. ORM ProblemsWhat you want ?Find/read a record/objectWhat you get ?A huge underground complexity
  23. 23. Let Validateour Thoughts
  24. 24. Let Validate our ThoughtsDo we need ACID for all solutions?
  25. 25. Let Validate our ThoughtsDo we need ACID for all solutions?When is Eventually Consistent enough ?
  26. 26. Let Validate our ThoughtsDo we need ACID for all solutions?When is Eventually Consistent enough ?Different solutions require different needs
  27. 27. Why NoSQL Appears ?Because New Dr ivers Appears (business or technical demand)
  28. 28. New Dr ivers Behind NoSQL Large amount of data Commodity hardware Scale Fast And Cheap Constantly changing request (data)
  29. 29. Why RDBMS arent good enough ?
  30. 30. Why RDBMS arent good enough ?Scalling reads in a RDBMS is hard
  31. 31. Why RDBMS arent good enough ?Scalling reads in a RDBMS is hard Scalling wr ites is impossible
  32. 32. Think againDo we really need a RDBMS ?
  33. 33. Think againDo we really need a RDBMS ? Sometimes !
  34. 34. Think again Do we really need a RDBMS ? Sometimes !But a lot of times we dont !
  35. 35. NoSQL
  36. 36. How did NoSQL start ? Google: Bigtable Amazon: Dynamo Facebook: Cassandra LinkedIn: Valdemort Yahoo: HBase (hadoop)
  37. 37. Or igins Google : “How can we build a DB on top of Google File System” Paper: Bigtable → A distributed store system for structured data, 2006Amazon: “How can we build a distributed hash table for the data center”Paper : Dynamo → Amazons highly available key-value store
  38. 38. Different Types of NoSQL Key-Value Stores Document Databases Column Databases Graph Databases
  39. 39. Key-Value StoresOr igin: Amazons Dynamo paperData model: Collections of KV pairsImplementations: Dynamo, Voldemort, Membase, Riak, RedisGood For: - Large amount of data - Scale writes and reads - Fast - Programmer friendly
  40. 40. Document DatabasesOr igin: Lotus NotesData model: Collections of DocumentsImplementations: CouchDB, MongoDB, Amazon SimpleDBGood For: - Human Data Structure - Programmer friendly - Rapid Development - Web friendly - CRUD
  41. 41. Column DatabasesOr igin: Googles BigTable PaperData model: Column family – each row (at least in theory) can have different configurationImplementations: BigTable, HBase, CassandraGood For: - Large amount of data - scale writes like no other - High availability
  42. 42. Graph DatabasesOr igin: Graph TheoryData model: Nodes and Relations, both can have KV pairsImplementations: Neo4j, FlockDBGood For: - resolve graph problems - Fast
  43. 43. Why Id choose CouchDB ? -> Easy to understand documents -> Use standards web technologies -> Simple to install and configure -> Small footprint (works on mobile platforms) -> Scales well (not for huge amount of data) -> Replication in the core
  44. 44. CouchDB Main Pr incipals Document Oriented Database No rows or columns Collection of JSON Documents Schema-Free
  45. 45. In CouchDB HTTP Rules-> Everything is a HTTP Request-> We are used to know GET and POST-> But there are others: -> PUT -> DELETE -> COPY RESTful HTTP API
  46. 46. Why JSON ?-> Light and text-based data format-> Simple to parse-> Not verbose (comparing to xml)-> Suitable for javascript frameworks (jquery)-> Parsers available in almost all programming languages
  47. 47. JSON Example{ make: "Ford", model: "Mustang", year: 2009, body: "Coupe", color: "Red", engine: { gas_type: "Petrol", cubic_capacity: 4600 }, previous_owners: [ { name: "John Smith", mileage: 1000 }, { name: "Jane Hunt", mileage: 2500 } ]}
  48. 48. JSON Example{ make: "Ford", model: "Mustang", year: 2009, body: "Coupe", color: "Red", engine: { gas_type: "Petrol", cubic_capacity: 4600 }, previous_owners: [ { name: "John Smith", mileage: 1000 }, { name: "Jane Hunt", mileage: 2500 } ]}
  49. 49. JSON Example{ make: "Ford", model: "Mustang", year: 2009, body: "Coupe", color: "Red", engine: { gas_type: "Petrol", cubic_capacity: 4600 }, previous_owners: [ { name: "John Smith", mileage: 1000 }, { name: "Jane Hunt", mileage: 2500 } ]}
  50. 50. JSON Example{ make: "Ford", model: "Mustang", year: 2009, body: "Coupe", color: "Red", engine: { gas_type: "Petrol", cubic_capacity: 4600 }, previous_owners: [ { name: "John Smith", mileage: 1000 }, { name: "Jane Hunt", mileage: 2500 } ]}
  51. 51. Example
  52. 52. Create / Delete Database $ curl http://127.0.0.1:5984 {"couchdb":"Welcome","version":"1.0.1"} $ curl -X PUT http://127.0.0.1:5984/contacts {"ok":true} $ curl -X GET http://127.0.0.1:5984/_all_dbs ["contacts","_users"] $ curl -X DELETE http://127.0.0.1:5984/contacts {"ok":true}
  53. 53. Manage Documents$ curl -X PUT http://127.0.0.1:5984/contacts/joaocerdeira -d {}{"ok":true,"id":"joaocerdeira","rev":"1-967a00dff5e02add41819138abb3284d"}$ curl -X GET http://127.0.0.1:5984/contacts/joaocerdeira{"_id":"joaocerdeira","_rev":"1-967a00dff5e02add41819138abb3284d"}$ curl -X DELETE http://127.0.0.1:5984/contacts/joaocerdeira?rev=1-967a00dff5e02add41819138abb3284d{"ok":true,"id":"joaocerdeira","rev":"2-eec205a9d413992850a6e32678485900"}
  54. 54. Manage Documents$ curl -X PUT http://127.0.0.1:5984/contacts/joaocerdeira -d{"firstName":"Joao","lastName":"Cerdeira","email":"cerdeira@gmail.com"}{"ok":true,"id":"joaocerdeira","rev":"1-186fe12b748c40559e8f234d8e566c18"}$ curl -X GET http://127.0.0.1:5984/contacts/joaocerdeira{"_id":"joaocerdeira","_rev":"1-186fe12b748c40559e8f234d8e566c18","firstName":"Joao","lastName":"Cerdeira","email":"cerdeira@gmail.com"}
  55. 55. Copy Documents$ curl -X COPY http://127.0.0.1:5984/contacts/joaocerdeira -H"Destination: batatinha"{"id":"batatinha","rev":"1-186fe12b748c40559e8f234d8e566c18"}$ curl -X GET http://127.0.0.1:5984/contacts/batatinha{"_id":"batatinha","_rev":"1-186fe12b748c40559e8f234d8e566c18","firstName":"Joao","lastName":"Cerdeira","email":"cerdeira@gmail.com"}
  56. 56. Changing Documents$ curl -X PUT http://127.0.0.1:5984/contacts/batatinha -d {"_rev":"1-186fe12b748c40559e8f234d8e566c18","firstName":"Clown","lastName":"Batatinha","email":["batatinha@bataton.pt","batatinha@first.to.exit@rtp.pt"], "phone":"93 1234567"}{"ok":true,"id":"batatinha","rev":"2-b7079a6d71179b1571652059355d84c3"}$ curl -X GET http://127.0.0.1:5984/contacts/batatinha{"_id":"batatinha","_rev":"2-b7079a6d71179b1571652059355d84c3","firstName":"Clown","lastName":"Batatinha","email":["batatinha@bataton.pt","batatinha@first.to.exit@rtp.pt"], "phone":"93 1234567"}
  57. 57. MVCC CouchDB never blocksAppend Mode Only
  58. 58. Designing Documents { "_id":"joaocerdeira", "_rev":"1-186fe12b748c40559e8f234d8e566c18", “doctype”:”contact” "firstName":"Joao", "lastName":"Cerdeira", “company”:”MULTICERT” "emails":[ { “type”:”personal”, “email”:"cerdeira@gmail.com“ }, { “type”:”business”, “email”:"joao.cerdeira@multicert.com“ } ], “phones”:[ { “type”:”personal”, “phone”:"93 1234567“ }, { “type”:”business”, “phone”:"93 7654321“ } ] }
  59. 59. Designing Documents { "_id":"joaocerdeira", "_rev":"1-186fe12b748c40559e8f234d8e566c18", “doctype”:”contact” "firstName":"Joao", "lastName":"Cerdeira", “company”:”MULTICERT” "emails":[ { “type”:”personal”, “email”:"cerdeira@gmail.com“ }, { “type”:”business”, “email”:"joao.cerdeira@multicert.com“ } ], “phones”:[ { “type”:”personal”, “phone”:"93 1234567“ }, { “type”:”business”, “phone”:"93 7654321“ } ] }
  60. 60. Designing Documents { "_id":"joaocerdeira", "_rev":"1-186fe12b748c40559e8f234d8e566c18", “doctype”:”contact” "firstName":"Joao", "lastName":"Cerdeira", “company”:”MULTICERT” "emails":[ { “type”:”personal”, “email”:"cerdeira@gmail.com“ }, { “type”:”business”, “email”:"joao.cerdeira@multicert.com“ } ], “phones”:[ { “type”:”personal”, “phone”:"93 1234567“ }, { “type”:”business”, “phone”:"93 7654321“ } ] }
  61. 61. Designing Documents { "_id":"joaocerdeira", "_rev":"1-186fe12b748c40559e8f234d8e566c18", “doctype”:”contact” "firstName":"Joao", "lastName":"Cerdeira", “company”:”MULTICERT” "emails":[ { “type”:”personal”, “email”:"cerdeira@gmail.com“ }, { “type”:”business”, “email”:"joao.cerdeira@multicert.com“ } ], “phones”:[ { “type”:”personal”, “phone”:"93 1234567“ }, { “type”:”business”, “phone”:"93 7654321“ } ] }
  62. 62. Futon Web Interface
  63. 63. Views
  64. 64. Quer ing CouchDB Quer ies in JavaScr ipt Use Map/Reduce for quer ingFor simple quer ies Map/Reduce isnt needed Dont have joins (but you can have similar)
  65. 65. Simple Views List All Documentsfunction(doc){ emit(doc._id,doc);} List All Documents Of type vipfunction(doc){ If (doc.type==vip){ emit(doc._id,doc); }}
  66. 66. Temp Views$ curl -X POST -H "Content-type: application/json"http://127.0.0.1:5984/contacts/_temp_view -d {"map":"function(doc){emit(doc._id,doc);}"}{"total_rows":2,"offset":0,"rows":[{"id":"batatinha","key":"batatinha","value":{"_id":"batatinha","_rev":"2-b7079a6d71179b1571652059355d84c3","firstName":"Palhaco","lastName":"Batatinha","email":["batatinha@bataton.pt","batatinha@first.to.exit@rtp.pt"],"phone":"931234567"}},{"id":"joaocerdeira","key":"joaocerdeira","value":{"_id":"joaocerdeira","_rev":"1-186fe12b748c40559e8f234d8e566c18","firstName":"Joao","lastName":"Cerdeira","email":"cerdeira@gmail.com","_deleted_conflicts":["2-eec205a9d413992850a6e32678485900"]}}
  67. 67. Normal Views { "_id" : "_design/example", "views" : { "foo" : { "map":"function(doc){emit(doc._id,doc);}" } } }$ curl -X PUT -H "Content-type: application/json"http://127.0.0.1:5984/contacts/_design/example -d @design_simple1.json
  68. 68. Normal Views$ curl -X GET http://127.0.0.1:5984/contacts/_design/example/_view/foo{"total_rows":2,"offset":0,"rows":[{"id":"batatinha","key":"batatinha","value":{"_id":"batatinha","_rev":"2-b7079a6d71179b1571652059355d84c3","firstName":"Palhaco","lastName":"Batatinha","email":["batatinha@bataton.pt","batatinha@primeiro.a.sair@rtp.pt"],"phone":"931234567"}},{"id":"joaocerdeira","key":"joaocerdeira","value":{"_id":"joaocerdeira","_rev":"1-186fe12b748c40559e8f234d8e566c18","firstName":"Jou00e3o","lastName":"Cerdeira","email":"cerdeira@gmail.com","_deleted_conflicts":["2-eec205a9d413992850a6e32678485900"]}}]}
  69. 69. Map/ReduceGoogle patent from the paper: http://labs.google.com/papers/mapreduce.html image source: http://map-reduce.wikispaces.asu.edu/
  70. 70. Map/Reduce Views{"_id" : "_design/example", "views" : { …................................... "bar" : { "map":"function(doc){emit(doc,1);}", "reduce":"function(keys, values, rereduce) { return sum(values);}" }}}$ curl -X GET http://127.0.0.1:5984/contacts/_design/example/_view/bar{"rows":[{"key":null,"value":7}]}
  71. 71. Map/Reduce Views{"_id" : "_design/example", "views" : { …................................... ""aggreg" : { "map":"function(doc){if(doc.country){emit(doc.country,1);}}", "reduce":"function(keys, values, rereduce) {return sum(values);}" }$ curl -X GEThttp://127.0.0.1:5984/contacts/_design/example/_view/aggreg?group=true{"rows":[{"key":"England","value":1},{"key":"Portugal","value":2},{"key":"US","value":2}]}
  72. 72. Replication
  73. 73. Wr iteRead
  74. 74. Wr iteRead Read
  75. 75. Wr ite Read Read Read
  76. 76. One Time Replication$ curl -H "Content-type: application/json -X POST http://127.0.0.1:5984/_replicate-d {"source":"contacts","target":"contacts-replica"}{"ok":true, "session_id":"00872a440fdda973d6a9a18f2f571bb8", "source_last_seq":19, "history": [{"session_id":"00872a440fdda973d6a9a18f2f571bb8", "start_time":"Tue, 05 Jul 2011 23:03:32 GMT", "end_time":"Tue, 05 Jul 2011 23:03:32 GMT", "start_last_seq":0, "end_last_seq":19, "recorded_seq":19, "missing_checked":0, "missing_found":8, "docs_read":12, "docs_written":12, "doc_write_failures":0}]}
  77. 77. Wr ite Wr ite
  78. 78. Continuous Replication $ curl -vX POST http://127.0.0.1:5984/_replicate -d { "source":"http://127.0.0.1:5984/contacts", "target":"http://127.0.0.1:5984/contacts-replica", "continuous":true }
  79. 79. Read Wr iteWr ite Wr ite Wr ite White Read
  80. 80. Load Balancing CachingIts HTTP. So use the tools you know -> NGINX -> Squid -> Apache mod_proxy -> …....
  81. 81. Conflict Resolution Library http://thetowersofjacksonville.com/photogallery/photo12411/real.html http://thetowersofjacksonville.com/photogallery/photo12411/real.htm
  82. 82. Conflicts Resolutionfunction(doc) { if(doc._conflicts) { emit(doc._conflicts, null);}}{"total_rows":1,"offset":0,"rows":[{"id":"identifier","key":["2-7c971bb974251ae8541b8fe045964219"],"value":null}]}$ curl -X DELETE $HOST/db-replica/identifier?rev=2-de0ea16f8621cbac506d23a0fbbde08a{"ok":true,"id":"identifier","rev":"3-bfe83a296b0445c4d526ef35ef62ac14"}$ curl -X PUT $HOST/db-replica/identifier-d {"count":3,"_rev":"2-7c971bb974251ae8541b8fe045964219"}{"ok":true,"id":"identifier","rev":"3-5d0319b075a21b095719bc561def7122"}
  83. 83. Libraryhttp://thetowersofjacksonville.com/photogallery/photo12411/real.htm
  84. 84. ClientsJavaScript : Jquery CouchDB Library.Net : RelaxJava : CouchDB4JPerl : CouchDB::Client Net::CouchDbRuby : CouchRestPython : couchdb-pythonScala : scouchdbAnd so much more ...
  85. 85. CouchDB InMobile http://www.digitaljournal.com/article/261153
  86. 86. Mobile Platforms Supported
  87. 87. Simply Works
  88. 88. PhoneGAP LawnChair
  89. 89. Own Your Data I like services like google but what about my privacy ?!I think CouchDB is the way to own my data
  90. 90. Partition with Cluster http://thetowersofjacksonville.com/photogallery/photo12411/real.htm
  91. 91. Solutions
  92. 92. “CouchDB is built ofthe Web to theWeb” – Jacob Kaplan-Moss
  93. 93. We need a MindSet Change Stop seing all the data in the world as relational data
  94. 94. Dont trust me ... or others Try it !
  95. 95. And the Future…Probably will be polyglotUsing RDBMS and more than one NoSQL Database per solution
  96. 96. Success Stor ies

×