Your SlideShare is downloading. ×
CouchDB at New York PHP
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

CouchDB at New York PHP


Published on

This is a presentation on CouchDB that I gave at the New York PHP User Group. I talked about the basics of CouchDB, its JSON documents, its RESTful API, writing and querying MapReduce views, using …

This is a presentation on CouchDB that I gave at the New York PHP User Group. I talked about the basics of CouchDB, its JSON documents, its RESTful API, writing and querying MapReduce views, using CouchDB from within PHP, and scaling.

Published in: Technology, News & Politics

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. CouchDB: JSON,HTTP & MapReduce Bradley Holt ( (
  • 2. About Me
  • 3. Co-Founder andTechnical Director
  • 4. from Vermont
  • 5. Organizer BTV
  • 6. Contributor
  • 7. Author
  • 8. CouchDB Basics
  • 9. Cluster Of UnreliableCommodity HardwareDocument-oriented, schema-lessShared nothing, horizontally scalableRuns on the Erlang OTP platformPeer-based, bi-directional replicationRESTful HTTP APIQueries are done against MapReduce “views”, or indexes
  • 10. When You MightConsider CouchDBYou’ve found yourself denormalizing your SQL database for betterperformance.Your domain model is a “ t” for documents (e.g. a CMS).Your application is read-heavy.You need a high level of concurrency and can give up consistencyin exchange.You need horizontal scalability.You want your database to be able to run anywhere, even onmobile devices, and even when disconnected from the cluster.
  • 11. Trade-OffsNo ad hoc queries. You need to know what you’re going to want toquery ahead of time. For example, SQL would be a better t forbusiness intelligence reporting.No concept of “joins”. You can relate data, but watch out forconsistency issues.Transactions are limited to document boundaries.CouchDB trades storage space for performance.
  • 12. Other Alternatives to SQLMongoDB (a database for Hadoop)
  • 13. Don’t be so quick to get rid of SQL!There are many problems for which anSQL database is a good t. SQL is verypowerful and exible query language.
  • 14. JSON Documents{"title":"CouchDB: The De nitive Guide"}
  • 15. JSON (JavaScript Object Notation) is ahuman-readable and lightweight datainterchange format.Data structures from manyprogramming languages can beeasily converted to and from JSON.
  • 16. JSON ValuesA JSON object is a collection of key/value pairs.JSON values can be strings, numbers, booleans (false or true),arrays (e.g. ["a", "b", "c"]), null, or another JSON object.
  • 17. A “Book” JSON Object{ "_id":"978-0-596-15589-6", "title":"CouchDB: The De nitive Guide", "subtitle":"Time to Relax", "authors":[ "J. Chris Anderson", "Jan Lehnardt", "Noah Slater" ], "publisher":"OReilly Media", "released":"2010-01-19", "pages":272}
  • 18. RESTful HTTP APIcurl -iX PUT http://localhost:5984/mydb
  • 19. Representational State Transfer (REST)is a software architecture style thatdescribes distributed hypermediasystems such as the World Wide Web.
  • 20. HTTP is distributed,scalable, and cacheable.
  • 21. Everyone speaks HTTP.
  • 22. Create a Database$ curl -iX PUT http://localhost:5984/mydbHTTP/1.1 201 CreatedLocation: http://localhost:5984/mydb{"ok":true}
  • 23. Create a Document$ curl -iX POST http://localhost:5984/mydb-H "Content-Type: application/json"-d {"_id":"42621b2516001626"}HTTP/1.1 201 CreatedLocation: http://localhost:5984/mydb/42621b2516001626{ "ok":true, "id":"42621b2516001626", "rev":"1-967a00dff5e02add41819138abb3284d"}
  • 24. Read a Document$ curl -iX GET http://localhost:5984/mydb/42621b2516001626HTTP/1.1 200 OKEtag: "1-967a00dff5e02add41819138abb3284d"{ "_id":"42621b2516001626", "_rev":"1-967a00dff5e02add41819138abb3284d"}
  • 25. When updating a document, CouchDBrequires the correct document revisionnumber as part of its Multi-VersionConcurrency Control (MVCC). This formof optimistic concurrency ensures thatanother client hasnt modi ed thedocument since it was last retrieved.
  • 26. Update a Document$ curl -iX PUT http://localhost:5984/mydb/42621b2516001626-H "Content-Type: application/json"-d { "_id":"42621b2516001626", "_rev":"1-967a00dff5e02add41819138abb3284d", "title":"CouchDB: The De nitive Guide"}HTTP/1.1 201 Created{ "ok":true, "id":"42621b2516001626", "rev":"2-bbd27429fd1a0daa2b946cbacb22dc3e"}
  • 27. Conditional GET$ curl -iX GET http://localhost:5984/mydb/42621b2516001626-H If-None-Match: "2-bbd27429fd1a0daa2b946cbacb22dc3e"HTTP/1.1 304 Not ModifiedEtag: "2-bbd27429fd1a0daa2b946cbacb22dc3e"Content-Length: 0
  • 28. Delete a Document$ curl -iX DELETE http://localhost:5984/mydb/42621b2516001626-H If-Match: "2-bbd27429fd1a0daa2b946cbacb22dc3e"HTTP/1.1 200 OK{ "ok":true, "id":"42621b2516001626", "rev":"3-29d2ef6e0d3558a3547a92dac51f3231"}
  • 29. Read a Deleted Document$ curl -iX GET http://localhost:5984/mydb/42621b2516001626HTTP/1.1 404 Object Not Found{ "error":"not_found", "reason":"deleted"}
  • 30. Read a Deleted Document$ curl -iX GET http://localhost:5984/mydb/42621b2516001626?rev=3-29d2ef6e0d3558a3547a92dac51f3231HTTP/1.1 200 OKEtag: "3-29d2ef6e0d3558a3547a92dac51f3231"{ "_id":"42621b2516001626", "_rev":"3-29d2ef6e0d3558a3547a92dac51f3231", "_deleted":true}
  • 31. Fetch Revisions$ curl -iX GET http://localhost:5984/mydb/42621b2516001626?rev=3-29d2ef6e0d3558a3547a92dac51f3231&revs=trueHTTP/1.1 200 OK{ // … "_revisions":{ "start":3, "ids":[ "29d2ef6e0d3558a3547a92dac51f3231", "bbd27429fd1a0daa2b946cbacb22dc3e", "967a00dff5e02add41819138abb3284d" ] }}
  • 32. Do not rely on older revisions.Revisions are used for concurrencycontrol and for replication. Oldrevisions are removed during databasecompaction.
  • 33. MapReduce Viewsfunction(doc) { if (doc.title) { emit(doc.title); } }
  • 34. MapReduce consists of Map andReduce steps which can be distributedin a way that takes advantage of themultiple processor cores found inmodern hardware.
  • 35. Futon
  • 36. Create a Database
  • 37. Map and Reduce are written asJavaScript functions that are de nedwithin views. Temporary views can beused during development but shouldbe saved permanently to designdocuments for production. Temporaryviews can be very slow.
  • 38. Map
  • 39. Add the First Document{ "_id":"978-0-596-15589-6", "title":"CouchDB: The De nitive Guide", "subtitle":"Time to Relax", "authors":[ "J. Chris Anderson", "Jan Lehnardt", "Noah Slater" ], "publisher":"OReilly Media", "released":"2010-01-19", "pages":272}
  • 40. One-To-One Mapping
  • 41. Map Book Titlesfunction(doc) { // JSON object representing a doc to be mapped if (doc.title) { // make sure this doc has a title emit(doc.title); // emit the doc’s title as the key }}
  • 42. The emit function accepts twoarguments. The rst is a key and thesecond a value. Both are optional anddefault to null.
  • 43. “Titles” Temporary View
  • 44. “Titles” Temporary View key id value "CouchDB: The "978-0-596-15589-6" nullDefinitive Guide"
  • 45. Add a Second Document{ "_id":"978-0-596-52926-0", "title":"RESTful Web Services", "subtitle":"Web services for the real world", "authors":[ "Leonard Richardson", "Sam Ruby" ], "publisher":"OReilly Media", "released":"2007-05-08", "pages":448}
  • 46. “Titles” Temporary View
  • 47. “Titles” Temporary View key id value "CouchDB: The "978-0-596-15589-6" nullDefinitive Guide" "RESTful Web "978-0-596-52926-0" null Services"
  • 48. One-To-Many Mapping
  • 49. Add a “formats” Field toBoth Documents{ // … "formats":[ "Print", "Ebook", "Safari Books Online" ]}
  • 50. Add a Third Document{ "_id":"978-1-565-92580-9", "title":"DocBook: The De nitive Guide", "authors":[ "Norman Walsh", "Leonard Muellner" ], "publisher":"OReilly Media", "formats":[ "Print" ], "released":"1999-10-28", "pages":648}
  • 51. Map Book Formatsfunction(doc) { // JSON object representing a doc to be mapped if (doc.formats) { // make sure this doc has a formats eld for (var i in doc.formats) { emit(doc.formats[i]); // emit each format as the key } }}
  • 52. “Formats” Temporary View key id value "Ebook" "978-0-596-15589-6" null "Ebook" "978-0-596-52926-0" null "Print" "978-0-596-15589-6" null "Print" "978-0-596-52926-0" null "Print" "978-1-565-92580-9" null"Safari Books Online" "978-0-596-15589-6" null"Safari Books Online" "978-0-596-52926-0" null
  • 53. When querying a view, the “key” and“id” elds can be used to select a row orrange of rows. Optionally, rows can begrouped by “key” elds or by levels of“key” elds.
  • 54. Map Book Authorsfunction(doc) { // JSON object representing a doc to be mapped if (doc.authors) { // make sure this doc has an authors eld for (var i in doc.authors) { emit(doc.authors[i]); // emit each author as the key } }}
  • 55. “Authors” Temporary View key id value "J. Chris Anderson" "978-0-596-15589-6" null "Jan Lehnardt" "978-0-596-15589-6" null "Leonard Muellner" "978-1-565-92580-9" null"Leonard Richardson" "978-0-596-52926-0" null "Noah Slater" "978-0-596-15589-6" null "Norman Walsh" "978-1-565-92580-9" null "Sam Ruby" "978-0-596-52926-0" null
  • 56. Reduce
  • 57. Built-in Reduce Functions Function Output_count Returns the number of mapped values in the set_sum Returns the sum of the set of mapped values Returns numerical statistics of the mapped values in_stats the set including the sum, count, min, and max
  • 58. Count
  • 59. Format Count, Not Grouped
  • 60. Format Count, Not Grouped key value null 7
  • 61. Format Count, Grouped
  • 62. Format Count, Grouped key value "Ebook" 2 "Print" 3 "Safari Books Online" 2
  • 63. Sum
  • 64. Sum of Pages by Format
  • 65. Sum of Pages by Format key value "Ebook" 720 "Print" 1368 "Safari Books Online" 720
  • 66. Statssum, count, minimum, maximum,sum over all square roots
  • 67. Stats of Pages by Format
  • 68. Stats of Pages by Format key value {"sum":720,"count":2,"min":272, "Ebook" "max":448,"sumsqr":274688} {"sum":1368,"count":3,"min":272, "Print" "max":648,"sumsqr":694592} {"sum":720,"count":2,"min":272,"Safari Books Online" "max":448,"sumsqr":274688}
  • 69. Custom Reduce Functions
  • 70. The built-in Reduce functions shouldserve your needs most, if not all, of thetime. If you nd yourself writing acustom Reduce function, then take astep back and make sure that one ofthe built-in Reduce functions wontserve your needs better.
  • 71. Reduce Function Skeletonfunction(keys, values, rereduce) {}
  • 72. ParametersKeys: An array of mapped key and document IDs in the form of[key,id] where id is the document ID.Values: An array of mapped values.Rereduce: Whether or not the Reduce function is being calledrecursively on its own output.
  • 73. Count Equivalentfunction(keys, values, rereduce) { if (rereduce) { return sum(values); } else { return values.length; }}
  • 74. Sum Equivalentfunction(keys, values, rereduce) { return sum(values);}
  • 75. MapReduce LimitationsFull-text indexing and ad hoc searching • couchdb-lucene • ElasticSearch and CouchDB indexing and search (two dimensional) • GeoCouch • Geohash (e.g. dr5rusx1qkvvr)
  • 76. Querying Views
  • 77. You can query for all rows, a singlecontiguous range of rows, or even rowsmatching a speci ed key.
  • 78. Add a Fourth Document{ "_id":"978-0-596-80579-1", "title":"Building iPhone Apps with HTML, CSS, and JavaScript", "subtitle":"Making App Store Apps Without Objective-C or Cocoa", "authors":[ "Jonathan Stark" ], "publisher":"OReilly Media", "formats":[ "Print", "Ebook", "Safari Books Online" ], "released":"2010-01-08", "pages":192}
  • 79. Map Book Releasesfunction(doc) { if (doc.released) { emit(doc.released.split("-"), doc.pages); }}
  • 80. Save the “Releases” View
  • 81. “Releases”, Exact Grouping
  • 82. “Releases”, Exact Grouping key value {"sum":648,"count":1,"min":648,["1999","10","28"] "max":648,"sumsqr":419904} {"sum":448,"count":1,"min":448,["2007","05","08"] "max":448,"sumsqr":200704} {"sum":192,"count":1,"min":192,["2010","01","08"] "max":192,"sumsqr":36864} {"sum":272,"count":1,"min":272,["2010","01","19"] "max":272,"sumsqr":73984}
  • 83. “Releases”, Level 1 Grouping
  • 84. “Releases”, Level 1 Grouping key value {"sum":648,"count":1,"min":648, ["1999"] "max":648,"sumsqr":419904} {"sum":448,"count":1,"min":448, ["2007"] "max":448,"sumsqr":200704} {"sum":464,"count":2,"min":192, ["2010"] "max":272,"sumsqr":110848}
  • 85. “Releases”, Level 2 Grouping
  • 86. “Releases”, Level 2 Grouping key value {"sum":648,"count":1,"min":648, ["1999","10"] "max":648,"sumsqr":419904} {"sum":448,"count":1,"min":448, ["2007","05"] "max":448,"sumsqr":200704} {"sum":464,"count":2,"min":192, ["2010","01"] "max":272,"sumsqr":110848}
  • 87. PHP Libraries for CouchDBPHPCouch CouchDB Extension (PECL) ODM
  • 88. Do-It-YourselfZend_Http_Client + Zend_CachePEAR’s HTTP_ClientPHP cURLJSON maps nicely to and from PHP arrays using Zend_Json orPHP’s json_ functions
  • 89. Scaling
  • 90. Load BalancingSend POST, PUT, and DELETE requests to a write-only master nodeSetup continuous replication from the master node to multipleread-only nodesLoad balance GET, HEAD, and OPTIONS requests amongstread-only nodes • Apache HTTP Server (mod_proxy) • nginx • HAProxy • Varnish • Squid
  • 91. Clustering(Partitioning/Sharding)Lounge • Proxy, partitioning, and shardingBigCouch • Clusters modeled after Amazon’s Dynamo approachPillow • “…a combined router and rereducer for CouchDB.”
  • 92. What else?
  • 93. AuthenticationBasic AuthenticationCookie AuthenticationOAuth
  • 94. AuthorizationServer AdminDatabase ReaderDocument Update Validation
  • 95. Hypermedia ControlsShow FunctionsList FunctionsSee:
  • 96. ReplicationPeer-based & bi-directionalDocuments and modi ed elds are incrementally replicated.All data will be eventually consistent.Con icts are agged and can be handled by application logic.Partial replicas can be created via JavaScript lter functions.
  • 97. HostingCouchOne (now CouchBase)
  • 98. DistributingCouchApp • Applications built using JavaScript, HTML5, CSS and CouchDBMobile • Android
  • 99. CouchDB ResourcesCouchDB: The De nitive Guide CouchDB Wikiby J. Chris Anderson, Jan, and Noah Slater(O’Reilly) Beginning CouchDB978-0-596-15589-6 by Joe Lennon (Apress) 978-1-430-27237-3Writing and Querying MapReduceViews in CouchDBby Bradley Holt (O’Reilly)978-1-449-30312-9Scaling CouchDBby Bradley Holt (O’Reilly)063-6-920-01840-7
  • 100. Questions?
  • 101. Thank You Bradley Holt ( @BradleyHolt ( © 2011 Bradley Holt. All rights reserved.