Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CouchDB at New York PHP


Published on

This is a presentation on CouchDB that I gave at the New York PHP User Group. I talked about the basics of CouchDB, its JSON documents, its RESTful API, writing and querying MapReduce views, using CouchDB from within PHP, and scaling.

Published in: Technology, News & Politics
  • Be the first to comment

CouchDB at New York PHP

  1. CouchDB: JSON,HTTP & MapReduce Bradley Holt ( (
  2. About Me
  3. Co-Founder andTechnical Director
  4. from Vermont
  5. Organizer BTV
  6. Contributor
  7. Author
  8. CouchDB Basics
  9. Cluster Of UnreliableCommodity HardwareDocument-oriented, schema-lessShared nothing, horizontally scalableRuns on the Erlang OTP platformPeer-based, bi-directional replicationRESTful HTTP APIQueries are done against MapReduce “views”, or indexes
  10. When You MightConsider CouchDBYou’ve found yourself denormalizing your SQL database for betterperformance.Your domain model is a “ t” for documents (e.g. a CMS).Your application is read-heavy.You need a high level of concurrency and can give up consistencyin exchange.You need horizontal scalability.You want your database to be able to run anywhere, even onmobile devices, and even when disconnected from the cluster.
  11. Trade-OffsNo ad hoc queries. You need to know what you’re going to want toquery ahead of time. For example, SQL would be a better t forbusiness intelligence reporting.No concept of “joins”. You can relate data, but watch out forconsistency issues.Transactions are limited to document boundaries.CouchDB trades storage space for performance.
  12. Other Alternatives to SQLMongoDB (a database for Hadoop)
  13. Don’t be so quick to get rid of SQL!There are many problems for which anSQL database is a good t. SQL is verypowerful and exible query language.
  14. JSON Documents{"title":"CouchDB: The De nitive Guide"}
  15. JSON (JavaScript Object Notation) is ahuman-readable and lightweight datainterchange format.Data structures from manyprogramming languages can beeasily converted to and from JSON.
  16. JSON ValuesA JSON object is a collection of key/value pairs.JSON values can be strings, numbers, booleans (false or true),arrays (e.g. ["a", "b", "c"]), null, or another JSON object.
  17. A “Book” JSON Object{ "_id":"978-0-596-15589-6", "title":"CouchDB: The De nitive Guide", "subtitle":"Time to Relax", "authors":[ "J. Chris Anderson", "Jan Lehnardt", "Noah Slater" ], "publisher":"OReilly Media", "released":"2010-01-19", "pages":272}
  18. RESTful HTTP APIcurl -iX PUT http://localhost:5984/mydb
  19. Representational State Transfer (REST)is a software architecture style thatdescribes distributed hypermediasystems such as the World Wide Web.
  20. HTTP is distributed,scalable, and cacheable.
  21. Everyone speaks HTTP.
  22. Create a Database$ curl -iX PUT http://localhost:5984/mydbHTTP/1.1 201 CreatedLocation: http://localhost:5984/mydb{"ok":true}
  23. Create a Document$ curl -iX POST http://localhost:5984/mydb-H "Content-Type: application/json"-d {"_id":"42621b2516001626"}HTTP/1.1 201 CreatedLocation: http://localhost:5984/mydb/42621b2516001626{ "ok":true, "id":"42621b2516001626", "rev":"1-967a00dff5e02add41819138abb3284d"}
  24. Read a Document$ curl -iX GET http://localhost:5984/mydb/42621b2516001626HTTP/1.1 200 OKEtag: "1-967a00dff5e02add41819138abb3284d"{ "_id":"42621b2516001626", "_rev":"1-967a00dff5e02add41819138abb3284d"}
  25. When updating a document, CouchDBrequires the correct document revisionnumber as part of its Multi-VersionConcurrency Control (MVCC). This formof optimistic concurrency ensures thatanother client hasnt modi ed thedocument since it was last retrieved.
  26. Update a Document$ curl -iX PUT http://localhost:5984/mydb/42621b2516001626-H "Content-Type: application/json"-d { "_id":"42621b2516001626", "_rev":"1-967a00dff5e02add41819138abb3284d", "title":"CouchDB: The De nitive Guide"}HTTP/1.1 201 Created{ "ok":true, "id":"42621b2516001626", "rev":"2-bbd27429fd1a0daa2b946cbacb22dc3e"}
  27. Conditional GET$ curl -iX GET http://localhost:5984/mydb/42621b2516001626-H If-None-Match: "2-bbd27429fd1a0daa2b946cbacb22dc3e"HTTP/1.1 304 Not ModifiedEtag: "2-bbd27429fd1a0daa2b946cbacb22dc3e"Content-Length: 0
  28. Delete a Document$ curl -iX DELETE http://localhost:5984/mydb/42621b2516001626-H If-Match: "2-bbd27429fd1a0daa2b946cbacb22dc3e"HTTP/1.1 200 OK{ "ok":true, "id":"42621b2516001626", "rev":"3-29d2ef6e0d3558a3547a92dac51f3231"}
  29. Read a Deleted Document$ curl -iX GET http://localhost:5984/mydb/42621b2516001626HTTP/1.1 404 Object Not Found{ "error":"not_found", "reason":"deleted"}
  30. Read a Deleted Document$ curl -iX GET http://localhost:5984/mydb/42621b2516001626?rev=3-29d2ef6e0d3558a3547a92dac51f3231HTTP/1.1 200 OKEtag: "3-29d2ef6e0d3558a3547a92dac51f3231"{ "_id":"42621b2516001626", "_rev":"3-29d2ef6e0d3558a3547a92dac51f3231", "_deleted":true}
  31. Fetch Revisions$ curl -iX GET http://localhost:5984/mydb/42621b2516001626?rev=3-29d2ef6e0d3558a3547a92dac51f3231&revs=trueHTTP/1.1 200 OK{ // … "_revisions":{ "start":3, "ids":[ "29d2ef6e0d3558a3547a92dac51f3231", "bbd27429fd1a0daa2b946cbacb22dc3e", "967a00dff5e02add41819138abb3284d" ] }}
  32. Do not rely on older revisions.Revisions are used for concurrencycontrol and for replication. Oldrevisions are removed during databasecompaction.
  33. MapReduce Viewsfunction(doc) { if (doc.title) { emit(doc.title); } }
  34. MapReduce consists of Map andReduce steps which can be distributedin a way that takes advantage of themultiple processor cores found inmodern hardware.
  35. Futon
  36. Create a Database
  37. Map and Reduce are written asJavaScript functions that are de nedwithin views. Temporary views can beused during development but shouldbe saved permanently to designdocuments for production. Temporaryviews can be very slow.
  38. Map
  39. Add the First Document{ "_id":"978-0-596-15589-6", "title":"CouchDB: The De nitive Guide", "subtitle":"Time to Relax", "authors":[ "J. Chris Anderson", "Jan Lehnardt", "Noah Slater" ], "publisher":"OReilly Media", "released":"2010-01-19", "pages":272}
  40. One-To-One Mapping
  41. Map Book Titlesfunction(doc) { // JSON object representing a doc to be mapped if (doc.title) { // make sure this doc has a title emit(doc.title); // emit the doc’s title as the key }}
  42. The emit function accepts twoarguments. The rst is a key and thesecond a value. Both are optional anddefault to null.
  43. “Titles” Temporary View
  44. “Titles” Temporary View key id value "CouchDB: The "978-0-596-15589-6" nullDefinitive Guide"
  45. Add a Second Document{ "_id":"978-0-596-52926-0", "title":"RESTful Web Services", "subtitle":"Web services for the real world", "authors":[ "Leonard Richardson", "Sam Ruby" ], "publisher":"OReilly Media", "released":"2007-05-08", "pages":448}
  46. “Titles” Temporary View
  47. “Titles” Temporary View key id value "CouchDB: The "978-0-596-15589-6" nullDefinitive Guide" "RESTful Web "978-0-596-52926-0" null Services"
  48. One-To-Many Mapping
  49. Add a “formats” Field toBoth Documents{ // … "formats":[ "Print", "Ebook", "Safari Books Online" ]}
  50. Add a Third Document{ "_id":"978-1-565-92580-9", "title":"DocBook: The De nitive Guide", "authors":[ "Norman Walsh", "Leonard Muellner" ], "publisher":"OReilly Media", "formats":[ "Print" ], "released":"1999-10-28", "pages":648}
  51. Map Book Formatsfunction(doc) { // JSON object representing a doc to be mapped if (doc.formats) { // make sure this doc has a formats eld for (var i in doc.formats) { emit(doc.formats[i]); // emit each format as the key } }}
  52. “Formats” Temporary View key id value "Ebook" "978-0-596-15589-6" null "Ebook" "978-0-596-52926-0" null "Print" "978-0-596-15589-6" null "Print" "978-0-596-52926-0" null "Print" "978-1-565-92580-9" null"Safari Books Online" "978-0-596-15589-6" null"Safari Books Online" "978-0-596-52926-0" null
  53. When querying a view, the “key” and“id” elds can be used to select a row orrange of rows. Optionally, rows can begrouped by “key” elds or by levels of“key” elds.
  54. Map Book Authorsfunction(doc) { // JSON object representing a doc to be mapped if (doc.authors) { // make sure this doc has an authors eld for (var i in doc.authors) { emit(doc.authors[i]); // emit each author as the key } }}
  55. “Authors” Temporary View key id value "J. Chris Anderson" "978-0-596-15589-6" null "Jan Lehnardt" "978-0-596-15589-6" null "Leonard Muellner" "978-1-565-92580-9" null"Leonard Richardson" "978-0-596-52926-0" null "Noah Slater" "978-0-596-15589-6" null "Norman Walsh" "978-1-565-92580-9" null "Sam Ruby" "978-0-596-52926-0" null
  56. Reduce
  57. Built-in Reduce Functions Function Output_count Returns the number of mapped values in the set_sum Returns the sum of the set of mapped values Returns numerical statistics of the mapped values in_stats the set including the sum, count, min, and max
  58. Count
  59. Format Count, Not Grouped
  60. Format Count, Not Grouped key value null 7
  61. Format Count, Grouped
  62. Format Count, Grouped key value "Ebook" 2 "Print" 3 "Safari Books Online" 2
  63. Sum
  64. Sum of Pages by Format
  65. Sum of Pages by Format key value "Ebook" 720 "Print" 1368 "Safari Books Online" 720
  66. Statssum, count, minimum, maximum,sum over all square roots
  67. Stats of Pages by Format
  68. Stats of Pages by Format key value {"sum":720,"count":2,"min":272, "Ebook" "max":448,"sumsqr":274688} {"sum":1368,"count":3,"min":272, "Print" "max":648,"sumsqr":694592} {"sum":720,"count":2,"min":272,"Safari Books Online" "max":448,"sumsqr":274688}
  69. Custom Reduce Functions
  70. The built-in Reduce functions shouldserve your needs most, if not all, of thetime. If you nd yourself writing acustom Reduce function, then take astep back and make sure that one ofthe built-in Reduce functions wontserve your needs better.
  71. Reduce Function Skeletonfunction(keys, values, rereduce) {}
  72. ParametersKeys: An array of mapped key and document IDs in the form of[key,id] where id is the document ID.Values: An array of mapped values.Rereduce: Whether or not the Reduce function is being calledrecursively on its own output.
  73. Count Equivalentfunction(keys, values, rereduce) { if (rereduce) { return sum(values); } else { return values.length; }}
  74. Sum Equivalentfunction(keys, values, rereduce) { return sum(values);}
  75. MapReduce LimitationsFull-text indexing and ad hoc searching • couchdb-lucene • ElasticSearch and CouchDB indexing and search (two dimensional) • GeoCouch • Geohash (e.g. dr5rusx1qkvvr)
  76. Querying Views
  77. You can query for all rows, a singlecontiguous range of rows, or even rowsmatching a speci ed key.
  78. Add a Fourth Document{ "_id":"978-0-596-80579-1", "title":"Building iPhone Apps with HTML, CSS, and JavaScript", "subtitle":"Making App Store Apps Without Objective-C or Cocoa", "authors":[ "Jonathan Stark" ], "publisher":"OReilly Media", "formats":[ "Print", "Ebook", "Safari Books Online" ], "released":"2010-01-08", "pages":192}
  79. Map Book Releasesfunction(doc) { if (doc.released) { emit(doc.released.split("-"), doc.pages); }}
  80. Save the “Releases” View
  81. “Releases”, Exact Grouping
  82. “Releases”, Exact Grouping key value {"sum":648,"count":1,"min":648,["1999","10","28"] "max":648,"sumsqr":419904} {"sum":448,"count":1,"min":448,["2007","05","08"] "max":448,"sumsqr":200704} {"sum":192,"count":1,"min":192,["2010","01","08"] "max":192,"sumsqr":36864} {"sum":272,"count":1,"min":272,["2010","01","19"] "max":272,"sumsqr":73984}
  83. “Releases”, Level 1 Grouping
  84. “Releases”, Level 1 Grouping key value {"sum":648,"count":1,"min":648, ["1999"] "max":648,"sumsqr":419904} {"sum":448,"count":1,"min":448, ["2007"] "max":448,"sumsqr":200704} {"sum":464,"count":2,"min":192, ["2010"] "max":272,"sumsqr":110848}
  85. “Releases”, Level 2 Grouping
  86. “Releases”, Level 2 Grouping key value {"sum":648,"count":1,"min":648, ["1999","10"] "max":648,"sumsqr":419904} {"sum":448,"count":1,"min":448, ["2007","05"] "max":448,"sumsqr":200704} {"sum":464,"count":2,"min":192, ["2010","01"] "max":272,"sumsqr":110848}
  87. PHP Libraries for CouchDBPHPCouch CouchDB Extension (PECL) ODM
  88. Do-It-YourselfZend_Http_Client + Zend_CachePEAR’s HTTP_ClientPHP cURLJSON maps nicely to and from PHP arrays using Zend_Json orPHP’s json_ functions
  89. Scaling
  90. Load BalancingSend POST, PUT, and DELETE requests to a write-only master nodeSetup continuous replication from the master node to multipleread-only nodesLoad balance GET, HEAD, and OPTIONS requests amongstread-only nodes • Apache HTTP Server (mod_proxy) • nginx • HAProxy • Varnish • Squid
  91. Clustering(Partitioning/Sharding)Lounge • Proxy, partitioning, and shardingBigCouch • Clusters modeled after Amazon’s Dynamo approachPillow • “…a combined router and rereducer for CouchDB.”
  92. What else?
  93. AuthenticationBasic AuthenticationCookie AuthenticationOAuth
  94. AuthorizationServer AdminDatabase ReaderDocument Update Validation
  95. Hypermedia ControlsShow FunctionsList FunctionsSee:
  96. ReplicationPeer-based & bi-directionalDocuments and modi ed elds are incrementally replicated.All data will be eventually consistent.Con icts are agged and can be handled by application logic.Partial replicas can be created via JavaScript lter functions.
  97. HostingCouchOne (now CouchBase)
  98. DistributingCouchApp • Applications built using JavaScript, HTML5, CSS and CouchDBMobile • Android
  99. CouchDB ResourcesCouchDB: The De nitive Guide CouchDB Wikiby J. Chris Anderson, Jan, and Noah Slater(O’Reilly) Beginning CouchDB978-0-596-15589-6 by Joe Lennon (Apress) 978-1-430-27237-3Writing and Querying MapReduceViews in CouchDBby Bradley Holt (O’Reilly)978-1-449-30312-9Scaling CouchDBby Bradley Holt (O’Reilly)063-6-920-01840-7
  100. Questions?
  101. Thank You Bradley Holt ( @BradleyHolt ( © 2011 Bradley Holt. All rights reserved.