Advanced CouchDB

  • 5,933 views
Uploaded on

http://joind.in/2495 …

http://joind.in/2495

PHPBenelux conference January 2011

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
5,933
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
114
Comments
0
Likes
10

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. CouchDB relax
  • 2. CouchDB relax Sander van de Graaf @svdgraafFocus -> practical usage examples
  • 3. http://joind.in/talk/view/2495second talk ever, please provide feedback
  • 4. CONTENTS• Introduction• PHP Usage• Replication/Scalability• Backend usage• Couchapps• Other stuff
  • 5. CouchDB relax
  • 6. NOSQL
  • 7. IT’S A MOVEMENTMovement, definitions vary
  • 8. 1998Back in the day...
  • 9. Lame movie 1
  • 10. Another one
  • 11. And then some more...
  • 12. XML was introduced
  • 13. Some game was published
  • 14. MC Donald’s Happy Meal
  • 15. Carlo StrozziReleased NOSQL open source DB
  • 16. NOSQL == Not Only SQL
  • 17. “[The NoSQL movement] departs from the relationalmodel altogether, it should therefore have been calledmore appropriately ‘NoREL’, or something to thateffect.” - Carlo Strozzi
  • 18. CouchDB relax
  • 19. Ubuntu One, contacts sync
  • 20. NUTSHELL
  • 21. SPEEDSpeed Not diskpace (see cleanup)
  • 22. APPEND ONLYAppend only storage, happy cup of coffee!
  • 23. NO REPAIR NEEDED
  • 24. COMPACTING
  • 25. HTTP SERVERcaching, loadbalancing, without extracosts :D
  • 26. CAP
  • 27. CouchDB CAP
  • 28. CouchDB EVENTUALLY CONSISTENTCouchDB focus is on Availability + Reliability, and will beconsistent after replication.
  • 29. FULL REST API
  • 30. REST• GET • SELECT• PUT • UPDATE• POST • INSERT• DELETE • DELETE• COPY • ...
  • 31. JSON { total_rows: 2, offset: 0, rows: [ { id: _design/foobar, key: _design/foobar, value: { rev: 5-982b2fc36835715b2aae54609b5d5f1e } }, { id: f0e1fd96eb6e094f74dda8d949000a6a, key: f0e1fd96eb6e094f74dda8d949000a6a, value: { rev: 1-86bca407fce8234a63c90ff549b56b10 } }, ] }Javascript == awesome! :D
  • 32. REPLICATIONKey feature, relaxed about replicationissues, and version conflicts
  • 33. Welcome to Futon, I prefer a UI
  • 34. http-console rocks the socks out of telnet
  • 35. Berkeley
  • 36. CONTENTS• Introduction• PHP Usage• Replication/Scalability• Backend usage• Couchapps• Other stuff
  • 37. PHP USAGE
  • 38. PHP LIBRARIES • PHPillow (LGPL) • PHP Object Freezer (BSD) • PHP On Couch (GPL 2 / 3) • PHP CouchDB Extension (PHP license) • SAG for CouchDB (apache) • Doctrine 2 CouchDB ODMAll are quite nice, doctrine has some rough edges, I use PHP On Couch with custom patch forZend autoloader easyness
  • 39. <?PHP // setup connection for couchdb $client = new Couchdb_Client(http://ponies.couchone.com:5984,rainbows); // fetch a document $doc = $client->getDoc(awesome_pony); // updating document $doc->newproperty = array("type","awesome"); try { $client->storeDoc($doc); } catch (Exception $e) { echo "Document storage failed : " . $e->getMessage(); }PHP On Couch with small ZF autoloader fix
  • 40. CONTENTS• Introduction• PHP Usage• Replication/Scalability• Backend usage• Couchapps• Other stuff
  • 41. REPLICATION
  • 42. DEFINITION“Replication is the process of sharing information soas to ensure consistency between redundantresources, such as software or hardware components,to improve reliability, fault-tolerance, or accessibility.” Source: wikipedia
  • 43. CouchDB relax
  • 44. CouchDB relax CouchDB relax
  • 45. CouchDB
  • 46. CouchDB relax CouchDB relax CouchDB relax CouchDB relaxMysql can do this
  • 47. CouchDB relax CouchDB relaxMaster, Master replication
  • 48. CouchDB relaxCouchDB relax CouchDB relax
  • 49. US NL CouchDB relax CouchDB relax BE CouchDB relaxNot only locally
  • 50. P2P WEB
  • 51. “World Domination”
  • 52. CLUSTERING “The fun stuff ”
  • 53. Couchdb doesn’t support partitioning (sharding) itself, couchdb -> http based -> lots ofpossibilities
  • 54. loadbalancer ...n CouchDB relax CouchDB relaxThe basics are all the same: easy => couchdb instances 1..n => loadbalancer
  • 55. CHALLENGES• Large amounts of data• Large views (with big/long map/reduce queries)• LOTS of traffic• Location based partitions• For fun and profit
  • 56. MAP/REDUCE
  • 57. INPUT IP Bytes 212.122.174.13 18271 212.122.174.13 191726 212.122.174.13 198 74.119.8.111 91272 74.119.8.111 8371 212.122.174.13 43Map/Reduce example
  • 58. MAPPER => REDUCER IP Bytes 18271 191726212.122.174.13 198 43 91272 74.119.8.111 8371
  • 59. AFTER REDUCE IP Bytes212.122.174.13 210238 74.119.8.111 99643
  • 60. PARTITION INPUT Partition IP Bytes 0 212.122.174.13 18271 0 212.122.174.13 191726 0 212.122.174.13 198 1 74.119.8.111 91272 1 74.119.8.111 8371 0 212.122.174.13 43Map/Reduce example
  • 61. MAPPER => REDUCER Partition IP Bytes 18271 191726 0 212.122.174.13 198 43 91272 1 74.119.8.111 8371If data is big enough, you could even need a re-re-re-reducer
  • 62. AFTER REDUCE IP Bytes212.122.174.13 210238 74.119.8.111 99643
  • 63. CLUSTERING OPTIONS• CouchDB Lounge• Pillow• BigCouch
  • 64. LOUNGE• partitioning/clustering• Nginx module• meebo.com• ‘easy’• http://tilgovi.github.com/couchdb-lounge/
  • 65. LOUNGE • dumb_proxy => proxy for simple PUT/GET’s • smart_proxy => proxy for map/reduce over shards • replicator => updates all copies, redudantlyit can make sure that there are N copies of a document at every moment
  • 66. nginx dumb_proxy ...n CouchDB relax CouchDB relaxdumb_proxy == ONLY GET/PUT
  • 67. nginx smart_proxy ...n CouchDB relax CouchDB relaxsmart_proxy takes care of the map/reduce and re-reducers over multiple nodes
  • 68. Bonus: other nginx modules work toomod_cache, mod_expire, etc.
  • 69. PILLOW• Erlang based• router/rereducer (map/reduce over multiple systems)• In development (but promising!)• https://github.com/khellan/Pillow
  • 70. BIGCOUCH• Fork• 100% api compatible• Open Source/Commercial• https://cloudant.com/#!/solutions/bigcouch
  • 71. CONTENTS• Introduction• PHP Usage• Replication/Scalability• Backend usage• Couchapps• Other stuff
  • 72. BACKEND USAGE
  • 73. PROXIED CouchDB relaxproxied via middleware, or via mod_proxy or similiar
  • 74. DIRECT CouchDB relaxor direct, because http based, content is directly available in javascript
  • 75. NOSQL && SQL HYBRID• onSave, onCommit hooks available in every major framework• onSave -> make a JSON representation of your object, and PUT it to couchdb (#protip: only ‘public’ data)• sql db is leading, you don’t care about versioning in couchdb• youcan use your data directly from couchdb within your frontend javascript
  • 76. MODEL<?phpclass Pony extends Application_models{ public function toArray() { $data = $this->_getData(); unset($data[created_on]); unset($data[created_by]); unset($data[access_level]); unset($data[private_data]); $data[tags] = $this->getTags(); $data[categories] = $this->getCategories(); $data[rainbows] = double; return $data; }}
  • 77. AFTER_SAVE<?phpclass article_module extends admin_module{ public function after_save() { parent::after_save(); $data = $this->toJson(); $res = CouchDB::put($data); $this->_id = $res->_id; $this->_rev = $res->_rev; }}
  • 78. PROXY RewriteEngine On RewriteRule /data/(.*) http://127.0.0.1:5984/db/$1 [P,L]Proxy the calls (work around sandbox/other domain error), or use jsonp
  • 79. JAVASCRIPT<script type="text/javascript">$.getJSON("/db/ponies/_design/ponies/_view/best-ponies?include_docs=true", function(res){ for(i in res.rows) { doc = res.rows[i].doc; // do stuff } });</script>
  • 80. CONTENTS• Introduction• PHP Usage• Replication/Scalability• Backend usage• Couchapps• Other stuff
  • 81. COUCHAPPCouchDB has it’s own structure for “distributed, scalable web applications” called couchapps
  • 82. “Distributed, scalable, web applications you say?omgwtfbbq!?!1!!!11!1!eleven”
  • 83. _attachmentsthe magic is in _attachments
  • 84. CouchDB relax CouchDB relax CouchDB relaxdistribution via replication
  • 85. INSTALLATION Couchapp 0.7.0installation is easy
  • 86. $ couchapp initinit a project
  • 87. LAYOUTcreates a default folder
  • 88. $ couchapp push http://ponies.couchone.com:5984/rainbows
  • 89. https://github.com/brandon-beacher/couchapp-tmbundlecouchapp push on save -> textmate
  • 90. CONTENTS• Introduction• PHP Usage• Replication/Scalability• Backend usage• Couchapps• Other stuff
  • 91. OTHER STUFF
  • 92. REWRITES
  • 93. _REWRITE
  • 94. $ curl "http://ponies.couchone.com/rainbows/_design/ponies/_view/best-ponies?descending=true&limit=5&key=”foobar”
  • 95. such urls make us a sad panda
  • 96. { .... "rewrites": [ { "from": "/best-5-ponies", "to": "ponies/_view/best-ponies", "method": "GET", "query": { "descending": true, "limit": 5, "key": "foobar" } } ]}
  • 97. $ curl "http://ponies.couchone.com/rainbows/_design/ponies/_view/best- ponies?descending=true&limit=5&key=”foobar”rewrite this
  • 98. $ curl "http://ponies.couchone.com/rainbows/_design/ponies/_rewrite/best-5-ponies"to this
  • 99. [vhosts]awesomeponies.com = /rainbows/_design/ponies/_rewrite
  • 100. $ curl "http://ponies.couchone.com/rainbows/_design/ponies/_rewrite/best-5-ponies"rewrite this
  • 101. $ curl "http://awesomeponies.com/best-5-ponies"to this
  • 102. _CHANGES
  • 103. $ curl -X GET "http://ponies.couchone.com/rainbows/_changes"
  • 104. {"results":[],"last_seq":0}
  • 105. curl -X PUT http://ponies.couchone.come/rainbows/foobar -d {"type":"awesome"}
  • 106. {"results":[{"seq":1,"id":"foobar","changes":[{"rev":"1-aaa8e2a031bca334f50b48b6682fb486"}]}],"last_seq":1}
  • 107. {"results":[{"seq":1,"id":"foobar","changes":[{"rev":"1-aaa8e2a031bca334f50b48b6682fb486"}]},{"seq":2,"id":"foobar2","changes":[{"rev":"1-e18422e6a82d0f2157d74b5dcf457997"}]}],"last_seq":2}
  • 108. _CHANGES OPTIONS• ?since• Longpolling• Continuous
  • 109. $ curl -X GET "http://ponies.couchone.com/rainbows/_changes?since=20"
  • 110. curl -X GET "http://ponies.couchone.com/rainbows/_changes?feed=longpoll&since=2"Longpolling: good for little updates, connections stays open until change, then gets closedand you need to reconnect, lots of reconnects for lots of updates
  • 111. curl -X GET "http://ponies.couchone.com/rainbows/_changes?feed=continuous&since=2"Connections stays open, and you get updates on the fly!
  • 112. FILTERSfilters can be used to filter documents from output
  • 113. function(doc, req) { if(doc.priority == high) { return true; } return false; }we only want high priority documents
  • 114. curl -X GET"http://ponies.couchone.com/rainbows/_changes?feed=continuous&filter=app/important
  • 115. function(doc, req) { if(doc.name == req.query.name) { return true; } return false; }you can use req for request based filters
  • 116. curl -X GET"http://ponies.couchone.com/rainbows/_changes?feed=continuous&filter=app/name&name=foobar
  • 117. SHOWS
  • 118. function(doc, req) { return { body: "Hello World" }}
  • 119. curl -X"http://ponies.couchone.com/rainbows/_design/foobar/_show/showfunction/docid"
  • 120. function(doc) { return { "code": 302, "body": "See other", "headers": { "Location": doc.target } }; }You can also define http headers, we used this for translating public id’s into private storageid’s. In this way, couchdb took care of all the headers and http stuff, and we could use aregular nginx proxy module
  • 121. LUCENE
  • 122. [external]fti=/path/to/python /path/to/couchdb-lucene/tools/couchdb-external-hook.py[httpd_db_handlers]_fti = {couch_httpd_external, handle_external_req, <<"fti">>}
  • 123. function(doc) { var ret=new Document(); ret.add(doc.message); ret.add(new Date(doc.datetime)); return ret;}
  • 124. curl -X GET"http://ponies.couchone.com/rainbows/_fti/_design/unicorns/by-query?q=foobar"
  • 125. GEOCOUCHhttps://github.com/vmx/couchdb
  • 126. See Dericks talk yesterday
  • 127. GEOCOUCH • Supports bbox • fork • outputs via lists, georss possible • directly useable by google maps • can read GIS data • combined with _changes makes interesting usecase- bbox => all items withing a certain bounding box, polygon is in the works- currently a fork of couchdb, in the works as external module- output can be setup seperately- google maps can use georss- GIS: Geographic Information System (used worldwide?)
  • 128. SPATIAL INDEX in spatial/points.jsfunction(doc){ if (doc.geo && doc.geo.latitude != && doc.geo.longitude != ) { emit( { type: "Point", coordinates: [parseFloat(doc.geo.latitude), parseFloat(doc.geo.longitude)] }, [doc._id, doc] ); }}
  • 129. Worldwide searchhttp://ponies.couchone.com/rainbows/_design/unicorns/_spatial/points?bbox=0,0,180,90 {"update_seq":3,"rows":[ { "id":"augsburg", "bbox":[10.898333,48.371667,10.898333,48.371667], "value":["augsburg",[10.898333,48.371667]] } ]}
  • 130. GEORSS && GOOGLE MAPSif (GBrowserIsCompatible()){ map = new GMap2(document.getElementById(map)); var geoXML = new GGeoXml(http://ponies.couchone.com/rainbows/url-to-georss-view); map.addOverlay(geoXML);}
  • 131. curl -X GET "http://ponies.couchone.com/rainbows/_design/alarmeringen/_spatial/points?bbox=51.711369,4.218407,52.136520,4.745740";
  • 132. Q?
  • 133. http://www.couchone.com/get
  • 134. http://joind.in/talk/view/2495second talk ever, please provide feedback