Your SlideShare is downloading. ×

Embracing Constraints With CouchDB

4,033

Published on

Presentation given at Dutch PHP Conference 2010. …

Presentation given at Dutch PHP Conference 2010.

http://joind.in/1651

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,033
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
72
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. EMBRACING CONSTRAINTS WITH COUCHDB
  • 2. David Zülke
  • 3. David Zuelke
  • 4. http://en.wikipedia.org/wiki/File:München_Panorama.JPG
  • 5. Founder
  • 6. Lead Developer
  • 7. @dzuelke #dpc10
  • 8. http://joind.in/1651
  • 9. A DISCLAIMER FIRST Before You All Figure This Out Yourselves...
  • 10.
  • 11. NEIN NEIN NEIN NEIN DAS IST BETRUG
  • 12. This talk is not really about embracing constraints
  • 13. I’ll tell you what it’s really about when we’re finished
  • 14. I’ll also apologize to you for lying at that point
  • 15. (it’s always easier to apologize than to ask for permission)
  • 16. COUCHDB IN THREE SLIDES Full Of DIS IS SRS BSNS Bullet Points
  • 17. COUCHDB STORES DOCUMENTS • CouchDB stores documents with arbitrary keys and values • Each document is identified by an ID and has a revision • Documents can have file attachments • Stored as JSON, so it’s easy to interface with
  • 18. COUCHDB SPEAKS HTTP • CouchDB uses HTTP to communicate with clients & servers • That means scalability • That means a lot of kick ass stuff totally for free • Caching • Load Balancing • Content Negotiation
  • 19. COUCHDB USES MVCC • Multiversion Concurrency Control • When updating, you must supply a revision number • Your change will be rejected if the revision is not the latest • All writes are serialized • No need for locks, but puts some responsibility on developers
  • 20. SOME DETAILS An In-Depth Look At What Makes CouchDB Different
  • 21. availability partition CAP X tolerance consistency
  • 22. “So, CouchDB does not have consistency of CAP?”
  • 23. “Booh, that means my data will be inconsistent. Fail!”
  • 24. psssshhh
  • 25. YOUR MOM IS INCONSISTENT
  • 26. CouchDB is eventually consistent
  • 27. When replicating, conflicting revisions will be marked as such
  • 28. These conflicts can then be resolved (users, daemons,...)
  • 29. and everything will be fine o/
  • 30. which brings us to...
  • 31. REPLICATION • You can do Master-Master replication • Conflicts are detected and marked automatically • Conflicts are supposed to be resolved by applications • Or by users, who usually know best what to do!
  • 32. CouchDB is Ground Computing
  • 33. Imagine a world where every computer runs CouchDB
  • 34. Ubuntu One already does, to sync bookmarks etc!
  • 35. MAP/REDUCE
  • 36. BASIC PRINCIPLE: MAPPER • The Mapper reads records and emits <key, value> pairs • Example: Apache access.log • Each line is a record • Extract client IP address and number of bytes transferred • Emit IP address as key, number of bytes as value • For hourly rotating logs, the job can be split across 24 nodes* * In pratice, it’s a lot smarter than that
  • 37. BASIC PRINCIPLE: REDUCER •A Reducer is given a key and all values for this specific key • Even if there are many Mappers on many computers; the results are aggregated before they are handed to Reducers • Example: Apache access.log • The Reducer is called once for each client IP (that’s our key), with a list of values (transferred bytes) • We simply sum up the bytes to get the total traffic per IP!
  • 38. EXAMPLE OF MAPPED INPUT IP Bytes 212.122.174.13 18271 212.122.174.13 191726 212.122.174.13 198 74.119.8.111 91272 74.119.8.111 8371 212.122.174.13 43
  • 39. REDUCER WILL RECEIVE THIS IP Bytes 18271 191726 212.122.174.13 198 43 91272 74.119.8.111 8371
  • 40. AFTER REDUCTION IP Bytes 212.122.174.13 210238 74.119.8.111 99643
  • 41. COUCHDB INCREMENTAL MAPREDUCE
  • 42. THE KEY DIFFERENCE • Maps and Reduces are incremental: • If one document changes, only that one document needs: • mapping • reduction • Then a few new reduce runs are performed to compute the final result
  • 43. MAPPER: DOCS BY TAGS function(doc)  {    if(doc.type  ==  'talk')  {        (doc.tags  ||  []).forEach(function(tag)  {            emit(tag,  doc);        });    } }
  • 44. MAPREDUCE: COUNT TAGS function(doc)  {    if(doc.type  ==  'talk')  {        (doc.tags  ||  []).forEach(function(tag)  {            emit(tag,  1);        });    } } function(key,  values)  {    return  sum(values); }
  • 45. LUCENE INTEGRATION Full Control Over What Is Indexed, And How
  • 46. COUCHAPP Python Tool For Development And Deployment
  • 47. DEMO TIME Let’s Relax On The Couch
  • 48. !e End
  • 49. FURTHER READING • http://books.couchdb.org/ • http://couchdb.apache.org/ • http://github.com/couchapp/couchapp • http://github.com/rnewson/couchdb-lucene/ • http://janl.github.com/couchdbx/ • http://j.mp/oqbQs (E4X in CouchDB for XML parsing)
  • 50. DID YOU SEE THE HEAD FAKE? This Talk Was Not About Embracing Constraints It Was About Embracing Awesomeness
  • 51. Questions?
  • 52. THANK YOU! This was http://joind.in/1651 by @dzuelke

×