Embracing Constraints With CouchDB
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Embracing Constraints With CouchDB

  • 5,100 views
Uploaded on

Presentation given at Dutch PHP Conference 2010.

Presentation given at Dutch PHP Conference 2010.

http://joind.in/1651

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
5,100
On Slideshare
5,028
From Embeds
72
Number of Embeds
1

Actions

Shares
Downloads
71
Comments
0
Likes
4

Embeds 72

http://www.slideshare.net 72

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. EMBRACING CONSTRAINTS WITH COUCHDB
  • 2. David Zülke
  • 3. David Zuelke
  • 4. http://en.wikipedia.org/wiki/File:München_Panorama.JPG
  • 5. Founder
  • 6. Lead Developer
  • 7. @dzuelke #dpc10
  • 8. http://joind.in/1651
  • 9. A DISCLAIMER FIRST Before You All Figure This Out Yourselves...
  • 10.
  • 11. NEIN NEIN NEIN NEIN DAS IST BETRUG
  • 12. This talk is not really about embracing constraints
  • 13. I’ll tell you what it’s really about when we’re finished
  • 14. I’ll also apologize to you for lying at that point
  • 15. (it’s always easier to apologize than to ask for permission)
  • 16. COUCHDB IN THREE SLIDES Full Of DIS IS SRS BSNS Bullet Points
  • 17. COUCHDB STORES DOCUMENTS • CouchDB stores documents with arbitrary keys and values • Each document is identified by an ID and has a revision • Documents can have file attachments • Stored as JSON, so it’s easy to interface with
  • 18. COUCHDB SPEAKS HTTP • CouchDB uses HTTP to communicate with clients & servers • That means scalability • That means a lot of kick ass stuff totally for free • Caching • Load Balancing • Content Negotiation
  • 19. COUCHDB USES MVCC • Multiversion Concurrency Control • When updating, you must supply a revision number • Your change will be rejected if the revision is not the latest • All writes are serialized • No need for locks, but puts some responsibility on developers
  • 20. SOME DETAILS An In-Depth Look At What Makes CouchDB Different
  • 21. availability partition CAP X tolerance consistency
  • 22. “So, CouchDB does not have consistency of CAP?”
  • 23. “Booh, that means my data will be inconsistent. Fail!”
  • 24. psssshhh
  • 25. YOUR MOM IS INCONSISTENT
  • 26. CouchDB is eventually consistent
  • 27. When replicating, conflicting revisions will be marked as such
  • 28. These conflicts can then be resolved (users, daemons,...)
  • 29. and everything will be fine o/
  • 30. which brings us to...
  • 31. REPLICATION • You can do Master-Master replication • Conflicts are detected and marked automatically • Conflicts are supposed to be resolved by applications • Or by users, who usually know best what to do!
  • 32. CouchDB is Ground Computing
  • 33. Imagine a world where every computer runs CouchDB
  • 34. Ubuntu One already does, to sync bookmarks etc!
  • 35. MAP/REDUCE
  • 36. BASIC PRINCIPLE: MAPPER • The Mapper reads records and emits <key, value> pairs • Example: Apache access.log • Each line is a record • Extract client IP address and number of bytes transferred • Emit IP address as key, number of bytes as value • For hourly rotating logs, the job can be split across 24 nodes* * In pratice, it’s a lot smarter than that
  • 37. BASIC PRINCIPLE: REDUCER •A Reducer is given a key and all values for this specific key • Even if there are many Mappers on many computers; the results are aggregated before they are handed to Reducers • Example: Apache access.log • The Reducer is called once for each client IP (that’s our key), with a list of values (transferred bytes) • We simply sum up the bytes to get the total traffic per IP!
  • 38. EXAMPLE OF MAPPED INPUT IP Bytes 212.122.174.13 18271 212.122.174.13 191726 212.122.174.13 198 74.119.8.111 91272 74.119.8.111 8371 212.122.174.13 43
  • 39. REDUCER WILL RECEIVE THIS IP Bytes 18271 191726 212.122.174.13 198 43 91272 74.119.8.111 8371
  • 40. AFTER REDUCTION IP Bytes 212.122.174.13 210238 74.119.8.111 99643
  • 41. COUCHDB INCREMENTAL MAPREDUCE
  • 42. THE KEY DIFFERENCE • Maps and Reduces are incremental: • If one document changes, only that one document needs: • mapping • reduction • Then a few new reduce runs are performed to compute the final result
  • 43. MAPPER: DOCS BY TAGS function(doc)  {    if(doc.type  ==  'talk')  {        (doc.tags  ||  []).forEach(function(tag)  {            emit(tag,  doc);        });    } }
  • 44. MAPREDUCE: COUNT TAGS function(doc)  {    if(doc.type  ==  'talk')  {        (doc.tags  ||  []).forEach(function(tag)  {            emit(tag,  1);        });    } } function(key,  values)  {    return  sum(values); }
  • 45. LUCENE INTEGRATION Full Control Over What Is Indexed, And How
  • 46. COUCHAPP Python Tool For Development And Deployment
  • 47. DEMO TIME Let’s Relax On The Couch
  • 48. !e End
  • 49. FURTHER READING • http://books.couchdb.org/ • http://couchdb.apache.org/ • http://github.com/couchapp/couchapp • http://github.com/rnewson/couchdb-lucene/ • http://janl.github.com/couchdbx/ • http://j.mp/oqbQs (E4X in CouchDB for XML parsing)
  • 50. DID YOU SEE THE HEAD FAKE? This Talk Was Not About Embracing Constraints It Was About Embracing Awesomeness
  • 51. Questions?
  • 52. THANK YOU! This was http://joind.in/1651 by @dzuelke