An Introduction to CouchDB (IPC11SE 2011-06-01)
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

An Introduction to CouchDB (IPC11SE 2011-06-01)

on

  • 3,988 views

Presentation given at International PHP Conference Spring Edition 2011.

Presentation given at International PHP Conference Spring Edition 2011.

Statistics

Views

Total Views
3,988
Views on SlideShare
3,982
Embed Views
6

Actions

Likes
2
Downloads
40
Comments
0

2 Embeds 6

http://twitter.com 4
https://twitter.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

An Introduction to CouchDB (IPC11SE 2011-06-01) Presentation Transcript

  • 1. AN INTRODUCTION TO COUCHDB
  • 2. David Zülke
  • 3. David Zuelke
  • 4. http://en.wikipedia.org/wiki/File:München_Panorama.JPG
  • 5. Founder
  • 6. Lead Developer
  • 7. @dzuelke
  • 8. COUCHDB IN THREE SLIDES Full Of DIS IS SRS BSNS Bullet Points
  • 9. COUCHDB STORES DOCUMENTS• CouchDB stores documents with arbitrary keys and values• Each document is identified by an ID and has a revision• Documents can have file attachments• Stored as JSON, so it’s easy to interface with
  • 10. COUCHDB SPEAKS HTTP• CouchDB uses HTTP to communicate with clients & servers• That means scalability• That means a lot of kick ass stuff totally for free • Caching • Load Balancing • Content Negotiation
  • 11. COUCHDB USES MVCC• Multiversion Concurrency Control• When updating, you must supply a revision number• Your change will be rejected if the revision is not the latest• All writes are serialized• No need for locks, but puts some responsibility on developers
  • 12. THE DETAILSAn In-Depth Look At What Makes CouchDB Different
  • 13. availability Do you know the partition CAP X theorem? toleranceconsistency
  • 14. “So, CouchDB does not have consistency of CAP?”
  • 15. “Booh, that means my data will be inconsistent. Fail!”
  • 16. psssshhh
  • 17. YOUR MOM IS INCONSISTENT
  • 18. CouchDB is eventually consistent
  • 19. When replicating, conflicting revisions will be marked as such
  • 20. These conflicts can then be resolved (users, daemons,...)
  • 21. and everything will be fine o/
  • 22. which brings us to...
  • 23. REPLICATION• You can do Master-Master replication• Conflicts are detected and marked automatically• Conflicts are supposed to be resolved by applications • Or by users, who usually know best what to do!
  • 24. CouchDB is Ground Computing
  • 25. Imagine a world where every computer runs CouchDB
  • 26. Ubuntu One already does, to sync bookmarks etc!
  • 27. MAP/REDUCE
  • 28. BASIC PRINCIPLE: MAPPER• The Mapper reads records and emits <key, value> pairs • Example: Apache access.log • Each line is a record • Extract client IP address and number of bytes transferred • Emit IP address as key, number of bytes as value• For hourly rotating logs, the job can be split across 24 nodes* * In pratice, it’s a lot smarter than that
  • 29. BASIC PRINCIPLE: REDUCER•A Reducer is given a key and all values for this specific key • Even if there are many Mappers on many computers; the results are aggregated before they are handed to Reducers • Example: Apache access.log • The Reducer is called once for each client IP (that’s our key), with a list of values (transferred bytes) • We simply sum up the bytes to get the total traffic per IP!
  • 30. EXAMPLE OF MAPPED INPUT IP Bytes 212.122.174.13 18271 212.122.174.13 191726 212.122.174.13 198 74.119.8.111 91272 74.119.8.111 8371 212.122.174.13 43
  • 31. REDUCER WILL RECEIVE THIS IP Bytes 18271 191726 212.122.174.13 198 43 91272 74.119.8.111 8371
  • 32. AFTER REDUCTION IP Bytes212.122.174.13 210238 74.119.8.111 99643
  • 33. COUCHDB INCREMENTAL MAPREDUCE
  • 34. THE KEY DIFFERENCE• Maps and Reduces are incremental: • If one document changes, only that one document needs: • mapping • reduction • Then a few new reduce runs are performed to compute the final result
  • 35. MAPPER: DOCS BY TAGSfunction(doc)  {    if(doc.type  ==  product)  {        (doc.tags  ||  []).forEach(function(tag)  {            emit(tag,  doc);        });    }}
  • 36. MAPREDUCE: COUNT TAGSfunction(doc)  {    if(doc.type  ==  product)  {        (doc.tags  ||  []).forEach(function(tag)  {            emit(tag,  1);        });    }}function(key,  values)  {    return  sum(values);} built-in CouchDB function, very efficient_sum
  • 37. BUT WAIT!There are no tables :(
  • 38. so... how do you join data from related documents?
  • 39. JOIN PRODUCTS WITH THEIR CATEGORIES function(doc)  {    if(doc.type  ==  product)  {        emit([doc._id,  0],  doc);        emit([doc._id,  1],  {  _id:  doc.category_id  });    } } ["123",  0]            {_id:  "123",  _rev:  "5-­‐a72",  type:  "product",  "name":  "Laser  Beam"} ["123",  1]            {_id:  "est",  _rev:  "2-­‐9af",  type:  "category",  "name":  "Evil  Stuff"} ["817",  0]            {_id:  "817",  _rev:  "2-­‐aa8",  type:  "product",  "name":  "Rocketship"} ["817",  1]            {_id:  "cst",  _rev:  "3-­‐d8a",  type:  "category",  "name":  "Cool  Stuff"} ["441",  0]            {_id:  "441",  _rev:  "19-­‐fdf",  type:  "product",  "name":  "Sharks"} ["441",  1]            {_id:  "est",  _rev:  "2-­‐9af",  type:  "category",  "name":  "Evil  Stuff"}
  • 40. JOIN CATEGORIES WITH ALL THEIR PRODUCTS function(doc)  {    if(doc.type  ==  category)  {        emit([doc._id,  0],  doc);    }  elseif(doc.type  ==  product)  {        emit([doc.category_id,  doc._id],  doc);    } }["est",  0]            {_id:  "est",  _rev:  "2-­‐9af",  type:  "category",  "name":  "Evil  Stuff"}["est",  "123"]    {_id:  "123",  _rev:  "5-­‐a72",  type:  "product",  "name":  "Laser  Beam"}["est",  "441"]    {_id:  "441",  _rev:  "19-­‐fdf",  type:  "product",  "name":  "Sharks"}["cst",  0]            {_id:  "cst",  _rev:  "3-­‐d8a",  type:  "category",  "name":  "Cool  Stuff"}["cst",  "817"]    {_id:  "817",  _rev:  "2-­‐aa8",  type:  "product",  "name":  "Rocketship"}
  • 41. BUT... BUT... WAIT!How to guarantee a documents structure if it’s all schema-less?
  • 42. VALIDATE DOCUMENTSfunction  (newDoc,  savedDoc,  userCtx)  {    if(savedDoc  &&  savedDoc.created_at  !=  newDoc.created_at)  {        throw({forbidden:  created_at  is  immutable});    }    if(doc.type  ==  product)  {        if(!doc.price)  {            throw({forbidden:  product  must  have  a  price});        }    }}
  • 43. VALIDATE DOCUMENTSfunction  (newDoc,  savedDoc,  userCtx)  {    function  require(beTrue,  message)  {        if(!beTrue)  throw({forbidden:  message});    }    require(savedDoc  &&  savedDoc.created_at  !=  newDoc.created_at,        created_at  is  immutable    );    if(doc.type  ==  product)  {        require(!doc.price,            product  must  have  a  price        );    }}
  • 44. LUCENE INTEGRATIONFull Control Over What Is Indexed, And How
  • 45. COUCHAPPPython Tool For Development And Deployment
  • 46. DEMO TIMELet’s Relax On The Couch
  • 47. !e End
  • 48. FURTHER READING• http://guide.couchdb.org/• http://couchdb.apache.org/• http://github.com/couchapp/couchapp• http://github.com/rnewson/couchdb-lucene/• http://www.couchbase.com/downloads/• http://j.mp/oqbQs (E4X in CouchDB for XML parsing)
  • 49. Questions?
  • 50. THANK YOU! This was http://joind.in/3521 by @dzuelke