JAOO October 5th, 2009




   Apache
CouchDB
      Jason Davies
About Jason
•   Cambridge University (2002-2005)
•   Director, Jason Davies Ltd (2005-present)
•   Web Developer: Python, ...
CouchDB
  from
3,333km
CouchDB Features
•   “Cluster of Unreliable Commodity Hardware”
•   Schema-free (JSON)
•   Document-oriented, distributed ...
Schema-Free (JSON)
•

                           Features
•   Document Oriented,
    Not Relational

•   Highly Concurrent...
Schema-Free (JSON)
•

                           Features
•   Document Oriented,
    Not Relational

•   Highly Concurrent...
http://www.flickr.com/photos/stilleben2001/223243329/




Documents
Schema-Free ( JSON)
{
    "_id": "BCCD12CBB",
    "_rev": "AB764C",

    "type": "person",
    "name": "Darth Vader",
    ...
Schema-Free ( JSON)
{
    "_id": "BCCD12CBB",
    "_rev": "AB764C",

    "type": "person",
    "name": "Darth Vader",
    ...
Schema-Free ( JSON)
{
    "_id": "BCCD12CBB",
    "_rev": "AB764C",

    "type": "person",
    "name": "Darth Vader",
    ...
Schema-Free ( JSON)
{
    "_id": "BCCD12CBB",
    "_rev": "AB764C",

    "type": "person",
    "name": "Darth Vader",
    ...
Schema-Free (JSON)
•

                           Features
•   Document-Oriented,
    Not Relational

•   Highly Concurrent...
Document-Oriented
                                   Not Relational

•   Documents in the Real World™
    •   Bills, lette...
Document-Oriented
                                   Not Relational
•   Documents in the Real World™
        Bills, letter...
Schema-Free (JSON)
•

                           Features
•   Document-Oriented,
    Not Relational

•   Highly Concurrent...
Highly Concurrent
                             Erlang FTW


•   Functional languages highly appropriate
    for parallelli...
MVCC
•   Multiversion Concurrency Control
•   Reads: lock-free; never block
    •   Potential for massive horizontal scali...
Schema-Free (JSON)
•

                           Features
•   Document-Oriented,
    Not Relational

•   Highly Concurrent...
ful
                              CRUD
•   Create
    HTTP PUT /db/mydocid
•   Read
    HTTP GET /db/mydocid
•   Update
  ...
ful
                                Example
couch = CouchRest.database!("http://
127.0.0.1:5984/tweets")


tweets_url = "h...
Cacheability
•   Both documents and views return ETags
•   Clients send If-None-Match
    •   CouchDB responds with 304 No...
Schema-Free (JSON)
•

                           Features
•   Document-Oriented,
    Not Relational

•   Highly Concurrent...
JavaScript-Powered
           Map/Reduce
•   Map functions extract data from your
    documents
•   Reduce functions aggre...
http://horicky.blogspot.com/2008/10/couchdb-implementation.html
Map/Reduce Views
     Docs
                                 Map
{"user" : "Chris",
                      function(doc) {  ...
Map/Reduce Views
     Docs
                                 Map
{"user" : "Chris",
                      function(doc) {  ...
Map/Reduce Views
      Docs
                                 Map
{"user" : "Chris",
                      function(doc) { ...
Render Views as HTML
lists/index.js            /drl/_list/sofa/index/recent-posts?descending=true&limit=8
Server-Side JavaScript
•   _show for transforming documents
•   _list for transforming views
•   _update for transforming ...
Schema-Free (JSON)
•

                           Features
•   Document-Oriented,
    Not Relational

•   Highly Concurrent...
Replication
•   Incremental
•   Near-real-time
    •   Clustered mirrors
•   Scheduled
•   Ad-hoc
“Ground Computing”
                              @jhuggins




         http://www.flickr.com/photos/mcpig/872293700/
http://www.flickr.com/photos/hercwad/2290378571/
Latency Sucks
Stuart Langridge - Canonical




!   !
Con icts
Con ict resolution by
     example


 A               B


 ❦
Con ict resolution by
     example


 A               B


 ❦
Con ict resolution by
     example


 A               B


 ❦              ✿
                ❦
Con ict resolution by
     example


 A               B


                ✿
Con ict resolution by
     example


 A               B


                     ✿
Schema-Free (JSON)
•

                           Features
•   Document-Oriented,
    Not Relational

•   Highly Concurrent...
Robust Storage

Append-Only
File Structure

Designed to
Crash

Instant-On
Robust
No Silver Bullet!
•   Read-optimised; not so much for writes
    •   Use ?batch=ok for “don’t care” logging
•   Views must...
Experiments!




http://www.flickr.com/photos/seanstayte/378461237/
`couchapp`
• Scripts written in Python to make
  developing pure CouchDB applications
  easier
• sudo easy_install couchap...
Directory Structure
Resulting Design Doc
_list
• Arbitrary JS transformation for views
• http://127.0.0.1:5984/mydb/_design/app/
  _list/myview?startkey=...&endkey...
_show

• Arbitrary transformation for documents
• http://127.0.0.1:5984/mydb/_design/app/
  _show/mydoc
• function (doc, r...
JavaScript Templating
•   EmbeddedJS (EJS)

    •   <% /* execute arbitrary JS */ %>

    •   <%= /* execute and include r...
Push Helper Macros
• Simple macros to facilitate code re-use
• Insert code directly
 • // !code path/to/code.js
• Encode fi...
Secure Cookie Authentication
• Reasonable performance/simplicity of
  JavaScript implementation
• Mutual authentication
• ...
Tamper-Proof Cookies


Timestamp + signature => limited forward-security
        (outside of timestamp window)
Secure Remote Password Protocol (SRP)

• Zero-Knowledge Password Proof
• Simple to implement in Erlang using BigInt
  and ...
@couchdbinaction




http://books.couchdb.org/relax
Thank you
for listening!
www.jasondavies.com

   @jasondavies
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
CouchDB at JAOO Århus 2009
Upcoming SlideShare
Loading in …5
×

CouchDB at JAOO Århus 2009

1,119 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,119
On SlideShare
0
From Embeds
0
Number of Embeds
134
Actions
Shares
0
Downloads
25
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

CouchDB at JAOO Århus 2009

  1. 1. JAOO October 5th, 2009 Apache CouchDB Jason Davies
  2. 2. About Jason • Cambridge University (2002-2005) • Director, Jason Davies Ltd (2005-present) • Web Developer: Python, Django, JavaScript, jQuery, Erlang, &c. • Apache CouchDB committer (2009)
  3. 3. CouchDB from 3,333km
  4. 4. CouchDB Features • “Cluster of Unreliable Commodity Hardware” • Schema-free (JSON) • Document-oriented, distributed database • RESTful HTTP API • JavaScript-powered Map/Reduce • N-Master Replication
  5. 5. Schema-Free (JSON) • Features • Document Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage
  6. 6. Schema-Free (JSON) • Features • Document Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage
  7. 7. http://www.flickr.com/photos/stilleben2001/223243329/ Documents
  8. 8. Schema-Free ( JSON) { "_id": "BCCD12CBB", "_rev": "AB764C", "type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true }
  9. 9. Schema-Free ( JSON) { "_id": "BCCD12CBB", "_rev": "AB764C", "type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true }
  10. 10. Schema-Free ( JSON) { "_id": "BCCD12CBB", "_rev": "AB764C", "type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true }
  11. 11. Schema-Free ( JSON) { "_id": "BCCD12CBB", "_rev": "AB764C", "type": "person", "name": "Darth Vader", "age": 63, "headware": ["Helmet", "Sombrero"], "dark_side": true }
  12. 12. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage
  13. 13. Document-Oriented Not Relational • Documents in the Real World™ • Bills, letters, tax forms… • Same type != same structure • Can be out of date (so what?) • No references
  14. 14. Document-Oriented Not Relational • Documents in the Real World™ Bills, letters, tax forms… Natural Data • • Same type != same structure • Behaviour Can be out of date (so what?) • No references
  15. 15. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage
  16. 16. Highly Concurrent Erlang FTW • Functional languages highly appropriate for parallellism • Lightweight “processes” and message- passing; “shared-nothing” • Easy to create fault-tolerant systems
  17. 17. MVCC • Multiversion Concurrency Control • Reads: lock-free; never block • Potential for massive horizontal scaling • Writes: all-or-nothing • Success • Fail: conflict error, fetch and try again
  18. 18. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage
  19. 19. ful CRUD • Create HTTP PUT /db/mydocid • Read HTTP GET /db/mydocid • Update HTTP PUT /db/mydocid • Delete HTTP DELETE /db/mydocid
  20. 20. ful Example couch = CouchRest.database!("http:// 127.0.0.1:5984/tweets") tweets_url = "http://twitter.com/statuses/ user_timeline.json" tweets = http.get(tweets_url) couch.bulk_save(tweets)
  21. 21. Cacheability • Both documents and views return ETags • Clients send If-None-Match • CouchDB responds with 304 Not Modified and bypasses potentially expensive lookup • Can use Varnish/Squid as caching proxy • Proxy- friendly
  22. 22. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage
  23. 23. JavaScript-Powered Map/Reduce • Map functions extract data from your documents • Reduce functions aggregate intermediate values • The kicker: Incremental B-tree storage
  24. 24. http://horicky.blogspot.com/2008/10/couchdb-implementation.html
  25. 25. Map/Reduce Views Docs Map {"user" : "Chris", function(doc) { {"key": "Alice", "value": 5} "points" : 3 } if (doc.user && doc.points) { {"key": "Bob", "value": 7} {"user": "Joe", emit(doc.user, doc.points); {"key": "Chris", "value": 3} "points" : 10 } } {"key": "Joe", "value": 10} {"user": "Alice", } {"key": "Mary", "value": 9} "points" : 5 } {"user": "Mary", "points" : 9} {"user": "Bob", Reduce "points": 7} function(keys, values, rereduce) { Alice ... Chris: 15 return sum(values); Everyone: 34 }
  26. 26. Map/Reduce Views Docs Map {"user" : "Chris", function(doc) { {"key": "Alice", "value": 5} "points" : 3 } if (doc.user && doc.points) { {"key": "Bob", "value": 7} {"user": "Joe", emit(doc.user, doc.points); {"key": "Chris", "value": 3} "points" : 10 } } {"key": "Joe", "value": 10} {"user": "Alice", } {"key": "Mary", "value": 9} "points" : 5 } {"user": "Mary", "points" : 9} {"user": "Bob", Reduce "points": 7} function(keys, values, rereduce) { Alice … Chris: 15 return sum(values); Everyone: 34 }
  27. 27. Map/Reduce Views Docs Map {"user" : "Chris", function(doc) { {"key": "Alice", "value": 5} "points" : 3 } if (doc.user && doc.points) { {"key": "Bob", "value": 7} {"user": "Joe", emit(doc.user, doc.points); {"key": "Chris", "value": 3} "points" : 10 } } {"key": "Joe", "value": 10} {"user": "Alice", } {"key": "Mary", "value": 9} "points" : 5 } {"user": "Mary", "points" : 9} {"user": "Bob", Reduce "points": 7} function(keys, values, rereduce) { Alice … Chris: 15 return sum(values); Everyone: 34 }
  28. 28. Render Views as HTML lists/index.js /drl/_list/sofa/index/recent-posts?descending=true&limit=8
  29. 29. Server-Side JavaScript • _show for transforming documents • _list for transforming views • _update for transforming PUTs/POSTs • Code-sharing between client and server • Easy deployment
  30. 30. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage
  31. 31. Replication • Incremental • Near-real-time • Clustered mirrors • Scheduled • Ad-hoc
  32. 32. “Ground Computing” @jhuggins http://www.flickr.com/photos/mcpig/872293700/
  33. 33. http://www.flickr.com/photos/hercwad/2290378571/
  34. 34. Latency Sucks
  35. 35. Stuart Langridge - Canonical ! !
  36. 36. Con icts
  37. 37. Con ict resolution by example A B ❦
  38. 38. Con ict resolution by example A B ❦
  39. 39. Con ict resolution by example A B ❦ ✿ ❦
  40. 40. Con ict resolution by example A B ✿
  41. 41. Con ict resolution by example A B ✿
  42. 42. Schema-Free (JSON) • Features • Document-Oriented, Not Relational • Highly Concurrent • RESTful HTTP API • JavaScript-Powered Map/Reduce • N-Master Replication • Robust Storage
  43. 43. Robust Storage Append-Only File Structure Designed to Crash Instant-On
  44. 44. Robust
  45. 45. No Silver Bullet! • Read-optimised; not so much for writes • Use ?batch=ok for “don’t care” logging • Views must be specified beforehand • Relational databases are better at highly dynamic queries • But: use externals e.g. Lucene, SQLite
  46. 46. Experiments! http://www.flickr.com/photos/seanstayte/378461237/
  47. 47. `couchapp` • Scripts written in Python to make developing pure CouchDB applications easier • sudo easy_install couchapp • couchapp generate relax && cd relax • couchapp push http://127.0.0.1:5984/mydb
  48. 48. Directory Structure
  49. 49. Resulting Design Doc
  50. 50. _list • Arbitrary JS transformation for views • http://127.0.0.1:5984/mydb/_design/app/ _list/myview?startkey=...&endkey=... • JSON -> HTML, JSON -> XML, ... • E4X nice for XML generation • Iteratively call getRow() and use send(...)
  51. 51. _show • Arbitrary transformation for documents • http://127.0.0.1:5984/mydb/_design/app/ _show/mydoc • function (doc, req) { return “foo”; }
  52. 52. JavaScript Templating • EmbeddedJS (EJS) • <% /* execute arbitrary JS */ %> • <%= /* execute and include result */ %> • new EJS({ text: mytemplate }).render(doc); • John Resig’s Micro-Templating • new template(mytemplate)(doc); • Doesn’t preserve whitespace or LaTeX backslashes
  53. 53. Push Helper Macros • Simple macros to facilitate code re-use • Insert code directly • // !code path/to/code.js • Encode file as JSON: path/to/test.html • // !json path.to.test • // !json _attachments/test.html
  54. 54. Secure Cookie Authentication • Reasonable performance/simplicity of JavaScript implementation • Mutual authentication • Resistance to off-line dictionary attacks based on passive eavesdropping • Passwords stored in a form that is not plaintext-equivalent • Limited resistance to replay attacks
  55. 55. Tamper-Proof Cookies Timestamp + signature => limited forward-security (outside of timestamp window)
  56. 56. Secure Remote Password Protocol (SRP) • Zero-Knowledge Password Proof • Simple to implement in Erlang using BigInt and crypto libraries • JavaScript too slow: over 5s for 1024 bits • Vulnerable to active injection attacks • There are simpler protocols that can be used to give equivalent security • Just add SSL for protection from active attacks (or lobby for TLS-SRP/J-PAKE!)
  57. 57. @couchdbinaction http://books.couchdb.org/relax
  58. 58. Thank you for listening! www.jasondavies.com @jasondavies

×