Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Couch Foo: CouchDB on rails

3,944 views
3,811 views

Published on

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,944
On SlideShare
0
From Embeds
0
Number of Embeds
30
Actions
Shares
0
Downloads
0
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Couch Foo: CouchDB on rails

  1. 1. CouchDB and Ruby George Palmer
  2. 2. “Spending more time on the couch: Using CouchDB to avoid migrations, create offline applications and scale with greater ease”
  3. 3. What is CouchDB?
  4. 4. Database Landscape
  5. 5. Database Landscape MySQL
  6. 6. Database Landscape MySQL Postgres
  7. 7. Database Landscape MySQL Postgres SQLite
  8. 8. Database Landscape MySQL Postgres SQLite
  9. 9. Database Landscape MySQL Postgres SQLite
  10. 10. Database Landscape Relational Database MySQL Postgres SQLite
  11. 11. Database Landscape Relational Database MySQL Postgres ToykoCabinet SQLite
  12. 12. Database Landscape Relational Database MySQL Postgres ToykoCabinet SQLite Redis
  13. 13. Database Landscape Relational Database MySQL Postgres ToykoCabinet SQLite Redis Memcachedb
  14. 14. Database Landscape Relational Database MySQL Postgres ToykoCabinet SQLite Redis Memcachedb
  15. 15. Database Landscape Relational Database MySQL Postgres ToykoCabinet SQLite Redis Memcachedb
  16. 16. Database Landscape Relational Database Key-Value Databases MySQL Postgres ToykoCabinet SQLite Redis Memcachedb
  17. 17. Database Landscape Amazon SimpleDB Relational Database Key-Value Databases MySQL Postgres ToykoCabinet SQLite Redis Memcachedb
  18. 18. Database Landscape Amazon SimpleDB MongoDB Relational Database Key-Value Databases MySQL Postgres ToykoCabinet SQLite Redis Memcachedb
  19. 19. Database Landscape Amazon SimpleDB MongoDB CouchDB Relational Database Key-Value Databases MySQL Postgres ToykoCabinet SQLite Redis Memcachedb
  20. 20. Database Landscape Amazon SimpleDB MongoDB CouchDB Relational Database Key-Value Databases MySQL Postgres ToykoCabinet SQLite Redis Memcachedb
  21. 21. Database Landscape Amazon SimpleDB MongoDB CouchDB Relational Database Key-Value Databases MySQL Postgres ToykoCabinet SQLite Redis Memcachedb
  22. 22. Database Landscape Amazon SimpleDB MongoDB CouchDB Document Orientated Databases Relational Database Key-Value Databases MySQL Postgres ToykoCabinet SQLite Redis Memcachedb
  23. 23. Document Orientated Databases • Data is stored in documents • Like real life there’s no limit on how the information is stored • ie schema free • Documents stored and accessed by an identifier
  24. 24. Stored as JSON { “_id”: “A312C72B”, “_rev”: “AB746C”, “name”: “Rails Underground”, “start-date”: “July 24th”, “tags”: [“ruby”, “rails”, “conference”] }
  25. 25. REST interface • Create HTTP POST or PUT /db/docid • Read HTTP GET /db/docid • Destroy HTTP DELETE /db/docid • Update HTTP PUT /db/docid
  26. 26. HTTP REST Benefits • Load Balancing • Caching • ...
  27. 27. Introducing Views { { “type”: “car” “type”: “van” “colour”... “colour”... } } { { “type”: “lorry” “type”: “van” “colour”... “colour”... } } CouchDB
  28. 28. Introducing Views { { “type”: “car” “type”: “van” “colour”... “colour”... } } { { “type”: “lorry” “type”: “van” “colour”... “colour”... } } CouchDB
  29. 29. Introducing Views { { We need to look at a “type”: “car” “colour”... “type”: “van” “colour”... subset of the } } database documents - { { or a ‘view’ of the “type”: “lorry” “colour”... “type”: “van” “colour”... database } } CouchDB
  30. 30. Views • These views are saved in the database itself under a special identifier • they must start: _design/
  31. 31. Simple View function(doc) { if (doc.Type == "van") { emit(null, doc); } }
  32. 32. Executing the query http://localhost:5984/rails-underground/_view/vans/all {"total_rows":2, "offset":0, "rows":[ {"id":"1", "key":null, "value": {"_id":"1","_rev":"4062949995","type":"van","name":"Tran sit"}}, {"id":"3", "key":null, "value": {"_id":"3","_rev":"1728947327","type":"van","name":"Tran sporter"}} ] }
  33. 33. Executing the query http://localhost:5984/rails-underground/_view/vans/all? limit=1 {"total_rows":1, "offset":0, "rows":[ {"id":"1", "key":null, "value": {"_id":"1","_rev":"4062949995","type":"van","name":"Tran sit"}} ] }
  34. 34. Simple View function(doc) { if (doc.Type == "van") { emit(doc.Name, doc); } }
  35. 35. Executing the query http://localhost:5984/rails-underground/_view/vans/all? startkey=Trans {"total_rows":1, "offset":0, "rows":[ {"id":"3", "key":”Transporter”, "value": {"_id":"3","_rev":"1728947327","type":"van","name":"Tran sporter"}} ] }
  36. 36. Reduce Query • map: function(doc) {if (doc.Type == "van") {emit(null, doc);}} // previous example • reduce: function(keys, values) { return keys.length } • This will return a result of 2 • You can optionally rereduce the result set by having a third argument on the reduce function
  37. 37. Inline Associatons { “user” : “George”, “roles” : [“supervisor”, “teamleader”], ... }
  38. 38. Associations in 1 query function(doc) { if (doc.type == "post") { map([doc._id, 0], doc); } else if (doc.type == "comment") { map([doc.post, 1], doc); } }
  39. 39. Associations in 1 query http://localhost:5984/rails-underground/_view/ posts_comments/all “key”:[“1”,0], “value”:{“_id”:”1”, “type”:”post”, “text”:”My Blog Post”} “key”:[“2”,0], “value”:{“_id”:”2”, “type”:”post”, “text”:”My 2nd Blog Post”} “key”:[“3”,0], “value”:{“_id”:”3”, “type”:”post”, “text”:”My 3rd Blog Post”} “key”:[“1”,1], “value”:{“_id”:”3”, “type”:”comment”, “text”:”You rock dude”, “post”:”1”} “key”:[“2”,1], “value”:{“_id”:”3”, “type”:”comment”, “text”:”Man you suck”, “post”:”2”}
  40. 40. Associations in 1 query http://localhost:5984/rails-underground/_view/ posts_comments/all?startkey=[“1”]&endkey=[“1”,2] “key”:[“1”,0], “value”:{“_id”:”1”, “type”:”post”, “text”:”My Blog Post”} “key”:[“2”,0], “value”:{“_id”:”2”, “type”:”post”, “text”:”My 2nd Blog Post”} “key”:[“3”,0], “value”:{“_id”:”3”, “type”:”post”, “text”:”My 3rd Blog Post”} “key”:[“1”,1], “value”:{“_id”:”3”, “type”:”comment”, “text”:”You rock dude”, “post”:”1”} “key”:[“2”,1], “value”:{“_id”:”3”, “type”:”comment”, “text”:”Man you suck”, “post”:”2”}
  41. 41. Indexes • Relational databases typically pay cost at point of insertion • CouchDB pay cost at time of checking view • Expensive if you write lots and read infrequently • Can create an update script CouchDB calls when database updated
  42. 42. Replication
  43. 43. Replication
  44. 44. Replication
  45. 45. Replication
  46. 46. Replication Nodes that go down catch up
  47. 47. Replication Partial Replicas
  48. 48. Replication Partial Replicas Offline copies
  49. 49. Conflict Management • As well as an _id field each document has a _rev • Each time a document is updated so is the _rev • Used to help determine ‘winning’ document when conflicts occur • Old revisions are available until the database is compressed
  50. 50. When to use CouchDB (and also when not)
  51. 51. Scenario I: Schema-less databases • FriendFeed moved to using MySQL in schema-less fashion because: • Adding new features became so difficult. In particular adding new indexes became too expensive for tables with 10-20 million rows
  52. 52. Rules of Database degradation • Fields become optional • Relationships become Many-to-Many [ Taken from http://push.cx/2009/rules-of-database- app-aging ]
  53. 53. Scenario II: Real world models “The real world doesn’t map to a relational database very well”
  54. 54. 5ft Shelf - Envisaged Database Design Shelf M M Book 1 1 M M M M Recommendat Price Category ions
  55. 55. 5ft Shelf - Actual Database Design Book 1 M Recommendat ions 1 M Shelf M M Book Edition 1 1 M M M M ISBN Price Category
  56. 56. 5ft Shelf - CouchDB Database Design Shelf M M Book 1 M Recommendat ions Prices ISBNs Categories Inline Associations on Book
  57. 57. Scenario III: Using replication/sharding • Offline Capability: • Satellite Office • Desktop Apps • ‘Edge’ Databases • Large amount of databases needed • Even 1 database per user
  58. 58. When not to use CouchDB • Estate Agent application • Each house has x bedrooms, y bathrooms... • Financial Applications • (Generally stuff that’s very fixed)
  59. 59. Introducing couch_foo (An exercise in learning how ActiveRecord really works)
  60. 60. Why? • Used all the existing ruby libraries at the time and found frustrations coming from ActiveRecord API • Opinion split on whether providing a full API for CouchDB is a good thing or whether more of the logic should be in application
  61. 61. CouchFoo Models class Address < CouchFoo::Base property :number, Integer property :street, String property :postcode end # - Generic types are fine if to_json # from_json are defined on the # object
  62. 62. CouchFoo Models (2) • The _id field is automatically assigned by CouchFoo and is guaranteed to be a UUID • View generation is handled automatically so you can just call the finders • Nearly all AR finders are available • Address.all • User.find_by_login, ... • (Obviously #find_by_sql isn’t)
  63. 63. CouchFoo Models (3) • Associations • Validations • Callbacks • Inheritance • Calculations • ...
  64. 64. How finders work • Each model stores its name in a ruby_class attribute in the document. • This makes it easy to filter out in a view • And then regenerate the model on retrieval • Views are generated automatically with one for each attribute lookup on a given model • Users can define their own custom view in the model as well
  65. 65. Finders • User.find(:first, :conditions => {:login => “george”}) • User.find(:first, :conditions => {:login => “george”}, :update => false) • User.find(:all, :limit => 10, :offset => 3) • User.find(:all, :use_key => [:login], :startkey => “fred”, :endkey => “george”)
  66. 66. Ordering Confusion • Relational databases order records by id which normally keeps incrementing • Thus newer records have a higher id • In CouchDB inserting a record may have a key starting ‘a126’ whereas the next insertion may have a key starting ‘3a82’ • Thus find(:all) operations aren’t always in the order you expect • This can confuse the user too - “Why is my newest photo in the middle of the list?”
  67. 67. Ordering Confusion (2) class Address < CouchFoo::Base property :number, Integer property :street, String property :postcode property :created_at, DateTime default_sort :created_at end
  68. 68. Gotchas • To keep views(indexes) at a minimum ordering is done by CouchDB if the attribute isn’t exposed in the key • This means using :order and :limit may not provide the results expected • Can get round by adding extra attributes to the query eg User.find(:all, :use_key => [:name, :admin], :conditions => {:admin => true}, :order => :name, :limit => 10)
  69. 69. Bulk Saving CouchFoo::Base.bulk_save_default = true User.all.size # => 27 User.create(:name => "george") User.all.size # => 27 User.database.commit User.all.size # => 28 # It’s the developers responsibility to commit # In Rails the after_filter can be used to avoid forgetting
  70. 70. Performance • For updates will perform worse when not in memory - eg update_all • With bulk_save can perform much better than relational databases • CouchDB 0.9 offers a few performance enhancements over 0.8 • Performance implications fully documented in CouchFoo#Base
  71. 71. Bonus
  72. 72. ActiveRecord tips • model.column_names # => {“id”, “login”,...} • also columns and columns_hash • instance.attributes # => {“name” => “George”, ...} • model.with_scope User.with_scope(:find => { :conditions => "state = 3" }) do find(1) # => SELECT * from articles WHERE state = 3 AND id = 1 end
  73. 73. Resources • Rails Freelancer • @Georgio_1999 • http://rowtheboat.com • http://github.com/georgepalmer

×