Putting rails and couch db on the cloud - Indicthreads cloud computing conference 2011


Published on

Session presented at the 2nd IndicThreads.com Conference on Cloud Computing held in Pune, India on 3-4 June 2011.


Session Abstract: Apache CouchDB is a document-oriented NoSQL database that can be queried and indexed in a MapReduce fashion using JavaScript. CouchDB offers an easy way to get introduced to the world of NoSQL.In this session we will learn how to work with CouchDB, how to install it over an Amazon EC2 instance and how to insert and query data on it. We will then create a Ruby on Rails application, host it on the cloud through Heroku and integrate it with our CouchDB.

After this session, the audience will be able to work with CouchDB, understand it’s strengths and work with it over an EC2 instance. The audience will also be able to appreciate the ease of hosting Rails application with Heroku and how quickly one can launch and scale applications over the cloud with the combination of these two technologies.

Rocky Jaiswal is Software Architect at McKinsey & Company and has more than 8 years of experience in software analysis, design and programming. His primary area of expertise is application development using Java/JEE/Spring & Hibernate. He has worked as a consultant for major investment banks like Goldman Sachs and Morgan Stanley. He has extensive international experience and has worked in the UK, USA, Netherlands, Japan and Mexico. Rocky is a strong believer in Agile methodologies for software development particularly Scrum and XP.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Putting rails and couch db on the cloud - Indicthreads cloud computing conference 2011

  1. 1. 06/06/11 1
  2. 2. ABOUT MERocky JaiswalDaytime job – Software Architect at McKinsey & CoProgrammer / Agilist at heartHave been programming for almost 9 years, plan to do it for a longloooong timeI Java/Ruby/JRuby/jQuery and anything to do with webapplication developmentWant to build good looking, scalable and performing websites thathelp peopleBlog – www.rockyj.inTwitter – www.twitter.com/whatsuprocky 2
  3. 3. Why we need the cloudI am a developer. Don’t want the hassle to maintaininfrastructure.We are a small organization. We want cheap and flexibleinfrastructure.I / we want to scale easily. Be it scale up or scale down.*Choice of technology also determines how easily you canscale. e.g. Use of NoSQL instead of a RDBMS 3
  4. 4. A WORKING EXAMPLEhttp://biblefind.in 4
  5. 5. HELLO COUCHDBApache CouchDB is a document-oriented database thatcan be queried and indexed in a MapReduce fashionusing JavaScript.CouchDB also offers incremental replication with bi-directional conflict detection and resolution.CouchDB provides a RESTful JSON API than can beaccessed from any environment that allows HTTPrequests.+ It offers an easy introduction to the world of NoSQL 5
  6. 6. HELLO COUCHDB – DOCUMENT ORIENTEDA different way to model your data.Data stored in documentsThink of it as a de-normalizedtable row 6
  7. 7. HELLO COUCHDB -MAPREDUCEMapReduce – Divide and Rule for programmersHow would you count the occurrences of eachword in a book given a group of helpers 7
  8. 8. HELLO COUCHDB –JSON/HTTPWhen we talk of databases we talk of drivers CouchDB’s protocol is HTTPThe data exchange + storage language is JSON{ “Subject” : “I like JSON”, “Author” : “Rocky”, “Tags” : [ “Web”, “Programming”, “Data Exchange” ]}And the queries are written in JavaScript 8
  10. 10. CREATING VIEWS INCOUCHDBViews are like queries. Hmmm… more like Stored ProceduresViews are expressed as Map + Reduce functions written in JavaScriptFor example my view to query all the verses –{ "lookup": { "map": "function(doc){ if (doc.book && doc.chapter && doc.verse && doc.text){ key = [doc.book, doc.chapter, doc.verse]; emit(key, doc.text); } }" }}CouchDB runs the function for every document in DB and stores results in a B-Tree. 10
  11. 11. THE COUCHREST GEMCouchRest lightly wraps CouchDBs HTTP API,managing JSON serialization, and remembering theURI-paths to CouchDBs API endpoints so you donthave to.@db = CouchRest.database!("")@db.save_doc({:key => value, another key => another value}) 11
  12. 12. THE RAILS APPLICATIONSo our back-end is setWe only need a front-end nowNothing much needs to be done 1 Controller 1 View Some jQuery for autocomplete 12
  13. 13. REGEX NIGHTMARESMatthew 1 – One whole chapterMark 2:3 – One versePsalms 23:1-4 – A set of versesFor God so loved the world – Free Text 13
  14. 14. LUCENE INTEGRATIONCouchdb-lucene (https://github.com/rnewson/couchdb-lucene)Java projectCouchDB View –@db.save_doc({ "_id" => "_design/lucene", :fulltext => { :by_text => { :index => "function(doc) { var ret=new Document(); ret.add(doc.text); return ret }" } }}) 14
  15. 15. LUCENE INTEGRATIONCONTD..CouchDB config -[external]fti=python /home/rocky/Apps/couchdb-lucene-0.7-SNAPSHOT/tools/couchdb-external-hook.py[httpd_db_handlers]_fti = {couch_httpd_external, handle_external_req,<<"fti">>} 15
  16. 16. USING HEROKURails application hosting providerFree for 1 “Dyno” + 1 Shared database 16
  17. 17. USING HEROKU CONTD.. 17
  18. 18. USING SLICEHOST 18
  19. 19. PUTTING IT OUT IN THE BIG BADWORLDBuy a domain name from http://www.godaddy.comOr a domain provider of your choiceIn Heroku add the domain nameAdd the Zerigo add-on in HerokuIn godaddy’s admin console point your nameservers toZerigo’s name serversWait … 19
  20. 20. SCALINGHeroku makes scaling dead easy 20
  21. 21. SCALINGHeroku makes scaling dead easy (if we were using SQL) 21
  22. 22. SCALINGIn NoSQL world, replication is a first class citizen POST /_replicate { “source”:”a”, “target”:”b”, “continuous”:”true” } 22
  23. 23. SCALING (HOT BACKUP) 23
  25. 25. SCALING WITH COUCHDB LOUNGEHave a look at CouchDB LoungeIt consists of – a dumb proxy that is a module for nginx a smart proxy that distributes workAll in all, make a cluster – Have continuous replication Use Lounge to distribute load 25
  26. 26. WHEN NOT TO USE COUCHDBWhen the number of writes far exceed the number ofreads (plus the data volume is very high) - This would create a bottleneck for replication - And you may encounter more conflictsWhen you need ad-hoc queries - You cannot use the power of views in this caseUse CouchDB’s brother MongoDB in this case. 26
  27. 27. THANKS 27