Getting Distributed
               With Ruby (On Rails)

                  by Martin Sadler


Implementing distributed pro...
dsc.net
DSC
• Hosting, Web application development, and
  consultancy
• Host the crew email system and carried out
  the intranet ...
http://www.workingwithrails.com
Working With Rails
• Largest index of Ruby on Rails in the world
• Over 7000 people listed
• From 104 countries
• Find out...
Why Distributed?
Working With Rails
Distributed Ruby?
Some background
Some background
Current

• Uses FeedTools (nice lib!)
• Great at parsing feed formats
• Good for small sites
Issues
Issues
• No longer supported by author
Issues
• No longer supported by author
• Feeds fetched at the users expense   (and
  therefore Mongrels)
Issues
• No longer supported by author
• Feeds fetched at the users expense   (and
  therefore Mongrels)
• Feeds are cache...
Issues
• No longer supported by author
• Feeds fetched at the users expense   (and
  therefore Mongrels)
• Feeds are cache...
The Result
The Result

• Occasional slow loading pages that include
  third party feeds
The Result

• Occasional slow loading pages that include
  third party feeds
• Stale feed items
The Result

• Occasional slow loading pages that include
  third party feeds
• Stale feed items
• Inconstant feed items
The Result

• Occasional slow loading pages that include
  third party feeds
• Stale feed items
• Inconstant feed items
• ...
The Challenge
• To keep content fresh, push traffic to WWR
  and out to the blog owners
• Different feeds and sources to co...
Ruby on Rails distributed processing choices

     RingyDingy
                            AP4R

    Rinda         DRB     ...
DRB
Basic building block of all other Ruby
distributed libs.
“DRb literally stands for quot;Distributed Rubyquot;. It is a...
Quick DRB Example
Server
                                                        Client
require 'drb'
                    ...
Quick DRB Example
Server
                                                        Client
require 'drb'
                    ...
Quick DRB Example
Server
                                                        Client
require 'drb'
                    ...
Basics

    • Server
    • Clients / Workers
    • Communicate via messages
http://en.wikipedia.org/wiki/Distributed_compu...
BackgroundRB
• Ruby job server and scheduler.
• Integrates with Rails
• Quite complex
• Some issues between versions but m...
Starfish
• Inspired by Google’s MapReduce
• Easy to understand code
• Stability?
• No longer supported by author?

        ...
reliable-message
   • Solid library
   • Easy to understand API
   • Bit more involved to setup
   • Can be integrated wit...
AP4R

• Asynchronous Processing for Ruby
• Lesser known lib from Japan (new kid on the
  block)
• Integrates with Rails
• ...
AP4R
• AP4R, Asynchronous Processing for Ruby, is
  the implementation of reliable asynchronous
  message processing. It p...
AP4R Features

•   Business logic can be implemented as simple Web applications, or ruby code, whether it's called
    asy...
AP4R Process Flow

•   A client(e.g. a web browser) makes a request to a web server (Apache, Lighttpd, etc...).

•   A rai...
AP4R example
Hello World app comes with AP4R to get
you started.


Nice guide also here
http://rubyforge.org/frs/download....
Complimentary
  Services
Rinda
   • Rinda::Ring allows DRb services and clients
      to automatically find each other without
      knowing where t...
RingyDingy
• RingyDingy automatically registers a service
  with a RingServer. If communication between
  the RingServer a...
Feeds in WWR
 AP4R
 Server
                               Feed
@queue
                              Queue



Feed Fetcher ...
Running the code

ruby script/ap4r_start -c config/queues_mysql.cfg
rake background:feed_queue
rake background:feed_retrieve
Key points

• The Feed Queue fetches the urls of stale
  feeds
• Each worker (client) has the Rails
  environment loaded
With this solution


• Can scale as demand grows
• Flexible for any type of feed data
• Still - room for improvement
Possible Improvements

• Automatic spawning and killing of workers
  as queue size grows or decreases
• Better handling of...
When to go distributed?

• Long running process or task
 • Fetching external data
 • Complex computations
 • .... that can...
Pitfalls
• Dependencies
• `connection closed' errors on Mac (IPV6) -
  change all refs of localhost to 127.0.0.1 to
  avoi...
Do you need
         distributed?

• Maybe you would be better scheduling
  instead?
• http://www.igvita.com/blog/2007/03/...
So where is all this
  leading us to?
Contextual Feed
 Aggregation
Group feed aggregation
Group blog posts
Group blog posts
Twitters
Group blog posts
   Twitters
and so on.......
Thanks!
             http://www.dsc.net
             http://www.workingwithrails.com
             Blog: http://beyondthety...
Getting Distributed (With Ruby On Rails)
Getting Distributed (With Ruby On Rails)
Getting Distributed (With Ruby On Rails)
Upcoming SlideShare
Loading in...5
×

Getting Distributed (With Ruby On Rails)

20,204

Published on

Implementing distributed processing at Working With Rails

Published in: Technology
4 Comments
49 Likes
Statistics
Notes
  • Very good presentation. Nicely done. I’m Ana Mui Stanley, working on my latest site on lyrics, www.lyrics-search.org/ . I enjoy reading the slide.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Excellent. You've shown your credibility on presentation with this slideshow. This one deserves thumbs up. I'm John, owner of www.freeringtones.ws/ . Perhaps I'll get to see more quality slides from you.

    Best wishes.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • It's a bit complicated, but having reading this gives me more sight on ruby.

    Dave (CEO : www.freeringtonesforverizon.net/ )
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Great display. I've taken a few of the structure graphics as well as adapted to my startup

    Janie
    http://financejedi.com
    http://healthjedi.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
20,204
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
4
Likes
49
Embeds 0
No embeds

No notes for slide

Getting Distributed (With Ruby On Rails)

  1. 1. Getting Distributed With Ruby (On Rails) by Martin Sadler Implementing distributed processing at Working With Rails
  2. 2. dsc.net
  3. 3. DSC • Hosting, Web application development, and consultancy • Host the crew email system and carried out the intranet integration for Virgin Atlantic. • Runs several large forums. • e.g. pprune.org - 150k members, 2600 at one time • Also AskDirect, Mumsnet
  4. 4. http://www.workingwithrails.com
  5. 5. Working With Rails • Largest index of Ruby on Rails in the world • Over 7000 people listed • From 104 countries • Find out who’s who? • Connect with others • Find a developer for a project / employment • Also lists groups, companies, and sites
  6. 6. Why Distributed?
  7. 7. Working With Rails
  8. 8. Distributed Ruby?
  9. 9. Some background
  10. 10. Some background
  11. 11. Current • Uses FeedTools (nice lib!) • Great at parsing feed formats • Good for small sites
  12. 12. Issues
  13. 13. Issues • No longer supported by author
  14. 14. Issues • No longer supported by author • Feeds fetched at the users expense (and therefore Mongrels)
  15. 15. Issues • No longer supported by author • Feeds fetched at the users expense (and therefore Mongrels) • Feeds are cached locally but are parsed on request
  16. 16. Issues • No longer supported by author • Feeds fetched at the users expense (and therefore Mongrels) • Feeds are cached locally but are parsed on request • Known probs when scaling (search on Google)
  17. 17. The Result
  18. 18. The Result • Occasional slow loading pages that include third party feeds
  19. 19. The Result • Occasional slow loading pages that include third party feeds • Stale feed items
  20. 20. The Result • Occasional slow loading pages that include third party feeds • Stale feed items • Inconstant feed items
  21. 21. The Result • Occasional slow loading pages that include third party feeds • Stale feed items • Inconstant feed items • No good!
  22. 22. The Challenge • To keep content fresh, push traffic to WWR and out to the blog owners • Different feeds and sources to consider: Flickr, Twitter, Blog, Delicious • Each need to display in multiple places in many ways • But also want to do some funkier stuff (as you’ll see a bit later)
  23. 23. Ruby on Rails distributed processing choices RingyDingy AP4R Rinda DRB Starfish BackgroundRB Reliable-Message
  24. 24. DRB Basic building block of all other Ruby distributed libs. “DRb literally stands for quot;Distributed Rubyquot;. It is a library that allows you to send and receive messages from remote Ruby objects via TCP/IP. Sound kind of like RPC, CORBA or Java's RMI? Probably so. This is Ruby's simple as dirt answer to all of the above.” http://chadfowler.com/ruby/drb.html
  25. 25. Quick DRB Example Server Client require 'drb' require 'drb' class TestServer DRb.start_service() obj = DRbObject.new(nil, 'druby://localhost:9000') def doit # Now use obj quot;Hello, Distributed Worldquot; p obj.doit end end aServerObject = TestServer.new DRb.start_service('druby://localhost:9000', aServerObject) DRb.thread.join # Don't exit just yet!
  26. 26. Quick DRB Example Server Client require 'drb' require 'drb' class TestServer DRb.start_service() obj = DRbObject.new(nil, 'druby://localhost:9000') def doit # Now use obj quot;Hello, Distributed Worldquot; p obj.doit end end aServerObject = TestServer.new DRb.start_service('druby://localhost:9000', aServerObject) DRb.thread.join # Don't exit just yet! > ruby server.rb
  27. 27. Quick DRB Example Server Client require 'drb' require 'drb' class TestServer DRb.start_service() obj = DRbObject.new(nil, 'druby://localhost:9000') def doit # Now use obj quot;Hello, Distributed Worldquot; p obj.doit end end aServerObject = TestServer.new DRb.start_service('druby://localhost:9000', aServerObject) DRb.thread.join # Don't exit just yet! > ruby server.rb > ruby client.rb “Hello Distributed World”
  28. 28. Basics • Server • Clients / Workers • Communicate via messages http://en.wikipedia.org/wiki/Distributed_computing
  29. 29. BackgroundRB • Ruby job server and scheduler. • Integrates with Rails • Quite complex • Some issues between versions but many favor it above the other libs • Most well known http://backgroundrb.rubyforge.org/
  30. 30. Starfish • Inspired by Google’s MapReduce • Easy to understand code • Stability? • No longer supported by author? http://rufy.com/starfish/doc/
  31. 31. reliable-message • Solid library • Easy to understand API • Bit more involved to setup • Can be integrated with Rails • On going development http://trac.labnotes.org/cgi-bin/trac.cgi/wiki/Ruby/ReliableMessaging
  32. 32. AP4R • Asynchronous Processing for Ruby • Lesser known lib from Japan (new kid on the block) • Integrates with Rails • Built on top of reliable-message
  33. 33. AP4R • AP4R, Asynchronous Processing for Ruby, is the implementation of reliable asynchronous message processing. It provides message queuing, and message dispatching. • Using asynchronous processing, we can cut down turn-around-time of web applications by queuing, or can utilize more machine power by load-balancing.
  34. 34. AP4R Features • Business logic can be implemented as simple Web applications, or ruby code, whether it's called asynchronously or synchronously. • Asynchronous messaging is reliable by RDBMS persistence (now MySQL only) or file persistence, under the favor of reliable-msg. • Load balancing over multiple AP4R processes on single/multiple servers is supported. • Asynchronous logics are called via various protocols, such as XML-RPC, SOAP, HTTP PUT, and more. • Using store and forward function, at-least-omce QoS level is provided.
  35. 35. AP4R Process Flow • A client(e.g. a web browser) makes a request to a web server (Apache, Lighttpd, etc...). • A rails application (a synchronous logic) is executed on mongrel via mod_proxy or something. • At the last of the synchronous logic, message(s) are put to AP4R (AP4R provides a helper). • Once the synchronous logic is done, the clients receives a response immediately. • AP4R queues the message, and requests it to the web server asynchronously. • An asynchronous logic, implemented as usual rails action, is executed.
  36. 36. AP4R example Hello World app comes with AP4R to get you started. Nice guide also here http://rubyforge.org/frs/download.php/13312/AP4R_Users_Guide_EN.pdf
  37. 37. Complimentary Services
  38. 38. Rinda • Rinda::Ring allows DRb services and clients to automatically find each other without knowing where they live. • DRb servers register themselves with a RingServer which allows clients to find the servers they need. Many servers may register themselves with the RingServer. The DRb servers don't need to run on the same machine. http://segment7.net/projects/ruby/drb/rinda/ringserver.html
  39. 39. RingyDingy • RingyDingy automatically registers a service with a RingServer. If communication between the RingServer and the RingyDingy is lost, RingyDingy will re-register its service with the RingServer when it reappears. http://seattlerb.rubyforge.org/RingyDingy/
  40. 40. Feeds in WWR AP4R Server Feed @queue Queue Feed Fetcher Feed Fetcher Feed Fetcher 1 2 N
  41. 41. Running the code ruby script/ap4r_start -c config/queues_mysql.cfg rake background:feed_queue rake background:feed_retrieve
  42. 42. Key points • The Feed Queue fetches the urls of stale feeds • Each worker (client) has the Rails environment loaded
  43. 43. With this solution • Can scale as demand grows • Flexible for any type of feed data • Still - room for improvement
  44. 44. Possible Improvements • Automatic spawning and killing of workers as queue size grows or decreases • Better handling of feed errors • Dynamic polling intervals based on user defined prefs or some intelligent logic.
  45. 45. When to go distributed? • Long running process or task • Fetching external data • Complex computations • .... that can be broken into chunks or work • You care about the user experience
  46. 46. Pitfalls • Dependencies • `connection closed' errors on Mac (IPV6) - change all refs of localhost to 127.0.0.1 to avoid. (had to patch reliable-message) • Terminology to understand • Memory requirements
  47. 47. Do you need distributed? • Maybe you would be better scheduling instead? • http://www.igvita.com/blog/2007/03/29/ scheduling-tasks-in-ruby-rails/
  48. 48. So where is all this leading us to?
  49. 49. Contextual Feed Aggregation
  50. 50. Group feed aggregation
  51. 51. Group blog posts
  52. 52. Group blog posts Twitters
  53. 53. Group blog posts Twitters and so on.......
  54. 54. Thanks! http://www.dsc.net http://www.workingwithrails.com Blog: http://beyondthetype.com Enjoyed the talk? Recommend me on WWR http://workingwithrails.com/person/5152-martin-sadler

×