Loading...
Flash Player 9 (or above) is needed to view slideshows. We have detected that you do not have it on your computer.To install it, go here
 
Post to Twitter Post to Twitter
Myspace Hi5 Friendster Xanga LiveJournal Facebook Blogger Tagged Typepad Freewebs BlackPlanet gigya icons
SlideShare is now available on LinkedIn. Add it to your LinkedIn profile.

Getting Distributed (With Ruby On Rails)

From martinbtt, 2 years ago Add as contact

Implementing distributed processing at Working With Rails

8228 views | 0 comments | 24 favorites | 0 downloads | 0 embeds (Stats)

Categories

Technology

Groups/Events

ROR
Embed in your blog options close
Embed (wordpress.com) Exclude related slideshows Embed in your blog

More Info

This slideshow is Public
Total Views: 8228 on Slideshare: 8228 from embeds: 0
Flagged as inappropriate Flag as inappropriate

Flag as inappropriate

Select your reason for flagging this slideshow as inappropriate.

If needed, use the feedback form to let us know more details.

Slideshow Transcript

  1. Slide 1: Getting Distributed With Ruby (On Rails) by Martin Sadler Implementing distributed processing at Working With Rails
  2. Slide 2: dsc.net
  3. Slide 3: DSC • Hosting, Web application development, and consultancy • Host the crew email system and carried out the intranet integration for Virgin Atlantic. • Runs several large forums. • e.g. pprune.org - 150k members, 2600 at one time • Also AskDirect, Mumsnet
  4. Slide 4: http://www.workingwithrails.com
  5. Slide 5: Working With Rails • Largest index of Ruby on Rails in the world • Over 7000 people listed • From 104 countries • Find out who’s who? • Connect with others • Find a developer for a project / employment • Also lists groups, companies, and sites
  6. Slide 6: Why Distributed?
  7. Slide 7: Working With Rails
  8. Slide 9: Distributed Ruby?
  9. Slide 10: Some background
  10. Slide 11: Some background
  11. Slide 12: Current • Uses FeedTools (nice lib!) • Great at parsing feed formats • Good for small sites
  12. Slide 13: Issues
  13. Slide 14: Issues • No longer supported by author
  14. Slide 15: Issues • No longer supported by author • Feeds fetched at the users expense (and therefore Mongrels)
  15. Slide 16: Issues • No longer supported by author • Feeds fetched at the users expense (and therefore Mongrels) • Feeds are cached locally but are parsed on request
  16. Slide 17: Issues • No longer supported by author • Feeds fetched at the users expense (and therefore Mongrels) • Feeds are cached locally but are parsed on request • Known probs when scaling (search on Google)
  17. Slide 18: The Result
  18. Slide 19: The Result • Occasional slow loading pages that include third party feeds
  19. Slide 20: The Result • Occasional slow loading pages that include third party feeds • Stale feed items
  20. Slide 21: The Result • Occasional slow loading pages that include third party feeds • Stale feed items • Inconstant feed items
  21. Slide 22: The Result • Occasional slow loading pages that include third party feeds • Stale feed items • Inconstant feed items • No good!
  22. Slide 23: The Challenge • To keep content fresh, push traffic to WWR and out to the blog owners • Different feeds and sources to consider: Flickr, Twitter, Blog, Delicious • Each need to display in multiple places in many ways • But also want to do some funkier stuff (as you’ll see a bit later)
  23. Slide 24: Ruby on Rails distributed processing choices RingyDingy AP4R Rinda DRB Starfish BackgroundRB Reliable-Message
  24. Slide 25: DRB Basic building block of all other Ruby distributed libs. “DRb literally stands for \"Distributed Ruby\". It is a library that allows you to send and receive messages from remote Ruby objects via TCP/IP. Sound kind of like RPC, CORBA or Java's RMI? Probably so. This is Ruby's simple as dirt answer to all of the above.” http://chadfowler.com/ruby/drb.html
  25. Slide 26: Quick DRB Example Server Client require 'drb' require 'drb' class TestServer DRb.start_service() obj = DRbObject.new(nil, 'druby://localhost:9000') def doit # Now use obj \"Hello, Distributed World\" p obj.doit end end aServerObject = TestServer.new DRb.start_service('druby://localhost:9000', aServerObject) DRb.thread.join # Don't exit just yet!
  26. Slide 27: Quick DRB Example Server Client require 'drb' require 'drb' class TestServer DRb.start_service() obj = DRbObject.new(nil, 'druby://localhost:9000') def doit # Now use obj \"Hello, Distributed World\" p obj.doit end end aServerObject = TestServer.new DRb.start_service('druby://localhost:9000', aServerObject) DRb.thread.join # Don't exit just yet! > ruby server.rb
  27. Slide 28: Quick DRB Example Server Client require 'drb' require 'drb' class TestServer DRb.start_service() obj = DRbObject.new(nil, 'druby://localhost:9000') def doit # Now use obj \"Hello, Distributed World\" p obj.doit end end aServerObject = TestServer.new DRb.start_service('druby://localhost:9000', aServerObject) DRb.thread.join # Don't exit just yet! > ruby server.rb > ruby client.rb “Hello Distributed World”
  28. Slide 29: Basics • Server • Clients / Workers • Communicate via messages http://en.wikipedia.org/wiki/Distributed_computing
  29. Slide 30: BackgroundRB • Ruby job server and scheduler. • Integrates with Rails • Quite complex • Some issues between versions but many favor it above the other libs • Most well known http://backgroundrb.rubyforge.org/
  30. Slide 31: Starfish • Inspired by Google’s MapReduce • Easy to understand code • Stability? • No longer supported by author? http://rufy.com/starfish/doc/
  31. Slide 32: reliable-message • Solid library • Easy to understand API • Bit more involved to setup • Can be integrated with Rails • On going development http://trac.labnotes.org/cgi-bin/trac.cgi/wiki/Ruby/ReliableMessaging
  32. Slide 33: AP4R • Asynchronous Processing for Ruby • Lesser known lib from Japan (new kid on the block) • Integrates with Rails • Built on top of reliable-message
  33. Slide 34: AP4R • AP4R, Asynchronous Processing for Ruby, is the implementation of reliable asynchronous message processing. It provides message queuing, and message dispatching. • Using asynchronous processing, we can cut down turn-around-time of web applications by queuing, or can utilize more machine power by load-balancing.
  34. Slide 35: AP4R Features • Business logic can be implemented as simple Web applications, or ruby code, whether it's called asynchronously or synchronously. • Asynchronous messaging is reliable by RDBMS persistence (now MySQL only) or file persistence, under the favor of reliable-msg. • Load balancing over multiple AP4R processes on single/multiple servers is supported. • Asynchronous logics are called via various protocols, such as XML-RPC, SOAP, HTTP PUT, and more. • Using store and forward function, at-least-omce QoS level is provided.
  35. Slide 36: AP4R Process Flow • A client(e.g. a web browser) makes a request to a web server (Apache, Lighttpd, etc...). • A rails application (a synchronous logic) is executed on mongrel via mod_proxy or something. • At the last of the synchronous logic, message(s) are put to AP4R (AP4R provides a helper). • Once the synchronous logic is done, the clients receives a response immediately. • AP4R queues the message, and requests it to the web server asynchronously. • An asynchronous logic, implemented as usual rails action, is executed.
  36. Slide 37: AP4R example Hello World app comes with AP4R to get you started. Nice guide also here http://rubyforge.org/frs/download.php/13312/AP4R_Users_Guide_EN.pdf
  37. Slide 38: Complimentary Services
  38. Slide 39: Rinda • Rinda::Ring allows DRb services and clients to automatically find each other without knowing where they live. • DRb servers register themselves with a RingServer which allows clients to find the servers they need. Many servers may register themselves with the RingServer. The DRb servers don't need to run on the same machine. http://segment7.net/projects/ruby/drb/rinda/ringserver.html
  39. Slide 40: RingyDingy • RingyDingy automatically registers a service with a RingServer. If communication between the RingServer and the RingyDingy is lost, RingyDingy will re-register its service with the RingServer when it reappears. http://seattlerb.rubyforge.org/RingyDingy/
  40. Slide 41: Feeds in WWR AP4R Server Feed @queue Queue Feed Fetcher Feed Fetcher Feed Fetcher 1 2 N
  41. Slide 42: Running the code ruby script/ap4r_start -c config/queues_mysql.cfg rake background:feed_queue rake background:feed_retrieve
  42. Slide 43: Key points • The Feed Queue fetches the urls of stale feeds • Each worker (client) has the Rails environment loaded
  43. Slide 44: With this solution • Can scale as demand grows • Flexible for any type of feed data • Still - room for improvement
  44. Slide 45: Possible Improvements • Automatic spawning and killing of workers as queue size grows or decreases • Better handling of feed errors • Dynamic polling intervals based on user defined prefs or some intelligent logic.
  45. Slide 46: When to go distributed? • Long running process or task • Fetching external data • Complex computations • .... that can be broken into chunks or work • You care about the user experience
  46. Slide 47: Pitfalls • Dependencies • `connection closed' errors on Mac (IPV6) - change all refs of localhost to 127.0.0.1 to avoid. (had to patch reliable-message) • Terminology to understand • Memory requirements
  47. Slide 48: Do you need distributed? • Maybe you would be better scheduling instead? • http://www.igvita.com/blog/2007/03/29/ scheduling-tasks-in-ruby-rails/
  48. Slide 49: So where is all this leading us to?
  49. Slide 50: Contextual Feed Aggregation
  50. Slide 52: Group feed aggregation
  51. Slide 54: Group blog posts
  52. Slide 55: Group blog posts Twitters
  53. Slide 56: Group blog posts Twitters and so on.......
  54. Slide 57: Thanks! http://www.dsc.net http://www.workingwithrails.com Blog: http://beyondthetype.com Enjoyed the talk? Recommend me on WWR http://workingwithrails.com/person/5152-martin-sadler