The Current State of Asynchronous Processing With Ruby

19,259 views
19,035 views

Published on

A small overview of current technologies

Published in: Technology, Business

The Current State of Asynchronous Processing With Ruby

  1. 1. The Current State of Asynchronous Processing in Ruby Mathias Meyer, Peritor GmbH
  2. 2. self • Software-Developer for Peritor GmbH in Berlin • Code-Reviews, Scaling, Performance-Reviews and -Tuning, Refactorings • Maintainer: acts_as_solr, run_later, heaps of stuff for Capistrano/Webistrano
  3. 3. What? • Asynchronous Processing • Just a fancy word for...
  4. 4. “Dude, this request is getting too long!”
  5. 5. “Okay, let’s move stuff into the background.”
  6. 6. In other words: What? • A process puts a job into some queue • Another process picks up the job and processes it
  7. 7. Why? • Requests are taking too long • Image uploads to S3 • Full-text search updates • Generating PDFs, reports, etc. • Sending emails (newsletters, etc.)
  8. 8. Why? • Your app requires longer running tasks • Compute daily statistics data • Create reports • Data crunching
  9. 9. Why? • You need to run tasks at a certain time • Scheduled tasks like invoicing, billing • Sending out reminder emails
  10. 10. Why? • Longer tasks • Are harder to debug and monitor • Block user and the application
  11. 11. Example Rails Client Upload to S3 ! Response
  12. 12. Example - Much Better Client Rails Worker Upload to Response S3
  13. 13. The simplest Thing that could possibly work Thread.new
do 

AccountMailer.deliver_signup(@user) end
  14. 14. Done.
  15. 15. Not so fast... • Problems • Not easy on resources • Unreliable • Not reproducible
  16. 16. A little better... run_later
do 

AccountMailer.deliver_signup(@user) end
  17. 17. run_later • Borrowed from Merb, available as a Rails- Plugin • Uses worker thread and a queue • Simple solution for simple tasks
  18. 18. What you really want • Reliable messaging • Durability • Scheduling • Scalable processing • Not necessarily all at the same time
  19. 19. Options, options • Messaging Queues • Polling • Schedulers • Oh my!
  20. 20. Protocols, oh my! • AMQP • STOMP • JMS • XMPP • RestMS
  21. 21. Message Queues
  22. 22. Message Queues • Publish/subscribe mechanism • Usually require more than one new component in your infrastructure • Broker middleware • Subscribers, message listeners
  23. 23. Message Queues • ActiveMQ • RabbitMQ • ActiveMessaging • Amazon SQS • Usually more code involved
  24. 24. Pollers • Polling a database for new jobs • Simple and domain-specific • Only one new component • Usually requires more effort to make them less prone to errors
  25. 25. Pollers • Roll Your Own w/ or w/o daemons gem • delayed_job - adds some infrastructure • background_job
  26. 26. Pollers - RYO create_table
:jobs
do
|t| 

t.string
:klazz,
:method 

t.string
:obj_id end
  27. 27. Pollers - RYO class
Job
<
ActiveRecord::Base 

def
self.schedule(obj,
method) 



create(:klazz
=>
obj.class.name, 










:obj_id
=>
obj.id, 










:method
=>
method.to_s) 

end 

 

def
run 



obj
=
klazz.constantize.find(obj_id) 



obj.send(self[:method]) 

end end Job.schedule(@user,
:notify_signup)
  28. 28. Pollers - RYO class
JobPoller 

def
run 



loop
do 





job
=
Job.find(:first,
:lock
=>
true) 





next
unless
job 





job.run 





job.delete 



end 

end end task
:poller
do 

JobPoller.new.run end
  29. 29. Pollers - delayed_job User.find(params[:id]).send_later(:notify_signup) rake
jobs:work
  30. 30. Pollers - delayed_job class
SignupNotifier
<
Struct.new(:name) 

def
perform 



user
=
User.find_by_name(name) 



user.notify_signup 

end end Delayed::Job.enqueue
SignupNotifier.new(quot;davidquot;)
  31. 31. Message Queues, Pollers • Usually don’t directly support scheduling
  32. 32. Schedulers
  33. 33. Schedulers • cron, cron, cron (oh, and rake) • rufus-scheduler • BackgrounDRb • Quartz with JRuby (if you’re into that sort of thing) • Roll Your Own
  34. 34. rufus-scheduler scheduler
=
Rufus::Scheduler.start_new scheduler.cron
'0
22
*
*
1‐5'
do 

Job.new.run end scheduler.in
'20m'
do 

puts
quot;Get
a
flat
whitequot; end
  35. 35. Playing with the Big Guys • Nanite - Jack of all trades • Uses RabbitMQ, Erlang-based messaging server, AMQP implementation • Distributes work across a network of workers
  36. 36. Nanite • A self-assembling fabric of Ruby daemons • Uses mappers and agents • Mappers route message requests • Agents handle messages
  37. 37. Nanite Agent Agent Mapper RabbitMQ Agent Mapper Agent
  38. 38. Nanite class
Reactor 

include
Nanite::Actor 

expose
:react 

 

def
react(payload) 



quot;reacting
to
message
with
payload:
#{payload}quot; 

end end
  39. 39. Nanite Nanite.request('/reactor/react',
'good
acting
is
reacting')
do
|r| 

p
r end
  40. 40. Nanite • Agents register themselves, constantly pinging the mappers • Mappers remove timed-out agents • Work distributed based on agent load • Unfortunately: Poor documentation
  41. 41. Other Stuff • BackgrounDRb • AP4R • job_fu • spawn • Starling/Workling • Daemons • beanstalkd
  42. 42. Careful now! Pitfalls
  43. 43. Race Conditions
  44. 44. Race Conditions • Workers try accessing the same data • Workers update data that is updated heavily from other layers
  45. 45. Race Conditions • Make jobs repeatable • Reduce proneness to errors coming from the database • Let your broker take care of that
  46. 46. Queue Congestion
  47. 47. Queue Congestion http://www.flickr.com/photos/lasgalletas/263909727/
  48. 48. Queue Congestion • Workers can’t keep up with queue • More work coming in than being processed
  49. 49. Queue Congestion • Make it easy to add more workers • Ensure they don’t steal each other’s work or do it twice (row-level locking, lock on fields)
  50. 50. Stalled Workers
  51. 51. http://www.flickr.com/photos/freeurmind/2941677586/
  52. 52. Stalled Workers • Workers stopped processing jobs • Got stuck on an a particular task • Choked on an error
  53. 53. Stalled Workers • Monitor your workers and queues • Build in error reporting, report exceptions like in the rest of your code • Make jobs repeatable • Use Nanite (self-healing)
  54. 54. Recommendations • Simple jobs • Roll Your Own • delayed_job
  55. 55. Recommendations • Distributed • Amazon SQS w/ ActiveMessaging or custom worker • Works best on EC2
  56. 56. Recommendations • Heavy load, scalable • Nanite/RabbitMQ
  57. 57. The End • Questions? • @roidrage • http://www.paperplanes.de • http://github.com/mattmatt

×