Migrating from Backgroundrb to Resque


Published on

We upgraded from Backgroundrb to Resque. The pagers have stopped buzzing, and we are very pleased with the migration.

Resque was a little tricky to get the last 5% complete. This presentation shares some of the implementation details (code and config files) to help others make their Resque setup rock solid.

Published in: Technology
1 Comment
  • Nice presentation. Resque is indeed an awesome gem !!
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Migrating from Backgroundrb to Resque

  1. 1. to: Resque from: Backgroundrb @kbrock 2010-10-12
  2. 2. Summary ! Background queues let us defer logic outside the browser request and response. ! Background.rb was crashing for us often. Moved to resque and it hasn't crashed since. ! Background.rb is easier to run out of the box. ! Adding just a little code makes Resque just as easy without sacrificing all the added flexibility.
  3. 3. Why we upgraded? ! bdrb pages Boss 4 times my first weekend ! memory leaks caused crashes ! monit can't restart workers in backgroundrb ! move to active project (ala heroku, github, redis)
  4. 4. What do each bring to the table bdrb resque adhoc (out of request) ! ! delay (run/remind) ! resque-schedule schedule (cron) ! resque-schedule mail (invisible/out of req) code resque_mailer status reporting code resque-meta, web backgroundrb does most of what we need out of the box resque has plugins to make up the difference
  5. 5. Bdrb Components scheduler workers rails main queue work enqueue queue manager mailer we started Monitored data bdrb yml simple w/ 1 queue (add started_at for delayed jobs) scheduler is a special worker - managed by 1 process (is a runner/worker)
  6. 6. Resque Components delayed scheduler schedule queue 2 rails enqueue 1 workers rake 4 resque main main work web main queue queue queue workers 6 3 mailer we started Monitored data 5 many moving parts simplified in all workers are the same scheduler simply adds entries in the queue (instead of MetaWorker/running jobs) web ui is a nice touch
  7. 7. 1. Ad-hoc Enqueuing bdrb resque args hash ruby, checked enqueue AR objects ! mail(invisible) ! ! AR objects - creeped up in the action_mailer deliver calls Looks like bdrb wins here, but not enqueuing AR objects is best practice
  8. 8. Ad-hoc/Delayed (bdrb) class JobWorker < BackgrounDRb::MetaWorker set_worker_name :job_worker def purge_job_logs() JobLog.purge_expired! persistent_job.finish! end def self.perform_later(*args) MiddleMan.worker(:job_worker).enq_purge_job_logs( :job_key => new_job_key, :arg => args) end def self.perform_at(*args) time=args.shift MiddleMan.worker(:job_worker).enq_purge_job_logs( :job_key => new_job_key, :arg => *args,:scheduled_at => time) end def self.new_job_key() "purge_job_logs_#{ActiveSupport::SecureRandom.hex(8)}" end end don't need to do a command pattern (our code didn't) scheduled_at = beauty of SQL parent class enqueue knows queue name (code not loaded)
  9. 9. Ad-hoc/Delayed (resque) class PurgeJobLogs @queue = :job_worker def self.process() JobLog.purge_expired! end def self.perform_later(*args) Resque.enqueue(self, *args) end def self.perform_at(*args) time=args.shift Resque.enqueue_at(time, self, *args) end end Enqueue needs worker class to know the name of the queue (even if called directly into Resque) interface only (perform_{at,later}) -> abstracted out to parent?
  10. 10. 2. Scheduled Enqueuing bdrb resque sched any method !x2 command scheduler ! !+ adhoc jobs ! Need to define schedule in 2 places. yml and ruby. We ran into case where this caused a problem web ui for easy adhoc kicking off of resque commands. (very useful in staging)
  11. 11. Scheduled (bdrb) :backgroundrb: :ip: :port: 11006 :environment: development :schedules: :scheduled_worker: :purge_job_logs: :trigger_args: 0 */5 * * * * Evidence of framework - scheduled_worker defined here, need meta worker (so it can be run)
  12. 12. Scheduled (bdrb) class ScheduledWorker < BackgrounDRb::MetaWorker extend BdrbUtils::CronExtensions set_worker_name :scheduled_worker threaded_cron_job(:purge_job_logs) { JobLog.purge_expired! } end scheduler = MetaWorker. Defined 2 times - so it calls your code, so can call "any static method"
  13. 13. Scheduled (resque) --- clear_logs: cron: "*/10 * * * *" class: PurgeJobLogs queue: job_worker description: Remove old logs queue_name (so scheduler does not need to load worker into memory to enqueue) cron is standard format (remove 'seconds') - commands scheduler in separate process. (can run when workers are stopped / changed) - minimal env scheduler injects into queue (vs runs jobs) - so can adhoc inject via web no ruby code for this
  14. 14. 3. Processes/Worker management bdrb resque knows queues ! us, command, web pids ! us+ mem leak resistant ! workers/queue 1 <1 - ∞ pause workers ! Discover previous queues (not all) via 'resque list' / web bdrb: creates 1 worker/queue (creates pid file + 1 pid for backgroundrb) - monit can't restart we manage processes: 1+ workers/queue - 1+ queues / worker pause/restart workers
  15. 15. worker list (resque) primary: queues: background,mail secondary: queues: mail,background can have multiple workers running the same queues can have multiple queues in 1 worker worker pool can be * generalized, * response focused, * schedule focused, *changed at runtime inverted priority list - prevents starvation
  16. 16. 4. Running Workers namespace :resque do desc 'start all background resque daemons' task :start_daemons do mrake_start "resque_scheduler resque:scheduler" workers_config.each do |worker, config| mrake_start "resque_#{worker} resque:work QUEUE=#{config['queues']}" end end desc 'stop all background resque daemons' task :stop_daemons do sh "./script/monit_rake stop resque_scheduler" workers_config.each do |worker, config| sh "./script/monit_rake stop resque_#{worker} -s QUIT" end end def self.workers_config YAML.load(File.open(ENV['WORKER_YML'] || 'config/resque_workers.yml')) end def self.mrake_start(task) sh "nohup ./script/monit_rake start #{task} RAILS_ENV=#{ENV['RAILS_ENV']} >> log/daemons.log &" end end
  17. 17. Deploying (cap) namespace :resque do desc "Stop the resque daemon" task :stop, :roles => :resque do run "cd #{current_path} && RAILS_ENV=#{rails_env} WORKER_YML=#{resque_workers_yml} rake resque:stop_daemons; true" end desc "Start the resque daemon" task :start, :roles => :resque do run "cd #{current_path} && RAILS_ENV=#{rails_env} WORKER_YML=#{resque_workers_yml} rake resque:start_daemons" end end
  18. 18. 5. Monitoring Workers (monit.erb) check process resque_scheduler with pidfile <%= @rails_root %>/tmp/pids/resque_scheduler.pid group resque alert errors@domain.com start program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake start resque_scheduler resque:scheduler'" stop program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake stop resque_scheduler'" <% YAML.load(File.open(Rails.root+'/config/production/resque/resque_workers.yml')).each_pair do |worker, config| %> check process resque_<%=worker%> with pidfile <%= @rails_root %>/tmp/pids/resque_<%=worker%>.pid group resque alert errors@domain.com start program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake start resque_<%=worker%> resque:work QUEUE=<%=config['queues']%>'" stop program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake stop resque_<%=worker%>'" <% end %> use template to generate monit file
  19. 19. Monitoring Rake Processes #!/bin/sh # wrapper to daemonize rake tasks: see also http://mmonit.com/wiki/Monit/FAQ#pidfile usage() { echo "usage: ${0} [start|stop] name target [arguments]" echo "tname is used to create or read the log and pid file names" echo "tfor start: target and arguments are passed to rake" echo "tfor stop: target and arguments are passed to kill (e.g.: -n 3)" exit 1 } [ $# -lt 2 ] && usage cmd=$1 name=$2 shift ; shift pid_file=./tmp/pids/${name}.pid log_file=./log/${name}.log # ...
  20. 20. Monitoring Processes case $cmd in start) if [ ${#} -eq 0 ] ; then echo -e "nERROR: missing targetn" usage fi pid=`cat ${pid_file} 2> /dev/null` if [ -n "${pid}" ] ; then ps ${pid} if [ $? -eq 0 ] ; then echo "ensure process ${name} (pid: ${pid_file}) is not running" exit 1 fi fi echo $$ > ${pid_file} exec 2>&1 rake $* 1>> ${log_file} ;; stop) pid=`cat ${pid_file} 2> /dev/null` [ -n "${pid}" ] && kill $* ${pid} rm -f ${pid_file} ;; *) usage ;; esac
  21. 21. Monitoring Web
  22. 22. 6. Running Web namespace :resque do task :setup => :environment desc 'kick off resque-web' task :web => :environment do $stdout.sync=true $stderr.sync=true puts `env RAILS_ENV=#{RAILS_ENV} resque-web #{RAILS_ROOT}/config/initializers/resque.rb` end end
  23. 23. initializer #this runs in sinatra and rails - so don't use Rails.env rails_env = ENV['RAILS_ENV'] || 'development' rails_root=ENV['RAILS_ROOT'] || File.join(File.dirname(__FILE__),'../..') redis_config = YAML.load_file(rails_root + '/config/redis.yml') Resque.redis = redis_config[rails_env] require 'resque_scheduler' require 'resque/plugins/meta' require 'resque_mailer' Resque.schedule = YAML.load_file(rails_root+'/config/resque_schedule.yml') Resque::Mailer.excluded_environments = [:test, :cucumber]
  24. 24. 5. Monitoring Work bdrb resque ad-hoc queries SQL redis query did it run? custom resque-meta did it fail? hoptoad ! rerun ! have id ! resque-meta que health sample controller ! Did the job run? resque assumes all worked - only tells you failures. not good enough for us
  25. 25. Pausing Workers signal what happens when to use quit wait for child & exit gracefully shutdown term / int immediately kill child & exit shutdown now usr1 immediately kill child stale child usr2 don't start any new jobs cont start to process new jobs
  26. 26. Testing Worker bdrb resque testing queue mid-easy resque_unit testing command ! all workers same ! interface only !
  27. 27. Mail Resque::Mailer.excluded_environments = [:test, :cucumber]
  28. 28. Extending with Hooks resque hooks around_enqueue " after_enqueue ! before_perform ! around_perform !/" after_perform ! all plugins want to extend enqueue - not compatible need to be able to alter arguments (e.g.: add id for meta plugins)
  29. 29. Conclusion ! Boss got no pages in first month of implementation ! no memory leaks, great uptime (don't need monit...) ! Fast ! generalized workers increases throughput (nightly vs 1 hour) ! minimal custom code ! still some intimidation ! Eating flavor of the month
  30. 30. References ! coders: @kbrock and @wpeterson ! great company: PatientsLikeMe (encouraged sharing this) ! resque_mailer ! resque-scheduler ! resque-meta ! monit, hoptoad, rpm_contrib