Job queues with Gearman
Upcoming SlideShare
Loading in...5

Job queues with Gearman






Total Views
Views on SlideShare
Embed Views



2 Embeds 57 56 1



Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Hola Mariano, disculpa la molestia, estoy intentando obtener el result de una background task, y no hay caso... Sabes si esto es posible y como se tendria que hacer?
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • * PRO TIP: If you are doing signup forms, integrate with facebook and linkedin. Both provide emails that you know are already validated
  • * Job queues are not only about performance. They are also about stopping BAD IDEAS * For example: signup form, on $_POST, sends email. Good idea or bad idea? * There are several things that can go wrong: timeouts, errors while connecting, errors while rendering email, errors in email component (I KNOW ;)
  • * Performance is not just about caching, its about the right architecture * Surprisingly, right architectures are not that hard to build
  • * Communism is a good thing on the Internet * Outsource anything you can to Workers (ehem ehem Workana)
  • * I think of controllers as sexy secretaries. * When looking for pictures of “Hot secretaries”, Justin Bieber showed up. Can somebody please explain that?
  • * In essence you could do this with RabbitMQ, ActiveMQ, Redis pub/sub, heck even MySQL ** Yes, MySQL (robot plugin) * CakeResque: CakePHP 2.0+, uses Redis
  • * Remember when Harry was told TOM MARVOLO RIDDLE == LORD VOLDEMORT?
  • * Multi-language: PHP, Perl, Java, Ruby, Python * Multi-platform: Linux, OSX, Windows (cygwin) * Synchronous execution: execute, and get the result, waiting for it * Job status callbacks: register callbacks for when the task is completed, or when the job sends a status update (using the worker API)
  • * Why not multiple workers on same machine on multi-core CPUs? Supervisord
  • * addServer(): you can attach your worker to multiple servers * PRIORITIES: doLowBackground(), doBackground(), doHighBackground()
  • * PRIORITIES: doLow(), doNormal(), doHigh()
  • * What happens if your worker fatals? You got it, the whole worker goes bananas. * We need to ensure there's always a worker active
  • * You send status through sendStatus() * You can also send data through sendData()
  • * NOT UNUSUAL: if they can be done independently, they can be done in parallel. EG: send email / pre-create profile page / update stat counters * Resource usage optimization: some “expensive” machines for heavy tasks, others for easier tasks. But we still need to know when all the tasks are DONE * Map & Reduce: Mapper on one Gearman server, reducer on the other addTask: has priorities (addTask, addTaskLow, addTaskHigh)
  • * worker3.php: we use the same exact worker from before, no need to change it for switching to a parallel work mode
  • * Workers KILLED: monit that issues a ping dummy task for each worker, synchronously * PROCESS switches: do an usleep() between loop changes. It won't kill you.
  • * wait(): wait for activity from server * PHP BUG: #60764

Job queues with Gearman Job queues with Gearman Presentation Transcript

  • CakeFest 2012 – Manchester, UK Job Queues Letting your servers breathe @mgiglesias
  • So you build a signupclass UsersController extends AppController { public function add() { if ($this->request->is(post)) { $this->User->save($this->request->data); $email = new CakeEmail(); $email->to($this->request->data[User][email]) ->subject("Validate em emails") ->template(activate) ->viewVars(array(id => $this->User->id)) ->send(); $this->Session->setFlash("Giddy up!"); return $this->redirect(array(action =>index)); } else { $this->Session->setFlash("Fix em typos"); } }}
  • Bad ideas● Email sending● Notifications● Messaging – To multiple users● Imports – Oh yes, imports● Counter caches
  • The #1 Rule of Fight ClubYOU DO NOT MAKE THE USER WAIT
  • The solution
  • Show me how, commarade Could someone please send this email? Controller{ { to:, to:, subject: FYI, subject: Yo, dawg, body: Unicorns sux body: Whatup} }
  • Ingredients● Channel for dispatching tasks● Job queue● Workers that wait for jobs● Beer● CakeResque: github://kamisama/Cake-Resque
  • A Harry Potter momentGEARMANMANAGER
  • Break it down● Client API to dispatch tasks● Job server (Gearmand)● Worker API to, well, work on jobs● Multi-language● Multi-platform● Synchronous execution● Job status callbacks
  • Super scalable Gearmand #1 Gearmand #2Worker #1 Worker #2 Worker #2 Worker #2 Worker #2
  • Client & Worker$client = new GearmanClient();$client->addServer(;$result = $client->doBackground(hello, Mariano);var_dump($result);$worker = new GearmanWorker();$worker->addServer(;$worker->addFunction(hello, function($job) { $who = $job->workload(); sleep(1); echo "Hello {$who}!" . "n"; sleep(1); return true;});while($worker->work()) { usleep(50000);}
  • Getting jiggy with it● Start $ gearmand --verbose DEBUG● $ telnet 4730 and type in STATUS – Gives you registered functions, number of workers and items in queue
  • Show me the money
  • Get the money back$client = new GearmanClient();$client->addServer(;$result = $client->doNormal(hello, Mariano);var_dump($result);$worker = new GearmanWorker();$worker->addServer(;$worker->addFunction(hello, function($job) { $who = $job->workload(); return "Hello {$who}!";});while($worker->work()) { usleep(50000);}
  • Show me the money
  • I got 99 problems
  • <3 supervisor[supervisord]logfile=/var/log/supervisor/supervisord.loglogfile_maxbytes = 50MBpidfile=/var/run/[inet_http_server]port =[supervisorctl]serverurl = unix:///var/run/supervisor.sockprompt = supervisor[program:worker]command=/usr/bin/php -f /home/mariano/worker.phpprocess_name=%(program_name)s #%(process_num)snumprocs=4autostart=trueautorestart=truestdout_logfile=/var/log/supervisor/worker-%(process_num)s.logstderr_logfile=/var/log/supervisor/worker-%(process_num)s-error.log
  • <3 supervisor$ sudo supervisord -c supervisor.conf
  • Job statuses [file1.jpg, [file1.jpg, file2.jpg, file2.jpg, file3.jpg] file3.jpg] (1, 3) (1, 3) WorkerClient (2, 3) (2, 3) Gearman (3, 3) (3, 3)
  • Job statuses$worker = new GearmanWorker();$worker->addServer(;$worker->addFunction(process, function($job) { $files = json_decode($job->workload(), true); $i = 0; $count = count($files); foreach($files as $file) { echo "Processing {$file}... "; $job->sendStatus(++$i, $count); echo "DONEn"; sleep(1); }});while($worker->work()) { usleep(50000);}
  • Job statuses$client = new GearmanClient();$client->addServer(;$handle = $client->doBackground(process, json_encode(array( file1.jpg, file2.jpg, file3.jpg)));$done = 0;do { list($queued, $running, $processed, $total) = $client->jobStatus($handle); if (!$queued) { break; } if ($processed != $done) { $done = $processed; echo "PROCESSED {$done} of {$total}n"; } usleep(50000);} while(true);echo DONE;
  • Parallel tasks● They are not as unusual as you think● Resource usage optimization● Divide and conquer <BuzzWord>Map & Reduce</BuzzWord>● Can even prioritize the tasks● Its parallel, so its faster :)
  • Parallel tasks$client->setStatusCallback(function($task) { echo "PROCESSED {$task->taskNumerator()} of {$task->taskDenominator()}n";});$client->setCompleteCallback(function($task) { echo "COMPLETED: {$task->unique()}: {$task->data()}n";});$client->addTask(process, json_encode(array( file1.jpg, file2.jpg, file3.jpg)));$client->addTask(process, json_encode(array( a1.jpg, b.jpg)));$client->runTasks();
  • Parallel tasks worker-1.log worker-2.logProcessing a1.jpg... DONE Processing file1.jpg... DONEProcessing b.jpg... DONE Processing file2.jpg... DONE Processing file3.jpg... DONE
  • What about CakePHP?
  • Bake it!class JobsController extends AppController { public function trigger() { $client = new GearmanClient(); $client->addServer(; $client->doNormal(run, json_encode(array( action => get_called_class() . ::hello, params => array(Mariano) ))); echo DONE; exit; } public static function hello($who) { echo "ttHello {$who}!n"; }}
  • Bake it!class WorkerShell extends AppShell { public function start() { $this->out(Connecting to Gearman... , 0, Shell::VERBOSE); $worker = new GearmanWorker(); $worker->addServer(; $worker->addFunction(run, array($this, _run)); $this->out(DONE, 1, Shell::VERBOSE); $this->out(Waiting for jobs, 1, Shell::VERBOSE); while($worker->work()) { usleep(50000); } }}
  • Bake it!public function _run(GearmanJob $job) { $params = json_decode($job->workload(), true); $this->out(Got new job: . $job->workload(), 1, Shell::VERBOSE); if (empty($params) || !isset($params[action])) { throw new InvalidArgumentException(Invalid job request); } else if ( preg_match(/^(.+?Controller)::/, $params[action], $match) && !class_exists($match[1]) ) { App::uses($match[1], Controller); } $result = call_user_func_array($params[action], $params[params]); $this->out(Job done, 1, Shell::VERBOSE); return $result;}
  • Lessons learned● Processes stay alive, which mean resource timeouts – Check your DB for reconnection!● Workers can be killed, easily● Be mindful of your memory usage● Account for process switches
  • I can haz non-blocking! (Sort of)$worker->addOptions(GEARMAN_WORKER_NON_BLOCKING);while ($worker->work() || ( $worker->returnCode() == GEARMAN_IO_WAIT || $worker->returnCode() == GEARMAN_NO_JOBS)) { if ($worker->returnCode() == GEARMAN_SUCCESS) { continue; } if (!$worker->wait()) { if ($worker->returnCode() == GEARMAN_NO_ACTIVE_FDS) { echo "Got disconnected, so waiting forserver...n"; sleep(5); continue; } break; }}
  • CakeFest 2012 – Manchester, UK Questions? Beers? @mgiglesias