Work Queues
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Work Queues

on

  • 2,740 views

 

Statistics

Views

Total Views
2,740
Views on SlideShare
2,678
Embed Views
62

Actions

Likes
3
Downloads
41
Comments
0

3 Embeds 62

http://lanyrd.com 49
http://cnyerson.blogspot.com 12
http://cnyerson.blogspot.jp 1

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Work Queues Presentation Transcript

  • 1. Work Queues with Gearman and CodeIgniterCI Conf ’12 - San FranciscoErik Giberti - JD SupraSenior Developer@gibertierik@jdsupra.com
  • 2. Who Am I?
  • 3. What’s a Work Queue Anyway?
  • 4. What’s a Work Queue Anyway? “A sequence of stored data or programs awaiting processing.” ~ American Heritage Dictionary
  • 5. What’s a Work Queue Anyway? “A sequence of stored data or programs awaiting processing.” ~ American Heritage Dictionary “[F]or storing messages as they travel between computers.” ~ Amazon SQS site
  • 6. What’s a Work Queue Anyway? “A sequence of stored data or programs awaiting processing.” ~ American Heritage Dictionary “[F]or storing messages as they travel between computers.” ~ Amazon SQS site “[I]t’s the nervous system for how distributed processing communicates” ~ Gearman site
  • 7. The Client Worker Pattern
  • 8. You might be using them already...
  • 9. You might be using them already...<?php // filename: do_some_work.php $conn = mysqli_connect($server, $user, $pass, $database); $select = “SELECT * FROM things WHERE status = ‘not done’ ORDER BY timestamp ASC”; $res = $conn->query($select); while($work = $res->fetch_assoc()){! //! // do work! // $update = “UPDATE things SET status = ‘done’ WHERE id = {$things[‘id’]}”;! $conn->query($update); }
  • 10. It works!
  • 11. It works!It’s simple to understand Get work -> do work -> done.
  • 12. It works!It’s simple to understand Get work -> do work -> done.Can be implemented in other languages This job/worker could have been created in Python, Java... whatever!
  • 13. It works!It’s simple to understand Get work -> do work -> done.Can be implemented in other languages This job/worker could have been created in Python, Java... whatever!Can be deployed on a different server So long as it can talk to the DB
  • 14. It works!It’s simple to understand Get work -> do work -> done.Can be implemented in other languages This job/worker could have been created in Python, Java... whatever!Can be deployed on a different server So long as it can talk to the DBIt’s persistent MySQL is pretty good at keeping data
  • 15. What’s wrong with that?
  • 16. What’s wrong with that?Runs at the frequency that cron fires it off Can only run once a minute
  • 17. What’s wrong with that?Runs at the frequency that cron fires it off Can only run once a minuteSingle threaded Race condition if you start two overlapping threads Mitigation strategies Have workers kill themselves every minute Use modulus to only do certain jobs in certain threads Create different DB pools for each worker
  • 18. What’s wrong with that?Runs at the frequency that cron fires it off Can only run once a minuteSingle threaded Race condition if you start two overlapping threads Mitigation strategies Have workers kill themselves every minute Use modulus to only do certain jobs in certain threads Create different DB pools for each workerHits the database at the predefined frequency Wasting DB resources Hardest tech in most stacks to scale...
  • 19. Worker Take 2
  • 20. Worker Take 2// filename: do_some_work_two.php$conn = mysqli_connect($server, $user, $pass, $database);$select = “SELECT * FROM things WHERE status = ‘not done’ ORDER BY timestamp ASC LIMIT 1”;$done = false;while(!$done){ $res = $conn->query($select); if($res->num_rows == 1){! $work = $res->fetch_assoc();! //! // do some stuff! // $update = “UPDATE things SET status = ‘done’ WHERE id = {$things[‘id’]}”;! $conn->query($update); } else {! sleep(1); }}
  • 21. A little better?
  • 22. A little better?It’s simple to understand Get work -> do work -> repeat
  • 23. A little better?It’s simple to understand Get work -> do work -> repeatCan be implemented in other languages This job/worker could have been created in Python, Java... whatever!
  • 24. A little better?It’s simple to understand Get work -> do work -> repeatCan be implemented in other languages This job/worker could have been created in Python, Java... whatever!Can be deployed on a different server So long as it can talk to the DB
  • 25. A little better?It’s simple to understand Get work -> do work -> repeatCan be implemented in other languages This job/worker could have been created in Python, Java... whatever!Can be deployed on a different server So long as it can talk to the DBIt’s persistent MySQL is pretty good at keeping data
  • 26. A little better?It’s simple to understand Get work -> do work -> repeatCan be implemented in other languages This job/worker could have been created in Python, Java... whatever!Can be deployed on a different server So long as it can talk to the DBIt’s persistent MySQL is pretty good at keeping dataIt’s near real time Maximum delay of 1 second between jobs
  • 27. What’s wrong with that?
  • 28. What’s wrong with that?Runs at the frequency of the sleep() delay 1 second waits are pretty good for offloaded tasks Can tweak the timing delay to our level of tolerance usleep() for finer control of the interval and less than 1 second intervals are possible!
  • 29. What’s wrong with that?Runs at the frequency of the sleep() delay 1 second waits are pretty good for offloaded tasks Can tweak the timing delay to our level of tolerance usleep() for finer control of the interval and less than 1 second intervals are possible!Less likely to hit a race condition Same mitigation strategies apply
  • 30. What’s wrong with that?Runs at the frequency of the sleep() delay 1 second waits are pretty good for offloaded tasks Can tweak the timing delay to our level of tolerance usleep() for finer control of the interval and less than 1 second intervals are possible!Less likely to hit a race condition Same mitigation strategies applyHits the database even more! MySQL can cache queries pretty well with MyISAM tables... ... but your using InnoDB ... right?
  • 31. - a quick history lessonOriginally developed by Danga Interactive to solve specific issuesin the building and hosting of LiveJournal.com. It was originallyannounced in 2005 and was written in Perl.
  • 32. - a quick history lessonOriginally developed by Danga Interactive to solve specific issuesin the building and hosting of LiveJournal.com. It was originallyannounced in 2005 and was written in Perl.It was ported to C by a number of big companies, like Googleand enjoys wide support from the community; not that muchreally goes wrong with it.
  • 33. - a quick history lessonOriginally developed by Danga Interactive to solve specific issuesin the building and hosting of LiveJournal.com. It was originallyannounced in 2005 and was written in Perl.It was ported to C by a number of big companies, like Googleand enjoys wide support from the community; not that muchreally goes wrong with it.Incidentally, Danga Interactive also created Memcached
  • 34. - a quick history lessonOriginally developed by Danga Interactive to solve specific issuesin the building and hosting of LiveJournal.com. It was originallyannounced in 2005 and was written in Perl.It was ported to C by a number of big companies, like Googleand enjoys wide support from the community; not that muchreally goes wrong with it.Incidentally, Danga Interactive also created MemcachedDanga Interactive exists no more, it was sold to Six Apart thenLiveJournal was sold to SUP. Six Apart was acquired by SAYMedia.
  • 35. How Gearman Work Queues Stack Up
  • 36. How Gearman Work Queues Stack UpThey’re simple to understand Get work -> do work -> repeat
  • 37. How Gearman Work Queues Stack UpThey’re simple to understand Get work -> do work -> repeatThey can work across languages Jobs are simple strings, so they can pass whatever you want Additionally, tools exist for things like MySQL to trigger new jobs
  • 38. How Gearman Work Queues Stack UpThey’re simple to understand Get work -> do work -> repeatThey can work across languages Jobs are simple strings, so they can pass whatever you want Additionally, tools exist for things like MySQL to trigger new jobsThey can work across servers Workers don’t need to live where clients do
  • 39. How Gearman Work Queues Stack UpThey’re simple to understand Get work -> do work -> repeatThey can work across languages Jobs are simple strings, so they can pass whatever you want Additionally, tools exist for things like MySQL to trigger new jobsThey can work across servers Workers don’t need to live where clients doPersistent This requires additional configuration
  • 40. How Gearman Work Queues Stack UpThey’re simple to understand Get work -> do work -> repeatThey can work across languages Jobs are simple strings, so they can pass whatever you want Additionally, tools exist for things like MySQL to trigger new jobsThey can work across servers Workers don’t need to live where clients doPersistent This requires additional configurationReal Time
  • 41. How Gearman Work Queues Stack UpThey’re simple to understand Get work -> do work -> repeatThey can work across languages Jobs are simple strings, so they can pass whatever you want Additionally, tools exist for things like MySQL to trigger new jobsThey can work across servers Workers don’t need to live where clients doPersistent This requires additional configurationReal TimeBonus: Can be asynchronous and synchronous!
  • 42. But I don’t have this sort of workload...
  • 43. But I don’t have this sort of workload... O’RLY?
  • 44. Have you seen this pattern?class Upload extends CI_Controller {! function index(){! ! $this->load->helper(array(‘url’));! ! $ul_config = array( /* ... */ );! ! $this->load->library(‘upload’, $ul_config);! ! if($this->upload->do_upload()){! ! ! $file_info = $this->upload->data();! ! ! $img_config = array( /* ... */);! ! ! $this->load->library(‘image_lib’, $img_config);! ! ! $this->image_lib->resize();! ! ! // watermark too?! ! ! // write to the database! ! ! // move to web accessible directory! ! ! redirect(site_url(‘controller/next’));! ! } else { /* error handling code */ }! }}
  • 45. It Works
  • 46. It WorksReasonably fast in many circumstances
  • 47. It WorksReasonably fast in many circumstancesMost modern servers can handle a few of these calls at thesame time.
  • 48. But there are limitations...
  • 49. But there are limitations...Cameras take big images The iPhone 4 is a 5 megapixel camera, most files are about 2Mb Entry level DSLR cameras are 14 megapixel with 5Mb files
  • 50. But there are limitations...Cameras take big images The iPhone 4 is a 5 megapixel camera, most files are about 2Mb Entry level DSLR cameras are 14 megapixel with 5Mb filesWhat about multiple uploads Present the form, upload one image, process, repeat? Provide a block of upload fields and hope it’s enough? Use HTML5 / Flash plugin to send multiple files? Hope you have enough CPU & RAM to crunch them in parallel.
  • 51. But there are limitations...Cameras take big images The iPhone 4 is a 5 megapixel camera, most files are about 2Mb Entry level DSLR cameras are 14 megapixel with 5Mb filesWhat about multiple uploads Present the form, upload one image, process, repeat? Provide a block of upload fields and hope it’s enough? Use HTML5 / Flash plugin to send multiple files? Hope you have enough CPU & RAM to crunch them in parallel.Processing may delay the client longer than the defaulttimeout. What happens to your user’s confidence in your product if things are slow?
  • 52. Okay, I’m interested how do I add it?
  • 53. Gearmand runs as another daemon/service in your stack. Can be a different server or run along side everything else
  • 54. Gearmand runs as another daemon/service in your stack. Can be a different server or run along side everything elseGearman’s only job is to facilitate the handling of thesemessages It can optionally store them in a persistent store to recover from system reboots, service restarts, etc.
  • 55. Gearmand runs as another daemon/service in your stack. Can be a different server or run along side everything elseGearman’s only job is to facilitate the handling of thesemessages It can optionally store them in a persistent store to recover from system reboots, service restarts, etc.PHP talks to the Gearmand server via an API extension Just like MySQL, Memcached, Postgresql, APC and other tools
  • 56. Gearmand runs as another daemon/service in your stack. Can be a different server or run along side everything elseGearman’s only job is to facilitate the handling of thesemessages It can optionally store them in a persistent store to recover from system reboots, service restarts, etc.PHP talks to the Gearmand server via an API extension Just like MySQL, Memcached, Postgresql, APC and other toolsYour client and worker code pass messages back and forthto Gearman You’ll use a predefined set of PHP function calls to do this
  • 57. Getting Gearmand running - installers for many systems
  • 58. Getting Gearmand running - installers for many systemsLinux: apt-get, yum etc sudo apt-get install gearmand sudo yum install gearmand
  • 59. Getting Gearmand running - installers for many systemsLinux: apt-get, yum etc sudo apt-get install gearmand sudo yum install gearmandWindows: cygwin ...
  • 60. Getting Gearmand running - installers for many systemsLinux: apt-get, yum etc sudo apt-get install gearmand sudo yum install gearmandWindows: cygwin ...OS X: ... Install a package manager: MacPorts, Homebrew sudo port install gearmand brew install gearmand Add a Virtual Machine that has an installer...
  • 61. Getting Gearmand running - compile your own
  • 62. Getting Gearmand running - compile your ownGet your dependencies sudo yum install gpp gcc-c++ boost boost-devel libevent libevent-devel libuuid libuuid- devel
  • 63. Getting Gearmand running - compile your ownGet your dependencies sudo yum install gpp gcc-c++ boost boost-devel libevent libevent-devel libuuid libuuid- develGet the latest stable source from gearman.org (Launchpad) wget https://launchpad.net/gearmand/trunk/0.34/+download/gearmand-0.34.tar.gz
  • 64. Getting Gearmand running - compile your ownGet your dependencies sudo yum install gpp gcc-c++ boost boost-devel libevent libevent-devel libuuid libuuid- develGet the latest stable source from gearman.org (Launchpad) wget https://launchpad.net/gearmand/trunk/0.34/+download/gearmand-0.34.tar.gzUnpack, Compile and Install tar -xzf gearmand-0.34.tar.gz cd gearmand-0.34.tar.gz ./configure make sudo make install
  • 65. Getting Gearmand running - compile your ownGet your dependencies sudo yum install gpp gcc-c++ boost boost-devel libevent libevent-devel libuuid libuuid- develGet the latest stable source from gearman.org (Launchpad) wget https://launchpad.net/gearmand/trunk/0.34/+download/gearmand-0.34.tar.gzUnpack, Compile and Install tar -xzf gearmand-0.34.tar.gz cd gearmand-0.34.tar.gz ./configure make sudo make installAdd appropriate system hooks to start the service on reboot chkconfig systemctl ?
  • 66. Configuring Gearmand
  • 67. Configuring GearmandDefault install is pretty good Load gearmand as a background service No persistence Listens on all available interfaces at port 4730
  • 68. Configuring GearmandDefault install is pretty good Load gearmand as a background service No persistence Listens on all available interfaces at port 4730Persistence Easily enabled using MySQL based databases (Drizzle, MySQL, MariaDB etc) Postgresql SQL Lite Memcached Add the appropriate flags to your init script for details on each Docs: http://gearman.org/index.php?id=manual:job_server#persistent_queues Example: MySQL /sbin/gearmand -q libdrizzle --libdrizzle-host=127.0.0.1 --libdrizzle-user=gearman --libdrizzle-password=secret --libdrizzle-db=queue --libdrizzle-table=gearman --libdrizzle-mysql
  • 69. Configuring GearmandDefaults to single thread (mostly) Some versions default to use more than one... Non-blocking I/O which works very fast with a single thread To give each of the internals a dedicated thread use /sbin/gearmand -d -t 3 Additional threads are then used for client worker connections
  • 70. Configuring GearmandDefaults to single thread (mostly) Some versions default to use more than one... Non-blocking I/O which works very fast with a single thread To give each of the internals a dedicated thread use /sbin/gearmand -d -t 3 Additional threads are then used for client worker connectionsSecurity Lock gearmand to a single IP /sbin/gearmand -d -L 127.0.0.1 Change the port from the default 4730 /sbin/gearmand -d -L 127.0.0.1 -p 7003
  • 71. Configuring GearmandDefaults to single thread (mostly) Some versions default to use more than one... Non-blocking I/O which works very fast with a single thread To give each of the internals a dedicated thread use /sbin/gearmand -d -t 3 Additional threads are then used for client worker connectionsSecurity Lock gearmand to a single IP /sbin/gearmand -d -L 127.0.0.1 Change the port from the default 4730 /sbin/gearmand -d -L 127.0.0.1 -p 7003HTTP Access Gearmand supports a pluggable interface architecture and can use HTTP for communication. Requests are sent using GET and POST data.
  • 72. Configuring Gearmand for high availability
  • 73. Configuring Gearmand for high availabilityGearmand doesn’t replicate data May be a weak point depending on your use case
  • 74. Configuring Gearmand for high availabilityGearmand doesn’t replicate data May be a weak point depending on your use caseRedundancy without a load balancer In the client logic, add multiple servers and let the driver sort it out Workers register themselves with each gearmand server Done.
  • 75. Configuring Gearmand for high availabilityGearmand doesn’t replicate data May be a weak point depending on your use caseRedundancy without a load balancer In the client logic, add multiple servers and let the driver sort it out Workers register themselves with each gearmand server Done.
  • 76. Configuring Gearmand for high availabilityEasy to load balance Put the two servers behind a load balancer Each client connects to the load balancer Workers register themselves with each gearmand server Done.
  • 77. Configuring Gearmand for high availabilityEasy to load balance Put the two servers behind a load balancer Each client connects to the load balancer Workers register themselves with each gearmand server Done.
  • 78. Adding Gearman API to PHP
  • 79. Adding Gearman API to PHPPackage managers to the rescue again! sudo apt-get install php-pecl-gearman sudo yum install php-pecl-gearman
  • 80. Adding Gearman API to PHPPackage managers to the rescue again! sudo apt-get install php-pecl-gearman sudo yum install php-pecl-gearmanPecl sudo pecl install gearman sudo echo “extension=gearman.so” > /etc/php.d/gearman.ini
  • 81. Adding Gearman API to PHPPackage managers to the rescue again! sudo apt-get install php-pecl-gearman sudo yum install php-pecl-gearmanPecl sudo pecl install gearman sudo echo “extension=gearman.so” > /etc/php.d/gearman.iniQuick test php -i | grep “gearman support” gearman support => enabled
  • 82. Adding Gearman API to PHPPackage managers to the rescue again! sudo apt-get install php-pecl-gearman sudo yum install php-pecl-gearmanPecl sudo pecl install gearman sudo echo “extension=gearman.so” > /etc/php.d/gearman.iniQuick test php -i | grep “gearman support” gearman support => enabledRestart any services with PHP resident in memory Apache, Nginx, etc
  • 83. Can we get to the code already?
  • 84. Gearman in PHP - The ClientClients create the jobs and tasks for the workers to do
  • 85. Gearman in PHP - The ClientClients create the jobs and tasks for the workers to doCreate a client object $gm = new GearmanClient();
  • 86. Gearman in PHP - The ClientClients create the jobs and tasks for the workers to doCreate a client object $gm = new GearmanClient();Define the server(s) to use $gm->addServer(); // defaults to localhost:4730
  • 87. Gearman in PHP - The ClientClients create the jobs and tasks for the workers to doCreate a client object $gm = new GearmanClient();Define the server(s) to use $gm->addServer(); // defaults to localhost:4730Create the job and wait for a response $jobdata = “/* any valid string */”; do { $res = $gm->do(‘image_resize’, $jobdata); } while ($gm->returnCode() != GEARMAN_SUCCESS);
  • 88. Gearman in PHP - The ClientClients create the jobs and tasks for the workers to doCreate a client object $gm = new GearmanClient();Define the server(s) to use $gm->addServer(); // defaults to localhost:4730Create the job and wait for a response $jobdata = “/* any valid string */”; do { $res = $gm->do(‘image_resize’, $jobdata); } while ($gm->returnCode() != GEARMAN_SUCCESS);Or don’t $jobid = $gm->doBackground(‘image_resize’, $jobdata);
  • 89. Gearman in PHP - The WorkerWorkers do the jobs and tasks the clients request
  • 90. Gearman in PHP - The WorkerWorkers do the jobs and tasks the clients requestDefine the callbacks function callback_resize($job){ /* do stuff */ } function callback_watermark($job){ /* do stuff */ }
  • 91. Gearman in PHP - The WorkerWorkers do the jobs and tasks the clients requestDefine the callbacks function callback_resize($job){ /* do stuff */ } function callback_watermark($job){ /* do stuff */ }Create a worker object $gm = new GearmanWorker();
  • 92. Gearman in PHP - The WorkerWorkers do the jobs and tasks the clients requestDefine the callbacks function callback_resize($job){ /* do stuff */ } function callback_watermark($job){ /* do stuff */ }Create a worker object $gm = new GearmanWorker();Define the server(s) to respond to $gm->addServer(); // defaults to localhost:4730
  • 93. Gearman in PHP - The WorkerWorkers do the jobs and tasks the clients requestDefine the callbacks function callback_resize($job){ /* do stuff */ } function callback_watermark($job){ /* do stuff */ }Create a worker object $gm = new GearmanWorker();Define the server(s) to respond to $gm->addServer(); // defaults to localhost:4730Tell the server which callbacks to use for each task $gm->addFunction(‘resize’, ‘callback_resize’); $gm->addFunction(‘watermark’, ‘callback_watermark’);
  • 94. Gearman in PHP - The WorkerWorkers do the jobs and tasks the clients requestDefine the callbacks function callback_resize($job){ /* do stuff */ } function callback_watermark($job){ /* do stuff */ }Create a worker object $gm = new GearmanWorker();Define the server(s) to respond to $gm->addServer(); // defaults to localhost:4730Tell the server which callbacks to use for each task $gm->addFunction(‘resize’, ‘callback_resize’); $gm->addFunction(‘watermark’, ‘callback_watermark’);Wait for work to do while($gm->work());
  • 95. Gearman in PHP - The WorkerWorkers do the jobs and tasks the clients requestDefine the callbacks function callback_resize($job){ /* do stuff */ } function callback_watermark($job){ /* do stuff */ }Create a worker object $gm = new GearmanWorker();Define the server(s) to respond to $gm->addServer(); // defaults to localhost:4730Tell the server which callbacks to use for each task $gm->addFunction(‘resize’, ‘callback_resize’); $gm->addFunction(‘watermark’, ‘callback_watermark’);Wait for work to do while($gm->work());Callback functions can post status updates on their progress back to the client $job->setStatus($numerator/$denominator);
  • 96. Example Clients in CodeIgniterclass Photos extends CI_Model {! public function resize_sync($filename){! ! $gm = new GearmanClient();! ! $gm->addServer(127.0.0.1, 4730);! ! do {! ! ! $res = $gm->do(image_resize, $filename);! ! ! switch($gm->returnCode()){! ! ! ! case GEARMAN_WORK_FAIL:! ! ! ! ! return FALSE;! ! ! ! case GEARMAN_SUCCESS:! ! ! ! ! return TRUE;! ! ! }! ! } while ($gm->returnCode() != GEARMAN_SUCCESS);! ! return TRUE;! }! public function resize_async($filename){! ! $gm = new GearmanClient();! ! $gm->addServer(127.0.0.1, 4730);! ! $res = $gm->doBackground(image_resize, $filename); if(!$res){ return FALSE; } ! return TRUE;! }}
  • 97. The Callback Handlersclass Photos extends CI_Model {! public function resize_sync($filename){ /* ... */ }! public function resize_async($filename){ /* ... */ }! static public function image_resize($job){! ! $filename = $job->workload();! ! $CI =& get_instance();! ! $config = array(! ! ! source_image => $filename,! ! ! create_thumbnail => TRUE,! ! ! maintain_ratio => TRUE,! ! ! width => 800,! ! ! height => 600,! ! );! ! $CI->image_lib->initialize($config);! ! $CI->image_lib->resize();! ! $CI->image_lib->clear();! ! return true;! }! static public function image_watermark($job){ /* ... */ }}
  • 98. The Workerclass Photos extends CI_Model {! public function resize_sync($filename){ /* ... */ }! public function resize_async($filename){ /* ... */ }! static public function image_resize($job){ /* ... */ }! static public function image_watermark($job){ /* ... */ }! public function gearman_worker()! {! !! ! $gm = new GearmanWorker();! ! $gm->addServer(127.0.0.1, 4730);! ! $gm->addFunction(image_resize, Photos::image_resize);! ! $gm->addFunction(image_watermark, Photos::image_watermark);! ! while($gm->work());! }}
  • 99. Bringing it all togetherclass Upload extends CI_Controller {! function index(){! ! $this->load->model(photos);! ! $this->load->helper(array(‘url’));! ! $ul_config = array( /* ... */ );! ! $this->load->library(‘upload’, $ul_config);! ! if($this->upload->do_upload()){! ! ! $file_info = $this->upload->data();! ! ! $this->photos->resize_async($file_info[full_path]);! ! ! redirect(site_url(‘controller/next’));! ! } else { /* error handling code */ }! }! function worker(){! ! $this->load->model(photos);! ! $this->photos->gearman_worker();! }}
  • 100. Running your workers#!/bin/bashSCRIPT="index.php upload gearman_worker"! # this is your CI controller/methodWORKDIR=/var/www/html/ ! ! ! # this is your CI app rootMAX_WORKERS=5! ! ! ! # number of workersPHP=/usr/bin/php! ! ! ! # location of PHP on your systemCOUNT=0! ! ! ! ! # internal use variablefor i in `ps -afe | grep "$SCRIPT" | grep -v grep | awk {print $2}`do! COUNT=$((COUNT+1))doneif test $COUNT -lt $MAX_WORKERSthen! cd $WORKDIR! $PHP $SCRIPTelse! echo There are enough workers running already.fi
  • 101. Demo Time: Image Resizing
  • 102. About the job dataJob data is always a string
  • 103. About the job dataJob data is always a stringYou can serialize data to pass more complex objects $data = array( ‘filename’ => $filename, ‘methods’ => array(‘resize’, ‘watermark’), ‘user_id’ => $user_id, ); $jobdata = json_encode($data); $gm->doBackground(‘processor’, $jobdata);
  • 104. About the job dataJob data is always a stringYou can serialize data to pass more complex objects $data = array( ‘filename’ => $filename, ‘methods’ => array(‘resize’, ‘watermark’), ‘user_id’ => $user_id, ); $jobdata = json_encode($data); $gm->doBackground(‘processor’, $jobdata);And then deserialize it in the worker $jobdata = $job->workload(); $data = json_decode($jobdata, true);
  • 105. About the job dataIn theory, you can pass about 2Gb per message Limited by the server protocol which uses a 32 bit integer to define message size.
  • 106. About the job dataIn theory, you can pass about 2Gb per message Limited by the server protocol which uses a 32 bit integer to define message size.You can pass binary data by encoding it $binary_base64 = base64_encode($binary);
  • 107. About the job dataIn theory, you can pass about 2Gb per message Limited by the server protocol which uses a 32 bit integer to define message size.You can pass binary data by encoding it $binary_base64 = base64_encode($binary);But use it sparingly Any data passed is stored in memory and your persistent store (if enabled) and is likely on disk already Base 64 encoded data is roughly 1 1/2 times the size of the source data too Compressing the binary data first can help
  • 108. About the job dataDon’t pass 2Gb objects
  • 109. About the job dataDon’t pass 2Gb objectsKeep your message data small, less than 64K is perfect Amazon’s SQS has a hard limit of 64K
  • 110. About the job dataDon’t pass 2Gb objectsKeep your message data small, less than 64K is perfect Amazon’s SQS has a hard limit of 64KPass pointers instead Pass paths and filenames instead of files Pass cache keys instead of complex class objects that won’t deserialize right anyway
  • 111. About the job dataDon’t pass 2Gb objectsKeep your message data small, less than 64K is perfect Amazon’s SQS has a hard limit of 64KPass pointers instead Pass paths and filenames instead of files Pass cache keys instead of complex class objects that won’t deserialize right anywayGearman is not a data store!
  • 112. Another Example
  • 113. Analytics: ProblemWe wanted to provide near realtime analytics to our clients
  • 114. Analytics: ProblemWe wanted to provide near realtime analytics to our clients We liked the data we could scrape out of Google
  • 115. Analytics: ProblemWe wanted to provide near realtime analytics to our clients We liked the data we could scrape out of Google Networks Referrers Location
  • 116. Analytics: ProblemWe wanted to provide near realtime analytics to our clients We liked the data we could scrape out of Google Networks Referrers Location so we did... for a while... and it was good!
  • 117. Analytics: ProblemWe wanted to provide near realtime analytics to our clients We liked the data we could scrape out of Google Networks Referrers Location so we did... for a while... and it was good! But Google has API limits for getting data out of Analytics
  • 118. Analytics: ProblemWe wanted to provide near realtime analytics to our clients We liked the data we could scrape out of Google Networks Referrers Location so we did... for a while... and it was good! But Google has API limits for getting data out of Analytics Only a few thousand requests per day
  • 119. Analytics: ProblemWe wanted to provide near realtime analytics to our clients We liked the data we could scrape out of Google Networks Referrers Location so we did... for a while... and it was good! But Google has API limits for getting data out of Analytics Only a few thousand requests per day Batching requests was good - but not ideal
  • 120. Analytics: ProblemWe wanted to provide near realtime analytics to our clients We liked the data we could scrape out of Google Networks Referrers Location so we did... for a while... and it was good! But Google has API limits for getting data out of Analytics Only a few thousand requests per day Batching requests was good - but not ideal Data was only monthly, no daily, weekly breakdowns
  • 121. Analytics: ProblemWe wanted to provide near realtime analytics to our clients We liked the data we could scrape out of Google Networks Referrers Location so we did... for a while... and it was good! But Google has API limits for getting data out of Analytics Only a few thousand requests per day Batching requests was good - but not ideal Data was only monthly, no daily, weekly breakdowns Still ended up taking weeks to gather a full month of data
  • 122. Analytics: ProblemWe wanted to provide near realtime analytics to our clients We liked the data we could scrape out of Google Networks Referrers Location so we did... for a while... and it was good! But Google has API limits for getting data out of Analytics Only a few thousand requests per day Batching requests was good - but not ideal Data was only monthly, no daily, weekly breakdowns Still ended up taking weeks to gather a full month of data We already had scraped a significant amount of Google data so any future collection would need to play nice with the existing data
  • 123. Analytics: SolutionBuild our own platform that gathered exactly what we wantedmodeled after Google Analytics
  • 124. Analytics: SolutionBuild our own platform that gathered exactly what we wantedmodeled after Google AnalyticsHow we did our tracking pixel
  • 125. Analytics: SolutionBuild our own platform that gathered exactly what we wantedmodeled after Google AnalyticsHow we did our tracking pixel The page has a small javascript file included that
  • 126. Analytics: SolutionBuild our own platform that gathered exactly what we wantedmodeled after Google AnalyticsHow we did our tracking pixel The page has a small javascript file included that captures the current url, user agent, referrer and so on
  • 127. Analytics: SolutionBuild our own platform that gathered exactly what we wantedmodeled after Google AnalyticsHow we did our tracking pixel The page has a small javascript file included that captures the current url, user agent, referrer and so on and embeds it as an image tag with a carefully crafted URL for the 1x1 tracking pixel /statistics/pixel/?ua=Mozilla%2F...
  • 128. Analytics: SolutionBuild our own platform that gathered exactly what we wantedmodeled after Google AnalyticsHow we did our tracking pixel The page has a small javascript file included that captures the current url, user agent, referrer and so on and embeds it as an image tag with a carefully crafted URL for the 1x1 tracking pixel /statistics/pixel/?ua=Mozilla%2F... The pixel request is handled by CodeIgniter, which returns the image,
  • 129. Analytics: SolutionBuild our own platform that gathered exactly what we wantedmodeled after Google AnalyticsHow we did our tracking pixel The page has a small javascript file included that captures the current url, user agent, referrer and so on and embeds it as an image tag with a carefully crafted URL for the 1x1 tracking pixel /statistics/pixel/?ua=Mozilla%2F... The pixel request is handled by CodeIgniter, which returns the image, but not before we lookup the country, state, city of the originating IP and pass that data into Gearman for future processing.
  • 130. Analytics: SolutionBuild our own platform that gathered exactly what we wantedmodeled after Google AnalyticsHow we did our tracking pixel The page has a small javascript file included that captures the current url, user agent, referrer and so on and embeds it as an image tag with a carefully crafted URL for the 1x1 tracking pixel /statistics/pixel/?ua=Mozilla%2F... The pixel request is handled by CodeIgniter, which returns the image, but not before we lookup the country, state, city of the originating IP and pass that data into Gearman for future processing. The Gearman worker then parses and normalizes the data,
  • 131. Analytics: SolutionBuild our own platform that gathered exactly what we wantedmodeled after Google AnalyticsHow we did our tracking pixel The page has a small javascript file included that captures the current url, user agent, referrer and so on and embeds it as an image tag with a carefully crafted URL for the 1x1 tracking pixel /statistics/pixel/?ua=Mozilla%2F... The pixel request is handled by CodeIgniter, which returns the image, but not before we lookup the country, state, city of the originating IP and pass that data into Gearman for future processing. The Gearman worker then parses and normalizes the data, performs a host lookup on the source IP (which is often very slow) all before recording the result in our datastore.
  • 132. Demo Time: Analytics
  • 133. Patterns and Recipes
  • 134. Patterns and RecipesOne to one Workers can run in optimized environments for specific tasks Example: Using R, Matlab etc to run mathematic analysis and still use CI to for the front end Example: Run code on a different server to avoid CPU/disk/memory contention with Apache etc. Example: Run tools on Windows platforms like .NET components for generating word files.
  • 135. Patterns and RecipesOne to one Workers can run in optimized environments for specific tasks Example: Using R, Matlab etc to run mathematic analysis and still use CI to for the front end Example: Run code on a different server to avoid CPU/disk/memory contention with Apache etc. Example: Run tools on Windows platforms like .NET components for generating word files.One to many A single client can part out jobs to multiple workers Example: Performing TF*IDF analysis on documents to find keywords Example: Handling image manipulations in parallel, resizing 2 or 3 new thumbnails at a time.
  • 136. Patterns and RecipesMany to one Multiple clients utilizing a single worker thread Share memory across jobs. Arrays are faster than APC, Memcached, MySQL Example: Database write buffering (for non critical data only) Example: Perform database writes across shards. MySQL UDF inserts a record into gearman that’s then re-written out to appropriate user shards
  • 137. Patterns and RecipesMany to one Multiple clients utilizing a single worker thread Share memory across jobs. Arrays are faster than APC, Memcached, MySQL Example: Database write buffering (for non critical data only) Example: Perform database writes across shards. MySQL UDF inserts a record into gearman that’s then re-written out to appropriate user shardsOptimize front end displays Example: Pagination optimization User requests the first page of data Kick off a background task to pre-cache the next page before it’s requested Example: Dashboards User logs into your site, you fire off background tasks to generate dashboard information that user will need Workers begin crunching the data and store the results in a cache Meanwhile, the user is served a page with spaces allocated for each widget, which are then loaded via AJAX
  • 138. Patterns and RecipesDelayed/Deferred Processing Use whenever tasks would run long or are potentially unreliable Preparing images Preparing video Remote service calls Sending content to Twitter, Facebook, LinkedIn... Triggering other remote services faxes, emails etc Prefetching data
  • 139. Other Solutions
  • 140. Other SolutionsMost alternative solutions have APIs for PHP Some interface over other protocols like HTTP(s) or memcache, which can ease deploying new servers
  • 141. Other SolutionsMost alternative solutions have APIs for PHP Some interface over other protocols like HTTP(s) or memcache, which can ease deploying new serversAlternatives Active MQ Amazon’s SQS Beanstalkd Microsoft Message Queuing RabbitMQ Others - check for activity before adopting
  • 142. Some helpful tipsNot everything belongs in a work queue
  • 143. Some helpful tipsNot everything belongs in a work queue Queue it if...
  • 144. Some helpful tipsNot everything belongs in a work queue Queue it if... the process is intensive on any subsystem, CPU, RAM etc
  • 145. Some helpful tipsNot everything belongs in a work queue Queue it if... the process is intensive on any subsystem, CPU, RAM etc the process is slow, taking 1/2 second or longer to run, profile your app with CodeIgniter’s built-in profiler.
  • 146. Some helpful tipsNot everything belongs in a work queue Queue it if... the process is intensive on any subsystem, CPU, RAM etc the process is slow, taking 1/2 second or longer to run, profile your app with CodeIgniter’s built-in profiler. you want the benefit of running in parallel
  • 147. Some helpful tipsNot everything belongs in a work queue Queue it if... the process is intensive on any subsystem, CPU, RAM etc the process is slow, taking 1/2 second or longer to run, profile your app with CodeIgniter’s built-in profiler. you want the benefit of running in parallelDon’t use your database server as your work queue
  • 148. Some helpful tipsNot everything belongs in a work queue Queue it if... the process is intensive on any subsystem, CPU, RAM etc the process is slow, taking 1/2 second or longer to run, profile your app with CodeIgniter’s built-in profiler. you want the benefit of running in parallelDon’t use your database server as your work queue It might work short term, but it’s not scaleable.
  • 149. Some helpful tipsNot everything belongs in a work queue Queue it if... the process is intensive on any subsystem, CPU, RAM etc the process is slow, taking 1/2 second or longer to run, profile your app with CodeIgniter’s built-in profiler. you want the benefit of running in parallelDon’t use your database server as your work queue It might work short term, but it’s not scaleable.Persistence comes with a cost
  • 150. Some helpful tipsNot everything belongs in a work queue Queue it if... the process is intensive on any subsystem, CPU, RAM etc the process is slow, taking 1/2 second or longer to run, profile your app with CodeIgniter’s built-in profiler. you want the benefit of running in parallelDon’t use your database server as your work queue It might work short term, but it’s not scaleable.Persistence comes with a cost Writes to the datastore are never free and will slow down the queue
  • 151. Some helpful tipsNot everything belongs in a work queue Queue it if... the process is intensive on any subsystem, CPU, RAM etc the process is slow, taking 1/2 second or longer to run, profile your app with CodeIgniter’s built-in profiler. you want the benefit of running in parallelDon’t use your database server as your work queue It might work short term, but it’s not scaleable.Persistence comes with a cost Writes to the datastore are never free and will slow down the queue Adds an additional layer of stuff to maintain
  • 152. Questions?
  • 153. Thank You!
  • 154. (this slide intentionally left blank)