0
Distributed Queue System      using Gearman         Taehun Cho        CTO @ IMcompany       http://iamcompany.net/
What is a Queue?             copyright@ www.mathworks.com
Multiple Queue With multiple teller, all customers must form multiple queue to be served
Some Situation on the WEB+You want to send an email, push notification,messages, import big file, etc...
Problem is that...     DO NOT LET YOUR USERS WAIT
Run Asynchronously
Running Task in a Single Thread?  Loading behavior may affect the main GUI
Run in Background!The user can do next action, and the task should be     done in backend in a distributed manner
Job Queue SystemsCelery (http://www.celeryproject.org/)RabbitMQ (http://www.rabbitmq.com/)Zend Server Job Queue (http://ww...
Introducing Gearman            LiveJournal
Introducing GearmanIn LiveJournal, many photos had uploadedevery day and it lead to a heavy load of imageprocessing, and t...
Features of Gearman● Open Source● Simple & Fast (rewritten in C)● Support a variety of languages  : build Worker in Python...
Example of Architectures       (from http://gearman.org/#what_is_gearman)
Architecture                        Acks the job, finds all sleeping workers                                              ...
Installation● Compile                  (for PHP APIs)tar xzvf gearmand-X.Y.tar. ● Pecl Extensiongz                        ...
Use Cases- Crawling a website- Image Manipulation- Push Notification- Sending Email/Messages- File verification/compressin...
Samples - Worker
Samples - Client
Samples - Monitoring  A good tool for monitoring gearman, is available at     https://github.com/yugene/Gearman-Monitor
Result                Worker #1                    Worker #2The incomplete job will re-queue to available workers         ...
Motivation● At the beginning state, we run 3 computers  for crawling each schools information.  (articles, schedules of th...
Gearman in IMcompanyBut there were some challenges!● How many workers should be up for a  server? (How efficiently leverag...
Exceptional Case #1
Reported bugs when using PHPBug #63041 "Failed to set exception option" onconnect when any gearman server is downhttps://b...
Supervisord for sanity"PHP was not built for long running request""Sometimes it occurs memory leaks"Supervisord helps you ...
Exceptional Case #2 PHP sometimes slows down after hundreds of executions, kill it off if you know this will happen. - Mik...
Server Seems Fine for Now
What We Learned● Gearmans queue list is unstable so  persistent queueing was highly needed in  our system● Integrating MyS...
Also Weve Tried...● Firing queueing jobs over HTTP request is  sometimes not working and may lead to  freezing the server ...
Limitations● Queue makes no guarantees - use MySQL,  memcached, Redis, PostgreSQL, etc..● There are few administration too...
Join Community!http://gearman.org/http://groups.google.com/group/gearman/
Were hiring!●   Work in Daejeon, Korea●   Flexible, Small Company●   Excellent Benefits●   We Need Senior HackersFind more...
Upcoming SlideShare
Loading in...5
×

Distributed Queue System using Gearman

13,733

Published on

We use Gearman for managing queue system. This covers why we should use a queue in many situations on web-based interface as well as server-side application.

Published in: Technology
0 Comments
19 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
13,733
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
106
Comments
0
Likes
19
Embeds 0
No embeds

No notes for slide

Transcript of "Distributed Queue System using Gearman"

  1. 1. Distributed Queue System using Gearman Taehun Cho CTO @ IMcompany http://iamcompany.net/
  2. 2. What is a Queue? copyright@ www.mathworks.com
  3. 3. Multiple Queue With multiple teller, all customers must form multiple queue to be served
  4. 4. Some Situation on the WEB+You want to send an email, push notification,messages, import big file, etc...
  5. 5. Problem is that... DO NOT LET YOUR USERS WAIT
  6. 6. Run Asynchronously
  7. 7. Running Task in a Single Thread? Loading behavior may affect the main GUI
  8. 8. Run in Background!The user can do next action, and the task should be done in backend in a distributed manner
  9. 9. Job Queue SystemsCelery (http://www.celeryproject.org/)RabbitMQ (http://www.rabbitmq.com/)Zend Server Job Queue (http://www.zend.com/)ZeroMQ (http://www.zeromq.org/)BeanstalkdPeafowl, StarlingApache ActiveMQand many others..
  10. 10. Introducing Gearman LiveJournal
  11. 11. Introducing GearmanIn LiveJournal, many photos had uploadedevery day and it lead to a heavy load of imageprocessing, and this was a motivation to buildsuch a queue system.● Yahoo!: 120+ servers, 12M jobs/day● Digg: 45+ servers, 400K jobs/day● LiveJournal, SixApart, DealNews, Xing.com, and many others. - Expert PHP and MySQL - Andrew et al, (2010, Wrox)● Grooveshark, GoDaddy.com, IMcompany
  12. 12. Features of Gearman● Open Source● Simple & Fast (rewritten in C)● Support a variety of languages : build Worker in Python, Client in PHP● Flexible● Load Balance● Failover
  13. 13. Example of Architectures (from http://gearman.org/#what_is_gearman)
  14. 14. Architecture Acks the job, finds all sleeping workers Awake, asks for jobs to server Gearman Job ServerConnect, submit a job Sends a noop command to wake them up Client Worker
  15. 15. Installation● Compile (for PHP APIs)tar xzvf gearmand-X.Y.tar. ● Pecl Extensiongz sudo pecl install gearmancd gearmand-X.Y./configure ● Add below to php.inimake extension="gearman.so"make install● Start Server$ gearmand -d
  16. 16. Use Cases- Crawling a website- Image Manipulation- Push Notification- Sending Email/Messages- File verification/compressing- Fetching RSS Feeds- Indexing on Search Engine
  17. 17. Samples - Worker
  18. 18. Samples - Client
  19. 19. Samples - Monitoring A good tool for monitoring gearman, is available at https://github.com/yugene/Gearman-Monitor
  20. 20. Result Worker #1 Worker #2The incomplete job will re-queue to available workers for fault-tolerance
  21. 21. Motivation● At the beginning state, we run 3 computers for crawling each schools information. (articles, schedules of the school)● One job at a time, too much time to finish all of them, sometimes machines do the same job as the others do.● That was a motivation to make a job queue system that could do jobs in parallel. And weve found Gearman!
  22. 22. Gearman in IMcompanyBut there were some challenges!● How many workers should be up for a server? (How efficiently leverage the load?)● How can we handle unexpected termination of workers?● What if the servers resource is exhausted due to the jobs that given by workers? (Then the server would not respond to others requests/connections related to WEB, SVN, MySQL)
  23. 23. Exceptional Case #1
  24. 24. Reported bugs when using PHPBug #63041 "Failed to set exception option" onconnect when any gearman server is downhttps://bugs.php.net/bug.php?id=63041Bug #63648 Gearman worker stops withsegfault after 1-2 hour of workinghttps://bugs.php.net/bug.php?id=63648
  25. 25. Supervisord for sanity"PHP was not built for long running request""Sometimes it occurs memory leaks"Supervisord helps you in above cases!- Auto restart the processes based on customconfigurations* Installation guide - http://www.masnun.com/2011/11/02/gearman-php-and-supervisor-processing-background-jobs-with-sanity.html
  26. 26. Exceptional Case #2 PHP sometimes slows down after hundreds of executions, kill it off if you know this will happen. - Mike Willbanks, "Gearman: A Job Server made for Scale"
  27. 27. Server Seems Fine for Now
  28. 28. What We Learned● Gearmans queue list is unstable so persistent queueing was highly needed in our system● Integrating MySQL with Gearman was failed in both 1.0.2, 0.34● Tried SQLite, but performance was very poorDo NOT Reserve Too Much Jobs in a Queue
  29. 29. Also Weve Tried...● Firing queueing jobs over HTTP request is sometimes not working and may lead to freezing the server eventually● And doesnt support additional functions for the HTTP connection such as authentication● And is not customizableGearman Seems Too Young at This Moment
  30. 30. Limitations● Queue makes no guarantees - use MySQL, memcached, Redis, PostgreSQL, etc..● There are few administration tools● Jobs dont expire● If a job is dropped, the client is never be notified-from "http://inside.godaddy.com/cloud-processing-with-gearman/"
  31. 31. Join Community!http://gearman.org/http://groups.google.com/group/gearman/
  32. 32. Were hiring!● Work in Daejeon, Korea● Flexible, Small Company● Excellent Benefits● We Need Senior HackersFind more information at http://iamcompany.net/Thank you!Any questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×