High gear PHP with Gearman




                        Felix De Vliegher
         14/05/2010 - phpDay Italia 2010
Whoami?

 • Consultant and software engineer at Ibuildings NL

 • Long time PHP developer

 • PHPBenelux

 • Interest in P...
Whoami?

 • Consultant and software engineer at Ibuildings NL

 • Long time PHP developer

 • PHPBenelux

 • Interest in P...
Whoami?

 • Consultant and software engineer at Ibuildings NL

 • Long time PHP developer

 • PHPBenelux

 • Interest in P...
What is Gearman?
And why should I care about it?




                                  5
GEARMAN!




           6
A grain of truth




          The name is an anagram for
          “Manager,” since it dispatches jobs to
          be do...
What is Gearman?


 It’s an

 Application framework
 to

 distribute work




                         8
Distribute work?



•   Actual work is farmed out
•   Handled by a number of nodes
•   RPC
•   Distributed parallel proces...
Traditional architecture

        Application

                        View


                      Controller
           ...
New architecture
Model                    Client application
                        Gearman Client API



               ...
Terminology




              12
Terminology



                Create jobs to be run and send
     Client
                them to a job server


         ...
Terminology



                Create jobs to be run and send
     Client
                them to a job server


         ...
Terminology



                Create jobs to be run and send
     Client
                them to a job server


         ...
Application areas

• Image   resizing / generating

• Log   analysis and aggregation

• Asynchronous    queues

• Map/Redu...
Multiple client & worker API’s




                                 14
Multiple client & worker API’s




                                 14
Advantages

• Speed up work

• Parallel and asynchronous work

• Doesn’t block your apache processes

• Scales well

• Arc...
Once upon a time
Gearman history




                   16
Once upon a time




• 2005:   http://brad.livejournal.com/2106943.html
• Originally   a Perl implementation
• Created   b...
Once upon a time



• 2008:   Rewrite in C by Brian Aker
• PHP   Extension by James Luedke
• Gearman     powers some of th...
Installing Gearman

• Job   Server: gearmand


• Get   it from https://launchpad.net/gearmand/
• extract,   configure, mak...
Gearmand usage

/usr/local/sbin/gearmand -d -u <user> -L 127.0.0.1 -p 7003



        -d         Start as daemon in backgr...
Commandline gearman

•    Client mode
     •   ls | gearman -f processFiles
     •   gearman -f processFiles < file
     •...
PHP Interface




                22
PHP: 2 options

•   Pecl extension:

$ pecl install channel://pecl.php.net/gearman-0.6.0

$ php -i | grep "gearman support...
Simplest example

                   Worker:




                   Client:




                             24
Client API

Setting up the client:




                         25
Client API

Job priorities and synchronous vs asynchronous:




                                                  26
GearmanClient::jobStatus()



 array(4) {
   [0]=>        Job is known?
   bool(true)
   [1]=>        Job still running?
 ...
Notifying the client

Client receiving the status notifications:




   dev:~/gearman# php -q client.php
   Running: true,...
Worker API

Possible to add multiple servers:




                                    29
Worker API

Registering functions using a callback:




Pass application data to the functions:




# php -q client.php
Se...
Worker API
Put it all together:




                       31
Notifying the client
Status notifications:
GearmanJob::sendStatus(int $numerator, int $denominator)
GearmanJob::sendWarnin...
Jobs vs Tasks




                33
Callbacks

Provide feedback on different moments in the process:

GearmanClient::setDataCallback

GearmanClient::setComple...
What about persistence?

 By default, jobs are stored in memory (fast)


 gearmand --queue-type (-q)
 • libdrizzle
 • post...
Sqlite3 example

 gearmand
   --queue-type=libsqlite3
   --libsqlite3-db=/tmp/jobs.sqlite
   --libsqlite3-table=gearman_jo...
How to do storage?




                     37
How to do storage?

 Distributed




                     37
How to do storage?

 Distributed

 Need to share storage (most of the time)




                                          ...
How to do storage?

 Distributed

 Need to share storage (most of the time)

 Some options:




                          ...
How to do storage?

 Distributed

 Need to share storage (most of the time)

 Some options:

 • NFS




                  ...
How to do storage?

 Distributed

 Need to share storage (most of the time)

 Some options:

 • NFS

 • MogileFS




     ...
How to do storage?

 Distributed

 Need to share storage (most of the time)

 Some options:

 • NFS

 • MogileFS

 • DR:BD...
So, can I kick out my crons?




                               38
So, can I kick out my crons?

 Not quite :)




                               38
So, can I kick out my crons?

 Not quite :)
 Scheduled execution (cron)




                               38
So, can I kick out my crons?

 Not quite :)
 Scheduled execution (cron)
 delayed execution (at)




                      ...
So, can I kick out my crons?

 Not quite :)
 Scheduled execution (cron)
 delayed execution (at)


 */15 * * * * /usr/local...
So, can I kick out my crons?

 Not quite :)
 Scheduled execution (cron)
 delayed execution (at)


 */15 * * * * /usr/local...
So, can I kick out my crons?

 Not quite :)
 Scheduled execution (cron)
 delayed execution (at)


 */15 * * * * /usr/local...
HTTP protocol

 Start gearmand with -r http

            POST /reverse HTTP/1.1
            Content-Length: 13

          ...
Application areas




                    40
Image resizing: Client




                         41
Image resizing: worker




                         42
Map / Reduce

                              Client


                        Gearman Job server


                        ...
Apache logging




                 44
Apache logging




                 44
Monitoring




             45
Monitoring

 GearUp: Monitoring service for gearman
   ‣ No code yet, but looks promising
   ‣ http://launchpad.net/gearup...
Monitoring

 GearUp: Monitoring service for gearman
   ‣ No code yet, but looks promising
   ‣ http://launchpad.net/gearup...
Monitoring

 GearUp: Monitoring service for gearman
   ‣ No code yet, but looks promising
   ‣ http://launchpad.net/gearup...
Monitoring

 Supervisord: can manage workers
 [program:gearman-foobar-worker]
 command=/usr/local/bin/php -q /home/foo/wor...
Alternatives

 Most similar: Beanstalkd
 • PHP Client: pheanstalk
 • Web interface in Django
 • http://kr.github.com/beans...
Questions ?




Feedback: http://joind.in/1470


                                 49
Ibuildings challenge!




                        50
Ibuildings challenge!

 http://www.ibuildings.com/challenge




                                       50
Ibuildings challenge!

 http://www.ibuildings.com/challenge

 “The Test Driven Challenge”




                            ...
Ibuildings challenge!

 http://www.ibuildings.com/challenge

 “The Test Driven Challenge”

 Win an iPad! (when they’re ava...
Ibuildings challenge!

 http://www.ibuildings.com/challenge

 “The Test Driven Challenge”

 Win an iPad! (when they’re ava...
Links & sources
Credits:
- Gear man: http://agearman.com/
- Distributed computing: http://www.theleadblog.com/2009/06/
lea...
Thank you!
           Contact details:

            Felix De Vliegher
Email: felix@ibuildings.com
            Twitter: @fe...
Upcoming SlideShare
Loading in...5
×

High gear PHP with Gearman (phpDay 2010)

11,896

Published on

Gearman is an application framework for distributing work to other machines and processors which are better suited for the job. It can be used in a variety of applications, from high-availability web sites to the transport of database replication events. In other words, it is the nervous system for how distributed processing communicates. With things like scalability and distributed computing becoming more and more important to today’s web applications, Gearman and its PHP interface can prove quite useful to us in a variety of situations. In this talk, we’ll first have a look at what distributed processing exactly means, and then looking at what Gearman actually is and does, and how it can power up your application using the Gearman PHP extension. By showing different examples and application area’s, you’ll get a good feeling of what Gearman is capable of and why it can be a valuable asset to your next PHP project.

Published in: Technology
0 Comments
17 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
11,896
On Slideshare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
278
Comments
0
Likes
17
Embeds 0
No embeds

No notes for slide




  • Asking who knows gearman, uses gearman. Who uses PHP.













  • Brad Fitzpatrick

  • also packages available
  • Listening and management thread - only one
    I/O thread - can have many
    Processing thread - only one






  • jobstatus slide next
  • $status returns an array containing status information for the job corresponding to the supplied job handle.




  • end with jobs (single) vs tasks (parallel)


  • sqlite3 example next







  • http protocol next
  • http protocol next
  • http protocol next
  • http protocol next
  • http protocol next
  • http protocol next
  • X-Gearman-Unique: &lt;uniqueid&gt;
    X-Gearman-Priority: high|low
    X-Gearman-Background: true
    xmpp jabber protocol

  • telling about how we use gearman:
    project where we transform manuscripts -&gt; pdf -&gt; images


















  • High gear PHP with Gearman (phpDay 2010)

    1. 1. High gear PHP with Gearman Felix De Vliegher 14/05/2010 - phpDay Italia 2010
    2. 2. Whoami? • Consultant and software engineer at Ibuildings NL • Long time PHP developer • PHPBenelux • Interest in PHP QA, high performance & scalability • Belgium 2
    3. 3. Whoami? • Consultant and software engineer at Ibuildings NL • Long time PHP developer • PHPBenelux • Interest in PHP QA, high performance & scalability • Belgium (beer)! 3
    4. 4. Whoami? • Consultant and software engineer at Ibuildings NL • Long time PHP developer • PHPBenelux • Interest in PHP QA, high performance & scalability • Belgium 4
    5. 5. What is Gearman? And why should I care about it? 5
    6. 6. GEARMAN! 6
    7. 7. A grain of truth The name is an anagram for “Manager,” since it dispatches jobs to be done, but does not do anything useful itself. - From Gearman website 7
    8. 8. What is Gearman? It’s an Application framework to distribute work 8
    9. 9. Distribute work? • Actual work is farmed out • Handled by a number of nodes • RPC • Distributed parallel processing • Shared nothing 9
    10. 10. Traditional architecture Application View Controller f Model 10
    11. 11. New architecture Model Client application Gearman Client API Gearman Job server (gearmand) Gearman Worker API Gearman Worker API Gearman Worker API Worker application Worker application Worker application 11
    12. 12. Terminology 12
    13. 13. Terminology Create jobs to be run and send Client them to a job server Register with a job server and Worker grab jobs to run Coordinates assignment from Job Server clients to workers, handles restarts 12
    14. 14. Terminology Create jobs to be run and send Client them to a job server Register with a job server and Worker grab jobs to run Coordinates assignment from Job Server clients to workers, handles restarts 12
    15. 15. Terminology Create jobs to be run and send Client them to a job server Register with a job server and Worker grab jobs to run Coordinates assignment from Job Server clients to workers, handles restarts 12
    16. 16. Application areas • Image resizing / generating • Log analysis and aggregation • Asynchronous queues • Map/Reduce • URL processing • Cache warm-up 13
    17. 17. Multiple client & worker API’s 14
    18. 18. Multiple client & worker API’s 14
    19. 19. Advantages • Speed up work • Parallel and asynchronous work • Doesn’t block your apache processes • Scales well • Architecture-based workload distributing • Legacy code 15
    20. 20. Once upon a time Gearman history 16
    21. 21. Once upon a time • 2005: http://brad.livejournal.com/2106943.html • Originally a Perl implementation • Created by Danga Interactive • Guys behind Memcache and MogileFS 17
    22. 22. Once upon a time • 2008: Rewrite in C by Brian Aker • PHP Extension by James Luedke • Gearman powers some of the largest sites around: • Digg: 45+ servers, 400K jobs/day • Yahoo: 60+ servers, 6M jobs/day • Netlog.com • Xing.com 18
    23. 23. Installing Gearman • Job Server: gearmand • Get it from https://launchpad.net/gearmand/ • extract, configure, make, make install: dev:/usr/local/src# wget http://launchpad.net/gearmand/trunk/0.10/+download/ gearmand-0.10.tar.gz dev:/usr/local/src# tar -xzvf gearmand-0.10.tar.gz dev:/usr/local/src# cd gearmand-0.10/ dev:/usr/local/src/gearmand-0.10# ./configure --prefix=/usr/local/ dev:/usr/local/src/gearmand-0.10# make dev:/usr/local/src/gearmand-0.10# make install dev:/usr/local/src/gearmand-0.10# /usr/local/sbin/gearmand --help 19
    24. 24. Gearmand usage /usr/local/sbin/gearmand -d -u <user> -L 127.0.0.1 -p 7003 -d Start as daemon in background -u <user> Run as the specified user -L <host> Only listen on the specified host or IP -p <port> Listen on the specified port -t <threads> Number of threads to use -v(vv) Verbose (useful for debugging) 20
    25. 25. Commandline gearman • Client mode • ls | gearman -f processFiles • gearman -f processFiles < file • gearman -f processFiles “foo data” • Worker mode • gearman -w -f lineCount -- wc -l • gearman -w -c 100 -f doStuff ./script.sh • Example: dev:~/gearman# gearman -w -f foo -- grep GearmanClient & dev:~/gearman# cat demo.php | gearman -f foo $client= new GearmanClient(); 21
    26. 26. PHP Interface 22
    27. 27. PHP: 2 options • Pecl extension: $ pecl install channel://pecl.php.net/gearman-0.6.0 $ php -i | grep "gearman support" gearman support => enabled • Net_Gearman PEAR Library: $ pear install Net_Gearman • Net_Gearman_Job • Net_Gearman_Worker • Net_Gearman_Task • Net_Gearman_Set • Net_Gearman_Client 23
    28. 28. Simplest example Worker: Client: 24
    29. 29. Client API Setting up the client: 25
    30. 30. Client API Job priorities and synchronous vs asynchronous: 26
    31. 31. GearmanClient::jobStatus() array(4) { [0]=> Job is known? bool(true) [1]=> Job still running? bool(true) [2]=> Numerator int(2) [3]=> Denominator int(5) } 27
    32. 32. Notifying the client Client receiving the status notifications: dev:~/gearman# php -q client.php Running: true, numerator: 0, denomintor: 2 Running: true, numerator: 1, denomintor: 2 Running: false, numerator: 0, denomintor: 0 28
    33. 33. Worker API Possible to add multiple servers: 29
    34. 34. Worker API Registering functions using a callback: Pass application data to the functions: # php -q client.php Sending job Count: 1: HELLO! Count: 2: WORLD! 30
    35. 35. Worker API Put it all together: 31
    36. 36. Notifying the client Status notifications: GearmanJob::sendStatus(int $numerator, int $denominator) GearmanJob::sendWarning(string $warning) GearmanJob::sendComplete(string $result) GearmanJob::sendFail(void) GearmanJob::sendException(string $exception) 32
    37. 37. Jobs vs Tasks 33
    38. 38. Callbacks Provide feedback on different moments in the process: GearmanClient::setDataCallback GearmanClient::setCompleteCallback GearmanClient::setCreatedCallback GearmanClient::setExceptionCallback GearmanClient::setFailCallback GearmanClient::setStatusCallback GearmanClient::setWorkloadCallback 34
    39. 39. What about persistence? By default, jobs are stored in memory (fast) gearmand --queue-type (-q) • libdrizzle • postgresql • libmemcached • libsqlite3 Only for background jobs 35
    40. 40. Sqlite3 example gearmand --queue-type=libsqlite3 --libsqlite3-db=/tmp/jobs.sqlite --libsqlite3-table=gearman_jobs Table structure: sqlite> CREATE TABLE gearman_jobs ( ...> unique_key TEXT PRIMARY_KEY, ...> function_name TEXT, ...> priority INTEGER, ...> data BLOB ...> ); 36
    41. 41. How to do storage? 37
    42. 42. How to do storage? Distributed 37
    43. 43. How to do storage? Distributed Need to share storage (most of the time) 37
    44. 44. How to do storage? Distributed Need to share storage (most of the time) Some options: 37
    45. 45. How to do storage? Distributed Need to share storage (most of the time) Some options: • NFS 37
    46. 46. How to do storage? Distributed Need to share storage (most of the time) Some options: • NFS • MogileFS 37
    47. 47. How to do storage? Distributed Need to share storage (most of the time) Some options: • NFS • MogileFS • DR:BD 37
    48. 48. So, can I kick out my crons? 38
    49. 49. So, can I kick out my crons? Not quite :) 38
    50. 50. So, can I kick out my crons? Not quite :) Scheduled execution (cron) 38
    51. 51. So, can I kick out my crons? Not quite :) Scheduled execution (cron) delayed execution (at) 38
    52. 52. So, can I kick out my crons? Not quite :) Scheduled execution (cron) delayed execution (at) */15 * * * * /usr/local/bin/gearman -f cronTask < /tmp/input.txt 38
    53. 53. So, can I kick out my crons? Not quite :) Scheduled execution (cron) delayed execution (at) */15 * * * * /usr/local/bin/gearman -f cronTask < /tmp/input.txt at functionality is considered for inclusion 38
    54. 54. So, can I kick out my crons? Not quite :) Scheduled execution (cron) delayed execution (at) */15 * * * * /usr/local/bin/gearman -f cronTask < /tmp/input.txt at functionality is considered for inclusion See http://groups.google.com/group/gearman/browse_thread/thread/b9891649fb08d16b# 38
    55. 55. HTTP protocol Start gearmand with -r http POST /reverse HTTP/1.1 Content-Length: 13 Hello phpDay! HTTP/1.0 200 OK X-Gearman-Job-Handle: H:gman:4 Content-Length: 13 Server: Gearman/0.9 !yaDphp olleH Use X-Gearman-* headers to modify job 39
    56. 56. Application areas 40
    57. 57. Image resizing: Client 41
    58. 58. Image resizing: worker 42
    59. 59. Map / Reduce Client Gearman Job server Map/Reduce Worker Client Client Client Gearman Job server Worker Worker Worker Worker Worker 43
    60. 60. Apache logging 44
    61. 61. Apache logging 44
    62. 62. Monitoring 45
    63. 63. Monitoring GearUp: Monitoring service for gearman ‣ No code yet, but looks promising ‣ http://launchpad.net/gearup 46
    64. 64. Monitoring GearUp: Monitoring service for gearman ‣ No code yet, but looks promising ‣ http://launchpad.net/gearup 46
    65. 65. Monitoring GearUp: Monitoring service for gearman ‣ No code yet, but looks promising ‣ http://launchpad.net/gearup Telnet monitoring: 46
    66. 66. Monitoring Supervisord: can manage workers [program:gearman-foobar-worker] command=/usr/local/bin/php -q /home/foo/workers/foobar.php process_name=%(program_name)s_%(process_num)02d autostart=true autorestart=true numprocs=100 redirect_stderr=true stdout_logfile=/var/log/gearman-foobar-worker.log stdout_logfile_maxbytes=5MB stdout_logfile_backups=10 Combine with get_memory_usage() to restart workers http://supervisord.org/ Alternative: http://github.com/brianlmoon/GearmanManager 47
    67. 67. Alternatives Most similar: Beanstalkd • PHP Client: pheanstalk • Web interface in Django • http://kr.github.com/beanstalkd/ Zend Server job queue • Has job scheduling and dependencies built in • Not free • http://www.zend.com/en/products/server/zend-server-job- queue 48
    68. 68. Questions ? Feedback: http://joind.in/1470 49
    69. 69. Ibuildings challenge! 50
    70. 70. Ibuildings challenge! http://www.ibuildings.com/challenge 50
    71. 71. Ibuildings challenge! http://www.ibuildings.com/challenge “The Test Driven Challenge” 50
    72. 72. Ibuildings challenge! http://www.ibuildings.com/challenge “The Test Driven Challenge” Win an iPad! (when they’re available) 50
    73. 73. Ibuildings challenge! http://www.ibuildings.com/challenge “The Test Driven Challenge” Win an iPad! (when they’re available) We’re also hiring! 50
    74. 74. Links & sources Credits: - Gear man: http://agearman.com/ - Distributed computing: http://www.theleadblog.com/2009/06/ lead-distribution-creating-right-sales.html - Old gears: http://decorate.pebblez.com - Army of elephpants: http://www.flickr.com/people/dragonbe/ Gearman online: - http://www.danga.com/gearman/ - http://gearman.org/ - http://pecl.php.net/package/gearman - IRC: #gearman on irc.freenode.net - ML: http://groups.google.com/group/gearman 51
    75. 75. Thank you! Contact details: Felix De Vliegher Email: felix@ibuildings.com Twitter: @felixdv
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×