Gearman
From the Worker's Perspective
/usr/bin/whoami
• Brian Aker
• HP Fellow
• Previously MySQL, Slashdot, Sun Microsystems
What is Gearman?
“The way I like to think of Gearman is a
massively distributed fork mechanism”
          -Joe Stump, Digg


                  “The Not Mechanical Turk”
                   -Don MacAskill, SmugMug
resize_image()
do                   {
(“resize_image”)       …
                       return $image;
                   }
(Livejournal)
Server
Provides Asynchronous and Synchronous Requests
Restarts Work
Durable Requests (MySQL, Postgres,...)
Gearman Protocol/HTTP
Epoch Scheduling
Logging
(also available, native Servers in Java, Erlang, and Perl)
Client

# Create our client object.$gmclient= new
GearmanClient();# Add default server
(localhost).$gmclient->addServer();

$result= $gmclient->do("reverse", "Hello!");
echo "Success: $resultn";
Worker
# Create our worker object.$gmw= new
GearmanWorker();# Add default server
(localhost).$gmw->addServer();

$gmw->addFunction("reverse",
"reverse_fn");while ($gmworker->work())
{…}
Worker Function


 function reverse_fn($job){ $workload= $job-
 >workload(); $result= strrev($workload);
 return $result;}
Lots of functions...

$gmw->addFunction("resize", "resize_fn");
$gmw->addFunction("grep", "grep_fn");

$gmw->addFunction("fetch_url", "fetch_url");
Function
gearman_return_t fetch_url(gearman_job_st *job, void*){ const char *workload=
gearman_job_workload(job); size_t workload_size=
gearman_job_workload_size(job);

 gearman_job_send_status(job, 0, 100);

…
 gearman_job_send_data(job, chunk, sizeofchunk);

…
 gearman_job_send_status(job, 50,100);

… if (issue_warning)
  gearman_job_warning(job, “I'm sorry, Dave. I'm afraid I can't do that.”, size);
 return GEARMAN_SUCCESS;}
Worker Return

  GEARMAN_SUCCESS

  GEARMAN_FATAL

  GEARMAN_ERROR

  GEARMAN_SHUTDOWN
Your Client
                         request()

                          Client API
                    (C, PHP, Perl, Python, Java,Drizzle, ...)




Network,
Highly Available,              Server
Fault Tolerant

                       Worker API
                        (C, PHP, Perl, Python, Java, ...)



                     Your Worker
                      function()
Your Client
resize_image(“            request()
…”);

                           Client API
                     (C, PHP, Perl, Python, Java,Drizzle, ...)




Network,
Highly Available,                                                Provided by
                                Server
Fault Tolerant                                                    Gearman


                        Worker API
 resize_image()          (C, PHP, Perl, Python, Java, ...)
 {
   …;
   return resized;    Your Worker
 }                     function()
CPU?   event()
multiple languages
Connectors?
• C/C++
• PHP
• Python
• Java
• MySQL and Postgres
• ....
Other items of
         interest?
• Work status requests.
• Chunked Data.
• Exception Handling.
• Up to 4gig message sizes.
• Threaded server.
• Coalescence (the stealth killer feature)
Namespaces
Foo::resize_image()   Acme::resize_image()
  {                     {
    …                     …
    return $image;        return $image;
}                     }
Better Map Reduce?
reduce()
                                      {…}
map(list[…],
                   map()
      reduce());
                   {…}




                                        reduce()
                                        {…}
                           reduce()
                           {…}
Map @#$@# ?
find()

Partitioning
                     find()

           {A...K}
           {L...Q}
           {R...Z}   find()
Partitioning

gearman_return_t split_worker(gearman_job_st *job, void* /* context */){ const
char *workload= gearman_job_workload(job); size_t workload_size=
gearman_job_workload_size(job); const char *chunk_begin= workload; for
(size_t x= 0; x < workload_size; x++) { if (workload[x] == 0 or workload[x] ==
' ') {    gearman_job_send_data(job, chunk_begin,
                             workload +x -chunk_begin);    chunk_begin=
workload +x +1; } } return GEARMAN_SUCCESS;}
$resu

Aggregation             lt



                      $resu
                        lt
         + result
         + result
         + result     $resu
                        lt
       = sum result
Aggregation

gearman_return_t cat_aggregator (gearman_aggregator_st *,
                                 gearman_task_st *task,

                                  gearman_result_st *result){ std::string
string_value; do { gearman_result_st *result_ptr=
gearman_task_result(task);
string_value.append(gearman_result_value(result_ptr),
                        gearman_result_size(result_ptr));
 } while ((task= gearman_next(task)));
gearman_result_store_value(result, string_value.c_str(),
                             string_value.size()); return
GEARMAN_SUCCESS;}
Do we have to
  partition?
 (What other tricks exist!)
Pipeline
Store()        Resize()   Publish()
Future
0.32 Released
Custom Logging Plugins
Client/Worker Configuration
Extended Administrative Commands
SSL
Status lookup via Unique Identifier
Job Result Cache
Uplift!
Gearman.inf
                                       o
• Gearman.org (...)
• http://launchpad.net/gearmand/
• twitter: brianaker
• blog: blog.krow.net

Gearmam, from the_worker's_perspective copy

  • 1.
  • 2.
    /usr/bin/whoami • Brian Aker •HP Fellow • Previously MySQL, Slashdot, Sun Microsystems
  • 3.
  • 4.
    “The way Ilike to think of Gearman is a massively distributed fork mechanism” -Joe Stump, Digg “The Not Mechanical Turk” -Don MacAskill, SmugMug
  • 6.
    resize_image() do { (“resize_image”) … return $image; }
  • 7.
  • 13.
    Server Provides Asynchronous andSynchronous Requests Restarts Work Durable Requests (MySQL, Postgres,...) Gearman Protocol/HTTP Epoch Scheduling Logging (also available, native Servers in Java, Erlang, and Perl)
  • 14.
    Client # Create ourclient object.$gmclient= new GearmanClient();# Add default server (localhost).$gmclient->addServer(); $result= $gmclient->do("reverse", "Hello!"); echo "Success: $resultn";
  • 15.
    Worker # Create ourworker object.$gmw= new GearmanWorker();# Add default server (localhost).$gmw->addServer(); $gmw->addFunction("reverse", "reverse_fn");while ($gmworker->work()) {…}
  • 16.
    Worker Function functionreverse_fn($job){ $workload= $job- >workload(); $result= strrev($workload); return $result;}
  • 17.
    Lots of functions... $gmw->addFunction("resize","resize_fn"); $gmw->addFunction("grep", "grep_fn"); $gmw->addFunction("fetch_url", "fetch_url");
  • 18.
    Function gearman_return_t fetch_url(gearman_job_st *job,void*){ const char *workload= gearman_job_workload(job); size_t workload_size= gearman_job_workload_size(job); gearman_job_send_status(job, 0, 100); … gearman_job_send_data(job, chunk, sizeofchunk); … gearman_job_send_status(job, 50,100); … if (issue_warning) gearman_job_warning(job, “I'm sorry, Dave. I'm afraid I can't do that.”, size); return GEARMAN_SUCCESS;}
  • 19.
    Worker Return GEARMAN_SUCCESS GEARMAN_FATAL GEARMAN_ERROR GEARMAN_SHUTDOWN
  • 20.
    Your Client request() Client API (C, PHP, Perl, Python, Java,Drizzle, ...) Network, Highly Available, Server Fault Tolerant Worker API (C, PHP, Perl, Python, Java, ...) Your Worker function()
  • 21.
    Your Client resize_image(“ request() …”); Client API (C, PHP, Perl, Python, Java,Drizzle, ...) Network, Highly Available, Provided by Server Fault Tolerant Gearman Worker API resize_image() (C, PHP, Perl, Python, Java, ...) { …; return resized; Your Worker } function()
  • 22.
    CPU? event()
  • 24.
  • 25.
    Connectors? • C/C++ • PHP •Python • Java • MySQL and Postgres • ....
  • 26.
    Other items of interest? • Work status requests. • Chunked Data. • Exception Handling. • Up to 4gig message sizes. • Threaded server. • Coalescence (the stealth killer feature)
  • 27.
    Namespaces Foo::resize_image() Acme::resize_image() { { … … return $image; return $image; } }
  • 28.
  • 29.
    reduce() {…} map(list[…], map() reduce()); {…} reduce() {…} reduce() {…}
  • 30.
  • 31.
    find() Partitioning find() {A...K} {L...Q} {R...Z} find()
  • 32.
    Partitioning gearman_return_t split_worker(gearman_job_st *job,void* /* context */){ const char *workload= gearman_job_workload(job); size_t workload_size= gearman_job_workload_size(job); const char *chunk_begin= workload; for (size_t x= 0; x < workload_size; x++) { if (workload[x] == 0 or workload[x] == ' ') { gearman_job_send_data(job, chunk_begin, workload +x -chunk_begin); chunk_begin= workload +x +1; } } return GEARMAN_SUCCESS;}
  • 33.
    $resu Aggregation lt $resu lt + result + result + result $resu lt = sum result
  • 34.
    Aggregation gearman_return_t cat_aggregator (gearman_aggregator_st*, gearman_task_st *task, gearman_result_st *result){ std::string string_value; do { gearman_result_st *result_ptr= gearman_task_result(task); string_value.append(gearman_result_value(result_ptr), gearman_result_size(result_ptr)); } while ((task= gearman_next(task))); gearman_result_store_value(result, string_value.c_str(), string_value.size()); return GEARMAN_SUCCESS;}
  • 35.
    Do we haveto partition? (What other tricks exist!)
  • 36.
    Pipeline Store() Resize() Publish()
  • 37.
    Future 0.32 Released Custom LoggingPlugins Client/Worker Configuration Extended Administrative Commands SSL Status lookup via Unique Identifier Job Result Cache Uplift!
  • 38.
    Gearman.inf o • Gearman.org (...) • http://launchpad.net/gearmand/ • twitter: brianaker • blog: blog.krow.net

Editor's Notes

  • #24 There is no difference if one of the actors is using one operating system rather than the same one used by the server. Actually, clients can get specific advantage of this architecture, by requesting tasks that are not available in their operating system but can be easily performed in the OS of one of the workers.
  • #25 Language, the greatest divider in the recent technology, is not an obstacle anymore. Clients can keep coding in the language they are most familiar with, and the workers will use libraries and classes that are only available to specific languages, without need for the clients to be involved with the nitpicks of such complex systems.