Writing Prefork Workers / Servers Cybozu Labs, Inc. Kazuho Oku
Job workers: the application area <ul><li>Essential for large-scale webapps </li></ul><ul><ul><li>to communicate with othe...
Job workers: prefork vs. event-driven <ul><li>Prefork-based </li></ul><ul><ul><li>Good for writing application servers, tr...
Job Workers: prefork vs. event-driven (2) <ul><li>Event-driven-based </li></ul><ul><ul><li>good for writing chat server / ...
Job Workers: prefork vs. event-driven (3) <ul><li>Using prefork at first is generally good </li></ul><ul><ul><li>and then ...
Agenda <ul><li>Parallel::Prefork </li></ul><ul><ul><li>for managing prefork’ed processes </li></ul></ul><ul><li>Server::St...
<ul><li>Parallel::Prefork </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
Parallel::ForkManager – the good old way <ul><li># taken from the POD </li></ul><ul><li>my $pm = new Parallel::ForkManager...
Parallel::ForkManager – the problem <ul><li>No support for shutdown </li></ul><ul><ul><li>designed for operating on alread...
Parallel::Prefork – a signal-savvy manager <ul><li>my $pm = Parallel::Prefork->new({ </li></ul><ul><li>max_workers  => $MA...
Parallel::Prefork – writing a Gearman worker <ul><li>my $pm = Parallel::Prefork->new(...); </li></ul><ul><li>my $worker = ...
Parallel::Prefork – writing a prefork server <ul><li>my $pm = Parallel::Prefork->new(...); </li></ul><ul><li>my $listen_so...
Parallel::Prefork – advanced topics <ul><li>Graceful reconfiguration </li></ul><ul><ul><li>possible, but often unnecessary...
Parallel::Prefork – graceful reconfiguration <ul><li>my $pm = Parallel::Prefork->new({ </li></ul><ul><li>max_workers  => $...
Parallel::Prefork – dynamic scaling <ul><li>my $pm = Parallel::Prefork::SpareWorkers->new({ </li></ul><ul><li>max_workers ...
<ul><li>Server::Starter </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
Hot deployment <ul><li>what is it? </li></ul><ul><ul><li>upgrading web application without restarting the application serv...
Old techniques to restart a webapp. server <ul><li>restart the interpreter (mod_perl) </li></ul><ul><ul><li>pros: graceful...
Old techniques to restart a webapp. server (2) <ul><li>exec(myself) (Net::Server) </li></ul><ul><ul><li>pros: graceful, pu...
The restart method of Server::Starter <ul><li>a superdaemon for hot-deploying TCP servers </li></ul><ul><ul><li>superdaemo...
Server::Starter – no downtime <ul><li>listening socket shared by old and new generation app. servers </li></ul><ul><li>old...
Server::Starter – no resource leaks <ul><li>no chance of resource leaks </li></ul><ul><ul><li>every generation of app. ser...
Server::Starter – fail safe <ul><li>old app. server retired  if and only if  the new app. server starts up successfully </...
Server::Starter – the code <ul><li># only change the code that listens to a port </li></ul><ul><li>+ if (defined $ENV{SERV...
Server::Starter – integration w. daemontools <ul><li>The “run” script: </li></ul><ul><li>#! /bin/sh </li></ul><ul><li>  ex...
<ul><li>Parallel::Scoreboard </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
Monitoring the workers / servers <ul><li>is essential for… </li></ul><ul><ul><li>resource provisioning </li></ul></ul><ul>...
Parallel::Scoreboard – the caveats <ul><li>a building block for monitoring processes </li></ul><ul><ul><li>for visualizati...
Parallel::Scoreboard – under the hood <ul><li>one file per monitored process </li></ul><ul><ul><li><base_dir>/status_<pid>...
Parallel::Scoreboard – monitored process <ul><li># prepare </li></ul><ul><li>my $scoreboard = Parallel::Scoreboard->new( <...
Parallel::Scoreboard – integrating with P::Prefork <ul><li>my $pm = Parallel::Prefork->new(...); </li></ul><ul><li>my $sco...
Parallel::Scoreboard – monitoring process <ul><li># prepare </li></ul><ul><li>my $scoreboard = Parallel::Scoreboard->new( ...
Parallel::Scoreboard – monitoring by HTTP <ul><li>Parallel::Scoreboard::PSGI::App </li></ul><ul><ul><li>mod_status-like di...
Parallel::Scoreboard – monitoring in HTTP (2) <ul><li>plackup --port 8080 –e 'use Parallel::Scoreboard::PSGI::App::JSON;  ...
<ul><li>Summary </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
Summary <ul><li>use Parallel::Prefork when writing job workers / servers </li></ul><ul><li>use Server::Starter when hot-de...
Upcoming SlideShare
Loading in...5
×

Writing Prefork Workers / Servers

7,056

Published on

Published in: Technology
0 Comments
15 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,056
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
1
Comments
0
Likes
15
Embeds 0
No embeds

No notes for slide

Transcript of "Writing Prefork Workers / Servers"

  1. 1. Writing Prefork Workers / Servers Cybozu Labs, Inc. Kazuho Oku
  2. 2. Job workers: the application area <ul><li>Essential for large-scale webapps </li></ul><ul><ul><li>to communicate with other services </li></ul></ul><ul><ul><li>for synchronizing data between storages </li></ul></ul><ul><ul><li>for resizing images, … </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  3. 3. Job workers: prefork vs. event-driven <ul><li>Prefork-based </li></ul><ul><ul><li>Good for writing application servers, transcoders (image resizing, etc.), DB-based job workers </li></ul></ul><ul><ul><li>+ easy to write and to maintain </li></ul></ul><ul><ul><li>− consumes more memory (improved by CoW) </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  4. 4. Job Workers: prefork vs. event-driven (2) <ul><li>Event-driven-based </li></ul><ul><ul><li>good for writing chat server / client (irc), comet-based applications </li></ul></ul><ul><ul><li>+ easy to implement interaction between the tasks (connections) </li></ul></ul><ul><ul><li>+ consumes less memory </li></ul></ul><ul><ul><li>− difficult to write and to maintain </li></ul></ul><ul><ul><li>− most modules cannot be called asynchronously </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  5. 5. Job Workers: prefork vs. event-driven (3) <ul><li>Using prefork at first is generally good </li></ul><ul><ul><li>and then switch to an event-driven-based approach if performance matters </li></ul></ul><ul><ul><li>unless you are writing a server that implements interaction between the clients </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  6. 6. Agenda <ul><li>Parallel::Prefork </li></ul><ul><ul><li>for managing prefork’ed processes </li></ul></ul><ul><li>Server::Starter </li></ul><ul><ul><li>for hot-deploying servers </li></ul></ul><ul><li>Parallel::Scoreboard </li></ul><ul><ul><li>for monitoring prefork-based workers / servers </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  7. 7. <ul><li>Parallel::Prefork </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  8. 8. Parallel::ForkManager – the good old way <ul><li># taken from the POD </li></ul><ul><li>my $pm = new Parallel::ForkManager($MAX_PROCESSES); </li></ul><ul><li>foreach $data (@all_data) { </li></ul><ul><li># Forks and returns the pid for the child: </li></ul><ul><li>$pm->start and next; </li></ul><ul><li>... do some work with $data in the child process ... </li></ul><ul><li>$pm->finish; # Terminates the child process </li></ul><ul><li>} </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  9. 9. Parallel::ForkManager – the problem <ul><li>No support for shutdown </li></ul><ul><ul><li>designed for operating on already-existing data in parallel, not for receiving and handling data concurrently </li></ul></ul><ul><ul><li>child processes are not killed by parent </li></ul></ul><ul><ul><li>no clean shutdown or restart </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  10. 10. Parallel::Prefork – a signal-savvy manager <ul><li>my $pm = Parallel::Prefork->new({ </li></ul><ul><li>max_workers => $MAX_PROCESSES, </li></ul><ul><li>trap_signals => { </li></ul><ul><li>TERM => 'TERM', # send TERM to children when parent gets TERM </li></ul><ul><li>}, </li></ul><ul><li>); </li></ul><ul><li>while ( $pm->signal_received ne 'TERM' ) { </li></ul><ul><li>$pm->start and next; </li></ul><ul><li>... do some work within the child process ... </li></ul><ul><li>$pm->finish; </li></ul><ul><li>} </li></ul><ul><li>$pm->wait_all_children(); # wait for all children to exit </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  11. 11. Parallel::Prefork – writing a Gearman worker <ul><li>my $pm = Parallel::Prefork->new(...); </li></ul><ul><li>my $worker = Gearman::Worker->new; </li></ul><ul><li>... </li></ul><ul><li>while ($pm->signal_received ne 'TERM') { </li></ul><ul><li>$pm->start and next; </li></ul><ul><li># gracefully exit the child process when parent delegates TERM </li></ul><ul><li>my $stop_worker = undef; </li></ul><ul><li>local $SIG{TERM} = sub { $stop_worker = 1 }; </li></ul><ul><li>$worker->work(stop_if => sub { $stop_worker }); </li></ul><ul><li>$pm->finish; </li></ul><ul><li>} </li></ul><ul><li>$pm->wait_all_children(); # wait for all children to exit </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  12. 12. Parallel::Prefork – writing a prefork server <ul><li>my $pm = Parallel::Prefork->new(...); </li></ul><ul><li>my $listen_sock = IO::Socket::INET->new( </li></ul><ul><li>LocalAddr => $hostport, </li></ul><ul><li>Listen => Socket::SOMAXCONN, </li></ul><ul><li>ReuseAddr => 1, </li></ul><ul><li>); </li></ul><ul><li>while ($pm->signal_received ne 'TERM') { </li></ul><ul><li>$pm->start and next; </li></ul><ul><li>while (my $socket = $listen_sock->accept) { </li></ul><ul><li>... communicate with the client ... </li></ul><ul><li>} </li></ul><ul><li>$pm->finish; </li></ul><ul><li>} </li></ul><ul><li>$pm->wait_all_children(); # wait for all children to exit </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  13. 13. Parallel::Prefork – advanced topics <ul><li>Graceful reconfiguration </li></ul><ul><ul><li>possible, but often unnecessary, since… </li></ul></ul><ul><ul><li>for workers, short downtime is acceptable </li></ul></ul><ul><ul><ul><li>… , or we can run multiple instances of worker processes to hide the downtime </li></ul></ul></ul><ul><ul><li>for servers, we need hot-deploy to update code </li></ul></ul><ul><ul><ul><li>and the same technique can be used for changing configuration </li></ul></ul></ul><ul><li>Changing # of worker processes </li></ul><ul><ul><li>in general a wrong idea :-p </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  14. 14. Parallel::Prefork – graceful reconfiguration <ul><li>my $pm = Parallel::Prefork->new({ </li></ul><ul><li>max_workers => $MAX_PROCESSES, </li></ul><ul><li>trap_signals => { </li></ul><ul><li>TERM => 'TERM', # send TERM to children when parent gets TERM </li></ul><ul><li>HUP => 'TERM', # send TERM to chlidren on graceful reconf. </li></ul><ul><li>}, </li></ul><ul><li>); </li></ul><ul><li>while ($pm->signal_received ne 'TERM') { </li></ul><ul><li>reload_config() if $pm->signal_received eq 'HUP'; </li></ul><ul><li>$pm->start and next; </li></ul><ul><li>... do some work within the child process ... </li></ul><ul><li>$pm->finish; </li></ul><ul><li>} </li></ul><ul><li>$pm->wait_all_children(); # wait for all children to exit </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  15. 15. Parallel::Prefork – dynamic scaling <ul><li>my $pm = Parallel::Prefork::SpareWorkers->new({ </li></ul><ul><li>max_workers => 40, </li></ul><ul><li>min_spare_workers => 5, </li></ul><ul><li>max_spare_workers => 10, </li></ul><ul><li>... </li></ul><ul><li>); </li></ul><ul><li>while ($pm->signal_recieived ne 'TERM') { </li></ul><ul><li>$pm->start and next; </li></ul><ul><li>... # setup signal handlers, etc. </li></ul><ul><li>while (my $sock = $listener->accept()) { </li></ul><ul><li>$pm->set_state('A'); # set state of the worker proc. to active </li></ul><ul><li>... </li></ul><ul><li>$pm->set_state(Parallel::Prefork::SpareWorkers::STATUS_IDLE); </li></ul><ul><li>} </li></ul><ul><li>$pm->finish(); </li></ul><ul><li>} </li></ul><ul><li>... </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  16. 16. <ul><li>Server::Starter </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  17. 17. Hot deployment <ul><li>what is it? </li></ul><ul><ul><li>upgrading web application without restarting the application server </li></ul></ul><ul><li>the goals </li></ul><ul><ul><li>no downtime </li></ul></ul><ul><ul><li>no resource leaks </li></ul></ul><ul><ul><li>fail-safe </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  18. 18. Old techniques to restart a webapp. server <ul><li>restart the interpreter (mod_perl) </li></ul><ul><ul><li>pros: graceful </li></ul></ul><ul><ul><li>cons: XS may cause resource leaks, service-down on deployment failure, cannot implement in pure-perl </li></ul></ul><ul><li>bind to unix socket (FastCGI) </li></ul><ul><ul><li>pros: graceful, fail-safe </li></ul></ul><ul><ul><li>cons: only useful for local-machine communication </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  19. 19. Old techniques to restart a webapp. server (2) <ul><li>exec(myself) (Net::Server) </li></ul><ul><ul><li>pros: graceful, pure-perl </li></ul></ul><ul><ul><li>cons: file descriptor leaks, service-down on deployment failure </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  20. 20. The restart method of Server::Starter <ul><li>a superdaemon for hot-deploying TCP servers </li></ul><ul><ul><li>superdaemon binds to TCP ports, then spawns the application server </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers listen spawn app. servers SIGHUP accept app. logic fork & exec accept app. logic SIGTERM accept app. logic fork & exec
  21. 21. Server::Starter – no downtime <ul><li>listening socket shared by old and new generation app. servers </li></ul><ul><li>old app. servers receive SIGTERM after new servers start </li></ul>Oct 15 2010 Writing Prefork Workers / Servers listen spawn app. servers SIGHUP accept app. logic fork & exec accept app. logic SIGTERM accept app. logic fork & exec
  22. 22. Server::Starter – no resource leaks <ul><li>no chance of resource leaks </li></ul><ul><ul><li>every generation of app. servers spawned from superdaemon </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers listen spawn app. servers SIGHUP accept app. logic fork & exec accept app. logic SIGTERM accept app. logic fork & exec
  23. 23. Server::Starter – fail safe <ul><li>old app. server retired if and only if the new app. server starts up successfully </li></ul><ul><ul><li>service continues even if the updated app. server fails to start, in cases like missing modules, etc. </li></ul></ul><ul><ul><li>a good practice is to do self-testing in the app. server before starting to serve client connections </li></ul></ul><ul><ul><ul><li>is also an efficient way to preload modules </li></ul></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers listen spawn app. servers SIGHUP accept app. logic fork & exec accept app. logic SIGTERM accept app. logic fork & exec
  24. 24. Server::Starter – the code <ul><li># only change the code that listens to a port </li></ul><ul><li>+ if (defined $ENV{SERVER_STARTER_PORT}) { </li></ul><ul><li>+ ($hostport, my $fd) = %{Server::Starter::server_ports}; </li></ul><ul><li>+ $hostport = “0.0.0.0:$_” </li></ul><ul><li>+ unless $hostport =~ /:/; </li></ul><ul><li>+ $listen_sock = IO::Socket::INET->new(Proto => 'tcp'); </li></ul><ul><li>+ $listen_sock->fdopen($fd, 'w’) </li></ul><ul><li>+ or die &quot;failed to bind to listening socket:$!&quot;; </li></ul><ul><li>+ } else { </li></ul><ul><li>$listen_sock = IO::Socket::INET->new( </li></ul><ul><li>LocalAddr => $hostport, </li></ul><ul><li>Listen => Socket::SOMAXCONN, </li></ul><ul><li>ReuseAddr => 1, </li></ul><ul><li>); </li></ul><ul><li>+ } </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  25. 25. Server::Starter – integration w. daemontools <ul><li>The “run” script: </li></ul><ul><li>#! /bin/sh </li></ul><ul><li> exec start_server --port=80 -- my_server.pl </li></ul><ul><li>To start the server: </li></ul><ul><li>svc –u /service/my_server </li></ul><ul><li>To stop the server: </li></ul><ul><li>svc –d /service/my_server </li></ul><ul><li>To gracefully restart the updated version of the server: </li></ul><ul><li>svc –h /service/my_server </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  26. 26. <ul><li>Parallel::Scoreboard </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  27. 27. Monitoring the workers / servers <ul><li>is essential for… </li></ul><ul><ul><li>resource provisioning </li></ul></ul><ul><ul><li>fault management / fixing bugs </li></ul></ul><ul><li>but how? </li></ul><ul><ul><li>load average is not an answer </li></ul></ul><ul><ul><ul><li>the bottleneck might not be CPU or disk I/O </li></ul></ul></ul><ul><ul><li>logging is good for tracking down bugs, but difficult to use for monitoring </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  28. 28. Parallel::Scoreboard – the caveats <ul><li>a building block for monitoring processes </li></ul><ul><ul><li>for visualization (like the mod_status of Apache) </li></ul></ul><ul><ul><li>for creating automated monitoring system </li></ul></ul><ul><li>flexible </li></ul><ul><ul><li>any number of processes can be monitored </li></ul></ul><ul><ul><li>any information can be stored for monitoring </li></ul></ul><ul><ul><li>any process can monitor any set of processes </li></ul></ul><ul><ul><ul><li>no relationship between the monitoring processes and monitored processes is necessary </li></ul></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  29. 29. Parallel::Scoreboard – under the hood <ul><li>one file per monitored process </li></ul><ul><ul><li><base_dir>/status_<pid> </li></ul></ul><ul><li>uses flock for GC </li></ul><ul><ul><li>monitored process locks its status file while alive </li></ul></ul><ul><ul><li>monitoring process removes unlocked status files </li></ul></ul><ul><li>uses checksum for detecting r/w collision </li></ul><ul><ul><li>monitoring process retries on collision </li></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  30. 30. Parallel::Scoreboard – monitored process <ul><li># prepare </li></ul><ul><li>my $scoreboard = Parallel::Scoreboard->new( </li></ul><ul><li>base_dir => '/tmp/scoreboard', </li></ul><ul><li>); </li></ul><ul><li># set the initial status. Note that the &quot;update” method should only </li></ul><ul><li># be called from worker processes </li></ul><ul><li>$scoreboard->update('initializing'); </li></ul><ul><li># the main loop </li></ul><ul><li>while (1) { </li></ul><ul><li>$scoreboard->update('waiting for task'); </li></ul><ul><li>my $task = get_task(); </li></ul><ul><li>$scoreboard->update('handling ' . $task->{id}); </li></ul><ul><li>handle_task($task); </li></ul><ul><li>} </li></ul><ul><li>$scoreboard->update('exitting'); </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  31. 31. Parallel::Scoreboard – integrating with P::Prefork <ul><li>my $pm = Parallel::Prefork->new(...); </li></ul><ul><li>my $scoreboard = Parallel::Scoreboard->new(...); </li></ul><ul><li>my $worker = Gearman::Worker->new; </li></ul><ul><li>$worker->register_function(handle_job => sub { </li></ul><ul><li>$scoreboard->update('handling ' . ...); </li></ul><ul><li>try { </li></ul><ul><li>... handle the job ... </li></ul><ul><li>} finally { </li></ul><ul><li>$scoreboard->update('idle'); </li></ul><ul><li>} </li></ul><ul><li>}); </li></ul><ul><li>while ($pm->signal_received ne 'TERM') { </li></ul><ul><li>$pm->start and next; </li></ul><ul><li>$scoreboard->update(‘idle’); # just entered the child process </li></ul><ul><li>... </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  32. 32. Parallel::Scoreboard – monitoring process <ul><li># prepare </li></ul><ul><li>my $scoreboard = Parallel::Scoreboard->new( </li></ul><ul><li>base_dir => '/tmp/scoreboard', </li></ul><ul><li>); </li></ul><ul><li># read the status, and print </li></ul><ul><li>my $stats = $scoreboard->read_all(); </li></ul><ul><li>for my $pid (sort { $a <=> $b } keys %$stats) { </li></ul><ul><li>print &quot;PID:$pid is $stats->{$pid} ”; </li></ul><ul><li>} </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  33. 33. Parallel::Scoreboard – monitoring by HTTP <ul><li>Parallel::Scoreboard::PSGI::App </li></ul><ul><ul><li>mod_status-like display </li></ul></ul><ul><li>Parallel::Scoreboard::PSGI::App::JSON </li></ul><ul><ul><li>scoreboard sent out in JSON </li></ul></ul><ul><ul><ul><li>expects the status to be string or a JSON array / object </li></ul></ul></ul><ul><ul><li>useful for gathering scoreboard of many machines </li></ul></ul><ul><ul><li>also useful for auto-scaling </li></ul></ul><ul><ul><ul><li>automatically power-up / down the app. servers depending on the output of the scoreboards </li></ul></ul></ul>Oct 15 2010 Writing Prefork Workers / Servers
  34. 34. Parallel::Scoreboard – monitoring in HTTP (2) <ul><li>plackup --port 8080 –e 'use Parallel::Scoreboard::PSGI::App::JSON; </li></ul><ul><li>Parallel::Scoreboard::PSGI::App::JSON->new( </li></ul><ul><li>scoreboard => Parallel::Scoreboard->new( </li></ul><ul><li>base_dir => &quot;/tmp/my_scoreboard&quot;, </li></ul><ul><li>), </li></ul><ul><li>)->to_app' </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  35. 35. <ul><li>Summary </li></ul>Oct 15 2010 Writing Prefork Workers / Servers
  36. 36. Summary <ul><li>use Parallel::Prefork when writing job workers / servers </li></ul><ul><li>use Server::Starter when hot-deployment is necessary </li></ul><ul><li>use Parallel::Scoreboard to create monitors for your job workers / servers </li></ul>Oct 15 2010 Writing Prefork Workers / Servers

×