Multi-tasking in PHP



Wednesday, December 5, 12
About Me
                   •        Formerly CTO for
                            Company52

                   •        Currently work at
                            Brandmovers on
                            Northside Drive

                   •        Self-taught

                   •        Full-time geek since
                            2007




Wednesday, December 5, 12
Wednesday, December 5, 12
Use Cases
                •      System resources are
                       not the bottleneck
                •      Batch processing
                       involving an API:
                      •     E-Mail
                      •     Geocoding
                      •     Electronic billing
                •      Daemons


Wednesday, December 5, 12
Alternatives

                   • Gearman
                   • curl_multi_*
                   • Other scripting languages


Wednesday, December 5, 12
Theory



Wednesday, December 5, 12
Two ways to multi-task
                             Multi-processing        Multi-threading

                             Separate memory          Shared memory

                             Errors are isolated   Errors are not isolated

                            Separate permissions     Same permissions

                                Linux/UNIX               Windows


Wednesday, December 5, 12
Multiprocessing                                     Multithreading

             Process 1        program
                                                                Process 1          program



                                                                  Thread 1
                 state:         virtual
                                              memory
                PC, stack     processor                              state:          virtual
                                                                    PC, stack      processor


             Process 2        program                             Thread 2
                                                                     state:          virtual
                                                                    PC, stack      processor
                                                                                                 memory
                 state:         virtual
                                              memory
                PC, stack     processor


                                                                  Thread m
                                                                     state:          virtual
             Process n        program
                                                                    PC, stack      processor




                 state:         virtual
                                              memory
                PC, stack     processor
                                                                Process n


                                               Courtesy www.fmc-modeling.org

Wednesday, December 5, 12
Multiprocessing

                   • “The simultaneous execution of two or
                            more programs by separate CPUs under
                            integrated control.”
                   • Clones the entire process, except resources
                   • Copy-on-write memory

Wednesday, December 5, 12
Forking




                            Diagram courtesy cnx.org



Wednesday, December 5, 12
Child Process

                   • A cloned copy of a parent process
                   • Receives a new process ID and a parent
                            process ID
                   • Does some work
                   • Dies

Wednesday, December 5, 12
...sort of.




                            Photo Credit: Christopher Brian (2011 Toronto Zombie Walk)



Wednesday, December 5, 12
Parent Responsibilities

                   • Reproduction
                   • Monitors child process status
                   • “Reap” zombie processes


Wednesday, December 5, 12
Process Signals
                             Signal     Description
                            SIGCHLD   Child process died

                            SIGINT     User Interrupt

                            SIGTERM       Terminate

                            SIGKILL   Forcibly terminate



Wednesday, December 5, 12
PHP Implementation



Wednesday, December 5, 12
Requirements
                   •        Unix-like operating system
                   •        PHP 4.1+
                   •        PHP PCNTL extension
                            (compile with --enable-pcntl)
                   •        PHP Semaphore extension, optional
                            (--enable-sysvsem, --enable-sysvshm, --enable-sysvmsg)
                   •        Plenty of memory
                   •        Multiple CPU cores



Wednesday, December 5, 12
Overview
                   1. Define signal handlers
                   2. Fetch a dataset
                   3. Fork off one child process for each item
                   4. Stop forking when a threshold is reached, and sleep
                   5. Reap a child process whenever SIGCHLD is received
                   6. If there’s more work to do, fork more processes
                   7. When all child processes have been reaped,
                      terminate


Wednesday, December 5, 12
declare(ticks = 1);

                 // Setup our signal handlers
                 pcntl_signal(SIGTERM, "signal_handler");
                 pcntl_signal(SIGINT,  "signal_handler");
                 pcntl_signal(SIGCHLD, "signal_handler");




Wednesday, December 5, 12
function signal_handler($signal)
                 {
                 	

 switch ($signal)
                 	

 {
                 	

 	

 case SIGINT:
                 	

 	

 case SIGTERM:
                 	

 	

 	

 // kill all child processes
                 	

 	

 	

 exit(0);
                 	

 	

 case SIGCHLD:
                 	

 	

 	

 // reap a child process
                 	

 	

 	

 reap_child();
                 	

 	

 break;
                 	

 }
                 }


Wednesday, December 5, 12
$pid = pcntl_fork();

                  switch($pid)
                  {
                  	

 case 0:
                  	

 	

 // Child process
                  	

 	

 call_user_func($callback, $data);
                  	

 	

 posix_kill(posix_getppid(), SIGCHLD);
                  	

 	

 exit;
                  	

 case -1:
                  	

 	

 // Parent process, fork failed
                  	

 	

 throw new Exception("Out of memory!");
                  	

 default:
                  	

 	

 // Parent process, fork succeeded
                  	

 	

 $processes[$pid] = TRUE;
                  }

Wednesday, December 5, 12
Repeat for
                            each unit of work


Wednesday, December 5, 12
function reap_child()
             {
             	

 // Check if any child process has terminated,
             	

 // and if so remove it from memory
             	

 $pid = pcntl_wait($status, WNOHANG);
             	

 if ($pid < 0)
             	

 {
             	

 	

 throw new Exception("Out of memory");
             	

 }
             	

 elseif ($pid > 0)
             	

 {
             	

 	

 unset($processes[$pid]);
             	

 }	

             }

Wednesday, December 5, 12
Demo Time!
                            http://gist.github.com/4212160




Wednesday, December 5, 12
Wednesday, December 5, 12
DON’T:
                   •        DON’T fork a web process (CLI only!)
                   •        DON’T overload your system
                   •        DON’T open resources before forking
                   •        DO respect common POSIX signals
                   •        DO remove zombie processes
                   •        DO force new database connections in children
                            mysql_reconnect($s, $u, $p, TRUE);


Wednesday, December 5, 12
Challenges



Wednesday, December 5, 12
Race Conditions
                   •        A logic bug where the result is affected by the
                            sequence or timing of uncontrollable events
                   •        Adding debug logic can change timing
                   •        Dirty reads
                   •        Lost data
                   •        Unpredictable behavior
                   •        Deadlocks, hanging, crashing


Wednesday, December 5, 12
Wednesday, December 5, 12
Wednesday, December 5, 12
Solutions

                        • Handle I/O in the parent process
                            exclusively
                        • Manage resources with semaphores
                            and/or mutexes




Wednesday, December 5, 12
Semaphores

                   • Semaphore = atomically updated counter
                   • Mutex = binary semaphore with ownership
                   • PHP: sem_get(), sem_release()
                   • Transactional databases use semaphores

Wednesday, December 5, 12
Deadlocks




                                        Image credit: csunplugged.org




Wednesday, December 5, 12
Bonus Slides



Wednesday, December 5, 12
Shared Memory
                   •        Advanced inter-process communication

                   •        Pass data back to the parent process

                   •        PHP Shared Memory extension
                            (--enable-shmop)

                   •        PHP System V Shared Memory extension
                            (--enable-sysvshm)

                        •     More robust

                        •     Compatible with other languages


Wednesday, December 5, 12
Daemonization

                   • Fork, kill parent
                   • Orphaned child process continues running
                   • Signal and error handling are critical
                   • Server daemons usually fork child
                            processes to handle requests



Wednesday, December 5, 12
Wednesday, December 5, 12
Thank You!


                              @compwright
                            http://bit.ly/atlphpm




Wednesday, December 5, 12

Multi-tasking in PHP

  • 1.
  • 2.
    About Me • Formerly CTO for Company52 • Currently work at Brandmovers on Northside Drive • Self-taught • Full-time geek since 2007 Wednesday, December 5, 12
  • 3.
  • 4.
    Use Cases • System resources are not the bottleneck • Batch processing involving an API: • E-Mail • Geocoding • Electronic billing • Daemons Wednesday, December 5, 12
  • 5.
    Alternatives • Gearman • curl_multi_* • Other scripting languages Wednesday, December 5, 12
  • 6.
  • 7.
    Two ways tomulti-task Multi-processing Multi-threading Separate memory Shared memory Errors are isolated Errors are not isolated Separate permissions Same permissions Linux/UNIX Windows Wednesday, December 5, 12
  • 8.
    Multiprocessing Multithreading Process 1 program Process 1 program Thread 1 state: virtual memory PC, stack processor state: virtual PC, stack processor Process 2 program Thread 2 state: virtual PC, stack processor memory state: virtual memory PC, stack processor Thread m state: virtual Process n program PC, stack processor state: virtual memory PC, stack processor Process n Courtesy www.fmc-modeling.org Wednesday, December 5, 12
  • 9.
    Multiprocessing • “The simultaneous execution of two or more programs by separate CPUs under integrated control.” • Clones the entire process, except resources • Copy-on-write memory Wednesday, December 5, 12
  • 10.
    Forking Diagram courtesy cnx.org Wednesday, December 5, 12
  • 11.
    Child Process • A cloned copy of a parent process • Receives a new process ID and a parent process ID • Does some work • Dies Wednesday, December 5, 12
  • 12.
    ...sort of. Photo Credit: Christopher Brian (2011 Toronto Zombie Walk) Wednesday, December 5, 12
  • 13.
    Parent Responsibilities • Reproduction • Monitors child process status • “Reap” zombie processes Wednesday, December 5, 12
  • 14.
    Process Signals Signal Description SIGCHLD Child process died SIGINT User Interrupt SIGTERM Terminate SIGKILL Forcibly terminate Wednesday, December 5, 12
  • 15.
  • 16.
    Requirements • Unix-like operating system • PHP 4.1+ • PHP PCNTL extension (compile with --enable-pcntl) • PHP Semaphore extension, optional (--enable-sysvsem, --enable-sysvshm, --enable-sysvmsg) • Plenty of memory • Multiple CPU cores Wednesday, December 5, 12
  • 17.
    Overview 1. Define signal handlers 2. Fetch a dataset 3. Fork off one child process for each item 4. Stop forking when a threshold is reached, and sleep 5. Reap a child process whenever SIGCHLD is received 6. If there’s more work to do, fork more processes 7. When all child processes have been reaped, terminate Wednesday, December 5, 12
  • 18.
    declare(ticks = 1); // Setup our signal handlers pcntl_signal(SIGTERM, "signal_handler"); pcntl_signal(SIGINT,  "signal_handler"); pcntl_signal(SIGCHLD, "signal_handler"); Wednesday, December 5, 12
  • 19.
    function signal_handler($signal) { switch ($signal) { case SIGINT: case SIGTERM: // kill all child processes exit(0); case SIGCHLD: // reap a child process reap_child(); break; } } Wednesday, December 5, 12
  • 20.
    $pid = pcntl_fork(); switch($pid) { case 0: // Child process call_user_func($callback, $data); posix_kill(posix_getppid(), SIGCHLD); exit; case -1: // Parent process, fork failed throw new Exception("Out of memory!"); default: // Parent process, fork succeeded $processes[$pid] = TRUE; } Wednesday, December 5, 12
  • 21.
    Repeat for each unit of work Wednesday, December 5, 12
  • 22.
    function reap_child() { // Check if any child process has terminated, // and if so remove it from memory $pid = pcntl_wait($status, WNOHANG); if ($pid < 0) { throw new Exception("Out of memory"); } elseif ($pid > 0) { unset($processes[$pid]); } } Wednesday, December 5, 12
  • 23.
    Demo Time! http://gist.github.com/4212160 Wednesday, December 5, 12
  • 24.
  • 25.
    DON’T: • DON’T fork a web process (CLI only!) • DON’T overload your system • DON’T open resources before forking • DO respect common POSIX signals • DO remove zombie processes • DO force new database connections in children mysql_reconnect($s, $u, $p, TRUE); Wednesday, December 5, 12
  • 26.
  • 27.
    Race Conditions • A logic bug where the result is affected by the sequence or timing of uncontrollable events • Adding debug logic can change timing • Dirty reads • Lost data • Unpredictable behavior • Deadlocks, hanging, crashing Wednesday, December 5, 12
  • 28.
  • 29.
  • 30.
    Solutions • Handle I/O in the parent process exclusively • Manage resources with semaphores and/or mutexes Wednesday, December 5, 12
  • 31.
    Semaphores • Semaphore = atomically updated counter • Mutex = binary semaphore with ownership • PHP: sem_get(), sem_release() • Transactional databases use semaphores Wednesday, December 5, 12
  • 32.
    Deadlocks Image credit: csunplugged.org Wednesday, December 5, 12
  • 33.
  • 34.
    Shared Memory • Advanced inter-process communication • Pass data back to the parent process • PHP Shared Memory extension (--enable-shmop) • PHP System V Shared Memory extension (--enable-sysvshm) • More robust • Compatible with other languages Wednesday, December 5, 12
  • 35.
    Daemonization • Fork, kill parent • Orphaned child process continues running • Signal and error handling are critical • Server daemons usually fork child processes to handle requests Wednesday, December 5, 12
  • 36.
  • 37.
    Thank You! @compwright http://bit.ly/atlphpm Wednesday, December 5, 12