Scaling PHP in the real world!


Published on

PHP is used by the likes of Facebook, Yahoo!, Zynga, Tumblr, Etsy, and Wikipedia. How do the largest internet companies scale PHP to meet their demand? Join this session and find out how to use the latest tools in PHP for developing high performance applications. We’ll take a look at common techniques for scaling PHP applications and best practices for profiling and optimizing performance. After this session, you’ll leave prepared to tackle your next enterprise PHP project.

Published in: Technology

Scaling PHP in the real world!

  1. 1. Scaling PHP in the real world!
  2. 2. PHP is used by the likes of Facebook,Yahoo,Zynga,Tumblr, Etsy, and Wikipedia. How do thelargest internet companies scale PHP to meettheir demand? Join this session and find out howto use the latest tools in PHP for developing highperformance applications.We’ll take a look atcommon techniques for scaling PHP applicationsand best practices for profiling and optimizingperformance.After this session, you’ll leaveprepared to tackle your next enterprise PHPproject.
  3. 3. Agenda• Why performance matters?• The problems with PHP• Opcode Caches• Best practice designs• Doing work in the background with queues• Fronting with http caching (Varnish/Squid) and a reverse proxy cache• Distributed data caches with redis and memcached• Using the right tool for the job• Tools of the trade• Xdebug +Valgrind + WebGrind• AppDynamics• Architecture not applicationsWe all know performance is important, but performance tuning is too often an afterthought.As a result, taking on a performance tuning project for a slow application can be prettyintimidating – where do you even begin? In this series I’ll tell you about the strategies andtechnologies that (in my experience) have been the most successful in improving PHPperformance. To start off, however, we’ll talk about some of the easy wins in PHPperformance tuning. These are the things you can do that’ll get you the most performancebang for your buck, and you should be sure you’ve checked off all of them before you takeon any of the more complex stuff.
  4. 4. Who am I?• Dustin Whittle•• @dustinwhittle• Technologist, Pilot, Skier, Diver, Sailor
  5. 5. What I have worked on• Developer Evangelist @• Consultant & Trainer @• Developer Evangelist @
  6. 6. Did you know Facebook,Yahoo, Zynga,Tumblr,Etsy, and Wikipedia wereall built on PHP?
  7. 7. Why does performancematter?
  8. 8. Microsoft found that Bingsearches that were 2 secondsslower resulted in a 4.3%drop in revenue per user
  9. 9. When Mozilla shaved 2.2seconds off their landing page,Firefox downloads increased15.4%
  10. 10. Shopzilla saw conversionrates increase by over 7%as a result of optimizing theirperformance
  11. 11. Making Barack Obama’swebsite 60% faster increaseddonation conversions by 14%
  12. 12. Performanceaffects the bottomline
  13. 13. PHP is slower thanJava, C++, Erlang, andGo!
  14. 14. ...but there are ways toscale to handle hightraffic applications
  15. 15. PHP is not yourproblem!
  16. 16. how many issues are getting resolved as the PHP team iterates and releases.
  17. 17. What version of PHPdo you run?
  18. 18. Upgrade your PHPenvironment to 2013!One of the easiest improvements you can make to improve performance and stability is toupgrade your version of PHP. PHP 5.3.x was released in 2009. If you haven’t migrated to PHP5.4, now is the time! Not only do you benefit from bug fixes and new features, but you willalso see faster response times immediately. See to get started.Installing the latest PHP on Linux - the latest PHP on OSX - the latest PHP on Windows - you’ve finished upgrading PHP, be sure to disable any unused extensions in productionsuch as xdebug or xhprof.
  19. 19. Nginx + PHP-FPM
  20. 20. Use an opcode cache!PHP is an interpreted language, which means that every time a PHP page is requested, theserver will interpet the PHP file and compile it into something the machine can understand(opcode). Opcode caches preserve this generated code in a cache so that it will only need tobe interpreted on the first request. If you aren’t using an opcode cache you’re missing out ona very easy performance gain. Pick your flavor: APC, Zend Optimizer, XCache, orEaccellerator. I highly recommend APC, written by the creator of PHP, Rasmus Lerdorf.
  21. 21. APC
  22. 22. Zend Optimizer
  23. 23. XCache
  24. 24. PHP 5.5 has ZendOptimizer by default
  25. 25. Use autoloading!Many developers writing object-oriented applications create one PHP source file per classdefinition. One of the biggest annoyances in writing PHP is having to write a long list ofneeded includes at the beginning of each script (one for each class). PHP re-evaluates theserequire/include expressions over and over during the evaluation period each time a filecontaining one or more of these expressions is loaded into the runtime. Using an autoloaderwill enable you to remove all of your require/include statements and benefit from aperformance improvement. You can even cache the class map of your autoloader in APC for asmall performance improvement.
  26. 26. Check out theSymfony2 ClassLoadercomponent
  27. 27. Scaling beyond a singleserver
  28. 28. Optimize yoursessions!While HTTP is stateless, most real life web applications require a way to manage user data. InPHP, application state is managed via sessions. The default configuration for PHP is to persistsession data to disk. This is extremely slow and not scalable beyond a single server. A bettersolution is to store your session data in a database and front with an LRU (Least RecentlyUsed) cache with Memcached or Redis. If you are super smart you will realize you should limityour session data size (4096 bytes) and store all session data in a signed or encryptedcookie.
  29. 29. PHP default is topersist sessions to disk
  30. 30. It is better to store in adatabase
  31. 31. Even better is to storein a database with acache in front
  32. 32. The best solution is to limitsession size and store in asigned or encrypted cookie
  33. 33. Leverage an in-memorydata cacheApplications usually require data. Data is usually structured and organized in a database.Depending on the data set and how it is accessed it can be expensive to query. An easysolution is to cache the result of the first query in a data cache like Memcached or Redis. Ifthe data changes, you invalidate the cache and make another SQL query to get the updatedresult set from the database.I highly recommend the Doctrine ORM for PHP which has built-in caching support forMemcached or Redis.There are many use cases for a distributed data cache from caching web service responsesand app configurations to entire rendered pages.
  34. 34. Memcached
  35. 35. Redis
  36. 36. I highly recommend theDoctrine ORM for PHP whichhas built-in caching support forMemcached or Redis.
  37. 37. Do blocking work inthe backgroundOften times web applications have to run tasks that can take a while to complete. In mostcases there is no good reason to force the end-user to have to wait for the job to finish. Thesolution is to queue blocking work to run in background jobs. Background jobs are jobs thatare executed outside the main flow of your program, and usually handled by a queue ormessage system. There are a lot of great solutions that can help solve running backgroundsjobs. The benefits come in terms of both end-user experience and scaling by writing andprocessing long running jobs from a queue. I am a big fan of Resque for PHP that is a simpletoolkit for running tasks from queues. There are a variety of tools that provide queuing ormessaging systems that work well with PHP:
  38. 38. • Resque• Gearman• RabbitMQ• Kafka• Beanstalkd• ZeroMQ• ActiveMQ
  39. 39. Resque
  40. 40. • Sending notifications + posting to socialaccounts• Analytics + Instrumentation• Updating profiles and discovering friendsfrom social accounts• Consuming web services like TwitterStreaming API
  41. 41.
  42. 42.
  43. 43.
  44. 44.
  45. 45. Leverage HTTP cachingHTTP caching is one of the most misunderstood technologies on the Internet. Go read theHTTP caching specification. Don’t worry, I’ll wait. Seriously, go do it! They solved all of thesecaching design problems a few decades ago. It boils down to expiration or invalidation andwhen used properly can save your app servers a lot of load. Please read the excellent HTTPcaching guide from Mark Nottingam. I highly recommend using Varnish as a reverse proxycache to alleviate load on your app servers.
  46. 46. RTFM
  47. 47. Expires or Invalidation
  48. 48. • Varnish• Squid• Nginx Proxy Cache• Apache Proxy Cache
  49. 49. I highly recommend usingVarnish as a reverse proxycache to alleviate load on yourapp servers.
  50. 50. Optimize yourframework!Deep diving into the specifics of optimizing each framework is outside of the scope of thispost, but these principles apply to every framework:Stay up-to-date with the latest stable version of your favorite frameworkDisable features you are not using (I18N, Security, etc)Enable caching features for view and result set caching
  51. 51. • Stay up-to-date with the latest stableversion of your favorite framework• Disable features you are not using (I18N,Security, etc)• Enable caching features for view and resultset caching
  52. 52. Sharding
  53. 53. I see many Service OrientedArchitectures with Java/Scala/C++/Erlang backends with PHPor JavaScript frontend
  54. 54. Companies of great scalemove away from PHP orcreate their own variant
  55. 55. Yahoo! & yPHP
  56. 56. Facebook &HipHopHipHop 1.0 - HPHPc - Transformed subset of PHP code to C++ for performanceHipHop 2.0 - HHVM - Virtual Machine, Runtime, and JIT for PHP
  57. 57. Learn to how to profilecode for PHPperformanceXdebug is a PHP extension for powerful debugging. It supports stack and function traces,profiling information and memory allocation and script execution analysis. It allowsdevelopers to easily profile PHP code.WebGrind is an Xdebug profiling web frontend in PHP5. It implements a subset of the featuresof kcachegrind and installs in seconds and works on all platforms. For quick-and-dirtyoptimizations it does the job. Here’s a screenshot showing the output from profiling.XHprof is a function-level hierarchical profiler for PHP with a reporting and UI layer. XHProf iscapable of reporting function-level inclusive and exclusive wall times, memory usage, CPUtimes and number of calls for each function. Additionally, it supports the ability to comparetwo runs (hierarchical DIFF reports) or aggregate results from multiple runs.XHprofXHGuiAppDynamics is application performance management software designed to help dev and opstroubleshoot problems in complex production apps.
  58. 58. Xdebug + WebGrind
  59. 59.
  60. 60.
  61. 61. Don’t forget tooptimize the client sidePHP application performance is only part of the battleNow that you have optimized the server-side, you can spend time improving the client side!In modern web applications most of the end-user experience time is spent waiting on theclient side to render. Google has dedicated many resources to helping developers improveclient side performance.
  62. 62. Google PageSpeed
  63. 63. Scalability is about the entirearchitecture, not someminor code optimizations.
  64. 64. Questions?
  65. 65. Please give any feedback onJoind.in
  66. 66. Find these slides on SpeakerDeck
  67. 67. See you next year!