Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Varnish Cache - International PHP Conference Fall 2012


Published on

An introduction to varnish cache for PHP developers.

Published in: Technology
  • Be the first to comment

Varnish Cache - International PHP Conference Fall 2012

  1. 1. Mike Willbanks | Barnes & NobleVarnish Cache
  2. 2. Housekeeping…•  Talk –  Slides will be posted after the talk.•  Me –  Sr. Web Architect Manager at NOOK Developer –  Prior MNPHP Organizer –  Open Source Contributor –  Where you can find me: •  Twitter: mwillbanks G+: Mike Willbanks •  IRC (freenode): mwillbanks Blog: •  GitHub:
  3. 3. Agenda•  Varnish?•  The Good : Getting Started•  The Awesome : General Usage•  The Crazy : Advanced Usage•  Gotchas
  4. 4. Official StatementWhat it doesGeneral use caseWHAT IS VARNISH?
  5. 5. Official Statement“Varnish is a web application accelerator. You install it in front of your web application and it will speed it up significantly.”
  6. 6. You can cache…Both dynamic and static files and contents.
  7. 7. A Scenario•  System Status Server –  Mobile apps check current status. –  If the system is down do we communicate? –  If there are problems do we communicate? –  The apps and mobile site rely on an API •  Trouble in paradise? Few and far in between.
  8. 8. The Graph - AWS Req/s Peak Load700 14600 12500 10400 8300 Req/s Peak Load 6200 4100 2 0 0 Small X-Large Small Varnish Small X-Large Small Varnish Time Requests500 80000450 70000400 60000350300 50000250 40000 Time Requests200 30000150 20000100 50 10000 0 0 Small X-Large Small Varnish Small X-Large Small Varnish
  9. 9. The Raw Data Small   X-­‐Large   Small  Varnish  Concurrency   10   150   150  Requests   5000   55558   75000  Time   438   347   36  Req/s   11.42   58   585  Peak  Load   11.91   8.44   0.35   19,442  failed  Comments   requests  
  10. 10. Load Balancer HTTP Server Cluster DatabaseTraditional LAMP Stack
  11. 11. Load Balancer Yes Varnish Cache Cache Hit No HTTP Server Cluster DatabaseLAMP + Varnish* Varnish can act as a load balancer.
  12. 12. InstallationGeneral InformationDefault VCLTHE GOOD – JUMP START
  13. 13. Installationrpm --nosignature -i install varnishcurl | sudo apt-key add -echo "deb lucid varnish-3.0" | sudotee -a /etc/apt/sources.listsudo apt-get updatesudo apt-get install varnishgit clone git:// varnish-cachesh && make install
  14. 14. Varnish Daemon•  varnishd –  -a address[:port] listen for client –  -b address[:port] backend requests –  -T address[:port] administration http –  -s type[,options] storage type (malloc, file, persistence) –  -P /path/to/file PID file –  Many others; these are generally the most important. Generally the defaults will do with just modification of the default VCL (more on it later).
  15. 15. General Configuration•  varnishd -a :80 -T localhost:6082 -f /path/to/default.vcl -s malloc,512mb•  Web server to listen on port 8080
  16. 16. Setup a backend!backend default { .host = “” .port = “8080”}
  17. 17. So what’s actually caching?•  Any requests containing –  GET / HEAD –  TTL > 0•  What cause it to miss? –  Cookies –  Authentication Headers –  Vary “*” –  Cache-control: private
  18. 18. Request req. vcl_recv req. req. req. bereq. bereq. req. bereq.vcl_pass vcl_miss vcl_hash vcl_pipe req. req. bereq. vcl_fetch vcl_hit obj. beresp. resp. vcl_deliver Response
  19. 19. HTTP Caching•  RFC 2616 HTTP/1.1 Headers –  Expiration •  Cache-Control •  Expires –  Validation •  Last Modified •  If-Modified-Since •  ETag •  If-None-Match
  20. 20. TTL Priority•  VCL –  beresp.ttl•  Headers –  Cache-control: s-max-age –  Cache-control: max-age –  Expires –  Validation
  21. 21. Use Wordpress?backend default { .host = "“; .port = "8080"; }sub vcl_recv { if (!(req.url ~ "wp-(login|admin)")) { unset req.http.cookie; } }sub vcl_fetch { if (!(req.url ~ "wp-(login|admin)")) { unset beresp.http.set-cookie; } }
  23. 23. Varnish Configuration Language•  VCL State Engine –  Each Request is Processed Separately & Independently –  States are Isolated but are Related –  Return statements exit one state and start another –  VCL defaults are ALWAYS appended below your own VCL•  VCL can be complex, but… –  Two main subroutines; vcl_recv and vcl_fetch –  Common actions: pass, hit_for_pass, lookup, pipe, deliver –  Common variables: req, beresp and obj –  More subroutines, functions and complexity can arise dependent on condition.
  24. 24. Request req. vcl_recv req. req. req. bereq. bereq. req. bereq.vcl_pass vcl_miss vcl_hash vcl_pipe req. req. bereq. vcl_fetch vcl_hit obj. beresp. resp. vcl_deliver Response
  25. 25. VCL - ProcessVCL Process Descriptionvcl_init Startup routine (VCL loaded, VMOD init)vcl_recv Beginning of request, req is in scopevcl_pipe Client & backend data passed unalteredvcl_pass Request goes to backend and not cachedvcl_hash Creates cache hash, call hash_data for custom hashesvcl_hit Called when hash found in cachevcl_miss Called when hash not found in cachevcl_fetch Called to fetch data from backendvcl_deliver Called prior to delivery of response (excluding pipe)vcl_error Called when an error occursvcl_fini Shutdown routine (VCL unload, VMOD cleanup)
  26. 26. VCL – Variables•  Always Available •  Backend –  now – epoch time –  bereq – backend request•  Backend Declarations –  beresp – backend response –  .host – hostname / IP •  Cached Object –  .port – port number –  obj – Cached object, can•  Request Processing only change .ttl –  client – ip & identity •  Response –  server – ip & port –  resp – response information –  req – request information
  27. 27. VCL - FunctionsVCL Function Descriptionhash_data(string) Adds a string to the hash inputregsub(string, regex, sub) Substitution on first occurrenceregsuball(string, regex, sub) Substitution on all occurrencesban(expression) Ban all items that match expressionban(regex) Ban all items that match regular expression
  28. 28. Request req. vcl_recv req. req. req. bereq. bereq. req. bereq. vcl_pass vcl_miss vcl_hash vcl_pipe req. req. bereq. vcl_fetch vcl_hit obj. beresp. resp. vcl_deliverWalking through the noteworthy items.DEFAULT VCL Response
  29. 29. vcl_recv•  Received Request•  Only GET & HEAD by default –  Safest way to cache!•  Will use HTTP cache headers.•  Cookies or Authentication Headers will bust out of the cache.
  30. 30. vcl_hash•  Hash is what we look for in the cache.•  Default is URL + Host –  Server IP used if host header was not set; in a load balanced environment ensure you set this header!
  31. 31. vcl_fetch•  Fetch retrieves the response from the backend.•  No Cache if… –  TTL is not set or not greater than 0. –  Vary headers exist. –  Hit-For-Pass means we will cache a pass through.
  32. 32. Common adjustments to make.GENERAL ADJUSTMENTS
  33. 33. Cache Static ContentNo reason that static content should not be cached.
  34. 34. Remove GA CookiesGA cookies will cause a miss; remove them prior to going to thebackend.
  35. 35. Allow PurgingOnly allow from localhost or trusted server network.
  36. 36. Leveraging backend serversDIRECTORS
  37. 37. Directors – The TypesDirector Type DescriptionRandom Picks based on random and weight.Client Picks based on client identity.Hash Picks based on hash value.Round Robin Goes in order and starts overDNS Picks based on incoming DNS host, random OR round robin.Fallback Picks the first “healthy” server.
  38. 38. Director - Probing•  Backend Probing•  Variables –  .url –  .request –  .window –  .threshold –  .intial –  .expected_response –  .interval –  .timeout
  39. 39. Load BalancingImplementing a simple varnish load balancer.Varnish does not handle SSL termination.
  40. 40. Grace ModeRequest already pending for update; serve grace content.Backend is unhealthy.Probes as seen earlier must be implemented.
  41. 41. Saint ModeBackend may be sick for a particular piece of contentSaint mode makes sure that the backend will not request the objectagain for a specific period of time.
  42. 42. Purging•  The various ways of purging –  varnishadm – command line utility –  Sockets (port 6082) –  HTTP – now that is the sexiness
  43. 43. Purging Examplesvarnishadm -T purge req.url == "/foo/bar“telnet localhost 6082purge req.url == "/foo/bartelnet localhost 80Response:Trying to localhost.Escape character is ^].PURGE /foo/bar HTTP/1.0Host: bacon.orgcurl –X PURGE
  44. 44. Distributed Purging•  curl multi-request (in php)•  Use a message queue –  Use workers to do the leg work for you•  You will need to store a list of servers “somewhere”
  45. 45. Logging•  Many times people want to log the requests to a file –  By default Varnish only stores these in shared memory. –  Apache Style Logs •  varnishncsa –D –a –w log.txt –  This will run as a daemon to log all of your requests on a separate thread.
  46. 46. LoggingApache style logging using: varnishncsa -O -a -w log.txt
  47. 47. You likely want to ensure that your cache is:1.  Working Properly2.  Caching EffectivelyVERIFY YOUR VCL
  48. 48. What is Varnish doing…Varnishtop will show you real time information on your system.•  Use -i to filter on specific tags.•  Use -x to exclude specific tags.
  49. 49. Checking Statistics…Varnishstat will give you statistics you need to know how you’redoing.
  50. 50. ESI – Edge-Side IncludesVarnish AdministrationVMODTHE CRAZY
  51. 51. ESI – Edge Side Includes•  ESI is a small markup language much like SSI (server side includes) to include fragments (or dynamic content for that matter).•  Think of it as replacing regions inside of a page as if you were using XHR (AJAX) but single threaded.•  Three Statements can be utilized. –  esi:include – Include a page –  esi:remove – Remove content –  <!-- esi --> - ESI disabled, execute normally
  52. 52. <esi:include src="header.php" /> V B a a r c n k i e Page Content s n h dESI DiagramVarnish detects ESI, requests from backend OR checks cachedstate.
  53. 53. Using ESI•  In vcl_fetch, you must set ESI to be on –  set beresp.do_esi = true; –  Varnish refuses to parse content for ESI if it does not look like XML •  This is by default; so check varnishstat and varnishlog to ensure that it is functioning like normal.
  54. 54. ESI Usage<html> <head><title>Rock it with ESI</title></head> <body> <header> <esi:include src=”header.php" /> </header> <section id="main">...</section> <footer></footer> </body></html>
  55. 55. Embedding C in VCL•  Before getting into VMOD; did you know you can embed C into the VCL for varnish?•  Want to do something crazy fast or leverage a C library for pre or post processing?•  I know… you’re thinking that’s useless.. –  On to the example; and a good one from the Varnish WIKI!
  56. 56. Embedded C for syslogC{ #include <syslog.h>}Csub vcl_something { C{ syslog(LOG_INFO, "Something happened at VCL line XX."); }C}# Example with using varnish variablesC{ syslog(LOG_ERR, "Spurious response from backend: xid %s request %s %s "%s" %d "%s" "%s"", VRT_r_req_xid(sp), VRT_r_req_request(sp), VRT_GetHdr(sp, HDR_REQ, "005host:"), VRT_r_req_url(sp), VRT_r_obj_status(sp), VRT_r_obj_response(sp), VRT_GetHdr(sp, HDR_OBJ, "011Location:"));}C
  57. 57. Varnish Modules / Extensions•  Taking VCL embedded C to the next level•  Allows you to extend varnish and create new functions•  You could link to libraries to provide additional functionality
  58. 58. VMOD - std•  toupper •  syslog•  tolower •  fileread•  set_up_tos •  duration•  random •  integer•  log •  collect
  59. 59. Management ConsoleCache Warm upADMINISTERING VARNISH
  60. 60. Management Console•  varnishadm –T localhost:6062 –  vcl.list – see all loaded configuration –  vcl.load – load new configuration –  vcl.use – select configuration to use –  vcl.discard – remove configuration
  61. 61. Cache Warmup•  Need to warm up your cache before putting a sever in the queue or load test an environment? –  varnishreplay –r log.txt
  62. 62. Having Keep-Alive offNo SSL TerminationNo persistent cacheESI multiple fragmentsCookies*GOTCHAS
  63. 63. These slides will be posted to SlideShare & SpeakerDeck. SpeakerDeck: Slideshare: Twitter: mwillbanks G+: Mike Willbanks IRC (freenode): mwillbanks Blog: GitHub: