Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

June8 presentation

991 views

Published on

  • Be the first to comment

  • Be the first to like this

June8 presentation

  1. 1. Varnish– A brief introduction<br />Nicolas A. Bérard-Nault<br />June 15, 2011<br />
  2. 2. Regular page view<br />
  3. 3. Reverse proxy cached page view<br />
  4. 4. So whatisVarnish ?<br /><ul><li>Reverse proxy cache
  5. 5. Designedfrom the ground up to be an HTTP accelerator solution</li></ul>Wewillcover<br /><ul><li>Default configuration and options
  6. 6. ESI
  7. 7. HTTP headers
  8. 8. Keezmovies.com</li></ul> - Benchmarks<br /> - Use case<br /> - Problems & solutions<br />
  9. 9. ConfiguringVarnish<br />Varnish uses a configuration file compiled to C on the fly and included as a sharedlibrary. The configuration format iscalled the VCL (Varnish Configuration Language), a domainspecificlanguagereminescent of Perl.<br />If the VCL is not enough, youcan configure usinginline C and the VRT (VarnishRun Time) library.<br />For a full reference: <br />http://www.varnish-cache.org/docs/2.1/tutorial/vcl.html<br />
  10. 10. Step by stepthrough the configuration<br />Back end definitions<br />backend www { <br />.host = "www.example.com"; <br />.port = "http"; <br />.connect_timeout = 1s; <br />.first_byte_timeout = 5s; <br />.between_bytes_timeout = 2s;<br /> .probe = { <br /> .url = "/test.jpg"; <br /> .timeout = 0.3 s; <br /> .window = 8; <br /> .threshold = 3; <br />} <br />}<br />You can have as manybackends as youwant<br />
  11. 11. Step by stepthrough the configuration<br />Directordefinitions<br />director www_director random { <br />{ .backend = www1; .weight = 2; } <br />{ .backend = www2; .weight = 1; } <br />}<br />director www_directorround-robin { <br /> { .backend = www1; } <br /> { .backend = www2; } <br />}<br />You can have as manydirectors as youwant<br />
  12. 12. Highlysimplified flow chartof Varnish<br />operations<br />
  13. 13. Step by stepthrough the configuration<br />recv: connectionisreceived<br />sub vcl_recv {<br /> if (req.restarts == 0) {<br /> if (req.http.x-forwarded-for) {<br /> set req.http.X-Forwarded-For =<br />req.http.X-Forwarded-For + ", " + client.ip;<br /> } else {<br /> set req.http.X-Forwarded-For = client.ip;<br /> }<br />}<br />if (req.http.Authorization || req.http.Cookie) {<br /> /* Not cacheable by default */<br /> return (pass);<br /> }<br /> if (req.request != "GET" && req.request != "HEAD") {<br /> /* We only deal with GET and HEAD by default */<br /> return (pass);<br /> }<br /> return (lookup);<br />}<br />
  14. 14. Be carefulwithyour HTTP verbs…<br />But wealwayscheat…<br />
  15. 15. vcl_hash<br />
  16. 16. vcl_hash: createobject hash for request<br />sub vcl_hash {<br />hash_data(req.url);<br /> if (req.http.host) {<br />hash_data(req.http.host);<br /> } else {<br />hash_data(server.ip);<br /> }<br /> return (hash);<br />}<br />
  17. 17. vcl_hit, vcl_miss<br />
  18. 18. vcl_pass: request not cacheable<br />sub vcl_pass{<br /> return (pass);<br />}<br />vcl_hit: post-lookup, objectexists in cache<br />sub vcl_hit {<br /> return (deliver);<br />}<br />vcl_miss: post-lookupobjectdoes not exist in cache<br />sub vcl_miss {<br /> return (fetch);<br />}<br />
  19. 19. vcl_fetch<br />
  20. 20. vcl_fetch: post objectfetchedfrom back-end<br />sub vcl_fetch {<br /> if (beresp.ttl <= 0s || <br />beresp.http.Set-Cookie || <br />beresp.http.Vary== "*") {<br /> set beresp.ttl = 120 s;<br /> return (hit_for_pass);<br /> }<br /> return (deliver);<br />}<br />
  21. 21. vcl_fetch<br />
  22. 22. Step by stepthrough the configuration<br />vcl_deliver: objectis to bedelivered to client<br />sub vcl_deliver {<br /> return (deliver);<br />}<br />
  23. 23. ESI (edge-sideinclude)<br />Invented by Akamai, only a subsetissupported by Varnish<br />Varnish supports include:<br /><div><br />Hello:<br /><esi:includesrc=“/getname.php“ /><br /></div><br />Will beprocessedinto:<br /><div><br />Hello:<br />Roger Cyr<br /></div><br />
  24. 24. ESI (edge-sideinclude)<br />To enable ESI processing, used the esikeyword in vcl_fetch.<br />ESI and gzip<br />VarnishWILL NOT be able to do ESI processing on gzip’edbackendresponses. It willalso not be able to do ungzip an ESI response.<br />In all cases, ESIs and gzip are not a good mix. Better support isplanned for Varnish 3.0. <br />
  25. 25. HTTP headers<br />Varnish relies on HTTP headers to know what to cache and for how long.<br />This isdonethrough the Cache-Control HTTP header.<br />Cache-Control: 30<br />Cache-Control: max-age=900<br />Cache-Control: no-cache<br />Cache-Control: must-revalidate<br />Read the HTTP RFC !<br />http://tools.ietf.org/html/rfc2616#section-14.9<br />
  26. 26. keezmovies.com<br />
  27. 27. keezmovies.com<br /><ul><li>Average of 13 million hits per day (~ 150 queries per second)
  28. 28. Homepagegets a large part of the hits (~35%, ~53 queries per second)
  29. 29. Logged in trafficis a very, very, verysmallminority</li></ul>Perfect candidate for full page caching<br />
  30. 30. Someresults for KM<br />Tested four configurations:<br />Apache + PHP<br />Apache + PHP + APC<br />Lighttpd + PHP + APC<br />Varnish<br />- Homepage (size = 90k, gzipped = 10k).<br />- Testedusing Apache Benchmark with<br />Increasingconcurrency.<br />
  31. 31.
  32. 32.
  33. 33.
  34. 34. But…<br />Content differsslightly for certain countries (notoriously, Germany)<br />Google Analytics cookies<br />And of course, not all GETrequests are nullipotent<br />The good news is, two of thesethreeproblems are easilytackable !<br />
  35. 35. Problem #1: Geolocalization<br />Essentially, each page has 2 versions:<br />Germanvisitor & disclaimer not accepted<br />Rest of the world & Germanvisitorwhoaccepteddisclaimer<br />__attribute__((constructor)) void<br />load_module()<br />{<br /> /* … */<br />handle = dlopen(“/usr/lib/varnish/geoip.so”, RTLD_NOW);<br />if (handle != NULL) {<br />get_country_code= dlsym(handle, “get_country_code”);<br />}<br />}<br />}C<br />
  36. 36. The following code isadded to vcl_recv<br />subvcl_recv {<br /> C{<br /> char *cc = (*get_country_code)(VRT_IP_string(sp, VRT_r_client_ip(sp)));<br />VRT_SetHdr(sp, HDR_REQ, "017X-Country-Code:", cc, <br />vrt_magic_string_end);<br /> }C<br /> if (req.http.Cookie ~ "age_verified.*" ) {<br /> set req.http.X-Age-Verified = "1";<br /> } else {<br /> set req.http.X-Age-Verified = "0";<br /> }<br />}<br />The PHP page isresponsible for setting the age_verified cookie once<br />the disclaimerisaccepted<br />
  37. 37. The following code isadded to vcl_hash<br />sub vcl_hash{<br /> if (req.http.x-country-code=="DE" && <br />req.http.x-age-verified == "0") {<br />set req.hash += req.http.x-age-verified;<br /> set req.hash += req.http.x-country-code;<br />}<br />}<br />You candownload the VarnishGeoIPlibraryhere: <br />http://www.varnish-cache.org/trac/wiki/GeoipUsingInlineC<br />It uses the MaxmindGeoIPlibrary.<br />
  38. 38. Problem #2: Google Analyticscookie<br />sub vcl_recv {<br /> if (req.http.Cookie) {<br /> if (req.http.Cookie ~ "user_cookie.*" ) {<br /> return( pass);<br /> } <br /> remove req.http.Cookie;<br /> } <br />}<br />This removes all cookies except the oneswe know to beuseful<br />
  39. 39. Problem #3: GET requestswithsideeffects<br />JSON UDP packets<br />
  40. 40. Stats server<br /><ul><li>Nodejs server, communicatingwithdatabasedirectly (couldbecommunicatingwithwebsitethrough API)
  41. 41. Does batch queries
  42. 42. Can handle and aggregaterequestsfrommanyVarnish servers at the same time
  43. 43. Bonus: canbeused for many, many, manyotherthings….</li></ul>Core: http://github.com/nicobn/AlysObserver<br />Varnish module: http://github.com/nicobn/AlysVarnish<br />
  44. 44. Side note: YourTTL istoohigh<br />KeezMovies: 53qps on home page<br />Rapidlydecreasing marginal utility<br />Dr. Strangelove or how I learned to stop worrying and love lowTTLs<br />
  45. 45. Questions ?<br />

×