Varnish– A brief introduction<br />Nicolas A. Bérard-Nault<br />June 15, 2011<br />
Regular page view<br />
Reverse proxy cached page view<br />
So whatisVarnish ?<br /><ul><li>Reverse proxy cache
Designedfrom the ground up to be an HTTP accelerator solution</li></ul>Wewillcover<br /><ul><li>Default configuration and ...
ESI
HTTP headers
Keezmovies.com</li></ul>  - Benchmarks<br /> - Use case<br /> - Problems & solutions<br />
ConfiguringVarnish<br />Varnish uses a configuration file compiled to C on the fly and included as a sharedlibrary. The co...
Step by stepthrough the configuration<br />Back end definitions<br />backend www { <br />.host = "www.example.com"; <br />...
Step by stepthrough the configuration<br />Directordefinitions<br />director www_director random { <br />{ .backend = www1...
Highlysimplified flow chartof Varnish<br />operations<br />
Step by stepthrough the configuration<br />recv: connectionisreceived<br />sub vcl_recv {<br />    if (req.restarts == 0) ...
Be carefulwithyour HTTP verbs…<br />But wealwayscheat…<br />
vcl_hash<br />
vcl_hash: createobject hash for request<br />sub vcl_hash {<br />hash_data(req.url);<br />	if (req.http.host) {<br />hash_...
vcl_hit, vcl_miss<br />
vcl_pass: request not cacheable<br />sub vcl_pass{<br />	return (pass);<br />}<br />vcl_hit: post-lookup, objectexists in ...
vcl_fetch<br />
vcl_fetch: post objectfetchedfrom back-end<br />sub vcl_fetch {<br />	if (beresp.ttl <= 0s || <br />beresp.http.Set-Cookie...
vcl_fetch<br />
Step by stepthrough the configuration<br />vcl_deliver: objectis to bedelivered to client<br />sub vcl_deliver {<br />	ret...
ESI (edge-sideinclude)<br />Invented by Akamai, only a subsetissupported by Varnish<br />Varnish supports include:<br /><d...
ESI (edge-sideinclude)<br />To enable ESI processing, used the esikeyword in vcl_fetch.<br />ESI and gzip<br />VarnishWILL...
HTTP headers<br />Varnish relies on HTTP headers to know what to cache and for how long.<br />This isdonethrough the Cache...
keezmovies.com<br />
keezmovies.com<br /><ul><li>Average of 13 million hits per day (~ 150 queries per second)
Homepagegets a large part of the hits (~35%, ~53 queries per second)
Logged in trafficis a very, very, verysmallminority</li></ul>Perfect candidate for full page caching<br />
Someresults for KM<br />Tested four configurations:<br />Apache + PHP<br />Apache + PHP + APC<br />Lighttpd + PHP + APC<br...
But…<br />Content differsslightly for certain countries (notoriously, Germany)<br />Google Analytics cookies<br />And of c...
Problem #1: Geolocalization<br />Essentially, each page has 2 versions:<br />Germanvisitor & disclaimer not accepted<br />...
The following code isadded to vcl_recv<br />subvcl_recv {<br />  C{<br />    char *cc = (*get_country_code)(VRT_IP_string(...
Upcoming SlideShare
Loading in …5
×

June8 presentation

879 views
826 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
879
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

June8 presentation

  1. 1. Varnish– A brief introduction<br />Nicolas A. Bérard-Nault<br />June 15, 2011<br />
  2. 2. Regular page view<br />
  3. 3. Reverse proxy cached page view<br />
  4. 4. So whatisVarnish ?<br /><ul><li>Reverse proxy cache
  5. 5. Designedfrom the ground up to be an HTTP accelerator solution</li></ul>Wewillcover<br /><ul><li>Default configuration and options
  6. 6. ESI
  7. 7. HTTP headers
  8. 8. Keezmovies.com</li></ul> - Benchmarks<br /> - Use case<br /> - Problems & solutions<br />
  9. 9. ConfiguringVarnish<br />Varnish uses a configuration file compiled to C on the fly and included as a sharedlibrary. The configuration format iscalled the VCL (Varnish Configuration Language), a domainspecificlanguagereminescent of Perl.<br />If the VCL is not enough, youcan configure usinginline C and the VRT (VarnishRun Time) library.<br />For a full reference: <br />http://www.varnish-cache.org/docs/2.1/tutorial/vcl.html<br />
  10. 10. Step by stepthrough the configuration<br />Back end definitions<br />backend www { <br />.host = "www.example.com"; <br />.port = "http"; <br />.connect_timeout = 1s; <br />.first_byte_timeout = 5s; <br />.between_bytes_timeout = 2s;<br /> .probe = { <br /> .url = "/test.jpg"; <br /> .timeout = 0.3 s; <br /> .window = 8; <br /> .threshold = 3; <br />} <br />}<br />You can have as manybackends as youwant<br />
  11. 11. Step by stepthrough the configuration<br />Directordefinitions<br />director www_director random { <br />{ .backend = www1; .weight = 2; } <br />{ .backend = www2; .weight = 1; } <br />}<br />director www_directorround-robin { <br /> { .backend = www1; } <br /> { .backend = www2; } <br />}<br />You can have as manydirectors as youwant<br />
  12. 12. Highlysimplified flow chartof Varnish<br />operations<br />
  13. 13. Step by stepthrough the configuration<br />recv: connectionisreceived<br />sub vcl_recv {<br /> if (req.restarts == 0) {<br /> if (req.http.x-forwarded-for) {<br /> set req.http.X-Forwarded-For =<br />req.http.X-Forwarded-For + ", " + client.ip;<br /> } else {<br /> set req.http.X-Forwarded-For = client.ip;<br /> }<br />}<br />if (req.http.Authorization || req.http.Cookie) {<br /> /* Not cacheable by default */<br /> return (pass);<br /> }<br /> if (req.request != "GET" && req.request != "HEAD") {<br /> /* We only deal with GET and HEAD by default */<br /> return (pass);<br /> }<br /> return (lookup);<br />}<br />
  14. 14. Be carefulwithyour HTTP verbs…<br />But wealwayscheat…<br />
  15. 15. vcl_hash<br />
  16. 16. vcl_hash: createobject hash for request<br />sub vcl_hash {<br />hash_data(req.url);<br /> if (req.http.host) {<br />hash_data(req.http.host);<br /> } else {<br />hash_data(server.ip);<br /> }<br /> return (hash);<br />}<br />
  17. 17. vcl_hit, vcl_miss<br />
  18. 18. vcl_pass: request not cacheable<br />sub vcl_pass{<br /> return (pass);<br />}<br />vcl_hit: post-lookup, objectexists in cache<br />sub vcl_hit {<br /> return (deliver);<br />}<br />vcl_miss: post-lookupobjectdoes not exist in cache<br />sub vcl_miss {<br /> return (fetch);<br />}<br />
  19. 19. vcl_fetch<br />
  20. 20. vcl_fetch: post objectfetchedfrom back-end<br />sub vcl_fetch {<br /> if (beresp.ttl <= 0s || <br />beresp.http.Set-Cookie || <br />beresp.http.Vary== "*") {<br /> set beresp.ttl = 120 s;<br /> return (hit_for_pass);<br /> }<br /> return (deliver);<br />}<br />
  21. 21. vcl_fetch<br />
  22. 22. Step by stepthrough the configuration<br />vcl_deliver: objectis to bedelivered to client<br />sub vcl_deliver {<br /> return (deliver);<br />}<br />
  23. 23. ESI (edge-sideinclude)<br />Invented by Akamai, only a subsetissupported by Varnish<br />Varnish supports include:<br /><div><br />Hello:<br /><esi:includesrc=“/getname.php“ /><br /></div><br />Will beprocessedinto:<br /><div><br />Hello:<br />Roger Cyr<br /></div><br />
  24. 24. ESI (edge-sideinclude)<br />To enable ESI processing, used the esikeyword in vcl_fetch.<br />ESI and gzip<br />VarnishWILL NOT be able to do ESI processing on gzip’edbackendresponses. It willalso not be able to do ungzip an ESI response.<br />In all cases, ESIs and gzip are not a good mix. Better support isplanned for Varnish 3.0. <br />
  25. 25. HTTP headers<br />Varnish relies on HTTP headers to know what to cache and for how long.<br />This isdonethrough the Cache-Control HTTP header.<br />Cache-Control: 30<br />Cache-Control: max-age=900<br />Cache-Control: no-cache<br />Cache-Control: must-revalidate<br />Read the HTTP RFC !<br />http://tools.ietf.org/html/rfc2616#section-14.9<br />
  26. 26. keezmovies.com<br />
  27. 27. keezmovies.com<br /><ul><li>Average of 13 million hits per day (~ 150 queries per second)
  28. 28. Homepagegets a large part of the hits (~35%, ~53 queries per second)
  29. 29. Logged in trafficis a very, very, verysmallminority</li></ul>Perfect candidate for full page caching<br />
  30. 30. Someresults for KM<br />Tested four configurations:<br />Apache + PHP<br />Apache + PHP + APC<br />Lighttpd + PHP + APC<br />Varnish<br />- Homepage (size = 90k, gzipped = 10k).<br />- Testedusing Apache Benchmark with<br />Increasingconcurrency.<br />
  31. 31.
  32. 32.
  33. 33.
  34. 34. But…<br />Content differsslightly for certain countries (notoriously, Germany)<br />Google Analytics cookies<br />And of course, not all GETrequests are nullipotent<br />The good news is, two of thesethreeproblems are easilytackable !<br />
  35. 35. Problem #1: Geolocalization<br />Essentially, each page has 2 versions:<br />Germanvisitor & disclaimer not accepted<br />Rest of the world & Germanvisitorwhoaccepteddisclaimer<br />__attribute__((constructor)) void<br />load_module()<br />{<br /> /* … */<br />handle = dlopen(“/usr/lib/varnish/geoip.so”, RTLD_NOW);<br />if (handle != NULL) {<br />get_country_code= dlsym(handle, “get_country_code”);<br />}<br />}<br />}C<br />
  36. 36. The following code isadded to vcl_recv<br />subvcl_recv {<br /> C{<br /> char *cc = (*get_country_code)(VRT_IP_string(sp, VRT_r_client_ip(sp)));<br />VRT_SetHdr(sp, HDR_REQ, "017X-Country-Code:", cc, <br />vrt_magic_string_end);<br /> }C<br /> if (req.http.Cookie ~ "age_verified.*" ) {<br /> set req.http.X-Age-Verified = "1";<br /> } else {<br /> set req.http.X-Age-Verified = "0";<br /> }<br />}<br />The PHP page isresponsible for setting the age_verified cookie once<br />the disclaimerisaccepted<br />
  37. 37. The following code isadded to vcl_hash<br />sub vcl_hash{<br /> if (req.http.x-country-code=="DE" && <br />req.http.x-age-verified == "0") {<br />set req.hash += req.http.x-age-verified;<br /> set req.hash += req.http.x-country-code;<br />}<br />}<br />You candownload the VarnishGeoIPlibraryhere: <br />http://www.varnish-cache.org/trac/wiki/GeoipUsingInlineC<br />It uses the MaxmindGeoIPlibrary.<br />
  38. 38. Problem #2: Google Analyticscookie<br />sub vcl_recv {<br /> if (req.http.Cookie) {<br /> if (req.http.Cookie ~ "user_cookie.*" ) {<br /> return( pass);<br /> } <br /> remove req.http.Cookie;<br /> } <br />}<br />This removes all cookies except the oneswe know to beuseful<br />
  39. 39. Problem #3: GET requestswithsideeffects<br />JSON UDP packets<br />
  40. 40. Stats server<br /><ul><li>Nodejs server, communicatingwithdatabasedirectly (couldbecommunicatingwithwebsitethrough API)
  41. 41. Does batch queries
  42. 42. Can handle and aggregaterequestsfrommanyVarnish servers at the same time
  43. 43. Bonus: canbeused for many, many, manyotherthings….</li></ul>Core: http://github.com/nicobn/AlysObserver<br />Varnish module: http://github.com/nicobn/AlysVarnish<br />
  44. 44. Side note: YourTTL istoohigh<br />KeezMovies: 53qps on home page<br />Rapidlydecreasing marginal utility<br />Dr. Strangelove or how I learned to stop worrying and love lowTTLs<br />
  45. 45. Questions ?<br />

×