Varnish– A brief introductionNicolas A. Bérard-NaultJune 15, 2011
Regular page view
Reverse proxy cached page view
So whatisVarnish ?Reverse proxy cache
Designedfrom the ground up to be an HTTP accelerator solutionWewillcoverDefault configuration and options
ESI
HTTP headers
Keezmovies.com  - Benchmarks - Use case - Problems & solutions
ConfiguringVarnishVarnish uses a configuration file compiled to C on the fly and included as a sharedlibrary. The configuration format iscalled the VCL (Varnish Configuration Language), a domainspecificlanguagereminescent of Perl.If the VCL is not enough, youcan configure usinginline C and the VRT (VarnishRun Time) library.For a full reference: http://www.varnish-cache.org/docs/2.1/tutorial/vcl.html
Step by stepthrough the configurationBack end definitionsbackend www { .host = "www.example.com"; .port = "http"; .connect_timeout = 1s; .first_byte_timeout = 5s; .between_bytes_timeout = 2s;	.probe = { 	.url = "/test.jpg"; 	.timeout = 0.3 s; 	.window = 8; 	.threshold = 3; } }You can have as manybackends as youwant
Step by stepthrough the configurationDirectordefinitionsdirector www_director random { { .backend = www1; .weight = 2; } { .backend = www2; .weight = 1; } }director www_directorround-robin	 { 	{ .backend = www1; } 	{ .backend = www2; } }You can have as manydirectors as youwant
Highlysimplified flow chartof Varnishoperations
Step by stepthrough the configurationrecv: connectionisreceivedsub vcl_recv {    if (req.restarts == 0) {        if (req.http.x-forwarded-for) {            set req.http.X-Forwarded-For =req.http.X-Forwarded-For + ", " + client.ip;        } else {            set req.http.X-Forwarded-For = client.ip;        }}if (req.http.Authorization || req.http.Cookie) {        /* Not cacheable by default */        return (pass);    }    if (req.request != "GET" && req.request != "HEAD") {        /* We only deal with GET and HEAD by default */        return (pass);    }   return (lookup);}
Be carefulwithyour HTTP verbs…But wealwayscheat…
vcl_hash
vcl_hash: createobject hash for requestsub vcl_hash {hash_data(req.url);	if (req.http.host) {hash_data(req.http.host);	} else {hash_data(server.ip);	}	return (hash);}
vcl_hit, vcl_miss
vcl_pass: request not cacheablesub vcl_pass{	return (pass);}vcl_hit: post-lookup, objectexists in cachesub vcl_hit {	return (deliver);}vcl_miss: post-lookupobjectdoes not exist in cachesub vcl_miss {	return (fetch);}
vcl_fetch
vcl_fetch: post objectfetchedfrom back-endsub vcl_fetch {	if (beresp.ttl <= 0s || beresp.http.Set-Cookie || beresp.http.Vary== "*") {		set beresp.ttl = 120 s;	    return (hit_for_pass);	}	return (deliver);}
vcl_fetch
Step by stepthrough the configurationvcl_deliver: objectis to bedelivered to clientsub vcl_deliver {	return (deliver);}
ESI (edge-sideinclude)Invented by Akamai, only a subsetissupported by VarnishVarnish supports include:<div>Hello:<esi:includesrc=“/getname.php“ /></div>Will beprocessedinto:<div>Hello:Roger Cyr</div>
ESI (edge-sideinclude)To enable ESI processing, used the esikeyword in vcl_fetch.ESI and gzipVarnishWILL NOT be able to do ESI processing on gzip’edbackendresponses. It willalso not be able to do ungzip an ESI response.In all cases, ESIs and gzip are not a good mix. Better support isplanned for Varnish 3.0.
HTTP headersVarnish relies on HTTP headers to know what to cache and for how long.This isdonethrough the Cache-Control HTTP header.Cache-Control: 30Cache-Control: max-age=900Cache-Control: no-cacheCache-Control: must-revalidateRead the HTTP RFC !http://tools.ietf.org/html/rfc2616#section-14.9
keezmovies.com
keezmovies.comAverage of 13 million hits per day (~ 150 queries per second)
Homepagegets a large part of the hits (~35%, ~53 queries per second)
Logged in trafficis a very, very, verysmallminorityPerfect candidate for full page caching
Someresults for KMTested four configurations:Apache + PHPApache + PHP + APCLighttpd + PHP + APCVarnish- Homepage (size = 90k, gzipped = 10k).- Testedusing Apache Benchmark withIncreasingconcurrency.
But…Content differsslightly for certain countries (notoriously, Germany)Google Analytics cookiesAnd of course, not all GETrequests are nullipotentThe good news is, two of thesethreeproblems are easilytackable !
Problem #1: GeolocalizationEssentially, each page has 2 versions:Germanvisitor & disclaimer not acceptedRest of the world & Germanvisitorwhoaccepteddisclaimer__attribute__((constructor)) voidload_module(){    /* … */handle = dlopen(“/usr/lib/varnish/geoip.so”, RTLD_NOW);if (handle != NULL) {get_country_code= dlsym(handle, “get_country_code”);}}}C
The following code isadded to vcl_recvsubvcl_recv {  C{    char *cc = (*get_country_code)(VRT_IP_string(sp, VRT_r_client_ip(sp)));VRT_SetHdr(sp, HDR_REQ, "\017X-Country-Code:", cc, vrt_magic_string_end);  }C  if (req.http.Cookie ~ "age_verified.*" ) {      set req.http.X-Age-Verified = "1";  } else {      set req.http.X-Age-Verified = "0";  }}The PHP page isresponsible for setting the age_verified cookie oncethe disclaimerisaccepted

June8 presentation

  • 1.
    Varnish– A briefintroductionNicolas A. Bérard-NaultJune 15, 2011
  • 2.
  • 3.
  • 4.
  • 5.
    Designedfrom the groundup to be an HTTP accelerator solutionWewillcoverDefault configuration and options
  • 6.
  • 7.
  • 8.
    Keezmovies.com -Benchmarks - Use case - Problems & solutions
  • 9.
    ConfiguringVarnishVarnish uses aconfiguration file compiled to C on the fly and included as a sharedlibrary. The configuration format iscalled the VCL (Varnish Configuration Language), a domainspecificlanguagereminescent of Perl.If the VCL is not enough, youcan configure usinginline C and the VRT (VarnishRun Time) library.For a full reference: http://www.varnish-cache.org/docs/2.1/tutorial/vcl.html
  • 10.
    Step by stepthroughthe configurationBack end definitionsbackend www { .host = "www.example.com"; .port = "http"; .connect_timeout = 1s; .first_byte_timeout = 5s; .between_bytes_timeout = 2s; .probe = { .url = "/test.jpg"; .timeout = 0.3 s; .window = 8; .threshold = 3; } }You can have as manybackends as youwant
  • 11.
    Step by stepthroughthe configurationDirectordefinitionsdirector www_director random { { .backend = www1; .weight = 2; } { .backend = www2; .weight = 1; } }director www_directorround-robin { { .backend = www1; } { .backend = www2; } }You can have as manydirectors as youwant
  • 12.
  • 13.
    Step by stepthroughthe configurationrecv: connectionisreceivedsub vcl_recv { if (req.restarts == 0) { if (req.http.x-forwarded-for) { set req.http.X-Forwarded-For =req.http.X-Forwarded-For + ", " + client.ip; } else { set req.http.X-Forwarded-For = client.ip; }}if (req.http.Authorization || req.http.Cookie) { /* Not cacheable by default */ return (pass); } if (req.request != "GET" && req.request != "HEAD") { /* We only deal with GET and HEAD by default */ return (pass); } return (lookup);}
  • 14.
    Be carefulwithyour HTTPverbs…But wealwayscheat…
  • 15.
  • 16.
    vcl_hash: createobject hashfor requestsub vcl_hash {hash_data(req.url); if (req.http.host) {hash_data(req.http.host); } else {hash_data(server.ip); } return (hash);}
  • 17.
  • 18.
    vcl_pass: request notcacheablesub vcl_pass{ return (pass);}vcl_hit: post-lookup, objectexists in cachesub vcl_hit { return (deliver);}vcl_miss: post-lookupobjectdoes not exist in cachesub vcl_miss { return (fetch);}
  • 19.
  • 20.
    vcl_fetch: post objectfetchedfromback-endsub vcl_fetch { if (beresp.ttl <= 0s || beresp.http.Set-Cookie || beresp.http.Vary== "*") { set beresp.ttl = 120 s; return (hit_for_pass); } return (deliver);}
  • 21.
  • 22.
    Step by stepthroughthe configurationvcl_deliver: objectis to bedelivered to clientsub vcl_deliver { return (deliver);}
  • 23.
    ESI (edge-sideinclude)Invented byAkamai, only a subsetissupported by VarnishVarnish supports include:<div>Hello:<esi:includesrc=“/getname.php“ /></div>Will beprocessedinto:<div>Hello:Roger Cyr</div>
  • 24.
    ESI (edge-sideinclude)To enableESI processing, used the esikeyword in vcl_fetch.ESI and gzipVarnishWILL NOT be able to do ESI processing on gzip’edbackendresponses. It willalso not be able to do ungzip an ESI response.In all cases, ESIs and gzip are not a good mix. Better support isplanned for Varnish 3.0.
  • 25.
    HTTP headersVarnish relieson HTTP headers to know what to cache and for how long.This isdonethrough the Cache-Control HTTP header.Cache-Control: 30Cache-Control: max-age=900Cache-Control: no-cacheCache-Control: must-revalidateRead the HTTP RFC !http://tools.ietf.org/html/rfc2616#section-14.9
  • 26.
  • 27.
    keezmovies.comAverage of 13million hits per day (~ 150 queries per second)
  • 28.
    Homepagegets a largepart of the hits (~35%, ~53 queries per second)
  • 29.
    Logged in trafficisa very, very, verysmallminorityPerfect candidate for full page caching
  • 30.
    Someresults for KMTestedfour configurations:Apache + PHPApache + PHP + APCLighttpd + PHP + APCVarnish- Homepage (size = 90k, gzipped = 10k).- Testedusing Apache Benchmark withIncreasingconcurrency.
  • 34.
    But…Content differsslightly forcertain countries (notoriously, Germany)Google Analytics cookiesAnd of course, not all GETrequests are nullipotentThe good news is, two of thesethreeproblems are easilytackable !
  • 35.
    Problem #1: GeolocalizationEssentially,each page has 2 versions:Germanvisitor & disclaimer not acceptedRest of the world & Germanvisitorwhoaccepteddisclaimer__attribute__((constructor)) voidload_module(){ /* … */handle = dlopen(“/usr/lib/varnish/geoip.so”, RTLD_NOW);if (handle != NULL) {get_country_code= dlsym(handle, “get_country_code”);}}}C
  • 36.
    The following codeisadded to vcl_recvsubvcl_recv { C{ char *cc = (*get_country_code)(VRT_IP_string(sp, VRT_r_client_ip(sp)));VRT_SetHdr(sp, HDR_REQ, "\017X-Country-Code:", cc, vrt_magic_string_end); }C if (req.http.Cookie ~ "age_verified.*" ) { set req.http.X-Age-Verified = "1"; } else { set req.http.X-Age-Verified = "0"; }}The PHP page isresponsible for setting the age_verified cookie oncethe disclaimerisaccepted