Varnish Resources✤ Varnish Cache Website http://www.varnish-cache.org/✤ The Varnish Book https://www.varnish-software.com/static/book/ Designed as a classroom-led ofﬁcial training manual. It’s written like a schoolbook, but it’s good and up-to-date in a way that a lot of Varnish documentation out there isn’t.
Where We Left Off✤ At the last meetup I introduced Varnish, explained its architecture and how to install it and monitor it successfully.✤ Tonight we’ll dive into Varnish Conﬁguration Language and discuss how VCL deﬁnes Varnish’s policy for handling HTTP requests.✤ I’ll be discussing Varnish 3.0 speciﬁc syntax, anyone using 2.1.5 or earlier should be aware that VCL has changed syntax somewhat between v2 and v3.
Hits & Misses, Passes & Pipes✤ A hit happens when a request comes in and the hash matches a response in the cache. The response is sent to the client and the backend never knows about it.✤ A miss happens when a request is not present in the cache or is present but expired or banned. The request is sent to the backend and its response is saved in the cache.✤ A pass happens when varnish is conﬁgured to bypass certain requests. They are never cached and don’t ﬁgure in hit rates.✤ A pipe grants a direct passthrough to the backend. Used for media streams.
What Is VCL?✤ A domain-speciﬁc language used to deﬁne the way Varnish handles & caches HTTP requests✤ Has if statements & regexes but no user functions or loops✤ C-like syntax✤ Compiled to C by the server and linked dynamically
The VCL Engine✤ VCL deﬁnes how each HTTP request is processed. Each request is processed independently.✤ Varnish parses and veriﬁes requests but all policy decisions about how requests are cached and routed depends on VCL code.✤ Each predeﬁned VCL function handles a particular phase of the request lifecycle, and ends with a return() statement that forwards the workﬂow to the next phase.
Built-In Functions & Keywords✤ VCL includes some built-in functions to be used during request processing.✤ regsub() and regsuball() allow a user to modify headers based on regex matching. There are some simpler methods using VMODs but it’s good to be comfortable writing complex regexes when implementing complicated logic in VCL.✤ The set keyword is used to set headers on a request or response, and the remove or unset keywords can remove them. Setting headers is the only way to pass information between VCL subroutines.✤ The ban() function allows you to remove entries from the cache.
Writing VCL✤ Each VCL function has a default that will always run unless you override it. The default is appended to the end of any VCL function you include.✤ If you include the same VCL function twice they’re appended in the order read.✤ It’s possible to include other ﬁles using the include statement. Statements can be deﬁned as subroutines for convenience and readability and executed later with the call statement.
Return Values in VCL✤ Every VCL function has a set of possible return values that determine a request’s handling. error and restart are available in most functions.✤ return(error) passes control to vcl_error().✤ return(restart) increments the restart counter and begins again at vcl_recv().✤ The default VCL code will run for a function if the user code does not return().
Data Objects in VCL✤ VCL exposes global objects representing HTTP requests or responses that can be read and modiﬁed. Different objects are available in different VCL functions, e.g. beresp cannot be modiﬁed in vcl_recv because we don’t have a backend response yet.✤ The ﬁve main objects are the request object (req), response object (resp), the backend request and response objects (bereq and beresp), and the cache object (obj).✤ client and server are read-only objects that expose data about the client and the varnish server itself respectively.
Defining Backends & Probes✤ Varnish forwards uncached requests to deﬁned backends✤ Probes ensure a backend is healthy✤ To be considered healthy, backend must pass threshold probes out of the last window attempts.
Directors ✤ Directors are collections of backends ✤ They deﬁne which backends are chosen for a request ✤ Director types include random, client, hash, round-robin, DNS, and fallback.
Access Control Lists ✤ ACLs consist of lists of IP addresses ✤ ACLs can be matched against in VCL code to restrict access
Receiving Requests with vcl_recv✤ vcl_recv happens to every request.✤ Use it to check and add headers, perform bans and redirects, forward to proxies, and manage cookies.✤ The req object is available here.✤ Possible return values are pass, lookup, pipe, and error.
Default vcl_recv✤ The vcl_recv default is important to understand in order to get good cache hit ratios.✤ It won’t cache in the presence of cookies or authorization headers.✤ You’ll generally need to use your own vcl_recv to strip cookies you don’t care about to have any caching at all.
Proper Passing via vcl_pass✤ vcl_pass happens when a request is passed in vcl_recv, vcl_hit, or vcl_miss.✤ A return(pass) in vcl_pass sends control to vcl_fetch.✤ The default content is to simply return(pass).✤ The req and the bereq objects are available.✤ Possible return values are pass, restart, and error.
Heavenly Hashes with vcl_hash✤ vcl_hash happens to any request calling return(lookup) in vcl_recv.✤ Use it to add values to the hash. Any aspects of the request that affect the returned content should be included in the hash by calling hash_data().✤ The req object is available.✤ The only possible return value is hash.
Default vcl_hash✤ The vcl_hash default includes the URL and the host or IP address.✤ Anything that changes the HTML returned from the backend needs to be hashed on. Device types, login cookies, referers; depending on your application.✤ It’s a good idea to strip any utm_source or other tracking query string elements to prevent unnecessarily fragmenting your cache.
Handling Hits in vcl_hit✤ vcl_hit is executed when a hashed request is found in the cache.✤ A return(pass) in vcl_hit sends control to vcl_pass.✤ The default content is to simply return(deliver), which sends control to vcl_deliver().✤ req and obj are exposed to vcl_hit.✤ Possible return values are deliver, pass, restart, and error.
Manipulating Misses with vcl_miss✤ vcl_miss is executed when a hashed request is not found in the cache.✤ A return(pass) in vcl_miss sends control to vcl_pass.✤ The default content is return(fetch), which sends control to vcl_fetch().✤ req and bereq are exposed to vcl_miss.✤ Possible return values are fetch, pass, restart, and error.
Fantastic Fetching in vcl_fetch✤ vcl_fetch is executed after a backend request is made but before that response is stored in the cache. It happens either after vcl_miss returns fetch or vcl_pass returns pass.✤ The req, bereq, and beresp objects are accessible in vcl_fetch.✤ Possible return values are deliver, hit_for_pass, restart, and error.
Default vcl_fetch✤ The vcl_fetch defaults to deliver unless the backend has set a cookie or a Vary header or the TTL is 0.✤ hit_for_pass is a special condition that stores an object in the cache but sets a ﬂag marking it as content that should be fetched fresh from the backend for as long as the ﬂag exists.
Dynamic Delivery in vcl_deliver✤ vcl_deliver is executed when varnish returns content to the client.✤ The resp object is accessible, none of the other request or response objects are in scope.✤ The default is to return(deliver).✤ Possible return values are deliver, restart, and error.
Expectorate Errors using vcl_error✤ vcl_error is executed when any VCL function returns error.✤ The default uses the synthetic keyword to deliver a Guru Meditation error to the client.✤ Possible return values are deliver and restart.
Other VCL functions✤ vcl_pipe is executed when vcl_recv returns pipe. It’s used for streaming media and tells varnish to pipe that client directly to the backend for the duration of the HTTP connection. If you use this, you should explicitly close the connection from the backend when done piping with a Connection: close header.✤ vcl_init and vcl_ﬁni are functions that get called upon the loading and unloading of a VCL script, respectively. They’re used for initializing and cleaning up VMODs during startup and shutdown.
Using VMODs✤ VMODs are varnish modules. They’re initialized with the import keyword and they export functions into VCL.✤ vmod_std is packaged with Varnish and provides small useful functions.✤ Other Varnish modules are available including geoIP lookups, Redis and Memcache clients, and custom VMODs can be written in C for use with Varnish.
Banning Basics✤ You can use the built-in VCL ban() function to perform bans of cached content.✤ Calling the ban() function adds your expression to the ban list, which is checked after a cache object is found. If the object matches a ban on the ban list, its considered a miss rather than a hit.✤ ban(“req.url ==” + req.url); would ban any content from a URL matching req.url. Bans can be set to match on any request header, not just URL.
Active Banning✤ It’s possible to write VCL that can accept ban requests from your backends. By doing this, it allows a backend to actively ban changed content from the varnish cache, ensuring clients get the most up to date version of the content.✤ This example bans based on the URL, but it could have a more complex rule that matches a header set by the backend.
VCL Takeaways✤ The Varnish website contains a lot of great examples of handy tricks you can use in your own VCL code, but it’s important to understand what you’re doing before you implement them.✤ Know your HTTP protocol. Varnish is tightly tied to HTTP so it’s very helpful to understand the HTTP protocol and its intricacies, especially how basic HTTP caching headers are interpreted by browsers and other clients.✤ Keep it simple at ﬁrst, and iterate improvements. It’s tempting to write a huge VCL policy that has all the bells and whistles but a complex VCL can be difﬁcult to debug. Start small and add complexity as you go.
A Closing Testimonial From Jay-Z✤ “If you’re having scaling problems, I feel bad for you son... Clients sent 99 requests but my backend got one.” Photo by ﬂickr user matthew_harrison
Til Next Time...✤ Come back next month (date TBD) for another exciting adventure with Varnish... Advanced VCL Tricks
Sources & Links✤ Detailed VCL ﬂowchart https://www.varnish-cache.org/trac/wiki/VCLExampleDefault✤ VMOD Library https://www.varnish-cache.org/vmods✤ Upgrading from Varnish 2.1 to Varnish 3.0 https://www.varnish-cache.org/docs/3.0/installation/upgrade.html