Varnish Cache Plus. Random notes for wise web developers

Varnish Cache Plus
Random notes for wise web developers
Carlos Abalde, Roberto Moreda
{cabalde, moreda}@allenta.com
October 2014

Agenda
1. Introduction
2. Varnish 101
3. Invalidations
4. HTTP headers
5. Content composition
6. VAC
7. VCS
8. Device detection
9. Varnish Plus 4.x
10. Q&A

Disclaimer
๏ General understanding of ‘The Varnish Book’ is assumed
‣ This is not the official Varnish Cache training
‣ This is not a Varnish Cache internals course
‣ This is not a Varnish module development course
‣ This is a collection of random notes for web developers
willing to make the most of Varnish Cache Plus
๏ OSS Varnish Cache vs. Varnish Cache Plus
‣ 3.x vs. 4.x

Varnish Cache 3.x
What everybody should know
๏ The Varnish Book
‣ https://www.varnish-software.com/static/book/
๏ The Varnish Reference Manual
‣ https://www.varnish-cache.org/docs/.../index.html
๏ Default VCL
‣ https://www.varnish-cache.org/trac/.../default.vcl

Varnish Cache Plus 3.x
Components I
๏ Support, advise & training
๏ Varnish Enhanced Cache Invalidation
‣ Hash Two, Hash Ninja…
๏ Varnish Administration Console (VAC)
๏ Varnish Custom Statistics (VCS)
๏ Device detection

Components II
๏ Varnish Tuner
๏ Enhanced HTTP streaming
๏ Packaged binary VMODs
๏ Varnish Paywall
๏ … and more to come shortly!

Supported platforms
๏ 64 bits
๏ Distributions
‣ RedHat Enterprise Linux 5 & 6
‣ Ubuntu Linux 12.04 LTS (precise)
‣ Ubuntu Linux 14.04 LTS (trusty)
‣ Debian Linux 7 (wheezy)

Caching policy
๏ Varnish Cache Plus would require zero configuration
in a perfect world with perfect HTTP citizens
‣ Correct HTTP caching headers
‣ Vary HTTP header used wisely
‣ HTTP cookies used conservatively
๏ By default Varnish Cache Plus will not cache
anything marked as private, carrying a cookie or
including a '*'
Vary HTTP header

VCL
Overview
๏ Varnish Configuration Language
‣ Domain specific state engine
‣ No loops, variables, functions…
‣ Command line configuration & Tunable parameters
๏ Translated to C code
๏ Loaded as a dynamically generated shared library
‣ Zero downtime & Blazingly fast

VCL
vcl_recv I
๏ Normalize client-input
๏ Pick a backend / director
๏ Re-write / extend client-input
๏ Decide caching policy based on client-input
๏ Access control
๏ Security barriers

VCL
vcl_recv II
sub
vcl_recv
{
#
Backend
selection
&
URL
normalization.
if
(req.http.host
~
"^blogs.")
{
set
req.backend
=
blogs;
set
req.http.host
=
regsub(req.http.host,"^blogs.",
"");
set
req.url
=
regsub(req.url,
"^",
"/blogs");
}
else
{
set
req.backend
=
default;
}
#
Poor
man's
device
detection.
if
(req.http.User-‐Agent
~
"(iPad|iPhone|Android)")
{
set
req.http.X-‐Device
=
"mobile";
}
else
{
set
=
"desktop";
}
}

VCL
vcl_fetch I
๏ Sanitize / extend backend response
๏ Override cache duration
‣ beresp.ttl
- s-‐maxage & maxage in Cache-‐Control HTTP header
- Expires HTTP header
- Default TTL
‣ Beware with TTL of hitpass objects

VCL
vcl_fetch II
sub
vcl_fetch
{
#
Override
caching
TTL.
if
(beresp.http.Cache-‐Control
!~
"s-‐maxage")
{
set
beresp.ttl
=
0;
if
(bereq.url
~
".jpg(?|$)")
{
set
beresp.ttl
=
30s;
}
}
#
Never
cache
a
Set-‐Cookie
header.
if
(beresp.ttl
>
0s)
{
unset
beresp.http.Set-‐Cookie;
}
#
Create
ban-‐lurker
friendly
objects.
set
beresp.http.X-‐Url
=
bereq.url;
}

VMODs
๏ Shared libraries extending the VCL core
‣ std VMOD
- std.toupper(), std.log(), std.fileread()…
‣ ABI (Application Binary Interface) mismatches
๏ cookie, header, var, curl, digest, geoip, boltsort,
memcached, redis, dns…
๏ https://www.varnish-cache.org/vmods

Backends
๏ Multiple backends
‣ Selected at request time based on any request property
๏ Probes
‣ Per-backend periodic health checks
- Interval, timeout, expected response…
๏ Directors
‣ Load balanced backend groups

Error handling
Saint mode
๏ Some backend may be sick for a particular object
‣ Other objects from the same backend can still be accessed
- Unless more than a set amount of objects are added to
the saint mode blacklist for a specific backend
๏ Do not request again the object to that backend for a
period of time
‣ Grace mode is used when all possible backends for the
requested object have been blacklisted
๏ Complement backend probes

Error handling
Grace mode
๏ A graced object is an object that has expired, but is still
kept in cache
‣ beresp.ttl vs. beresp.grace
๏ Graced objects are used to
‣ Serve outdated content if the backend is down
- Probes or saint mode is required for this
‣ Serve sightly staled content while fresh versions are
fetched

Beyond caching policy
๏ Why restricting VCL / VMODs to implement the
caching policy?
๏ Any logic modeled in VCL / VMODs is compiled,
embedded & executed in the caching edger layer
‣ 1000x times faster than typical Java / PHP apps
- Strong restrictions
‣ Accounting, paywalling, A/B testing…

varnishtest
๏ Powerful Varnish-specific testing tool
‣ Mocked clients & backends executing /
processing HTTP requests against real Varnish
Cache Plus instances
‣ http://www.clock.co.uk/...varnishtest
๏ Essential when implementing complex VCL logic
๏ Easily integrable in any CI infrastructure

FAQ
๏ When SSL support will be implemented?
‣ "[...] huge waste of time and effort to even think about it."
๏ When SPDY support will be implemented?
‣ "[...] Varnish is not speedy, Varnish is fast! [...]"
๏ What is the recommended value for this bizarre kernel /
varnishd parameter I found in some random blog?
‣ Use Varnish Tuner + Fine tune based on necessity
‣ Pay attention to workspaces & syslog messages

Overview
๏ Updated objects may be available before TTL
expiration
‣ Purges
‣ Forced misses
‣ Bans
‣ Hash Two / Hash Ninja / …

Purges
Overview
๏ VCL
๏ Eagerly discards an object along with all its variants
acl
internal
{
"localhost";
"192.168.55.0"/24;
}
sub
vcl_recv
{
if
(req.request
==
"PURGE")
{
if
(client.ip
!~
internal)
{
error
405
"Not
allowed.";
}
return
(lookup);
}
}
sub
vcl_hit
{
if
(req.request
==
"PURGE")
{
purge;
error
200
"Purged.";
}
}
sub
vcl_miss
{
if
(req.request
==
"PURGE")
{
purge;
error
200
"Purged.";
}
}

Purges
Downsides I
๏ What if the new object cannot be fetched after the
invalidation?
‣ Soft-purges VMOD
‣ Forces misses
๏ What if multiple objects need to be invalidated? What
if objects need to be invalidated too frequently?
‣ Bans
‣ Hash Two

Purges
Downsides II
๏ How to invalidate hitpass objects?
‣ Not possible in Varnish Cache Plus 3.x
- Redesigned in Varnish Cache Plus 4.x
- https://www.varnish-cache.org/trac/.../1033
‣ return(pass); during vcl_recv is preferred
when possible

Forced misses
Overview
๏ VCL
๏ Forces a cache miss for the request
‣ Useful for cache priming scripts
sub
vcl_recv
{
if
(req.http.X-‐Priming-‐Script)
{
...
set
req.hash_always_miss
=
true;
}
...
}

Forced misses
Behavior
๏ Object will always be (re)fetched from the backend
๏ New object is put into cache and used from that point
onward
‣ Old object is not evicted until it’s safe to do so
‣ Controls who takes the penalty of waiting for an
updated object
๏ Old objects are not freed up until expiration
‣ This is considered a flaw and a fix is expected

Bans
Overview
๏ VCL or CLI
๏ Lazily discards multiple objects matching an expression
‣ Logical operators + Object attributes + Regular expressions
‣ Only works on objects already in the cache
๏ Ban lurker
‣ Frees up memory + Keeps the ban list at a manageable size
‣ obj.* based expressions

Bans
Example
sub
vcl_recv
{
if
(req.request
==
"BAN")
{
...
if
(!req.http.X-‐Ban-‐Url-‐Regexp)
{
error
400
"Empty
URL
regexp.";
}
ban("obj.http.X-‐Url
~
"
+
req.http.X-‐Ban-‐Url-‐Regexp);
}
}
sub
vcl_fetch
{
set
beresp.http.X-‐Url
=
req.url;
}
sub
vcl_deliver
{
unset
resp.http.X-‐Url;
}

Hash Two
Overview
๏ VCL + VMOD
๏ Workarounds bans scalability
HTTP/1.x
200
OK
Transfer-‐Encoding:
chunked
...
X-‐Tags:
C10
P42
P236
P857
...
ban
obj.http.X-‐Tags
~
"(s|^)P42(s|$)"

Hash Two
Example
import
hashtwo;
sub
vcl_recv
{
if
(req.request
==
"PURGE")
{
...
if
(hashtwo.purge(req.http.X-‐Tag)
!=
0)
{
error
200
"Purged.";
}
else
{
error
404
"Not
found.";
}
}
}
sub
vcl_fetch
{
set
beresp.http.X-‐HashTwo
=
beresp.http.X-‐Tags;
}

Cache related headers
๏ Expires
๏ Cache-Control
๏ Last-Modified
๏ If-Modified-Since
๏ If-None-Match
๏ Etag
๏ Pragma
๏ Vary
๏ Age

Cache-Control
Overview
๏ Specifies directives that must be applied by all
caching mechanisms (from Varnish Cache Plus to
browser cache)
‣ public
|
private
‣ no-‐store
‣ no-‐cache
‣ max-‐age
‣ s-‐maxage
‣ must-‐revalidate
‣ no-‐transform
‣ …

Cache-Control
beresp.ttl
๏ Ignored in incoming client HTTP requests
๏ Only s-‐maxage & max-‐age used in backend HTTP
responses to calculate default TTL
‣ Always overrides Expires header
‣ Beware of Age header in client responses
- Objects not cached client side
- https://www.varnish-cache.org/...Caching

Vary
๏ Indicates the response returned by the backend
server may vary depending on headers received in
the request
๏ Object variants & Hit ratio
‣ Vary:
Accept-‐Encoding
- Normalization of Accept-‐Encoding header is
not required
‣ Vary:
User-‐Agent

Overview
๏ Break objects into smaller fragments
‣ Separate cache policy for each fragment
‣ Increase hit ratio
๏ Tools
‣ Edge Side Includes (ESI)
‣ AJAX
- Beware of RTT & Cross domain policy

Edge Side Includes
๏ Subset of ESI Language Specification 1.0
‣ <esi:include
src="<URL>
"
/>
‣ <esi:remove>...</esi:remove>
‣ <!-‐-‐esi
...—>
๏ set
beresp.do_esi
=
true;
‣ Separate Varnish requests
๏ Testing ESI in dev environment

Overview
๏ Central control of Varnish Cache Plus servers
‣ Web UI + RESTful API
- Super Fast Purger
๏ Cache group management
‣ Real time statistics, VCL editor, ban submission…
๏ Varnish Agent 2

Super Fast Purger
๏ High performance intermediary distributing
invalidation requests to groups of Varnish
Cache Plus servers
‣ Leverages speed & flexibility of VCL
‣ Keep-alive workaround
๏ Part of the VAC RESTful API
‣ Trivially integrable in existing applications

Change management
๏ Easily integrable using the VAC RESTful API
‣ git, Mercurial… hooks
‣ Jenkins, Travis, GitLab… CI scripts
๏ Manual VCL bundle generation
๏ Orchestrated / programmed deployments,
rollbacks, etc.

Overview
๏ Real-time aggregated statistics
‣ Multiple vstatdprobe daemons
‣ One vstatd daemon
‣ JSON + Time series API
๏ VSM log based
‣ Efficient circular in-memory data structure
‣ std.log("vcs-‐key:"
+
<key
suffix>);

Some ideas
๏ Trending articles or sale products
๏ Cache hits and cache misses
๏ URLs with long load times
๏ URLs with the most 5xx response codes
๏ Where traffic is coming from
๏ …

Example
sub
vcl_deliver
{
std.log("vcs-‐key:"
+
req.http.host);
std.log("vcs-‐key:"
+
req.http.host
+
req.url);
std.log("vcs-‐key:TOTAL");
if
(obj.hits
==
0)
{
std.log("vcs-‐key:MISS");
}
}

API I
๏ Stats (#requests, #misses, avg ttfb, acc body bytes, #2xx,
#3xx…) for key named “example.com" during the last
time windows
‣ GET
/key/example.com
๏ Keys that produced the most 5xx responses during the
last time window
‣ GET
/all/top_5xx
๏ Top 5 requested keys during the last time window
‣ GET
/all/top/5?verbose=1

API II
๏ Top 10 most requested keys ending with ‘.gif'
during the last time window
‣ GET
/match/(.*)%5C.gif$/top
๏ Top 50 slowest backend requests aggregating
the last 20 time windows
‣ GET
/all/top_ttfb/50?b=20

Overview
๏ VMOD
๏ DeviceAtlas
‣ https://deviceatlas.com
‣ Database locally deployed & Daily updated
๏ OSS alternatives
‣ https://github.com/serbanghita/Mobile-Detect
‣ …

Example
import
deviceatlas;
sub
vcl_recv
{
if
(deviceatlas.lookup(req.http.User-‐Agent,
"isMobilePhone")
==
"1")
{
set
=
"mobile";
}
elsif
(deviceatlas.lookup(req.http.User-‐Agent,
"isTablet")
==
"1")
{
set
=
"tablet";
}
else
{
set
=
"desktop";
}
}

Some ideas
๏ Redirections based on device properties
๏ Backend selection based on device properties
๏ Normalization of the UA header
‣ Caching different versions (i.e. Vary header) of
the same object based on normalized UAs
๏ …

Highlights
๏ Client / backend thread split
‣ Background content refreshing
๏ Redesigned purges
‣ return(purge); during vcl_recv
๏ Directors implemented as VMODs
‣ Consistent hashing director
๏ Distinction between error & synthetic responses

Varnish Cache Plus. Random notes for wise web developers

More Related Content

What's hot

Viewers also liked

Similar to Varnish Cache Plus. Random notes for wise web developers

Recently uploaded

Varnish Cache Plus. Random notes for wise web developers