Tuning the Kernel for Varnish Cache

Tuning the kernel
for Varnish Cache

Me!
• Per Buer
• CTO @ Varnish Software
• Programmer / Sysadmin
background
• Beer

Varnish Software
• Contributes a lot to the Varnish Cache project
• Not the Varnish Cache project
• Support and ad-on software for Varnish Cache
• Media, e-commerce, API and CDN workloads

What is this Varnish?
Client BackendVarnish
TTBF: 30 microseconds TTBF: 150 milliseconds

Varnish Cache: 30s primer
• High performance HTTP Caching reverse proxy
• 10 years old
• Policy-driven conﬁguration language
• Massively threaded - event driven programming is a fad :-P
• Super easy to write modules (no event loop, see)

VCL Example
sub vcl_recv {
if (req.http.host == "www.example.com" &&
req.url ~ "^/fun/" &&
(req.http.referer && req.http.referer !~ "^http://www.example.com/")) {
return (synth(403, "No hotlinking please”));
}
}

So? What is Varnish?
Client BackendVarnish
Run high speed logic here.

Tuning Varnish for fun and proﬁt

What to tune
• Linux IP stack & Netﬁlter
• Linux ethernet - we’ll skip this for now. Most of you don’t have
ethernet interfaces anymore. :-)
• Varnish Cache

http://www.linuxbrigade.com/reduce-time_wait-socket-connections/
(#2 on my Google when searching for tcp_tw_recycle)
Dangerous

Setting up a lab
• Set up three node network (client - router - target)
• Use Trafﬁc Control / Netem on virtual servers

target
router
client
eth1 
192.168.16.1/24
intnet
eth2 
192.168.17.1/24
intnet2
192.168.17.2
intnet2
192.168.16.2
intnet

So we have a perfect network…

Real life networks
• Latency
• Jitter
• Packet reordering
• Packet loss
• Duplication
• Corruption

Trafﬁc Control: netem
• Ships in the 2.6 linux kernel
• Make all sort of characteristics easy
• Reasonably simple to use (see next slide)

tc qdisc add dev eth1 root netem delay 100ms 10ms
distribution normal reorder 2% 10% loss 1%
queuing discipline
tc qdisc add dev eth2 root netem delay 1ms

target
router
client
100ms +/- 10ms
1% loss
2% reordering
1ms

A suitable backend
• https://github.com/espebra/dummy-api
• Perfect for ad hoc testing
• Object size, latencies (ttfb, ttb) are all dynamic (from URL)
• Really fast (100K+ RPS)
• http://target:1337/?header-delay=50&body-delay=100&predictable-
content=10

Linux TCP buffer tuning
• Supposedly auto-tuning
• Defaults are OK
• Some improvements on 10G networks

Client Varnish100ms latency
Need to retain data in buffers while waiting for ACK

Calculating BDP
• Max Bandwidth per ﬂow x Delay
• 1000 Mbps x 0.1 seconds = 100megabits = 12megabytes
• Default: ~3.7 megabytes - 330 megabits @ 100ms latency

BDP Tuning
• Kernel autotunes the details - we just give it more room
• /proc/sys/net/core/(r|w)mem_max can be ignored
• /proc/sys/net/ipv4/tcp_(r|w)mem should be lifted -
• 10240 87380 16777216 is the usual recommendation

Three way handshake
SYN
ACK
GET / …
SYN, ACK
ACK
RESP
Initcwnd

Playing with initcwnd
• Initial congestion window is now 10
• Increasing might break stuff
• Some CDNs increase initcwnd and show some improvement

accept()
• System call used by an application to accept a socket from the
kernel
• Multiple threads in Varnish issue accept() calls - one per thread pool

somaxconn
• Global limit on listen_depth
• Default is silly (128)
• Adds 3s/1s delay to incoming connections (initial syn gets
discarded)
• Increase it to 1 - 16K

tcp_max_syn_backlog
• Threshold for SYN Flood detection
• Limits number of TCP connection being established
• When exhausted - SYN Cookies are sent
• Do not rely on SYN Cookies

Local TCP ports
• Varnish will need local sockets in order to talk to backends
• Busy servers might run low on sockets
• Default: net.ipv4.ip_local_port_range = 32768 61000
• Can safely be increased to “2000 65500”

TIME_WAIT
• Socket is kept around after it is closed
• Linux used 2x FIN timeout
• Default is 60 seconds (no packet should be older than 60s)
• I’ve never seen a packet older than 10s
• net.ipv4.tcp_ﬁn_timeout can be set to 10

More TIME_WAIT
• tcp_tw_recycle is dangerous (unbuckles seat belt)
• tcp_tw_reuse can cause problems with uses behind NAT - makes
sense on LAN w.o./NAT
• tcp_max_tw_buckets can mitigate TIME_WAIT attacks by destroying
sockets in TIME_WAIT state
• Increase tcp_max_tw_buckets to 256K or more

Connection tracking
• Linux ﬁrewall tracks connections
• Loaded implicitly when using certain iptables rules
• [11864.342438] nf_conntrack version 0.5.0 (3917 buckets, 15668 max)
• New connection are rejected when conntrack is full
• Set parameters when loading module (options nf_conntrack
hashsize=XXXXX) and

Linux tuning - summing up
• Leave most things as they are
• Increase somaxconn, tcp_max_backlog
• Increase local_port range
• Decrease tcp_ﬁn_timeout to ~10
• Increase tcp_max_tw_buckets to ~256K
• Increase BDP buffer limit

A short sidestep: TCP Acceleration

Varnish Cache threads
• Number of pool: always 2
• thread_pool_max
• thread_pool_min
• You need ~ 1 thread per RPS

Workspace Tuning
• Varnish pre-allocates memory for the threads
• When it runs out of memory - it crashes

VSL Tuning
• /var/lib/varnish contains the VSL.
• Linux will try to sync the VSL to disk
• On busy servers: put VSL on RAMDISK

Keepalives
• 3 way handshake on long latency is expensive
• TLS handshake is worse
• idle_send_timeout (frontend) and backend_idle_timeout (backend)

Most efﬁcient tuning
• Increase your cache hit rate
• 100ms vs 1ms per request

Increasing cache hit rate
• Prolong TTLs - invalidate on change
• Normalize request headers when using Vary

Summing up: Varnish Cache
• Threads are in pools (you need two)
• Make sure there is enough threads
• Make sure there is enough memory
• Try to tune your cache hit ratio

Preemptive answers
• TLS is not in Varnish Cache due to OpenSSL QA issues
• H/2 support is experimental in Varnish Cache 5.0
• Full H/2 support in Varnish Cache 5.1 (with Hitch)

Tuning the Kernel for Varnish Cache

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Tuning the Kernel for Varnish Cache

Similar to Tuning the Kernel for Varnish Cache (20)

More from Per Buer

More from Per Buer (6)

Recently uploaded

Recently uploaded (20)

Tuning the Kernel for Varnish Cache