Varnish Configuration Step by Step

60,752 views
60,530 views

Published on

Improving Site Response Time, Part 3

Published in: Technology
5 Comments
53 Likes
Statistics
Notes
No Downloads
Views
Total views
60,752
On SlideShare
0
From Embeds
0
Number of Embeds
570
Actions
Shares
0
Downloads
640
Comments
5
Likes
53
Embeds 0
No embeds

No notes for slide

Varnish Configuration Step by Step

  1. 1. Improving Site Response Time Part 3: Varnish Configuration Step by StepKim Stefan Lindholm 1 19.2.2012
  2. 2. Instructions are for Amazon Linux AMI (64-bit) and compatible systems. 2
  3. 3. Varnish Configuration - Requirements1 All traffic to origin server is routed through Incapsula firewall If origin server or Incapsula proxy is down, Varnish serves a2 cached copy for 6 hours Varnish restarts automatically upon critical failure or server3 reboot and notifies the administrator by e-mail Varnish hit rate, CPU load and memory usage can be4 monitored at specific URL 3
  4. 4. CACHE SIZEMost files are served from CDN, so dataset should be smallVCL file: Normalization of hostname, gzip/deflate, etc.VCL file: No caching for logged in users / administrator back-endVCL file: Only one version of page is cached regardless of cookies3 variants of style sheets: standard, IE8 or later, ≤IE7 4
  5. 5. ESTIMATING SIZE• Test site has 27 pages, sized 2.5 - 5.3 kB according to Firefox• We didn’t optimize for IE7, thus page sizes are 88 - 129 kB• For a very static site, let’s say we’ll reach a maximum of 100 pages • Firefox, Chrome, Opera & Safari: 100 x 5.5 kB = 0.6 MB • IE 8 or later: 100 x 120 kB = 12 MB • IE7 & IE6: 100 x 130 kB = 13 MB 5
  6. 6. ESTIMATING SIZE• Depending on the amount of unique (= browser x encoding etc.) cached pages needed, entire real dataset will probably be 10-50 MB in size• Amazon EC2 micro instance has 613 MB of RAM and running “vmstat” or “free - m” shows that ~170 MB of that is available• As a rule of thumb (which doesn’t always work) you can allocate 80 % of free memory to Varnish cache• We’ll allocate 130 MB and hot dataset should still be a fraction of that• Had we not pushed 250 MB of files to CDN, we’d need at least 1 GB of RAM 6
  7. 7. Installing Varnish 7
  8. 8. • Our VCL files and tools use syntax of Varnish 3.0 and don’t work with older versions• Building Varnish from source: sudo su - yum install -y gcc make pkgconfig pcre-devel ncurses-devel cd /usr/src wget http://repo.varnish-cache.org/source/varnish-3.0.2.tar.gz -O - | tar xz cd varnish-3.0.2 ./configure make && make install exit• Copy VCL configuration file to /etc/varnish and try invoking Varnish: “sudo /usr/local/sbin/varnishd -V” 8
  9. 9. • Running with 130 MB of memory (remove line breaks): sudo /usr/local/sbin/varnishd -s malloc,130M -f /etc/varnish/<your_config>.vcl -T 127.0.0.1:2000 -a 0.0.0.0:80• Only if out of memory, try with disk: sudo /usr/local/sbin/varnishd -s file,/<path>/<file>,3G -f /etc/varnish/<your_config>.vcl -T 127.0.0.1:2000 -a 0.0.0.0:80• Stopping Varnish: sudo pkill varnishd 9
  10. 10. • Useful commands for monitoring Varnish: varnishstat varnishtop varnishhist varnishsizes varnishlog• Purging main page / all pages from cache: varnishadm -T localhost:2000 ban.url "^/$" varnishadm -T localhost:2000 ban.url "^/.*"• Further performance tuning: sudo /usr/local/sbin/varnishd -s malloc,130M -u nobody -g nobody -p cli_timeout=30 -p thread_pool_add_delay=2 -p thread_pool_min=400 -p thread_pool_max=4000 -p session_linger=100 -f /etc/varnish/<your_config>.vcl -T 127.0.0.1:2000 -a 0.0.0.0:80 10
  11. 11. Installing Security.VCL 11
  12. 12. • Varnish makes a nice first line of defense against web attacks• Security.VCL is a web application firewall similar to Apache mod_security but faster # If not yet installed: sudo yum install -y make wget https://github.com/comotion/security.vcl/tarball/master -O - | tar xz cd <comotion-security-dir>/vcl/ sudo make cd .. sudo ln -s $PWD/vcl/ /etc/varnish/security• Edit your VCL file and add this line near the top: include "/etc/varnish/security/main.vcl"; 12
  13. 13. • Edit file vcl/config.vcl and comment out some rules: #include "/etc/varnish/security/modules/robots.vcl"; #include "/etc/varnish/security/modules/cloak.vcl";• Finally, reload your Varnish configuration and test the firewall. Visiting these URLs must return “Error 403 Naughty, not nice!” • example.com/exploit/foo/bar:bla • example.com/index.old • example.com/SELECT FROM • example.com/javascript: 13
  14. 14. Installing New Relic 14
  15. 15. • New Relic allows tracking request queue times which is a relevant metric when load testing Varnish• Signup for a free account at http://newrelic.com/, subscribe to a weekly performance summary and write down your license key sudo rpm -Uvh http://yum.newrelic.com/pub/newrelic/el5/x86_64/newrelic-repo-5-3.noarch.rpm sudo yum install -y newrelic-sysmond sudo nrsysmond-config --set license_key=<your_license_key> sudo /etc/init.d/newrelic-sysmond start 15
  16. 16. • Create file /etc/varnish/newrelic.h /* * Add X-Request-Start header so we can track queue times in New Relic RPM */ #include <stdio.h> #include <sys/time.h> struct timeval detail_time; gettimeofday(&detail_time, NULL); char start[20]; sprintf(start, "t=%lu%06lu", detail_time.tv_sec, detail_time.tv_usec); VRT_SetHdr(sp, HDR_REQ, "020X-Request-Start:", start, vrt_magic_string_end);• Add the following to your VCL file, inside vcl_recv: C{ #include </etc/varnish/newrelic.h> }C 16
  17. 17. After a while, request queuing parameter shouldappear in New Relic RPM: 17
  18. 18. Installing Munin 18
  19. 19. • Install Munin and Varnish plugins: sudo su - yum install -y munin-node munin cd /usr/share/munin/plugins/ wget https://raw.github.com/munin-monitoring/contrib/master/plugins/ varnish/varnish_allocated wget https://raw.github.com/munin-monitoring/contrib/master/plugins/ varnish/varnish_cachehitratio wget https://raw.github.com/munin-monitoring/contrib/master/plugins/ varnish/varnish_healthy_backends wget https://raw.github.com/munin-monitoring/contrib/master/plugins/ varnish/varnish_hitrate wget https://raw.github.com/munin-monitoring/contrib/master/plugins/ varnish/varnish_total_objects chmod a+x /usr/share/munin/plugins/varnish_* ln -s /usr/share/munin/plugins/varnish_* /etc/munin/plugins/ exit• Command “munin-node-configure” lists installed plugins 19
  20. 20. • Add to /etc/munin/plugin-conf.d/munin-node [varnish*] user root # Uncomment to set network traffic warning at 400K # [if_*] # env.warning 400000• Edit e-mail settings in /etc/munin/munin.conf contact.me.command mail -s "Munin notification" admin@example.com contact.me.always_send warning critical• Start Munin: “sudo service munin-node start” 20
  21. 21. Munin notifications can be tested bysetting a low threshold for traffic: 21
  22. 22. Cloud platforms allow setting useful alerts as well,here’s Amazon CloudWatch: 22
  23. 23. ...or sign up for a free RevealCloud account athttps://app.copperegg.com/signup/free 23
  24. 24. Weekly E-mail from Munin 24
  25. 25. • Create file /etc/varnish/email_varnish_reports.sh #!/bin/bash # Send Munin generated Varnish statistics by e-mail VARNISH_LOCATION="Tokyo" REPORT_PATH=/var/www/html/munin/localhost/localhost EMAIL_RECIPIENT="admin@example.com" EMAIL_SUBJECT="Varnish Weekly Statistics" EMAIL_BODY="Weekly statistics attached." hash mutt 2>&- || { echo -e >&2 "nMutt not installed, aborting.n"; exit 1; } echo $EMAIL_BODY | mutt -s "$EMAIL_SUBJECT ($VARNISH_LOCATION)" -a $REPORT_PATH/varnish_cachehitratio-week.png -a $REPORT_PATH/varnish_hitrate-week.png -a $REPORT_PATH/varnish_total_objects-week.png -a $REPORT_PATH/varnish_allocated-week.png -a $REPORT_PATH/df-week.png -a $REPORT_PATH/threads-week.png -a $REPORT_PATH/cpu-week.png -a $REPORT_PATH/memory-week.png -- $EMAIL_RECIPIENT 25
  26. 26. • Edit file /etc/crontab to send a report every Monday MAILTO=admin@example.com 00 08 * * Mon root /etc/varnish/email_varnish_reports.sh• Make sure Mutt is installed and restart cron daemon sudo yum install -y mutt sudo service crond restart 26
  27. 27. Limited Browser Access to Munin Graphs 27
  28. 28. • Install lighttpd sudo yum install -y lighttpd• Edit file /etc/lighttpd/lighttpd.conf server.port = 8081 server.document-root = server_root + "/html" $HTTP["remoteip"] !~ "127.0.0.1" { url.access-deny = ( "" ) }• Start lighttpd sudo service lighttpd start 28
  29. 29. • Now you can let Varnish control access to Munin graphs. Benefit: maintain ACLs in one configuration file only.• Edit your VCL file and add this line near the top: backend monitoring { .host = "127.0.0.1"; .port = "8081"; }• Add the following in the beginning of vcl_recv: if (req.url ~ "^/munin" && client.ip ~ internal && (req.url ~ "?your-secret-token" || req.http.referer ~ "(www.)?example.com")) { set req.backend = monitoring; return (pipe); } 29
  30. 30. Automatic Restarting 30
  31. 31. • Install daemontools and create Varnish service directory sudo su - mkdir -p /package cd /package wget http://cr.yp.to/daemontools/daemontools-0.76.tar.gz tar zxpf daemontools-0.76.tar.gz rm -f daemontools-0.76.tar.gz cd admin/daemontools-0.76 sed -i /extern int errno/{s/^//* /;s/$/ *//;G;s/$/#include <errno.h>/;} src/error.h package/install mkdir /var/service mkdir -m 1755 /var/service/varnish 31
  32. 32. • Stop Varnish in case it’s running and create an executable script /var/service/varnish/run: #!/bin/sh # Daemontools run script for starting Varnish exec 2>&1 exec echo | mail -s "Varnish in Tokyo restarting" admin@example.com exec varnishd -F -s malloc,130M -u nobody -g nobody -p cli_timeout=30 -p thread_pool_add_delay=2 -p thread_pool_min=400 -p thread_pool_max=4000 -p session_linger=100 -f /etc/varnish/varnish.tokyo.vcl -T 127.0.0.1:2000 -a 0.0.0.0:80• Notethat the script may get called multiple times during reboot, thus sending several e-mails 32
  33. 33. • Create a log script and add symbolic link: mkdir -m 755 /var/service/varnish/log cd /var/service/varnish/log wget http://qmail.jms1.net/scripts/service-any-log-run mv service-any-log-run run chmod 755 run ln -s /var/service/varnish /service/varnish• Confirm that the services are running: svstat /service/varnish /service/varnish/log• Ifdaemontools is not running, type "sudo /command/ svscanboot &". Varnish is stopped by typing "svc -d /service/ varnish" and started with "svc -u /service/varnish". 33
  34. 34. • Reboot the system to check that everything works fine. You might have to take two more steps. Comment out this line from file /etc/inittab: #SV:12345:respawn:/command/svscanboot• Create file /etc/init/svscan.conf: start on runlevel [12345] stop on runlevel [^12345] respawn exec /command/svscanboot• Add similar scripts /service/<your-service>/run for all services you need to manage, e.g. Munin and lighttpd. 34
  35. 35. • Finally, you can create a swap file in case Varnish needs it: #!/bin/bash # Create swapfile if not already present. Default size is 2 GB. if [ ${SWAP_SIZE_MEGABYTES:=2048} -eq 0 ];then echo No swap size given, skipping. else if [ -e /swapfile ];then echo /swapfile already exists, skipping. else echo Creating /swapfile of $SWAP_SIZE_MEGABYTES MB dd if=/dev/zero of=/swapfile bs=1024 count=$(($SWAP_SIZE_MEGABYTES*1024)) mkswap /swapfile fi swapon /swapfile echo Swap Status: swapon -s fi 35
  36. 36. DOWNLOADS• varnish.tokyo.vcl (VCL example) - https://gist.github.com/1754248• newrelic.h - https://gist.github.com/1817420• email_varnish_reports.sh - https://gist.github.com/1817400• run (daemontools) - https://gist.github.com/1818886• create_swapfile.sh - https://gist.github.com/1817411• daemontools log script - http://qmail.jms1.net/scripts/service-any-log-run• Other code snippets of this presentation - https://gist.github.com/1819143 36
  37. 37. VCL Configuration File 37
  38. 38. # VCL configuration file for Varnish # Handle HTTPS connection Pa Pa if (server.port == 443) { ge ge# Define which IP addresses or hosts have access to files that are set req.backend = web_ssl; 1 2# blocked from the public internet } else {acl internal { set req.backend = web;} "localhost"; your origin } server here if (req.restarts == 0) {# Define origin servers if (req.http.x-forwarded-for) {backend web { .host = "1.2.3.4"; .port = "80"; } set req.http.X-Forwarded-For =backend web_ssl { .host = "1.2.3.4"; .port = "443"; } req.http.X-Forwarded-For + ", " + client.ip; } else {# Uncomment to support Munin graphs set req.http.X-Forwarded-For = client.ip;# backend monitoring { .host = "127.0.0.1"; .port = "8081"; } } }# Uncomment to include Security.VCL module# @see: https://github.com/comotion/security.vcl # Normalize requests sent via curls -X mode and LWP# include "/etc/varnish/security/main.vcl"; if (req.url ~ "^http://") { set req.url = regsub(req.url, "http://[^/]*", "");# Respond to incoming requests }sub vcl_recv { # Normalize hostname to avoid double caching # Uncomment to support Munin graphs. Access is granted if visitor set req.http.host = regsub(req.http.host, # is coming from a whitelisted IP address and secret token is "^example.com$", "www.example.com"); # provided. # e.g. http://www.example.com/munin?your-secret-token # Uncomment to support shared hosting when testing through staging # if (req.url ~ "^/munin" && client.ip ~ internal # server # && (req.url ~ "?your-secret-token" # set req.http.host = regsub(req.http.host, "^cache.example.com$", # || req.http.referer ~ "(www.)?example.com")) { # "www.example.com"); # set req.backend = monitoring; # return (pipe) ; # } # Use anonymous, cached pages if all backends are down if (!req.backend.healthy) { # Uncomment to have New Relic track queue times unset req.http.Cookie; # C{ } # #include </etc/varnish/newrelic.h> # }C 38
  39. 39. # Allow the backend to serve up stale content if it is # Handle compression correctly. Different browsers send Pa Pa# responding slowly # different "Accept-Encoding" headers, even though they ge geset req.grace = 6h; # mostly all support the same compression mechanisms. By 3 4 # consolidating these compression headers into a consistent# Do not cache these paths # format, we can reduce the size of the cache and get more hits.if (req.url ~ "^/status.php$" || # @see: http:// varnish.projects.linpro.no/wiki/FAQ/Compression req.url ~ "^/administrator") { if (req.http.Accept-Encoding) { return (pass); if (req.http.Accept-Encoding ~ "gzip") {} # If the browser supports it, well use gzip. set req.http.Accept-Encoding = "gzip";# Do not cache authenticated sessions } else if (req.http.Accept-Encoding ~ "deflate") {if (req.http.Cookie && req.http.Cookie ~ "authtoken=") { # Next, try deflate if it is supported. return (pipe); set req.http.Accept-Encoding = "deflate";} } else { # Unknown algorithm. Remove it and send unencoded.# Do not allow outside access to configuration.php unset req.http.Accept-Encoding;if (req.url ~ "^/configuration.php$" && !client.ip ~ internal) { } # Have Varnish throw the error directly } # error 404 "Page not found."; # Always cache the following file types for all users # Use a custom error page if (req.url ~ set req.url = "/"; "(?i).(png|gif|jpeg|jpg|ico|swf|pdf|txt|css|js|html|htm|gz|xml)} (?[a-z0-9]+)?$") { unset req.http.Cookie;# Allow purge only from internal users }if (req.request == "PURGE") { if (!client.ip ~ internal) { if (req.request != "GET" && error 405 "Not allowed."; req.request != "HEAD" && } req.request != "PUT" && return (lookup); req.request != "POST" &&} req.request != "TRACE" && req.request != "OPTIONS" && req.request != "DELETE") { /* Non-RFC2616 or CONNECT which is weird. */ return (pipe); } 39
  40. 40. if (req.request != "GET" && req.request != "HEAD") { # Dont include cookie in hash Pa Pa return (pass); # if (req.http.Cookie) { ge ge } # hash_data(req.http.Cookie); 5 6 # } # We cache requests with cookies too (e.g. Google Analytics) # Original: if (req.http.Authenticate || req.http.Authorization return (hash); # || req.http.Cookie) { } if (req.http.Authenticate || req.http.Authorization) { return (pass); sub vcl_hit { } if (req.request == "PURGE") { purge; return (lookup); error 200 "Purged.";} } if (obj.ttl <= 0s) {# sub vcl_pipe { return (pass);# # Note that only the first request to the backend will have }# # X-Forwarded-For set. If you use X-Forwarded-For and want to# # have it set for all requests, make sure to have: return (deliver);# # set bereq.http.connection = "close"; }# # here. It is not set by default as it might break some# # broken web applications, like IIS with NTLM authentication. sub vcl_miss {# return (pipe); if (req.request == "PURGE") {# } error 404 "Not in cache."; }# sub vcl_pass {# return (pass); return (fetch);# } }# Determine the cache key when storing/retrieving a cached page # Called when the requested object has been retrieved from the backend,sub vcl_hash { # or the request to the backend has failed; "beresp" stands for hash_data(req.url); # back-end response if (req.http.host) { sub vcl_fetch { hash_data(req.http.host); } else { hash_data(server.ip); } 40
  41. 41. # Dont allow static files to set cookies # unset beresp.http.expires; Pa Paif (req.url ~ # set beresp.ttl = 1w; ge ge "(?i).(png|gif|jpeg|jpg|ico|swf|pdf|txt|css|js|html|htm|gz|xml) # set beresp.http.magicmarker = "1"; 7 8 (?[a-z0-9]+)?$") { } unset beresp.http.Set-cookie;} return (deliver); }# Allow items to be stale if neededset beresp.grace = 6h; sub vcl_deliver {if (beresp.ttl <= 0s) { # Uncomment to add hostname to headers set beresp.http.X-Cacheable = "NO:Not Cacheable"; # set resp.http.X-Served-By = server.hostname; return (hit_for_pass);} else if (req.http.Cookie ~"(UserID|_session)") { # Identify which Varnish handled the request # Dont cache content for logged in users if (obj.hits > 0) { set beresp.http.X-Cacheable = "NO:Got Session"; set resp.http.X-Cache = "HIT from Tokyo"; return (hit_for_pass); set resp.http.X-Cache-Hits = obj.hits;} else if (beresp.http.Cache-Control ~ "private") { } else { # Respect the Cache-Control=private header from the backend set resp.http.X-Cache = "MISS from Tokyo"; set beresp.http.X-Cacheable = "NO:Cache-Control=private"; return (hit_for_pass); } your location} else if (beresp.ttl < 1s) { # Remove version number sometimes set by CMS here # Extend the lifetime of the object artificially if (resp.http.X-Content-Encoded-By) { set beresp.ttl = 300s; unset resp.http.X-Content-Encoded-By; set beresp.grace = 300s; } set beresp.http.X-Cacheable = "YES:Forced";} else { if (resp.http.magicmarker) { # Varnish determined the object was cacheable # Remove the magic marker, see vcl_fetch set beresp.http.X-Cacheable = "YES"; unset resp.http.magicmarker; # Uncomment to have Varnish cache objects longer than the clients # By definition we have a fresh object # do. Cache must be purged manually when the site changes, so dont set resp.http.Age = "0"; # use with frequently changing content - comments, visitor counters } # etc. # @see: https://www.varnish-cache.org/trac/wiki/ return (deliver); VCLExampleLongerCaching } 41
  42. 42. sub vcl_error { "}; Pa Pa ge return (deliver); ge 10 # Redirect to some other URL in case of root page failure } 9 # if (req.url ~ "^/?$") { # set obj.status = 302; # set obj.http.Location = "http://backup.example.com/"; # } # Otherwise redirect to root, which will likely be in the cache set obj.http.Content-Type = "text/html; charset=utf-8"; synthetic {"<html><head> <title>Page Unavailable</title> <style> body { background: #efefef; text-align: center; color: white; font-family: Trebuchet MS, sans-serif; } #page { width: 500px; margin: 100px auto 0; padding: 30px;background: #888888; border-radius: 14px; -moz-border-radius: 14px; -webkit-border-radius: 14px; border: 0 } a, a:link, a:visited { color: #cccccc; } .error { color: #222222; } </style></head><body onload="setTimeout(function() { window.location = / }, 3000)"> <div id="page"> <h1 class="title">Page Unavailable</h1> <p>The page you requested is temporarily unavailable.</p> <p>Were redirecting you to the <a href="/">homepage</a> in 3seconds.</p> <div class="error">(Error "} + obj.status + " " + obj.response +{")</div> </div></body></html> 42
  43. 43. ONE MORE THING...Incapsula need your A record and CNAME record (www) to point to theirservers. This is obviously not the case when you send visitors to Varnishinstead.If your DNS settings don’t match Incapsula’s instructions, you’ll see an errormessage in control panel and the service might be disabled - not sure aboutthe latter.Quick fix is to always send visitors from Ireland and Israel to the defaultaddress as these are the locations of Incapsula Site Helper bot. If this seems toohackish, more elegant solutions probably exist. 43
  44. 44. GEODNS SETTINGS Record Area Dataexample.com Europe, Africa, Global <Varnish Ireland IP>example.com Americas, Global <Varnish California IP>example.com Asia, Australia, Global <Varnish Tokyo IP>example.com Ireland, Israel, Global <Incapsula IP>www.example.com Global example.comwww.example.com Ireland, Israel <Incapsula CNAME>Matches are made from smallest to largest qualifying records, so Ireland takes precedence overEurope which in turn precedes global record. Geo-targeting is never 100% accurate. 44
  45. 45. DNS FAILOVER• Wondered why area Global was set for so many records?• This means that when one edge server is down, requests will be balanced to all remaining servers marked as global• As a result, potential DDoS attack will have to take down 4 destinations instead of one. An alternative would be having only Incapsula as backup and keeping other Varnish boxes (and visitors) oblivious to regional attacks.• When a failed edge server is back up again, it will start receiving requests as usual 45
  46. 46. RESOURCES - VARNISH• https://www.varnish-cache.org/trac/wiki/VCLExamples• https://www.varnish-cache.org/docs/trunk/installation/upgrade.html• http://nwlinux.com/varnish-caching-proxy-configuration-resources/• http://nwlinux.com/mitigating-ddos-attacks-with-varnish-proxy/• https://github.com/comotion/security.vcl 46
  47. 47. RESOURCES - MONITORING• http://newrelic.com/docs/server/server-monitor-installation-redhat-and- centos• http://munin-monitoring.org/• http://engineering.gomiso.com/2011/01/04/easy-monitoring-of-varnish-with- munin/• http://waste.mandragor.org/munin_tutorial/munin.html• http://www.copperegg.com/product/pricing/ 47
  48. 48. RESOURCES• http://www.productionmonkeys.net/guides/web-server/varnish• http://cr.yp.to/daemontools.html• http://www.webperformance.com/library/tutorials/ CalculateNumberOfLoadtestUsers/ 48

×