Varnish - PLNOG 4

7,390
-1

Published on

The unabridged version of the presentation on Varnish for PLNOG 4.

Published in: Technology
1 Comment
15 Likes
Statistics
Notes
No Downloads
Views
Total Views
7,390
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
170
Comments
1
Likes
15
Embeds 0
No embeds

No notes for slide

Varnish - PLNOG 4

  1. 1. A modern HTTP accelerator for content providers Leszek Urbański Trader Media East Competence Center unabridged version PLNOG 4 – Warsaw, 2010-03-05
  2. 2. Modern web apps ● you know this picture...
  3. 3. Modern web apps ● you know this picture...
  4. 4. Modern web apps ● a nice setup...
  5. 5. Modern web apps ● ...but the application is still slow. ● you need efficient web caching, before the traffic hits your app ● CDNs? Hardware? Squid?
  6. 6. Web caching ● CDNs ● expensive ● you are completely dependent on a CDN's service ● hardware ● nice, but... ● $30,000 for one BIG-IP 3600 without redundancy, support and only 50 Mbps compression with the standard licence
  7. 7. Web caching ● Squid ● a forward proxy/cache with optional reverse proxying (HTTP acceleration) ● huge config files full of forward proxy options ● it's slow ● “1970s programming”
  8. 8. http://varnish-cache.org/
  9. 9. Varnish ● a state-of-the-art reverse proxy and cache ● open source, initially developed for a Norwegian tabloid “Verdens Gang” in 2006 ● Poul-Henning Kamp – architect and lead developer ● Linpro AS
  10. 10. Varnish ● used by TOP100 sites ● Twitter ● Photobucket ● weather.com ● answers.com ● Hulu ● Wikia ● source: Ingvar Hagelund http://users.linpro.no/ingvar/varnish/stats-2010-01-18.txt
  11. 11. Varnish ● used by only one Alexa TOP100 site in Poland ● Gadu-Gadu
  12. 12. Architecture ● Varnish does not fight the OS kernel! ● uses virtual memory, two main stevedores: ● mmap() ● malloc() ● scales well in SMP environments ● event-based acceptor ● multi-threaded worker model
  13. 13. Architecture ● avoids expensive memory operations ● workers used in the MRU order, session lingering ● a worker has a private set of variables on the stack ● static buffers – reused ● uses jemalloc library. No noticeable difference with Google's tcmalloc
  14. 14. Architecture ● workspaces ● operate on pointers, do not copy data ● malloc() only for the workspaces ● obj_workspace – per object, for request/response headers and metadata. Watch out for very large headers/cookies! ● sess_workspace – per thread, for request processing ● shm_workspace – for SHM logging
  15. 15. Architecture ● SHM logging ● an mmap()ed file shared by all threads and logging programs ● logging without syscalls! memcpy(p + SHMLOG_DATA, t.b, l); /* or */ vsnprintf((char *)(p + SHMLOG_DATA), mlen + 1, fmt, ap);
  16. 16. Architecture ● object eviction from a LRU list ● the list requires locking for writes ● an object is only moved in the LRU list if it hasn't been moved for the last lru_interval seconds ● hitpass objects
  17. 17. Architecture ● efficient object purging - “ban list” ● need to purge 200,000 objects from the cache without overloading the server? ● Varnish keeps a list of purges ● every object is tested against the list, but only if requested by a client ● if it matches, it is refreshed from a backend ● its “last tested against” pointer is updated
  18. 18. Architecture ● results? ● microsecond-level response for cached objects ● good even for static content ● performance limit currently unknown :-) ● 75,000 reqs/s achieved at TMECC ● 143,000 reqs/s achieved by Kristian from Redpill- Linpro
  19. 19. Architecture ● serving a request from cache: <... futex resumed> ) = 0 <0.629910> futex(0x7f2a577fe2e8, FUTEX_WAKE_PRIVATE, 1) = 0 <0.000011> ioctl(9, FIONBIO, [0]) = 0 <0.000011> read(9, "GET /logo.png HTTP/1.0rn (...) 8191) = 177 <0.000016> clock_gettime(CLOCK_REALTIME, {1265632945, 828835974}) = 0 <0.000011> clock_gettime(CLOCK_REALTIME, {1265632945, 828986444}) = 0 <0.000011> clock_gettime(CLOCK_REALTIME, {1265632945, 829032564}) = 0 <0.000010> writev(9, [{"HTTP/1.1"..., 8}, (...) 12912}], 32) = 13227 <0.000039> clock_gettime(CLOCK_REALTIME, {1265632945, 830411262}) = 0 <0.000011> close(9) = 0 <0.000019> futex(0x44884bf4, FUTEX_WAIT_PRIVATE, 239, NULL <unfinished ...> ● 10 system calls, 4 for clock
  20. 20. Features ● run-time management and reconfiguration $ telnet localhost 6082 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. vcl.list 200 23 active 7 boot vcl.load new1 /etc/varnish/default.vcl 200 13 VCL compiled. vcl.use new1 200 0
  21. 21. Features ● comprehensive logging and management ● varnishadm ● varnishlog ● varnishncsa ● varnishtop ● varnishstat ● varnishhist ● varnishreplay ● varnishtest
  22. 22. Features ● logging examples ● tags varnishtop -i RxURL varnishtop -i TxURL varnishtop -i RxHeader -I '^User-Agent' varnishlog -c -o ReqStart 10.0.0.1 varnishlog -b -o TxHeader '^X-Forwarded-For: .*10.0.0.1'
  23. 23. Features ● varnishstat – real time statistics client_conn 87737603 99.74 Client connections accepted client_req 335496200 381.40 Client requests received cache_hit 307936704 350.07 Cache hits cache_hitpass 811746 0.92 Cache hits for pass backend_conn 12311926 14.00 Backend conn. success n_object 549675 . N struct object n_wrk 100 . N worker threads n_expired 23826372 . N expired objects n_lru_nuked 0 . N LRU nuked objects n_wrk_failed 0 0.00 N worker threads not created s_req 335510357 381.41 Total Requests s_pass 2947900 3.35 Total pass s_fetch 27317481 31.05 Total fetch sma_nbytes 6661407561 . SMA outstanding bytes sma_balloc 2173616292374 . SMA bytes allocated sma_bfree 2166954884813 . SMA bytes free backend_req 27318738 31.06 Backend requests made esi_parse 0 0.00 Objects ESI parsed (unlock) esi_errors 0 0.00 ESI parse errors (unlock)
  24. 24. Features ● timing information type XID start time 830 ReqEnd c 877345549 1233949945.075706005 1233949945.075754881 0.017112017 0.000022888 0.000025988 end time accept()-processing processing-delivery delivery time
  25. 25. Features ● backend load balancing – directors ● round-robin ● random ● backend health polling – using new connections ● grace ● URL serialization ● IPv6 support
  26. 26. Features ● no forward-proxy support – can be done, but with huge amount of configuration magic ● flexible purging purge req.http.host == foobar.com && req.url ~ ^/directory/.*$ purge obj.http.Cookie ~ example=true ● ESI support
  27. 27. Features ● Edge Side Includes ● a markup language for dynamic content assembly ● used by Akamai, IBM WebSphere, F5, Varnish ● without ESI: page-level caching decisions ● with ESI: a page can be split into separate blocks and assembled by the cache server
  28. 28. Features ● Edge Side Includes ● Varnish implements a small subset of ESI ● no compression support yet ● no If-Modified-Since support yet <esi:include src="/esi/hot_news.html"/> <esi:remove> <a href="/something">something</a> </esi:remove> <!--esi <p>Hot news:<esi:include src="/hot_news.html"/></p> -->
  29. 29. Features ● VCL – Varnish Configuration Language ● a domain-specific language ● translated to C and compiled ● dynamically loaded ● similar to C, Perl ● = == ! && || ~ !~ ● character escaping like in URLs: %nn ● no user-defined variables, use HTTP headers: set req.http.something = ""; unset req.http.something;
  30. 30. Features ● VCL – Varnish Configuration Language ● “normal” “concatenated” “strings” or {"string string "} synthetic { “string” } ● if () {} elsif {} ● no loops ● include “file.vcl”; ● regsub(), regsuball()
  31. 31. Features ● VCL – Varnish Configuration Language ● user-definied subroutines sub f { do_magic; } call f; ● no arguments / return values in subs ● return(); exclusive to internal VCL functions ● special variables: now (unix time), client.ip, server.ip, server.port, server.identity
  32. 32. Features ● VCL – Varnish Configuration Language ● ACLs acl localnet { “localhost”; “10.0.0.0/24”; ! “10.0.0.1”; } if (client.ip ~ localnet) { do_magic; } ● security.vcl ● if everything else fails... embedded C!
  33. 33. Features ● embedded C in VCL ● example: syslog logging from VCL (don't :-) C{ #include <syslog.h> }C C{ syslog(LOG_INFO, "Something happened at VCL line XX."); syslog(LOG_ERR, "Response from backend: XID %s request %s %s "%s" %d "%s" "%s"", VRT_r_req_xid(sp), VRT_r_req_request(sp), VRT_GetHdr(sp, HDR_REQ, "005Host:"), VRT_r_req_url(sp), VRT_r_obj_status(sp), VRT_r_obj_response(sp), VRT_GetHdr(sp, HDR_OBJ, "011Location:")); }C
  34. 34. VCL ● request path through VCL ● vcl_recv ● vcl_pipe ● vcl_pass ● vcl_hash ● vcl_{hit,miss} ● vcl_fetch ● vcl_deliver ● http://varnish-cache.org/wiki/VCLExampleDefault ● this graph is oversimplified!
  35. 35. VCL ● vcl_recv ● called at the beginning, after the request has been received ● possible returns: error, pass, pipe, lookup ● example variables: req.request, req.url, req.proto, req.backend, req.backend.healthy, req.http.Header if (req.host == “static.foo.com” && req.url ~ “^/static/.*”) { set req.backend = cluster1; } else { error 404 “Unknown virtual host”; }
  36. 36. VCL ● vcl_pipe ● called when entering pipe mode ● shifts bytes back and forth (client ↔ backend) ● possible returns: error, pipe ● example variables: bereq.request, bereq.url, bereq.proto, bereq.http.Header ● timeouts
  37. 37. VCL ● vcl_pass ● called when entering pass mode ● the request is passed to the backend without caching ● possible returns: error, pass ● example variables: bereq.*
  38. 38. VCL ● vcl_hash ● called on object lookup ● generates a user-configurable object hash ● possible returns: hash ● example variables: req.hash vcl_hash { if (req.url ~ “^/content” && req.http.Cookie ~ “adult=true”) { set req.hash += “adultContent”; set req.http.X-Adult-Content = “1”; } }
  39. 39. VCL ● vcl_hit ● called after lookup when hit ● possible returns: error, pass, deliver ● example variables: obj.hits, obj.ttl ● caveat: do not modify the object here! ● example: adaptive TTLs: if (req.http.host ~ “^images.”) { if (obj.hits > 5 && obj.hits < 10) { set obj.ttl = 8h; } elsif (obj.hits >= 10) { set obj.ttl = 2d; } }
  40. 40. VCL ● vcl_miss ● called after lookup when missed ● possible returns: error, pass, fetch ● example variables: bereq.*
  41. 41. VCL ● vcl_fetch ● called after the object has been fetched ● possible returns: error, pass, deliver, esi ● example variables: obj.hits, obj.proto, obj.status, obj.response, obj.cacheable, obj.ttl, obj.lastuse ● obj.cacheable means: obj.status is 200, 203, 300, 301, 302, 410 or 404 ● forced obj.ttl first set here ● obj. called beresp. in trunk
  42. 42. VCL ● vcl_fetch ● ESI processing takes place here <!-- /esi/example.html --> <esi:include src="/counter.cgi"/> sub vcl_fetch { if (req.url ~ "/esi/" || obj.http.X-ESI) { esi; set obj.ttl = 1d; } elsif (req.url == "/counter.cgi") { set obj.ttl = 1m; } }
  43. 43. VCL ● vcl_deliver ● called before delivery to the client ● possible returns: error, deliver ● example variables: resp.proto, resp.status, resp.response, resp.http.HEADER ● modify headers for the client here set resp.http.X-Served-By = server.identity; if (obj.hits > 0) { set resp.http.X-Varnish-Hit = “HIT”; set resp.http.X-Varnish-Hits = obj.hits; } else { set resp.http.X-Varnish-Hit = “MISS”; }
  44. 44. VCL ● vcl_error ● called on errors ● possible returns: deliver ● example variables: req.*, obj.* ● customizing error pages: sub vcl_error { if (req.url ~ “^/MONITOR.txt$”) { synthetic {“MONITOR “}; deliver; } }
  45. 45. VCL ● restarts ● the “restart” keyword turns the request all the way back to vcl_recv, available everywhere sub vcl_fetch { if (obj.status >= 500) { restart; } } sub vcl_error { if (obj.status == 500 && req.restarts < 4) { restart; } }
  46. 46. VCL ● restarts ● the “restart” keyword – you can even try another data center sub vcl_recv { if (req.restarts == 0) { set req.backend = data_center_1; } elsif (req.restarts == 1) { set req.backend = data_center_2; } }
  47. 47. VCL ● things to remember ● req. data structure available throughout the VCL (except in vcl_deliver) ● do not modify objects in vcl_hit (except for TTL) ● if unsure, translate the VCL to C varnishd -C -f file.vcl ● look for VRT_count(sp, X) for ordering ● if you don't return in a vcl_*, default VCL for that function is appended
  48. 48. VCL examples ● purging, “the squid way” sub vcl_recv { if (req.request == "PURGE") { if (!client.ip ~ purge) { error 405 "Not allowed"; } lookup; } } sub vcl_hit { if (req.request == "PURGE") { set obj.ttl = 0s; error 200 "Purged"; } } sub vcl_miss { if (req.request == "PURGE") { error 404 "Not found"; } }
  49. 49. VCL examples ● saint mode (trunk only) ● do not send errors to clients sub vcl_fetch { if (beresp.status >= 500) { set beresp.saintmode = 20s; restart; } set beresp.grace = 30m; } ● saint mode will disable a backend for a specified period of time ● if all backends are unavailable - grace
  50. 50. VCL examples ● force grace on error vcl_error { if (req.restarts == 0) { set req.http.X-Serve-Graced = "1"; restart; } } vcl_recv { if (req.http.X-Serve-Graced && req.restarts == 1) { set req.backend = dead; } ● define a “dead” backend with health polling ● when restarted from vcl_error, graced content will be served
  51. 51. VCL examples ● URL rewriting if (req.http.host ~ "^(www.)?foo" && req.url ~ "^/images/") { set req.http.host = "images.foo"; set req.url = regsub(req.url, "^/images/", "/"); } ● redirects (a bit of a hack) sub vcl_recv { if (req.http.host = "^(www.)?foo.com" && req.http.User-Agent ~ "iPhone|Nokia|Motorola") { error 701 "Moved temporarily"; } } sub vcl_error { if (obj.status == 701) { set obj.http.Location = "http://m.foo.com/"; set obj.status = 302; deliver; } }
  52. 52. VCL examples ● caching publicly available authorized pages sub vcl_fetch { if (obj.http.Authorization && !obj.http.Cache-Control ~ "public") { pass; } } ● caching logged in users (be careful!) ● http://varnish-cache.org/wiki/VCLExampleCachingLoggedInUsers ● possible with per-user caching and careful use of ESI ● separate “(not) logged in” objects from “logged in as...”
  53. 53. VCL examples ● cookie based hashing sub vcl_hash { if (req.http.Cookie ~ "language=esperanto" ) { set req.hash += "LangEsperanto"; } } ● result: a separate cached version of the object for requests with Cookie: language=esperanto; ● extracting the value of a cookie ● nothing more than a regexp regsub(req.http.Cookie, "^.*?cookie=([^;]*);*.*$", "1");
  54. 54. VCL examples ● serving synthetic responses if (req.url ~ "^/MONITOR.txt") { error 200 "OK"; } ● allowing reloads from browsers without purging if (req.http.Cache-Control ~ "(no-cache|no-store|private)") { pass; } ● watch out for nasty bots! ● passing everything for secure URLs if (req.url ~ "^/secure") { pass; }
  55. 55. VCL examples ● normalizing Accept-Encoding headers for compression if (req.http.Accept-Encoding) { if (req.http.Accept-Encoding ~ "gzip") { set req.http.Accept-Encoding = "gzip"; } elsif (req.http.Accept-Encoding ~ "deflate") { set req.http.Accept-Encoding = "deflate"; } else { remove req.http.Accept-Encoding; } if (req.url ~ ".(css|js)$" &&req.http.User-Agent ~ "MSIE 6") { remove req.http.Accept-Encoding; } }
  56. 56. Best practices ● RFC 2616! ● not really for reverse proxies... ● Varnish is both a client and a server ● TTLs
  57. 57. Best practices ● object TTL control – headers from backend ● considered in the following order: ● Cache-Control: s-maxage=<relative time> ● Cache-Control: max-age=<relative time> ● Varnish ignores all other Cache-Control headers (unless told otherwise in VCL) ● Expires: absolute time, requires synced clocks ● Expires is an HTTP/1.0 header ● Varnish will try to compensate for clock skew
  58. 58. Best practices ● object TTL control – VCL ● set obj.ttl = x; - takes precedence over headers ● default_ttl configuration parameter ● Varnish sets the Age: header ● if in doubt, check varnishlog ● TTL tag
  59. 59. Best practices ● TTL tag in varnishlog 509 TTL - 1850178309 RFC 1798 1267393695 1267393694 1267395494 0 0 XID TTL time Date Expires max-age age 242 TTL c 1416303904 VCL 86400 1267393696 XID TTL time
  60. 60. Best practices ● caching policy ● Last-Modified / If-Modified-Since ● ETag / If-None-Match ● Vary
  61. 61. Best practices ● compression ● Varnish leaves compression up to the backends ● gzip, deflate, none – data set * 3 ● Vary: Accept-Encoding ● normalize Accept-Encoding from browsers
  62. 62. Best practices ● sanitize request headers ● we've had requests coming in to “http://our.com/http://another.com/.*” if (req.url ~ "^/?http://") { set req.url = regsub(req.url, "?http://.*", ""); } ● cache hit ratio went from 92% to 94% ● normalize vhosts if (req.http.host ~ "^(www.)?example.com") { set req.http.host = "example.com"; } ● hit ratio and backend requests: 1% is half of 2%!
  63. 63. Best practices ● set Content-Length on the backends ● static files: multiple backends = multiple VFS caches ● serving large objects a.k.a. My Own YouTube ● objects are fully fetched before delivery ● use pipe ● ranges not supported ● not really suitable for serving video content
  64. 64. Best practices ● forced TTLs ● on heavily loaded sites – force TTLs to a few seconds on all pages (but pass secure content) ● purging ● entries on the ban list accumulate – consider forcing expiry by PURGE requests ● when forcing expiry, purge all Vary versions! ● include purging in your application design
  65. 65. Best practices ● debugging ● add an X-Served-By header ● add other headers along the way ● beware of header traffic! ● X-Varnish header
  66. 66. Best practices ● HTTPS – use pound or perlbal ● in vcl_pipe: set bereq.http.Connection = "close"; ● drain connections quickly before restarting varnishd sub vcl_recv { if (req.http.Connection != "close") { set req.http.Connection = "close"; restart; } }
  67. 67. Best practices ● graph everything, ask questions later ● YMMV: what is good for the big guys from TOP100 may not be as good for you (e.g. stevedore choice) ● test ● wget --save-headers ● curl -i ● LWP: GET -USsed ● caveat: lwp-request does: “GET http://foo/bar”
  68. 68. Best practices ● if everything else fails... $ gdb /usr/sbin/varnishd core GNU gdb 6.8-debian This GDB was configured as "x86_64-linux-gnu"... (gdb) bt (…) (gdb) frame 3 #3 0x000000000042ef64 in mgt_cli_vlu (priv=0x7fb112813c00, p=0x7fb1128d3000 "debug.health") at mgt_cli.c:270 270 xxxassert(i == strlen(p)); ● don't strip Varnish binaries ● compile with --enable-debugging-symbols --enable- diagnostics
  69. 69. Configuration ● object hash table ● Varnish 2.0: -h classic,N ● N hash buckets – objects / 10 ● a prime number ● Varnish trunk: -h critbit ● Patricia Tree
  70. 70. Configuration ● run-time parameters (can be set from CLI) ● obj_workspace=Nbytes (dynamic in trunk) – headers, per object overhead ● sess_workspace=Nbytes – entire header and all edits done in VCL, per thread ● shm_workspace=Nbytes – for the log, per thread ● shm_reclen=Nbytes – max SHM log record length ● session_linger=Nms – time before a worker thread is returned to its pool ● sess_timeout=Ns – persistent session timeout
  71. 71. Configuration ● thread_pools=N – set to the number of CPU cores ● thread_pool_add_delay=Nms – default may be too high ● thread_pool_max and _min – a bit confusing ● max – the limit for all thread pools ● min – the limit for one thread pool ● do not set too high
  72. 72. OS environment ● forget about 32-bit ● malloc() better than mmap() for in-memory cache sets ● also better for larger-than-memory cache sets on Linux (YMMV) ● Varnish on virtualized guests? ● slight latency difference ● can be an issue for on-line auction sites
  73. 73. OS environment ● Virtualization ● Varnish on a standalone system
  74. 74. OS environment ● Virtualization ● Varnish on a Xen domU with pinned vcpus
  75. 75. OS environment ● I/O related tuning on Linux ● set vm.swappiness to 0 ● /var/lib/varnish/$HOSTNAME/_.vsl – the SHM log ● put the SHM log on tmpfs ● anticipatory elevator best on HDDs, noop on SSDs ● use ext2 ● noatime ● swap striping ● iSCSI is great for logs
  76. 76. OS environment ● network tuning ● run NTP ● check if your load balancer uses keep-alive ● /proc/sys/net – don't tune if you don't know what you're doing ● don't use net.ipv4.tcp_tw_reuse ● tcp_tw_recycle is even worse ● normally the socket waits 2 * MSL ● reusing causes problems with NAT routers
  77. 77. New features in trunk ● upcoming release: Varnish 2.1 ● persistent storage (without LRU support) ● URL hashing director ● client hashing director ● critbit by default ● saint mode ● obj_workspace allocated dynamically ● req.* in vcl_deliver ● obj.* in vcl_fetch is now beresp.*
  78. 78. Shopping list ● http://varnish-cache.org/wiki/PostTwoShoppingList ● ESI enhancements (304s, gzip, etc.) ● compression support ● streaming in pass / fetch ● Content-Range support ● file upload buffering ● VCL cookie handling (req.cookie.foo) ● custom formats in varnishlog and varnishncsa
  79. 79. Shopping list ● expiry randomization ● “lemming effect”
  80. 80. Support & development ● commercial support offered by Redpill-Linpro ● community support #varnish on irc.linpro.no ● VML – Varnish Moral License ● http://phk.freebsd.dk/VML/ ● not a support contract! ● help pay for Varnish development
  81. 81. Thank you Questions?
  82. 82. Sources ● http://varnish-cache.org/ ● TMECC VCL configs ● Wikia VCL configs ● http://kristian.blog.linpro.no/ ● http://ingvar.blog.linpro.no/ ● #varnish

×