VUG5: Varnish at Opera Software

3,579 views

Published on

How we use Varnish at Opera Software, from the beginning (2009) to now.

Presentation hold for the 5th Varnish Users Group meeting (VUG5) held in Paris on March 22nd 2012.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,579
On SlideShare
0
From Embeds
0
Number of Embeds
17
Actions
Shares
0
Downloads
36
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

VUG5: Varnish at Opera Software

  1. 1. Varnish @ Opera Varnish Users Group Meeting Paris, 22nd March 2012Cosimo Streppone <cosimo@opera.com>
  2. 2. 1st Varnish deployment: My Opera• October 2009• 1 old recycled machine, 2 Gb of disk allocated• Started serving static pictures (1M+ req/day)• Then more...• Even more...• ...• ~15% of all My Opera requests were «varnished»• Around 8M req/day
  3. 3. My Opera – The start• Still using Debian Etch First Varnish instance was running v1.x from Etch. several years old, not good• Experienced VIPs – ”Very Interesting Problems” – User X getting User Ys session – Random users getting admin powers. Nightmare!• Theory: Varnish was caching response bodies that contained Set-Cookie: opera_session=<session_id>
  4. 4. My Opera – The startif (req.url ~ "^/community/users/avatar.pl/[0-9]+$" || req.url ~ "^/.+/avatar.pl$" || req.url ~ "^/.+/picture.pl?xscale=100$" || req.url ~ "^/desktopteam/xml/atom/blog/?$" || req.url ~ "^/desktopteam/xml/rss/blog/?$" || req.url ~ "^/community/api/users/friends.pl?user=.+$" || req.url ~ "^/community/api/users/groups.pl?user=.+$") { unset req.http.Cookie; unset req.http.Authorization; lookup;}
  5. 5. My Opera – Pass logged in users ... # Check for cookie only after always-cache URLs if (req.http.Cookie ~ "(opera_session|opera_persistent_)") { pass; } # DANGER, Will Robinson! Caching the front-page # At this point, lots of Google Analytics cookies will go in. # No problem. Its stuff used by Javascript if (req.url ~ "^/community/$") { lookup; } pass;}
  6. 6. My Opera: testing Varnish setup ... ok 289 - Got response from backend for /community/ (from ...) ok 290 - Correct status line # Adding header [Cookie] => [language=it] # ---------- # GET http://cache01.my.opera.com:6081/community/ # Host: my.opera.com # ------------ ok 291 - 2nd request: got response from backend for /community/ (from...) X-Varnish: 1211283813 1211283812 ok 292 - Correct status line # X-Varnish: 1211283813 1211283812 X-Varnish-Status: hit # X-Varnish-Status: hit # X-Varnish-Cacheable: yes, language cookie X-Varnish-Cacheable: yes, language cookie # X-Varnish-URL: /community/ X-Varnish-URL: /community/ ok 293 - URL /community/ was handled correctly by varnish # cookie_header: ok 294 - URL /community/ has correct cookies (or no cookies) 1..294All tests successful.
  7. 7. My Opera – Next steps
  8. 8. My Opera – Next steps● Front page caching● Static assets and UGC● On-the-fly thumbnails● “Shields-up” configuration
  9. 9. Front page caching Problem Solution• Very dynamic, i18n • varnish-accept-language• Accept-Language “extension” header variation• Vary: Accept- Language sub-optimal
  10. 10. Front page caching - Accept-Language SUPPORTED_LANGUAGES = “:de:es:it:ru:” DEFAULT_LANGUAGE = “en” Client sends Backend receivesAccept-Language: ru, uk;q=0.9 Accept-Language: ruAccept-Language: es-ES, es;q=0.8 Accept-Language: esAccept-Language: fr, it;q=0.7 Accept-Language: itAccept-Language: fr Accept-Language: ben
  11. 11. Front page caching
  12. 12. Static assets and UGC servers Problem Solution• One central location • Decentralized varnish• SPOF servers in multiple DC• High latency US -> NO • Talking to 1 backend • Very long TTL • Health probes • Cache invalidation API • Built our GeoDNS
  13. 13. Thumbnail generation and caching Problem Solution• Change of Design™ • Switch to on-the-fly made our millions of generation model pre-generated • Used mod_dims (AOL) thumbnails useless • Varnish on :80 • 2 backends 300k objects 95% hit rate avg 800 req/s/backend peak
  14. 14. Thumbnail generation and caching How it works http://localhost/dims/ crop/472x360/ contrast/+1/ quality/90/ /actual/picture/url.jpg (remote too!) Using rewrite rules Http://localhost/tn/small/ /actual/picture/url.jpg
  15. 15. Thumbnail generation and caching● Recognize mobile/non-mobile● Scale thumbnails on the fly● Reduce JPEG quality Ex.: /thumb/small/quality/80/some/path/pic.jpg
  16. 16. Shields-up configuration Problem Solution• Original setup too • DDoS specific to My Opera • Varnish in front, rather• Long tail of non- than after frontends popular content • Cache most logged out “unprotected” requests with lower TTL• Can we find some • Compromise solution, more generic setup? but generic enough
  17. 17. Sitecheck
  18. 18. Sitecheck – Malware, fraud protection • Used by the Opera browser • Must work! Failure not an option • ~8k req/s/backend peak • 2 varnish boxes, 16k req/s, 20k peak • 85% hit rate • TTL 10
  19. 19. Opera TV Store
  20. 20. Country-level ban • Contract mandates that TV Store shouldnt be available in specific countries • Country check in the backend means no caching is possible • Implemented with varnish-geoip
  21. 21. Country-level ban sub country_ban_list_check { # Allow testing of country ban if (req.http.Cookie ~ "x_geo_ip_forceds*=s*country:..") { set req.http.X-Geo-IP = regsuball( req.http.Cookie, "^.*x_geo_ip_forceds*=s*(country:..).*$", "1" ); log "Forced X-Geo-IP to " req.http.X-Geo-IP ""; } # Block access to tvstore in these countries if (req.http.X-Geo-IP && req.http.X-Geo-IP ~ "^country:(C1|C2|C3|...)$") { log "Country ban"; error 750 "tvstore is not available in your country"; } } sub vcl_recv { C{ vcl_geoip_country_set_header_xff(sp); }C call country_ban_list_check; }
  22. 22. Brand + device TV detection• Analyze User-Agent header• Regex the hell out of it• Send X-Brand, X-Device header to backend• Fallback Device detection in the backend (for development, test setups, ...)
  23. 23. VCL library
  24. 24. accept-encoding.vcl# STD: Deal with different Accept-Encoding formatssub accept_encoding_normalize { if (req.http.Accept-Encoding) { if (req.http.Accept-Encoding ~ "gzip") { set req.http.Accept-Encoding = "gzip"; } elsif (req.http.Accept-Encoding ~ "deflate") { set req.http.Accept-Encoding = "deflate"; } else { unset req.http.Accept-Encoding; } }}
  25. 25. accept-language.vclC{/* * Accept-language header normalization * * - Parses client Accept-Language HTTP header * - Tries to find the best match with the supported languages * - Writes the best match as req.http.X-Varnish-Accept-Language * * http://github.com/cosimo/varnish-accept-language */#include <ctype.h> /* isupper */#include <stdio.h>#include <stdlib.h> /* qsort */#include <string.h>#define DEFAULT_LANGUAGE "en"#define SUPPORTED_LANGUAGES ":de:en:es-la:fr:fy:hu:ja:no:pl:pt-br:ru:sk:sq:sr:tr:uk:vn:xx-lol:zh-tw:"…
  26. 26. maintenance.vcl + {up,down}.shinclude "/etc/varnish/accept-encoding.vcl";backend oopsy { .host = "10.20.21.22”; .port = "80";}sub vcl_recv { set req.backend = oopsy; # Serve page from within Varnish. See vcl_error() if (req.url == "/ping.html") { error 700; } call accept_encoding_normalize; # Collapse URLs, so that we have just one cached object set req.url = "/maintenance-down"; remove req.http.Cookie; remove req.http.Authorization; return (lookup);}
  27. 27. purge.vclacl purge { … }sub vcl_recv { if (req.request == "PURGE") { If (! (client.ip ~ purge)) { error 405 "Not allowed."; } purge("req.url == " req.url); error 200 "Purged."; } else if (req.request == "PURGE_SUFFIX") { set req.http.X-URL = regsuball(req.url, "[|]|[^.$|()*+?{}]", "0") "$"; purge_url(req.http.X-URL); unset req.http.X-URL; error 200 "Purged suffix."; } Ugly! else if (req.request == "PURGE_PREFIX") { … }}
  28. 28. X-forwarded-for.vcl# See http://www.varnish-cache.org/trac/ticket/540sub inject_forwarded_for { # Rename the incoming XFF header to work around a Varnish bug if (req.http.X-Forwarded-For) { # Append the client IP set req.http.X-Real-Forwarded-For = req.http.X-Forwarded-For ", " regsub(client.ip, ":.*", ""); } else { # Simply use the client IP set req.http.X-Real-Forwarded-For = regsub(client.ip,":.*", ""); }} Wat!?
  29. 29. Testing VCLs – http-cuke
  30. 30. http-cuke – csrf.testFeature: TVStore uses cookies to protect against CSRF attacks In order to protect the users from CSRF attacks As a TV Store developer I want to verify that some pages send out a CSRF cookie token to the browser or deviceScenario: Accessing the Backgammon application URL Given a "Opera/9.80 (Linux … Opera TV Store)" user agent When I go to "https://tvstore.server/store/app/backgammon" Then the final HTTP status code should be "200" Then the page should contain "A board game for one player" Then the page should not be cached by varnish Then the server should send a CSRF token
  31. 31. http-cuke – prove-like output $ http-cuke --test ./csrf.test $ http-cuke --test-dir ./some-dir
  32. 32. http-cuke – a sample test run# ============================================================# FEATURE: TV Store uses cookies to protect against CSRF attacks# ============================================================# ------------------------------------------------------------# SCENARIO: Accessing the Backgammon application URL# ------------------------------------------------------------ok 1 - Given a "Opera/9.80 (Linux...)" user agentok 2 - When I go to "https://tvstore.server/app/backgammon"ok 3 - Status code is 200 (expected 200)ok 4 - Then the final HTTP status code should be "200"ok 5 - String A board game for one player was found in the pageok 6 - Then the page should contain "A board game for one player"ok 7 - X-Varnish header contains only current XID (523289525)ok 8 - Age of cached resource is zerook 9 - Then the page should not be cached by varnishok 10 - CSRF token was found (49a0da1b2758bf62a028072e4f7f36dc)ok 11 - Then the server should send a CSRF token
  33. 33. Puppet module
  34. 34. varnish/manifests/init.pp class varnish { package { "varnish": ensure => "installed" } file { "/etc/init.d/varnish": … } file { "/etc/sysctl.conf": … } exec { "update-sysctl": … } file { "/usr/share/varnish/purge-cache": … } service { "varnish": ensure => "running", … } munin::plugin::custom { "varnish_": } munin::plugin { [ "varnish_backend_traffic", "varnish_expunge", … } }
  35. 35. Custom init script # Lower stack limit demand for every Varnish thread # http://projects.linpro.no/pipermail/varnish-misc/2009-August/002977.html # Still relevant for Varnish 3 ?? ulimit -s 256 # Startup with custom cc_command fails # Filed Debian bug #659005 if bash -c "start-stop-daemon --start --quiet --pidfile ${PIDFILE} --exec ${DAEMON} -- -P ${PIDFILE} ${DAEMON_OPTS} > ${output} 2>&1"; then log_end_msg 0 else …
  36. 36. Custom sysctl settings # From http://varnish.projects.linpro.no/wiki/Performance # + our own tweaking and tuning net.ipv4.ip_local_port_range = 1024 65536 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 net.ipv4.tcp_fin_timeout = 30 net.core.netdev_max_backlog = 30000 net.ipv4.tcp_no_metrics_save = 1 net.core.somaxconn = 262144 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_max_orphans = 262144 net.ipv4.tcp_max_syn_backlog = 262144 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syn_retries = 2
  37. 37. Purge cache script Modeled after Debian vcl-reload script $ purge-cache -a $ purge-cache -u http://some.url $ purge-cache -r ^/(home|user)/
  38. 38. varnish/manifests/init.pp – 2 define varnish::config ( $vcl_conf="default.vcl", $listen_address="", $listen_port=6081, $thread_min=400, $thread_max=5000, $thread_timeout=30, $storage_type="malloc", $storage_size="12G", $ttl=60, $thread_pools=$processorcount, $sess_workspace=131072, $cc_command="", $sess_timeout=3 ) { file { "/etc/default/varnish": ensure => "present", owner => "root", group => "root", mode => 644, content => template("varnish/debian-defaults.erb"), require => Package["varnish"], notify => Service["varnish"], } }
  39. 39. Example of varnish::config varnish::config { "cache-varnish-config": vcl_conf => "cache.vcl", storage_type => "malloc", storage_size => "20G", listen_port => 80, sess_workspace => 131072, ttl => 86400, thread_pools => 8, thread_min => 800, thread_max => 10000, # Necessary for GeoIP cc_command => exec cc -fpic -shared -Wl,-x -L/usr/include/GeoIP.h -lGeoIP -o %o %s, }
  40. 40. varnish/manifests/init.pp – 3 define varnish::vcl ($source) { file { "/etc/varnish/${name}.vcl": ensure => "present", owner => "root", group => "root", mode => 644, source => $source, require => Package["varnish"], notify => Service["varnish"], } }
  41. 41. Migration to Varnish 3
  42. 42. Following Debian stable Not there yet. Still anchored to 2.1Migration 2.0 → 2.1 was relatively painless
  43. 43. Embedded C code?Migrate accept-language and geoip extensions to VMODs
  44. 44. 2.1 → 3.0 syntax changes? Test our VCLs
  45. 45. CustomizationsAre they still relevant for 3.0? (ulimit -s 256, etc...)
  46. 46. Id like to see in varnish...
  47. 47. Easier VMODsIdeally, as easy as embedded C!
  48. 48. varnishtop -t10s Collect traffic for 10s and then report Bonus feature: tags, AKA group-byvarnishtop -i RxURL -g RxHeader.Referer -t 60s
  49. 49. Use headers as vars less than ideal Introduce variables or registers to avoid set req.http.X-Var = regsuball( req.http.Some-Header, …, 1 );
  50. 50. Better Cookies inspectionAvoid regsuball() mess on req.http.Cookieset req.http.Cookie.SomeName = “xxx”;set req.http.X-Var1 = req.http.Cookie.sessionid; and...set var.SessionID = req.http.Cookie.sessionid;
  51. 51. file storage?malloc works fine for us, but we always had problems with file storage
  52. 52. Better SSL handlingNot really. nginx works fine.
  53. 53. Questions!
  54. 54. opera.com/jobs

×