Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached

  • 304 views
Uploaded on

From guaranteed APOCALYPSE* …

From guaranteed APOCALYPSE*
using varnish, memcached, and some other stuff

From PHP Bulgaria User Group Meeting: 23.11.2013

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
304
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
10
Comments
1
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. SAVING THE WORLD From guaranteed APOCALYPSE* using varnish, memcached, and some other stuff * apocalypse not really guaranteed
  • 2. WHAT IS CACHING?
  • 3. WHY NOT DOING CACHING IS BAD? • Keep executing the same code with the same data • Waste computing power getting the same result • That power is probably generated by burning coal* • Burning stuff produces tons of CO2** * it most likely is not ** probably a smaller unit of mass
  • 4. Too much CO2 will make THE EARTH EXPLODE* * based on pure speculation
  • 5. WHY SHOULD YOU CARE? • Your web apps will become WAY faster • Users and search engines will like you MORE • You will use A LOT less hardware resources 2 CO and/or save $$$ • You will generate LESS • The Earth will NOT explode and/or you’ll have more $$$ • Women like people who save the world and/or have $$$ • And lots of other stuff* * 0 or greater amount of other stuff
  • 6. ABOUT TTL
  • 7. WHY YOU SHOULD AVOID USING TTL • You might use obsolete data • Your server might get a cache stampede and go down • You should PUSH the fresh data in your cache as soon as you have it, BEFORE the old one has expired from the cache
  • 8. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds
  • 9. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds 1.A critical piece of your cached data expired through TTL (or is evicted)
  • 10. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds 2. A client requests a service which relies on that data
  • 11. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds 3. That data takes relatively long time to compute
  • 12. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds 4. Other requests come that need the same data
  • 13. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds 5. A lot of them stack on the server before the first one is even finished
  • 14. 503 SERVICE UNAVAILABLE
  • 15. I DONT WANT THAT! No you don’t!
  • 16. MEMCACHED
  • 17. HOW DO I CACHE THINGS? 1. Create a Memcached instance $memcached = new Memcached; $memcached->addServers( $memcachedServers ); 2. Put data in $memcached->set( $key, $value, $expireAt ); 3. Get data out $memcached->get( $key );
  • 18. A SIMPLE BENCHMARK
  • 19. HELPFUL TIPS • It’s best if you cache the final result of an operation rather than the entry data • You should always have a fallback if you get a cache miss • Try to avoid flushing the entire cache, use clever key names instead • Use Memcached::getAllKeys() to help you manage/release/update data • Use Memcached::stats() to help you improve efficiency • Have a warmup script!
  • 20. WHAT TO CHECK IN STATS() … … [“get_hits”]=>int(110825125) [“get_misses”]=>int(17396765) [“evictions”]=>int(0) … …
  • 21. VARNISH
  • 22. VARNISH IS: • A caching HTTP reverse proxy • Really, really really FAST • Usually limited by the speed of the network • Has decent flexibility with VCL configuration language
  • 23. A SIMPLE BENCHMARK
  • 24. NICE SPEED Now lets see how to use Varnish effectively on my very dynamic site
  • 25. COMMON PROBLEMS TO OVERCOME • My pages are mix of highly dynamic sections and mostly static stuff, and Varnish supposedly only caches whole pages • I need to control/flush/refresh the cache without stoping/starting/killing/rebooting/pulling the cord/ assaulting the datacenter and I prefer to do it from within my app • My visitors have unique stuff • Sessions • Cookies • Statistics and tracking visitors
  • 26. ABOUT ESI • Edge Side Includes or ESI is a small markup language for edge level dynamic web content assembly. The purpose of ESI is to tackle the problem of web infrastructure scaling. <HTML> <BODY> … <esi:include src=“/esi/private/recentproducts“/> … </BODY> </HTML>
  • 27. Doesn't change at all session specific 1minute session specific 24h Doesn't change at all 2-4minutes 1 hour
  • 28. SETTING UP BACKENDS backend www { .host = “192.168.0.2”; .port = “81”; .connect_timeout = 1s; .first_byte_timeout = 5s; .between_bytes_timeout = 2s; }
  • 29. HOW DOES IT WORK? pipe vcl_recv vcl_pipe pass lookup vcl_pass vcl_hash Backend1 pass Client request vcl_hit vcl_miss vcl_fetch vcl_deliver vcl_error pipe fetch Backend2
  • 30. vcl_recv • First checkpoint when a request arrives and is parsed • We must decide whether to lookup, pass or pipe the request • We can choose a backend to use • We have the req object • Definition of PURGE, BAN or REFRESH like requests is here • We can set a header in the req object to tell our backend the request is from varnish
  • 31. set req.backend = default; set req.http.X-Varnish-Handshake = “1”; set req.http.X-Forwarded-For = client.ip; ! if (req.url ~ "/esi/") { set req.http.X-Varnish-Esi = regsub(req.url, ".esi/(w+)/.*", "1"); remove req.http.Accept-Encoding; } if (req.request != "GET" && req.request != "HEAD") { # We only deal with GET and HEAD by default return (pass); } if (req.http.Cookie !~ “PHPSESSID="){ call generate_session; } return (lookup);
  • 32. WAIT, WHAT? sub generate_session { C{ char uuid_buf [50]; generate_uuid(uuid_buf); VRT_SetHdr(sp, HDR_REQ, "030X-Varnish-Fake-Session:", uuid_buf, vrt_magic_string_end ); }C ! if (req.http.Cookie) { set req.http.Cookie = req.http.X-Varnish-Fake-Session + "; " + req.http.Cookie; } else { set req.http.Cookie = req.http.X-Varnish-Fake-Session; } }
  • 33. WAIT, WHAT? sub generate_session { C{ char uuid_buf [50]; generate_uuid(uuid_buf); VRT_SetHdr(sp, HDR_REQ, "030X-Varnish-Fake-Session:", uuid_buf, vrt_magic_string_end ); }C ! if (req.http.Cookie) { set req.http.Cookie = req.http.X-Varnish-Fake-Session + "; " + req.http.Cookie; } else { set req.http.Cookie = req.http.X-Varnish-Fake-Session; } }
  • 34. C{ #include <stdlib.h> #include <stdio.h> #include <time.h> #include <pthread.h> ! static pthread_mutex_t lrand_mutex = PTHREAD_MUTEX_INITIALIZER; ! void generate_uuid(char* buf) { pthread_mutex_lock(&lrand_mutex); long a = lrand48(); long b = lrand48(); long c = lrand48(); long d = lrand48(); pthread_mutex_unlock(&lrand_mutex); sprintf(buf, "PHPSESSID=%08lx%04lx%04lx%04lx%04lx%08lx", a, b & 0xffff, (b & ((long)0x0fff0000) >> 16) | 0x4000, (c & 0x0fff) | 0x8000, (c & (long)0xffff0000) >> 16, d ); return; } }C
  • 35. HOW DOES IT WORK? pipe vcl_recv vcl_pipe pass lookup vcl_pass vcl_hash Backend1 pass Client request vcl_hit vcl_miss vcl_fetch vcl_deliver vcl_error pipe fetch Backend2
  • 36. vcl_hash • Generates the hash through which Varnish looks up an object • We have the req object • We can make certain objects unique in the cache based on something more than just the url - like a session cookie.
  • 37. hash_data(req.url); if (req.http.host) { hash_data(req.http.host); } else { hash_data(server.ip); } ! if (req.http.Accept-Encoding) { hash_data(req.http.Accept-Encoding); } ! if (req.http.X-Varnish-Esi == "private" && req.http.Cookie ~ "PHPSESSID=") { hash_data(regsub(req.http.Cookie, "^.*?PHPSESSID=([^;]*);*.*$", "1")); } ! return (hash);
  • 38. HOW DOES IT WORK? pipe vcl_recv vcl_pipe pass lookup vcl_pass vcl_hash Backend1 pass Client request vcl_hit vcl_miss vcl_fetch vcl_deliver vcl_error pipe fetch Backend2
  • 39. vcl_fetch • Takes control when a response from the backend is fetched and parsed • We have the req and beresp objects • A good place to sanitise the backend response and control TTL • Removal of Set-Cookie header is a good practice here • Add helper headers to the cached object for the ban lurker • We can choose to deliver or hit_for_pass here
  • 40. beresp.ttl Before Varnish runs vcl_fetch, the beresp.ttl variable has already been set to a value. It will use the first value it finds among: ! • The s-maxage variable in the Cache-Control response header • The max-age variable in the Cache-Control response header • The Expires response header • The default_ttl parameter
  • 41. set beresp.http.X-Url = req.url; set beresp.http.X-Host = req.http.host; set beresp.http.X-Varnish-Session = regsub(req.http.Cookie,"^.*?PHPSESSID=([^;]*);*.*$", “1"); if (beresp.status != 200 && beresp.status != 404) { set beresp.ttl = 15s; return (hit_for_pass); } if (beresp.http.Set-Cookie) { remove beresp.http.Set-Cookie; } if (beresp.http.X-Varnish-Esi == "1") { set beresp.do_esi = true; } if (req.url ~ ".(jpg|jpeg|gif|otf|png|ico|css|zip|tgz|gz|rar|bz2|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|scripts)$"){ set beresp.ttl = 180m; } return (deliver);
  • 42. HOW DOES IT WORK? pipe vcl_recv vcl_pipe pass lookup vcl_pass vcl_hash Backend1 pass Client request vcl_hit vcl_miss vcl_fetch vcl_deliver vcl_error pipe fetch Backend2
  • 43. vcl_deliver • Takes control just before a response is sent to the client • We have the req and resp objects • Executes after hit, miss and fetch, hit_for_pass or pass (but not pipe) • Removal of all headers we set during the VCL flow is a good idea here • We can also add headers here that should go to the client, but shouldn’t be in the cache
  • 44. if (req.http.X-Varnish-Fake-Session) { call generate_session_expires; set resp.http.Set-Cookie = req.http.X-Varnish-Fake-Session + "; expires=" + resp.http.X-Varnish-Cookie-Expires + "; path=/"; if (req.http.Host) { set resp.http.Set-Cookie = resp.http.Set-Cookie + "; domain=" + regsub(req.http.Host, ":d+$", ""); } set resp.http.Set-Cookie = resp.http.Set-Cookie + "; httponly"; unset resp.http.X-Varnish-Cookie-Expires; } if (!client.ip ~ debug) { unset resp.http.X-Host; unset resp.http.X-Url; unset resp.http.X-Varnish-Session; } else { if (obj.hits > 0) { set resp.http.X-Cache = "HIT"; } else { set resp.http.X-Cache = "MISS"; } } ! return (deliver);
  • 45. ACLs acl purge { "localhost"; "127.0.0.1"; } ! acl debug { "192.168.0.128"; }
  • 46. INVALIDATING CACHED OBJECTS • We can control cached objects through http requests to varnish with some clever VCL-ing • PURGE - we can purge a single object from the cache • BAN - we can ban a selection of matching objects from the cache • REFRESH - we can fetch a new copy of an object whole the old one is still served in the meantime
  • 47. sub vcl_recv { if (req.request == "PURGE") { if (!client.ip ~ purge) { error 405 "Not allowed."; } return(lookup); } } ! sub vcl_hit { if (req.request == "PURGE") { purge; error 200 "Purged"; } } ! sub vcl_miss { if (req.request == "PURGE") { error 404 "Not in cache"; } }
  • 48. $cacheServerSocket = fsockopen($varnishHostname, 80, $errno, $errstr, 2); ! $request = "PURGE /something.htm HTTP/1.0rn”; $request .= "Host: www.varnished-site.comrn”; $request .= "Connection: Closernrn”; ! fwrite($cacheServerSocket, $request); $response = fgets($cacheServerSocket); fclose($cacheServerSocket);
  • 49. sub vcl_recv { if (req.request == "BAN") { if (!client.ip ~ purge) { error 405 "Not allowed."; } ban("obj.http.X-Host ~ " + req.http.host + " && obj.http.X-Url ~ " + req.url); error 200 "Bannerd"; } }
  • 50. sub vcl_recv { if (req.request == "REFRESH") { if (!client.ip ~ purge) { error 405 "Not allowed."; } set req.request = "GET"; set req.hash_always_miss = true; } }
  • 51. VMODs
  • 52. COMMON PROBLEMS TO OVERCOME • My pages are mix of highly dynamic sections and mostly static stuff, and Varnish supposedly only caches whole pages => Use ESI • I need to control/flush/refresh the cache without stoping/starting/killing/rebooting/pulling the cord/assaulting the datacenter and I prefer to do it from within my app => Set up PURGE/BAN/REFRESH in the VCL • My visitors have unique stuff => Use the session cookie in the vcl_hash to keep unique copy • Sessions => Use the generate session in Varnish trick • Cookies => Uhhh, don't use em? • Statistics and tracking visitors => Use the memcached VMOD and process stuff asynch on the backend
  • 53. OTHER STUFF
  • 54. QUESTIONS?* * answers not guaranteed to be available and/or true