SAVING THE WORLD
From guaranteed APOCALYPSE*	

using varnish, memcached, and some other stuff
* apocalypse not really guar...
WHAT IS CACHING?
WHY NOT DOING CACHING
IS BAD?
•

Keep executing the same code with the same data	


•

Waste computing power getting the s...
Too much CO2 will make

THE EARTH EXPLODE*

* based on pure speculation
WHY SHOULD YOU CARE?
•

Your web apps will become WAY faster	


•

Users and search engines will like you MORE	


•

You w...
ABOUT TTL
WHY YOU SHOULD AVOID USING TTL
•

You might use obsolete data	


•

Your server might get a cache stampede and go down	


...
requests

WAIT, WHAT IS A CACHE STAMPEDE?

seconds
requests

WAIT, WHAT IS A CACHE STAMPEDE?

seconds

1.A critical piece of your cached data expired through TTL (or is evic...
requests

WAIT, WHAT IS A CACHE STAMPEDE?

seconds

2. A client requests a service which relies on that data
requests

WAIT, WHAT IS A CACHE STAMPEDE?

seconds

3. That data takes relatively long time to compute
requests

WAIT, WHAT IS A CACHE STAMPEDE?

seconds

4. Other requests come that need the same data
requests

WAIT, WHAT IS A CACHE STAMPEDE?

seconds

5. A lot of them stack on the server before the first one is even finish...
503 SERVICE UNAVAILABLE
I DONT WANT THAT!
No you don’t!
MEMCACHED
HOW DO I CACHE THINGS?
1. Create a Memcached instance	

$memcached = new Memcached;	
$memcached->addServers( $memcachedSer...
A SIMPLE BENCHMARK
HELPFUL TIPS
•

It’s best if you cache the final result of an operation rather than the entry data	


•

You should always ...
WHAT TO CHECK IN STATS()
…	

…	

[“get_hits”]=>int(110825125)	

[“get_misses”]=>int(17396765)	

[“evictions”]=>int(0)	

…	...
VARNISH
VARNISH IS:
•

A caching HTTP reverse proxy	


•

Really, really really FAST	


•

Usually limited by the speed of the net...
A SIMPLE BENCHMARK
NICE SPEED
Now lets see how to use Varnish effectively on my very dynamic site
COMMON PROBLEMS TO OVERCOME
•

My pages are mix of highly dynamic sections and mostly static stuff, and Varnish supposedly...
ABOUT ESI
•

Edge Side Includes or ESI is a small markup language for edge level
dynamic web content assembly. The purpose...
Doesn't change at all

session specific

1minute

session specific

24h

Doesn't change at all

2-4minutes

1 hour
SETTING UP BACKENDS
backend www {
.host = “192.168.0.2”;
.port = “81”;
.connect_timeout = 1s;
.first_byte_timeout = 5s;
.b...
HOW DOES IT WORK?
pipe

vcl_recv

vcl_pipe

pass

lookup

vcl_pass

vcl_hash

Backend1

pass

Client request
vcl_hit

vcl_...
vcl_recv
•

First checkpoint when a request arrives and is parsed	


•

We must decide whether to lookup, pass or pipe the...
set req.backend = default;	

set req.http.X-Varnish-Handshake = “1”;	

set req.http.X-Forwarded-For = client.ip;	

!

if (...
WAIT, WHAT?
sub generate_session {
C{
char uuid_buf [50];
generate_uuid(uuid_buf);
VRT_SetHdr(sp, HDR_REQ,
"030X-Varnish-F...
WAIT, WHAT?
sub generate_session {
C{
char uuid_buf [50];
generate_uuid(uuid_buf);
VRT_SetHdr(sp, HDR_REQ,
"030X-Varnish-F...
C{
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <pthread.h>
!

static pthread_mutex_t lrand_mutex = P...
HOW DOES IT WORK?
pipe

vcl_recv

vcl_pipe

pass

lookup

vcl_pass

vcl_hash

Backend1

pass

Client request
vcl_hit

vcl_...
vcl_hash
•

Generates the hash through which Varnish looks up an object	


•

We have the req object	


•

We can make cer...
hash_data(req.url);	

if (req.http.host) {	

	

 hash_data(req.http.host);	

} else {	

	

 hash_data(server.ip);	

}	

!
...
HOW DOES IT WORK?
pipe

vcl_recv

vcl_pipe

pass

lookup

vcl_pass

vcl_hash

Backend1

pass

Client request
vcl_hit

vcl_...
vcl_fetch
•

Takes control when a response from the backend is fetched and parsed	


•

We have the req and beresp objects...
beresp.ttl
Before Varnish runs vcl_fetch, the beresp.ttl variable has already been set to a value. It will
use the first va...
set beresp.http.X-Url = req.url;	

set beresp.http.X-Host = req.http.host;	

set beresp.http.X-Varnish-Session = regsub(re...
HOW DOES IT WORK?
pipe

vcl_recv

vcl_pipe

pass

lookup

vcl_pass

vcl_hash

Backend1

pass

Client request
vcl_hit

vcl_...
vcl_deliver
•

Takes control just before a response is sent to the client	


•

We have the req and resp objects	


•

Exe...
if (req.http.X-Varnish-Fake-Session) {	

	

call generate_session_expires;	

	

set resp.http.Set-Cookie 	

 = req.http.X-...
ACLs
acl purge {	

"localhost";	

"127.0.0.1";	

}	

!

acl debug {	

"192.168.0.128";	

}
INVALIDATING CACHED OBJECTS
•

We can control cached objects through http requests to varnish with
some clever VCL-ing 	

...
sub vcl_recv {	

	

 if (req.request == "PURGE") {	

	

 	

 if (!client.ip ~ purge) {	

	

 	

 	

 error 405 "Not allowe...
$cacheServerSocket = fsockopen($varnishHostname, 80, $errno, $errstr, 2);	

!

$request = "PURGE /something.htm HTTP/1.0rn...
sub vcl_recv {	

	

 if (req.request == "BAN") {	

	

 	

 if (!client.ip ~ purge) {	

	

 	

 	

 error 405 "Not allowed....
sub vcl_recv {	

	

 if (req.request == "REFRESH") {	

	

 	

 if (!client.ip ~ purge) {	

	

 	

 	

 error 405 "Not allo...
VMODs
COMMON PROBLEMS TO OVERCOME
•

My pages are mix of highly dynamic sections and mostly static stuff, and Varnish supposedly...
OTHER STUFF
QUESTIONS?*

* answers not guaranteed to be available and/or true
Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached
Upcoming SlideShare
Loading in...5
×

Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached

491

Published on

From guaranteed APOCALYPSE*
using varnish, memcached, and some other stuff

From PHP Bulgaria User Group Meeting: 23.11.2013

Published in: Technology
1 Comment
3 Likes
Statistics
Notes
No Downloads
Views
Total Views
491
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
14
Comments
1
Likes
3
Embeds 0
No embeds

No notes for slide

Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached

  1. 1. SAVING THE WORLD From guaranteed APOCALYPSE* using varnish, memcached, and some other stuff * apocalypse not really guaranteed
  2. 2. WHAT IS CACHING?
  3. 3. WHY NOT DOING CACHING IS BAD? • Keep executing the same code with the same data • Waste computing power getting the same result • That power is probably generated by burning coal* • Burning stuff produces tons of CO2** * it most likely is not ** probably a smaller unit of mass
  4. 4. Too much CO2 will make THE EARTH EXPLODE* * based on pure speculation
  5. 5. WHY SHOULD YOU CARE? • Your web apps will become WAY faster • Users and search engines will like you MORE • You will use A LOT less hardware resources 2 CO and/or save $$$ • You will generate LESS • The Earth will NOT explode and/or you’ll have more $$$ • Women like people who save the world and/or have $$$ • And lots of other stuff* * 0 or greater amount of other stuff
  6. 6. ABOUT TTL
  7. 7. WHY YOU SHOULD AVOID USING TTL • You might use obsolete data • Your server might get a cache stampede and go down • You should PUSH the fresh data in your cache as soon as you have it, BEFORE the old one has expired from the cache
  8. 8. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds
  9. 9. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds 1.A critical piece of your cached data expired through TTL (or is evicted)
  10. 10. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds 2. A client requests a service which relies on that data
  11. 11. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds 3. That data takes relatively long time to compute
  12. 12. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds 4. Other requests come that need the same data
  13. 13. requests WAIT, WHAT IS A CACHE STAMPEDE? seconds 5. A lot of them stack on the server before the first one is even finished
  14. 14. 503 SERVICE UNAVAILABLE
  15. 15. I DONT WANT THAT! No you don’t!
  16. 16. MEMCACHED
  17. 17. HOW DO I CACHE THINGS? 1. Create a Memcached instance $memcached = new Memcached; $memcached->addServers( $memcachedServers ); 2. Put data in $memcached->set( $key, $value, $expireAt ); 3. Get data out $memcached->get( $key );
  18. 18. A SIMPLE BENCHMARK
  19. 19. HELPFUL TIPS • It’s best if you cache the final result of an operation rather than the entry data • You should always have a fallback if you get a cache miss • Try to avoid flushing the entire cache, use clever key names instead • Use Memcached::getAllKeys() to help you manage/release/update data • Use Memcached::stats() to help you improve efficiency • Have a warmup script!
  20. 20. WHAT TO CHECK IN STATS() … … [“get_hits”]=>int(110825125) [“get_misses”]=>int(17396765) [“evictions”]=>int(0) … …
  21. 21. VARNISH
  22. 22. VARNISH IS: • A caching HTTP reverse proxy • Really, really really FAST • Usually limited by the speed of the network • Has decent flexibility with VCL configuration language
  23. 23. A SIMPLE BENCHMARK
  24. 24. NICE SPEED Now lets see how to use Varnish effectively on my very dynamic site
  25. 25. COMMON PROBLEMS TO OVERCOME • My pages are mix of highly dynamic sections and mostly static stuff, and Varnish supposedly only caches whole pages • I need to control/flush/refresh the cache without stoping/starting/killing/rebooting/pulling the cord/ assaulting the datacenter and I prefer to do it from within my app • My visitors have unique stuff • Sessions • Cookies • Statistics and tracking visitors
  26. 26. ABOUT ESI • Edge Side Includes or ESI is a small markup language for edge level dynamic web content assembly. The purpose of ESI is to tackle the problem of web infrastructure scaling. <HTML> <BODY> … <esi:include src=“/esi/private/recentproducts“/> … </BODY> </HTML>
  27. 27. Doesn't change at all session specific 1minute session specific 24h Doesn't change at all 2-4minutes 1 hour
  28. 28. SETTING UP BACKENDS backend www { .host = “192.168.0.2”; .port = “81”; .connect_timeout = 1s; .first_byte_timeout = 5s; .between_bytes_timeout = 2s; }
  29. 29. HOW DOES IT WORK? pipe vcl_recv vcl_pipe pass lookup vcl_pass vcl_hash Backend1 pass Client request vcl_hit vcl_miss vcl_fetch vcl_deliver vcl_error pipe fetch Backend2
  30. 30. vcl_recv • First checkpoint when a request arrives and is parsed • We must decide whether to lookup, pass or pipe the request • We can choose a backend to use • We have the req object • Definition of PURGE, BAN or REFRESH like requests is here • We can set a header in the req object to tell our backend the request is from varnish
  31. 31. set req.backend = default; set req.http.X-Varnish-Handshake = “1”; set req.http.X-Forwarded-For = client.ip; ! if (req.url ~ "/esi/") { set req.http.X-Varnish-Esi = regsub(req.url, ".esi/(w+)/.*", "1"); remove req.http.Accept-Encoding; } if (req.request != "GET" && req.request != "HEAD") { # We only deal with GET and HEAD by default return (pass); } if (req.http.Cookie !~ “PHPSESSID="){ call generate_session; } return (lookup);
  32. 32. WAIT, WHAT? sub generate_session { C{ char uuid_buf [50]; generate_uuid(uuid_buf); VRT_SetHdr(sp, HDR_REQ, "030X-Varnish-Fake-Session:", uuid_buf, vrt_magic_string_end ); }C ! if (req.http.Cookie) { set req.http.Cookie = req.http.X-Varnish-Fake-Session + "; " + req.http.Cookie; } else { set req.http.Cookie = req.http.X-Varnish-Fake-Session; } }
  33. 33. WAIT, WHAT? sub generate_session { C{ char uuid_buf [50]; generate_uuid(uuid_buf); VRT_SetHdr(sp, HDR_REQ, "030X-Varnish-Fake-Session:", uuid_buf, vrt_magic_string_end ); }C ! if (req.http.Cookie) { set req.http.Cookie = req.http.X-Varnish-Fake-Session + "; " + req.http.Cookie; } else { set req.http.Cookie = req.http.X-Varnish-Fake-Session; } }
  34. 34. C{ #include <stdlib.h> #include <stdio.h> #include <time.h> #include <pthread.h> ! static pthread_mutex_t lrand_mutex = PTHREAD_MUTEX_INITIALIZER; ! void generate_uuid(char* buf) { pthread_mutex_lock(&lrand_mutex); long a = lrand48(); long b = lrand48(); long c = lrand48(); long d = lrand48(); pthread_mutex_unlock(&lrand_mutex); sprintf(buf, "PHPSESSID=%08lx%04lx%04lx%04lx%04lx%08lx", a, b & 0xffff, (b & ((long)0x0fff0000) >> 16) | 0x4000, (c & 0x0fff) | 0x8000, (c & (long)0xffff0000) >> 16, d ); return; } }C
  35. 35. HOW DOES IT WORK? pipe vcl_recv vcl_pipe pass lookup vcl_pass vcl_hash Backend1 pass Client request vcl_hit vcl_miss vcl_fetch vcl_deliver vcl_error pipe fetch Backend2
  36. 36. vcl_hash • Generates the hash through which Varnish looks up an object • We have the req object • We can make certain objects unique in the cache based on something more than just the url - like a session cookie.
  37. 37. hash_data(req.url); if (req.http.host) { hash_data(req.http.host); } else { hash_data(server.ip); } ! if (req.http.Accept-Encoding) { hash_data(req.http.Accept-Encoding); } ! if (req.http.X-Varnish-Esi == "private" && req.http.Cookie ~ "PHPSESSID=") { hash_data(regsub(req.http.Cookie, "^.*?PHPSESSID=([^;]*);*.*$", "1")); } ! return (hash);
  38. 38. HOW DOES IT WORK? pipe vcl_recv vcl_pipe pass lookup vcl_pass vcl_hash Backend1 pass Client request vcl_hit vcl_miss vcl_fetch vcl_deliver vcl_error pipe fetch Backend2
  39. 39. vcl_fetch • Takes control when a response from the backend is fetched and parsed • We have the req and beresp objects • A good place to sanitise the backend response and control TTL • Removal of Set-Cookie header is a good practice here • Add helper headers to the cached object for the ban lurker • We can choose to deliver or hit_for_pass here
  40. 40. beresp.ttl Before Varnish runs vcl_fetch, the beresp.ttl variable has already been set to a value. It will use the first value it finds among: ! • The s-maxage variable in the Cache-Control response header • The max-age variable in the Cache-Control response header • The Expires response header • The default_ttl parameter
  41. 41. set beresp.http.X-Url = req.url; set beresp.http.X-Host = req.http.host; set beresp.http.X-Varnish-Session = regsub(req.http.Cookie,"^.*?PHPSESSID=([^;]*);*.*$", “1"); if (beresp.status != 200 && beresp.status != 404) { set beresp.ttl = 15s; return (hit_for_pass); } if (beresp.http.Set-Cookie) { remove beresp.http.Set-Cookie; } if (beresp.http.X-Varnish-Esi == "1") { set beresp.do_esi = true; } if (req.url ~ ".(jpg|jpeg|gif|otf|png|ico|css|zip|tgz|gz|rar|bz2|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|scripts)$"){ set beresp.ttl = 180m; } return (deliver);
  42. 42. HOW DOES IT WORK? pipe vcl_recv vcl_pipe pass lookup vcl_pass vcl_hash Backend1 pass Client request vcl_hit vcl_miss vcl_fetch vcl_deliver vcl_error pipe fetch Backend2
  43. 43. vcl_deliver • Takes control just before a response is sent to the client • We have the req and resp objects • Executes after hit, miss and fetch, hit_for_pass or pass (but not pipe) • Removal of all headers we set during the VCL flow is a good idea here • We can also add headers here that should go to the client, but shouldn’t be in the cache
  44. 44. if (req.http.X-Varnish-Fake-Session) { call generate_session_expires; set resp.http.Set-Cookie = req.http.X-Varnish-Fake-Session + "; expires=" + resp.http.X-Varnish-Cookie-Expires + "; path=/"; if (req.http.Host) { set resp.http.Set-Cookie = resp.http.Set-Cookie + "; domain=" + regsub(req.http.Host, ":d+$", ""); } set resp.http.Set-Cookie = resp.http.Set-Cookie + "; httponly"; unset resp.http.X-Varnish-Cookie-Expires; } if (!client.ip ~ debug) { unset resp.http.X-Host; unset resp.http.X-Url; unset resp.http.X-Varnish-Session; } else { if (obj.hits > 0) { set resp.http.X-Cache = "HIT"; } else { set resp.http.X-Cache = "MISS"; } } ! return (deliver);
  45. 45. ACLs acl purge { "localhost"; "127.0.0.1"; } ! acl debug { "192.168.0.128"; }
  46. 46. INVALIDATING CACHED OBJECTS • We can control cached objects through http requests to varnish with some clever VCL-ing • PURGE - we can purge a single object from the cache • BAN - we can ban a selection of matching objects from the cache • REFRESH - we can fetch a new copy of an object whole the old one is still served in the meantime
  47. 47. sub vcl_recv { if (req.request == "PURGE") { if (!client.ip ~ purge) { error 405 "Not allowed."; } return(lookup); } } ! sub vcl_hit { if (req.request == "PURGE") { purge; error 200 "Purged"; } } ! sub vcl_miss { if (req.request == "PURGE") { error 404 "Not in cache"; } }
  48. 48. $cacheServerSocket = fsockopen($varnishHostname, 80, $errno, $errstr, 2); ! $request = "PURGE /something.htm HTTP/1.0rn”; $request .= "Host: www.varnished-site.comrn”; $request .= "Connection: Closernrn”; ! fwrite($cacheServerSocket, $request); $response = fgets($cacheServerSocket); fclose($cacheServerSocket);
  49. 49. sub vcl_recv { if (req.request == "BAN") { if (!client.ip ~ purge) { error 405 "Not allowed."; } ban("obj.http.X-Host ~ " + req.http.host + " && obj.http.X-Url ~ " + req.url); error 200 "Bannerd"; } }
  50. 50. sub vcl_recv { if (req.request == "REFRESH") { if (!client.ip ~ purge) { error 405 "Not allowed."; } set req.request = "GET"; set req.hash_always_miss = true; } }
  51. 51. VMODs
  52. 52. COMMON PROBLEMS TO OVERCOME • My pages are mix of highly dynamic sections and mostly static stuff, and Varnish supposedly only caches whole pages => Use ESI • I need to control/flush/refresh the cache without stoping/starting/killing/rebooting/pulling the cord/assaulting the datacenter and I prefer to do it from within my app => Set up PURGE/BAN/REFRESH in the VCL • My visitors have unique stuff => Use the session cookie in the vcl_hash to keep unique copy • Sessions => Use the generate session in Varnish trick • Cookies => Uhhh, don't use em? • Statistics and tracking visitors => Use the memcached VMOD and process stuff asynch on the backend
  53. 53. OTHER STUFF
  54. 54. QUESTIONS?* * answers not guaranteed to be available and/or true
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×