Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Caching on the web

229 views

Published on

This talk covers one of the secrets of high scalability and performance.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Caching on the web

  1. 1. Cachingon the web @jcemer
  2. 2. jcemer.com twitter.com/jcemer
  3. 3. Cache is a hardware or software component that stores data so future requests for that data can be served faster - Wikipedia
  4. 4. Caching and RAM is the answer to everything - about Flickr Architecture
  5. 5. Caching is one of the secrets of high scalability and performance
  6. 6. Where could cache be present on the web? • Browser • Network • Server • Application
  7. 7. Browsercaching
  8. 8. Caching would be useless if it did not significantly improve performance - HTTP/1.1 specification
  9. 9. The user might reuse assets during navigation
  10. 10. • Documents • Images • Scripts and CSS • Asynchronous Requests What could be cached?
  11. 11. HTTP headers have the responsibility to define
 if a response could be cached and for how long
  12. 12. GET /main.css HTTP/1.1 Host: jcemer.com
  13. 13. GET /main.css HTTP/1.1 Host: jcemer.com HTTP/1.1 200 Date: Tue, 13 Sep 2016 13:32:50 GMT Cache-Control: max-age=604800 <Response Data>
  14. 14. allows the response to be stored in cache Cache-Control: max-age=604800 Response header
  15. 15. CSS is requested once during navigation 😄 😄
  16. 16. The browser requests it again only if the cache expires or if the user force refreshes the page
  17. 17. allow to add more info about the resource Last-Modified: Mon, 12 Sep 2016 22:06:39 GMT
 Etag: W/"337e7-8HrLmYe6UGIUDolQeGLoyw" Response headers
  18. 18. GET /main.css HTTP/1.1 Host: jcemer.com HTTP/1.1 200 OK Date: Tue, 13 Sep 2016 13:32:50 GMT Last-Modified: Mon, 12 Sep 2016 15:23:17 GMT Cache-Control: max-age=604800 new
  19. 19. allow reuse the cached resource if it didn’t change If-Modified-Since: Mon, 12 Sep 2016 15:23:17 GMT If-Match: W/"337e7-8HrLmYe6UGIUDolQeGLoyw" Request headers
  20. 20. GET /main.css HTTP/1.1 Host: jcemer.com If-Modified-Since: Mon, 12 Sep 2016 15:23:17 GMT
  21. 21. GET /main.css HTTP/1.1 Host: jcemer.com If-Modified-Since: Mon, 12 Sep 2016 15:23:17 GMT HTTP/1.1 304 Not Modified Date: Tue, 13 Sep 2016 13:32:50 GMT Cache-Control: max-age=604800 HTTP/1.1 200 OK Date: Tue, 13 Sep 2016 13:32:50 GMT Last-Modified: Mon, 12 Sep 2016 15:23:17 GMT <Response Data> <No Response Data>
  22. 22. A Website with the HTTP headers wisely defined will provide a better experience for the users
  23. 23. Networkcaching
  24. 24. Content Delivery Network (CDN) is a globally distributed network of proxy servers - Wikipedia
  25. 25. allows proxy servers to cache the content Cache-control: public
  26. 26. Servercaching
  27. 27. Finally, the control is all on your hand, developer!
  28. 28. Cache server
  29. 29. The cache server stay between the user and the application or other servers
  30. 30. • Shared documents • Images • Scripts and CSS • Asynchronous Requests What could be cached?
  31. 31. • Varnish • Squid • nginx Tools https://varnish-cache.org http://www.squid-cache.org https://www.nginx.com
  32. 32. location / { proxy_pass http://otherserver; } the nginx intermediates the client request’s
  33. 33. nginx cache server
  34. 34. proxy_cache_path /path/to/cache; 
 location / { proxy_pass http://otherserver; proxy_cache cache; } caching!
  35. 35. 😄 nginx cache server
  36. 36. The proxy caches content relying only on the application HTTP headers
  37. 37. t1 t2 t1 request at a different time
  38. 38. 😓 … t1 t1 t1 x3
  39. 39. proxy_cache_lock on; proxy_cache_lock_timeout 180; allow the proxy to delegate only the first of similar requests at a time
  40. 40. t1 t1 😄x1
  41. 41. All clients are waiting until receive the response when the first request returns
  42. 42. What happen in case of failures? 🤒
  43. 43. allows to delivery expired content in case of failure proxy_cache_use_stale timeout error http_500;
  44. 44. 😄 🤒
  45. 45. The proxy could improve the fault tolerance of the application
  46. 46. proxy_cache_use_stale updating; delivers expired content for the subsequent similar requests
  47. 47. Appcaching
  48. 48. Caching on the app reduces the time of specific operations
  49. 49. • Complex computations • Data shared across requests What could be cached?
  50. 50. def price @price ||= Price.new(unit_price, category) end
  51. 51. Memoization stores the results to avoid future calculations
  52. 52. A global code memoization is going to last in-memory during all the application execution cycle
  53. 53. $price = Price.new(unit_price, category) 😓
  54. 54. cache.fetch("cat#{cat_id}", expires_in: 1.minute) do CatagoryTax.new(cat_id) end http://api.rubyonrails.org/classes/ActiveSupport/Cache/Store.html ActiveSupport::Cache::Store
  55. 55. Rails Caching API allows store and reuse data during an amount of time across requests
  56. 56. https://github.com/ptarjan/node-cache/issues/77 😓
  57. 57. 🔥
  58. 58. Rails Memory Caching wisely prunes the cached data when it exceeds the allotted memory size
  59. 59. One of the ways to scale an application is through adding more application instances (scale horizontally) 😓
  60. 60. Load balancer App 
 instances
  61. 61. Never assumes that anything cached in memory or on disk will be available on a future request https://12factor.net/processes
  62. 62. Any storage could be used to share cache data between instances but the Key-value Storages are the most common
  63. 63. • Redis • Memcached Key-value Storages http://redis.io http://memcached.org
  64. 64. These tools have different policies to prune the amount of cached data
  65. 65. Cache Storages might also be fault tolerable with replication and persistence http://redis.io/topics/persistence
 http://redis.io/topics/replication
 http://redis.io/topics/sentinel
  66. 66. Race condition happens when different application instances fetch for a not cached data at the same time 😓
  67. 67. expired data! 😁
  68. 68. cache.fetch(key, race_condition_ttl: 10.seconds) do
 heavy_db_computation
 end http://api.rubyonrails.org/classes/ActiveSupport/Cache/Store.html ActiveSupport::Cache::Store
  69. 69. 😁 😳
  70. 70. It is difficult to completely eliminate cache updating race conditions issues with multiple application instances
  71. 71. A solution for that is to update the cache data outside the application flow and just consume cached data on the application
  72. 72. 1. Obey the HTTP headers
  73. 73. 2. Caching is important
  74. 74. 3. Measure the miss/hits of the caching strategy
  75. 75. 4. Evaluate carefully the caching strategy options
  76. 76. Thanks!@jcemer

×