0
Caching Up and Down
the Stack
Long Island/Queens Django Meetup 5/20/14
Hi, I’m Dan Kuebrich
● Software engineer, python fan
● Web performance geek
● Founder of Tracelytics, now part of AppNeta
...
DJANGO
What is “caching”?
● Caching is avoiding doing expensive work
o by doing cheaper work
● Common examples?
o On repeat visit...
What is “caching”?
Uncached
Client
Data Source
What is “caching”?
Client
Data Source
Uncached Cached
Cache Intermediary
Client
Data Source
What is “caching”?
Client
Data Source
Uncached Cached
Cache Intermediary
Client
Data Source
Fast!
Slow...
“Latency Numbers Every Programmer Should Know”
Systems Performance: Enterprise and the Cloud by Brendan Gregg
http://books...
A whole mess of caching:
● Browser cache
● CDN
● Proxy / optimizer
● Application-based
o Full-page
o Fragment
o Object cac...
Caching in Django apps: Frontend
● Client-side assets
● Full pages
Client-side assets
Client-side assets
Client-side assets
● Use HTTP caches!
o Browser
o CDN
o Intermediate proxies
● Set policy with cache headers
o Cache-Contr...
HTTP Cache-Control and Expires
● Stop the browser from even asking for it
● Expires
o Pick a date in the future, good til ...
HTTP Cache-Control and Expires
dan@JLTM21:~$ curl -I https://login.tv.appneta.com/cache/tl-layouts_base_unauth-
compiled-1...
HTTP Cache Control in Django
https://docs.djangoproject.com/en/dev/topics/cache/
ETag + Last-Modified
ETag + Last-Modified
dan@JLTM21:~$ curl -I www.appneta.com/stylesheets/styles.css
HTTP/1.1 200 OK
Last-Modified: Tue, 20 M...
ETag + Last-Modified
dan@JLTM21:~$ curl -I www.appneta.com/stylesheets/styles.css --header 'If-None-
Match: "30854c-1c3d3-...
ETag vs Last-Modified
● Last-Modified is date-based
● ETag is content-based
● Most webservers generate both
● Some webserv...
A whole mess of caching:
● Browser cache
● CDN
● Proxy / optimizer
● Application-based
o Full-page
o Fragment
o Object cac...
CDNs
● Put content closer to your end-users
o and offload HTTP requests from
your servers
● Best for static assets
● Same ...
Full-page caching
Client
Data Source
Varnish
No internet
standards
necessary!
Full-page caching: mod_pagespeed
Client
Data Source
mod_pagespeed
● Dynamically rewrites
pages with frontend
optimizations...
A whole mess of caching:
● Browser cache
● CDN
● Proxy / optimizer
● Application-based
o Full-page
o Fragment
o Object cac...
Full-page caching in Django
Wait, where is this getting cached?
● Django makes it easy to configure
o In-memory
o File-based
o Memcached
o etc.
Full-page caching: dynamic pages?
Full-page caching: dynamic pages?
Fragment caching
Full-page caching: dynamic pages?
Full-page caching: the ajax solution
Object caching
def get_item_by_id(key):
# Look up the item in our database
return session.query(User)
.filter_by(id=key)
....
Object caching
def get_item_by_id(key):
# Check in cache
val = mc.get(key)
# If exists, return it
if val:
return val
# If ...
Object caching
@decorator
def cache(expensive_func, key):
# Check in cache
val = mc.get(key)
# If exists, return it
if val...
Object caching
@cache
def get_item_by_id(key):
# Look up the item in our database
return session.query(User)
.filter_by(id...
Object caching in Django
A whole mess of caching:
● Browser cache
● CDN
● Proxy / optimizer
● Application-based
o Full-page
o Fragment
o Object cac...
Query caching
Client
Actual tables
Database
Query
Cache
Cached?
Query caching
mysql> select SQL_CACHE count(*) from traces;
+----------+
| count(*) |
+----------+
| 3135623 |
+----------...
Query caching
Query caching
Uncached
Cached
Denormalization
mysql> select table1.x, table2.y from table1 join table2 on table1.z = table2.q
where table1.z > 100;
mysq...
A whole mess of caching:
● Browser cache
● CDN
● Proxy / optimizer
● Application-based
o Full-page
o Fragment
o Object cac...
Caching: what can go wrong?
● Invalidation
● Fragmentation
● Stampedes
● Complexity
Invalidation
Client
Data Source
Cache Intermediary
Update!
Write
Invalidate
Invalidation on page-scale
● Browser cache
● CDN
● Proxy / optimizer
● Application-based
o Full-page
o Fragment
o Object c...
Fragmentation
● What if I have a lot of different things to
cache?
o More misses
o Potential cache eviction
Fragmentation
Your pages / objects
FrequencyofAccess
Fragmentation
Your pages / objects
FrequencyofAccess
Stampedes
● On a cache miss extra work is done
● The result is stored in the cache
● What if multiple simultaneous misses?
Stampedes
http://allthingsd.com/20080521/stampede-facebook-opens-its-profile-doors/
Complexity
● How much caching do I need, and where?
● What is the invalidation process
o on data update? on release?
● Wha...
Takeaways
● The ‘how’ of caching:
o What are you caching?
o Where are you caching it?
o How bad is a cache miss?
o How and...
Takeaways
● The ‘why’ of caching:
o Did it actually get faster?
o Is speed worth extra complexity?
o Don’t guess – measure...
Questions?
?
Thanks!
● Interested in measuring your Django app’s
performance?
o Free trial of TraceView:
www.appneta.com/products/trace...
Resources
● Django documentation on caching: https://docs.djangoproject.com/en/dev/topics/cache/
● Varnish caching, via Di...
Caching Up and Down the Stack
Upcoming SlideShare
Loading in...5
×

Caching Up and Down the Stack

202

Published on

Whether you're looking to make your web app run faster or scale better, one great way to achieve both is to simply do less work. How? By using caches, the data hidey-holes which generations of engineers have thoughtfully left at key junctures in computing infrastructure from your CPU to the backbone of the internet. Requests into web applications, which span great distances and often involve expensive frontend and backend lifting are great candidates for caching of all types. We'll discuss the benefits and tradeoffs of caching at different layers of the stack and how to find low-hanging cachable fruit, with a particular focus on server-side improvements

Published in: Software
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
202
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Caching Up and Down the Stack"

  1. 1. Caching Up and Down the Stack Long Island/Queens Django Meetup 5/20/14
  2. 2. Hi, I’m Dan Kuebrich ● Software engineer, python fan ● Web performance geek ● Founder of Tracelytics, now part of AppNeta ● Once (and future?) Queens resident
  3. 3. DJANGO
  4. 4. What is “caching”? ● Caching is avoiding doing expensive work o by doing cheaper work ● Common examples? o On repeat visits, your browser doesn’t download images that haven’t changed o Your CPU caches instructions, data so it doesn’t have to go to RAM… or to disk!
  5. 5. What is “caching”? Uncached Client Data Source
  6. 6. What is “caching”? Client Data Source Uncached Cached Cache Intermediary Client Data Source
  7. 7. What is “caching”? Client Data Source Uncached Cached Cache Intermediary Client Data Source Fast! Slow...
  8. 8. “Latency Numbers Every Programmer Should Know” Systems Performance: Enterprise and the Cloud by Brendan Gregg http://books.google.com/books?id=xQdvAQAAQBAJ&pg=PA20&lpg=PA20&source=bl&ots=hlTgyxdrnR&sig=CCjddHrY1H6muMVW9BFcbdO7DDo&hl=en&sa=X&ei=dS7oUquhOYr9oAT9oYGoDw&ved=0CCkQ6AEwAA#v=onepage &q&f=false
  9. 9. A whole mess of caching: ● Browser cache ● CDN ● Proxy / optimizer ● Application-based o Full-page o Fragment o Object cache ● Database o Query cache o Denormalization Closer to the user Closer to the data
  10. 10. Caching in Django apps: Frontend ● Client-side assets ● Full pages
  11. 11. Client-side assets
  12. 12. Client-side assets
  13. 13. Client-side assets ● Use HTTP caches! o Browser o CDN o Intermediate proxies ● Set policy with cache headers o Cache-Control / Expires o ETag / Last-Modified
  14. 14. HTTP Cache-Control and Expires ● Stop the browser from even asking for it ● Expires o Pick a date in the future, good til then ● Cache-control o More flexible o Introduced in HTTP 1.1 o Use this one
  15. 15. HTTP Cache-Control and Expires dan@JLTM21:~$ curl -I https://login.tv.appneta.com/cache/tl-layouts_base_unauth- compiled-162c2ceecd9a7ff1e65ab460c2b99852a49f5a43.css HTTP/1.1 200 OK Accept-Ranges: bytes Cache-Control: max-age=315360000 Content-length: 5955 Content-Type: text/css Date: Tue, 20 May 2014 23:12:16 GMT Expires: Thu, 31 Dec 2037 23:55:55 GMT Last-Modified: Fri, 16 May 2014 20:51:19 GMT Server: nginx Connection: keep-alive
  16. 16. HTTP Cache Control in Django https://docs.djangoproject.com/en/dev/topics/cache/
  17. 17. ETag + Last-Modified
  18. 18. ETag + Last-Modified dan@JLTM21:~$ curl -I www.appneta.com/stylesheets/styles.css HTTP/1.1 200 OK Last-Modified: Tue, 20 May 2014 05:52:50 GMT ETag: "30854c-1c3d3-4f9ce7d715080" Vary: Accept-Encoding Content-Type: text/css ...
  19. 19. ETag + Last-Modified dan@JLTM21:~$ curl -I www.appneta.com/stylesheets/styles.css --header 'If-None- Match: "30854c-1c3d3-4f9ce7d715080"' HTTP/1.1 304 Not Modified Last-Modified: Tue, 20 May 2014 05:52:50 GMT ETag: "30854c-1c3d3-4f9ce7d715080" Vary: Accept-Encoding Content-Type: text/css Date: Tue, 20 May 2014 23:21:12 GMT ...
  20. 20. ETag vs Last-Modified ● Last-Modified is date-based ● ETag is content-based ● Most webservers generate both ● Some webservers (Apache) generate etags that depend on local state o If you have a load-balanced pool of servers working here, they might not be using the same etags!
  21. 21. A whole mess of caching: ● Browser cache ● CDN ● Proxy / optimizer ● Application-based o Full-page o Fragment o Object cache ● Database o Query cache o Denormalization
  22. 22. CDNs ● Put content closer to your end-users o and offload HTTP requests from your servers ● Best for static assets ● Same cache control policies apply
  23. 23. Full-page caching Client Data Source Varnish No internet standards necessary!
  24. 24. Full-page caching: mod_pagespeed Client Data Source mod_pagespeed ● Dynamically rewrites pages with frontend optimizations ● Caches rewritten pages
  25. 25. A whole mess of caching: ● Browser cache ● CDN ● Proxy / optimizer ● Application-based o Full-page o Fragment o Object cache ● Database o Query cache o Denormalization
  26. 26. Full-page caching in Django
  27. 27. Wait, where is this getting cached? ● Django makes it easy to configure o In-memory o File-based o Memcached o etc.
  28. 28. Full-page caching: dynamic pages?
  29. 29. Full-page caching: dynamic pages?
  30. 30. Fragment caching
  31. 31. Full-page caching: dynamic pages?
  32. 32. Full-page caching: the ajax solution
  33. 33. Object caching def get_item_by_id(key): # Look up the item in our database return session.query(User) .filter_by(id=key) .first()
  34. 34. Object caching def get_item_by_id(key): # Check in cache val = mc.get(key) # If exists, return it if val: return val # If not, get the val, store it in the cache val = return session.query(User) .filter_by(id=key) .first() mc.set(key, val) return val
  35. 35. Object caching @decorator def cache(expensive_func, key): # Check in cache val = mc.get(key) # If exists, return it if val: return val # If not, get the val, store it in the cache val = expensive_func(key) mc.set(key, val) return val
  36. 36. Object caching @cache def get_item_by_id(key): # Look up the item in our database return session.query(User) .filter_by(id=key) .first()
  37. 37. Object caching in Django
  38. 38. A whole mess of caching: ● Browser cache ● CDN ● Proxy / optimizer ● Application-based o Full-page o Fragment o Object cache ● Database o Query cache o Denormalization
  39. 39. Query caching Client Actual tables Database Query Cache Cached?
  40. 40. Query caching mysql> select SQL_CACHE count(*) from traces; +----------+ | count(*) | +----------+ | 3135623 | +----------+ 1 row in set (0.56 sec) mysql> select SQL_CACHE count(*) from traces; +----------+ | count(*) | +----------+ | 3135623 | +----------+ 1 row in set (0.00 sec)
  41. 41. Query caching
  42. 42. Query caching Uncached Cached
  43. 43. Denormalization mysql> select table1.x, table2.y from table1 join table2 on table1.z = table2.q where table1.z > 100; mysql> select table1.x, table1.y from table1 where table1.z > 100;
  44. 44. A whole mess of caching: ● Browser cache ● CDN ● Proxy / optimizer ● Application-based o Full-page o Fragment o Object cache ● Database o Query cache o Denormalization
  45. 45. Caching: what can go wrong? ● Invalidation ● Fragmentation ● Stampedes ● Complexity
  46. 46. Invalidation Client Data Source Cache Intermediary Update! Write Invalidate
  47. 47. Invalidation on page-scale ● Browser cache ● CDN ● Proxy / optimizer ● Application-based o Full-page o Fragment o Object cache ● Database o Query cache o Denormalization More savings, generally more invalidation... Smaller savings, generally less invalidation
  48. 48. Fragmentation ● What if I have a lot of different things to cache? o More misses o Potential cache eviction
  49. 49. Fragmentation Your pages / objects FrequencyofAccess
  50. 50. Fragmentation Your pages / objects FrequencyofAccess
  51. 51. Stampedes ● On a cache miss extra work is done ● The result is stored in the cache ● What if multiple simultaneous misses?
  52. 52. Stampedes http://allthingsd.com/20080521/stampede-facebook-opens-its-profile-doors/
  53. 53. Complexity ● How much caching do I need, and where? ● What is the invalidation process o on data update? on release? ● What happens if the caches fall over? ● How do I debug it?
  54. 54. Takeaways ● The ‘how’ of caching: o What are you caching? o Where are you caching it? o How bad is a cache miss? o How and when are you invalidating?
  55. 55. Takeaways ● The ‘why’ of caching: o Did it actually get faster? o Is speed worth extra complexity? o Don’t guess – measure! o Always use real-world conditions.
  56. 56. Questions? ?
  57. 57. Thanks! ● Interested in measuring your Django app’s performance? o Free trial of TraceView: www.appneta.com/products/traceview ● See you at Velocity NYC this fall? ● Twitter: @appneta / @dankosaur
  58. 58. Resources ● Django documentation on caching: https://docs.djangoproject.com/en/dev/topics/cache/ ● Varnish caching, via Disqus: http://blog.disqus.com/post/62187806135/scaling-django-to-8- billion-page-views ● Django cache option comparisons: http://codysoyland.com/2010/jan/17/evaluating-django- caching-options/ ● More Django-specific tips: http://www.slideshare.net/csky/where-django-caching-bust-at-the- seams ● Guide to cache-related HTTP headers: http://www.mobify.com/blog/beginners-guide-to-http- cache-headers/ ● Google PageSpeed: https://developers.google.com/speed/pagespeed/module
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×