Using HTTP Link: Header for
Gateway Cache Invalidation

  Mike Kelly <mike@mykanjo.co.uk>
  Michael Hausenblas <michael.ha...
What is a gateway cache?

"reverse proxy cache"
A layer between all clients and destination server




Objective:
Minimize...
How do they work?

They can leverage the 3 principal caching mechanisms:


• Expiration
• Validation
• Invalidation


HTTP...
Expiration-based caching

< 200 OK
< Content-Type: text/html
< Cache-Control: public, s-maxage=600
< ....

Pros:
+ Simple
...
Validation-based caching

< 200 OK
< ETag: "686897696a7c876b7e"
> GET /example
> If-None-Match: "686897696a7c876b7e"
< 304...
Expiration+Validation caching

< 200 OK
< ETag: "686897696a7c876b7e"
< Cache-Control: public, s-maxage=600

Pros:
+ Expira...
Invalidation-based caching

             - Responses fresh until invalidated

             (by non-safe requests)

       ...
How is this possible?


Product of adhering to constraints of REST, particularly:


  Uniform Interface
    + Self-descrip...
Invalidation-based caching

Pros:
+ Caches have self-control
+ "Best case" efficiency
+ Ensured freshness*


Cons:
- Only ...
Cache invalidation in practice

Two main problems for cache invalidation arise from
pragmatism and trade-offs in resource ...
Composite Problem

Perfect World:
<collection>
   <item rel="item" href="/items/123" />
   <item rel="item" href="/items/a...
Composite Problem

What effect would the following interaction have on the
composite collection it belongs to?




> PUT /...
The Split-resource Problem

Given /document resource with representations:

/document.html
/document.xml
/document.json


...
What's the Problem?
.. The Solution

Beef up the uniform interface:

Express these common types of resource dependency as
control data using L...
LHIC-I

Express dependency in response to an invalidating request

> PUT /composite-collection/item123

< 200 OK
< Link: <...
LHIC-II

Express dependencies in initial cacheable responses
> GET /document.html
< 200 OK
< Link: </document>;
< rel="htt...
Comparison

LHIC-I
+ More dynamic control of invalidation

- DoS risk
- Invalidation does not cascade

LHIC-II
+ No DoS ri...
Conclusion

LHIC injects lost visibility. Resulting mechanism:

+ Very efficient
+ Ensures freshness
+ Easily managed
+ Le...
Considerations

Resource state altered outside of uniform interface
- Don't do that
- Reintroduce expiration and validatio...
Upcoming SlideShare
Loading in …5
×

Link Header-based Invalidation of Caches

3,210 views

Published on

Gateway caches are intermediary components for reducing demands on destination servers, and therefore operational costs of a system. At scale, particularly with the advent of on-demand infrastructures such as EC2, etc., maximising cache efficiency translates into cost efficiency, resulting in a competitive advantage. In this position paper, we initially discuss advantages and limitations of HTTP caching mechanisms. We then propose to use HTTP Link: headers to maximise the efficiency of gateway (or reverse proxy) caching mechanisms and discuss early findings.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,210
On SlideShare
0
From Embeds
0
Number of Embeds
1,271
Actions
Shares
0
Downloads
14
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Link Header-based Invalidation of Caches

  1. 1. Using HTTP Link: Header for Gateway Cache Invalidation Mike Kelly <mike@mykanjo.co.uk> Michael Hausenblas <michael.hausenblas@deri.org>
  2. 2. What is a gateway cache? "reverse proxy cache" A layer between all clients and destination server Objective: Minimize demand on destination server Not so concerned with reducing bandwith
  3. 3. How do they work? They can leverage the 3 principal caching mechanisms: • Expiration • Validation • Invalidation HTTP has mechanisms for each of these
  4. 4. Expiration-based caching < 200 OK < Content-Type: text/html < Cache-Control: public, s-maxage=600 < .... Pros: + Simple + No contact with server until expiration Cons: - Inefficient - Difficult to manage
  5. 5. Validation-based caching < 200 OK < ETag: "686897696a7c876b7e" > GET /example > If-None-Match: "686897696a7c876b7e" < 304 Not Modified Pros: + Reduces bandwidth + Ensures freshness Cons: - Server handling every request - Generating 304 still costs processing and I/O
  6. 6. Expiration+Validation caching < 200 OK < ETag: "686897696a7c876b7e" < Cache-Control: public, s-maxage=600 Pros: + Expiration reduces contact with server + Validation reduces bandwidth Cons: - "Worst case" inefficiency - Still managing caching rules
  7. 7. Invalidation-based caching - Responses fresh until invalidated (by non-safe requests) In HTTP: PUT POST PATCH DELETE (PURGE?)
  8. 8. How is this possible? Product of adhering to constraints of REST, particularly: Uniform Interface + Self-descriptive messages Intermediaries can make assertions about client-server interactions.
  9. 9. Invalidation-based caching Pros: + Caches have self-control + "Best case" efficiency + Ensured freshness* Cons: - Only reliable for gateway caches - Impractical* * (sort of)
  10. 10. Cache invalidation in practice Two main problems for cache invalidation arise from pragmatism and trade-offs in resource granularity and identification: • The "Composite Problem" • The "Split-resource Problem"
  11. 11. Composite Problem Perfect World: <collection> <item rel="item" href="/items/123" /> <item rel="item" href="/items/asdf" /> <item rel="item" href="/items/foobar" /> </collection> Real World: <collection> <item rel="item" href="/items/123"> <title>Item 123</title> <content>Content for item 123 - an example of embedded state</content> </item> <item rel="item" href="/items/asdf"> <title>Item asdf</title> <content>This state is also embedded</content> </item> <item rel="item" href="/items/foobar"> <title>FooBar</title> <content>Yet more embedded state!! :(</content> </item> </collection>
  12. 12. Composite Problem What effect would the following interaction have on the composite collection it belongs to? > PUT /composite-collection/item123 < 200 OK
  13. 13. The Split-resource Problem Given /document resource with representations: /document.html /document.xml /document.json When a client does this: PUT /document Then invalidation of each representation is invisible to intermediaries
  14. 14. What's the Problem?
  15. 15. .. The Solution Beef up the uniform interface: Express these common types of resource dependency as control data using Link header and standard link relations This increases: - Self-descriptiveness of messages - Visibility "Link Header-based Invalidation of Caches" (LHIC)
  16. 16. LHIC-I Express dependency in response to an invalidating request > PUT /composite-collection/item123 < 200 OK < Link: </composite-collection>; < rel="http://example.org/rels/dependant"
  17. 17. LHIC-II Express dependencies in initial cacheable responses > GET /document.html < 200 OK < Link: </document>; < rel="http://example.org/rels/dependsOn" > GET /document.xml < 200 OK < Link: </document>; < rel="http://example.org/rels/dependsOn" > GET /document.json < 200 OK < Link: </document>; < rel="http://example.org/rels/dependsOn" > PUT /document < 200 OK
  18. 18. Comparison LHIC-I + More dynamic control of invalidation - DoS risk - Invalidation does not cascade LHIC-II + No DoS risk + Cascading invalidation - Complexity
  19. 19. Conclusion LHIC injects lost visibility. Resulting mechanism: + Very efficient + Ensures freshness + Easily managed + Leverages existing specs - Only for gateway caching + Combine Invalidation (gateway) & Validation (client)
  20. 20. Considerations Resource state altered outside of uniform interface - Don't do that - Reintroduce expiration and validation Peering - Further research Size limits for HTTP headers

×