Under the Covers with the Web


Published on

Walks through the basics of the HTTP protocol, URLs, cookies and caching, with tricks and tips that can be used by web developers. From a Geek.class I did on Oct 6, 2011 for Meet the Geeks.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Comic: How a Web Browser Works http://www.labnol.org/internet/comic-how-browser-works/18086/
  • RFC 3986: Uniform Resource Identifier (URI): Generic Syntax – Current Standard http://tools.ietf.org/html/rfc3986 RFC 1738: Uniform Resource Locators http://tools.ietf.org/html/rfc1738 RFC 1630: Universal Resource Identifiers in WWW http://tools.ietf.org/html/rfc1630
  • Maximum URL Lengths http://www.boutell.com/newfaq/misc/urllength.html
  • Wikipedia: SPDY Protocol http://en.wikipedia.org/wiki/SPDY SPDY: An experimental protocol for a faster web http://dev.chromium.org/spdy/spdy-whitepaper
  • Wikipedia: Domain Name System http://en.wikipedia.org/wiki/Domain_Name_System
  • Wikipedia: Transmission Control Protocol (TCP) http://en.wikipedia.org/wiki/Transmission_Control_Protocol
  • RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1 http://tools.ietf.org/html/rfc2616 RFC 1945: Hypertext Transfer Protocol -- HTTP/1.0 http://tools.ietf.org/html/rfc1945
  • PHP Charset FAQ http://kore-nordmann.de/blog/php_charset_encoding_FAQ.html PHP: Character Sets / Character Encoding Issues http://www.phpwact.org/php/i18n/charsets RFC 6265: HTTP State Management Mechanism http://tools.ietf.org/html/rfc6265
  • How To Optimize Your Site With GZIP Compression http://betterexplained.com/articles/how-to-optimize-your-site-with-gzip-compression/
  • Request Processing in Apache http://www.apachetutor.org/dev/request
  • PHP Equivalents for ASP Objects http://phplens.com/phpeverywhere/node/view/32
  • RFC 6265: HTTP State Management Mechanism http://tools.ietf.org/html/rfc6265
  • Caching Tutorial http://www.mnot.net/cache_docs/
  • Caching Tutorial http://www.mnot.net/cache_docs/
  • Under the Covers with the Web

    1. 1. Under the Covers with the Web by Trevor Lohrbeer @LabEscape [email_address] labescape.com @FastFedora [email_address] fastfedora.com
    2. 2. Simple Request Walkthrough <ul><li>Enter URL in browser </li></ul><ul><li>Browser sends request to server </li></ul><ul><li>Server sends response to browser </li></ul><ul><li>Browser renders page </li></ul>
    3. 3. Making the Request <ul><li>Parse URL </li></ul><ul><li>Resolve domain to IP address </li></ul><ul><li>Open TCP/IP connection to server </li></ul><ul><li>Use HTTP to send a request </li></ul>
    4. 4. URL: Uniform Resource Locators <ul><li>About URLs: </li></ul><ul><ul><li>Not web-specific. Internet standard. </li></ul></ul><ul><ul><li>Defined by Request For Comments (RFCs) </li></ul></ul><ul><ul><li>Current Standard is RFC 3986 </li></ul></ul><ul><ul><li>Unlike URIs, URLs provide an access mechanism </li></ul></ul>
    5. 5. URL Format <ul><li>Format <protocol>//<user>:<password>@<host>:<port>/<url-path> http://myserver.com/some/search.php?query=help#ch5 </li></ul><ul><li>http://admin:password@myserver.com:8000/some/path.php/info#ch5 http://[2001:4860:0:2001::68]/ </li></ul><ul><li>Parsing – Regular Expression ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(?([^#]*))?(#(.*))? 12 3 4 5 6 7 8 9 </li></ul><ul><li>Where: </li></ul><ul><li>scheme = $2 authority = $4 path = $5 </li></ul><ul><li>query = $7 </li></ul><ul><li>fragment = $9 </li></ul><ul><li>Details </li></ul><ul><ul><li>Valid characters: a-z A-Z 0-9 - . _ ~ </li></ul></ul><ul><ul><li>Special characters for HTTP: / ? & = , + # </li></ul></ul><ul><ul><li>Recommend % encoding other characters (eg: %20) </li></ul></ul><ul><ul><li>Avoid URLs longer than 2,000 characters </li></ul></ul>
    6. 6. Parsing an HTTP URL <ul><li><protocol>://<host>:<port>/<path>?<query>#<fragment> </li></ul><ul><li>Protocol Either http or https </li></ul><ul><li>Host & Port (Authority) Location of resource (default port 80 for http, 443 for https) </li></ul><ul><li>Path Hierarchical string to resource </li></ul><ul><li>Query Parameters to pass to resource </li></ul><ul><li>Fragment Identifies subset of resource </li></ul>
    7. 7. Protocols <ul><li>HTTP Original web protocol. Currently version 1.1. </li></ul><ul><li>HTTPS HTTP tunneled within an SSL/TLS connection on port 443 </li></ul><ul><li>S-HTTP Older secure protocol – used port 80 </li></ul><ul><li>SPDY Faster version of HTTP used by Google Chrome </li></ul>
    8. 8. Hosts <ul><li>Hosts can be: </li></ul><ul><ul><li>DNS Names: localhost, www.fastfedora.com </li></ul></ul><ul><ul><li>WINS (NetBIOS) Names: WEBSERVER </li></ul></ul><ul><ul><li>IPv4 Addresses: </li></ul></ul><ul><ul><li>IPv6 Addresses: [2001:0db8:85a3:::8a2e:0370:7334] </li></ul></ul><ul><ul><li>Other “Registered” Name </li></ul></ul><ul><li>Tricks: </li></ul><ul><ul><li>Hardcode name to IP address in hosts / lmhosts Windows: C:WINDOWSsystem32driversetc Linux: /etc/hosts </li></ul></ul><ul><ul><li>On Windows, use “nbtstat –r” to refresh cache </li></ul></ul>
    9. 9. Paths <ul><li>Hierarchical, separating path components by / </li></ul><ul><li>Can be empty (eg: http://example.com?id=5) </li></ul><ul><li>Start with first / </li></ul><ul><li>End with ?, # or end of URL </li></ul><ul><li>Absolute paths start with leading / </li></ul><ul><li>Relative paths start with . or .. to refer to current location or parent location, respectively </li></ul><ul><li>For application server, does not have to end with the application end-point, can have “path info” which extends past resource, eg: /book/search.php/My%20Book/Paperback </li></ul>
    10. 10. Queries <ul><li>Can be any text, eg: ?myQuery </li></ul><ul><li>Or uses name=value pairs separated by &, eg: ?query=myQuery </li></ul><ul><li>Names do not have to be unique, eg: ?stock=MSFT&stock=SUN is valid </li></ul><ul><li>Multiple values can be comma-separated for some application engines, eg: ?stock=MSFT,SUN </li></ul><ul><li>When using ampersands: </li></ul><ul><ul><li>Use &amp; when using the URL in HTML / XML </li></ul></ul><ul><ul><li>Use & when using the URL elsewhere </li></ul></ul><ul><li>Search engines used to not index, still not great </li></ul><ul><li>Always encode =, & and # in any names or values </li></ul>
    11. 11. Fragments <ul><li>Refers to a subset or view of web page </li></ul><ul><li>Not indexed by search engines </li></ul><ul><li>Used to reference anchors in web pages, eg: #chapter2 links to <a name=“chapter2”></a> </li></ul><ul><li>Used by AJAX to store state using JavaScript without refreshing the browser page – helps support bookmarks & browser history </li></ul><ul><ul><li>Obsoleted by pushState in HTML 5 </li></ul></ul><ul><li>Used to increase SEO by canonicalizing URLs </li></ul>
    12. 12. Resolve Domain to IP address <ul><li>DNS </li></ul><ul><ul><li>Resolves names to IP addresses & vice versus </li></ul></ul><ul><ul><li>Not a 1-to-1 mapping between IPs & names </li></ul></ul><ul><ul><li>Results cached at many levels: </li></ul></ul><ul><ul><ul><li>Application (web browser, e-mail) </li></ul></ul></ul><ul><ul><ul><li>Local OS resolver </li></ul></ul></ul><ul><ul><ul><li>DNS server </li></ul></ul></ul><ul><ul><ul><li>Authoritative nameserver </li></ul></ul></ul><ul><li>When moving hosting: </li></ul><ul><ul><li>Reduce time-to-live (TTL) on DNS record to 1 day </li></ul></ul><ul><ul><li>Wait until old TTL expires (usually 1 week) </li></ul></ul><ul><ul><li>Move hosting to new IP address </li></ul></ul><ul><ul><li>Increase TTL back to 1 week </li></ul></ul>
    13. 13. Open TCP/IP Connection <ul><li>About TCP </li></ul><ul><ul><li>Sits atop Internet Protocol (IP) </li></ul></ul><ul><ul><li>Provides reliable connection </li></ul></ul><ul><ul><li>Optimized for accuracy, not timeliness </li></ul></ul><ul><ul><li>Takes time to establish </li></ul></ul><ul><ul><li>Uses resources on the server & client to maintain </li></ul></ul><ul><ul><li>Can use a telnet client to establish </li></ul></ul><ul><li>Connection </li></ul><ul><ul><li>Requires a three-way handshake </li></ul></ul><ul><ul><ul><li>Client SYN, Server SYN-ACK, Client ACK </li></ul></ul></ul><ul><ul><li>Uses an IP address + port for each endpoint (client & server) </li></ul></ul><ul><ul><li>Server port generally 80, client port generally > 1024 </li></ul></ul>
    14. 14. The HTTP Protocol <ul><li>About HTTP </li></ul><ul><ul><li>Plain-text format for sending & retrieving web content from a server </li></ul></ul><ul><ul><li>Defined by RFC 2616 (v1.1) and RFC 1945 (v1.0) </li></ul></ul><ul><li>Response Format </li></ul><ul><ul><li>Status Line </li></ul></ul><ul><ul><li>Headers </li></ul></ul><ul><ul><li>Blank Line </li></ul></ul><ul><ul><li>Response Body </li></ul></ul><ul><li>Request Format </li></ul><ul><ul><li>Request Line </li></ul></ul><ul><ul><li>Headers </li></ul></ul><ul><ul><li>Blank Line </li></ul></ul><ul><ul><li>Optional Data </li></ul></ul>
    15. 15. Debugging HTTP <ul><li>Browser Plugins </li></ul><ul><ul><li>Firefox: LiveHeaders </li></ul></ul><ul><ul><li>Chrome: Go to chrome://net-internals/#events & filter by URL_REQUEST </li></ul></ul><ul><ul><li>Internet Explorer: IEWatch, ieHTTPHeaders </li></ul></ul><ul><li>Web Debugging Proxies </li></ul><ul><ul><li>Fiddler </li></ul></ul><ul><ul><li>Charles </li></ul></ul><ul><li>Network Packet Analyzer </li></ul><ul><ul><li>Wireshark </li></ul></ul><ul><li>Telnet Client </li></ul><ul><li>Custom Code </li></ul>
    16. 16. HTTP Request Methods <ul><li>Read-Only Methods </li></ul><ul><ul><li>GET </li></ul></ul><ul><ul><li>HEAD </li></ul></ul><ul><ul><li>OPTIONS </li></ul></ul><ul><ul><li>TRACE </li></ul></ul><ul><li>Idempotent Methods </li></ul><ul><ul><li>PUT </li></ul></ul><ul><ul><li>DELETE </li></ul></ul><ul><li>Write Methods </li></ul><ul><ul><li>POST </li></ul></ul>
    17. 17. Common Request Headers <ul><li>Host Name of host request being sent to </li></ul><ul><li>Cache-Control / Pragma Determines caching behavior </li></ul><ul><li>Accept / Accept-Charset / Accept-Encoding / Accept-Language Which MIME types, character sets, encodings (compression techniques) and languages to use in the response </li></ul><ul><li>If-Modified-Since / If-Match Only send full response if data has been modified since this date </li></ul><ul><li>Referer The URL that pointed to this URL </li></ul><ul><li>User-Agent The identifier for the browser or program accessing the URL. </li></ul><ul><li>Cookie Sends any cookies previously set to server </li></ul>
    18. 18. Basic GET Request <ul><li>GET /path/myPage.php?id=15 HTTP/1.1 </li></ul><ul><li>Host: www.mysite.com </li></ul>
    19. 19. Typical GET Request <ul><li>GET / HTTP/1.1 </li></ul><ul><li>Host: www.yahoo.com </li></ul><ul><li>User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 </li></ul><ul><li>Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 </li></ul><ul><li>Accept-Language: en-us,en;q=0.5 </li></ul><ul><li>Accept-Encoding: gzip, deflate </li></ul><ul><li>Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 </li></ul><ul><li>Keep-Alive: 115 </li></ul><ul><li>Connection: keep-alive </li></ul><ul><li>Cookie: B=5uh7jf9466q3e&b=4&d=zrFdfgYnwtpYEIRQwQyo3Yh8W.6mk7zDmSWeQ--&s=fs&i=Yq8FH2OqgDl7KB7VK8; YLS==1&p=0&n=0; s_vsn_yahoogroupsygprod_1=687374583019; FPS=dl; F=a=e92100wTFmxh9hOe34539Z6Yp2uybfZ.8pX0oke5pYRVFcz4dYgR74..vuqgLRABfa5K8IfW3tJLSP3LyikNIAO2234sAesM7spi4di.8BaE&b=vZaW; ALP=bTowJmw6ZW5fVVMm; F=a=e92100wTF; Qur=RAB234fa5K8IfW3tJLSP3L; yikNI23AOwVsAesM7spi4di.8BaE&b=vZaW; ALP=bTowJmw2346ZW5fVVMm; </li></ul>
    20. 20. Basic Request Notes <ul><li>Determine locale using Accept-Language </li></ul><ul><ul><li>PHP: Locale::acceptFromHttp() </li></ul></ul><ul><ul><li>ASP.Net: Request.UserLanguages </li></ul></ul><ul><ul><li>J2EE: ServletRequest.getLocales() </li></ul></ul><ul><li>Use separate domain for static content to avoid overhead of sending cookies </li></ul><ul><ul><li>For instance: static.example.com </li></ul></ul><ul><ul><li>Or use separate domain entirely: example-images.com </li></ul></ul><ul><ul><li>When setting cookies, use a specific domain, eg: www.example.com not example.com </li></ul></ul><ul><li>Turn on gzip compression </li></ul><ul><ul><li>Always for uncompressed static files </li></ul></ul><ul><ul><li>Never for compressed static files (images, PDFs) </li></ul></ul><ul><ul><li>Sometimes for dynamic responses </li></ul></ul>
    21. 21. Typical POST Request <ul><li>POST /data-generator/DataGenerator/Projects/Edit.aspx?ID=1 HTTP/1.1 </li></ul><ul><li>Host: localhost </li></ul><ul><li>User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 </li></ul><ul><li>Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 </li></ul><ul><li>Accept-Language: en-us,en;q=0.5 </li></ul><ul><li>Accept-Encoding: gzip, deflate </li></ul><ul><li>Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 </li></ul><ul><li>Keep-Alive: 115 </li></ul><ul><li>Connection: keep-alive </li></ul><ul><li>Referer: http://localhost/data-generator/DataGenerator/Projects/Edit.aspx?ID=1 </li></ul><ul><li>Content-Type: application/x-www-form-urlencoded </li></ul><ul><li>Content-Length: 411 </li></ul><ul><li>__EVENTTARGET=&__EVENTARGUMENT=&__VIEWSTATE=%2FwEPDwUKMTE3MDA1NzgzNWRkD7Ixn245wh8y%2BUk486haBrWD82I%3D&__EVENTVALIDATION=%2FwEWBgKe67rCCwLCqe1mAubp%2BvAPAtn1w6cBArrGt4UFAsjVgtcCuHxmC3pWm2jBT5ZlGoNznlygJIk%3D&ctl00%24MainContent%24ID=1&ctl00%24MainContent%24Name=Project+Portfolio+Management&ctl00%24MainContent%24Description=A+general+project+portfolio+management+data+set.&ctl00%24MainContent%24SaveButton=Save </li></ul>
    22. 22. XML POST Request <ul><li>POST /data-generator/DataGenerator/Projects/Edit.aspx?ID=1 HTTP/1.1 </li></ul><ul><li>Host: localhost </li></ul><ul><li>User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 </li></ul><ul><li>Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 </li></ul><ul><li>Accept-Language: en-us,en;q=0.5 </li></ul><ul><li>Accept-Encoding: gzip, deflate </li></ul><ul><li>Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 </li></ul><ul><li>Keep-Alive: 115 </li></ul><ul><li>Connection: keep-alive </li></ul><ul><li>Content-Type: text/xml </li></ul><ul><li>Content-Length: 60 </li></ul><ul><li><xmlRoot> </li></ul><ul><li><sample message=&quot;This is a test&quot; /> </li></ul><ul><li></xmlRoot> </li></ul>
    23. 23. More Request Notes <ul><li>POST requests rarely, if ever, cached </li></ul><ul><ul><li>Some clients refuse to cache POST requests even if headers are sent </li></ul></ul><ul><li>GET requests better for caching </li></ul><ul><ul><li>Some proxies won’t cache GET requests if they contain cookies though </li></ul></ul><ul><li>For AJAX, don’t send content, just the URL with query parameters (eg: GET or POST with no data) </li></ul><ul><li>Any content can be sent in GET or POST requests, just specify a content-type </li></ul>
    24. 24. Server Processes Request <ul><li>Parses input stream </li></ul><ul><li>Creates request & response objects </li></ul><ul><li>Sends request through pipeline </li></ul><ul><li>Maps to content handler </li></ul><ul><ul><li>Requests handled internally or through CGI, ISAPI or other module </li></ul></ul><ul><ul><li>IIS 7 uses integrated pipeline for all requests </li></ul></ul><ul><li>Receives content from handler (stream or fixed) </li></ul><ul><li>Sends response to client </li></ul><ul><li>Logs request </li></ul>
    25. 25. Create Request & Response Objects <ul><li>PHP </li></ul><ul><li>$_SERVER </li></ul><ul><li>$_REQUEST </li></ul><ul><li>$_GET </li></ul><ul><li>$_POST </li></ul><ul><li>$_COOKIE </li></ul><ul><li>$_FILES </li></ul><ul><li>$_SESSION </li></ul><ul><li>$_ENV </li></ul><ul><li>$HTTP_RAW_POST _DATA </li></ul>ASP.Net Request .Params .ServerVariables .QueryString .Form .Cookies .Files .Headers .BinaryRead() Application Session Response J2EE HttpServletRequest .getParameterMap() .getQueryString() .getCookies() .getSession() .getHeaderNames() .getInputStream() HttpServletResponse
    26. 26. Common Response Headers <ul><li>Server The identifier for the server sending the response </li></ul><ul><li>Date </li></ul><ul><li>The date of this response as represented by the server </li></ul><ul><li>Cache-Control How this response should be cached. </li></ul><ul><li>Last-Modified The date the resource was last modified. Used for caching </li></ul><ul><li>Etag The entity tag. Uniquely identifies a version of the resource </li></ul><ul><li>Expires </li></ul><ul><li>The date when this response expires, or 0 to expire immediately. </li></ul><ul><li>Content-Type The MIME type of the content being sent. May include the character set </li></ul><ul><li>Content-Length The length of the content being sent. Optional & may not be sent </li></ul><ul><li>Location </li></ul><ul><li>Redirect the client to this new URL. Can be a permanent or temporary redirect </li></ul><ul><li>Set-Cookie Sets a cookie on the client </li></ul>
    27. 27. Sample Response <ul><li>HTTP/1.1 200 OK </li></ul><ul><li>Server: Microsoft-IIS/5.1 </li></ul><ul><li>Date: Thu, 06 Oct 2011 15:43:53 GMT </li></ul><ul><li>X-Powered-By: ASP.NET </li></ul><ul><li>X-AspNet-Version: 2.0.50727 </li></ul><ul><li>Cache-Control: private </li></ul><ul><li>Content-Type: text/html; charset=utf-8 </li></ul><ul><li>Content-Length: 8313 </li></ul><ul><li><!DOCTYPE html PUBLIC &quot;-//W3C//DTD XHTML 1.0 Transitional//EN&quot; &quot;http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd&quot;> </li></ul><ul><li><html> </li></ul><ul><li>… </li></ul><ul><li></html> </li></ul>
    28. 28. Cookies <ul><li>About Cookies </li></ul><ul><li>Defined by RFC 6265 </li></ul><ul><li>Used to provide state to HTTP </li></ul><ul><li>Often used to track session IDs </li></ul><ul><li>Multiple Set-Cookie headers can be sent in a response </li></ul><ul><li>Can be sent with any response code, though can be ignored when sending 1xx response codes </li></ul><ul><li>Not supposed to affect caching, but may </li></ul><ul><li>Sending a Cookie </li></ul><ul><ul><li>Set-Cookie: <name>=<value>; <attribute>; <attribute> </li></ul></ul><ul><ul><li>Set-Cookie: <name2>=<value2>; <attribute> </li></ul></ul><ul><li>Receiving a Cookie </li></ul><ul><ul><li>Cookie: <name>=<value>; <name2>=<value2> </li></ul></ul>
    29. 29. Cookie Attributes <ul><li>Expires </li></ul><ul><li>The date upon which the cookie expires. Cookies may be expunged before this date by the client </li></ul><ul><li>Max-Age </li></ul><ul><li>The number of seconds until the cookie expires. Max-Age overrides the Expires attribute </li></ul><ul><li>Domain </li></ul><ul><li>The hosts for which the cookie will be sent. Must be second-level or greater & include the originating server in definition </li></ul><ul><li>Path </li></ul><ul><li>The path for which the cookie will be sent. Defaults to path of requested page </li></ul><ul><li>Secure </li></ul><ul><li>Only transmit the cookie for a secure protocol, eg: https </li></ul><ul><li>HttpOnly </li></ul><ul><li>Prevent the cookie from being accessible to scripts, applets & plugins </li></ul>
    30. 30. Cookie Issues <ul><li>Cookies get sent in clear text. Use HTTPS with Secure flag to make them secure </li></ul><ul><li>If not using HttpOnly, cookies can be read and sent by JavaScript </li></ul><ul><li>Cookies can be cached by proxy servers, allowing users to access others’ sessions </li></ul><ul><li>Separation of authentication from destination allows an attacked to redirect a user to a web site and let the browser send the right cookie </li></ul><ul><li>When using wildcard domains, cookies can be overwritten by applications using another sub-domain </li></ul>
    31. 31. Disable HttpOnly for Session Cookies <ul><li>ASP.Net </li></ul><ul><li>Add to Global.asax: </li></ul><ul><li>protected void Application_EndRequest(object s, EventArgs e) { </li></ul><ul><li>if (Response.Cookies.Count > 0) { </li></ul><ul><li>foreach (string name in Response.Cookies.AllKeys) { </li></ul><ul><li>if (name == FormsAuthentication.FormsCookieName || </li></ul><ul><li>name.ToLower() == &quot;asp.net_sessionid&quot;) { </li></ul><ul><li>Response.Cookies[lName].HttpOnly = false; </li></ul><ul><li>} </li></ul><ul><li>} </li></ul><ul><li>} </li></ul><ul><li>} </li></ul><ul><li>J2EE </li></ul><ul><li>Add to web.xml: </li></ul><ul><li><session-config> </li></ul><ul><li><cookie-config> </li></ul><ul><li><http-only>true</http-only> </li></ul><ul><li></cookie-config> </li></ul><ul><li></session-config> </li></ul>
    32. 32. Caching <ul><li>Who caches? </li></ul><ul><ul><li>Browsers </li></ul></ul><ul><ul><li>Plug-ins </li></ul></ul><ul><ul><li>Proxies </li></ul></ul><ul><ul><li>Gateways </li></ul></ul><ul><ul><li>Reverse Proxies </li></ul></ul><ul><li>Why cache? </li></ul><ul><ul><li>Reduce latency </li></ul></ul><ul><ul><li>Reduce network traffic </li></ul></ul><ul><ul><li>Reduce server load </li></ul></ul>
    33. 33. Routing: A Straight Connection
    34. 34. Routing: Web Proxy
    35. 35. Routing: Web Proxy Revalidates
    36. 36. Routing: Reverse Proxy
    37. 37. Basics of Caching <ul><li>Key Questions </li></ul><ul><ul><li>Should I cache this? </li></ul></ul><ul><ul><li>Where should I cache this? </li></ul></ul><ul><ul><li>How long should I cache it for? </li></ul></ul><ul><ul><li>Can I check to see if it’s still valid? </li></ul></ul><ul><ul><li>When should I get rid of it? </li></ul></ul><ul><li>Concepts </li></ul><ul><ul><li>Directives </li></ul></ul><ul><ul><li>Freshness </li></ul></ul><ul><ul><li>Validation </li></ul></ul><ul><ul><li>Invalidation </li></ul></ul>
    38. 38. Should I Cache? <ul><li>Caching enabled by default </li></ul><ul><ul><li>Status codes 200, 203, 206, 300, 301, 410 </li></ul></ul><ul><ul><li>Not enabled if Authorization header present </li></ul></ul><ul><li>Cache-Control controls caching in HTTP 1.1 </li></ul><ul><ul><li>public, private, no-cache </li></ul></ul><ul><ul><li>no-store, no-transform </li></ul></ul><ul><ul><li>no-cache </li></ul></ul><ul><ul><li>max-age, min-fresh, max-stale, s-maxage </li></ul></ul><ul><ul><li>only-if-cached </li></ul></ul><ul><ul><li>must-revalidate, proxy-revalidate </li></ul></ul><ul><li>Pragma controls caching in HTTP 1.0 </li></ul><ul><ul><li>no-cache </li></ul></ul><ul><li>HTML META element generally only works for browsers </li></ul>
    39. 39. How long to cache for? <ul><li>Last-Modified </li></ul><ul><ul><li>Indicates the date a resource was last modified </li></ul></ul><ul><li>Expires </li></ul><ul><ul><li>Indicates the date when the response expires. Implies the response is cacheable, unless overridden by a Cache-Control header. The max-age Cache-Control directive overrides this value when determining how long to cache </li></ul></ul><ul><li>Age header </li></ul><ul><ul><li>Indicates how long a response has been in the cache. Sent by a web cache when sending a response </li></ul></ul>
    40. 40. Revalidating Caches <ul><li>E-Tag </li></ul><ul><li>If-Match </li></ul><ul><li>If-None-Match </li></ul><ul><li>If-Modified-Since </li></ul><ul><li>If-Unmodified-Since </li></ul>
    41. 41. Other Tricks <ul><li>Content-Disposition </li></ul><ul><ul><li>Force a Save As dialog to appear </li></ul></ul><ul><ul><li>Content-Transfer-Encoding: binary </li></ul></ul><ul><ul><li>Content-Disposition: attachment; filename=data.csv </li></ul></ul><ul><li>Server Push </li></ul><ul><ul><li>Use multipart/x-mixed-replace MIME type when pipelining </li></ul></ul><ul><ul><li>Not supported by Internet Explorer </li></ul></ul>
    42. 42. Thanks for attending! Slides: blog.fastfedora.com Trevor Lohrbeer @LabEscape [email_address] labescape.com @FastFedora [email_address] fastfedora.com