24. Caching in The Web User Agent Origin Server Proxies Gateways C C C C
25. Caching in The Web User Agent Origin Server Proxies Gateways C C C C ... Server: Apache ETag: "85a1b765e8c01dbf872651d7a5" Content-Type: text/html Cache-Control: max-age=3600 ...
26. Real World of the Web Client Cache Router Firewall ISP Proxy Server Firewall Web Server Resources Firewall Web Server Reverse Proxy Resources
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44. Cache Example 1 http://tomayko.com/writings/things-caches-do
58. From Open API to Next Generation Open Business Model
59.
60.
Editor's Notes
Any simple sample:
REST is not a standard You will not see the W3C putting out a REST specification. You will not see IBM or Microsoft or Sun selling a REST developer's toolkit. REST is just a design pattern You can't bottle up a pattern. You can only understand it and design your Web services to it. REST does prescribe the use of standards: HTTP URL XML/HTML/GIF/JPEG/ etc. (Resource Representations) text/xml, text/html, image/gif, image/jpeg, etc. (Resource Types, MIME Types)
Principles of REST Resource centric approach All relevant resources are addressable via URIs Uniform access via HTTP – GET, POST, PUT, DELETE Content type negotiation allows retrieving alternative representations from same URI REST style services are easy to access from code running in web browsers, any other client or servers can serve multiple representations of the same resource
03/04/09 Get to know your HTTP statuses. Do not add semantics that they were not intended for.
If you follow these semantics through, algorithmic resources are those that at first seems like you have to introduce a new verb, or dump REST. But in actuality, using REST for a Façade makes you think of Audting straight from the beginning. Consider a Transfer. If you make the Transfer process a resource, then you have to follow the HTTP Semantics. POST executes the Transfer, GET gives records of transfer. Delete and PUT can have affects for transfers that are in progress.
03/04/09 A Resource can be anything from a piece of data in a database, to an algorithm. You need to determine the type of resource you have in your design. This will help you formulate the correct URI patterns.
Simple resources are easy. You can have a resource like instructions. You can further filter the instructions down, into chapters. This provides a more complex type with specialized views.
03/04/09 A Collection is made up of one or more members. An example of a container is a BookShelf. A BookShelf has Books as a member. You may choose to have an HTTP GET to URI of /bookshelf serve up a list of books. In most cases though, a container type is named after the members, so that /book or /books could also be valid URI’s for collections of books. Ordered collections are ordered by index. When you access the collection without any modifiers they will be ordered by the id. Ordered collections can be sparse in that there is no guarantee that a particular index entry is present, only that they would be ordered. When a collection is not sparse, you are guaranteed to have all the indexes (eg., 1,2,3,4,5,etc). A sparse collection will allow (1,2,5,7,8). Collections can be unordered. In this case, when you access the container, there is no guarantee of order. For example, my bookshelf at home has my books in no particular order. There is no guarantee that the books are ordered when accessing the collection without any modifiers.
03/04/09 Several Development groups had a notion on Filter.. .others didn’t have anything (but passed them as query params). If you are going to do this in your organization, you should standardize on a common approach… . Here’s one example of a commons approach. Sometimes your implementation platform will make the decision for you. For example, JPA Based back end may allow you to pass JPA Expression, ADO for .NET provides REST layer for their back end.
03/04/09 Ordered Collection paging on members works well if the keys are integers that are not sparse. However, most of the time the collection will not be able to specify specific ranges. Start and count (or end) works well on any type collection. For ordered collections, the start query parameter must correspond to the id. For unordered collections, it can be mapped to any field, or any algorithm you specify. For example, you may use it to set the cursor position in an arbitrary result set in a query. Some HTTP Range headers may be tempting, but these actually wrok on bytes, and used by network layers to do optimal chuncking of data.
03/04/09 You can pass a comma separated list of columns to do sorting on more than one field. This can be used by ordered and unordered collection. In this slide we see the use of a sort and sortBy parameter.
03/04/09 In content negotiation, the resource representation returned is selected by the client through the use of several &quot;Accept&quot; headers, as defined in the HTTP specification. The <dot notation> works well for static resources. For example, if you have a document.html and document.pdf stored in your Web server, these are two different resources. However, using the <dot notation> as a means for asking for a different format of the same resource can cause confusion if the resource is dynamic. This is because, in the static case, the <dot notation> means two different resources, while in the dynamic case, it is the same resource. In addition, media type is just one content negotiation type. If you combine that with the others discussed earlier, then you could end up with funny looking URLs. For example, if you wanted an isso-8859 JSON version of a resource in English, then you would end up looking for a URI like this: /document.json.en.iso-8859-5. Query parameters are often used for content negotiations. The reason this happens is because developers often use the browser to quickly test and see results. Adding query parameters often provide a quick way to do this. Query parameters do not give the illusion that you are dealing with a different resource. However, dynamic RESTFul services are usually built to be used by client applications, either from a browser using Ajax or from another server application. Altering the URI in order to accomplish content negotiation is convenient for sharing a link to a specific resource representation or testing in a browser. However, applications should rely on standard HTTP headers first and selected URI conventions second. Query parameters are usually meant to provide input to a service, such as filtering criteria, sorting, and other business level details. Using headers for content negotiation separates IT concerns from business ones. In addition, requests usually pass through firewalls, proxies, and other servers. These HTTP proxies often understand the standard HTTP header and might provide caching and other nonfunctional requirements. As a compromise, you can provide a query parameter for development time and then disable it when deploying the application. In general, application clients should use the simplest technique. Many of the HTTP headers were designed with networking in mind, and helping browsers and intermediate proxies to automate exchange of information. Business applications often do not need this level of sophistication. he HTTP specification defines several techniques for content negotiation. This presentation addresses what is referred to as server-driven content negotiation. There are other types meant more for network proxies.
03/04/09 The Accept request-header field can be used to specify certain media types responses that are acceptable to the client. Accept headers can be used to indicate that the request is specifically limited to a small set of desired types, as in the case of a request for an in-line image. Examples include: application/json, application/atom+xml, and text/html. The Accept-Charset request-header field can be used to indicate in which character sets the response should be represented that are acceptable to the client. This field enables clients capable of understanding more comprehensive or special-purpose character sets to signal that capability to a server that is capable of representing documents in those character sets. The Accept-Encoding request-header field is similar to Accept, but restricts the content-codings. For example, you can use this field to indicate compression. Example values: compress;q=0.5, gzip;q=1.0. The Accept-Language request-header field is similar to Accept, but restricts the set of natural languages that are preferred as a response to the request. Examples include: en for English, es for Spanish.
03/04/09 This is a simple example of an HTTP request without caching. Here requests and responses just get forwarded.
03/04/09 Here we have a simple example using the Cache-Control HTTP Header. The service provider response with a response header and send it back to the client.
03/04/09 Since cache is fresh – and hasn’t expired – data is returned (and not hit on server). Here we are relying on the age of the resource.
03/04/09 The Last-Modified and E-Tag headers are a more robust solution. Here we get back an E-Tag and last modified header in the response for an initial GET.
03/04/09 Caching (where a response isn’t having to be sent) – since the requesting client (i.e. the cache) already has information that is still relevant. Cache checks with provider, provider returns simple HTTP code instead of data. We will see examples later where these same headers are used for Optimistic concurrency control.
03/04/09 Caching should only be used on GET. POST/PUT/DELETE would invalidate cache and start the process again. Calculating etags may not be easy, but mapping it to something like a JPA version number may be ideal.
03/04/09 Because Resources are communicated over HTTP, in a stateless fashion, optimistic concurrency patterns should be employed for resources that you plan to update or where having the latest copy is essential. Optimistic concurrency at the REST level does not live alone. Because you have physical underlying artifacts that you might be RESTifying, you have to work with underlying mechanisms. For example. If you are using database or JPA objects in your application and you are releasing this data via REST, then you have to work with the optimistic concurrency of that layer, such as a version column or JPA @Version annotation, and with the REST layer. You can pass this information as part of the payload, but, the downside is that only consumers and producers know about them. As we learned in the caching section, http headers are used to allow caching patterns. If consumers use these same headers, then they can employ optimistic concurrency and be in sync with caches inbetween.
03/04/09 This slide shows an interaction between a consumer and provider. When working with shared resources that need updating, a consumer should issue a GET. A Provider should populate the Etag header with a version number or timestamp. When the consumer updates the resource, it should populate the If-Match header with the value it received from the Etag. The provider should check the version number against that of the physical data (column, JPA version) and if they match, perform the update, if not, return 412 HTTP Response code, which means that the condition failed. Addtiional headers such as If-Modified-Since can be used for conditional GET on resource you have had for a while. If-None-Match can be used for conditional updates, though not typical of a consumer.
03/04/09 The beauty of REST is that it sits on top of the existing HTTP infrastructure. So follow all the same HTTP Security rules, lock down all URLS, URLS map one to one with services, allows for better ACL to finer grained things. Make sure you go to Keys Security Lecture.
03/04/09 REST Services themselves may not be the only thing, but the consumers and the behavior the REST can lead to. REST is often used to Mashup data. Mashups acces many REST resources and combine them on the glass. This can lead to cross site scripting and malicious content from third parties. Make sure you use server proxies to tunnel all ajax behavior through. Something like DataPower or the Ajax Proxy in WAS can provide a White List and Black list of third parties. It can also do things like inspect content. Browser applications should not do eval from third party JSON, but parse it.
Essentially Web 2.0 = new ways of using the web. New, easy to use technologies enabling more people to participate. Enterprises are starting to undergo the same shift and we can accelerate that by embracing these new technologies. This flexibility helps our clients adopt new business models. There are challenges, but IBM is in a unique position to
Ecommerce as the sample
03/04/09 Resources which explain how web caching works (I’ve added links to) – apart from the RFC 2616 (i.e. the HTTP 1.1 specification which I posted earlier). Responses to GET requests are usually the only requests where caching can occur (as the others are not idempotent) HTTP headers can contain caching directives to inform the caches how to interpret data in a request/response. Interesting headers to note are: Expires: The expires header is a basic means for controlling caches and tells how long the resource representation in the response returned is good for. After that time, a cache will request an update from the target resource. If you are going to design resources (and their data content is somewhat stagnant) – consider using Expires: as an HTTP header to reduce load on resources. The issues with Expires are that clocks are not synchronized, and that users may forget to update the time/date of the expiration. Cache-Control: .Cache-control attempts to give more control over Expires. It can include time that the representation is considered fresh (which is a relative time). It’s a relative time over when the request was issued. If neither header was set – no validation will occur and caches will not store any representation at all. If a cache stores information, and it includes a Last-Modified : tag, the cache can ask a server whether or not a resource has been modified (with an If-Modified-Since) tag. However – I’ll defer some of that discussion for a second. Lastly – HTTP 1.1 introduced a notion of Entity Tags (or Etag). An Etag is a unique identifier that is generated by the server and changed every time the representation changes. Because the server controls how Etags are generated, caches can be sureer if the Etag matches when they make a If-None-Match request , the representation is realy the same. Almost all caches use Last-Modified times in determining whether a representation is fresh. Again – let’s hold off for a couple of slides on the remaining items when I talk about concurrency control.