GET /Connected: A Tutorial on Web-based Services


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

GET /Connected: A Tutorial on Web-based Services

  1. 1. GET ConnectedA Tutorial on Web-based Services Jim Webber
  2. 2. whoami•  PhD in parallel computing –  Programming language design•  {developer, architect, director} with ThoughtWorks•  Author of “Developing Enterprise Web Services” –  And currently engaged in writing a book on Web-as-middleware
  3. 3. Roadmap•  Motivation and Introduction•  Web Architecture•  RPC Again!•  Embracing HTTP as an application protocol•  Semantics•  Hypermedia•  RESTful services•  Scalability•  Atom and AtomPub•  Security•  WS-* Wars•  Conclusions and further thoughts
  4. 4. Introduction•  This is a tutorial about the Web•  It’s very HTTP centric•  But it’s not about Web pages!•  The Web is a middleware platform which is… –  Globally deployed –  Has reach –  Is mature –  And is a reality in every part of our lives•  Which makes it interesting for distributed systems geeks
  5. 5. Why Web? Why not REST?•  REST is a brilliant architectural style•  But the Web allows for more than just RESTful systems•  There’s a spectrum of maturity of service styles –  From completely bonkers to completely RESTful•  We’ll use the Richardson maturity model to frame these kinds of discussions –  Level 0 to Level 3 –  Web-ignorant to RESTful!
  6. 6. Motivation•  This follows the plot from a book called GET /Connected which is currently being written by: –  Jim Webber –  Savas Parastatidis –  Ian Robinson•  With help from lots of other lovely people like: –  Halvard Skogsrud, Steve Vinoski, Mark Nottingham, Colin Jack, Spiros Tzavellas, Glen Ford, Sriram Narayan, Ken Kolchier and many more!•  The book deals with the Web as a distributed computing platform –  The Web as a whole, not just REST•  And so does this tutorial…
  7. 7. Web Architecture
  8. 8. Web History•  Started as a distributed hypermedia platform –  CERN, Berners-Lee, 1990•  Revolutionised hypermedia –  Imagine emailing someone a hypermedia deck nowadays!•  Architecture of the Web largely fortuitous –  W3C and others have since retrofitted/captured the Web’s architectural characteristics
  9. 9. The Web broke the rules
  10. 10. Web Fundamentals•  To embrace the Web, we need to understand how it works•  The Web is a distributed hypermedia model –  It doesn’t try to hide that distribution from you!•  Our challenge: –  Figure out the mapping between our problem domain and the underlying Web platform
  11. 11. Key Actors in the Web Architecture Client Cache Firewall Firewall Resources Router Web Server Web Server ISP Reverse Proxy Proxy Server Firewall Resources
  12. 12. Resources•  A resource is something “interesting” in your system•  Can be anything –  Spreadsheet (or one of its cells) –  Blog posting –  Printer –  Winning lottery numbers –  A transaction –  Others?
  13. 13. Interacting with Resources•  We deal with representations of resources –  Not the resources themselves •  “Pass-by-value” semantics –  Representation can be in any format •  Any media type•  Each resource implements a standard uniform interface –  Typically the HTTP interface•  Resources have names and addresses (URIs) –  Typically HTTP URIs (aka URLs)
  14. 14. Resource Architecture Consumer (Web Client) Uniform Interface (Web Server) Logical ResourcesResource Representation (e.g. XML document) Physical Resources
  15. 15. Resource Representations•  Making your system Web-friendly increases its surface area –  You expose many resources, rather than fewer endpoints•  Each resource has one or more representations –  Representations like JSON or XML or good for the programmatic Web•  Moving representations across the network is the way we transact work in a Web-native system
  16. 16. URIs•  URIs are addresses of resources in Web- based systems –  Each resource has at least one URI•  They identify “interesting” things –  i.e. Resources •  Any resource implements the same (uniform) interface –  Which means we can access it programmatically!
  17. 17. URI/Resource Relationship•  Any two resources cannot be identical –  Because then you’ve only got one resource!•  But they can have more than one name – –•  No mechanism for URI equality•  Canonical URIs are long-lived –  E.g. versus –  Send back HTTP 303 (“see also”) if the request is for an alternate URI –  Or set the Content-Location header in the response
  18. 18. Web Characteristics•  Scalable•  Fault-tolerant•  Recoverable•  Secure•  Loosely coupled•  Precisely the same characteristics we want in business software systems!
  19. 19. Scalability•  Web is truly Internet-scale –  Loose coupling •  Growth of the Web in one place is not impacted by changes in other places –  Uniform interface •  HTTP defines a standard interface for all actors on the Web •  Replication and caching is baked into this model –  Caches have the same interface as real resources! –  Stateless model •  Supports horizontal scaling
  20. 20. Fault Tolerant•  The Web is stateless –  All information required to process a request must be present in that request • Sessions are still plausible, but must be handled in a Web-consistent manner –  Modelled as resources!•  Statelessness means easy replication –  One Web server is replaceable with another –  Easy fail-over, horizontal scaling
  21. 21. Recoverable•  The Web places emphasis on repeatable information retrieval –  GET is idempotent •  Library of Congress found this the hard way! –  In failure cases, can safely repeat GET on resources•  HTTP verbs plus rich error handling help to remove guesswork from recovery –  HTTP statuses tell you what happened! –  Some verbs (e.g. PUT, DELETE) are safe to repeat
  22. 22. Secure•  HTTPs is a mature technology –  Based on SSL for secure point-to-point information retrieval•  Isn’t sympathetic to Web architecture –  Can’t cache!•  Higher-order protocols like Atom are starting to change this... –  Encrypt parts of a resource representation, not the transport channel –  OK to cache!
  23. 23. Loosely Coupled•  Adding a Web site to the WWW does not affect any other existing sites•  All Web actors support the same, uniform interface –  Easy to plumb new actors into the big wide web • Caches, proxies, servers, resources, etc
  24. 24. RPC Again!
  25. 25. Web Tunnelling•  Web Services tunnel SOAP over HTTP –  Using the Web as a transport only –  Ignoring many of the features for robustness the Web has built in But they claim to be “lightweight”•  Many Web people doe the same! and RESTful –  URI tunnelling, POX approaches are the most popular styles on today’s Web –  Worse than SOAP! • Less metadata!
  26. 26. Richardson Model Level 1•  Lots of URIs –  But really has a more level 0 mindset•  Doesn’t understand HTTP –  Other than as a transport•  No hypermedia
  27. 27. URI Tunnelling Pattern•  Web servers understand URIs•  URIs have structure•  Methods have signatures•  Can match URI structure to method signature
  28. 28. On The Wire
  29. 29. Server-Side URI Tunnelling Examplepublic void ProcessGet(HttpListenerContext context){ // Parse the URI Order order = ParseUriForOrderDetails(context.Request.QueryString); string response = string.Empty; if (order != null) { // Process the order by calling the mapped method var orderConfirmation = RestbucksService.PlaceOrder(order); response = "OrderId=" + orderConfirmation.OrderId.ToString(); } else { response = "Failure: Could not place order."; } // Write to the response stream using (var sw = new StreamWriter(context.Response.OutputStream)) { sw.Write(response); }}
  30. 30. Client-Side URI Tunnellingpublic OrderConfirmation PlaceOrder(Order order){ // Create the URI var sb = new StringBuilder(""); sb.AppendFormat("coffee={0}", order.Coffee.ToString()); sb.AppendFormat("&size={0}", order.Size.ToString()); sb.AppendFormat("&milk={0}", order.Milk.ToString()); sb.AppendFormat("&consume-location={0}", order.ConsumeLocation.ToString()); // Set up the GET request var request = HttpRequest.Create(sb.ToString()) as HttpWebRequest; request.Method = "GET"; // Get the response var response = request.GetResponse(); // Read the contents of the response OrderConfirmation orderConfirmation = null; using (var sr = new StreamReader(response.GetResponseStream())) { var str = sr.ReadToEnd(); // Create an OrderConfirmation object from the response orderConfirmation = new OrderConfirmation(str); } return orderConfirmation;}
  31. 31. URI Tunnelling Strengths•  Very easy to understand•  Great for simple procedure-calls•  Simple to code –  Do it with the servlet API, HttpListener, IHttpHandler, RAILS, whatever!•  Interoperable –  It’s just URIs!
  32. 32. URI Tunnelling Weaknesses•  It’s brittle RPC!•  Tight coupling, no metadata –  No typing or “return values” specified in the URI•  Not robust – have to handle failure cases manually•  No metadata support –  Construct the URIs yourself, map them to the function manually•  You typically use GET (prefer POST) –  OK for functions, but against the Web for procedures with side-affects
  33. 33. POX Pattern•  Web servers understand how to process requests with bodies –  Because they understand forms•  And how to respond with a body –  Because that’s how the Web works•  POX uses XML in the HTTP request and response to move a call stack between client and server
  34. 34. Richardson Model Level 0•  Single well-known endpoint –  Not really URI friendly•  Doesn’t understand HTTP –  Other than as a transport•  No hypermedia
  35. 35. POX Architecture
  36. 36. POX on the Wire
  37. 37. .Net POX Service Example From the Web serverprivate void ProcessRequest(HttpListenerContext context){ string verb = context.Request.HttpMethod.ToLower().Trim(); switch (verb) { Check HTTP Verb case "post": { (we want POST) // Everythings done with post in this case XmlDocument request = processing Dispatch it for new XmlDocument(); request.Load(XmlReader.Create(context.Request.InputStream)); XmlElement result = MyApp.Process(request.DocumentElement); byte[] returnValue = Turn the HTTP body into Utils.ConvertUnicodeString(Constants.Xml.XML_DECLARATION + an XML document for result.OuterXml); processing context.Response.OutputStream.Write(returnValue, 0, returnValue.Length); break; } ... Get XML result, and get Return XML bytes to client bytes
  38. 38. Java POX Servletpublic class RestbucksService extends HttpServlet {@Overrideprotected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { // Initialization code omitted for brevity try { requestReader = request.getReader(); responseWriter = response.getWriter(); String xmlRequest = extractPayload(requestReader); Order order = createOrder(xmlRequest); OrderConfirmation confirmation = restbucksService.placeOrder(order); embedPayload(requestWriter, confirmation.toString()); } finally { // Cleanup code omitted for brevity }}
  39. 39. C# POX Client Examplepublic OrderConfirmation PlaceOrder(string customerId, Item[] items){ // Serialize our objects XmlDocument requestXml = CreateXmlRequest(customerId, items); var client = new WebClient(); var ms = new MemoryStream(); requestXml.Save(ms); client.Headers.Add("Content-Type", "application/xml"); ms = new MemoryStream(client.UploadData(" PlaceOrder", null, ms.ToArray())); var responseXml = new XmlDocument(); responseXml.Load(ms); return CreateOrderConfirmation(responseXml);}
  40. 40. Java Apache Commons Clientpublic class OrderingClient { private String sendRequestPost(String request, String uri) private static final String XML_HEADING = "<?xml version="1.0"?>n"; throws private static final String NO_RESPONSE = IOException, HttpException { "Error: No response."; PostMethod method = new PostMethod(uri); public String placeOrder(String customerId, String[] itemIds) method.setRequestHeader("Content- throws Exception { type", "application/xml"); method.setRequestBody(XML_HEADING + // XML string creation omitted for brevity request); // ... String responseBody = NO_RESPONSE; String response = sendRequestPost(request, try { ""); new Document xmlResponse = HttpClient().executeMethod(method); DocumentBuilderFactory.newInstance() responseBody = new .newDocumentBuilder().p String(method.getResponseBody(), arse( "UTF-8"); new InputSource(new StringReader(response))); } finally { method.releaseConnection(); // XML response handling omitted for brevity } } return responseBody; } }
  41. 41. POX Strengths•  Simplicity – just use HTTP POST and XML•  Re-use existing infrastructure and libraries•  Interoperable –  It’s just XML and HTTP•  Can use complex data structures –  By encoding them in XML
  42. 42. POX Weaknesses•  Client and server must collude on XML payload –  Tightly coupled approach•  No metadata support –  Unless you’re using a POX toolkit that supports WSDL with HTTP binding (like WCF)•  Does not use Web for robustness•  Does not use SOAP + WS-* for robustness either
  43. 43. Web Abuse•  Both POX and URI Tunnelling fail to take advantage of the Web –  Ignoring status codes –  Reduced scope for caching –  No metadata –  Manual crash recovery/compensation leading to high development cost –  Etc•  They’re useful in some situations –  And you can implement them with minimal toolkit support –  But they’re not especially robust patterns
  44. 44. Tech InterludeHTTP Fundamentals
  45. 45. The HTTP Verbs Decreasing likelihood of being understood•  Retrieve a representation of a resource: GET•  Create a new resource: PUT to a new URI, by a Web server today or POST to an existing URI•  Modify an existing resource: PUT to an existing URI•  Delete an existing resource: DELETE•  Get metadata about an existing resource: HEAD•  See which of the verbs the resource understands: OPTIONS
  46. 46. HTTP Status Codes•  The HTTP status codes provide metadata about the state of resources•  They are part of what makes the Web a rich platform for building distributed systems•  They cover five broad categories –  1xx - Metadata –  2xx – Everything’s fine –  3xx – Redirection –  4xx – Client did something wrong –  5xx – Server did a bad thing•  There are a handful of these codes that we need to know in more detail
  47. 47. 1xx•  100 – Continue –  The operation will be accepted by the service –  The “look before you leap” pattern • Use in with the Expect header
  48. 48. 2xx•  200 – OK –  The server successfully completed whatever the client asked of it•  201 – Created –  Sent when a new resource is created at the client’s request via POST –  Location header should contain the URI to the newly created resource•  202 – Accepted –  Client’s request can’t be handled in a timely manner –  Location header should contain a URI to the resource that will eventually be exposed to fulfil the client’s expectations
  49. 49. More 2xx Codes•  203 – Non-Authoritative Information –  Much like 200, except the client knows not to place full trust in any headers since they could have come from 3rd parties or be cached etc.•  204 – No Content –  The server declines to send back a representation •  Perhaps because the associated resource doesn’t have one –  Used like an “ack” •  Prominent in AJAX applications
  50. 50. 3xx•  301 – Moved Permanently – Location header contains the new location of the resource•  303 – See Other – Location header contains the location of an alternative resource –  Used for redirection
  51. 51. More 3xx•  304 – Not Modified –  The resource hasn’t changed, use the existing representation –  Used in conjunction with conditional GET –  Client sends the If-Modified-Since header –  Response Date header must be set –  Response Etag and Content-Location headers must be same as original representation
  52. 52. 4xx•  400 – Bad Request –  The client has PUT or POST a resource representation that is in the right format, but contains invalid information•  401 – Unauthorised –  Proper credentials to operate on a resource weren’t provided –  Response WWW-Authenticate header contains the type of authentication the server expects •  Basic, digest, WSSE, etc –  Don’t leak information! •  Consider 404 in these situations
  53. 53. More 4xx•  403 – Forbidden –  The client request is OK, but the server doesn’t want to process it •  E.g. Restricted by IP address –  Implies that resource exists, beware leaking information•  404 – Not Found –  The standard catch-all response –  May be a lie to prevent 401 or 403 information leakage
  54. 54. Even more 4xx•  405 – Method Not Allowed –  The resource doesn’t support a given method –  The response Allow header lists the verbs the resource understands •  E.g. Allow: GET, POST, PUT•  406 – Not Acceptable –  The client places too many restrictions on the resource representation via the Accept-* header in the request –  The server can’t satisfy any of those representations
  55. 55. Yet More 4xx•  409 – Conflict –  Tried to change the state of the resource to something the server won’t allow •  E.g. Trying to DELETE something that doesn’t exist•  410 – Gone –  The resource has gone, permanently. –  Don’t send in response to DELETE •  The client won’t know if it was deleted, or if it was gone and the delete failed
  56. 56. Still more 4xx•  412 – Precondition Failed –  Server/resource couldn’t meet one or more preconditions •  As specified in the request header –  E.g. Using If-Unmodified-Since and PUT to modify a resource provided it hasn’t been changed by others•  413 – Request Entity Too Large –  Response comes with the Retry-After header in the hope that the failure is transient
  57. 57. 5xx Codes•  500 – Internal Server Error –  The normal response when we’re lazy •  503 – Service Unavailable –  The HTTP server is up, but not supporting resource communication properly –  Server may send a Retry-After header, assuming the fault is transient
  58. 58. HTTP Headers•  Headers provide metadata to assist processing –  Identify resource representation format (media type), length of payload, supported verbs, etc•  HTTP defines a wealth of these –  And like status codes they are our building blocks for robust service implementations
  59. 59. Must-know Headers•  Authorizaton –  Contains credentials (basic, digest, WSSE, etc) –  Extensible•  Content-Length –  Length of payload, in bytes•  Content-Type –  The resource representation form • E.g. application/xml, application/xhtml+xml
  60. 60. More Must-Know Headers•  Etag/If-None-Match –  Opaque identifier – think “checksum” for resource representations –  Used for conditional actions•  If-Modified-Since/Last-Modified –  Used for conditional operations too•  Host –  Contains the domain-name part of the URI
  61. 61. Yet More Must-Know Headers•  Location –  Used to flag the location of a created/moved resource –  In combination with: •  201 Created, 301 Moved Permanently, 302 Found, 307 Temporary Redirect, 300 Multiple Choices, 303 See Other•  User-Agent –  Tells the server side what the client-side capabilities are –  Should not be used in the programmable Web!
  62. 62. Final Must-Know Headers•  WWW-Authenticate –  Used with 401 status –  Informs client what authentication is needed•  Date –  Mandatory! –  Timestamps on request and response
  63. 63. Useful Headers•  Accept –  Client tells server what formats it wants –  Can externalise this in URI names in the general case
  64. 64. More Useful Headers•  Cache-Control –  Metadata for caches, tells them how to cache (or not) the resource representation –  And for how long etc.•  Content-MD5 –  Cryptographic checksum of body –  Useful integrity check, has computation cost
  65. 65. Yet More Useful Headers•  Expect –  A conditional – client asks if it’s OK to proceed by expecting 100-Continue –  Server either responds with 100 or 417 – Expectation Failed•  Expires –  Server tells client or proxy server that representation can be safely cached until a certain time•  If-Match –  Used for ETag comparison –  Opposite of If-None-Match
  66. 66. Final Useful Headers•  If-Unmodified-Since –  Useful for conditional PUT/POST • Make sure the resource hasn’t changed while you’re been manipulating it –  Compare with If-Modified-Since•  Range –  Specify part of a resource representation (a byte range) that you need – aka partial GET –  Useful for failure/recovery scenarios
  67. 67. HTTP RCF 2616 is Authoritative•  The statuses and headers here are a sample of the full range of headers in the HTTP spec•  They spec contains more than we discuss here•  It is authoritative about usage•  And it’s a good thing to keep handy when you’re working on a Web-based distributed system!
  68. 68. Tech Interlude URI Templates
  69. 69. Conflicting URI Philosophies•  URIs should be descriptive, predictable? –  http://spreadsheet/cells/a2,a9 – •  Convey some ideas about how the underlying resources are arranged •  Can infer http://spreadsheet/cells/b0,b10 and http:// for example •  Nice for programmatic access, but may introduce coupling•  URIs should be opaque? – –  TimBL says “opque URIs are cool” –  Convey no semantics, can’t infer anything from them •  Don’t introduce coupling
  70. 70. URI Templates, in brief•  Use URI templates to make your resource structure easy to understand•  For Amazon S3 (storage service) it’s easy: –{bucket-name}/{object- name}Bucket1 Bucket2 Object2 Object1 Object1 Object3 Object2
  71. 71. URI Templates are Easy!•  Take the URI:{order_id}•  You could do the substitution and get a URI:•  Can easily make more complex URIs too –  Mixing template and non-template sections{orders}/{shop}/{year}/{month}.xml•  Use URI templates client-side to compute server-side URIs –  But beware introducing coupling!
  72. 72. Why URI Templates?•  Regular URIs are a good idiom in Web- based services –  Helps with understanding, self documentation•  They allow users to infer a URI –  If the pattern is regular•  URI templates formalise this arrangement –  And advertise a template rather than a regular URI
  73. 73. URI Templates Pros and Cons•  Everything interesting in a Web-based service has a URI•  Remember, two schools of thought: –  Opaque URIs are cool (Berners-Lee) –  Transparent URIs are cool (everyone else)•  URI templates present two core concerns: –  They invite clients to invent URIs which may not be honoured by the server –  They increase coupling since servers must honour forever any URI templates they’ve advertised•  Use URI templates sparingly, and with caution –  Entry point URIs only is a good rule of thumb –  Or use them RESTfully, as we shall discuss later
  74. 74. Embracing HTTP as an Application Protocol
  75. 75. Using the Web•  URI tunnelling and POX use the Web as a transport –  Just like SOAP without metadata support•  CRUD services begin to use the Web’s coordination support•  But the Web is more than transport HTTP –  Transport, plus Headers Status Codes –  Metadata, plus Uniform –  Fault model, plus Interface Caches, –  Component model, plus proxies, –  Runtime environment, plus... servers, etc
  76. 76. CRUD Resource Lifecycle•  The resource is created with POST•  It’s read with GET•  And updated via PUT•  Finally it’s removed using DELETE
  77. 77. Richardson Model Level 2•  Lots of URIs•  Understands HTTP!•  No hypermedia
  78. 78. Create with POST POST /orders <order … /> 201 Created Location: …/1234Ordering Ordering Client 400 Bad Request Service 500 Internal Error
  79. 79. POST Semantics•  POST creates a new resource•  But the server decides on that resource’s URI•  Common human Web example: posting to Web log –  Server decides URI of posting and any comments made on that post•  Programmatic Web example: creating a new employee record –  And subsequently adding to it
  80. 80. POST Request Verb, path, and HTTPPOST /orders HTTP/1.1 versionHost: restbucks.example.comContent-Type: application/xmlContent-Length: 225 Generic XML content<order xmlns=""> <location>takeAway</location> <items> <item> <name>latte</name> <quantity>1</quantity> <milk>whole</milk> <size>small</size> Content </item> </items> (again Restbucks XML)</order>
  81. 81. POST ResponseHTTP/1.1 201 CreatedLocation: /orders/1234
  82. 82. When POST goes wrong•  We may be 4xx or 5xx errors –  Client versus server problem•  We turn to GET!•  Find out the resource states first –  Then figure out how to make forward or backward progress•  Then solve the problem –  May involve POSTing again –  May involve a PUT to rectify server-side resources in-place
  83. 83. POST Implementation with a Servletprotected void doPost(HttpServletRequest request, HttpServletResponse response) { try { Order order = extractOrderFromRequest(request); String internalOrderId = OrderDatabase.getDatabase().saveOrder(order); response.setHeader("Location", computeLocationHeader(request, internalOrderId)); response.setStatus(HttpServletResponse.SC_CREATED); } catch(Exception ex) { response.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR); }}
  84. 84. Read with GET GET /orders/1234 200 OK <order … />Ordering Ordering Client 404 Not Found Service 500 Internal Error
  85. 85. GET Semantics•  GET retrieves the representation of a resource•  Should be idempotent Library of congress –  Shared understanding of GET semantics catalogue incident! –  Don’t violate that understanding!
  86. 86. GET Exemplified Should the expected resource representation be in a header, or made explicit in the URI?GET /orders/1234 HTTP/1.1Accept: application/vnd.restbucks+xmlHost:
  87. 87. GET ResponseHTTP/1.1 200 OKContent-Length: 232Content-Type: application/vnd.restbucks+xmlDate: Wed, 19 Nov 2008 21:48:10 GMT<order xmlns=""> <location>takeAway</location> <items> <item> <name>latte</name> <quantity>1</quantity> <milk>whole</milk> <size>small</size> </item> </items> <status>pending</pending></order>
  88. 88. When GET Goes wrong•  Simple! –  Just 404 – the resource is no longer availableHTTP/1.1 404 Not FoundDate: Sat, 20 Dec 2008 19:01:33 GMT•  Are you sure? –  GET again!•  GET is safe and idempotent –  Great for crash recovery scenarios!
  89. 89. Idempotent Behaviour•  A action with no side affects –  Comes from mathematics•  In practice means two things: –  A safe operation is one which changes no state at all •  E.g. HTTP GET –  An idempotent operation is one which updates state in an absolute way •  E.g. x = 4 rather than x += 2•  Web-friendly systems scale because of safety –  Caching!•  And are fault tolerant because of idempotent behaviour –  Just re-try in failure cases
  90. 90. GET JAX-RS Implementation@Path("/")public class OrderingService { @GET @Produces("application/vnd.restbucks+xml") @Path("/{orderId}") public String getOrder(@PathParam("orderId") String orderId) { try { Order order = OrderDatabase.getDatabase().getOrder(orderId); if (order != null) { return xstream.toXML(order); } else { throw new WebApplicationException(404); } } catch (Exception e) { throw new WebApplicationException(500); } } // Remainder of implementation omitted for brevity}
  91. 91. Update with PUT PUT /orders/1234 <order … /> 200 OK 404 Not FoundOrdering Ordering Client Service 409 Conflict 500 Internal Error
  92. 92. PUT Semantics•  PUT creates a new resource but the client decides on the URI –  Providing the server logic allows it•  Also used to update existing resources by overwriting them in-place•  PUT is idempotent –  Makes absolute changes•  But is not safe –  It changes state!
  93. 93. PUT RequestPUT /orders/1234 HTTP/1.1Host: restbucks.comContent-Type: application/xmlContent-Length: 386<order xmlns=""> <location>takeAway</location> <items> <item> <milk>whole</milk> <name>latte</name> <quantity>2</quantity> <size>small</size> </item> <item> <milk>whole</milk> Updated content <name>cappuccino</name> <quantity>1</quantity> <size>large</size> </item> </items> <status>preparing</preparing></order>
  94. 94. PUT ResponseHTTP/1.1 200 OKDate: Sun, 30 Nov 2008 21:47:34 GMTContent-Length: 0 Minimalist response contains no entity body
  95. 95. When PUT goes wrong•  If we get 5xx error, or some HTTP/1.1 409 Conflict Date: Sun, 21 Dec 2008 16:43:07 GMT 4xx errors simply PUT again! Content-Length:382 –  PUT is idempotent <order xmlns="http://•  If we get errors indicating"> <location>takeAway</location> incompatible states (409, 417) <items> then do some forward/ <item> backward compensating work <milk>whole</milk> <name>latte</name> –  And maybe PUT again <quantity>2</quantity> <size>small</size> </item> <item> <milk>whole</milk> <name>cappuccino</name> <quantity>1</quantity> <size>large</size> </item> </items> <status>served</status> </order>
  96. 96. WCF Implementation for PUT[ServiceContract]public interface IOrderingService{ [OperationContract] [WebInvoke(Method = "PUT", UriTemplate = "/orders/ {orderId}")] void UpdateOrder(string orderId, Order order); // …}
  97. 97. WCF Serializable Types[DataContract(Namespace = "", Name = "order")]public class Order{ [DataMember(Name = "location")] public Location ConsumeLocation { get { return location; } set { location = value; } } [DataMember(Name = "items")] public List<Item> Items { get { return items; } set { items = value; } } [DataMember(Name = "status")] public Status OrderStatus { get { return status; } set { status = value; } } // …}
  98. 98. Remove with DELETE DELETE /orders/1234 200 OK 404 Not FoundOrdering Ordering Client Service 405 Method Not Allowed 500 Service Unavailable
  99. 99. DELETE Semantics This is important for decoupling implementation details from resources•  Stop the resource from being accessible –  Logical delete, not necessarily physical•  Request DELETE /orders/1234 HTTP/1.1 Host:•  Response HTTP/1.1 200 OK Content-Length: 0 Date: Tue, 16 Dec 2008 17:40:11 GMT
  100. 100. When DELETE goes wrong•  Simple case, DELETE again! –  Delete is idempotent! –  DELETE once, DELETE 10 times has the same effect: one deletionHTTP/1.1 404 Not FoundContent-Length: 0Date: Tue, 16 Dec 2008 17:42:12 GMT
  101. 101. When DELETE goes Really Wrong•  Look out for 405 and 409! HTTP/1.1 409 Conflict Content-Length: 379•  Some 4xx responses indicate Date: Tue, 16 Dec 2008 17:53:09 GMT that deletion isn’t possible <order xmlns="http:// –  The state of the resource"> <location>takeAway</location> isn’t compatible <items> –  Try forward/backward <item> compensation instead <name>latte</name> <milk>whole</milk> <size>small</size> <quantity>2</quantity> </item> <item> <name>cappuccino</name> <milk>skim</milk> Can’t delete an <size>large</size> <quantity>1</quantity> order that’s </item> already served </items> <status>served</status> </order>
  102. 102. CRUD is Good?•  CRUD is good –  But it’s not great•  CRUD-style services use some HTTP features•  But the application model is limited –  Suits database-style applications –  Hence frameworks like Microsoft’s Astoria•  CRUD has limitations –  CRUD ignores hypermedia –  CRUD encourages tight coupling through URI templates –  CRUD encourages server and client to collude•  The Web supports more sophisticated patterns than CRUD!
  103. 103. Tech InterludeDescribing CRUD Services
  104. 104. WADL Overview•  Web Application Description Language –  A contract language for Web-based services•  Think of a WADL contract as a site map for the programmatic Web –  Or like a resource-oriented WSDL for Web-based services•  Purpose: help tools generate clients which interact with a known set of resources•  Suits services with a static set of URIs or URI templates•  Does not suit hypermedia services •  Which we’ll see later!
  105. 105. WADL Close Up Base URI Verb<resources base=""> <resource path="newsSearch"> Local resource path – can be a template <method name="GET" id="search"> Request<request> declaration <param name="appid" type="xsd:string" style="query" Form parameters required="true"/> <param name="query" type="xsd:string" style="query" required="true"/> <param name="type" style="query" default="all"> Form options <option value="all"/> <option value="any"/> <option value="phrase"/> </param> ... Response </request> declaration XML response <response> <representation mediaType="application/xml" Results or fault element="yn:ResultSet"/> <fault status="400" mediaType="application/xml" element="ya:Error"/> </response> </method> </resource>
  106. 106. A more Programmatic WADL•  The previous WADL example showed the Yahoo! news service•  But it looked like a meta-description of the HTML forms•  WADL seems more natural for the programmatic Web when it uses other formats –  XML, JSON, etc
  107. 107. WADL and XML Resources Path is templated<resources base=""> <resource path="{order}"> PUT new order <method name="PUT"> <request> <representation Order in XML mediaType="application/xml" document element="r:Order"/> </request> <response> <representation mediaType="application/xml" element="r:Order"/> <fault status="409" mediaType="application/xml" element="r:OrderError"/> </response> Order returned as ... XML if success Or 409 with helpful XML error message otherwise
  108. 108. WADL Issues•  Is WADL another WSDL and therefore evil? –  It is static, but the Web is dynamic •  But do we want the programmatic Web to be dynamic? •  Or do we want to have constraints around the resources we interact with?•  Can WADL avoid being another WSDL –  We could return WADL documents when we move outside the scope of the original document –  Seems impractical, tightly coupled•  Can we use existing Web techniques for specifying contracts instead?
  109. 109. Tech Interlude Semantics
  110. 110. Microformats•  Microformats are an example of little “s” semantics•  Innovation at the edges of the Web –  Not by some central design authority (e.g. W3C)•  Started by embedding machine-processable elements in Web pages –  E.g. Calendar information, contact information, etc –  Using existing HTML features like class, rel, etc
  111. 111. Semantic versus semantic•  Semantic Web is top-down –  Driven by the W3C with extensive array of technology, standards, committees, etc –  Has not currently proven as scalable as the visionaries hoped •  RDF tripples have been harvested and processed in private databases•  Microformats are bottom-up –  Little formal organisation, no guarantee of interoperability –  Popular formats tend to be adopted (e.g. hCard) –  Easy to use and extend for our systems –  Trivial to integrate into current and future programmatic Web systems
  112. 112. Microformats and Resources•  Use Microformats to structure resources where formats exist –  I.e. Use hCard for contacts, hCalendar for data•  Create your own formats (sparingly) in other places –  Annotating links is a good start –  <link rel="" .../> –  <link rel="" type="application/atom+xml" href="{post-uri}" title="some title">•  The rel attribute describes the semantics of the referred resource
  113. 113. Tech InterludeHypermedia Formats
  114. 114. Media Types Rule!•  WADL takes an enterprise-y approach to the Web –  Make it static, dumb it down, sell tools to process it•  This is not how the Web works!•  The Web’s contracts are expressed in terms of media types –  If you know the type, you can process the content•  Some types are special because they work in harmony with the Web –  We call these “hypermedia formats”
  115. 115. Other Resource Representations•  Remember, XML is not the only way a resource can be serialised –  Remember the Web is based on REpresentational State Transfer•  The choice of representation is left to the implementer –  Can be a standard registered media type –  Or something else•  But there is a division on the Web between two families –  Hypermedia formats •  Formats which host URIs and links –  Regular formats •  Which don’t
  116. 116. Plain Old XML is not Hypermedia FriendlyHTTP/1.1 200 OK Where are the links?Content-Length: 227 Where’s the protocol?Content-Type: application/xmlDate: Wed, 19 Nov 2008 21:48:10 GMT<order xmlns=""> <location>takeAway</location> <items> <item> <name>latte</name> <quantity>1</quantity> <milk>whole</milk> <size>small</size> </item> </items> <status>pending</pending></order>
  117. 117. So what?•  How do you know the next thing to do?•  How do you know the resources you’re meant to interact with next?•  In short, how do you know the service’s protocol? –  Turn to WADL? Yuck! –  Read the documentation? Come on! –  URI Templates? Tight Coupling!
  118. 118. URI Templates are NOT a Hypermedia Substitute•  Often URI templates are used to advertise all resources a service hosts –  Do we really need to advertise them all?•  This is verbose•  This is out-of-band communication•  This encourages tight-coupling to resources through their URI template•  This has the opportunity to cause trouble! –  Knowledge of “deep” URIs is baked into consuming programs –  Services encapsulation is weak and consumers will program to it –  Service will change its implementation and break consumers
  119. 119. Bad Ideas with URI Templates•  Imagine we’re created an order, what next?•  We could share this URI template: –{order_id}•  The order_id field should match the order ID that came from the restbucks service –  Sounds great!•  But what if Restbucks outsources payment? –  Change the URI for payments, break the template, break consumers!•  Be careful what you share!
  120. 120. Better Ideas for URI Templates: Entry Points•  Imagine that we have a well-known entry point to our service –  Which corresponds to a starting point for a protocol•  Why not advertise that with a URI template?•  For example: –{store_id}/ {barista_id}•  Changes infrequently•  Is important to Restbucks•  Is transparent, and easy to bind to
  121. 121. Best Idea for URI Templates: Documentation! Internal URI•  Services tend to support lots / Templates of resources payment/ {order_id} External URI•  We need a shorthand for Templates talking about a large number of resources easily•  We can use a URI template for /{store}/ /order/ each “type” of resource that a orders {order_id} service (or services) supports•  But we don’t share this information with others –  Don’t violate encapsulation! /order/ {order_id}
  122. 122. application/xml is not the media type you’re looking for •  Remember that HTTP is an application protocol –  Headers and representations are intertwined –  Headers set processing context for representations –  Unlike SOAP which can safely ignore HTTP headers •  It has its own header model •  Remember that application/xml has a particular processing model –  Which doesn’t include understanding the semantics of links •  Remember if a representation is declared in the Content- Type header, you must treat it that way –  HTTP is an application protocol – did you forget already?  •  We need real hypermedia formats!
  123. 123. Hypermedia Formats•  Standard –  Wide “reach” –  Software agents already know how to process them –  But sometimes need to be shoe-horned•  Self-created –  Can craft specifically for domain –  Semantically rich –  But lack reach
  124. 124. Two Common Hypermedia Formats: XHTML and ATOM•  Both are commonplace today•  Both are hypermedia formats –  They contain links•  Both have a processing model that explicitly supports links•  Which means both can describe protocols…
  125. 125. XHTML•  XHTML is just HTML that is also XML Default XML•  For example: namespace <html xmlns="" xml:lang="en" lang="en" xmlns:r=""> Other XML <head> namespaces <title> XHTML Example </title> </head> <body> <p> ...
  126. 126. What’s the big deal with XHTML? •  It does two interesting things: 1.  It gives us document structure 2.  It gives us links •  So? 1.  We can understand the format of those resources 2.  We can discover other resources! •  How? 1.  Follow the links! 2.  Interact with the resources through the uniform interface •  Contrast this with the APP approach...similar!
  127. 127. XHTML in Action<html xmlns=""><body> <div class="order"> <p class="location">takeAway</p> <ul class="items">Business <li class="item"> data <p class="name">latte</p> <p class="quantity">1</p> “Hypermedia <p class="milk">whole</p> Control” <p class="size">small</p> </li> </ul> <a href="" rel="payment">payment</a> </div> </body></html>
  128. 128. application/xhtml+xml•  Can ask which verb the resource at the end of the link supports –  Via HTTP OPTIONS•  No easy way to tell what each link actually does –  Does it buy the music? –  Does it give you lyrics? –  Does it vote for a favourite album?•  We lack semantic understanding of the linkspace and resources –  But we have microformats for that semantic stuff!•  Importantly XHTML is a hypermedia format –  It contains hypermedia controls that can be used to describe protocols!
  129. 129. Atom Syndication Format•  We’ll study this in more depth HTTP/1.1 200 OK Content-Length: 342 later, but for now… Content-Type: application/atom+xml•  The application/atom+xml Date: Sun, 22 Mar 2009 17:04:10 GMT media type is hypermedia aware <entry xmlns=" 2005/Atom">•  You should expect links when <title>Order 1234</title> processing such <link rel="payment" href="http://"/> representations <link rel="special-offer"•  And be prepared to do things href=" offers/freeCookie"/> with them! <id> 1234</id> Links to other <updated>2009-03-22T16:57:02Z</ resources, a nascent updated> protocol <summary>1x Cafe Latte</summary> </entry>
  130. 130. application/atom+xml•  No easy way to tell what each link actually does –  But look at the way the rel attribute is being used –  Can we inject semantics there?•  Atom is a hypermedia format –  Both feeds and entries contains hypermedia controls that can describe protocols
  131. 131. application/vnd.restbucks+xml •  What a mouthful! •  The vnd namespace is for proprietary media types –  As opposed to the IANA-registered ones •  Restbucks XML is a hybrid –  We use plain old XML to convey information –  And Atom link elements to convey protocol •  This is important, since it allows us to create RESTful, hypermedia aware services
  132. 132. Hypermedia and RESTful Services
  133. 133. Revisiting Resource Lifetime•  On the Web, the lifecycle of a single resource is more than: –  Creation –  Updating –  Reading –  Deleting•  Can also get metadata –  About the resource –  About its (subset of) the verbs it understands•  And resources tell us about other resources we might want to interact with… –  A protocol!
  134. 134. Links•  Connectedness is good in Web-based systems•  Resource representations can contain other URIs•  Links act as state transitions•  Application (conversation) state is captured in terms of these states
  135. 135. Describing Contracts with Links•  The value of the Web is its “linked-ness” –  Links on a Web page constitute a contractfor page traversals•  The same is true of the programmatic Web•  Use Links to describe state transitions in programmatic Web services –  By navigating resources you change application state•  Hypermedia formats support this –  Allow us to describe higher-order protocols which sit comfortably atop HTTP –  Hence application/vnd.restbucks+xml
  136. 136. Links are State Transitions
  137. 137. Links as APIs<confirm xmlns="..."> •  Following a link causes an<link rel="payment" action to occur href="https://pay" •  This is the start of a state type="application/xml"/> machine!<link rel="postpone" •  Links lead to other resources href="https://wishlist" which also have links type="application/xml"/> •  Can make this stronger with</confirm> semantics –  Microformats
  138. 138. We have a framework!•  The Web gives us a processing and metadata model –  Verbs and status codes –  Headers•  Gives us metadata contracts or Web “APIs” –  URI Templates –  Links•  Strengthened with semantics –  Little “s”
  139. 139. Richardson Model Level 3•  Lots of URIs that address resources•  Embraces HTTP as an application protocol•  Resource representations and formats identify other resources –  RESTful at last!
  140. 140. Workflow•  How does a typical enterprise workflow look when it’s implemented in a Web-friendly way?•  Let’s take Starbuck’s as an example, the happy path is: –  Make selection •  Add any specialities –  Pay –  Wait for a while –  Collect drink
  141. 141. Workflow and MOM•  With Web Services we exchange messages with the service•  Resource state is hidden from view•  Conversation state is all we know –  Advertise it with SSDL, BPEL•  Uniform interface, roles defined by SOAP –  No “operations”
  142. 142. Web-friendly Workflow•  What happens if workflow stages are modelled as resources?•  And state transitions are modelled as hyperlinks or URI templates?•  And events modelled by traversing links and changing resource states?•  Answer: we get Web-friendly workflow –  With all the quality of service provided by the Web•  So let’s see how we order a coffee at… –  This is written up on the Web: •
  143. 143. Placing an Order•  Place your order by POSTing it to a well- known URI – Starbuck’s Service Client
  144. 144. Placing an Order: On the Wire •  Response•  Request 201 CreatedPOST /order HTTP 1.1 Location: order/1234Content-Length: ... Content-Type: application/ vnd.restbucks+xml Content-Length: ...<order xmlns="urn:restbucks"><drink>latte</drink> <order xmlns="urn:restbucks"></order> <drink>latte</drink> <link rel="payment" If we have a (private) href=" A link! Is this the payment/order/1234" microformat, this can start of an API? type="application/xml"/> become a neat API! </order>
  145. 145. Whoops! A mistake•  I like my coffee to taste like coffee!•  I need another shot of espresso –  What are my OPTIONS? Request  ResponseOPTIONS /order/1234 HTTP 1.1 200 OKHost: Allow: GET, PUT Phew! I can update my order, for now
  146. 146. Optional: Look Before You Leap•  See if the resource has changed since you submitted your order –  If you’re fast your drink hasn’t been prepared yet Request  ResponsePUT /order/1234 HTTP 1.1 100 ContinueHost: I can still PUT thisExpect: 100-Continue resource, for now. (417 Expectation Failed otherwise)
  147. 147. Amending an Order•  Add specialities to you order via PUT –  Restbucks needs 2 shots! Starbuck’s Service Client
  148. 148. Amending an Order: On the Wire •  Request •  Response PUT /order/1234 HTTP 1.1 200 OK Host: Location: Content-Type: application/ order/1234 vnd.restbucks+xml Content-Type: application/ Content-Length: ... vnd.restbucks+xml Content-Length: ... <order xmlns="urn:restbucks"> <drink>latte</drink> <order xmlns="urn:restbucks"> <additions>shot</additions> <drink>latte</drink> <link rel="payment" <additions>shot</additions> href=" <link rel="payment" payment/order/1234" href=" type="application/xml"/> payment/order/1234" </order> type="application/xml"/> </order>
  149. 149. Side Note: PATCH•  PUT demands a full representation as the entity body –  Onerous if the payload is large•  PATCH is a verb that supports diffs –  The server figures out how to apply those diffs•  Still a proposal at this point –  Boo!
  150. 150. Statelessness•  Remember interactions with resources are stateless•  The resource “forgets” about you while you’re not directly interacting with it•  Which means race conditions are possible•  Use If-Unmodified-Since on a timestamp to make sure –  Or use If-Match and an ETag•  You’ll get a 412 PreconditionFailed if you lost the race –  But you’ll avoid potentially putting the resource into some inconsistent state
  151. 151. Warning: Don’t be Slow!•  Can only make changes until someone actually makes your drink –  You’re safe if you use If-Unmodified-Since or If-Match –  But resource state can change without you!  Request  ResponsePUT /order/1234 HTTP 1.1Host: 409 Conflict... Too slow! Someone else has changed the state of my order Request  ResponseOPTIONS /order/1234 HTTP 1.1 Allow: GETHost:
  152. 152. Order Confirmation•  Check your order status by GETing it Starbuck’s Service Client
  153. 153. Order Confirmation: On the Wire•  Request •  ResponseGET /order/1234 HTTP 1.1 200 OKHost: Location: order/1234 Content-Type: application/ vnd.restbucks+xml Content-Length: ... <order xmlns="urn:restbucks"> <drink>latte</drink> <additions>shot</additions> <link rel="payment" href="https:// Are they trying to tell me with hypermedia? 1234" type="application/xml"/> </order>
  154. 154. Order Payment•  PUT your payment to the order resource Starbuck’s Service Client New resource!
  155. 155. How did I know to PUT? •  The client knew the URI to PUT to from the link –  PUT is also idempotent (can safely re-try) in case of failure •  Verified with OPTIONS –  Just in case you were in any doubt  Request  ResponseOPTIONS /payment/order/1234 HTTP 1.1 Allow: GET, PUTHost:
  156. 156. Order Payment: On the Wire•  Request •  Response 201 CreatedPUT /payment/order/1234 HTTP 1.1 Location: https://Host: application/xml 1234Content-Length: ... Content-Type: application/xml Content-Length: ...<payment xmlns="urn:restbucks"><cardNo>123456789</cardNo><expires>07/07</expires> <payment xmlns="urn:restbucks"><name>John Citizen</name> <cardNo>123456789</cardNo><amount>4.00</amount> <expires>07/07</expires></payment> <name>John Citizen</name> <amount>4.00</amount> </payment>
  157. 157. Check that you’ve paid•  Request •  ResponseGET /order/1234 HTTP 1.1 200 OKHost: Content-Type: application/ vnd.restbucks+xml Content-Length: ... My “API” has changed, <order xmlns="urn:restbucks"> because I’ve paid <drink>latte</drink> enough now <additions>shot</additions> </order>
  158. 158. What Happened Behind the Scenes?•  Restbucks can use the same resources!•  Plus some private resources of their own –  Master list of coffees to be prepared•  Authenticate to provide security on some resources –  E.g. only Starbuck’s are allowed to view payments
  159. 159. Payment •  Only Restbucks systems can access the record of payments –  Using the URI template: http://.../payment/order?{order_id} •  We can use HTTP authorisation to enforce this Request   Response 401 UnauthorizedGET /payment/order/1234 HTTP 1.1 WWW-Authenticate: DigestHost: realm="", qop="auth", nonce="ab656...", opaque="b6a9..."  Request   ResponseGET /payment/order/1234 HTTP 1.1 200 OKHost: restbucks.comAuthorization: Digest username="jw" Content-Type: application/xmlrealm="“ Content-Length: ...nonce="..."uri="payment/order/1234" <payment xmlns="urn:restbucks">qop=auth <cardNo>123456789</cardNo>nc=00000001 <expires>07/07</expires>cnonce="..." <name>John Citizen</name>reponse="..." <amount>4.00</amount>opaque="..." </payment>
  160. 160. Master Coffee List•  /orders URI for all orders, only accepts GET –  Anyone can use it, but it is only useful for Starbuck’s –  It’s not identified in any of our public APIs anywhere, but the back- end systems know the URI   Response Request 200 OK Content-Type: application/atom+xmlGET /orders HTTP 1.1 Content-Length: ... <?xml version="1.0" ?>Host: <feed xmlns=""> <title>Coffees to make</title> <link rel="alternate" href="http://"/> <updated>2007-07-10T09:18:43Z</updated> <author><name>Johnny Barrista</name></author> Atom feed! <id>urn:starkbucks:45ftis90</id> <entry> <link rel="alternate" type="application/xml" href=""/> <id>urn:restbucks:a3tfpfz3</id> </entry> ... </feed>
  161. 161. Finally drink your coffee...Source:
  162. 162. What did we learn from Restbucks? •  HTTP has a header/status combination for every occasion •  APIs are expressed in terms of links, and links are great! –  APP-esque APIs •  APIs can also be constructed with URI templates and inference –  But beware tight coupling outside of CRUD services! •  XML is fine, but we could also use formats like Atom, JSON or even default to XHTML as a sensible middle ground •  State machines (defined by links) are important –  Just as in Web Services…
  163. 163. Scalability
  164. 164. Statlessness•  Every action happens in isolation –  This is a good thing!•  In between requests the server knows nothing about you –  Excepting any state changes you caused when you last interacted with it.•  Keeps the interaction protocol simpler –  Makes recovery, scalability, failover much simpler too –  Avoid cookies!
  165. 165. Application vs Resource State•  Useful services hold persistent data – Resource state –  Resources are buckets of state –  What use is Google without state?•  Brittle implementations have application state –  They support long-lived conversations –  No failure isolation –  Poor crash recovery –  Hard to scale, hard to do fail-over fault tolerance•  Recall stateless Web Services – same applies in the Web too!
  166. 166. Stateful ExampleWhat if there’s a failure here?
  167. 167. Stateful Failure“Grandmother” Antipattern
  168. 168. Stateless System Tolerates Intermittent Failures
  169. 169. Scaling Horizontally•  Web farms have delivered horizontal scaling for years –  Though they sometimes do clever things with session affinity to support cookie-based sessions•  In the programmatic Web, statelessness enables scalability –  Just like in the Web Services world
  170. 170. Scalable Deployment Configuration •  Deploy services onto many servers •  Services are stateless –  No cookies! •  Servers share only back-end data
  171. 171. Scaling Vertically…without servers •  The most expensive round-trip: –  From client –  Across network –  Through servers –  Across network again –  To database –  And all the way back! •  The Web tries to short-circuit this –  By determining early if there is any actual work to do! –  And by caching
  172. 172. Caching•  Caching is about scaling vertically –  As opposed to horizontally•  Making a single service run faster –  Rather than getting higher overall throughput•  In the programmatic Web it’s about reducing load on servers –  And reducing latency for clients
  173. 173. Caching in a Scalable Deployment •  Cache (reverse proxy) in front of server farm –  Avoid hitting the server •  Proxy at client domain –  Avoid leaving the LAN •  Local cache with client –  Avoid using the network
  174. 174. Being workshy is a good thing!•  Provide guard clauses in requests so that servers can determine easily if there’s any work to be done –  Caches too•  Use headers: –  If-Modified-Since –  If-None-Match –  And friends•  Web infrastructure uses these to determine if its worth performing the request –  And often it isn’t –  So an existing representation can be returned
  175. 175. Conditional GET Avoids Work!•  Bandwidth-saving pattern•  Requires client and server to work together•  Server sends Last-Modified and/or ETag headers with representations•  Client sends back those values when it interacts with resource in If-Modified-Since and/or If-None- Match headers•  Server responds with a 200 an empty body if there have been no updates to that resource state•  Or gives a new resource representation (with new Last-Modified and/or ETag headers) ETag is an opaque identifier for a particular resource/version
  176. 176. Retrieving a Resource Representation •  Request GET /orders/1234 HTTP 1.1 Host: Accept: application/restbucks-xml If-Modified-Since: 2009-01-08T15:00:34Z If-None-Match: aabd653b-65d0-74da-bc63-4bca-ba3ef3f50432 •  Response 200 OK Content-Type: application/vnd.restbucks+xml Content-Length: ... Last-Modified: 2009-01-08T15:10:32Z Etag: abbb4828-93ba-567b-6a33-33d374bcad39 <order … />
  177. 177. Not Retrieving a Resource Representation•  RequestGET /orders/1234 HTTP 1.1Host: restbucks.comAccept: application/restbucks.xmlIf-Modified-Since: 2009-01-08T15:00:34ZIf-None-Match: aabd653b-65d0-74da- bc63-4bca-ba3ef3f50432 Client’s representation•  Response of the resource is up-to- dateHTTP/1.x 304 Not Modified
  178. 178. Works with other verbs tooPUT /orders/1234 HTTP 1.1Host: restbucks.comAccept: application/vnd.restbucks +xmlIf-Modified-Since: 2007-07-08T15:00:34ZIf-None-Match: aabd653b-65d0-74da- bc63-4bca-ba3ef3f50432<order …/>
  179. 179. PUT Results in no Work Done200 OKContent-Type: application/xmlContent-Length: ...Last-Modified: 2007-07-08T15:00:34ZEtag: aabd653b-65d0-74da- bc63-4bca-ba3ef3f50432
  180. 180. Atom and AtomPub
  181. 181. Syndication History•  Originally syndication used to provide feeds of information –  Same information available on associated Web sites•  Intended to be part of the "push" Web –  And allow syndication etc•  RSS was the primary driver here –  Several versions, loosely described •  Simple!•  ATOM followed –  Format and protocol –  Richer than RSS and now being used for the programmatic Web
  182. 182. Syndication•  Syndication: re-publishing articles collected from a variety of news sources•  Aggregation, grouping now commonplace on the Web –  Because we’re not tied to print
  183. 183. Feed Architecture Uniform Company feed interface! Public aggregator News feed Blog feed RSS Client (rich client,web app, etc) News feed Uniform interface! Bank account
  184. 184. Atom•  Atom is a format –  Aka Atom Syndication Format –  XML format for lists of (time-stamped) entries • Aka feeds –  Not to be confused with Atom Publishing Protocol
  185. 185. Atom Feeds•  Atom feeds contain useful information aimed at supporting publishing –  Its primary domain is weblogs, syndication, etc•  Better than XHTML in this case? –  Because it’s more specific to the problem domain•  Atom lists are known as feeds•  Items in Atom lists are known as entries
  186. 186. Anatomy of an Atom Feed HTTP metadata•  Media type: application/atom+xml Feed metadata<?xml version="1.0" encoding="utf-8"/><feed xmlns=""> <title>GET Connected</title> <link rel="alternate" href=""/> <updated>2007-07-01T13:00:44Z</updated> <author><name>Jim Webber</name></author> <contributor><name>Savas Parastatidis</name></ contributor> <contributor><name>Ian Robinson</name></contributor> <id>urn:ab45fe7e-7ff3-886c-11d2-7da3fe465322</id>
  187. 187. More Anatomy of an Atom Feed <entry> <title>Chapter 10 complete, says Webber</title> <link rel="service.edit" type="application/atom+xml" AtomPub API! href="http://restbucks/c10.aspx"/> <link rel="" type="application/atom+xml" (more later) href=""> <id>urn:dd64ef10-975d-23de-13fa-33d32117acb432</id> <updated>2007-07-01T13:00:44Z</updated> <summary>Chapter 10 deals with the comparison of Web and Web Services approaches to building distributed applications. </summary> <category scheme="" term="local" label="book news"/> </entry></feed>
  188. 188. Atom Feeds Analogy Resource Location Name Entry nameCreator(s) Content
  189. 189. Atom Feeds and Resources•  Atom is just a resource representation –  Like XHTML but structured for lists of things•  An Atom feed is a good resource representation for returning resources in response to a query –  It’s also hypermedia-friendly!
  190. 190. Atom Extensibility•  Q: What if your resource representations don’t fit in Atom entries directly?•  A: Use your own data! This will be ignored if your client application <entry> doesn’t know the <title>Chapter 10 complete, says Webber</title> namespace ... <jw:openIssues xmlns:jw=""> <jw:issue title="Colour diagrams degraded in monochrome"> <jw:status>closed</jw:closed> <jw:actionTaken date="2007-06-28T16:44:12Z"> <jw:takenBy></jw:takenBy> <jw:description>re-drew all diagrams</jw:description> </jw:actionTaken> </jw:issue> ... <jw:openIssues></entry>
  191. 191. Atom Publishing Protocol•  APP defines a set of resources that handle publishing Atom documents –  Four kinds of resources •  Collection •  Member •  Service Document •  Category Document –  And their representations on the wire•  Another uniform interface atop the HTTP uniform interface
  192. 192. APP: Collections•  Collection’s representation is an Atom feed•  APP defines semantics for the collection representation –  GET – retrieve the collection/feed –  POST – adds a new member to the collection •  Adds a new entry to the feed –  PUT and DELETE undefined by APP •  But probably should delete a collection or update a collection in place respectively
  193. 193. Service Document Example•  Represents a group of collections –  Think: service metadata!•  Media type: application/atomserv+xml•  Key features: –  Workspaces •  Containers for collections –  Accept •  The media type of the collections –  Categories •  Metadata describing the kind of the collection•  Designed to help the discovery process
  194. 194. Service Document Example<service xmlns="" xmlns:atom="http://"> <workspace> <atom:title>Genome Data</atom:title> What to POST to a <collection href=""> resource <atom:title>Human Genome</atom:title> <accept>application/atom+xml</accept> <categories href=""/> </collection> </workspace> <workspace> <atom:title>Cellular Images</atom:title> <collection href=""> What to POST to a <atom:title>Human Cells</atom:title> resource <accept>image/*</accept> <categories href=""/> </collection> </workspace>
  195. 195. Category Document•  More metadata –  Attach a meaningful category scheme to resources –  Think: namespaced microformat for Atom entries•  Media type: application/atomcat +xml
  196. 196. Category Document Example<app:categories xmlns:app="" xmlns="" scheme="http://genomes-r-us/" fixed="yes"> <category term="human" label="human dna"/> <category term="fruitfly" label="insect dna"/></app:categories> Can only use categories in this scheme
  197. 197. Atom and Binary Payloads•  APP defines a custom HTTP header: –  Slug•  Which describes a binary payload while it’s being uploadedPOST /music/indie HTTP/1.1Host: jim.webber.nameContent-type: audio/mpegContent-length: 6281125Slug: Hounds of Love by Futureheads[mp3 content here]•  And gets a response:201 CreatedLocation: love.atom
  198. 198. Where’s the file?•  Look again at the location: –•  It refers to an Atom entry not to the uploaded file! Title, derived from<entry> the Slug <title>Hounds of Love</entry> <updated>2007-07-02T11:52:29Z</updated> <id>55442bcd-77ae-83ce-234d-0f2754ca754c</id> Server <summary/> generated <link rel="service.edit" type="audio/mpeg" href="http://" /></entry> Server generated, based on slug
  199. 199. What Can I do with this?•  Anything you can do with any other Atom entry•  Add a summary via PUT, or change the title•  Delete the resource, or anything else that HTTP supports –  The binary file will be deleted too!•  The beauty of Atom is that it splits binaries into binary (which can’t go into a feed) and metadata (which can)
  200. 200. AtomPub: A URI-based API•  Each Atom service supports a number of URIs for manipulating its resources Creates entries –  An API expressed in terms of links! and sub- entries•  PostURI (returns an Edit URI in the header) –  Per service•  EditURI –  Per resource•  FeedURI –  Per query•  ResourcePostURI –  Per service
  201. 201. Post URI•  Post URI used to create entries Look out! The rel tag –  Either full entries (e.g. weblog post), is case sensitive! –  Or additions (e.g. comments)•  Client POSTs completed Atom Entry to this URI•  To create new entry, the link tag is used –  Link tag is used in HTML and Atom –  In head element (HTML), children of the Feed element (Atom).•  <link rel="" type="application/ atom+xml" href="{post-uri}" title="some title">•  Note: Multiple link tags may appear together and can be distinguished by having different rel, type and title attributes.Source: ATOM API Quick ReferenceJoe Gregorio,
  202. 202. Edit URI•  Edit URI used to edit an existing Entry.•  Client does a GET on the URI to retrieve the Atom Entry,•  Then modifies the Entry and then PUTs the Entry back to the same URI.•  Can use DELETE method on the URI to remove the Entry•  A link tag in either HTML or Atom of the following form will point to the URI for editing an Entry.<link rel="service.edit" type="application/atom+xml" href="{edit- uri}" title="some title">
  203. 203. Feed URI•  Feed URI used to retrieve Atom feed•  Feed may contain just Entries for syndication, or may contain additional link tags for browsing/navigating around to other feeds•  A typical feed for syndication would be indicated by a link tag of the form:•  <link href="feed-uri" type="application/atom+xml" rel="feed" title="feed title"/>•  If a feed contains extra link tags for navigation, it might be supplied specifically for client to use, then it is of the form:•  <link href="feed-uri" type="application/atom+xml" rel="feed.browse" title="feed title"/> Or•  <link href="feed-uri" type="application/atom+xml" rel="prev" title="feed title"/>
  204. 204. Consuming Feeds in Applications•  Feeds on the Internet have so-far been used to optimise the human Web –  Site summaries, blog posting, etc•  However feeds are a data structure –  And so potentially machine-processable•  Embedding machine-readable payloads means we have a vehicle for computer- computer interaction
  205. 205. Using Atom and AtomPub for Eventing Atom What does published information look like? XML