A web service is a method of communication between two
electronic devices over a network.
The W3C defines a web service as a software system designed to support interoperable machine-to-machine interaction over a network.
Nowadays the web is full of web services: search engines, online stores, weblogs, wikis, calculators, games etc. and so it's full of data. Machines fetches these data from the programmable web to perform some tasks.
The programmable web is just as same as the Web that we, the humans, interacts with.
The main difference is that instead of arranging its data in attractive HTML pages with banner and ads and cute logos, the programmable web usually serves stark, brutal XML documents which are not necessarily for human consumption. It's data is intended as input to a software program that does some tasks.
At its best, the programmable web works the same way as the human web. The clients of it retrieves data from it and figure out what do they mean, and they can also modify the programmable web, just like the human web.
The programmable web is based on HTTP and XML. Some parts of it serves HTML, JSON, plain text or binary documents, but most parts use XML, and it's all based on HTTP.
There are basically two ways of classifying the services that inhabit the programmable web: by the technologies they use (URIs, SOAP, XML-RPC and so on), or by the underlying architectures and design philosophies.
Most of today's terminology sorts services by their superficial appearances: the technology they use. These classifications work in most cases, but they're conceptually lacking and lead to mistakes. It would be better if a taxonomy based on architecture is used, which shows how technology choices follow from underlying design principles.
HTTP – The Common Thing of the Programmable Web
To classify the programmable web, it's better to start off with an overview of HTTP, the protocol that all web services have in common.
HTTP is a document-based protocol, in which the client puts a document in an envelope and sends it to the server. The server returns the favor by putting a response document in an envelope and sending it to the client. The protocol defines a strict format for thie envelope, but it doesn't care what goes inside.
In HTTP terms, this envelope is called either a request or a response . When a client sends a request to a server, then it's called an HTTP Request, and when a response comes from the server, then this envelope is called an HTTP Response.
The first burning question is – how the client can convey its intentions to the server? How does the server know a certain request is a request to retrieve some data, instead of a request to delete that same data or to overwrite it with different data? Why should the server do this instead of doing that?
This information about what to do with the data can be called the Method Information .
One way to convey method information in a web service is to put it in the HTTP method. This is how RESTful web services do it.
The great advantage of HTTP method names is that they're standardized.
Some web services keep method information in the URI path or the request document.
For example, let us consider the web service for Flickr. When someone sends HTTP requests to it's search API , the server searches for Photos. The HTTP method being used here is GET .
But the flickr supports many methods, not just GET -type methods such as flickr.photos.addTags, flickr.photos.comments.deleteComment and so on. All of them are invoked with an HTTP GET request, regardless of whether or not they GET any data. So practically Flickr is sticking its method information in the URI. Similarly, SOAP services also don't put their method information in the HTTP method, instead they store it in the entity-body and in a HTTP header.
The second question is – how the client tells the server which part of the data set to operate on i.e., given that the server understands that the client wants to delete some data, how can it know which data the client wants to delete? Why should the server operate on this data instead of that data?
This information can be called as the Scoping Information . One obvious place to put it is in the URI path. That's what most web sites do.
For an example, the URI http://www.google.com/search?q=REST tells the server that the client wants to get a list of search results about REST. Here the method information is GET and the scoping information is /search?q=REST.
Many web services put scoping information in the path. In a service where the method information defines a method in the programming language sense, the scoping information can be seen as a set of arguments to that method.
The alternative is to put the scoping information into the entity-body. A typical SOAP service does it this way.
Generally, the service design determines what information is method information and what information is scoping information.
A web service can be considered as RESTful if it follows the following constraints -
Client-server : Clients are separated from servers by a uniform interface.
Stateless : No client context being stored on the server between requests.
Cacheable : Clients are able to cache responses.
Layered system : A client cannot ordinarily tell whether it is connected directly to the end server, or to an intermediary along the way.
Uniform Interface : The uniform interface between clients and servers simplifies and decouples the architecture, which enables each part to evolve independently.
Code on demand (optional) : Servers are able to temporarily extend or customize the funtionality of a client by transferring logic to it that it can execute.
Type 1: RESTful resource-oriented web services
RESTful resource-oriented web services(Contd.)
In RESTful architectures, the method information goes into the HTTP method. In Resource-Oriented Architectures, the scoping information goes into the URI.
This combination is really powerful because given the first line of an HTTP request to a resource-oriented RESTful web service, one can easily understand basically what the client wants to do.
If the HTTP method doesn't match the method information, the service isn't RESTful. If the scoping information isn't in the URI, the service isn't resource-oriented. These aren't the only requirements, but they're good rules of thumb.
RESTful resource-oriented web services(Contd.)
A few well-known exaples of RESTful, resource-oriented web services include:
Services that expose the Atom Publishing Protocol
Amazon's Simple Storage Service(S3)
Most of yahoo's web services
Static web sites
Many web applications, especially read-only ones like search engines
An RPC-style web service accepts an envelope full of data from its client, and sends a similar envelope back. The method and the scoping information are kept inside the envelope, or on stickers applied to the envelope.
HTTP is a popular envelope format, and so is SOAP. Transmitting a SOAP document over HTTP puts the SOAP envelope inside an HTTP envelope.
In this architecture, every object doesn't necessarily respond to the same basic interface.
XML-RPC protocol for the web services is the most obvious example of the RPC architecture. In this protocol, the method data and the scoping data are put inside an XML document. This XML document becomes the entity-body inside the HTTP envelope.
In XML-RPC, the XML document containing method and scoping information is put into an envelope for transfer to the server. The envelope is an HTTP request with a method, URI and headers.
The XML document changes depending on which method someone is calling, but the HTTP envelope is always the same.
Where a RESTful service would expose different URIs for different values of the scoping information, an RPC-style service typically exposes a URI for each Document Processor : something that can open the envelopes and transform them into software commands.
Despite the ”rest” in the URI, this was clearly designed as an RPC-style service, one that uses HTTP as its envelope format. It's got the scoping information in the URI, just like RESTful resource-oriented services, but the method information also goes in the URI. It gives the illusion of behaving like a RESTful web service, but it isn't.
The uniform interface between clients and servers, discussed below, simplifies and decouples the architecture, which enables each part to evolve independently. The four guiding principles of this interface are detailed below -
Identification of resources
Manipulation of resources through representations
Hypermedia as the engine of application state
Uniform interface Guideline 1: Identification of resources
Individual resources are identified in requests, for example using URIs in web-based REST systems.
The resources themselves are conceptually separate from the representations that are returned to the client.
For example, the server does not send its database, but rather, perhaps, send some HTML, XML or JSON that represents some database records expressed, for instance, in Bengali and encoded in UTF-8, depending on the details of the request and the server implementation.
Uniform interface Guideline 2: Manipulation of resources through these representations
When a client holds a representation of a resource, including any metadata attached, it has enough information to modify or delete the resource on the server, provided it has permission to do so.
Clients can modify the resource using a POST request, and to delete it, it can send a DELETE request.
Each message from the server includes enough information to describe how the message can be processed by the client.
For example, which parser to invoke may be specified by an Internet media type (previously known as a MIME type).
Responses also explicitly indicate their cacheability.
Uniform interface Guideline 4: Hypermedia as the engine of application state
Clients make state transitions only through actions that are dynamically identified within hypermedia by the server (e.g. by hyperlinks within hypertext).
But for simple fixed entry points to the application, a client does not assume that any particular actions will be available for any particular resources beyond those described in representations previously received from the server.