Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. UNIT I INTRODUCTION Role Of XML XML and The Web XML Language Basics SOAP Web Services Revolutions Of XML Service Oriented Architecture (SOA) THE ROLE OF XML XML is a metalanguage (literally a language about languages) defined by the World Wide Web Consortium (W3C). XML is a set of rules and guidelines for describing structured data in plain text rather than proprietary binary representations. Was standardized by the W3C in 1998 In its short history, XML has given rise to numerous vertical industry vocabularies in support of B2B e-commerce, horizontal vocabularies that provide services to a wide range of industries, and XML’s influence has been felt in three waves, from industry-specific vocabularies, to horizontal industry applications, to protocols that describe how businesses can exchange data across the Web. XML is a language for creating other languages based on the insertion of tags to help describe data. However, XML is actually more than just tags. XML is a combination of tags and content in which the tags add meaning to the content. The following is a simple XML markup of customer information. <Customer> <Name>John von Neumann</Name> <PhoneNum>914.631.7722</PhoneNum> <FaxNum>914.631.7723</FaxNum> <E-Mail></E-Mail> </Customer> However, elements are only one way to describe data. It’s also possible to represent the data using attributes within a single element: <Customer name="John von Neumann" phone="914.631.7722" fax="914.631.7723" email=""/> • XML allows data to be stored in either elements or attributes. • Elements and attributes can be named to give the data meaning. • Start tags and end tags define elements that are the basis for XML tree-structured representations of documents. • Elements can contain text data and/or other elements.
  2. 2. 2 4 The XML Advantage XML has had an impact across a broad range of areas. The following is a list of some of the factors that have influenced XML’s adoption by a variety of organizations and individuals. • XML files are human-readable. Such is not the case with binary data formats. • Widespread industry support exists for XML. Numerous tools and utilities are being provided with Web browsers, databases, and operating systems, making it easier and less expensive for small and medium-sized organizations to import and export data in XML format. • Major relational databases now have the native capability to read and generate XML data. • A large family of XML support technologies is available for the interpretation and transformation of XML data for Web page display and report generation. XML: Design by Omission There are three key design elements that by omission contribute to XML’s success: • No display is assumed. Unlike HTML, XML makes no assumptions about how tags will be rendered in a browser or other display device. Auxiliary technologies such as style sheets add this capability. • There is no built-in data typing. DTDs and XML Schema provide support for defining the structure and data types associated with an XML document. • No transport is assumed. The XML specification makes no assumption about how XML is to be transported across the Internet. 9 22 XML AND WEB XML may be used to communicate directly with partners and suppliers. Instead of exchanging data about purchases and orders either manually or over proprietary networks, data vocabularies can be defined using XML and delivered from server to server using standard protocols such as HTTP or FTP. Associated with this ability to move data freely across the Web is the rise in the use of messaging servers and software These servers, supporting what is known as Message Oriented Middleware, provide guarantees of delivery and the ability to broadcast communications to multiple recipients. Web services is an ambitious initiative that is moving the Web to new levels of B2B (that is, software-to-software) interaction while trying to fulfill object technology's promise of reusable components from a service interface perspective
  3. 3. SOAP SOAP is the XML glue that lets clients and providers talk to each other and exchange XML data. SOAP builds on XML and common Web protocols (HTTP, FTP, and SMTP) to enable communication across the Web. SOAP brings to the table a set of rules for moving data, either directly in a point-to-point fashion or by sending the data through a message queue intermediary. Prior to SOAP, there were three basic options for doing distributed computing: Microsoft’s Distributed Component Object Model (DCOM), Java’s Remote Method Invocation (RMI), or the Object Management Group’s Common Object Request Broker Architecture (CORBA). Their drawback is that they limit the potential reach of the enterprise to servers that share the same object infrastructure. With SOAP, however, the potential space of interconnection is the entire Web itself. WEB SERVICES Web services is both a process and set of protocols for finding and connecting to software exposed as services over the Web. By assuming a SOAP foundation, Web services can concentrate on what data to exchange instead of worrying about how to get it from point A to point B, which is the job of SOAP. To make things even easier SOAP also defines an XML envelope to carry XML and a convention for doing remote procedure calls so that a service can advertise “call me here” and a program will be able to do so without concern for language or platform. Although SOAP may be used with a variety of protocols, the only bindings specified in
  4. 4. the proposed SOAP specification are for HTTP. The Web services technical infrastructure ensures that services even from different vendors will interoperate to create a complete business process. Web services takes the object-oriented. vision of assembling software from component building blocks to the next level. With Web services, however, the emphasis is on the assembly of services that may or may not be built on object technology. The interconnections opened up by the Web make possible a new way of interacting through the registration, discovery, and connection of software packaged as Web services. There are three major aspects to Web Services: • service provider provides an interface for software that can carry out a specified set of tasks. • service requester discovers and invokes a software service to provide a business solution • requester will commonly invoke a remote procedure call on the service provider, passing parameter data to the provider and receiving a result in reply. • broker manages and publishes the service. Service providers publish their services with the broker and requests access those services by creating bindings to the service provider. XML: THE THREE REVOLUTIONS Three revolutions centered on XML and the Web The Data Revolution XML-based industry-specific data vocabularies provide alternatives to specialized Electronic Data Interchange (EDI) solutions by facilitating B2B data exchange and playing a key role as a messaging infrastructure for distributed computing. XML's strength is its data independence. XML is pure data description, not tied to any programming language, operating system or transport protocol.
  5. 5. data is free to move about globally without the constraints imposed by tightly coupled transport dependent architectures. Protocols such as HTTP have had a tremendous impact on XML's viability and have opened the door to alternatives to CORBA, RMI and DCOM, which don't work over TCP/IP. XML does this by focusing on data and leaving other issues to supporting technologies. The Architectural Revolution The architectural revolution surrounding XML is reflected in a move from tightly coupled systems based on established infrastructures such as CORBA, RMI and DCOM, each with their own transport protocol, to loosely coupled systems riding atop standard Web protocols such as TCP/IP. The loose coupling of the Web makes possible new system architectures built around message- based middleware or less structured peer-to-peer interaction.
  6. 6. XML plays a key role in this new architecture for distributed computing through a new XML protocol language called SOAP, the Simple Object Access Protocol. SOAP simply defines a set of XML tags for moving XML data around the Web using standard Web protocols, accomplishing in one simple initiative what client-server computing had been trying to do for over a decade. Associated with this ability to move data freely across the Web is the rise in the use of messaging servers and software that sit between conversational participants. These servers, supporting what is known as Message Oriented Middleware, are playing an increasingly important role in the new extended enterprise by providing guarantees of delivery and the ability to broadcast communications to multiple recipients. The Software Revolution During the 1970s and 1980s, software was constructed as monolithic applications built to solve specific problems. The problem with large software projects is that, by trying to tackle multiple problems at once, the software is often ill suited to adding new functionality and adapting to technological change. In the 1990s a different model for software emerged based on the concept of simplicity. Instead of trying to define all requirements up front, this new philosophy was built around the concept of creating building blocks capable of combination with other building blocks that either already existed or were yet to be created. Figure 4 illustrates the software revolution. A case in point is the Web. After decades of attempts to build complex infrastructures for exchanging information across distributed networks, the Web emerged from an assemblage of foundational technologies such as HTTP, HTML, browsers and a longstanding networking technology known as TCP/IP that had been put in place in the 1970s.
  7. 7. SERVICE ORIENTED ARCHITECTURE SOA is an architectural style whose goal is to achieve loose coupling among interacting software agents. A service is a unit of work done by a service provider to achieve desired end results for a service consumer. Both provider and consumer are roles played by software agents on behalf of their owners. First, the messages must be descriptive, rather than instructive, because the service provider is responsible for solving the problem. This is like going to a restaurant: you tell your waiter what you would like to order and your preferences but you don't tell their cook how to cook your dish step by step. Second, service providers will be unable to understand your request if your messages are not written in a format, structure, and vocabulary that is understood by all parties. Omitting the vocabulary and structure of messages is a necessity for any efficient communication. The more restricted a message is, the easier it is to understand the message, although it comes at the expense of reduced extensibility. Third, extensibility is vitally important. If messages are not extensible, consumers and providers will be locked into one particular version of a service. Fourth, an SOA must have a mechanism that enables consumer to discover a service provider under the context of a service sought by the consumer.
  8. 8. UNIT II XML TECHNOLOGY XML Name Spaces Structuring With Schemas and DTD Presentation Techniques Transformation XML Infrastructure. XML XML is a markup language for documents containing structured information. Structured information contains both content (words, pictures, etc.) and some indication of what role that content plays A markup language is a mechanism to identify structures in a document. The XML specification defines a standard way to add markup to documents. XML specifies neither semantics nor a tag set. In fact XML is really a meta-language for describing markup languages. In other words, XML provides a facility to define tags and the structural relationships between them. Since there's no predefined tag set, there can't be any preconceived semantics. All of the semantics of an XML document will either be defined by the applications that process them or by style sheets. No. Well, yes, sort of. XML is defined as an application profile of SGML. SGML is the Standard Generalized Markup Language defined by ISO 8879. SGML has been the standard, vendor- independent way to maintain repositories of structured documentation for more than a decade, but it is not well suited to serving documents over the web (for a number of technical reasons beyond the scope of this article). Defining XML as an application profile of SGML means that any fully conformant SGML system will be able to read XML documents. However, using and understanding XML documents does not require a system that is capable of understanding the full generality of SGML. XML is, roughly speaking, a restricted form of SGML. XML was created so that richly structured documents could be used over the web. The only viable alternatives, HTML and SGML, are not practical for this purpose. HTML, as we've already discussed, comes bound with a set of semantics and does not provide arbitrary structure. SGML provides arbitrary structure, but is too difficult to implement just for a web browser. Full SGML systems solve large, complex problems that justify their expense. Viewing structured documents sent over the web rarely carries such justification.
  9. 9. This is not to say that XML can be expected to completely replace SGML. While XML is being designed to deliver structured content over the web, some of the very features it lacks to make this practical, make SGML a more satisfactory solution for the creation and long-time storage of complex documents. In many organizations, filtering SGML to XML will be the standard procedure for web delivery. • XML stands for EXtensible Markup Language • XML is a markup language much like HTML • XML was designed to describe data • XML tags are not predefined. You must define your own tags • XML uses a Document Type Definition (DTD) or an XML Schema to describe the data • XML with a DTD or XML Schema is designed to be self-descriptive • XML is a W3C Recommendation The Extensible Markup Language (XML) became a W3C Recommendation 10. February 1998. XML was designed to carry data.XML is not a replacement for HTML. XML was designed to describe data and to focus on what data is.HTML was designed to display data and to focus on how data looks.HTML is about displaying information, while XML is about describing information. XML was created to structure, store and to send information. The following example is a note to Tove from Jani, stored as XML: <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> The tags used to mark up HTML documents and the structure of HTML documents are predefined. The author of HTML documents can only use tags that are defined in the HTML standard (like <p>, <h1>, etc.).XML allows the author to define his own tags and his own document structure.It is important to understand that XML is not a replacement for HTML. In future Web development it is most likely that XML will be used to describe the data, while HTML will be used to format and display the same data.My best description of XML is this: XML is a cross-platform, software and hardware independent tool for transmitting information. XML TECHNOLGY FAMILY
  10. 10. Use of Elements vs. Attributes Data can be stored in child elements or in attributes. <person sex="female"> <firstname>Anna</firstname> <lastname>Smith</lastname> </person> <person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname> </person> In the first example sex is an attribute. In the last, sex is a child element. Both examples provide the same information. There are no rules about when to use attributes, and when to use child elements. My experience is that attributes are handy in HTML, but in XML you should try to avoid them. Use child elements if the information feels like data. Some of the problems with using attributes are: • attributes cannot contain multiple values (child elements can) • attributes are not easily expandable (for future changes) • attributes cannot describe structures (child elements can) • attributes are more difficult to manipulate by program code
  11. 11. • attribute values are not easy to test against a Document Type Definition (DTD) - which is used to define the legal elements of an XML document If you use attributes as containers for data, you end up with documents that are difficult to read and maintain. Try to use elements to describe data. Use attributes only to provide information that is not relevant to the data. XML NAMESPACES XML Namespaces provide a method to avoid element name conflicts. Since element names in XML are not predefined, a name conflict will occur when two different documents use the same element names. This XML document carries information in a table: <table> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table> <table> <name>African Coffee Table</name> <width>80</width> <length>120</length> </table> If these two XML documents were added together, there would be an element name conflict because both documents contain a <table> element with different content and definition. Solving Name Conflicts Using a Prefix This XML document carries information in a table: <h:table> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> This XML document carries information about a piece of furniture: <f:table> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table> Now there will be no name conflict because the two documents use a different name for their <table> element (<h:table> and <f:table>).
  12. 12. By using a prefix, we have created two different types of <table> elements. Using Namespaces This XML document carries information in a table: <h:table xmlns:h=""> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> This XML document carries information about a piece of furniture: <f:table xmlns:f=""> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table> Instead of using only prefixes, we have added an xmlns attribute to the <table> tag to give the prefix a qualified name associated with a namespace. The XML Namespace (xmlns) Attribute The XML namespace attribute is placed in the start tag of an element and has the following syntax: xmlns:namespace-prefix="namespaceURI" When a namespace is defined in the start tag of an element, all child elements with the same prefix are associated with the same namespace. Note that the address used to identify the namespace is not used by the parser to look up information. The only purpose is to give the namespace a unique name. However, very often companies use the namespace as a pointer to a real Web page containing information about the namespace. Try to go to Uniform Resource Identifier (URI) A Uniform Resource Identifier (URI) is a string of characters which identifies an Internet Resource. The most common URI is the Uniform Resource Locator (URL) which identifies an Internet domain address. Another, not so common type of URI is the Universal Resource Name (URN). In our examples we will only use URLs.
  13. 13. Default Namespaces Defining a default namespace for an element saves us from using prefixes in all the child elements. It has the following syntax: xmlns="namespaceURI" This XML document carries information in a table: <table xmlns=""> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table> This XML document carries information about a piece of furniture: <table xmlns=""> <name>African Coffee Table</name> <width>80</width> <length>120</length> </table> Namespaces in Real Use When you start using XSL, you will soon see namespaces in real use. XSL style sheets are used to transform XML documents into other formats, like HTML. If you take a close look at the XSL document below, you will see that most of the tags are HTML tags. The tags that are not HTML tags have the prefix xsl, identified by the namespace "": <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl=""> <xsl:template match="/"> <html> <body> <h2>My CD Collection</h2> <table border="1"> <tr> <th align="left">Title</th> <th align="left">Artist</th> </tr> <xsl:for-each select="catalog/cd"> <tr> <td><xsl:value-of select="title"/></td> <td><xsl:value-of select="artist"/></td>
  14. 14. </tr> </xsl:for-each> </table> </body> </html> </xsl:template> </xsl:stylesheet> STRUCTURING WITH SCHEMAS The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD. An XML Schema: • defines elements that can appear in a document • defines attributes that can appear in a document • defines which elements are child elements • defines the order of child elements • defines the number of child elements • defines whether an element is empty or can include text • defines data types for elements and attributes • defines default and fixed values for elements and attributes One of the greatest strength of XML Schemas is the support for data types. With support for data types: • It is easier to describe allowable document content • It is easier to validate the correctness of data • It is easier to work with data from a database • It is easier to define data facets (restrictions on data) • It is easier to define data patterns (data formats) • It is easier to convert data between different data types XML Schemas use XML Syntax Another great strength about XML Schemas is that they are written in XML.Some benefits of that XML Schemas are written in XML: • You don't have to learn a new language • You can use your XML editor to edit your Schema files • You can use your XML parser to parse your Schema files • You can manipulate your Schema with the XML DOM • You can transform your Schema with XSLT
  15. 15. XML Schemas Secure Data Communication When sending data from a sender to a receiver, it is essential that both parts have the same "expectations" about the content.With XML Schemas, the sender can describe the data in a way that the receiver will understand.A date like: "03-11-2004" will, in some countries, be interpreted as 3.November and in other countries as 11.March.However, an XML element with a data type like this: <date type="date">2004-03-11</date> ensures a mutual understanding of the content, because the XML data type "date" requires the format "YYYY-MM-DD". XML Schemas are Extensible XML Schemas are extensible, because they are written in XML.With an extensible Schema definition you can: • Reuse your Schema in other Schemas • Create your own data types derived from the standard types • Reference multiple schemas in the same document Well-Formed is not Enough A well-formed XML document is a document that conforms to the XML syntax rules, like: • it must begin with the XML declaration • it must have one unique root element • start-tags must have matching end-tags • elements are case sensitive • all elements must be closed • all elements must be properly nested • all attribute values must be quoted • entities must be used for special characters Even if documents are well-formed they can still contain errors, and those errors can have serious consequences.Think of the following situation: you order 5 gross of laser printers, instead of 5 laser printers. With XML Schemas, most of these errors can be caught by your validating software. DOCUMENT TYPE DEFINITION (DTD) A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list of legal elements and attributes. A DTD can be declared inline inside an XML document, or as an external reference.
  16. 16. Internal DTD Declaration If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the following syntax: <!DOCTYPE root-element [element-declarations]> Example XML document with an internal DTD: <?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend</body> </note> The DTD above is interpreted like this: !DOCTYPE note defines that the root element of this document is note. !ELEMENT note defines that the note element contains four elements: "to,from,heading,body". !ELEMENT to defines the to element to be of the type "#PCDATA". !ELEMENT from defines the from element to be of the type "#PCDATA". !ELEMENT heading defines the heading element to be of the type "#PCDATA". !ELEMENT body defines the body element to be of the type "#PCDATA". External DTD Declaration If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with the following syntax: <!DOCTYPE root-element SYSTEM "filename"> This is the same XML document as above, but with an external DTD (Open it, and select view source): <?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body>
  17. 17. </note> And this is the file "note.dtd" which contains the DTD: <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> Why Use a DTD? With a DTD, each of your XML files can carry a description of its own format. With a DTD, independent groups of people can agree to use a standard DTD for interchanging data. Your application can use a standard DTD to verify that the data you receive from the outside world is valid. You can also use a DTD to verify your own data. XML PARSING DOM • The XML DOM is the Document Object Model for XML • The XML DOM is platform- and language-independent • The XML DOM defines a standard set of objects for XML • The XML DOM defines a standard way to access XML documents • The XML DOM defines a standard way to manipulate XML documents • The XML DOM is a W3C standard The DOM views XML documents as a tree-structure. All elements; their containing text and their attributes, can be accessed through the DOM tree. Their contents can be modified or deleted, and new elements can be created. The elements, their text, and their attributes are all known as nodes. According to the DOM, everything in an XML document is a node.The DOM says that: • The entire document is a document node • Every XML tag is an element node • The texts contained in the XML elements are text nodes • Every XML attribute is an attribute node • Comments are comment nodes Node Hierarchy Nodes have a hierarchical relationship to each other.All nodes in an XML document form a document tree (or node tree). Each element, attribute, text, etc. in the XML document represents a node in the tree. The tree starts at the document node and continues to branch out until it has reached all text nodes at the lowest level of the tree.The terms "parent" and "child" are used to describe the relationships between nodes. Some nodes may have child nodes, while other nodes do not have children (leaf nodes).Because the XML data is structured in a tree form, it can be
  18. 18. traversed without knowing the exact structure of the tree and without knowing the type of data contained within. DOM Node Hierarchy Example Look at the following XML file: books.xml <?xml version="1.0" encoding="ISO-8859-1"?> <bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="WEB"> <title lang="en">XQuery Kick Start</title> <author>James McGovern</author> <author>Per Bothner</author> <author>Kurt Cagle</author> <author>James Linn</author> <author>Vaidyanathan Nagarajan</author> <year>2003</year> <price>49.99</price> </book> <book category="WEB"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore> Notice that the root element in the XML document above is named <bookstore>. All other elements in the document are contained within <bookstore>.The <bookstore> element represents the root node of the DOM tree. The root node <bookstore> holds four <book> child nodes.The first <book> child node also holds four children: <title>, <author>, <year>, and <price>, which contains one text node each, "Everyday Italian", "Giada De Laurentiis", "2005", and "30.00". Text is always stored in text nodes. A common error in DOM processing is to navigate to an element node and expect it to contain the text. However, even the simplest element node has a text node under it. For example, in <year>2005</year>, there is an element node (year), and a text node under it, which contains the text (2005).The following image illustrates a fragment of the DOM node tree from the XML document above:
  19. 19. SAX SAX is a common interface implemented for many different XML parsers (and things that pose as XML parsers), just as the JDBC is a common interface implemented for many different relational databases (and things that pose as relational databases). If you want to use SAX, you'll need all of the following: • Java 1.1 or higher. • A SAX2-compatible XML parser installed on your Java classpath. (If you need such a parser, see the page of links at the left.) • The SAX2 distribution installed on your Java classpath. (This probably came with your parser.) Most Java/XML tools distributions include SAX2 and a parser using it. Most web applications servers use it for their core XML support. In particular, environments with JAXP 1.1 support include SAX2. PRESENTATION TECHNOLIGES CSS and XSL XSL stands for EXtensible Stylesheet Language.The World Wide Web Consortium (W3C) started to develop XSL because there was a need for an XML-based Stylesheet Language. CSS = HTML Style Sheets HTML uses predefined tags and the meaning of the tags are well understood.The <table> element in HTML defines a table - and a browser knows how to display it.Adding styles to HTML elements is simple. Telling a browser to display an element in a special font or color, is easy with CSS. XSL = XML Style Sheets XML does not use predefined tags (we can use any tag-names we like), and the meaning of these tags are not well understood. A <table> element could mean an HTML table, a piece of furniture, or something else - and a browser does not know how to display it. XSL describes how the XML document should be displayed!XSL - More Than a Style Sheet Language
  20. 20. XSL consists of three parts: • XSLT - a language for transforming XML documents • XPath - a language for navigating in XML documents • XSL-FO - a language for formatting XML documents XHTML • XHTML stands for EXtensible HyperText Markup Language • XHTML is aimed to replace HTML • XHTML is almost identical to HTML 4.01 • XHTML is a stricter and cleaner version of HTML • XHTML is HTML defined as an XML application • XHTML is a W3C Recommendation XForms An XForms Processor built into the browser will be responsible for submitting the XForms data to a target. The data can be submitted as XML and could look something like this: <person> <fname>Hege</fname> <lname>Refsnes</lname> </person> Or it can be submitted as text, looking something like this: fname=Hege;lname=Refsnes VOICE XML VoiceXML (VXML) is the W3C's standard XML format for specifying interactive voice dialogues between a human and a computer. It allows voice applications to be developed and deployed in an analogous way to HTML for visual applications. Just as HTML documents are interpreted by a visual web browser, VoiceXML documents are interpreted by a voice browser. A common architecture is to deploy banks of voice browsers attached to the public switched telephone network (PSTN) so that users can use a telephone to interact with voice applications. The following is an example of a VoiceXML document: <?xml version="1.0"?> <vxml version="2.0" xmlns=""> <form> <block> <prompt> Hello world! </prompt>
  21. 21. </block> </form> </vxml> When interpreted by a VoiceXML interpreter this will output "Hello world" with synthesized speech. Typically, HTTP is used as the transport protocol for fetching VoiceXML pages. Some applications may use static VoiceXML pages, while others rely on dynamic VoiceXML page generation using an application server like Tomcat, Weblogic, IIS, or WebSphere. In a well- architected web application, the voice interface and the visual interface share the same back-end business logic. Historically, VoiceXML platform vendors have implemented the standard in different ways, and added proprietary features. But the VoiceXML 2.0 standard, adopted as a W3C Recommendation 16 March 2004, clarified most areas of difference. The VoiceXML Forum, an industry group promoting the use of the standard, provides a conformance testing process that certifies vendors implementations as conformant. TRANSFORMATION TECHNOLOGIES XLINK XLink defines a standard way of creating hyperlinks in XML documents. XPointer allows the hyperlinks to point to more specific parts (fragments) in the XML document. • XLink is short for the XML Linking Language • XLink is a language for creating hyperlinks in XML documents • XLink is similar to HTML links - but it is a lot more powerful • ANY element in an XML document can behave as an XLink • XLink supports simple links (like HTML) and extended links (for linking multiple resources together) • With XLink, the links can be defined outside of the linked files • XLink is a W3C Recommendation • XPointer is short for the XML Pointer Language • XPointer allows the hyperlinks to point to specific parts of the XML document • XPointer uses XPath expressions to navigate in the XML document • XPointer is a W3C Recommendation XLink Syntax In HTML, we know (and all the browsers know!) that the <a> element defines a hyperlink. However, this is not how it works with XML. In XML documents, you can use whatever element names you want - therefore it is impossible for browsers to predict what hyperlink elements will be called in XML documents. The solution for creating links in XML documents was to put a
  22. 22. marker on elements that should act as hyperlinks. Below is a simple example of how to use XLink to create links in an XML document: <?xml version="1.0"?> <homepages xmlns:xlink=""> <homepage xlink:type="simple" xlink:href="">Visit W3Schools</homepage> <homepage xlink:type="simple" xlink:href="">Visit W3C</homepage> </homepages> To get access to the XLink attributes and features we must declare the XLink namespace at the top of the document. The XLink namespace is: "". The xlink:type and the xlink:href attributes in the <homepage> elements define that the type and href attributes come from the xlink namespace. The xlink:type="simple" creates a simple, two-ended link (means "click from here to go there"). We will look at multi-ended (multidirectional) links later. XPointer Syntax In HTML, we can create a hyperlink that either points to an HTML page or to a bookmark inside an HTML page (using #). Sometimes it is more useful to point to more specific content. For example, let's say that we want to link to the third item in a particular list, or to the second sentence of the fifth paragraph. This is easy with XPointer. If the hyperlink points to an XML document, we can add an XPointer part after the URL in the xlink:href attribute, to navigate (with an XPath expression) to a specific place in the document. For example, in the example below we use XPointer to point to the fifth item in a list with a unique id of "rock": href="'rock').child(5,item)" XPATH XPath is the result of an effort to provide a common syntax and semantics for functionality shared between XSL Transformations [XSLT] and XPointer [XPointer]. The primary purpose of XPath is to address parts of an XML [XML] document. In support of this primary purpose, it also provides basic facilities for manipulation of strings, numbers and booleans. XPath uses a compact, non-XML syntax to facilitate use of XPath within URIs and XML attribute values. XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document.
  23. 23. In addition to its use for addressing, XPath is also designed so that it has a natural subset that can be used for matching (testing whether or not a node matches a pattern); this use of XPath is described in XSLT. XPath models an XML document as a tree of nodes. There are different types of nodes, including element nodes, attribute nodes and text nodes. XPath defines a way to compute a string-value for each type of node. Some types of nodes also have names. XPath fully supports XML Namespaces [XML Names]. Thus, the name of a node is modeled as a pair consisting of a local part and a possibly null namespace URI; this is called an expanded-name. The data model is described in detail in The primary syntactic construct in XPath is the expression. An expression matches the production Expr. An expression is evaluated to yield an object, which has one of the following four basic types: • node-set (an unordered collection of nodes without duplicates) • boolean (true or false) • number (a floating-point number) • string (a sequence of UCS characters) Expression evaluation occurs with respect to a context. XSLT and XPointer specify how the context is determined for XPath expressions used in XSLT and XPointer respectively. The context consists of: • a node (the context node) • a pair of non-zero positive integers (the context position and the context size) • a set of variable bindings • a function library • the set of namespace declarations in scope for the expression XQUERY • XQuery is the language for querying XML data • XQuery for XML is like SQL for databases • XQuery is built on XPath expressions • XQuery is supported by all the major database engines (IBM, Oracle, Microsoft, etc.) • XQuery is a W3C Recommendation XQuery is a language for finding and extracting elements and attributes from XML documents.Here is an example of a question that XQuery could solve:"Select all CD records with a price less than $10 from the CD collection stored in the XML document called cd_catalog.xml" XQuery can be used to: • Extract information to use in a Web Service • Generate summary reports • Transform XML data to XHTML • Search Web documents for relevant information
  24. 24. XQuery is a W3C Recommendation. XQuery is compatible with several W3C standards, such as XML, Namespaces, XSLT, XPath, and XML Schema. XQuery 1.0 became a W3C Recommendation January 23, 2007. XML INFRASTRUCTURE TECHNOLOGIES INFOSET The XML Infoset is an abstract Data Model describing the information available from an XML document. For many applications, this way of looking at an XML document is more useful than having to analyze and interpret XML syntax. DOM describes an API through which the information in an XML Infoset (i.e., the information available from a specific XML document) can be accessed from different programming languages. The XML Information Set (Infoset) defines a data model for XML. This data model is a set of abstractions that detail the properties of XML trees. These abstractions provide a common viewpoint from which to think about XML APIs and higher-level specifications such as XPath, XSLT and XML Schema, as shown in Figure 1. RDF
  25. 25. • RDF stands for Resource Description Framework • RDF is a framework for describing resources on the web • RDF provides a model for data, and a syntax so that independent parties can exchange and use it • RDF is designed to be read and understood by computers • RDF is not designed for being displayed to people • RDF is written in XML • RDF is a part of the W3C's Semantic Web Activity • RDF is a W3C Recommendation RDF - Examples of Use • Describing properties for shopping items, such as price and availability • Describing time schedules for web events • Describing information about web pages, such as content, author, created and modified date • Describing content and rating for web pictures • Describing content for search engines • Describing electronic libraries RDF is Designed to be Read by Computers RDF was designed to provide a common way to describe information so it can be read and understood by computer applications. RDF descriptions are not designed to be displayed on the web. RDF is Written in XML RDF documents are written in XML. The XML language used by RDF is called RDF/XML. By using XML, RDF information can easily be exchanged between different types of computers using different types of operating systems and application languages. RDF and "The Semantic Web" The RDF language is a part of the W3C's Semantic Web Activity. W3C's "Semantic Web Vision" is a future where: • Web information has exact meaning • Web information can be understood and processed by computers • Computers can integrate information from the web UNIT III
  26. 26. SOAP OVERVIEW OF SOAP SOAP allows Java objects and COM objects to talk to each other in a distributed, decentralized, Web-based environment. SOAP allows objects (or code) of any kind -- on any platform, in any language -- to cross- communicate. At present, SOAP has been implemented in over 60 languages on over 20 platforms. SOAP is simply one component in the emerging picture of the Web as a standards-based, language- and platform-neutral framework for business operations. These operations are commonly lumped under the generic tag "Web services, " but Web services themselves are only as good as the infrastructure that supports them. Three network tiers are evident in the evolution of Web services: TCP/IP, HTTP/HTML, and XML. These tiers build successively on top of each other and remain compatible today. The first tier, the TCP/IP protocol, is concerned primarily with passing data across the wire in packets. A protocol that guarantees transmission across public networks, TCP/IP emphasizes reliability of data transport and physical connectivity. Originally the putty holding proprietary networks together, it's now the backbone protocol of the Web on which higher-level, standard protocols such as HTTP rely. The second tier, HTML over HTTP, is a presentation tier and concerns itself with browser-based search, retrieval and sharing of information. The emphasis here is on GUI-based navigation and the manipulation of presentation formats. In many ways, HTML is more show than go and lacks both extensibility and true programming power. Nonetheless, sharing hypertext-linked documents in a browser-based environment revolutionized the way humans communicate text- based information to one another. Networked desktop environments, burdened with proprietary operating systems and platform dependent software, are slowly but surely giving way to the standards-based, open-systems computing of the Internet. Leading the charge into this brave new standards-based world is XML, the third and possibly the most compelling tier on the Internet. XML, a strongly-typed data interchange format, provides a new dimension to the HTTP/HTML tier, one in which machine-to-machine communication is made possible through standard interfaces. This layer -- variously described as A2A (application to application), B2B (business to business) or C2C (computer to computer) -- allows programs to exchange data formatted in a platform- and presentation-independent manner. XSLT style sheets may be added as an optional presentation and/or transformational component. SOAP is "a lightweight protocol for exchange of information in a decentralized, distributed environment." SOAP does not mandate a single programming model -- nor does it define language bindings for a specific programming language. In the context of the Java programming language, it's up to the
  27. 27. Java community to define the specific language binding. Java language bindings are now being pursued through the JAX-RPC initiative. SOAP is an extensible, text-based framework for enabling communication between diverse parties -- in general, objects -- that have no prior knowledge of each other or of each other's platforms. From the point of view of objects on the net, SOAP is the ultimate blind date. Client applications can interoperate in loosely-coupled environments to discover and connect dynamically to services without any previous agreements having been established between them. SOAP is extensible, because SOAP clients, servers and the protocol itself can evolve without breaking existing apps. SOAP, moreover, is generous in terms of supporting intermediaries and layered architectures. This means processing nodes can sit on the path a request takes between the client and server. These intermediate nodes process parts of the message specified by SOAP through the use of headers, which allow clients to identify which node works on what part of the message. This type of intermediate header processing is performed by private contract between the client application and the intermediate processing node. SOAP provides a mustUnderstand attribute for headers, which allows the client to specify whether the processing is mandatory or optional. If mustUnderstand is set to 1, the server must either perform the intermediate HTTP In order to fetch a web page for you, your web browser must "talk" to a web server somewhere else. When web browsers talk to web servers, they speak a language known as HTTP, which stands for HyperText Transfer Protocol. This language is actually very simple and understandable and is not difficult for the human eye to follow. A Simple HTTP Example The browser says: GET / HTTP/1.0 Host: And the server replies: HTTP/1.0 200 OK Content-Type: text/html HTTP is the protocol that drives the WWW. It was conceived by Sir Tim Berners-Lee (that’s right, they knighted him). The Web is based on the client-server programming model in which the client (your browser) requests a resource (a Web page) from a server. A brief negotiation is made and the server returns the resource after which the browser renders the page and then you can view (or perhaps listen) to it..
  28. 28. The first line of the browser's request, GET / HTTP/1.0, indicates that the browser wants to see the home page of the site, and that the browser is using version 1.0 of the HTTP protocol. The second line, Host:, indicates the web site that the browser is asking for. This is required because many web sites may share the same IP address on the Internet and be hosted by a single computer. The Host: line was added a few years after the original release of HTTP 1.0 in order to accommodate this. The first line of the server's reply, HTTP/1.0 200 OK, indicates that the server is also speaking version 1.0 of the HTTP protocol, and that the request was successful. If the page the browser asked for did not exist, the response would read HTTP/1.0 404 Not Found. The second line of the server's reply, Content-Type: text/html, tells the browser that the object it is about to receive is a web page. This is how the browser knows what to do with the response from the server. If this line were Content-Type: image/png, the browser would know to expect a PNG image file rather than a web page, and would display it accordingly. A modern web browser would say a bit more using the HTTP 1.1 protocol, and a modern web server would respond with a bit more information, but the differences are not dramatic and the above transaction is still perfectly valid; if a browser made a request exactly like the one above today, it would still be accepted by any web server, and the response above would still be accepted by any browser. This simplicity is typical of most of the protocols that grew up around the Internet. In fact, you can try being a web browser yourself, if you are a patient typist. If you are using Windows, click the Start menu, select "Run," and type "telnet 80" in the dialog that appears. Then click OK. Users of other operating systems can do the same thing; just start your own telnet program and connect to your web site as the host and 80 as the port number. When the connection is made, type:
  29. 29. XML – RPC Inside every computer, every time you click a key or the mouse, thousands of "procedure calls" are spawned, analyzing, computing and then acting on your gestures. A procedure call is the name of a procedure, its parameters, and the result it returns. Every program is just a single procedure called main, every operating system has a main procedure called a kernel. There's a top level to every program that sits in a loop waiting for something to happen and then distributes control to a hierarchy of procedures that respond. This is at the heart of interactivity and networking, it's at the heart of software. What is RPC? RPC is a very simple extension to the procedure call idea, it says let's create connections between procedures that are running in different applications, or on different machines. Conceptually, there's no difference between a local procedure call and a remote one, but they are implemented differently, perform differently (RPC is much slower) and therefore are used for different things. Remote calls are "marshalled" into a format that can be understood on the other side of the connection. As long as two machines agree on a format, they can talk to each other. That's why Windows machines can be networked with other Windows machines, and Macs can talk to Macs, etc. The value in a standardized cross-platform approach for RPC is that it allows Unix machines to talk to Windows machines and vice versa. What is XML-RPC? There are an almost infinite number of formats possible. One possible format is XML, a new language that both humans and computers can read. XML-RPC uses XML as the marshalling format. It allows Macs to easily make procedure calls to software running on Windows machines and BeOS machines, as well as all flavors of Unix and Java, and IBM mainframes, and PDAs and sewing machines (they have computers in them too these days). With XML it's easy to see what it's doing, and it's also relatively easy to marshall the internal procedure call format into a remote format. OK, now that we understand what XML-RPC is, let the XML part fade into the background. It's an implementation detail. Programmers are interested in XML, as are web developers, but if you're a user or an investor, XML is about as important as C++ or Java. The developers like it, or seem to, and that's the only major take-away from the XML part of XML-RPC. But RPC is important, no matter what format is used, because it allows choices, you can replace a component with another one; and it opens possibilities, empowering advanced users to develop solutions with packaged software that the developers didn't anticipate.
  30. 30. XML-RPC is among the simplest and most foolproof web service approaches, and makes it easy for computers to call procedures on other computers. XML-RPC permits programs to make function or procedure calls across a network. XMLRPC uses the HTTP protocol to pass information from a client computer to a server computer. XML-RPC uses a small XML vocabulary to describe the nature of requests and responses. XML-RPC client specify a procedure name and parameters in the XML request, and the server returns either a fault or a response in the XML response. XML-RPC parameters are a simple list of types and content - structs and arrays are the most complex types available. XML-RPC has no notion of objects and no mechanism for including information that uses other XML vocabularies. With XML-RPC and web services, however, the Web becomes a collection of procedural connections where computers exchange information along tightly bound paths. XML-RPC emerged in early 1998; it was published by UserLand Software and initially implemented in their Frontier product. XML-RPC consists of three relatively small parts: XML-RPC data model A set of types for use in passing parameters, return values, and faults (error messages) XML-RPC request structures An HTTP POST request containing method and parameter information XML-RPC response structures An HTTP response that contains return values or fault information The XML-RPC specification defines six basic data types and two compound data types that represent combinations of types. Basic data types in XML-RPC Type Value Examples 32-bit integers between - <int>27<int> int or i4 2,147,483,648 and 2,147,483,647. <i4>27<i4> <double>27.31415</double> double 64-bit floating-point numbers <double>-1.1465</double> <boolean>1</boolean> Boolean true (1) or false (0) <boolean>0</boolean> ASCII text, though many <string>Hello</string> string implementations support Unicode <string>bonkers! @</string>
  31. 31. <dateTime.iso8601> 20021125T02:20:04 Dates in ISO8601 format: </dateTime.iso8601> dateTime.iso8601 CCYYMMDDTHH:MM:SS <dateTime.iso8601> 20020104T17:27:30 </dateTime.iso8601> <base64> Binary information encoded as Base base64 SGVsbG8sIFdvcmxkIQ== 64, as defined in RFC 2045 </base64> These basic types are always enclosed in value elements. Strings (and only strings) may be enclosed in a value element but omit the string element. These basic types may be combined into two more complex types, arrays and structs. Arrays represent sequential information, while structs represent name-value pairs, much like hashtables, associative arrays, or properties. Arrays are indicated by the array element, which contains a data element holding the list of values. Like other data types, the array element must be enclosed in a value element. For example, the following array contains four strings: <value> <array> <data> <value><string>This </string></value> <value><string>is </string></value> <value><string>an </string></value> <value><string>array.</string></value> </data> </array> </value> XML-RPC requests are a combination of XML content and HTTP headers. The XML content uses the data typing structure to pass parameters and contains additional information identifying which procedure is being called, while the HTTP headers provide a wrapper for passing the request over the Web. Each request contains a single XML document, whose root element is a methodCall element. Each methodCall element contains a methodName element and a params element. The methodName element identifies the name of the procedure to be called, while the params element contains a list of parameters and their values. Each params element includes a list of param elements which in turn contain value elements. For example, to pass a request to a method called circleArea , which takes a Double parameter
  32. 32. (for the radius), the XML-RPC request would look like: <?xml version="1.0"?> <methodCall> <methodName>circleArea</methodName> <params> <param> <value><double>2.41</double></value> </param> </params> </methodCall> The HTTP headers for these requests will reflect the senders and the content. The basic template looks like: POST /target HTTP 1.0 User-Agent: Identifier Host: host.making.request Content-Type: text/xml Content-Length: length of request in bytes For example, if the circleArea method were available from an XML-RPC server listening at /xmlrpc, the request might look like: POST /xmlrpc HTTP 1.0 User-Agent: myXMLRPCClient/1.0 Host: Content-Type: text/xml Content-Length: 169 Assembled, the entire request would look like: POST /xmlrpc HTTP 1.0
  33. 33. User-Agent: myXMLRPCClient/1.0 Host: Content-Type: text/xml Content-Length: 169 <?xml version="1.0"?> <methodCall> <methodName>circleArea</methodName> <params> <param> <value><double>2.41</double></value> </param> </params> </methodCall> It's an ordinary HTTP request, with a carefully constructed payload. Responses are much like requests, with a few extra twists. If the response is successful - the procedure was found, executed correctly, and returned results - then the XML-RPC response will look much like a request, except that the methodCall element is replaced by a methodResponse element and there is no methodName element: <?xml version="1.0"?> <methodResponse> <params> <param> <value><double>18.24668429131</double></value> </param> </params> </methodResponse>
  34. 34. <?xml version="1.0"?> <methodResponse> <params> <param> <value><double>18.24668429131</double></value> </param> </params> </methodResponse> An XML-RPC response can only contain one parameter. That parameter may be an array or a struct, so it is possible to return multiple values It is always required to return a value in response. A "success value" - perhaps a boolean set to true (1) Like requests, responses are packaged in HTTP and have HTTP headers. All XML-RPC responses use the 200 OK response code, even if a fault is contained in the message. Headers use a common structure similar to that of requests, and a typical set of headers might look like: HTTP/1.1 200 OK Date: Sat, 06 Oct 2001 23:20:04 GMT Server: Apache.1.3.12 (Unix) Connection: close Content-Type: text/xml Content-Length: 124 XML-RPC only requires HTTP 1.0 support, but HTTP 1.1 is compatible. The Content-Type must be set to text/xml The Content-Length header specifies the length of the response in bytes.
  35. 35. A complete response, with both headers and a response payload, would look like: HTTP/1.1 200 OK Date: Sat, 06 Oct 2001 23:20:04 GMT Server: Apache.1.3.12 (Unix) Connection: close Content-Type: text/xml Content-Length: 124 <?xml version="1.0"?> <methodResponse> <params> <param> <value><double>18.24668429131</double></value> </param> </params> </methodResponse> After the response is delivered from the XML-RPC server to the XML-RPC client, the connection is closed. Follow-up requests need to be sent as separate XML-RPC connections.
  36. 36. SOAP SOAP is a simple XML-based protocol to let applications exchange information over HTTP.  SOAP stands for Simple Object Access Protocol  SOAP is a communication protocol  SOAP is for communication between applications  SOAP is a format for sending messages  SOAP is designed to communicate via Internet  SOAP is platform independent  SOAP is language independent  SOAP is based on XML  SOAP is simple and extensible  SOAP allows you to get around firewalls  SOAP will be developed as a W3C standard Why SOAP? It is important for application development to allow Internet communication between programs. Today's applications communicate using Remote Procedure Calls (RPC) between objects like DCOM and CORBA, but HTTP was not designed for this. RPC represents a compatibility and security problem; firewalls and proxy servers will normally block this kind of traffic. A better way to communicate between applications is over HTTP, because HTTP is supported by all Internet browsers and servers. SOAP was created to accomplish this. SOAP provides a way to communicate between applications running on different operating systems, with different technologies and programming languages SOAP message is an ordinary XML document containing the following elements: A required Envelope element that identifies the XML document as a SOAP message An optional Header element that contains header information A required Body element that contains call and response information An optional Fault element that provides information about errors that occurred while processing the message All the elements above are declared in the default namespace for the SOAP envelope: Syntax Rules Here are some important syntax rules: 1. A SOAP message MUST be encoded using XML
  37. 37. 2. A SOAP message MUST use the SOAP Envelope namespace 3. A SOAP message MUST use the SOAP Encoding namespace 4. A SOAP message must NOT contain a DTD reference 5. A SOAP message must NOT contain XML Processing Instructions <?xml version="1.0"?> <soap:Envelope xmlns:soap="" soap:encodingStyle=""> <soap:Header> ... ... </soap:Header> <soap:Body> ... ... <soap:Fault> ... ... </soap:Fault> </soap:Body> </soap:Envelope> The SOAP Envelope Element The required SOAP Envelope element is the root element of a SOAP message. It defines the XML document as a SOAP message. Note the use of the xmlns:soap namespace. It should always have the value of: and it defines the Envelope as a SOAP Envelope <?xml version="1.0"?>
  38. 38. <soap:Envelope xmlns:soap="" soap:encodingStyle=""> ... Message information goes here ... </soap:Envelope> The xmlns:soap Namespace A SOAP message must always have an Envelope element associated with the "" namespace. If a different namespace is used, the application must generate an error and discard the message. The SOAP Header Element The optional SOAP Header element contains application specific information (like authentication, payment, etc) about the SOAP message. If the Header element is present, it must be the first child element of the Envelope element. All immediate child elements of the Header element must be namespace-qualified. <?xml version="1.0"?> <soap:Envelope xmlns:soap="" soap:encodingStyle=""> <soap:Header> <m:Trans xmlns:m="" soap:mustUnderstand="1">234</m:Trans> </soap:Header> ... ...
  39. 39. </soap:Envelope> The example above contains a header with a "Trans" element, a "mustUnderstand" attribute value of "1", and a value of 234. SOAP defines three attributes in the default namespace (" envelope"). These attributes are: actor, mustUnderstand, and encodingStyle. The attributes defined in the SOAP Header defines how a recipient should process the SOAP message. The actor Attribute A SOAP message may travel from a sender to a receiver by passing different endpoints along the message path. Not all parts of the SOAP message may be intended for the ultimate endpoint of the SOAP message but, instead, may be intended for one or more of the endpoints on the message path. The SOAP actor attribute may be used to address the Header element to a particular endpoint. Syntax soap:actor="URI" <?xml version="1.0"?> <soap:Envelope xmlns:soap="" soap:encodingStyle=""> <soap:Header> <m:Trans xmlns:m="" soap:actor=""> 234 </m:Trans> </soap:Header> ... ... </soap:Envelope> The mustUnderstand Attribute
  40. 40. The SOAP mustUnderstand attribute can be used to indicate whether a header entry is mandatory or optional for the recipient to process. If you add "mustUnderstand="1" to a child element of the Header element it indicates that the receiver processing the Header must recognize the element. If the receiver does not recognize the element it must fail when processing the Header. Syntax soap:mustUnderstand="0|1" Example <?xml version="1.0"?> <soap:Envelope xmlns:soap="" soap:encodingStyle=""> <soap:Header> <m:Trans xmlns:m="" soap:mustUnderstand="1"> 234 </m:Trans> </soap:Header> ... ... </soap:Envelope> The SOAP Body Element The required SOAP Body element contains the actual SOAP message intended for the ultimate endpoint of the message.
  41. 41. Immediate child elements of the SOAP Body element may be namespace-qualified. SOAP defines one element inside the Body element in the default namespace (""). This is the SOAP Fault element, which is used to indicate error messages. <?xml version="1.0"?> <soap:Envelope xmlns:soap="" soap:encodingStyle=""> <soap:Body> <m:GetPrice xmlns:m=""> <m:Item>Apples</m:Item> </m:GetPrice> </soap:Body> </soap:Envelope> The example above requests the price of apples. Note that the m:GetPrice and the Item elements above are application-specific elements. They are not a part of the SOAP standard. A SOAP response could look something like this <?xml version="1.0"?> <soap:Envelope xmlns:soap="" soap:encodingStyle=""> <soap:Body> <m:GetPriceResponse xmlns:m=""> <m:Price>1.90</m:Price> </m:GetPriceResponse> </soap:Body> </soap:Envelope> The SOAP Fault Element
  42. 42. An error message from a SOAP message is carried inside a Fault element. If a Fault element is present, it must appear as a child element of the Body element. A Fault element can only appear once in a SOAP message. The SOAP Fault element has the following sub elements: Sub Element Description <faultcode> A code for identifying the fault <faultstring> A human readable explanation of the fault <faultactor> Information about who caused the fault to happen <detail> Holds application specific error information related to the Body element The faultcode values defined below must be used in the faultcode element when describing faults: Error Description VersionMismatch Found an invalid namespace for the SOAP Envelope element MustUnderstand An immediate child element of the Header element, with the mustUnderstand attribute set to "1", was not understood Client The message was incorrectly formed or contained incorrect information Server There was a problem with the server so the message could not proceed
  43. 43. SOAP WITH ATTACHMENTS You can associate a SOAP message with one or more attachments in their native format (for example GIF or JPEG) by using a multipart MIME structure for transport. There are two core standards that define how to do this:
  44. 44. SOAP with Attachments (SwA) or MIME for Web Services refers to the method of using Web Services to send and receive files using a combination of SOAP and MIME, primarily over HTTP. UNIT IV WEB SERVICES Web Services Definition by W3C ● A Web service is a software application ● identified by a URI, ● whose interfaces and binding are capable of being defined, described and discovered by XML artifacts and ● supports direct interactions with other software applications ● using XML based messages ● via internet-based protocols Characteristics of Web Services ● XML based everywhere ● Message-based ● Programming language independent ● Could be dynamically located ● Could be dynamically assembled or aggregated ● Accessed over the internet ● Loosely coupled ● Based on industry standards ● Are platform neutral ● Are accessible in a standard way ● Are accessible in an interoperable way ● Use simple and ubiquitous plumbing ● Are relatively cheap ● Simplify enterprise integration Why Web Services? ● Interoperable – Connect across heterogeneous networks using ubiquitous web-based standards ● Economical – Recycle components, no installation and tight integration of software ● Automatic – No human intervention required even for highly complex transactions ● Accessible – Legacy assets & internal apps are exposed and accessible on the web ● Available – Services on any device, anywhere, anytime ● Scalable – No limits on scope of applications and amount of heterogeneous applications
  45. 45. WEB SERVICE ARCHITECTURE AND KEY TECHNOLOGIES Web services are software components that can be accessed over the Web through standards- based protocols such as HTTP or SMTP for use in other applications. They provide a fundamentally new framework and set of standards for a computing environment that can include servers, workstations, desktop clients, and lightweight "pervasive" clients such as phones and PDAs. Web services are not limited to the Internet; they supply a powerful architecture for all types of distributed computing. Web services standards are the glue that allows computers and devices to interact, forming a greater computing whole that can be accessed from any device on the network. In Web services, computing nodes have three roles—client, service, and broker. o A client is any computer that accesses functions from one or more other computing nodes on the network. Typical clients include desktop computers, Web browsers, Java applets, and mobile devices. A client process makes a request for a computing service and receives results for that request. o A service is a computing process that receives and responds to requests and returns a set of results. o A broker is essentially a service metadata portal for registering and discovering services. Any network client can search the portal for an appropriate service. Because Web services can support the integration of information and services that are maintained on a distributed network, they are appealing to local governments and other organizations that have departments that independently collect and manage spatial data but must integrate these datasets.
  46. 46. A series of protocols—eXtensible Markup Language (XML); Simple Object Access Protocol (SOAP); Web Service Description Language (WSDL); and Universal Description, Discovery, and Integration (UDDI)—provides the key standards for Web services and supports sophisticated communications between various nodes on a network. These protocols enable smarter communication and collaborative processing among nodes built within any Web services- compliant architecture. UDDI allows clients to discover Web services. In a GIS context, the UDDI node plays the role of a metadata server for registered Web services. A user can search the UDDI directory and locate the distributed service providers or services that exist on a network. Web services interoperate (i.e., communicate) through an XML-based protocol known as SOAP. This is an XML API for the functions provided by a Web service. Each Web service advertises its SOAP API using WSDL that allows easy discovery of any service's capabilities. Web services provide an open, interoperable, and highly efficient framework for implementing systems. Software components communicate with each other via standard SOAP and XML protocols. A developer need only wrap an application with a SOAP API and it can talk (either calling or serving) with other applications. Web services are efficient because they build on the stateless (i.e., loosely coupled) environment of the Internet. A number of nodes can be dynamically connected only when needed to carry out a specific task such as updating a database or providing a particular service. UDDI Universal Description, Discovery and Integration (UDDI) is a platform-independent, XML- based registry for businesses worldwide to list themselves on the Internet. UDDI is an open industry initiative, sponsored by OASIS, enabling businesses to publish service listings and discover each other and define how the services or software applications interact over the Internet. A UDDI business registration consists of three components:  White Pages — address, contact, and known identifiers;  Yellow Pages — industrial categorizations based on standard taxonomies;  Green Pages — technical information about services exposed by the business. UDDI is one of the core Web services standards It is designed to be interrogated by SOAP messages and to provide access to Web Services Description Language documents describing the protocol bindings and message formats required to interact with the web services listed in its directory.
  47. 47. UDDI was written in August, 2000, In such a world, the publicly operated UDDI node or broker would be critical for everyone. For the consumer, public or open brokers would only return services listed for public discovery by others, while for a service producer, getting a good placement, by relying on metadata of authoritative index categories, in the brokerage would be critical for effective placement. The UDDI was integrated into the Web Services Interoperability (WS-I) standard as a central pillar of web services infrastructure. By the end of 2005, it was on the agenda for use by more than seventy percent of the Fortune 500 companies in either a public or private implementation, and particularly among those enterprises that seek to optimize software or service reuse. Many of these enterprises subscribe to some form of service-oriented architecture (SOA), server programs or database software licensed by some of the professed founders of the and OASIS. The UDDI specifications supported a publicly accessible Universal Business Registry in which a naming system was built around the UDDI- driven service broker. IBM, Microsoft and SAP announced they were closing their public UDDI nodes in January 2006.[2] Some assert that the most common place that a UDDI system can be found is inside a company where it is used to dynamically bind client systems to implementations. They would say that much of the search metadata permitted in UDDI is not used for this relatively simple role. However, the core of the trade infrastructure under UDDI, when deployed in the Universal Business Registries (now being disabled), has made all the information available to any client application, regardless of heterogeneous computing domains UDDI registries come in two forms: public and private. Both types comply to the same specifications. A private registry enables you to publish and test your internal e-business applications in a secure, private environment. Rational® Developer products include a private UDDI registry. public registry is a collection of peer directories that contain information about businesses and services. It locates services that are registered at one of its peer nodes and facilitates the discovery of published Web services. Data is replicated at each of the registries on a regular basis. This ensures consistency in service description formats and makes it easy to track changes as they occur. A private registry allows you to publish and test your internal applications in a secure, private environment. WSDL What is WSDL? • WSDL stands for Web Services Description Language • WSDL is written in XML • WSDL is an XML document • WSDL is used to describe Web services • WSDL is also used to locate Web services • WSDL is not yet a W3C standard
  48. 48. WSDL stands for Web Services Description Language. WSDL is a document written in XML. The document describes a Web service. It specifies the location of the service and the operations (or methods) the service exposes. WSDL 1.1 was submitted as a W3C Note by Ariba, IBM and Microsoft for describing services for the W3C XML Activity on XML Protocols in March 2001. (a W3C Note is made available by the W3C for discussion only. Publication of a Note by W3C indicates no endorsement by W3C or the W3C Team, or any W3C Members) The first Working Draft of WSDL 1.2 was released by W3C in July 2002. A WSDL document is just a simple XML document. It contains set of definitions to describe a web service. The WSDL Document Structure A WSDL document describes a web service using these major elements: Element Defines <portType> The operations performed by the web service <message> The messages used by the web service <types> The data types used by the web service <binding> The communication protocols used by the web service The main structure of a WSDL document looks like this: definitions> <types> definition of types........ </types> <message> definition of a message.... </message> <portType> definition of a port....... </portType> <binding> definition of a binding.... </binding> </definitions> A WSDL document can also contain other elements, like extension elements and a service element that makes it possible to group together the definitions of several web services in one single WSDL document. For a complete syntax overview go to the chapter WSDL Syntax. WSDL Ports The <portType> element is the most important WSDL element. It describes a web service, the operations that can be performed, and the messages that are involved. The <portType> element can be compared to a function library (or a module, or a class) in a traditional programming language. WSDL Messages The <message> element defines the data elements of an operation. Each message can consist of one or more parts. The parts can be compared to the parameters of a function call in a traditional programming language.
  49. 49. WSDL Types The <types> element defines the data type that are used by the web service. For maximum platform neutrality, WSDL uses XML Schema syntax to define data types. WSDL Bindings The <binding> element defines the message format and protocol details for each port. WSDL Example This is a simplified fraction of a WSDL document: <message name="getTermRequest"> <part name="term" type="xs:string"/> </message> <message name="getTermResponse"> <part name="value" type="xs:string"/> </message> <portType name="glossaryTerms"> <operation name="getTerm"> <input message="getTermRequest"/> <output message="getTermResponse"/> </operation> </portType> In this example the <portType> element defines "glossaryTerms" as the name of a port, and "getTerm" as the name of an operation. The "getTerm" operation has an input message called "getTermRequest" and an output message called "getTermResponse". The <message> elements define the parts of each message and the associated data types. Compared to traditional programming, glossaryTerms is a function library, "getTerm" is a function with "getTermRequest" as the input parameter and getTermResponse as the return parameter.
  50. 50. Web Service Caveat 1. Different implementations may not work together 2. SOAP messages on port 80 may bypass firewalls 3. Transactions must be specified outside the web services framework 4. Change Management is not addresses ebXML Electronic Business using eXtensible Markup Language, commonly known as e-business XML, or ebXML is a family of XML based standards sponsored by OASIS and UN/CEFACT whose
  51. 51. mission is to provide an open, XML-based infrastructure that enables the global use of electronic business information in an interoperable, secure, and consistent manner by all trading partners. The ebXML architecture is a unique set of concepts; part theoretical and part implemented in the existing ebXML standards work. The ebXML work stemmed from earlier work on ooEDI (object oriented EDI), UML / UMM, XML markup technologies and the X12 EDI "Future Vision" work sponsored by ANSI X12 EDI. History ebXML was started in 1999 as a joint initiative between the United Nations Centre for Trade facilitation and Electronic Business (UN/CEFACT) and Organization for the Advancement of Structured Information Standards (OASIS). A joint coordinating committee composed of representatives from each of the two organizations led the effort. Quarterly meetings of the working groups were held between November 1999 and May 2001. At the final plenary a Memorandum of Understanding was signed by the two organizations, splitting up responsibility for the various specifications but continuing oversight by the joint coordinating committee. The original project envisioned five layers of data specification, including XML standards for:  · Business processes,  · Collaboration protocol agreements,  · Core data components,  · Messaging,  · Registries and repositories After completion of the specifications by the two organizations, the work was submitted to ISO TC 154 for approval. The International Organization for Standardization (ISO) has approved the following five ebXML specifications as the ISO 15000 standard, under the general title, Electronic business eXtensible markup language:  · ISO 15000-1: ebXML Collaborative Partner Profile Agreement  · ISO 15000-2: ebXML Messaging Service Specification  · ISO 15000-3: ebXML Registry Information Model  · ISO 15000-4: ebXML Registry Services Specification  · ISO 15000-5: ebXML Core Components Technical Specification, Version 2.01.
  52. 52. ebXML terminology Registry: A central server that stores a variety of data necessary to make ebXML work. Amongst the information a Registry makes available in XML form are: Business Process & Information Meta Models, Core Library, Collaboration Protocol Profiles, and Business Library. Basically, when a business wants to start an ebXML relationship with another business, it queries a Registry in order to locate a suitable partner and to find information about requirements for dealing with that partner. Business Processes: Activities that a business can engage in (and for which it would generally want one or more partners). A Business Process is formally described by the Business Process Specification Schema (a W3C XML Schema and also a DTD), but may also be modeled in UML. Collaboration Protocol Profile (CPP): A profile filed with a Registry by a business wishing to engage in ebXML transactions. The CPP will specify some Business Processes of the business, as well as some Business Service Interfaces it supports. Business Service Interface: The ways that a business is able to carry out the transactions necessary in its Business Processes. The Business Service Interface also includes the kinds of Business Messages the business supports and the protocols over which these messages might travel. Business Messages: The actual information communicated as part of a business transaction. A message will contain multiple layers. At the outside layer, an actual communication protocol must be used (such as HTTP or SMTP). SOAP is an ebXML recommendation as an envelope for a message "payload." Other layers may deal with encryption or authentication. Core Library: A set of standard "parts" that may be used in larger ebXML elements. For example, Core Processes may be referenced by Business Processes. The Core Library is
  53. 53. contributed by the ebXML initiative itself, while larger elements may be contributed by specific industries or businesses. Collaboration Protocol Agreement (CPA): In essence, a contract between two or more businesses that can be derived automatically from the CPPs of the respective companies. If a CPP says "I can do X," a CPA says "We will do X together." Simple Object Access Protocol (SOAP): A W3C protocol for exchange of information in a distributed environment endorsed by the ebXML initiative. Of interest for ebXML is SOAP's function as an envelope that defines a framework for describing what is in a message and how to process it. OVERVIEW OF .NET NET Pros It offers multiple language support. It has a rich set of libraries, a la JVM. It's open-standard friendly (e.g., HTTP and XML) -- it may even become a standard itself. Its code is compiled natively, regardless of language or deployment (Web or desktop). .NET Cons It's yet another platform to consider, which generally means rewriting and learning new tricks. Microsoft tends to have good ideas, but mediocre implementation. Currently, it's only available on Windows. Microsoft claims C#, IL, and CLR/CLS will be submitted to ECMA, but there's still no clear view on what will be standardized from the platform. Microsoft's .NET initiative has its origins in the increasing importance of the Web in almost all areas of application development. Previous development tools, exemplified by Visual Studio version 6.0, were designed for the needs of a decade ago, when the ruling paradigm was applications that were stand-alone or were distributed over a local area network (LAN). As the need for Web-related capabilities grew, ad hoc solutions were crafted as enhancements to existing tools. Because the Web capabilities were not built into the development tools from the beginning, however, there were inevitable problems with deployment, maintenance, and efficiency. Things are different with .NET. The .NET Framework provides a comprehensive set of classes that are designed for just about any programming task you can imagine. From the very beginning, the Framework was designed to integrate Web-related programming functionality.
  54. 54. The Framework can be used by any of Microsoft's three programming languages: Visual Basic, C++, and C# (pronounced "C sharp"). The new releases of Visual Basic and C++ will be familiar to anyone who has used earlier versions, although there are numerous changes to accommodate the .NET architecture. C# is new language that is similar to Java in many respects, although there are significant differences between the two. Some observers consider C# to be a Java replacement made necessary because legal problems have forced Microsoft to stop supporting Java (or Visual J++, as Microsoft's version of Java was called). For the XML developer, .NET was designed to support XML from the ground up. There are no add-ons required, such as the MSXML Parser or the SOAP Toolkit. Everything you need is provided by the Framework. Please remember that as of this writing, the .NET Framework is a beta product. It is believed that the XML support is fairly stable, but it is possible that there will be some changes before the final product is released (which may happen by the time you read this). UNIT V XML SECURITY 1.Security overview 2.Canonicalization 3.XML security framework 4.XML encryption 5.XML digital signatures 6.XKMS. The basic security requirements. (1)confidentiality: Ensuring that information is not made available or disclosed to unauthorized individuals. (2)Authentication (i)ability to determine that the message really comes from the listed sender. (ii)non repudiation-preventing the origination of the document from denying having Sent it. (3)Data integrity Ensuring that information is not tampered in transit Approaches to cryptography falls into two categories. (i)single key cryptography (ii)public key cryptography Single key cryptography • A single key is used for both encryption and decryption. • The key must be known to both sender and receiver • The difficulty in this approach is the distribution of the key • Example DES-Data Encryption Standard • Single key systems are effective for secure communication between ATM machines and server • However it does not scale upto web, where ecommerce depends on individuals just showing to do business. Public key cryptography
  55. 55. • Enables secure communication without having to exchange secret key • It uses mathematical formula to generate two separate,but related key • One key is open to public view and the other private, known only to one individuals. XML canonicalization • Encoding scheme-are used to represent characters • Line breaks • Attribute values are normalized • Double quotes for attribute values • Special character in attribute values and character content • Entity references • Default attributes • XML and DTD declarations • White space outside document element • White space in start and end elements • Empty elements • Namespace declaration • Ordering of ns declaration and attributes XML security framework W3C is driving three XML security technology • XML digital signature • XML encryption • XML key management services XML Encrpytion • An important issue not addressed by SSL is encrypting part of the data being exchanged • Enables to overcome it by enabling encrypting part of the data. • It can also handle both XML and non XML data • Does not support encryption of attributes sample file to be encrypted Steps for XML encryption 1.selecting the XML to be encrypted 2.converting into canonical form 3.encrypting the resulting canonical form with public key 4.sending the encrypted XML XML Digital Signature 1.defines both syntax and rules for processing XML digital signature 2. It defines a series of XML elements for describing details of the signature. • Signed info-holds the information that is actually • Canonicalization method-algorithm used to canonicalize the signed info. • Signature method-algorithm used to convert the canonicalized signed info into the signature value • Combination of digest algorithm • Key dependent algorithm
  56. 56. • Reference –includes the method used to compute the digital hash and the identified data object the signature is later checked via reference and signature validation • Key info-indicates the key to be used to validate the signature • Transforms -optional ordered list of processing steps applied to the resources content before the digest was computed. • Digest method -algorithm applied to data after transforms is applied to yield the digest Value • Digest value -holds the value computed based on the data being signed. Public key infrastructure -arrangement that binds public key with respective user identifier by means of a certificate authority(entry that issues digital certificates for use by other parties) CA issues digital certificate which contain public key and an identity of the owner • It attests that the public key contained in the certificate belongs to the person, Organization,server or other entity noted in the certificate • PK1 consists of client software, server hardware,software ,legal contracts and assurances. • XML encryption and digital signature rely on PK1 to help encrypt,decrypt,sign and verify various documents. • Various PK1 solutions are available-X.509,PGP,SPK1. • Applications need to integrate with pk1 solution • Different organization use different PK1. XKMS - allows management of PK1 by abstracting the complexity of maganging the Pk1 from client applications to a trusted third party. - Trusted third party hosts the XKMS service while providing a PK1 interface to Client application - this allows a client application to access PK1 features,thereby reducing the client applications complexity XKMS spec are made up of two specs. 1.XKRSS-reg.service spec-registration of public key 2.XKISS-info.service spec-retrival of information based on key information