2. XML
eXtensible Markup Language
w3c standard
Why?
Store and transport data
Easy data exchange
Create more languages
WSDL (Web Service Description Language)
RDF (Resource Description Framework)
RSS (Really Simple Syndication)
Self-describing data
Easy to learn
Must learn
3. 3 Major Components
XML
XSL (eXtensible Stylesheet Language)
Style sheet language for XML documents
XSD (XML Schema Definition)
Describes the structure of an XML document
4. XML Document
<?xml version="1.0"?>
<!-- this is a sample -->
<note>
<to>Tove</to>
<from source=”contacts”>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Processing Instruction
Comment
Element
Attribute
5. XML Documents
Well formed and Valid
Well formed
Should only contain one root element
All tags should have corresponding end tag
Tags never overlap(<author><name> …
</author></name>)
Attributes must be quoted
Valid
Must be well formed and conforms to the schema
6. XML Documents
Has tree structure
Tags are case sensitive
<name> is different from <Name>
Comments
<!-- this is a comment -->
7. XML Elements
Can contain
Other elements
Text
Attributes
Valid names
<name>, <first_name>, <first2names>
Invalid names
<2nd_name>, <$amount>, <first name>
8. XML elements and Attributes
Data goes as elements
<person><name>john</name></person>
Meta data goes as attributes
<image type='gif'><name>graph.gif</name></image>
9. 1.0 vs 1.1
• 1.0 – everything not permitted is forbidden
• 1.1 – everything not forbidden is permitted
• 1.0 is compatible with 1.1, not vise-versa
• Forward compatible
• Does not affect to English documents
10. XML Namespaces
There can be common elements in multiple domains
File in hardware and office
<file>
<length>18</length>
<price>3.69</price>
<file>
<file>
<content>Employee data</content>
<numberOfPages>25</numberOfPages>
</file>
11. XML Namespaces
How to distinguish?
Solution : namespaces
<h:file xmlns:h="http://www.hardware.com/">
<h:length>18</h:length>
<h:price>3.69</h:price>
<h:file>
<o:file xmlns:o="http://www.office.com/people">
<o:content>Employee data</o:content>
<o:numberOfPages>25</o:numberOfPages>
</o:file>
12. XML Namespaces
How to distinguish?
Solution : namespaces
<h:file xmlns:h="http://www.hardware.com/">
<h:length>18</h:length>
<h:price>3.69</h:price>
<h:file>
<o:file xmlns:o="http://www.office.com/people">
<o:content>Employee data</o:content>
<o:numberOfPages>25</o:numberOfPages>
</o:file>
13. XML Parsers
• A piece of software which reads the content from the XML documents and present it
to the application
• Implementing xml parser
• Java way
• SAX (Simple API for XML)
• DOM (Document Object Model)
• StAX (Streaming API for XML)
15. XML Parsers
Feature StAX SAX DOM
API Type Pull, Streaming Push, Streaming In memory tree
Ease of Use High Medium High
XPath Capability No No Yes
CPU and Memory
Efficiency
Good Good Varies
Forward Only Yes Yes No
Read XML Yes Yes Yes
Write XML Yes No Yes
CRUD No No Yes
16. XSL
• XSL is a language for expressing stylesheets
• eXtensible Stylesheet Language
• XSLT (XSL Transformations)
• XPath
• XML vocabulary for specifying formatting semantics
20. XPath
• Few Examples
How to refer to the body element ?
/note/body [ '/' means root ]
How to get the source attribute ?
/note/from/@source
How to get all elements with a source attribute ?
//*[@source]
21. XSLT
• A language to convert XML documents to other
formats
• w3c Recommendation
• Uses XPath