2. Introduction
XML stands for extensible Markup Language.
XML was designed to store and transport data.
XML was designed to be both human- and
machine-readable.
XML is a markup language much like HTML
XML was designed to be self-descriptive
XML is a W3C Recommendation
XML Simplifies Things
It simplifies data sharing
It simplifies data transport
It simplifies platform changes
It simplifies data availability
3. XML Vs HTML
XML was designed to carry data - with focus on
what data is
HTML was designed to display data - with
focus on how data looks
XML tags are not predefined like HTML tags
4. XML File
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</tit-le>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
5. XML Tree
XML documents form a tree structure that
starts at "the root" and branches to "the
leaves".
6. XML Tree Structure
XML documents are formed as element trees.
An XML tree starts at a root element and
branches from the root to child elements.
All elements can have sub elements (child
elements)
Self-Describing Syntax
A prolog defines the XML version and the character
encoding:
<?xml version="1.0" encoding="UTF-8"?>
7. XML Syntax Rules
The syntax rules of XML are very simple and logical
The XML Prolog
The XML prolog is optional. If it exists, it must come first in
the document.
XML documents can contain international characters, like
Norwegian or French.
To avoid errors, you should specify the encoding used, or save
your XML files as UTF-8.
UTF-8 is the default character encoding for XML documents.
XML Documents Must Have a Root Element
XML documents must contain one root element that is the
parent of all other elements:
8. XML Syntax Rules
All XML Elements Must Have a Closing Tag
In XML, it is illegal to omit the closing tag. All
elements must have a closing tag:
XML Tags are Case Sensitive
XML tags are case sensitive. The tag <Letter> is
different from the tag <letter>.
Opening and closing tags must be written with the
same case
XML Elements Must be Properly Nested
All elements must be properly nested within each
other
9. XML Syntax Rules
XML Attribute Values Must Always be Quoted
XML elements can have attributes in name/value pairs
just like in HTML.
Entity References
There are 5 pre-defined entity references in XML:
< < less than
> > greater than
& & ampersand
' ' apostrophe
" " quotation mark
10. XML Syntax Rules
Comments in XML
The syntax for writing comments in XML is similar to that of
HTML
<!-- This is a comment -->
11. XML Elements
An XML element is everything from (including)
the element's start tag to (including) the
element's end tag.
An element can contain:
text
attributes
other elements
or a mix of the above
<book category="children">
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
12. XML Elements
Empty XML Elements
An element with no content is said to be empty.
In XML, an empty element can indicate like this:
<element></element>
or
<element />
Empty elements can have attributes.
14. XML Attributes
attributes cannot contain multiple values (elements can)
attributes cannot contain tree structures (elements can)
attributes are not easily expandable (for future changes)
15. XML Namespaces
XML Namespaces provide a method to avoid element name
conflicts.
If these XML fragments were added together, there would be a
name conflict.
Both contain a <table> element, but the elements have
different content and meaning.
<table>
<name>African Coffee Table</name>
<width>80</width>
<length>120</length>
</table>
<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
16. XML Namespaces
Solving the Name Conflict Using a Prefix
When using prefixes in XML, a namespace for the prefix must
be defined.
The namespace can be defined by an xmlns attribute in the
start tag of an element.
The namespace declaration has the following syntax.
xmlns:prefix="URI".
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
18. XML Namespaces
Namespaces can also be declared in the XML root element:
<root xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="https://www.w3schools.com/furniture">
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
19. Well Formed & Valid
An XML document with correct syntax is called "Well Formed".
An XML document validated against a DTD is both "Well
Formed" and "Valid".
20. XML DTD
DTD stands for Document Type Definition.
A DTD defines the structure and the legal elements and
attributes of an XML document.
Types
Internal DTD
External DTD
21. XML DTD
External DTD
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "Note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
</note>
Note.dtd:
<!DOCTYPE note
[
<!ELEMENT note (to,from,heading)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
]>
#PCDATA means parsed character data
22. XML DTD
An Internal DTD Declaration
#PCDATA means parsed character data
<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
</note>
23. PCDATA & CDATA
PCDATA is text that WILL be parsed by a parser. The text will
be examined by the parser for entities and markup.
Tags inside the text will be treated as markup and entities will
be expanded.
However, parsed character data should not contain any &, <,
or > characters; these need to be represented by the &
< and > entities, respectively.
24. PCDATA & CDATA
CDATA means character data.
CDATA is text that will NOT be parsed by a parser.
Characters like "<" and "&" are illegal in XML elements.
"<" will generate an error because the parser interprets it as
the start of a new element.
"&" will generate an error because the parser interprets it as
the start of an character entity.
Some text, like JavaScript code, contains a lot of "<" or "&"
characters.
To avoid errors script code can be defined as CDATA.
Everything inside a CDATA section is ignored by the parser.
25. PCDATA & CDATA
<script>
<![CDATA[
function matchwo(a,b) {
if (a < b && a < 0) {
return 1;
} else {
return 0;
}
}
]]>
</script>
• A CDATA section cannot contain the string "]]>".
• Nested CDATA sections are not allowed.
• The "]]>" that marks the end of the CDATA section cannot contain spaces or line
breaks.
26. DTD - Elements
Declaring Only One Occurrence of an Element
<!ELEMENT note (message)>
Declaring Minimum One Occurrence of an Element
<!ELEMENT note (message+)>
Declaring Zero or More Occurrences of an Element
<!ELEMENT note (message*)>
Declaring Zero or One Occurrences of an Element
<!ELEMENT note (message?)>
Declaring either/or Content
<!ELEMENT note (to,from,header,(message|body))>
27. XML Schema
An XML Schema describes the structure of an XML document,
just like a DTD.
An XML document with correct syntax is called "Well Formed".
An XML document validated against an XML Schema is both
"Well Formed" and "Valid".
The XML Schema language is also referred to as XML Schema
Definition (XSD)
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
28. XML Schema
The purpose of an XML Schema is to define the legal building
blocks of an XML document:
the elements and attributes that can appear in a document
the number of (and order of) child elements
data types for elements and attributes
default and fixed values for elements and attributes
XML Schemas are More Powerful than DTD
XML Schemas are written in XML
XML Schemas are extensible to additions
XML Schemas support data types
XML Schemas support namespaces
•xs:string
•xs:decimal
•xs:integer
•xs:boolean
•xs:date
•xs:time
29. XML Schema
Data Types
xs:string
xs:decimal
xs:integer
xs:boolean
xs:date
xs:time
Default and Fixed Values for Simple Elements
<xs:element name="color" type="xs:string" default="red"/>
<xs:element name="color" type="xs:string" fixed="red"/>
Attribute
<xs:attribute name="xxx" type="yyy"/>
30. XML Schema
The <schema> element is the root element of every XML Schema:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
31. Referencing a Schema in an XML Document
xmlns=https://www.w3schools.com
specifies the default namespace declaration. This declaration tells the schema-
validator that all the elements used in this XML document are declared in the
"https://www.w3schools.com" namespace.
Once you have the XML Schema Instance namespace available:
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance
xsi:schemaLocation=“https://www.w3schools.com note.xsd”
This attribute has two values, separated by a space.
The first value is the namespace to use.
The second value is the location of the XML schema to use for that namespace
33. XML USER DEFINED DATA TYPE
User Defined Data types in XML is possible with
XML Schema with types
Simple TYPE
Complex Type
34. XML USER DEFINED DATA TYPE
There are 3 ways in which a simpleType can be extended;
Restriction
List
Union
Restriction
Restriction is a way to constrain an existing type definition.
We can apply a restriction to the built in data types xs:string, xs:integer,
xs:date, etc. or ones we create ourselves.
<xs:simpleType name="LetterType">
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z]" />
</xs:restriction>
</xs:simpleType>
35. XML USER DEFINED DATA TYPE
A <simpleType> tag is used to define a new type, we
must give a unique name - in this case "LetterType".
We are restricting an existing type - so the tag is
<restriction> (you can also extend an existing type -
but more about this later). We are basing our new
type on a string so type="xs:string".
We are applying a restriction in the form of a Regular
expression, this is specified using the <pattern>
element. The regular expression means the data must
contain a single lower or upper case letter a through
to z.
Closing tag for the restriction.
Closing tag for the simple type.
36. XML USER DEFINED DATA TYPE
Union
A union is a mechanism for combining two or more
different data types into one.
Syntax:
<xs:simpleType name="SomeType">
<xs:union memberTypes="Type1 Type2" />
</xs:simpleType>
37. XML USER DEFINED DATA TYPE
List
A list allows the value (in the XML document) to
contain a number of valid values separated by
whitespace.
<xs:simpleType name="SomeType">
<xs:list itemType="Type1" />
</xs:simpleType>
Valid value for this type would be "5 9 21"
38. XML USER DEFINED DATA TYPE
COMPLEX TYPE:
<xs:complexType name="AddressType">
<xs:sequence>
<xs:element name="Line1" type="xs:string" />
<xs:element name="Line2" type="xs:string" />
</xs:sequence>
</xs:complexType>
39. XML USER DEFINED DATA TYPE
<xs:complexType name="UKAddressType">
<xs:complexContent>
<xs:extension base="AddressType">
<xs:sequence>
<xs:element name="County" type="xs:string" />
<xs:element name="Postcode" type="xs:string" />
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
40. DTD Vs XML Schema
DTD XSD(XML Schema Definition)
DTDs are derived from SGML syntax. XSDs are written in XML.
DTD doesn't support datatypes. XSD supports datatypes for elements
and attributes.
DTD doesn't support namespace. XSD supports namespace.
DTD doesn't define order for child
elements.
XSD defines order for child elements.
DTD is not extensible. XSD is extensible.
DTD is not simple to learn. XSD is simple to learn because you
don't need to learn new language.
DTD provides less control on XML
structure.
XSD provides more control on XML
structure.
41. XML DOM
The DOM defines a standard for accessing and manipulating documents
The XML DOM defines a standard way for accessing and manipulating XML
documents. It presents an XML document as a tree-structure.
txt = xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue;
The XML DOM is a standard for how to get, change, add, and delete XML
elements.
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
42. XML DOM
A standard object model for XML
A standard programming interface for XML
Platform- and language-independent
A W3C standard
43. XML DOM-Example
<script>
var xhttp = new XMLHttpRequest();
xhttp.onreadystatechange = function() {
if (this.readyState == 4 && this.status == 200) {
myFunction(this);
}
};
xhttp.open("GET", "books.xml", true);
xhttp.send();
function myFunction(xml) {
var xmlDoc = xml.responseXML;
document.getElementById("demo").innerHTML =
xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue;
}
</script>
44. XML DOM-Example Explained
xmlDoc - the XML DOM object created by the parser.
getElementsByTagName("title")[0] - get the first <title> element
childNodes[0] - the first child of the <title> element (the text node)
nodeValue - the value of the node (the text itself)
45. XML DOM
Navigating DOM Nodes
parentNode
childNodes
firstChild
lastChild
nextSibling
previousSibling
46. XML DOM
Properties
x.nodeName - the name of x
x.nodeValue - the value of x
x.parentNode - the parent node of x
x.childNodes - the child nodes of x
x.attributes - the attributes nodes of x
Methods
x.getElementsByTagName(name) - get all elements with a
specified tag name
x.appendChild(node) - insert a child node to x
x.removeChild(node) - remove a child node from x
47. XML DOM Node
Node Properties
In the XML DOM, each node is an object.
Objects have methods and properties, that can be accessed
and manipulated by JavaScript.
Three important node properties are:
nodeName
nodeValue
nodeType
48. XML DOM Node
Node Properties
nodeName
nodeName is read-only
nodeName of an element node is the same as the tag name
nodeName of an attribute node is the attribute name
nodeName of a text node is always #text
nodeName of the document node is always #document
49. XML DOM Node
Node Properties
nodeValue
nodeValue for element nodes is undefined
nodeValue for text nodes is the text itself
nodeValue for attribute nodes is the attribute value
var x = xmlDoc.getElementsByTagName("title")[0].childNodes[0];
var txt = x.nodeValue;
var x = xmlDoc.getElementsByTagName("title")[0].childNodes[0];
x.nodeValue = "Easy Cooking";
50. XML DOM Node
Node Properties
nodeType
Node type NodeType
Element 1
Attribute 2
Text 3
Comment 8
Document 9
51. XML DOM
Accessing Nodes
By using the getElementsByTagName() method
By looping through (traversing) the nodes tree.
By navigating the node tree, using the node relationships.
node.getElementsByTagName("tagname");
x.getElementsByTagName("title");
52. XML DOM
Accessing Nodes
By looping through (traversing) the nodes tree.
The getElementsByTagName() method returns a node list. A
node list is an array of nodes.
The <title> elements in x can be accessed by index number.
To access the third <title> you can write:
The length property defines the length of a node list (the
number of nodes).
You can loop through a node list by using the length property:
x.getElementsByTagName("title");
y = x[2];
for (i = 0; i <x.length; i++) {
// do something for each node
}
53. XML DOM
Accessing Nodes
By navigating the node tree, using the node relationships.
x = xmlDoc.getElementsByTagName("book")[0];
xlen = x.childNodes.length;
y = x.firstChild;
txt = "";
for (i = 0; i <xlen; i++) {
if (y.nodeType == 1) {
txt += y.nodeName + "<br>";
}
y = y.nextSibling;
} <book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price> </book>
https://www.w3schools.com/xml/
tryit.asp?filename=try_dom_naviga
te
54. XML Parser
XML parser is a software library or a package that
provides interface for client applications to work
with XML documents.
The XML Parser is designed to read the XML and
create a way for programs to use XML.
XML parser validates the document and check that
the document is well formatted.
However, before an XML document can be accessed,
it must be loaded into an XML DOM object.
56. XML Parser
All modern browsers have a built-in XML parser that
can convert text into an XML DOM object.
Types of XML Parsers
DOM
SAX
57. DOM Parser
A DOM document is an object which contains all the
information of an XML document.
It is composed like a tree structure.
The DOM Parser implements a DOM API.
This API is very simple to use.
58. DOM Parser
Features
A DOM Parser creates an internal structure in memory
which is a DOM document object and the client
applications get information of the original XML document
by invoking methods on this document object.
Advantages
It supports both read and write operations and the API is
very simple to use.
It is preferred when random access to widely separated
parts of a document is required.
Disadvantages
It is memory inefficient. (consumes more memory because
the whole XML document needs to loaded into memory).
It is comparatively slower than other parsers.
59. XML Parser
<script>
var text, parser, xmlDoc;
text = "<bookstore><book>" +
"<title>Everyday Italian</title>" +
"<author>Giada De Laurentiis</author>" +
"<year>2005</year>" +
"</book></bookstore>";
parser = new DOMParser();
xmlDoc = parser.parseFromString(text,"text/xml");
document.getElementById("demo").innerHTML =
xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue;
</script>
60. SAX Parser
SAX (Simple API for XML) is an event-based parser for
XML documents.
A SAX Parser implements SAX API.
Unlike a DOM parser, a SAX parser creates no parse
tree.
SAX is a streaming interface for XML, which means
that applications using SAX receive event
notifications about the XML document being
processed an element, and attribute, at a time in
sequential order starting at the top of the document,
and ending with the closing of the ROOT element.
61. SAX Parser
Reads an XML document from top to bottom, recognizing
the tokens that make up a well-formed XML document.
Tokens are processed in the same order that they appear
in the document.
Reports the application program the nature of tokens
that the parser has encountered as they occur.
The application program provides an "event" handler that
must be registered with the parser.
As the tokens are identified, callback methods in the
handler are invoked with the relevant information.
62. SAX Parser
Features
It does not create any internal structure.
Clients does not know what methods to call, they just
overrides the methods of the API and place his own code
inside method.
It is an event based parser, it works like an event handler
in Java.
Advantages
It is simple and memory efficient.
It is very fast and works for huge documents.
Disadvantages
It is event-based so its API is less intuitive.
Clients never know the full information because the data
is broken into pieces.
65. XMLHttpRequest
The XMLHttpRequest Object has a built in XML Parser.
The responseText property returns the response as a
string.
The responseXML property returns the response as an
XML DOM object.
66. XSL- Extensible Stylesheet Language
XSL is a language for expressing stylesheets
support for browsing, printing, and aural rendering
formatting highly structured documents (XML)
performing complex publishing tasks: tables of
contents, indexes, reports,...
addressing accessibility and internationalization
issues
written in XML
68. XSL Components
XSL is constituted of three main components:
XSLT: a transformation language
XPath: an expression language for addressing parts of
XML documents
FO: a vocabulary of formatting objects with their
associated formatting properties
XSL uses XSLT which uses XPath
69. XSLT
XSLT (eXtensible Stylesheet Language
Transformations) is the recommended style sheet
language for XML
XSLT is far more sophisticated than CSS.
With XSLT you can add/remove elements and
attributes to or from the output file.
Also rearrange and sort elements, perform tests and
make decisions about which elements to hide and
display, and a lot more.
XSLT uses XPath to find information in an XML
document.
72. XSLT Elements
Element Description
stylesheet Defines the root element of a style sheet
template Rules to apply when a specified node is
matched
text Writes literal text to the output
value-of Extracts the value of a selected node
if Contains a template that will be applied
only if a specified condition is true
sort Sorts the output
when Specifies an action for the <choose>
element
73. XSLT <xsl:stylesheet> and <xsl:transform> Elements
The <xsl:stylesheet> and <xsl:transform> elements are
completely synonymous elements.
Both are used to define the root element of the style
sheet.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
....
....
</xsl:stylesheet>
74. XSLT <xsl:template> Elements
The <xsl:template> element contains rules to apply when a specified node
is matched.
The match attribute is used to associate the template with an XML
element.
The match attribute can also be used to define a template for a whole
branch of the XML document (i.e. match="/" defines the whole document).
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:for-each select=“books/book">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="author"/></td>
</tr>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
75. XSLT <xsl:text> Elements
The <xsl:text> element is used to write literal text to the output.
<xsl:template match="/">
<xsl:for-each select=“books/book">
<xsl:value-of select="title"/>
<xsl:if test="position() < last()-1">
<xsl:text>, </xsl:text>
</xsl:if>
<xsl:if test="position()=last()-1">
<xsl:text>, and </xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:template>
76. XSLT <xsl:value-of> Elements
The <xsl:value-of> element extracts the value of a
selected node.
The <xsl:value-of> element can be used to select the
value of an XML element and add it to the output.
<xsl:template match="/">
<xsl:for-each select=“books/book">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-
of select="author"/></td>
</tr>
</xsl:for-each>
</xsl:template>
77. XSLT <xsl:if> Elements
The <xsl:if> element contains a template that will be applied
only if a specified condition is true.
<xsl:template match="/">
<xsl:for-each select=“books/book">
<xsl:if test="price > 10">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="author"/></td>
</tr>
</xsl:if>
</xsl:for-each>
</xsl:template>
78. XSLT <xsl:sort> Elements
The <xsl:sort> element is used to sort the output.
<xsl:sort> is always within <xsl:for-each> or <xsl:apply-
templates>.
<xsl:template match="/">
<xsl:for-each select=“books/book">
<xsl:sort select="author"/>
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="author"/></td>
</tr>
</xsl:if>
</xsl:for-each>
</xsl:template>