2. Agenda
At the end of this presentation, you will be able to understand :
Understand why we need XML
Understand the features of XML
Create and use XML files
Understand and use DTDs
Understand XSD and XPath
Use XML parsers like SAX and DOM parser
Create and use XSL files
5. John Collects Data..
The organization has many departments like HR, Admin, R&D, Marketing, Sales, Technical
Writing etc., Mark gets data from all departments in a different format. Mark is unable to
read the files from the program as the format from each department is different.
10. Features of XML
Simplicity - Writing XML is very easy.
Extensibility - XML data can be extended with DTD and XSL. DTD is Document Type
definition which provides the format of the XML document and XSL is the way XML data will
be represented/displayed. It is like CSS for HTML.
Interoperability - It can work on any platform.
Openness - Any tool/language can open an xml file and parse it in the programming
language.
11. XML
Data is inserted in the XML file in tags like HTML.
XML is written using user defined tags and data is written in
between the user defined tags.
User can open the xml file in any text editor or xml editors.
It can also read/parsed using computer languages.
13. Where can we use XML?
XML can be used to:
Exchange data between different systems
Share data
Store data
Make data in more useful and meaningful form
create new “ML Languages”
17. Simple XML File
XML file will have an extension of .xml for the file name.
When you double click on XML file, it can be opened in any browser.
Sample XML file is given below:
<person>
<name> Berhanu</name>
<class> Java </class>
<profession> Trainer </profession>
<location> Adama </location>
</person>
18. Simple XML File
Here <person>,<name>,<class>, <profession> and <address> are
tags.
Data will be written in between these tags.
19. XML as Tree
XML forms a tree structure. Here root_e|ement is the root of the tree, person is a branch and name, class, profession
and location are the leaves of the tree.
<root_element>
<person>
<name> Emnet </name>
<class> Java </c1ass>
<profession> scholar</profession>
<1ocation> Adama</location>
</person>
<person>
<name> Mituku</name>
<class> Java </class>
<profession> Software developer </profession>
<location> California </location>
</person>
</root_e1ement>
20. XML Rules
White spaces are ignored in HTML but XML considers white spaces as the
actual data.
Ordering and nesting of XML document should be proper. XML file does not
work properly if there is clash in nesting.
XML tags are case sensitive. Opening and closing tags should be exactly the
same without any difference in the case.
Every opening tag must have a close tag else XML file will not work correctly.
23. XML Name Space
Syntax of Name space is:
Element:name space xmlns:name space=”name space identifier”
24. XML Prolog
First line of XML is called Prolog.
Prolog will contain the following:
<?xm| version="1.0" encoding="UTF-8"?>
Prolog usually have version number and encoding.
Encoding is like character set. This is used for Unicode.
UTF stands for Universal character set Transformation Format.
UTF-8 is used for standard web applications.
26. XML DTD
DTD stands for Document Type Definition.
DTD specifies the format of the XML Document.
DTD can be inside an XML file or outside the XML file.
27. Document Type Definitions
Provide way to constrain XML document content
Specify what kind of content can appear and where
Relatively simple to write, but not powerful
Markup declarations that define document content
DTD specify the following
Elements
Attributes
Entities
Processing instructions
Comments
Parameter entity references
DTD Format:
<!DOCTYPE Name [
Document Type Definition
]>
28. Declaring Elements in DTD
<!ELEMENT Name ContentSpecification>
EMPTY content- element has no child elements
ANY content – element has no content constraints
Element content – can contain specified elements
Mixed Content – can contain elements and character data
Eg:
<!ELEMENT MISCELLANEOUS ANY>
<!ELEMENT IMG EMPTY>
<!ELEMENT NAME (#PCDATA)>
<!ELEMENT CD (TITLE, ARTIST)>
<!ELEMENT DWELLING (HOUSE |ROOM | CONDOMINIUM| TRAILER)>
<!ELEMENT TITLE (#PCDATA | SUBTITLE )>
Specifying element population limits:
? Means zero or one of the element
+ means one or more of the element
• Means zero or more of element
Eg:
<!ELEMENT LETTER (A?, B*,C+)>
29. Declaring Attributes in DTD
All attributes must be declared
Format:
<!ATTLIST ElementName AttrName Type DefaultDecl >
String: indicated by CDATA
Enumerated: can be one of a list of values
Tokenized: used for IDs, other advanced DTD features
Eg:
<! ATTLIST PACKAGE status CDATA “OK”>
<! ATTLIST DWELLING type (house |condominium | tailer) >
Default Attributes:
#REQUIRED: attribute must be present
#IMPLIED: attribute is optional
Default value: value to use for default
#FIXED: fixed at certain value
Eg:
<ATTLIST PACKAGE ontime (yes| no) #REQUIRED)
<ATTLIST PACKAGE ontime (yes| no) #IMPLIED)
<ATTLIST PACKAGE ontime CDATA #FIXED “yes”)
30. DTD
1 <?xml version=“1.0” encoding=“UTF-8”?>
2 <!DOCTYPE TEST SYSTEM “Test.dtd”>
3 <TEST>
4 <Name> Berhanu</Name>
5 <From>Adama</From>
6 <Profession> Teaching</Profession>
7 </TEST>
8
9 Content of the file: Test.dtd
10 <!DOCTYPE TEST>
11 [
12 <!ELEMENT TEST (Name,From,Profession)>
13 <!ELEMENT Name (#PCDATA)>
14 <!ELEMENT From (#PCDATA)>
15 <!ELEMENT Profession (#PCDATA)>
16 ]>
31. DTD
DOCTYPE says the root element of the XML file.
ELEMENT Test says it has 3 elements which are Name, from and Profession.
All the ELEMENTS are PCDATA, that means all the elements are parsable
character data.
At times you need not parse certain elements then you can use CDATA.
Using XML parser, using validate() we can validate XML whether it is formed
according to DTD or not.
32. Why DTD?
Even the XML file can be written in different formats by different people.
To avoid this issue and to standardize the XML data in one format, DTD is
helpful and useful.
33. Features of DTD
Providing standard way of representing data.
XML data can be validated with DTD.
DTD can be given in the xml file or it can be given in a separate file having
extension of .dtd and can use it in the XML file.
During the validation of XML, parsers read the DTD file and validates against
the XML file. Finally validator will give the result as valid XML or Invalid XML.
34. XPath
XPath is a language for addressing
parts of an XML document, designed to
be used by both XSLT(XSL5
Transformations) and XPointer to
provide a common syntax and
semantics for functionality shared
between them.
The primary purpose of XPath is to
address parts of an XML [XML]
document. In support of this primary
purpose, it also provides basic facilities
for manipulation of strings, numbers
and booleans.
35. XSD(XML Schema Definition)
An XML schema represents the
interrelationship between the
attributes and elements of an XML
object (for example, a document or a
portion of a document).
This description can be used to
verify that each item of the content in
a document adheres to the
description of the element in which
the content used to be placed.
To create a schema for a document,
you analyze its structure, defining
each structural elements as you
encounter it.
36. XSD(XML Schema Definition)
Alternative way of specifying rules to DTDs
Provide way to constrain XML document content
Written using XML syntax
Similar to DTDs, but more powerful and sophisticated
Finer level of control than DTDs allow
Schema and XML files always stored separately
37. Anatomy of a Schema
Uses namespace http://www.w3.org/2001/XMLSchema
Usually uses the “xsd” prefix
<xsd:schema xmlns:xsd="http://www.w3.0rg/2001/XMLSchema”>
<xsd:element name="ElemName”>
<!-- other content definitions go in here -->
<xsd:attribute name="AttrName”/>
</xsd:element>
</xsd:schema>
38. Declaring Elements in Schema
Declared using <xsd:element> tag
Can be declared as either simple or complex type
Elements can have mixed or empty element content
Elements can have min and max times they can occur
Elements can be restricted to having certain values
42. Declaring Elements in Schema
Can declare custom types, derived from existing types
<xsd:element name="InvitedGuests”>
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger”>
<xsd:minInclusive value="0">
<xsd:maxInclusive value="50">
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
Eg of XML:
< InvitedGuests> 0 to 50</ InvitedGuests>
43. Declaring Elements in Schema
Can restrict content to a range of choices
<xsd:element name="dwelling”>
<xsd:simpleType>
<xsd:restriction base="xsd:string”>
<xsd:enumeration value="house”>
<xsd:enumeration value="apartment”>
<xsd:enumeration value="trailer”>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
Eg of XML:
< dwelling > house/ apartment / trailer </ dwelling >
44. Declaring Elements in Schema
Restrictions can even use regular expression patterns
<xsd:element name="SSN”>
<xsd:simpleType>
<xsd:restriction base="xsd:string”>
<xsd:pattern value="d{3]-d{2}-d{4}">
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
Eg of XML:
< SSN >123-45-6789</ SSN >
45. Declaring Complex Types
Use the xsd:anyType value for the type attribute
Use the <xsd: complexType> tag in the definition
<xsd:element name="EmptyElem”>
<xsd:complexType>
</xsd:complexType>
</xsd:element>
Eg of XML:
< EmptyElem ></ EmptyElem>
46. Declaring Complex Types
Declaring elements with mixed content
<xsd:element name="title”>
<xsd:complexType mixed="true”>
<xsd:sequence>
<xsd:element name="subtitle”>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
Eg of XML:
< title>
< subtitle> </ subtitle>
</ title >
47. Declaring Complex Types
Declaring elements with element content
<xsd:element name="address”>
<xsd:complexType>
<xsd:sequence>
<xsd:element name="num” type="xsd:integer”/>
<xsd:element name="street” type="xsd:string”/>
<xsd:element name="city” type="xsd:string”/>
<xsd:element name="state” type="xsd:string”/>
</xsd: sequence>
</xsd:complexType>
</xsd:element>
Eg of XML:
< address >
< num >123 </num>
< street > abc</ street >
< city >Adama </ city >
< state > Oromia</ state >
</ address >
48. Declaring Complex Types
Declaring elements with element content
<xsd:element name="vehicle”>
<xsd:complexType>
<xsd:choice>
<xsd:element name="car” type="xsd:string”/>
<xsd:element name="bus” type="xsd:string”/>
<xsd:element name="van” type="xsd:string”/>
<xsd:element name="truck” type="xsd:string”/>
</xsd:choice>
</xsd:complexType>
</xsd:element>
Eg of XML:
< vehicle >
< car >Yaris </ car > or
< bus > tata</ bus > or
< van >hilex </ van > or
< truck > mahindra</ truck >
</ vehicle >
49. Declaring Complex Types
Declaring elements with element content
<xsd:element name="CapitalCity”>
<xsd:complexType>
<xsd:all>
<xsd:element name="name” type="xsd:string”/>
<xsd:element name="state” type="xsd:string”/>
<xsd:element name="pop” type="xsd:positiveInteger”/>
</xsd:all>
</xsd:complexType>
</xsd:element>
Eg of XML: any order
< CapitalCity >
< name >berhanu </ name >
< state > oromia</ state >
< pop >12333 </ pop >
</ CapitalCity >
50. Declaring Attributes in Schema
Declared using <xsd:attribute> tag
Are always declared as simple types
Can use any of the types used by elements.
<xsd:attribute name="age” type="xsd:integer”>
<xsd:attribute name="age” type="xsd:integer” use="optional”>
(The “use” attribute can also be “required” or “prohibited”)
<xsd:attribute name="age” type="xsd:integer” default="20">
51. Declaring Attributes in Schema
Adding attribute to element that has character data
<xsd:element name="Name”>
<xsd:complexType mixed="true”>
<xsd:attribute name="Age” type="xsd:integer”>
</xsd:complexType>
</xsd:element>
Eg of XML:
< Name Age =“35”> </ Name >
52. Declaring Attributes in Schema
Adding attribute to element that has empty content
<xsd:element name="Image”>
<xsd:complexType>
<xsd:attribute name="src” type="xsd:string”>
</xsd:complexType>
</xsd:element>
53. Declaring Attributes in Schema
Adding attributes to element that has mixed content
<xsd:element name="car”>
<xsd:complexType>
<xsd:sequence>
<xsd:element name="make” type="xsd:string”>
<xsd:element name="model” type="xsd:string”>
</xsd:sequence>
<xsd:attribute name="year” type="xsd:gYear”>
</xsd: complexType>
</xsd:element>
57. Why DOM and SAX?
From the program, if the xml file data is to be read then developer has to
write their own code to parse the xml file. It is complicated.
To avoid this, Java/programming language has given parsers like DOM and
SAX to read the xml file.
XML parser consumes lot of time and it is complicated, DOM/SAX parser is
used to parse the xml file. Using this, the code can be developed fast and
accurate.
58. Features of DOM and SAX?
DOM reads the XML file and stores in tree format.
SAX reads in event based format.
59. DOM parsing - creating XML file using DOM - Required Import
import javax.xml.parsers.*;
import org.w3c.dom.*;
import java.io.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
61. DOM to Create XML file
In the program, document object is created.
Elements/nodes of the XML file are created.
Data to be placed in between the nodes, are created in text object.
Text and elements (nodes) are linked.
Nodes are linked to the Root node.
Finally, transformer Transform() method is used to convert the document object to XML file.
65. SAX Parser
In the main () method, XML file is parsed by SAXParser object and is attached with a
mySaxHandler class.
For every node or data read, it calls a corresponding method in mysaxHandler class.
StartDocument() and endDocument() methods are called when an XML document is started
to read and after reading the entire document.
When start tag is read, startEIement() method is invoked.
When end tag is read, end Element() method is invoked.
When the data in-between the tag is read, characters() method is called.
68. Why XSL?
XML is used only to store data.
We need a mechanism to display the data in a specified format. This
is possible using XSL.
69. What is XSL?
XSL stands for Extensible Stylesheet Language.
Like the way CSS is stylesheet for HTML, XSL is the stylesheet for
XML.
XSL can navigate all the nodes/elements of the XML file and can
display the XML data in the particular format. This is done by XSLT
which stands for Extensible stylesheet Language Transformations.
70. Features of XSL
Representing the XML data in the required style.
Queries also can be specified in XSL.
XSL file reads the XML data, looks for the query (If it is given in the
XSL file) and displays the data on browser in the format given in the
XSL file.
71. XSL File
This line should be used in the XML file to use the XSL file.
This line will use the XML data and use XSL format and displays XHTML in the
browser.
Note: Browser should support XSLT, other wise the above code will not work.