1. 1
XML-Introduction
What is XML?
ā¢ XML stands for eXtensible Markup Language
ā¢ It is a text-based markup language derived from
Standard Generalized Markup Language (SGML).
ā¢ XML was designed to store and transport data
ā¢ XML was designed to be self-descriptive
ā¢ XML is a markup language that defines set of rules for
encoding documents in a format that is both human-
readable and machine-readable.
ā¢ XML is a W3C Recommendation
2. 2
ā¢ These document reference a Document Type Definition
(DTD) or schema, which defines the structure for the
document .
ā¢ Markup is information added to a document that
enhances its meaning in certain ways, in that it identifies
the parts and how they relate to each other.
ā¢ More specifically, a markup language is a set of symbols
that can be placed in the text of a document and label the
parts of that document
XML-Introduction
3. XML
ā¢ It is a method for putting structured data into a text file;
these files are
ā¢ - easy to read
ā¢ - unambiguous
ā¢ - extensible
ā¢ - platform-independent
ā¢ XML documents are used to transfer data from one
place to another often over the Internet.
April 29th, 2003 Organizing and Searching Information with XML 3
4. 4
XML
XML is a meta markup language
for text documents / textual data
XML allows to define languages
(āapplicationsā) to represent text
documents / textual data
5. 5
XML Does Not DO Anything
<note>
<to> SAM</to>
<from> JOHN</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
The XML above is quite self-descriptive:
It has sender, receiver information
It has a heading and a message body.
6. XML
ā¢ XML can work behind the scene to simplify the creation
of HTML documents for large web sites
ā¢ XML can be used to exchange the information between
organizations and systems
ā¢ XML can easily be merged with style sheets to create
almost any desired output.
April 29th, 2003 Organizing and Searching Information with XML 6
7. Differences between XML and HTML
HTML XML
HTML tags have a fixed
meaning and browsers know
what it is
XML tags are different for
different applications, and
users know what they mean
HTML tags are used for
display.
XML tags are used to
describe documents and data
Tags and attributes are
predetermined
allows user to specify what
each tag and attribute means
April 29th, 2003 Organizing and Searching Information with XML 7
8. 8
Differences between XML and HTML
HTML XML
content and formatting can
be placed together
content and format are
separate; formatting is
contained in a stylesheet
Designed to represent the
presentation structure of a
document
Designed to represent the
logical structure of a
document
HTML was designed to
display data - with focus on
how data looks
XML was designed to carry
data - with focus on what
data is
9. 9
XML
ā¢ XML is Extensible
Most XML applications will work as
expected even if new data is added
ā¢ XML Simplifies Things
It simplifies data sharing
It simplifies data transport
It simplifies platform changes
It simplifies data availability
10. 10
XML-Uses
ā¢ XML is used in many aspects of web
development.
ā¢ XML is often used to separate data from
presentation.
ā¢ Web searching and automating Web tasks
ā¢ e-business applications
12. XML-uses
ā¢ Web publishing:
ā¢ XML allows you to create interactive pages, allows the
customer to customize those pages, and makes creating
e-commerce applications more intuitive.
ā¢ With XML, you store the data once and then render that
content for different viewers or devices based on style
sheet processing using an Extensible Style Language
(XSL)/XSL Transformation (XSLT) processor.
April 29th, 2003 Organizing and Searching Information with XML 12
15. XML Syntax Rules
ā¢ XML Declaration
ā¢ The XML document can optionally have an XML
declaration. It is written as below:
ā¢ <?xml version="1.0" encoding="UTF-8"?>
Syntax Rules for XML declaration
The XML declaration is case sensitive and must begin with
"<?xml>" where "xml" is written in lower-case.
If document contains XML declaration, then it strictly needs to be
the first statement of the XML document
April 29th, 2003 Organizing and Searching Information with XML 15
16. XML Syntax Rules
ā¢ Tags and Elements
ā¢ An XML file is structured by several XML-elements,
also called XML-nodes .
ā¢ XML-elements' names are enclosed by angular brackets
< > as shown below:
ā¢ <element-name>
ā¢ Syntax Rules for Tags and Elements
ā¢ Element Syntax: <element> text </element>
April 29th, 2003 Organizing and Searching Information with XML 16
17. XML Syntax Rules
ā¢ Root element:
An XML document can have only one root element.
Case sensitivity:
The names of XML-elements are case-sensitive.
The name of the start and the end elements need to
be exactly in the same case.
April 29th, 2003 Organizing and Searching Information with XML 17
18. XML Syntax Rules
ā¢ Attributes
An attribute specifies a single property for the element,
using a name/value pair
XML References
References usually allow you to add or include additional
text or markup in an XML document.
References always begin with the symbol "&" ,which is a
reserved character and end with the symbol ";".
XML has two types of references:
April 29th, 2003 Organizing and Searching Information with XML 18
19. ā¢ Entity References: An entity reference contains a name
between the start and the end delimiters.
ā¢ For example & where amp is name..
ā¢ Character References: These contain references, such
as A, contains a hash mark (ā#ā) followed by a
number.
ā¢ The number always refers to the Unicode code of a
character. In this case, 65 refers to alphabet "A".
April 29th, 2003 Organizing and Searching Information with XML 19
20. 20
Entity References
Some characters have a special meaning in XML.
<message>salary < 1000</message>
This will generate an XML error:
To avoid this error, replace the "<" character with an
entity reference:
<message>salary < 1000</message>
23. XML-Syntax Rules
ā¢ All XML Elements Must Have a Closing Tag
ā¢ XML Tags are Case Sensitive
ā¢ XML Elements Must be Properly Nested
ā¢ XML Attribute Values Must Always be Quoted
23
24. ā¢ Comments in XML
ā¢ The syntax for writing comments in XML is similar to
that of HTML:
ā¢ <!-- This is a comment -->
24
25. Defining XML Tags
ā¢ XML tags form the foundation of XML.
ā¢ They define the scope of an element in
XML.
ā¢ They can also be used to declare settings
required for parsing the environment, and to
insert special instructions.
25
26. Defining XML Tags
ā¢ Start Tag
ā¢ The beginning of every non-empty XML element is
marked by a start-tag. Example of start-tag ā
<address>
ā¢ End Tag
ā¢ Every element that has a start tag should end with an
end-tag. Following is an example of end-tag ā
</address>
26
27. Defining XML Tags
ā¢ Empty Tag
ā¢ The text that appears between start-tag and end-tag is
called content.
ā¢ An element which has no content is termed as empty.
An empty element can be represented in two ways as
follows ā
ā¢ A start-tag immediately followed by an end-tag as
shown below ā
ā¢ <hr></hr>
ā¢ A complete empty-element tag is as shown below ā
ā¢ <hr />
27
28. Defining XML Tags
ā¢ XML Tags Rules
ā¢ Rule 1
ā¢ XML tags are case-sensitive.
ā¢ Invalid Tag
ā¢ <address>This is wrong syntax</Address>
ā¢ Valid Tag
ā¢ <address>This is correct syntax</address>
28
29. Defining XML Tags
ā¢ Rule 2
ā¢ XML tags must be closed in an appropriate order,
i.e., an XML tag opened inside another element
must be closed before the outer element is closed.
For example ā
ā¢ <outer_element>
ā¢ <internal_element>
ā¢ This tag is closed before the outer_element
</internal_element>
ā¢ </outer_element>
29
30. XML Elements
ā¢ An XML element is everything from (including) the
element's start tag to (including) the element's end tag.
ā¢ <price>29.99</price>
ā¢ An element can contain:
ā¢ text
ā¢ attributes
ā¢ other elements
ā¢ or a mix of the above
30
31. XML Elements
ā¢ Empty Element
ā¢ An element with no content is said to be empty.
ā¢ In XML, you can indicate an empty element like this:
ā¢ <element></element>
ā¢ Empty elements can have attributes.
31
32. XML Naming Rules
ā¢ XML elements must follow these naming rules:
ā¢ Element names are case-sensitive
ā¢ Element names must start with a letter or underscore
ā¢ Element names cannot start with the letters xml (or
XML, or Xml, etc)
ā¢ Element names can contain letters, digits, hyphens,
underscores, and periods
ā¢ Element names cannot contain spaces
ā¢ Any name can be used, no words are reserved (except
xml).
32
33. XML Attributes
ā¢ Attributes are part of XML elements.
ā¢ An element can have multiple unique
attributes. Attribute gives more information
about XML elements.
ā¢ Attributes they define properties of elements.
ā¢ An XML attribute is always a name-value
pair.
33
34. XML Attributes
ā¢ XML Attributes Must be Quoted
ā¢ Attribute values must always be quoted. Either
single or double quotes can be used.
ā¢ An XML attribute has the following syntax ā
ā¢ <element-name attribute1 attribute2 >
....content..
ā¢ </element-name>
<book publisher="Tata McGraw Hill"></book>
34
35. Element Attribute Rules
ā¢ An attribute name must not appear more
than once in the same start-tag or empty-
element tag.
ā¢ An attribute must be declared in the
Document Type Definition (DTD) using an
Attribute-List Declaration.
35
36. Document Type Definition.
ā¢ A DTD is a Document Type Definition.
ā¢ A DTD defines the structure and the legal
elements and attributes of an XML document.
ā¢ The XML Document Type Declaration,
commonly known as DTD, is a way to describe
XML language precisely.
ā¢ DTDs check vocabulary and validity of the
structure of XML documents against
grammatical rules of appropriate XML
language.
36
37. DTD
ā¢ Syntax
ā¢ Basic syntax of a DTD is as follows ā
ā¢ <!DOCTYPE element DTD identifier
ā¢ [ declaration1
ā¢ declaration2
ā¢ ........ ]>
37
38. DTD
ā¢ The DTD starts with <!DOCTYPE delimiter.
ā¢ An element tells the parser to parse the document from
the specified root element.
ā¢ DTD identifier is an identifier for the document type
definition, which may be the path to a file on the system
or URL to a file on the internet.
38
39. Internal DTD
ā¢ Internal DTD
ā¢ A DTD is referred to as an internal DTD if elements are
declared within the XML files.
ā¢ Syntax
ā¢ Following is the syntax of internal DTD ā
ā¢ <!DOCTYPE root-element [element-declarations]>
where root-element is the name of root element and
element-declarations is where you declare the elements.
39
40. Internal DTD
ā¢ <?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>
40
41. Internal DTD
ā¢ !DOCTYPE note defines that the root element of this document
is note
ā¢ !ELEMENT note defines that the note element must contain four
elements: "to,from,heading,body"
ā¢ !ELEMENT to defines the to element to be of type "#PCDATA"
ā¢ !ELEMENT from defines the from element to be of type
"#PCDATA"
ā¢ !ELEMENT heading defines the heading element to be of type
"#PCDATA"
ā¢ !ELEMENT body defines the body element to be of type
"#PCDATA"
41
42. DTD
ā¢ Rules
ā¢ The document type declaration must appear at
the start of the document (preceded only by the
XML header) - it is not permitted anywhere else
within the document.
ā¢ Similar to the DOCTYPE declaration, the
element declarations must start with an
exclamation mark.
ā¢ The Name in the document type declaration must
match the element type of the root element.
42
43. External DTD
ā¢ In external DTD elements are declared
outside the XML file.
ā¢ They are accessed by specifying the system
attributes which may be either the legal .dtd
file or a valid URL.
ā¢ To reference it as external DTD, standalone
attribute in the XML declaration must be set
as no. This means, declaration includes
information from the external source.
43
44. External DTD
ā¢ Syntax
ā¢ Following is the syntax for external DTD ā
ā¢ <!DOCTYPE root-element SYSTEM "file-name">
where file-name is the file with .dtd extension.
ā¢ Example
ā¢ The following example shows external DTD usage
44
47. External DTD
ā¢ Types
ā¢ Refer an external DTD by either using system
identifiers or public identifiers.
ā¢ System Identifiers
ā¢ A system identifier enables to specify the
location of an external file containing DTD
declarations. Syntax is as follows ā
ā¢ <!DOCTYPE name SYSTEM "address.dtd" [...]>
47
48. External DTD
ā¢ Public Identifiers
ā¢ Public identifiers provide a mechanism to locate DTD
resources and are written as below ā
ā¢ <!DOCTYPE name PUBLIC "-//Beginning
XML//DTD Address Example//EN">
ā¢ Public identifiers are used to identify an entry in a
catalog.
ā¢ Public identifiers follows, a commonly used format
called Formal Public Identifiers, or FPIs.
48
49. XML-DTD
ā¢ DTD-Element Declarations
ā¢ XML elements can be defined as building blocks of an
XML document.
ā¢ Elements can behave as a container to hold text,
elements, attributes, media objects or mix of all.
ā¢ In a DTD, XML elements are declared with the
following syntax:
ā¢ <!ELEMENT element-name type>
or
<!ELEMENT element-name (element-content)>
April 29th, 2003 Organizing and Searching Information with XML 49
50. XML-DTD
ā¢ Value of type can be ANY or EMPTY
ā¢ Content can be data or another element
Elements are classified as
ā¢ Standalone elements- also called Singleton elements
or empty elements-- they are empty.
ā¢ <!ELEMENT element-name EMPTY>
ā¢ Ex: <!ELEMENT br EMPTY>
ā¢ In xml file <br />
April 29th, 2003 Organizing and Searching Information with XML 50
51. XML-DTD
ā¢ Simple elements
These are the elements that contain text or Parsed Character Data
( represented as #PCDATA)
<!ELEMENT element-name (#PCDATA) >
Ex: <!ELEMENT greeting (#PCDATA) >
In xml: <greeting>Welcome to XML </greeting>
April 29th, 2003 Organizing and Searching Information with XML 51
52. XML-DTD
ā¢ Compound Elements- This elements can contain other
elements. An element contains one element.
ā¢ <!ELEMENT element-name (child-element-name)>
EX: <!ELEMENT employee (name)>
Xml : <employee>
<name> ABC </name>
</employee>
April 29th, 2003 Organizing and Searching Information with XML 52
53. XML-DTD
ā¢ Occurrence Indicators
ā¢ <!ELEMENT employee (contact*)>
ā¢ <!ELEMENT company (employee+)>
ā¢ <!ELEMENT book (author+)>
ā¢ <!ELEMENT employee (PAN?)>
April 29th, 2003 Organizing and Searching Information with XML 53
54. XML-DTD
Declaring Multiple Children
<!ELEMENT element-name (child1,child2,child3)>
EX:
<!ELEMENT employee (name,dept,id)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT dept (#PCDATA)>
<!ELEMENT id (#PCDATA)>
April 29th, 2003 Organizing and Searching Information with XML 54
55. XML-DTD
ā¢ Sequence operator( , )
ā¢ <!ELEMENT employee (name , dept, id)>
ā¢ Choice Operator ( | )
ā¢ <!ELEMENT product (price | discountprice )>
ā¢ Composite Operator [ ( ) ]
ā¢ <!ELEMENT biodata ( dob , ( company,title)* )>
April 29th, 2003 Organizing and Searching Information with XML 55
56. XML DTD elements
Mixed content example
<?xml version = "1.0" encoding = "UTF-8" standalone = "yes" ?>
<!DOCTYPE address [
<!ELEMENT address (#PCDATA | name)* >ā mixed content
<!ELEMENT name (#PCDATA)>
]>
<address>
Here's a bit of text mixed up with the child element.
<name>
Tanmay Patil
</name>
</address>
April 29th, 2003 Organizing and Searching Information with XML 56
57. Attribute Declarations
ā¢ Attribute Declaration
ā¢ Attributes are used to associate(name , value) pairs with elements.
ā¢ They are useful to provide useful information about the elementās
content.
ā¢ Syntax:
ā¢ <!ATTLIST element-name attribute-name
attribute-type default-value >
DTD example: <!ELEMENT payment EMPTY>
ā¢ <!ATTLIST payment type CDATA "check">
XML example: <payment type="check" />
April 29th, 2003 Organizing and Searching Information with XML 57
58. April 29th, 2003 Organizing and Searching Information with XML 58
64. Entity Declaration
ā¢ Entities are variables that represent other values.
ā¢ If a text contains entities, the value of the entity is substituted by
its actual value when ever the text is parsed.
ā¢ Entities may be parsed or unparsed.
Two types of Entity declarations
ā¢ GENERAL ENTITY-They are used with in the document
content
ā¢ PARAMETER ENTITY-They are parsed entities for use with
in the DTD.
April 29th, 2003 Organizing and Searching Information with XML 64
67. Entity Declaration
ā¢ Parameter Entity Declaration
It allows us to assign a collection of elements, attributes and attribute
values to a name and refer them using a name instead of listing them
every time they are used.
Parameter-Entity Declaration
ā¢ <!ENTITY % entity-name entity-def>
ā¢ Example : <!ENTITY % info āyear , make, modelā>
DTD: <!ELEMENT car (% info;) >
ā <!ELEMENT car (year , make, model)>
April 29th, 2003 Organizing and Searching Information with XML 67
68. XML Schema
ā¢ XML Schema is commonly known as XML Schema
Definition (XSD).
ā¢ A more powerful way of defining the structure and
constraining the contents of XML documents .
ā¢ It is used to describe and validate the structure and
the content of XML data.
ā¢ XML schema defines the elements, attributes and
data types.
ā¢ Supports Namespaces
ā¢ It is similar to a database schema that describes the data
in a database.
April 29th, 2003 Organizing and Searching Information with XML 68
69. XML Schema
ā¢ XML Schema is an XML-based alternative to DTD.
ā¢ XML Schemas Support Data Types
ā¢ It is easier to describe document content
ā¢ It is easier to define restrictions on data
ā¢ It is easier to validate the correctness of data
ā¢ It is easier to convert data between different data types
69
71. XML Schema
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="values" type="xs:string" />
</xs:schema>
April 29th, 2003 Organizing and Searching Information with XML 71
72. XML Schema
<?xml version = "1.0"?>
< xs:schema
xmlns:xs = "http://www.w3.org/2001/XMLSchema"
targetNamespace = "http://www.xmldata.com"
xmlns = "http://www.xmldata.com" >
April 29th, 2003 Organizing and Searching Information with XML 72
73. XML Schema
<xs:schema
xmlns:xs = "http://www.w3.org/2001/XMLSchema">
The above fragment specifies that elements and datatypes
used in the schema are defined in
http://www.w3.org/2001/XMLSchema namespace
and these elements/data types should be prefixed with xs
April 29th, 2003 Organizing and Searching Information with XML 73
74. ā¢ targetNamespace = http://www.xmldata.com
The above fragment specifies that elements used in this
schema are defined in http://www.xmldata.com
namespace. It is optional.
xmlns = http://www.xmldata.com
April 29th, 2003 Organizing and Searching Information with XML 74
75. XML Schemas
ā¢ <?xml version = "1.0" encoding = "UTF-8"?>
<xs:schema xmlns:xs = "http://www.w3.org/2001/XMLSchema">
75
76. XML Schema
ā¢ Defining XML Elements in Schema
ā¢ An element can be defined within an XSD as follows:
<xs:element name="x" type="y" />
April 29th, 2003 Organizing and Searching Information with XML 76
77. XML Schema
ā¢ The valid data values for the element in the XML
document can be further constrained using the fixed and
default properties.
ā¢ Default means that if no value is specified in the XML
document then the application reading the document,
typically an XML parser or XML Data Binding Library,
should use the default specified in the XSD.
ā¢ Fixed means the value in the XML document can only
have the value specified in the XSD.
April 29th, 2003 Organizing and Searching Information with XML 77
79. XML schema
ā¢ Specifying Element Cardinality
ā¢ It is possible to constrain the number of instances
(cardinality) of an XML element that appear in an XML
document.
ā¢ The cardinality is specified using the minOccurs and
maxOccurs attributes, and allows an element to be
specified as mandatory, optional, or can appear up to a
set number of times.
ā¢ The default values for minOccurs and maxOccurs is 1
April 29th, 2003 Organizing and Searching Information with XML 79
81. XML schema
Element Definition Types
Simple Type
ā¢ Simple type element is used only in the context of the
text. Some of the predefined simple types are:
xs:integer, xs:boolean, xs:string, xs:date. example ā
ā¢ <xs:element name = "phone_number" type = "xs:integer" />
ā¢ <xs:element name="Customer_dob" type="xs:date" />
ā¢ <xs:element name="Customer_address" type="xs:string" />
April 29th, 2003 Organizing and Searching Information with XML 81
82. XML schema
ā¢ Complex Type
ā¢ A complex type is a container for other element
definitions.
ā¢ This allows you to specify which child elements an
element can contain and to provide some structure
within XML documents.
ā¢ Complex Element is an XML element which can contain
other elements and/or attributes
April 29th, 2003 Organizing and Searching Information with XML 82
83. XML schema
ā¢ Example --Complex Type
April 29th, 2003 Organizing and Searching Information with XML 83
84. XML schema
ā¢ Defining Compositors
ā¢ Compositors provide rules that determine how and in what order
there children can appear within XML document.
ā¢ There are three types of compositors
ā¢ <xs:sequence>,
ā¢ <xs:choice>
ā¢ <xs:all>
April 29th, 2003 Organizing and Searching Information with XML 84
86. XML Schemas
ā¢ Global Complex Types
ā¢ A xs:complexType can also defined globally and given a
name.
ā¢ Named xs:complexTypes can then be re-used throughout
the schema, either referenced directly or used as the
basis to define other xs:complexTypes.
86
89. XML Schemas
ā¢ Attributes
ā¢ Attributes in XSD provide extra information
within an element. Attributes have name and type
property as shown below ā
ā¢ <xs:attribute name = "x" type = "y"/>
ā¢ An Attribute can appear 0 or 1 times within
a given element in the XML document.
Attributes are either optional or mandatory
89
90. XML Schemas
<xs:attribute name="ID" type="xs:string" />
<xs:attribute name="ID" type="xs:string" use="optional" />
Example:
<xs:element name="Order">
<xs:complexType>
<xs:attribute name="OrderID" type="xs:int" />
</xs:complexType>
</xs:element>
April 29th, 2003 Organizing and Searching Information with XML 90
91. XML Schemas
ā¢ XML Element Mixed Content
April 29th, 2003 Organizing and Searching Information with XML 91
92. XML Schemas
ā¢ XML Schemas are More Powerful than DTD
ā¢ XML Schemas are written in XML
ā¢ XML Schemas are extensible to additions
ā¢ XML Schemas support data types
ā¢ With XML Schema, your XML files can carry a
description of its own format.
ā¢ With XML Schema, independent groups of people can
agree on a standard for interchanging data.
ā¢ With XML Schema, you can verify data.
92
93. DOCUMENT OBJECT MODEL(DOM)
ā¢ The Document Object Model (DOM) is a W3C
standard.
ā¢ It defines a standard for accessing documents like
HTML and XML.
ā¢ The Document Object Model (DOM) is an
application programming interface (API) for
HTML and XML documents.
ā¢ It defines the logical structure of documents and
the way a document is accessed and manipulated.
93
94. DOCUMENT OBJECT MODEL(DOM)
ā¢ DOM defines the objects and properties and
methods (interface) to access all XML elements.
ā¢ It is separated into 3 different parts / levels ā
ā¢ Core DOM ā standard model for any structured
document
ā¢ XML DOM ā standard model for XML
documents
ā¢ HTML DOM ā standard model for HTML
documents
94
95. DOCUMENT OBJECT MODEL (DOM)
ā¢ XML DOM is a standard object model for XML.
XML documents have a hierarchy of
informational units called nodes;
ā¢ DOM is a standard programming interface of
describing those nodes and the relationships
between them.
ā¢ The XML DOM makes a tree-structure view for
an XML document.
95
96. DOCUMENT OBJECT MODEL
ā¢ Following is the diagram for the DOM structure.
The diagram depicts that parser evaluates an
XML document as a DOM structure by
traversing through each node.
96
98. DOCUMENT OBJECT MODEL
ā¢ Advantages of XML DOM
ā¢ The following are the advantages of XML DOM.
ā¢ XML DOM is language and platform independent.
ā¢ XML DOM is traversable - Information in XML DOM
is organized in a hierarchy which allows developer to
navigate around the hierarchy looking for specific
information.
ā¢ XML DOM is modifiable - It is dynamic in nature
providing the developer a scope to add, edit, move or
remove nodes at any point on the tree.
April 29th, 2003 Organizing and Searching Information with XML 98
99. DOCUMENT OBJECT MODEL
ā¢ Disadvantages of XML DOM
ā¢ It consumes more memory (if the XML structure is
large) as program written once remains in memory all
the time until and unless removed explicitly.
ā¢ Due to the extensive usage of memory, its operational
speed, compared to SAX is slower.
April 29th, 2003 Organizing and Searching Information with XML 99
100. XML-DOCUMENT OBJECT MODEL
XML DOM Properties
These are some typical DOM properties:
ā¢ x.nodeName - the name of x
ā¢ x.nodeValue - the value of x
ā¢ x.parentNode - the parent node of x
ā¢ x.childNodes - the child nodes of x
ā¢ x.attributes - the attributes nodes of x
ā¢ Note: In the list above, x is a node object.
April 29th, 2003 Organizing and Searching Information with XML 100
101. XML-DOCUMENT OBJECT MODEL
ā¢ XML DOM Methods
ā¢ x.getElementsByTagName(name) - get all
elements with a specified tag name
ā¢ x.appendChild(node) - insert a child node to x
ā¢ x.removeChild(node) - remove a child node from
x
ā¢ Note: In the list above, x is a node object.
April 29th, 2003 Organizing and Searching Information with XML 101
102. XML-DOCUMENT OBJECT MODEL
ā¢ An example to show how an XML document
("note.xml") is parsed into an XML DOM object.
ā¢ This example parses an XML document
(note.xml) into an XML DOM object and
extracts information from it with JavaScript.
April 29th, 2003 Organizing and Searching Information with XML 102
103. XML-DOCUMENT OBJECT MODEL
ā¢ <?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>john@starone.com</to>
<from>vimal@yahoo.com</from>
<body>Hello XML DOM</body>
</note>
April 29th, 2003 Organizing and Searching Information with XML 103
104. XML-DOM
ā¢ <!DOCTYPE html>
ā¢ <html>
ā¢ <body>
ā¢ <h1>Important Note</h1>
ā¢ <b>To:</b> <span id="to"></span><br>
ā¢ <b>From:</b> <span id="from"></span><br>
ā¢ <b>Message:</b> <span id="message"></span>
ā¢
April 29th, 2003 Organizing and Searching Information with XML 104
105. XML-DOM
<script>
if (window.XMLHttpRequest)
{
// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp=new XMLHttpRequest();
}
else
{ // code for IE6, IE5
xmlhttp=new ActiveXObject("Microsoft.XMLHTTP"
);
}
April 29th, 2003 Organizing and Searching Information with XML 105
109. XHTML
ā¢ XHTML stands for EXtensible HyperText Markup
Language.
ā¢ XHTML was developed by World Wide Web
Consortium (W3C) to help web developers make the
transition from HTML to XML
ā¢ XHTML is almost identical to HTML XHTML is
stricter than HTML
April 29th, 2003 Organizing and Searching Information with XML 109
110. XHTML
Important Differences from HTML:
XHTML DOCTYPE is mandatory
The xmlns attribute in <html> is mandatory
<html>, <head>, <title>, and <body> are mandatory
XHTML Elements
XHTML elements must be properly nested
XHTML elements must always be closed
XHTML elements must be in lowercase
XHTML documents must have one root element
April 29th, 2003 Organizing and Searching Information with XML 110
111. XHTML
XHTML Attributes
Attribute names must be in lower case
Attribute values must be quoted
Attribute minimization is forbidden
April 29th, 2003 Organizing and Searching Information with XML 111
112. ā¢ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML
1.0 Transitional//ENā
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-
transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Title of document</title>
</head>
<body>
some content
</body>
</html>
April 29th, 2003 Organizing and Searching Information with XML 112
113. USES OF XHTML
ā¢ XHTML documents are XML conforming as they are
readily viewed, edited, and validated with standard
XML tools.
ā¢ XHTML documents can be written to operate better than
they did before in existing browsers as well as in new
browsers.
ā¢ XHTML documents can utilize applications such as
scripts and applets that rely upon either the HTML
Document Object Model or the XML Document Object
Model.
ā¢ XHTML gives you a more consistent, well-structured
format so that your webpages can be easily parsed and
processed by present and future web browsers.
April 29th, 2003 Organizing and Searching Information with XML 113
114. XHTML
ā¢ The XHTML standard defines three Document Type
Definitions (DTDs). The most commonly used and easy
one is the XHTML Transitional document.
ā¢ XHTML 1.0 document type definitions correspond to
three DTDs ā
ā¢ Strict
ā¢ Transitional
ā¢ Frameset
April 29th, 2003 Organizing and Searching Information with XML 114
115. XHTML
ā¢ XHTML 1.0 Strict
ā¢ If you are planning to use Cascading Style Sheet (CSS)
strictly and avoiding to write most of the XHTML
attributes, then it is recommended to use this DTD
ā¢ XHTML 1.0 Transitional
ā¢ If you are planning to use many XHTML attributes as
well as few Cascading Style Sheet properties, then
you should adopt this DTD and you should write your
XHTML document accordingly.
April 29th, 2003 Organizing and Searching Information with XML 115
116. XHTML
ā¢ <!DOCTYPE html PUBLIC "-//W3C//DTD
XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-
strict.dtd">
April 29th, 2003 Organizing and Searching Information with XML 116
117. April 29th, 2003 Organizing and Searching Information with XML 117
118. April 29th, 2003 Organizing and Searching Information with XML 118
119. April 29th, 2003 Organizing and Searching Information with XML 119
120. DOM Parser in JAVA
ā¢ DOM is part of the Java API for XML processing
(JAXP).
ā¢ Java DOM parser traverses the XML file and
creates the corresponding DOM objects.
ā¢ These DOM objects are linked together in a tree
structure.
ā¢ The parser reads the whole XML structure into
the memory.
April 29th, 2003 Organizing and Searching Information with XML 120
121. DOM Parser in JAVA
DOM interfaces
ā¢ The DOM defines several Java interfaces
ā¢ Node ā The base datatype of the DOM.
ā¢ Element ā The vast majority of the objects you'll
deal with are Elements.
ā¢ Attr ā Represents an attribute of an element.
ā¢ Text ā The actual content of an Element or Attr.
ā¢ Document ā Represents the entire XML
document. A Document object is often referred to
as a DOM tree.
April 29th, 2003 Organizing and Searching Information with XML 121
122. ā¢ Document.getDocumentElement() ā Returns the root
element of the document.
ā¢ Node.getFirstChild() ā Returns the first child of a
given Node.
ā¢ Node.getLastChild() ā Returns the last child of a given
Node.
ā¢ Node.getNextSibling() ā These methods return the next
sibling of a given Node.
ā¢ Node.getPreviousSibling() ā These methods return the
previous sibling of a given Node.
ā¢ Node.getAttribute(attrName) ā For a given Node, it
returns the attribute with the requested name.
April 29th, 2003 Organizing and Searching Information with XML 122
DOM Methods
123. USING DOM in JAVA
ā¢ Steps to Using JDOM
ā¢ Following are the steps used while parsing a document
using JDOM Parser.
ā¢ Import XML-related packages.
ā¢ Create a DocumentBuilder.
ā¢ Create a Document from a file or stream
ā¢ Extract the root element
ā¢ Examine attributes
ā¢ Examine sub-elements
April 29th, 2003 Organizing and Searching Information with XML 123
124. Using DOM in JAVA
ā¢ Import XML-related packages
ā¢ import org.w3c.dom.*;
ā¢ import javax.xml.parsers.*;
ā¢ import java.io.*;
ā¢ Create a DocumentBuilder
ā¢ DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder =
factory.newDocumentBuilder();
April 29th, 2003 Organizing and Searching Information with XML 124
125. Using DOM in java
ā¢ Create a Document from a file
ā¢ File inputFile = new File("input.xml");
ā¢ Document doc = builder.parse(inputFile);
ā¢ Extract the root element
ā¢ Element root = document.getDocumentElement();
Examine attributes
ā¢ //returns specific attribute
ā¢ getAttribute("attributeName");
ā¢ //returns a Map (table) of names/values
ā¢ getAttributes();
April 29th, 2003 Organizing and Searching Information with XML 125
126. Using DOM in java
ā¢ Examine sub-elements
ā¢ //returns a list of subelements of specified name
getElementsByTagName("subelementName");
ā¢ //returns a list of all child nodes
ā¢ getChildNodes();
April 29th, 2003 Organizing and Searching Information with XML 126
127. Using DOM in java
ā¢ <?xml version = "1.0"?>
ā¢ <class>
ā¢ <student rollno = "393">
<firstname>dinkar</firstname>
ā¢ <marks>85</marks> </student>
ā¢ <student rollno = "493">
<firstname>Vaneet</firstname>
<marks>95</marks> </student>
ā¢ </class>
127
128. April 29th, 2003 Organizing and Searching Information with XML 128
129. April 29th, 2003 Organizing and Searching Information with XML 129
130. April 29th, 2003 Organizing and Searching Information with XML 130
131. SAX Parser
ā¢ SAX Stands for Simple API for XML Parsing.
This is an event based XML Parsing and it parse
XML file line by line so much suitable for large
XML Files.
ā¢ SAX XML Parser fires an event when it
encountered opening tag, element or attribute,
and the parsing works accordingly.
April 29th, 2003 Organizing and Searching Information with XML 131
132. SAX Parser
ā¢ Use SAX XML parser for parsing large XML files in
Java because it doesn't require to load whole XML file
in Java and it can read a big XML file in small parts. J
ā¢ Java provides support for SAX parser and you can parse
any XML file in Java using SAX Parser
ā¢ One disadvantage of using SAX Parser in java is that
reading XML file in Java using SAX Parser requires
more code in comparison of DOM Parser.
Read more:
April 29th, 2003 Organizing and Searching Information with XML 132
133. SAX Parser in jAVA
ā¢ The callback methods of DefaultHandler class:-
ā¢ startDocument() ā This method called at the start of the
XML document.
ā¢ endDocument() ā This method called at the end of the
XML document.
ā¢ startElement() ā This method called at the start of a
document element.
ā¢ endElement() ā This method called at the end of a
document element.
ā¢ characters() ā This method called with the text contents
in between the start and end tags of an XML document
element.
April 29th, 2003 Organizing and Searching Information with XML 133