PREPARED BY
MS.V.MANOCHITRA,
HOD, DEPT OF IT
BON SECOURS COLLEGE FOR WOMEN
WEB DESIGN
Subject Code: 16SMBEIT1:1
Class: III-IT
(Affiliated to Bharathidasan University)
XML
Extensible Markup Language
2
Chapter : 1 Introduction
• XML is a subset of Standard Generalized Mark up
Language(SGML). Which is the parent of other markup
language, such as hypertext Markup Language(HTML).
• A Markup language is composed of commands that
instruct a program such as word processor, text editor
and internet browser how to publish the output on the
screen.
• XML is a Meta Markup Language(MML is a language for
defining markup languages.
• It provides a format for during structured data
3
• It is an unlimited set of tags.
• It provides a framework for tagging structured data.
• It is not single, predefined markup language. It is Meta
language, that specifies rules for creating markup languages.
• XML is a language for describing other languages, which lets
you design your own markup.
• XML documents are made up of markup and character data.
• Character data is also known as content(all text and images
that appear on the page)
4
Some Reasons for XML has become as popular
as it is today:
1. XML is easy to understand and read.
2. A large number of platforms support XML and large
set of tools available for XML data reading, writing
and manipulation.
3. XML can be used across open standards that are
available today.
4. XML allows developers to create their own
definitions and models for representation.
5
Advantages of XML
XML brings power & flexibility to web based applications.
It provides a number of benefits to developers & users.
1. More meaningful searches.
2. Development of flexible web application.
3. Data integration from different sources.
4. Local computation & manipulation of data.
5. Multiple views of the data
6. It shall support a wide variety of application
7. Xml doc shall be easy to create
6
Syntax of XML
The syntax of XML can be thought of at two distinct levels
1. There is a general low level syntax of XML which specifies the rules of all
XML documents.
2. Second specifies by either document type definitions(DTD) or XML
Schemas.
Rules when you create XML syntax:
1. XML names are used to name elements and attributes. An XML name must
begin with a letter or a underscore and can include digits, hyphens and
periods.
2. All xml elements must have a closing tag.
3. Xml tags are CASE sensitive, so Body, body, BODY are all distinct names.
4.There is no length limitation for XML names.
5. All xml elements must be properly nested.
6. All xml documents must have a root element.
7. Attribute values must always be quoted.
7
• All XML documents begin with an XML declaration:
• <?xml version = "1.0" encoding = "utf-
8"?>
• Character set & Encoding
– All informations in xml is unicode text. It supports
representation of all international character sets.
– Unicode can be transmitted directly as if bit characters.
– Xml supports a range of encodings default is UTF-8
8
elements
• Every XML document defines a single root element.
• An element is everything from starting and ending tag.
• An element can contain:
-> other elements
-> text
-> attributes
-> or a mix of all the above
. Top element is the Root element or Document element.
. All the other elements are like Child elements.
. At the end of the branches, the elements that contain Character data.
. Empty elements do not contain any Child elements or Character
data such as image files, sound, video files and line break.
9
Example
<?xml version=“1.0”?> xml declaration
<mail> Root element
<to>virat</to>
<from>sachin</from> child element
<heading>match</heading>
<body>don’t forget call me</body>
</mail> end root element
10
The Syntax of XML Tag
<?xml version="1.0"?>
<CATALOG>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR>
</CD>
</CATALOG> 11
Writing well formed XML Document.
All XML documents that strictly used to these syntax rules is
considered as well formed rules.
• Start tags & end tags must match.
• Elements can’t overlap
– <title>computer<sub>
science</title>ecom</sub>
Correct format
<title>computer
<sub>Science</sub>
<author>Baker</author>
</title>
12
13
Chapter : 2 XML Document Structure
- An XML document often uses two auxiliary files:
- One to specify the structural syntactic rules
- One to provide a style specification
- An XML document has a single root element, but often consists of one or more
entities
- Entities range from a single special character
- An XML document has more entities called document entity.
- Reasons for entity structure:
1. Large documents are easier to manage
2. Repeated entities need not be repeated
3. Binary entities can only be referenced in the document entities,such as images.
XML Document Structure
14
15
- Entity names:
- No length limitation
- Must begin with a letter, a dash, or a colon
- Can include letters, digits, periods, dashes, underscores, or colons
- A reference to an entity has the form:
&entity_name;
For example, if apple_image is the name of the entity, &apple_image;
is a reference to it.
Predefined Entities or Reserved Characters
< &lt;
> &gt;
& &amp;
" &quot;
' &apos;
White space - Ignore white space, tabs, new lines..
16
Chapter :3 Document Type Definitions (DTD)
A Document Type Definition (DTD) defines the legal building blocks of an
XML document. It defines the document structure with a list of legal elements and
attributes
- A DTD is a set of structural rules called declarations
- These rules specify a set of elements, along with how and where they can
appear in a document.
- The DTD for a document can be internal or external
- All of the declarations of a DTD are enclosed in the block of a DOCTYPE
markup declaration
- DTD declarations have the form:
<!keyword … >
- There are four possible declaration keywords:
ELEMENT, ATTLIST, ENTITY, and NOTATION
17
Document Type Definitions (continued)
- Declaring Elements
- An element declaration specifies the name of an an element, and the
element’s structure
- If the element is a leaf node of the document tree, its structure is in
terms of characters
- If it is an internal node, its structure is a list of child elements.
(either leaf or internal nodes)
- General form:
<!ELEMENT element_name (list of child names)>
e.g., for document tree structure
<!ELEMENT memo (from, to, date, re, body)>
memo
from to date re body
18
Document Type Definitions (continued)
- Declaring Attributes : An attribute declaration must include the name of the
element to which the attribute belongs, the attribute name, and its type.
- General form:
<!ATTLIST el_name at_name at_type [default]>
If more than one attribute is declared for a given element,
the declarations can be combined,
<!ATTLIST element name
attribute name_1 attribute type default_Value_1
attribute name_2 attribute type default_Value_2
………….
attribute name_n attribute type default_Value_n
>
19
Document Type Definitions (continued)
- Declaring Attributes (continued)
- Attribute types: there are ten different types, but we will consider only CDATA
- Default values:
a value
#FIXED value (every element will have this value),
#REQUIRED (every instance of the element must have a value specified), or
#IMPLIED (no default value and need not specify a value)
<!ATTLIST element name
attribute name_1 attribute type default_Value_1>
<!ATTLIST car doors CDATA "4">
<!ATTLIST car engine_type CDATA #REQUIRED>
<!ATTLIST car price CDATA #IMPLIED>
<!ATTLIST car make CDATA #FIXED "Ford">
Chapter :4-XML Namespace
• An XML namespace is a collection of names used in XML documents as element types and attribute
names
• - The name of an XML namespace has the form of a URL
• - A namespace declaration has the form:
• <element_name xmlns[:prefix] = URL>
• <gmcars xmlns:gm = "http://www.gm.com/names">
An XML files within our application and the two files use some of the same tag name.
this is difficult to run the XML program.
Eg 1: Eg 2:
<?xml version=“1.0” encoding=“UTF-8”?> <?xml version=“1.0” encoding=“UTF-8”?>
<book> <author>
<title> asp.net</title> <title> asp.net</title>
<price> 49.99</price> <fname> 49.99</fname>
<year> 2005</year> < lnamer> 2005</lname>
</book> </author>
by using the XML namespace attribute XMLns, we can rectify the problem
20
21
Eg 1:
<?xml version=“1.0” encoding=“UTF-8”?>
<Book xmlns=“http://www.xmlws101.com/xmlns/Book”>
<title> asp.net</title>
<price> 49.99</price>
<year> 2005</year>
</Book>
Eg 2:
<?xml version=“1.0” encoding=“UTF-8”?>
<Author xmlns=“http://www.xmlws101.com/xmlns/Author”>
<title> asp.net</title>
<fname> 49.99</fname>
< lnamer> 2005</lname>
</Author>
In this examples the <Book> and <Author> elements contains XML namespace that
uniquely identifies this XML tag and all other tags are contained within it.
22
Chapter : 5-XML SCHEMAS
• “Schemas” is a general term--DTDs are a form of XML
schemas
– According to the dictionary, a schema is “a structured
framework or plan”
XML Schemas
23
An XML Schema:
• defines elements that can appear in a document
• defines attributes that can appear within elements
• defines which elements are child elements
• defines the sequence in which the child elements can appear
• defines the number of child elements
• defines whether an element is empty or can include text
• defines default values for attributes
24
XML Schemas
XML Schemas is one of the alternatives to DTD
- Schemas are written using a namespace
- Every XML schema has a single root, schema
The schema element must specify the namespace for schemas as its xmlns:xsd
attribute
• XMLS defines 44 data types
• - Primitive: String, Boolean, float, …
• - Derived: byte, decimal, positive Integer,
• - User-defined (derived) data types
– specify the base type)
25
26
Example of XML Schema document
<xml version=“1.0” encoding=“UTF-8”?>
<City xmlns:xsi=‘http://www.w3.org/2013/xmlschema-
instance”(specify the namespace)
xsi:NamespaceSchemaLocation=“AtomicType.xsd” (specify the
filename)
</City>
<xsd:complexType name="sportscar“>
<xsd:element name=“make“ type="xsd:string"/>
<xsd:element name=“model" type="xsd:string"/>
<xsd:element name=“engine" type="xsd:string"/>
<xsd:element name=“year" type="xsd:decimal"/>
</xsd:complexType>
(complex type means ordered,un ordered groups)
(sequence type means only in ordered group)
27
Chapter : 6 -Displaying XML Documents with CSS (Cascading Style
Sheet)
CSS is a technology for define layout or formatting for documents.
- A CSS style sheet for an XML document is just a list of its tags and associated styles
Eg:
<?xml version=“1.0” encoding=‘utf-8”?>
<?xml-stylesheet type = "text/css" href = “TwoColumn.css"?>
CSS coding:
body
{
Background-color:#0000cc;
Color:#000;
}
#banner
{background-color:#00cc00;
Color:#000;
}
LeftColumn
{
Width:300px;
Color;3000;
}
28
Example :
<?xml version="1.0"?>
<!-- XML demonstration -->
<?xml-stylesheet type="text/css“
href="style9.css"?>
<!DOCTYPE planet>
<planet>
<ocean>
<name>Arctic</name>
<area>13,000</area>
<depth>1,200</depth>
</ocean>
<ocean>
<name>Atlantic</name>
<area>87,000</area>
<depth>3,900</depth>
</ocean> </planet>
XSLT Style Sheets
• - XSL(eXtensible Stylesheet Language) began as a standard for
presentations of XML documents.XSL is a Family of recommendations for
defining XML document transformations and presentation.
• - Split into three parts:
• - XSLT – Transformations
• - XPATH - XML Path Language
• - XSL-FO - Formatting objects for printable docs
XSLT(eXtensible Stylesheet Language Transformations) is transformation language
for XML.
XSLT is a templating language that can be used to convert XML into something
else.
The result of transformation can be XML,HTML,XHTML or even plain text or
Binary.
29
XSLT processing
30
XSLT
Document
XML
Document
XSL
Document
XSLT
Processor
31
XML Processors
- There are two different approaches to designing XML processors:
- SAX (Simple API for XML)
- DOM (Document Object Model)
SAX is an event driven programming interface for XML parsing.
SAX is Widely accepted and supported
SAX Packages:
Org.xml.sax -> defines handler interface, which call handler methods such as
events or errors
Org.xml.sax.helpers- provides default implementations.
DOM
- The DOM(Document Object Model) is a document to navigate and manipulate
the structure and content of the document.
- The DOM processor builds a DOM tree structure of the document.
The root of the tree is document node, which has one or more child nodes.
EXAMPLE
<?xml version=“1.0” encoding=“UTF-8”??
<Products>
<Products Category=‘Actor”>
<Product ID>thalaiva</ Product ID>
<Name>Vijay</Name>
<ProductNumber>101</ProductNumber>
</Product>
<Products Category=‘Actress”>
<Product ID>singam</ Product ID>
<Name>Anushka</Name>
<ProductNumber>102</ProductNumber>
</Product>
<Products Category=‘”Comedy”>
<Product ID>OkOk</ Product ID>
<Name>Santhanam</Name>
<ProductNumber>103</ProductNumber>
</Product>
</Products> 32
33
Root Elements
(Products)
Element
(Product)
Attribute
(Category)
Element
(ProductID) Element
(Name)
Element
(ProductNumber)
Siblings
parent child
An Example of DOM Structure

Unit 5 xml (1)

  • 1.
    PREPARED BY MS.V.MANOCHITRA, HOD, DEPTOF IT BON SECOURS COLLEGE FOR WOMEN WEB DESIGN Subject Code: 16SMBEIT1:1 Class: III-IT (Affiliated to Bharathidasan University)
  • 2.
  • 3.
    Chapter : 1Introduction • XML is a subset of Standard Generalized Mark up Language(SGML). Which is the parent of other markup language, such as hypertext Markup Language(HTML). • A Markup language is composed of commands that instruct a program such as word processor, text editor and internet browser how to publish the output on the screen. • XML is a Meta Markup Language(MML is a language for defining markup languages. • It provides a format for during structured data 3
  • 4.
    • It isan unlimited set of tags. • It provides a framework for tagging structured data. • It is not single, predefined markup language. It is Meta language, that specifies rules for creating markup languages. • XML is a language for describing other languages, which lets you design your own markup. • XML documents are made up of markup and character data. • Character data is also known as content(all text and images that appear on the page) 4
  • 5.
    Some Reasons forXML has become as popular as it is today: 1. XML is easy to understand and read. 2. A large number of platforms support XML and large set of tools available for XML data reading, writing and manipulation. 3. XML can be used across open standards that are available today. 4. XML allows developers to create their own definitions and models for representation. 5
  • 6.
    Advantages of XML XMLbrings power & flexibility to web based applications. It provides a number of benefits to developers & users. 1. More meaningful searches. 2. Development of flexible web application. 3. Data integration from different sources. 4. Local computation & manipulation of data. 5. Multiple views of the data 6. It shall support a wide variety of application 7. Xml doc shall be easy to create 6
  • 7.
    Syntax of XML Thesyntax of XML can be thought of at two distinct levels 1. There is a general low level syntax of XML which specifies the rules of all XML documents. 2. Second specifies by either document type definitions(DTD) or XML Schemas. Rules when you create XML syntax: 1. XML names are used to name elements and attributes. An XML name must begin with a letter or a underscore and can include digits, hyphens and periods. 2. All xml elements must have a closing tag. 3. Xml tags are CASE sensitive, so Body, body, BODY are all distinct names. 4.There is no length limitation for XML names. 5. All xml elements must be properly nested. 6. All xml documents must have a root element. 7. Attribute values must always be quoted. 7
  • 8.
    • All XMLdocuments begin with an XML declaration: • <?xml version = "1.0" encoding = "utf- 8"?> • Character set & Encoding – All informations in xml is unicode text. It supports representation of all international character sets. – Unicode can be transmitted directly as if bit characters. – Xml supports a range of encodings default is UTF-8 8
  • 9.
    elements • Every XMLdocument defines a single root element. • An element is everything from starting and ending tag. • An element can contain: -> other elements -> text -> attributes -> or a mix of all the above . Top element is the Root element or Document element. . All the other elements are like Child elements. . At the end of the branches, the elements that contain Character data. . Empty elements do not contain any Child elements or Character data such as image files, sound, video files and line break. 9
  • 10.
    Example <?xml version=“1.0”?> xmldeclaration <mail> Root element <to>virat</to> <from>sachin</from> child element <heading>match</heading> <body>don’t forget call me</body> </mail> end root element 10
  • 11.
    The Syntax ofXML Tag <?xml version="1.0"?> <CATALOG> <CD> <TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST> <COUNTRY>USA</COUNTRY> <COMPANY>Columbia</COMPANY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR> </CD> </CATALOG> 11
  • 12.
    Writing well formedXML Document. All XML documents that strictly used to these syntax rules is considered as well formed rules. • Start tags & end tags must match. • Elements can’t overlap – <title>computer<sub> science</title>ecom</sub> Correct format <title>computer <sub>Science</sub> <author>Baker</author> </title> 12
  • 13.
    13 Chapter : 2XML Document Structure - An XML document often uses two auxiliary files: - One to specify the structural syntactic rules - One to provide a style specification - An XML document has a single root element, but often consists of one or more entities - Entities range from a single special character - An XML document has more entities called document entity. - Reasons for entity structure: 1. Large documents are easier to manage 2. Repeated entities need not be repeated 3. Binary entities can only be referenced in the document entities,such as images.
  • 14.
  • 15.
    15 - Entity names: -No length limitation - Must begin with a letter, a dash, or a colon - Can include letters, digits, periods, dashes, underscores, or colons - A reference to an entity has the form: &entity_name; For example, if apple_image is the name of the entity, &apple_image; is a reference to it. Predefined Entities or Reserved Characters < &lt; > &gt; & &amp; " &quot; ' &apos; White space - Ignore white space, tabs, new lines..
  • 16.
    16 Chapter :3 DocumentType Definitions (DTD) A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list of legal elements and attributes - A DTD is a set of structural rules called declarations - These rules specify a set of elements, along with how and where they can appear in a document. - The DTD for a document can be internal or external - All of the declarations of a DTD are enclosed in the block of a DOCTYPE markup declaration - DTD declarations have the form: <!keyword … > - There are four possible declaration keywords: ELEMENT, ATTLIST, ENTITY, and NOTATION
  • 17.
    17 Document Type Definitions(continued) - Declaring Elements - An element declaration specifies the name of an an element, and the element’s structure - If the element is a leaf node of the document tree, its structure is in terms of characters - If it is an internal node, its structure is a list of child elements. (either leaf or internal nodes) - General form: <!ELEMENT element_name (list of child names)> e.g., for document tree structure <!ELEMENT memo (from, to, date, re, body)> memo from to date re body
  • 18.
    18 Document Type Definitions(continued) - Declaring Attributes : An attribute declaration must include the name of the element to which the attribute belongs, the attribute name, and its type. - General form: <!ATTLIST el_name at_name at_type [default]> If more than one attribute is declared for a given element, the declarations can be combined, <!ATTLIST element name attribute name_1 attribute type default_Value_1 attribute name_2 attribute type default_Value_2 …………. attribute name_n attribute type default_Value_n >
  • 19.
    19 Document Type Definitions(continued) - Declaring Attributes (continued) - Attribute types: there are ten different types, but we will consider only CDATA - Default values: a value #FIXED value (every element will have this value), #REQUIRED (every instance of the element must have a value specified), or #IMPLIED (no default value and need not specify a value) <!ATTLIST element name attribute name_1 attribute type default_Value_1> <!ATTLIST car doors CDATA "4"> <!ATTLIST car engine_type CDATA #REQUIRED> <!ATTLIST car price CDATA #IMPLIED> <!ATTLIST car make CDATA #FIXED "Ford">
  • 20.
    Chapter :4-XML Namespace •An XML namespace is a collection of names used in XML documents as element types and attribute names • - The name of an XML namespace has the form of a URL • - A namespace declaration has the form: • <element_name xmlns[:prefix] = URL> • <gmcars xmlns:gm = "http://www.gm.com/names"> An XML files within our application and the two files use some of the same tag name. this is difficult to run the XML program. Eg 1: Eg 2: <?xml version=“1.0” encoding=“UTF-8”?> <?xml version=“1.0” encoding=“UTF-8”?> <book> <author> <title> asp.net</title> <title> asp.net</title> <price> 49.99</price> <fname> 49.99</fname> <year> 2005</year> < lnamer> 2005</lname> </book> </author> by using the XML namespace attribute XMLns, we can rectify the problem 20
  • 21.
    21 Eg 1: <?xml version=“1.0”encoding=“UTF-8”?> <Book xmlns=“http://www.xmlws101.com/xmlns/Book”> <title> asp.net</title> <price> 49.99</price> <year> 2005</year> </Book> Eg 2: <?xml version=“1.0” encoding=“UTF-8”?> <Author xmlns=“http://www.xmlws101.com/xmlns/Author”> <title> asp.net</title> <fname> 49.99</fname> < lnamer> 2005</lname> </Author> In this examples the <Book> and <Author> elements contains XML namespace that uniquely identifies this XML tag and all other tags are contained within it.
  • 22.
    22 Chapter : 5-XMLSCHEMAS • “Schemas” is a general term--DTDs are a form of XML schemas – According to the dictionary, a schema is “a structured framework or plan”
  • 23.
    XML Schemas 23 An XMLSchema: • defines elements that can appear in a document • defines attributes that can appear within elements • defines which elements are child elements • defines the sequence in which the child elements can appear • defines the number of child elements • defines whether an element is empty or can include text • defines default values for attributes
  • 24.
    24 XML Schemas XML Schemasis one of the alternatives to DTD - Schemas are written using a namespace - Every XML schema has a single root, schema The schema element must specify the namespace for schemas as its xmlns:xsd attribute
  • 25.
    • XMLS defines44 data types • - Primitive: String, Boolean, float, … • - Derived: byte, decimal, positive Integer, • - User-defined (derived) data types – specify the base type) 25
  • 26.
    26 Example of XMLSchema document <xml version=“1.0” encoding=“UTF-8”?> <City xmlns:xsi=‘http://www.w3.org/2013/xmlschema- instance”(specify the namespace) xsi:NamespaceSchemaLocation=“AtomicType.xsd” (specify the filename) </City> <xsd:complexType name="sportscar“> <xsd:element name=“make“ type="xsd:string"/> <xsd:element name=“model" type="xsd:string"/> <xsd:element name=“engine" type="xsd:string"/> <xsd:element name=“year" type="xsd:decimal"/> </xsd:complexType> (complex type means ordered,un ordered groups) (sequence type means only in ordered group)
  • 27.
    27 Chapter : 6-Displaying XML Documents with CSS (Cascading Style Sheet) CSS is a technology for define layout or formatting for documents. - A CSS style sheet for an XML document is just a list of its tags and associated styles Eg: <?xml version=“1.0” encoding=‘utf-8”?> <?xml-stylesheet type = "text/css" href = “TwoColumn.css"?> CSS coding: body { Background-color:#0000cc; Color:#000; } #banner {background-color:#00cc00; Color:#000; } LeftColumn { Width:300px; Color;3000; }
  • 28.
    28 Example : <?xml version="1.0"?> <!--XML demonstration --> <?xml-stylesheet type="text/css“ href="style9.css"?> <!DOCTYPE planet> <planet> <ocean> <name>Arctic</name> <area>13,000</area> <depth>1,200</depth> </ocean> <ocean> <name>Atlantic</name> <area>87,000</area> <depth>3,900</depth> </ocean> </planet>
  • 29.
    XSLT Style Sheets •- XSL(eXtensible Stylesheet Language) began as a standard for presentations of XML documents.XSL is a Family of recommendations for defining XML document transformations and presentation. • - Split into three parts: • - XSLT – Transformations • - XPATH - XML Path Language • - XSL-FO - Formatting objects for printable docs XSLT(eXtensible Stylesheet Language Transformations) is transformation language for XML. XSLT is a templating language that can be used to convert XML into something else. The result of transformation can be XML,HTML,XHTML or even plain text or Binary. 29
  • 30.
  • 31.
    31 XML Processors - Thereare two different approaches to designing XML processors: - SAX (Simple API for XML) - DOM (Document Object Model) SAX is an event driven programming interface for XML parsing. SAX is Widely accepted and supported SAX Packages: Org.xml.sax -> defines handler interface, which call handler methods such as events or errors Org.xml.sax.helpers- provides default implementations.
  • 32.
    DOM - The DOM(DocumentObject Model) is a document to navigate and manipulate the structure and content of the document. - The DOM processor builds a DOM tree structure of the document. The root of the tree is document node, which has one or more child nodes. EXAMPLE <?xml version=“1.0” encoding=“UTF-8”?? <Products> <Products Category=‘Actor”> <Product ID>thalaiva</ Product ID> <Name>Vijay</Name> <ProductNumber>101</ProductNumber> </Product> <Products Category=‘Actress”> <Product ID>singam</ Product ID> <Name>Anushka</Name> <ProductNumber>102</ProductNumber> </Product> <Products Category=‘”Comedy”> <Product ID>OkOk</ Product ID> <Name>Santhanam</Name> <ProductNumber>103</ProductNumber> </Product> </Products> 32
  • 33.