SlideShare a Scribd company logo
1 of 62
Teaching Innovation - Entrepreneurial - Global
The Centre for Technology enabled Teaching & Learning , N Y S S, India
DTEL(Department for Technology Enhanced Learning)
© By Ganesh K. Yenurkar
DEPARTMENT OF COMPUTER
TECHNOLOGY
III-SEMESTER
XML: AN OVERVIEW
AUTHOR
MR. GANESH K. YENURKAR
ASSISTANT PROFESSOR
YESHWANTRAO CHAVAN COLLEGE OF ENGINEERING , NAGPUR-441110 (MS)
DTEL © By Ganesh K. Yenurkar
PREFACE
As educators, we all have the same common goal “to guide our students” so that they gain the
maximum possible in a positive environment that promotes their success and inculcates in
them desire to learn. One of the best tools available to us in this pursuit is PPT instruction that
is systematic and self Learning. The goal of this PPT is to help teachers in the use of
eLearning that it is both effective and efficient method for teaching our students. It has been
developed for purely academic and non-commercial purpose.
Our desire in preparing this PPT is to support the teachers, who have the very demanding task
of Teaching-Plan to deliver instruction on a lecture/period basis. The PPT is therefore prepared
lecture wise. Further at the end of each chapter Questions have also been included for practice.
We begin here with a tour for XML : An Overview for the learners. In this ppt we attracts to
the beginners of the XML who wants to make their carrier or hands on XML technologies.
One can find here the XML introduction, purpose of use, how to use xml, xml syntax and
many more.
With deep regards and humility, we thank both our Management of MGI for motivating and
our CEO for strong follow-ups to prepare PPTs under . We dedicate this PPT to students and
our shared profession.
G. K. Yenurkar
XML: An Overview
DTEL © By Ganesh K. Yenurkar
COURSE OBJECTIVES
1. To learn how XML and its related technologies function and
how they facilitate integration between applications
2. To master and integrate the core syntax of XML, DTD, and
XML Schema
3. To understand the basic fundamentals of XSL
DTEL
XML: An Overview
© By Ganesh K. Yenurkar
COURSE OUTCOMES
On successful completion of this course the student should be able to:
1. Design and code data transfer scripts using XML languages for the transfer of data
over business networks and the Internet.
2. Validate XML documents with the Document Type Definitions and schemas according
to industry standards.
3. Transform various data formats such as text and images so that this information can be
transferred to and from server storage devices on business and health care networks and
the Internet.
4. Validate XML code and associated DTDs and schemas using a XML editing tool so
that the XML code can be used within business and health care industries.
DTEL
XML: An Overview
© By Ganesh K. Yenurkar
XML: An Overview
DTEL
Contents
1 Introduction to XML
Querying XML with XPath and XQuery
2 Document Type Defination
3
1
2
3
© By Ganesh K. Yenurkar
DTEL
XML: An Overview
Introduction to XML
© By Ganesh K. Yenurkar
LECTURE-1
8
DTEL
XML - Introduction
• XML stands for Extensible Markup Language. It is a textual
markup language derived from Standard Generalized Markup
Language ( i.e. SGML).
• XML tags may identify the data and are used to store and
organize the data, rather than specifying how to display it like
HTML tags, which are used to display the data. XML is not
going to replace HTML in the near future, but it introduces new
possibilities by adopting many successful features of HTML.
• There are 03 main important characteristics of XML that make it
useful in a variety of systems and solutions.
© By Ganesh K. Yenurkar
9
LECTURE-1
DTEL
•XML is extensible language − XML allows us to create/add your own
self-descriptive tags, or language, that suits your application.
•XML just carries the data, does not present it − XML allows us to
store the data irrespective of how it will be presented.
•XML is a public standard language − XML was developed by an
organization called the World Wide Web Consortium (W3C) and is
available as an open standard.
© By Ganesh K. Yenurkar
LECTURE-1
DTEL
Usage of XML
The various XML usage are as follows:-
•XML can work behind the scene to simplify the creation of HTML documents for
large web sites.
•XML can be used to exchange the information between organizations and
systems.
•XML can be used for offloading and reloading of databases.
•XML can be used to store and arrange the data, which can customize your data
handling needs.
•XML can easily be merged with style sheets to create almost any desired output.
•Virtually, any type of data can be expressed as an XML document.
© By Ganesh K. Yenurkar
What do you mean by XML Markup?
LECTURE-1
DTEL
• XML is a markup language that defines set of rules for
encoding documents in a format that is both human-readable
and machine-readable. So what exactly is a markup language?
Markup is information added to a document that enhances its
meaning in certain ways, in that it identifies the parts and how
they relate to each other.
• More specifically, a markup language is a set of symbols that
can be placed in the text of a document to demarcate and label
the parts of that document.
© By Ganesh K. Yenurkar
LECTURE-1
DTEL
• Following example shows how XML markup looks, when
embedded in a piece of text −
<My_message>
<text> Hello, Friends <text>
</My_message>
Explanation:- This snippet includes the markup symbols, or the tags such as
<message>...</message> and <text>... </text>. The tags <message> and
</message> mark the start and the end of the XML code fragment. The tags
<text> and </text> surround the text Hello, Friends.
© By Ganesh K. Yenurkar
LECTURE-1
DTEL
Is XML a Programming Language?
No, A programming language consists of grammar rules and
its own vocabulary which is used to create computer
programs. These programs instruct the computer to perform
specific tasks. XML does not qualify to be a programming
language as it does not perform any computation or
algorithms. It is usually stored in a simple text file and is
processed by special software that is capable of interpreting
XML.
© By Ganesh K. Yenurkar
LECTURE-1
DTEL
XML - Syntax
Following is a complete XML document −
<?xml version = "1.0"?>
<contact>
<name>Ganesh Yenurkar </name>
<address> Nagpur</address>
<company> YCCE </company>
<phone> 0000000000</phone>
</contact>
There are two kinds of information in the above example −
•Markup tags , like <contact> and
•The text, or the character data, YCCE and 0000000000.
© By Ganesh K. Yenurkar
LECTURE-1
DTEL
The following diagram depicts the syntax rules to write different types of markup
and text in an XML document.
© By Ganesh K. Yenurkar
LECTURE-1
DTEL
XML Declaration
The XML document can optionally have an XML declaration. It is written as
follows −
<?xml version = "1.0" encoding = "UTF-8"?>
Where version is the XML version and encoding specifies the character
encoding used in the document. Whole syntax is called as a “Prologue”.
Syntax Rules for XML Declaration
•The XML declaration is case sensitive and must begin with "<?xml>" where "xml"
is written in lower-case.
•If document contains XML declaration, then it strictly needs to be the first
statement of the XML document.
•The XML declaration strictly needs be the first statement in the XML document.
•An HTTP protocol can override the value of encoding that you put in the XML
declaration.
© By Ganesh K. Yenurkar
LECTURE-1
DTEL
XML is not…
• A replacement for HTML
(but HTML can be generated from XML)
• A presentation format
(but XML can be converted into one)
• A programming language
(but it can be used with almost any language)
• A network transfer protocol
(but XML may be transferred over a network)
• A database
(but XML may be stored into a database)
© By Ganesh K. Yenurkar
But XML Is …
LECTURE-1
DTEL
• XML is a meta markup language for text documents / textual
data
• XML allows to define languages („applications“) to represent
text documents / textual data
© By Ganesh K. Yenurkar
XML with an Examples
LECTURE-1
DTEL
<article>
<author>GANESH YENURKAR</author>
<title>WEB TECHNOLOGY</title>
</article>
• Easy to understand for all human users
• Very expressive clear to understand (semantics along with the
data is given)
• Well structured, easy to read and write from programs
This looks nice, but…
© By Ganesh K. Yenurkar
LECTURE-1
DTEL
<t108>
<x87>GANESH YENURKAR</x87>
<g10>WEB TECHNOLOGY</g10>
</t108>
• Its Very Hard to understand for any of human users
• Not expressive (no semantics along with the data is
ghiven)
• Well structured, easy to read and write from programs
… this is XML, too:
© By Ganesh K. Yenurkar
LECTURE-1
DTEL
<data>
ch37fhgks73j5mv9d63h5mgfkds8d984lgnsmcns983
</data>
• Impossible to understand for human users
• Not expressive (no semantics along with the data)
• Unstructured, read and write only with special programs
… and what about this XML document:
Note:-The actual benefit of using XML highly depends on the design
of the application of any kind
© By Ganesh K. Yenurkar
LECTURE-2
DTEL
Advantages of Using XML
• Truly Portable Data
• Easily readable by human users
• Very expressive (semantics near data)
• Very flexible and customizable (no finite tag set)
• Easy to use from programs (libs available)
• Easy to convert into other representations
(XML transformation languages)
• Many additional standards and tools
• Widely used and supported
© By Ganesh K. Yenurkar
App. Scenario : Data Exchange
LECTURE-2
DTEL
Legacy
System
(e.g., SAP
R/2)
Legacy
System
(e.g.,
Cobol)
XML
Adapter
XML
Adapter
XML
(BMECat, ebXML, RosettaNet, BizTalk, …)
Supplier
Buyer
Order
© By Ganesh K. Yenurkar
App. Scenario : XML for Metadata
LECTURE-2
DTEL
<rdf:RDF
<rdf:Description rdf:about="http://www-dbs/Sch03.pdf">
<dc:title>A Framework for…</dc:title>
<dc:creator>Ralf Schenkel</dc:creator>
<dc:description>While there are...</dc:description>
<dc:publisher>Saarland University</dc:publisher>
<dc:subject>XML Indexing</dc:subject>
<dc:rights>Copyright ...</dc:rights>
<dc:type>Electronic Document</dc:type>
<dc:format>text/pdf</dc:format>
<dc:language>en</dc:language>
</rdf:Description>
</rdf:RDF>
© By Ganesh K. Yenurkar
App. Scenario : Document Markup
LECTURE-2
DTEL
<article>
<section id=„01“ title=„XML:An Overview“>
This article is about <index>XML</index>.
</section>
<section id=„02“ title=„Main Results“>
<name>Weikum</name> <cite idref=„Weik01“/> shows the following
theorem (see Section <ref idref=„1“/>)
<theorem id=„theo:1“ source=„Weik01“>
For any XML document x, ...
</theorem>
</section>
<literature>
<cite id=„Weik01“><author>Weikum</author></cite>
</literature>
</article>
© By Ganesh K. Yenurkar
LECTURE-2
DTEL
• Document Markup adds structural and semantic
information to documents, e.g.
Sections, Subsections, Theorems, …
1. Cross References
2. Literature Citations
3. Index Entries
4. Named Entities
• This allows queries like
Which articles cite Weikum‘s XML paper from 2001? and
Which articles talk about (the named entity) „Weikum“?
© By Ganesh K. Yenurkar
XML Documents
LECTURE-2
DTEL
What‘s in an XML document?
Its a set of -
1. Elements
2. Attributes
plus some other details
© By Ganesh K. Yenurkar
A Simple XML Document
LECTURE-2
DTEL
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=“1” title=“Introduction”>
The <index>Web</index> provides the universal...
</section>
</text>
</article>
© By Ganesh K. Yenurkar
LECTURE-2
DTEL
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=“1” title=“Introduction”>
The <index>Web</index> provides the universal...
</section>
</text>
</article>
Freely definable tags
© By Ganesh K. Yenurkar
What is Markup?
LECTURE-3
DTEL
• (Freely definable) tags: article, title, author
with start tag: <article> etc.
and end tag: </article> etc.
• Elements: <article> ... </article>
• Elements have a name (article) and a content (...)
• Elements may be nested.
• Elements may be empty: <this_is_empty/>
• Element content is typically parsed character data (PCDATA),
i.e., strings with special characters, and/or nested elements
(mixed content if both).
• Each XML document has exactly one root element and forms
a tree.
• Elements with a common parent are ordered.
© By Ganesh K. Yenurkar
Elements vs. Attributes
LECTURE-3
DTEL
Elements may have attributes (in the start tag) that have a name and
a value, e.g. <section number=“1“>.
What is the difference between elements and attributes?
Only one attribute with a given name per element (but an arbitrary number of
subelements)
Attributes have no structure, simply strings (while elements can have
subelements)
As a rule of thumb:
Content into elements
Metadata into attributes
Example:
<person born=“1912-06-23“ died=“1954-06-07“>
Alan Turing</person> proved that…
© By Ganesh K. Yenurkar
LECTURE-3
DTEL
XML Parsing
article
author title text
section
abstract
The index
Web
provides …
title=“…“
number=“1“
In order …
Gerhard
Weikum
The Web
in 10 years
© By Ganesh K. Yenurkar
LECTURE-3
DTEL
Some special characters must be escaped using
entities:
< → &lt;
& → &amp;
(will be converted back when reading the XML doc)
Some other characters may be escaped, too:
> → &gt;
“ → &quot;
‘ → &apos;
special characters used in XML
© By Ganesh K. Yenurkar
Well-Formed XML Documents
LECTURE-3
DTEL
• A well-formed document must adher to, among others, the
following rules:
• Every start tag has a matching end tag.
• Elements may nest, but must not overlap.
• There must be exactly one root element.
• Attribute values must be quoted.
• An element may not have two attributes with the same name.
• Comments and processing instructions may not appear inside
tags.
• No unescaped < or & signs may occur inside character data.
© By Ganesh K. Yenurkar
Namespace Syntax
LECTURE-3
DTEL
<dbs:book xmlns:dbs=“http://www-dbs/dbs“>
Unique URI to identify
the namespace
Signal that namespace
definition happens
Prefix as abbrevation
of URI
© By Ganesh K. Yenurkar
Namespace Example
LECTURE-3
DTEL
<dbs:book xmlns:dbs=“http://www-dbs/dbs“>
<dbs:description> ... </dbs:description>
<dbs:text>
<dbs:formula>
<mathml:math
xmlns:mathml=“http://www.w3.org/1998/Math/MathML“>
...
</mathml:math>
</dbs:formula>
</dbs:text>
</dbs:book>
© By Ganesh K. Yenurkar
LECTURE-4
DTEL
Document Type Definitions
© By Ganesh K. Yenurkar
Document Type Definitions
LECTURE-4
DTEL
Sometimes XML is too flexible:
• Most Programs can only process a subset of all possible XML
applications
• For exchanging data, the format (i.e., elements, attributes and their
semantics) must be fixed
Document Type Definitions (DTD) for establishing the
vocabulary for one XML application (in some sense comparable to
schemas in databases)
A document is valid with respect to a DTD if it conforms to the
rules specified in that DTD.
Most XML parsers can be configured to validate the things.
© By Ganesh K. Yenurkar
DTD Example: Elements
LECTURE-4
DTEL
<!ELEMENT article (title,author+,text)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT text (abstract,section*,literature?)>
<!ELEMENT abstract (#PCDATA)>
<!ELEMENT section (#PCDATA|index)+>
<!ELEMENT literature (#PCDATA)>
<!ELEMENT index (#PCDATA)>
Content of the title element is
parsed character data
Content of the article element is a title element,
followed by one or more author elements,
followed by a text element
Content of the text element may
contain zero or more section
elements in this position
© By Ganesh K. Yenurkar
Element Declarations in DTDs
LECTURE-4
DTEL
One element declaration for each element type:
<!ELEMENT element_name content_specification>
where content_specification can be
(#PCDATA) parsed character data
(child) one child element
(c1,…,cn) a sequence of child elements c1…cn
(c1|…|cn) one of the elements c1…cn
For each component c, possible counts can be specified:
c exactly one such element
c+ one or more
c* zero or more
c? zero or one
Plus arbitrary combinations using parenthesis:
<!ELEMENT f ((a|b)*,c+,(d|e))*>
© By Ganesh K. Yenurkar
Element Declarations
LECTURE-4
DTEL
Elements with mixed content:
<!ELEMENT text (#PCDATA|index|cite|glossary)*>
Elements with empty content:
<!ELEMENT image EMPTY>
Elements with arbitrary content (this is nothing for production-level
DTDs):
<!ELEMENT thesis ANY>
© By Ganesh K. Yenurkar
Attribute Declarations in DTDs
LECTURE-4
DTEL
element
name
attribute name
attribute type
attribute default
Attributes are declared per element:
<!ATTLIST section number CDATA #REQUIRED
title CDATA #REQUIRED>
declares two required attributes for element section.
© By Ganesh K. Yenurkar
Attribute Declarations in DTDs
LECTURE-4
DTEL
Attributes are declared per element:
<!ATTLIST section number CDATA #REQUIRED
title CDATA #REQUIRED>
declares two required attributes for element section.
Possible attribute defaults:
#REQUIRED is required in each element instance
#IMPLIED is optional
#FIXED default always has this default value
default has this default value if the attribute is
omitted from the element instance
© By Ganesh K. Yenurkar
Attribute Types in DTDs
LECTURE-4
DTEL
CDATA string data
(A1|…|An) enumeration of all possible values of the
attribute (each is XML name)
ID unique XML name to identify the element
IDREF refers to ID attribute of some other element
(„intra-document link“)
IDREFS list of IDREF, separated by white space
plus some more
© By Ganesh K. Yenurkar
Attribute Examples
LECTURE-4
DTEL
<ATTLIST publication type (journal|inproceedings) #REQUIRED
pubid ID #REQUIRED>
<ATTLIST cite cid IDREF #REQUIRED>
<ATTLIST citation ref IDREF #IMPLIED
cid ID #REQUIRED>
<publications>
<publication type=“journal“ pubid=“Weikum01“>
<author>Gerhard Weikum</author>
<text>In the Web of 2010, XML <cite cid=„12“/>...</text>
<citation cid=„12“ ref=„XML98“/>
<citation cid=„15“>...</citation>
</publication>
<publication type=“inproceedings“ pubid=“XML98“>
<text>XML, the extended Markup Language, ...</text>
</publication>
</publications>
© By Ganesh K. Yenurkar
XML - DTDs
LECTURE-4
DTEL
• The XML Document Type Declaration, commonly known as DTD, is a way
to describe XML language precisely. DTDs check vocabulary and validity of
the structure of XML documents against grammatical rules of appropriate
XML language.
• An XML DTD can be either specified inside the document, or it can be kept
in a separate document and then liked separately.
Syntax:-
Basic syntax of a DTD is as follows −
<!DOCTYPE element DTD identifier
[
declaration1
declaration2 ........
]>
© By Ganesh K. Yenurkar
Elements vs. Attributes
LECTURE-5
DTEL
In the above syntax,
•The DTD starts with <!DOCTYPE delimiter.
•An element tells the parser to parse the document from the
specified root element.
•DTD identifier is an identifier for the document type definition,
which may be the path to a file on the system or URL to a file on
the internet. If the DTD is pointing to external path, it is called
External Subset.
•The square brackets [ ] enclose an optional list of entity
declarations called Internal Subset.
© By Ganesh K. Yenurkar
Internal DTD
LECTURE-5
DTEL
A DTD is referred to as an internal DTD if elements are declared within the
XML files. To refer it as internal DTD, standalone attribute in XML declaration
must be set to yes. This means, the declaration works independent of an
external source.
Syntax
Following is the syntax of internal DTD −
<!DOCTYPE root-element
[
element-declarations
]>
where root-element is the name of root element and element-declarations is
where you declare the elements.
© By Ganesh K. Yenurkar
Elements vs. Attributes
LECTURE-5
DTEL
Example
Following is a simple example of internal DTD −
<?xml version = "1.0" encoding = "UTF-8" standalone = "yes" ?>
<!DOCTYPE address
[
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>
<address> <name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
</address>
© By Ganesh K. Yenurkar
External DTD
LECTURE-5
DTEL
In external DTD elements are declared outside the XML file. They are
accessed by specifying the system attributes which may be either the legal .dtd
file or a valid URL. To refer it as external DTD, standalone attribute in the
XML declaration must be set as no. This means, declaration includes
information from the external source.
Syntax
Following is the syntax for external DTD −
<!DOCTYPE root-element SYSTEM "file-name">
where file-name is the file with .dtd extension.
© By Ganesh K. Yenurkar
LECTURE-5
DTEL
Example
The following example shows external DTD usage −
<?xml version = "1.0" encoding = "UTF-8" standalone = "no" ?>
<!DOCTYPE address SYSTEM "address.dtd">
<address> <name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
</address>
The content of the DTD file address.dtd is as shown −
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
© By Ganesh K. Yenurkar
XML Schema Basics
LECTURE-5
DTEL
• XML Schema is an XML application
Provides simple types (string, integer, dateTime, duration, language,
…)
• Allows defining possible values for elements
• Allows defining types derived from existing types
• Allows defining complex types
• Allows posing constraints on the occurrence of elements
• Allows forcing uniqueness and foreign keys
• Way too complex to cover in an introductory talk
© By Ganesh K. Yenurkar
XML Schema Example
LECTURE-5
DTEL
<xs:schema>
<xs:element name=“article“>
<xs:complexType>
<xs:sequence>
<xs:element name=“author“ type=“xs:string“/>
<xs:element name=“title“ type=“xs:string“/>
<xs:element name=“text“>
<xs:complexType>
<xs:sequence>
<xs:element name=“abstract“ type=“xs:string“/>
<xs:element name=“section“ type=“xs:string“
minOccurs=“0“ maxOccurs=“unbounded“/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
© By Ganesh K. Yenurkar
LECTURE-6
DTEL
Querying XML with XPath and
XQuery
© By Ganesh K. Yenurkar
LECTURE-6
DTEL
XPath and XQuery are query languages for XML data, both
standardized by the W3C and supported by various database
products.
Their search capabilities include
• logical conditions over element and attribute content
(first-order predicate logic a la SQL; simple conditions only in
XPath)
• regular expressions for pattern matching of element names
along paths or subtrees within XML data
+ joins, grouping, aggregation, transformation, etc. (XQuery
only)
Xpath...
© By Ganesh K. Yenurkar
Xpath...
LECTURE-6
DTEL
• XPath is a simple language to identify parts of the XML
document (for further processing)
• XPath operates on the tree representation of the document
• Result of an XPath expression is a set of elements or
attributes
• Discuss abbreviated version of XPath
© By Ganesh K. Yenurkar
Elements of XPath
LECTURE-6
DTEL
• An XPath expression usually is a location path that consists of
location steps, separated by /:
/article/text/abstract: selects all abstract elements
• A leading / always means the root element
• Each location step is evaluated in the context of a node in the tree,
the so-called context node
• Possible location steps:
child element x: select all child elements with name x
Attribute @x: select all attributes with name x
Wildcards * (any child), @* (any attribute)
Multiple matches, separated by |: x|y|z
© By Ganesh K. Yenurkar
XPath by Example
LECTURE-6
DTEL
/literature/book/author retrieves all book authors:
starting with the root, traverses the tree, matches element
names literature, book, author, and returns elements
<author>Suciu, Dan</author>,
<author>Abiteboul, Serge</author>, ...,
<author><firstname>Jeff</firstname>
<lastname>Ullman</lastname></author>
/literature/*/author authors of books, articles, essays, etc.
/literature//author authors that are descendants of literature
/literature//@year value of the year attribute of descendants of literature
/literature//author[firstname] authors that have a subelement firstname
/literature/(book|article)/author authors of books or articles
/literature/book[price < „50“]
/literature/book[author//country = „Germany“]
low priced books
books with German author
© By Ganesh K. Yenurkar
XQuery
LECTURE-6
DTEL
XQuery is an extremely powerful query language for XML data.
A query has the form of a so-called FLWR expression:
FOR $var1 IN expr1, $var2 IN expr2, ...
LET $var3 := expr3, $var4 := expr4, ...
WHERE condition
RETURN result-doc-construction
The FOR clause evaluates expressions (which may be XPath-style
path expressions) and binds the resulting elements to variables.
For a given binding each variable denotes exactly one element.
The LET clause binds entire sequences of elements to variables.
The WHERE clause evaluates a logical condition with each of
the possible variable bindings and selects those bindings that
satisfy the condition.
The RETURN clause constructs, from each of the variable bindings,
an XML result tree. This may involve grouping and aggregation
and even complete subqueries.
© By Ganesh K. Yenurkar
XQuery Examples
LECTURE-6
DTEL
// find Web-related articles by Dan Suciu from the year 1998
<results> {
FOR $a IN document(“literature.xml“)//article
FOR $n IN $a//author, $t IN $a/title
WHERE $a/@year = “1998“
AND contains($n, “Suciu“) AND contains($t, “Web“)
RETURN <result> $n $t </result> } </results>
// find articles co-authored by authors who have jointly written a book after
1995
<results> {
FOR $a IN document(“literature.xml“)//article
FOR $a1 IN $a//author, $a2 IN $a//author
WHERE SOME $b IN document(“literature.xml“)//book SATISFIES
$b//author = $a1 AND $b//author = $a2 AND $b/@year>“1995“
RETURN <result> $a1 $a2 <wrote> $a </wrote> </result> } </results>
© By Ganesh K. Yenurkar
References
LECTURE-6
DTEL
1. https://www.tutorialspoint.com/xml/xml_schemas.htm
2. Ralf Schenkel ,”XML for Beginners”, Organizing and Searching
Information with XML, April-2003
3. W3Schools Online Web Tutorials,http://www.w3schools.com.
© By Ganesh K. Yenurkar
LECTURE-6
DTEL
THANK YOU
© By Ganesh K. Yenurkar

More Related Content

Similar to XML: An Overview.pptx

Introduction to xml
Introduction to xmlIntroduction to xml
Introduction to xml
soumya
 
Week1 xml
Week1 xmlWeek1 xml
Week1 xml
hapy
 

Similar to XML: An Overview.pptx (20)

Xml in bio medical field
Xml in bio medical fieldXml in bio medical field
Xml in bio medical field
 
XML-INTRODUCTION.pdf
XML-INTRODUCTION.pdfXML-INTRODUCTION.pdf
XML-INTRODUCTION.pdf
 
Xml tutorial
Xml tutorialXml tutorial
Xml tutorial
 
XML Introduction
XML IntroductionXML Introduction
XML Introduction
 
Web Technologies Unit 2 Print.pdf
Web Technologies Unit 2 Print.pdfWeb Technologies Unit 2 Print.pdf
Web Technologies Unit 2 Print.pdf
 
Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)
 
Introduction to xml
Introduction to xmlIntroduction to xml
Introduction to xml
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7
 
XML.pptx
XML.pptxXML.pptx
XML.pptx
 
WEB TECHNOLOGIES XML
WEB TECHNOLOGIES XMLWEB TECHNOLOGIES XML
WEB TECHNOLOGIES XML
 
00 introduction
00 introduction00 introduction
00 introduction
 
93 peter butterfield
93 peter butterfield93 peter butterfield
93 peter butterfield
 
XML
XMLXML
XML
 
IT6801-Service Oriented Architecture- UNIT-I notes
IT6801-Service Oriented Architecture- UNIT-I notesIT6801-Service Oriented Architecture- UNIT-I notes
IT6801-Service Oriented Architecture- UNIT-I notes
 
Xml
XmlXml
Xml
 
Unit 5 xml (1)
Unit 5   xml (1)Unit 5   xml (1)
Unit 5 xml (1)
 
xml
xmlxml
xml
 
IT6801-Service Oriented Architecture
IT6801-Service Oriented ArchitectureIT6801-Service Oriented Architecture
IT6801-Service Oriented Architecture
 
Week1 xml
Week1 xmlWeek1 xml
Week1 xml
 
xml introduction in web technologies subject
xml introduction in web technologies subjectxml introduction in web technologies subject
xml introduction in web technologies subject
 

Recently uploaded

Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..
MaherOthman7
 
ALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdfALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdf
Madan Karki
 
Complex plane, Modulus, Argument, Graphical representation of a complex numbe...
Complex plane, Modulus, Argument, Graphical representation of a complex numbe...Complex plane, Modulus, Argument, Graphical representation of a complex numbe...
Complex plane, Modulus, Argument, Graphical representation of a complex numbe...
MohammadAliNayeem
 

Recently uploaded (20)

Electrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission lineElectrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission line
 
Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
 
How to Design and spec harmonic filter.pdf
How to Design and spec harmonic filter.pdfHow to Design and spec harmonic filter.pdf
How to Design and spec harmonic filter.pdf
 
Intelligent Agents, A discovery on How A Rational Agent Acts
Intelligent Agents, A discovery on How A Rational Agent ActsIntelligent Agents, A discovery on How A Rational Agent Acts
Intelligent Agents, A discovery on How A Rational Agent Acts
 
Insurance management system project report.pdf
Insurance management system project report.pdfInsurance management system project report.pdf
Insurance management system project report.pdf
 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..
 
Lesson no16 application of Induction Generator in Wind.ppsx
Lesson no16 application of Induction Generator in Wind.ppsxLesson no16 application of Induction Generator in Wind.ppsx
Lesson no16 application of Induction Generator in Wind.ppsx
 
Circuit Breaker arc phenomenon.pdf engineering
Circuit Breaker arc phenomenon.pdf engineeringCircuit Breaker arc phenomenon.pdf engineering
Circuit Breaker arc phenomenon.pdf engineering
 
ALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdfALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdf
 
15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon
 
RM&IPR M5 notes.pdfResearch Methodolgy & Intellectual Property Rights Series 5
RM&IPR M5 notes.pdfResearch Methodolgy & Intellectual Property Rights Series 5RM&IPR M5 notes.pdfResearch Methodolgy & Intellectual Property Rights Series 5
RM&IPR M5 notes.pdfResearch Methodolgy & Intellectual Property Rights Series 5
 
Lab Manual Arduino UNO Microcontrollar.docx
Lab Manual Arduino UNO Microcontrollar.docxLab Manual Arduino UNO Microcontrollar.docx
Lab Manual Arduino UNO Microcontrollar.docx
 
SLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptxSLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptx
 
Introduction to Heat Exchangers: Principle, Types and Applications
Introduction to Heat Exchangers: Principle, Types and ApplicationsIntroduction to Heat Exchangers: Principle, Types and Applications
Introduction to Heat Exchangers: Principle, Types and Applications
 
Software Engineering - Modelling Concepts + Class Modelling + Building the An...
Software Engineering - Modelling Concepts + Class Modelling + Building the An...Software Engineering - Modelling Concepts + Class Modelling + Building the An...
Software Engineering - Modelling Concepts + Class Modelling + Building the An...
 
Operating System chapter 9 (Virtual Memory)
Operating System chapter 9 (Virtual Memory)Operating System chapter 9 (Virtual Memory)
Operating System chapter 9 (Virtual Memory)
 
Multivibrator and its types defination and usges.pptx
Multivibrator and its types defination and usges.pptxMultivibrator and its types defination and usges.pptx
Multivibrator and its types defination and usges.pptx
 
Complex plane, Modulus, Argument, Graphical representation of a complex numbe...
Complex plane, Modulus, Argument, Graphical representation of a complex numbe...Complex plane, Modulus, Argument, Graphical representation of a complex numbe...
Complex plane, Modulus, Argument, Graphical representation of a complex numbe...
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
 

XML: An Overview.pptx

  • 1. Teaching Innovation - Entrepreneurial - Global The Centre for Technology enabled Teaching & Learning , N Y S S, India DTEL(Department for Technology Enhanced Learning) © By Ganesh K. Yenurkar
  • 2. DEPARTMENT OF COMPUTER TECHNOLOGY III-SEMESTER XML: AN OVERVIEW AUTHOR MR. GANESH K. YENURKAR ASSISTANT PROFESSOR YESHWANTRAO CHAVAN COLLEGE OF ENGINEERING , NAGPUR-441110 (MS) DTEL © By Ganesh K. Yenurkar
  • 3. PREFACE As educators, we all have the same common goal “to guide our students” so that they gain the maximum possible in a positive environment that promotes their success and inculcates in them desire to learn. One of the best tools available to us in this pursuit is PPT instruction that is systematic and self Learning. The goal of this PPT is to help teachers in the use of eLearning that it is both effective and efficient method for teaching our students. It has been developed for purely academic and non-commercial purpose. Our desire in preparing this PPT is to support the teachers, who have the very demanding task of Teaching-Plan to deliver instruction on a lecture/period basis. The PPT is therefore prepared lecture wise. Further at the end of each chapter Questions have also been included for practice. We begin here with a tour for XML : An Overview for the learners. In this ppt we attracts to the beginners of the XML who wants to make their carrier or hands on XML technologies. One can find here the XML introduction, purpose of use, how to use xml, xml syntax and many more. With deep regards and humility, we thank both our Management of MGI for motivating and our CEO for strong follow-ups to prepare PPTs under . We dedicate this PPT to students and our shared profession. G. K. Yenurkar XML: An Overview DTEL © By Ganesh K. Yenurkar
  • 4. COURSE OBJECTIVES 1. To learn how XML and its related technologies function and how they facilitate integration between applications 2. To master and integrate the core syntax of XML, DTD, and XML Schema 3. To understand the basic fundamentals of XSL DTEL XML: An Overview © By Ganesh K. Yenurkar
  • 5. COURSE OUTCOMES On successful completion of this course the student should be able to: 1. Design and code data transfer scripts using XML languages for the transfer of data over business networks and the Internet. 2. Validate XML documents with the Document Type Definitions and schemas according to industry standards. 3. Transform various data formats such as text and images so that this information can be transferred to and from server storage devices on business and health care networks and the Internet. 4. Validate XML code and associated DTDs and schemas using a XML editing tool so that the XML code can be used within business and health care industries. DTEL XML: An Overview © By Ganesh K. Yenurkar
  • 6. XML: An Overview DTEL Contents 1 Introduction to XML Querying XML with XPath and XQuery 2 Document Type Defination 3 1 2 3 © By Ganesh K. Yenurkar
  • 7. DTEL XML: An Overview Introduction to XML © By Ganesh K. Yenurkar
  • 8. LECTURE-1 8 DTEL XML - Introduction • XML stands for Extensible Markup Language. It is a textual markup language derived from Standard Generalized Markup Language ( i.e. SGML). • XML tags may identify the data and are used to store and organize the data, rather than specifying how to display it like HTML tags, which are used to display the data. XML is not going to replace HTML in the near future, but it introduces new possibilities by adopting many successful features of HTML. • There are 03 main important characteristics of XML that make it useful in a variety of systems and solutions. © By Ganesh K. Yenurkar
  • 9. 9 LECTURE-1 DTEL •XML is extensible language − XML allows us to create/add your own self-descriptive tags, or language, that suits your application. •XML just carries the data, does not present it − XML allows us to store the data irrespective of how it will be presented. •XML is a public standard language − XML was developed by an organization called the World Wide Web Consortium (W3C) and is available as an open standard. © By Ganesh K. Yenurkar
  • 10. LECTURE-1 DTEL Usage of XML The various XML usage are as follows:- •XML can work behind the scene to simplify the creation of HTML documents for large web sites. •XML can be used to exchange the information between organizations and systems. •XML can be used for offloading and reloading of databases. •XML can be used to store and arrange the data, which can customize your data handling needs. •XML can easily be merged with style sheets to create almost any desired output. •Virtually, any type of data can be expressed as an XML document. © By Ganesh K. Yenurkar
  • 11. What do you mean by XML Markup? LECTURE-1 DTEL • XML is a markup language that defines set of rules for encoding documents in a format that is both human-readable and machine-readable. So what exactly is a markup language? Markup is information added to a document that enhances its meaning in certain ways, in that it identifies the parts and how they relate to each other. • More specifically, a markup language is a set of symbols that can be placed in the text of a document to demarcate and label the parts of that document. © By Ganesh K. Yenurkar
  • 12. LECTURE-1 DTEL • Following example shows how XML markup looks, when embedded in a piece of text − <My_message> <text> Hello, Friends <text> </My_message> Explanation:- This snippet includes the markup symbols, or the tags such as <message>...</message> and <text>... </text>. The tags <message> and </message> mark the start and the end of the XML code fragment. The tags <text> and </text> surround the text Hello, Friends. © By Ganesh K. Yenurkar
  • 13. LECTURE-1 DTEL Is XML a Programming Language? No, A programming language consists of grammar rules and its own vocabulary which is used to create computer programs. These programs instruct the computer to perform specific tasks. XML does not qualify to be a programming language as it does not perform any computation or algorithms. It is usually stored in a simple text file and is processed by special software that is capable of interpreting XML. © By Ganesh K. Yenurkar
  • 14. LECTURE-1 DTEL XML - Syntax Following is a complete XML document − <?xml version = "1.0"?> <contact> <name>Ganesh Yenurkar </name> <address> Nagpur</address> <company> YCCE </company> <phone> 0000000000</phone> </contact> There are two kinds of information in the above example − •Markup tags , like <contact> and •The text, or the character data, YCCE and 0000000000. © By Ganesh K. Yenurkar
  • 15. LECTURE-1 DTEL The following diagram depicts the syntax rules to write different types of markup and text in an XML document. © By Ganesh K. Yenurkar
  • 16. LECTURE-1 DTEL XML Declaration The XML document can optionally have an XML declaration. It is written as follows − <?xml version = "1.0" encoding = "UTF-8"?> Where version is the XML version and encoding specifies the character encoding used in the document. Whole syntax is called as a “Prologue”. Syntax Rules for XML Declaration •The XML declaration is case sensitive and must begin with "<?xml>" where "xml" is written in lower-case. •If document contains XML declaration, then it strictly needs to be the first statement of the XML document. •The XML declaration strictly needs be the first statement in the XML document. •An HTTP protocol can override the value of encoding that you put in the XML declaration. © By Ganesh K. Yenurkar
  • 17. LECTURE-1 DTEL XML is not… • A replacement for HTML (but HTML can be generated from XML) • A presentation format (but XML can be converted into one) • A programming language (but it can be used with almost any language) • A network transfer protocol (but XML may be transferred over a network) • A database (but XML may be stored into a database) © By Ganesh K. Yenurkar
  • 18. But XML Is … LECTURE-1 DTEL • XML is a meta markup language for text documents / textual data • XML allows to define languages („applications“) to represent text documents / textual data © By Ganesh K. Yenurkar
  • 19. XML with an Examples LECTURE-1 DTEL <article> <author>GANESH YENURKAR</author> <title>WEB TECHNOLOGY</title> </article> • Easy to understand for all human users • Very expressive clear to understand (semantics along with the data is given) • Well structured, easy to read and write from programs This looks nice, but… © By Ganesh K. Yenurkar
  • 20. LECTURE-1 DTEL <t108> <x87>GANESH YENURKAR</x87> <g10>WEB TECHNOLOGY</g10> </t108> • Its Very Hard to understand for any of human users • Not expressive (no semantics along with the data is ghiven) • Well structured, easy to read and write from programs … this is XML, too: © By Ganesh K. Yenurkar
  • 21. LECTURE-1 DTEL <data> ch37fhgks73j5mv9d63h5mgfkds8d984lgnsmcns983 </data> • Impossible to understand for human users • Not expressive (no semantics along with the data) • Unstructured, read and write only with special programs … and what about this XML document: Note:-The actual benefit of using XML highly depends on the design of the application of any kind © By Ganesh K. Yenurkar
  • 22. LECTURE-2 DTEL Advantages of Using XML • Truly Portable Data • Easily readable by human users • Very expressive (semantics near data) • Very flexible and customizable (no finite tag set) • Easy to use from programs (libs available) • Easy to convert into other representations (XML transformation languages) • Many additional standards and tools • Widely used and supported © By Ganesh K. Yenurkar
  • 23. App. Scenario : Data Exchange LECTURE-2 DTEL Legacy System (e.g., SAP R/2) Legacy System (e.g., Cobol) XML Adapter XML Adapter XML (BMECat, ebXML, RosettaNet, BizTalk, …) Supplier Buyer Order © By Ganesh K. Yenurkar
  • 24. App. Scenario : XML for Metadata LECTURE-2 DTEL <rdf:RDF <rdf:Description rdf:about="http://www-dbs/Sch03.pdf"> <dc:title>A Framework for…</dc:title> <dc:creator>Ralf Schenkel</dc:creator> <dc:description>While there are...</dc:description> <dc:publisher>Saarland University</dc:publisher> <dc:subject>XML Indexing</dc:subject> <dc:rights>Copyright ...</dc:rights> <dc:type>Electronic Document</dc:type> <dc:format>text/pdf</dc:format> <dc:language>en</dc:language> </rdf:Description> </rdf:RDF> © By Ganesh K. Yenurkar
  • 25. App. Scenario : Document Markup LECTURE-2 DTEL <article> <section id=„01“ title=„XML:An Overview“> This article is about <index>XML</index>. </section> <section id=„02“ title=„Main Results“> <name>Weikum</name> <cite idref=„Weik01“/> shows the following theorem (see Section <ref idref=„1“/>) <theorem id=„theo:1“ source=„Weik01“> For any XML document x, ... </theorem> </section> <literature> <cite id=„Weik01“><author>Weikum</author></cite> </literature> </article> © By Ganesh K. Yenurkar
  • 26. LECTURE-2 DTEL • Document Markup adds structural and semantic information to documents, e.g. Sections, Subsections, Theorems, … 1. Cross References 2. Literature Citations 3. Index Entries 4. Named Entities • This allows queries like Which articles cite Weikum‘s XML paper from 2001? and Which articles talk about (the named entity) „Weikum“? © By Ganesh K. Yenurkar
  • 27. XML Documents LECTURE-2 DTEL What‘s in an XML document? Its a set of - 1. Elements 2. Attributes plus some other details © By Ganesh K. Yenurkar
  • 28. A Simple XML Document LECTURE-2 DTEL <article> <author>Gerhard Weikum</author> <title>The Web in Ten Years</title> <text> <abstract>In order to evolve...</abstract> <section number=“1” title=“Introduction”> The <index>Web</index> provides the universal... </section> </text> </article> © By Ganesh K. Yenurkar
  • 29. LECTURE-2 DTEL <article> <author>Gerhard Weikum</author> <title>The Web in Ten Years</title> <text> <abstract>In order to evolve...</abstract> <section number=“1” title=“Introduction”> The <index>Web</index> provides the universal... </section> </text> </article> Freely definable tags © By Ganesh K. Yenurkar
  • 30. What is Markup? LECTURE-3 DTEL • (Freely definable) tags: article, title, author with start tag: <article> etc. and end tag: </article> etc. • Elements: <article> ... </article> • Elements have a name (article) and a content (...) • Elements may be nested. • Elements may be empty: <this_is_empty/> • Element content is typically parsed character data (PCDATA), i.e., strings with special characters, and/or nested elements (mixed content if both). • Each XML document has exactly one root element and forms a tree. • Elements with a common parent are ordered. © By Ganesh K. Yenurkar
  • 31. Elements vs. Attributes LECTURE-3 DTEL Elements may have attributes (in the start tag) that have a name and a value, e.g. <section number=“1“>. What is the difference between elements and attributes? Only one attribute with a given name per element (but an arbitrary number of subelements) Attributes have no structure, simply strings (while elements can have subelements) As a rule of thumb: Content into elements Metadata into attributes Example: <person born=“1912-06-23“ died=“1954-06-07“> Alan Turing</person> proved that… © By Ganesh K. Yenurkar
  • 32. LECTURE-3 DTEL XML Parsing article author title text section abstract The index Web provides … title=“…“ number=“1“ In order … Gerhard Weikum The Web in 10 years © By Ganesh K. Yenurkar
  • 33. LECTURE-3 DTEL Some special characters must be escaped using entities: < → &lt; & → &amp; (will be converted back when reading the XML doc) Some other characters may be escaped, too: > → &gt; “ → &quot; ‘ → &apos; special characters used in XML © By Ganesh K. Yenurkar
  • 34. Well-Formed XML Documents LECTURE-3 DTEL • A well-formed document must adher to, among others, the following rules: • Every start tag has a matching end tag. • Elements may nest, but must not overlap. • There must be exactly one root element. • Attribute values must be quoted. • An element may not have two attributes with the same name. • Comments and processing instructions may not appear inside tags. • No unescaped < or & signs may occur inside character data. © By Ganesh K. Yenurkar
  • 35. Namespace Syntax LECTURE-3 DTEL <dbs:book xmlns:dbs=“http://www-dbs/dbs“> Unique URI to identify the namespace Signal that namespace definition happens Prefix as abbrevation of URI © By Ganesh K. Yenurkar
  • 36. Namespace Example LECTURE-3 DTEL <dbs:book xmlns:dbs=“http://www-dbs/dbs“> <dbs:description> ... </dbs:description> <dbs:text> <dbs:formula> <mathml:math xmlns:mathml=“http://www.w3.org/1998/Math/MathML“> ... </mathml:math> </dbs:formula> </dbs:text> </dbs:book> © By Ganesh K. Yenurkar
  • 38. Document Type Definitions LECTURE-4 DTEL Sometimes XML is too flexible: • Most Programs can only process a subset of all possible XML applications • For exchanging data, the format (i.e., elements, attributes and their semantics) must be fixed Document Type Definitions (DTD) for establishing the vocabulary for one XML application (in some sense comparable to schemas in databases) A document is valid with respect to a DTD if it conforms to the rules specified in that DTD. Most XML parsers can be configured to validate the things. © By Ganesh K. Yenurkar
  • 39. DTD Example: Elements LECTURE-4 DTEL <!ELEMENT article (title,author+,text)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT text (abstract,section*,literature?)> <!ELEMENT abstract (#PCDATA)> <!ELEMENT section (#PCDATA|index)+> <!ELEMENT literature (#PCDATA)> <!ELEMENT index (#PCDATA)> Content of the title element is parsed character data Content of the article element is a title element, followed by one or more author elements, followed by a text element Content of the text element may contain zero or more section elements in this position © By Ganesh K. Yenurkar
  • 40. Element Declarations in DTDs LECTURE-4 DTEL One element declaration for each element type: <!ELEMENT element_name content_specification> where content_specification can be (#PCDATA) parsed character data (child) one child element (c1,…,cn) a sequence of child elements c1…cn (c1|…|cn) one of the elements c1…cn For each component c, possible counts can be specified: c exactly one such element c+ one or more c* zero or more c? zero or one Plus arbitrary combinations using parenthesis: <!ELEMENT f ((a|b)*,c+,(d|e))*> © By Ganesh K. Yenurkar
  • 41. Element Declarations LECTURE-4 DTEL Elements with mixed content: <!ELEMENT text (#PCDATA|index|cite|glossary)*> Elements with empty content: <!ELEMENT image EMPTY> Elements with arbitrary content (this is nothing for production-level DTDs): <!ELEMENT thesis ANY> © By Ganesh K. Yenurkar
  • 42. Attribute Declarations in DTDs LECTURE-4 DTEL element name attribute name attribute type attribute default Attributes are declared per element: <!ATTLIST section number CDATA #REQUIRED title CDATA #REQUIRED> declares two required attributes for element section. © By Ganesh K. Yenurkar
  • 43. Attribute Declarations in DTDs LECTURE-4 DTEL Attributes are declared per element: <!ATTLIST section number CDATA #REQUIRED title CDATA #REQUIRED> declares two required attributes for element section. Possible attribute defaults: #REQUIRED is required in each element instance #IMPLIED is optional #FIXED default always has this default value default has this default value if the attribute is omitted from the element instance © By Ganesh K. Yenurkar
  • 44. Attribute Types in DTDs LECTURE-4 DTEL CDATA string data (A1|…|An) enumeration of all possible values of the attribute (each is XML name) ID unique XML name to identify the element IDREF refers to ID attribute of some other element („intra-document link“) IDREFS list of IDREF, separated by white space plus some more © By Ganesh K. Yenurkar
  • 45. Attribute Examples LECTURE-4 DTEL <ATTLIST publication type (journal|inproceedings) #REQUIRED pubid ID #REQUIRED> <ATTLIST cite cid IDREF #REQUIRED> <ATTLIST citation ref IDREF #IMPLIED cid ID #REQUIRED> <publications> <publication type=“journal“ pubid=“Weikum01“> <author>Gerhard Weikum</author> <text>In the Web of 2010, XML <cite cid=„12“/>...</text> <citation cid=„12“ ref=„XML98“/> <citation cid=„15“>...</citation> </publication> <publication type=“inproceedings“ pubid=“XML98“> <text>XML, the extended Markup Language, ...</text> </publication> </publications> © By Ganesh K. Yenurkar
  • 46. XML - DTDs LECTURE-4 DTEL • The XML Document Type Declaration, commonly known as DTD, is a way to describe XML language precisely. DTDs check vocabulary and validity of the structure of XML documents against grammatical rules of appropriate XML language. • An XML DTD can be either specified inside the document, or it can be kept in a separate document and then liked separately. Syntax:- Basic syntax of a DTD is as follows − <!DOCTYPE element DTD identifier [ declaration1 declaration2 ........ ]> © By Ganesh K. Yenurkar
  • 47. Elements vs. Attributes LECTURE-5 DTEL In the above syntax, •The DTD starts with <!DOCTYPE delimiter. •An element tells the parser to parse the document from the specified root element. •DTD identifier is an identifier for the document type definition, which may be the path to a file on the system or URL to a file on the internet. If the DTD is pointing to external path, it is called External Subset. •The square brackets [ ] enclose an optional list of entity declarations called Internal Subset. © By Ganesh K. Yenurkar
  • 48. Internal DTD LECTURE-5 DTEL A DTD is referred to as an internal DTD if elements are declared within the XML files. To refer it as internal DTD, standalone attribute in XML declaration must be set to yes. This means, the declaration works independent of an external source. Syntax Following is the syntax of internal DTD − <!DOCTYPE root-element [ element-declarations ]> where root-element is the name of root element and element-declarations is where you declare the elements. © By Ganesh K. Yenurkar
  • 49. Elements vs. Attributes LECTURE-5 DTEL Example Following is a simple example of internal DTD − <?xml version = "1.0" encoding = "UTF-8" standalone = "yes" ?> <!DOCTYPE address [ <!ELEMENT address (name,company,phone)> <!ELEMENT name (#PCDATA)> <!ELEMENT company (#PCDATA)> <!ELEMENT phone (#PCDATA)> ]> <address> <name>Tanmay Patil</name> <company>TutorialsPoint</company> <phone>(011) 123-4567</phone> </address> © By Ganesh K. Yenurkar
  • 50. External DTD LECTURE-5 DTEL In external DTD elements are declared outside the XML file. They are accessed by specifying the system attributes which may be either the legal .dtd file or a valid URL. To refer it as external DTD, standalone attribute in the XML declaration must be set as no. This means, declaration includes information from the external source. Syntax Following is the syntax for external DTD − <!DOCTYPE root-element SYSTEM "file-name"> where file-name is the file with .dtd extension. © By Ganesh K. Yenurkar
  • 51. LECTURE-5 DTEL Example The following example shows external DTD usage − <?xml version = "1.0" encoding = "UTF-8" standalone = "no" ?> <!DOCTYPE address SYSTEM "address.dtd"> <address> <name>Tanmay Patil</name> <company>TutorialsPoint</company> <phone>(011) 123-4567</phone> </address> The content of the DTD file address.dtd is as shown − <!ELEMENT address (name,company,phone)> <!ELEMENT name (#PCDATA)> <!ELEMENT company (#PCDATA)> <!ELEMENT phone (#PCDATA)> © By Ganesh K. Yenurkar
  • 52. XML Schema Basics LECTURE-5 DTEL • XML Schema is an XML application Provides simple types (string, integer, dateTime, duration, language, …) • Allows defining possible values for elements • Allows defining types derived from existing types • Allows defining complex types • Allows posing constraints on the occurrence of elements • Allows forcing uniqueness and foreign keys • Way too complex to cover in an introductory talk © By Ganesh K. Yenurkar
  • 53. XML Schema Example LECTURE-5 DTEL <xs:schema> <xs:element name=“article“> <xs:complexType> <xs:sequence> <xs:element name=“author“ type=“xs:string“/> <xs:element name=“title“ type=“xs:string“/> <xs:element name=“text“> <xs:complexType> <xs:sequence> <xs:element name=“abstract“ type=“xs:string“/> <xs:element name=“section“ type=“xs:string“ minOccurs=“0“ maxOccurs=“unbounded“/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> © By Ganesh K. Yenurkar
  • 54. LECTURE-6 DTEL Querying XML with XPath and XQuery © By Ganesh K. Yenurkar
  • 55. LECTURE-6 DTEL XPath and XQuery are query languages for XML data, both standardized by the W3C and supported by various database products. Their search capabilities include • logical conditions over element and attribute content (first-order predicate logic a la SQL; simple conditions only in XPath) • regular expressions for pattern matching of element names along paths or subtrees within XML data + joins, grouping, aggregation, transformation, etc. (XQuery only) Xpath... © By Ganesh K. Yenurkar
  • 56. Xpath... LECTURE-6 DTEL • XPath is a simple language to identify parts of the XML document (for further processing) • XPath operates on the tree representation of the document • Result of an XPath expression is a set of elements or attributes • Discuss abbreviated version of XPath © By Ganesh K. Yenurkar
  • 57. Elements of XPath LECTURE-6 DTEL • An XPath expression usually is a location path that consists of location steps, separated by /: /article/text/abstract: selects all abstract elements • A leading / always means the root element • Each location step is evaluated in the context of a node in the tree, the so-called context node • Possible location steps: child element x: select all child elements with name x Attribute @x: select all attributes with name x Wildcards * (any child), @* (any attribute) Multiple matches, separated by |: x|y|z © By Ganesh K. Yenurkar
  • 58. XPath by Example LECTURE-6 DTEL /literature/book/author retrieves all book authors: starting with the root, traverses the tree, matches element names literature, book, author, and returns elements <author>Suciu, Dan</author>, <author>Abiteboul, Serge</author>, ..., <author><firstname>Jeff</firstname> <lastname>Ullman</lastname></author> /literature/*/author authors of books, articles, essays, etc. /literature//author authors that are descendants of literature /literature//@year value of the year attribute of descendants of literature /literature//author[firstname] authors that have a subelement firstname /literature/(book|article)/author authors of books or articles /literature/book[price < „50“] /literature/book[author//country = „Germany“] low priced books books with German author © By Ganesh K. Yenurkar
  • 59. XQuery LECTURE-6 DTEL XQuery is an extremely powerful query language for XML data. A query has the form of a so-called FLWR expression: FOR $var1 IN expr1, $var2 IN expr2, ... LET $var3 := expr3, $var4 := expr4, ... WHERE condition RETURN result-doc-construction The FOR clause evaluates expressions (which may be XPath-style path expressions) and binds the resulting elements to variables. For a given binding each variable denotes exactly one element. The LET clause binds entire sequences of elements to variables. The WHERE clause evaluates a logical condition with each of the possible variable bindings and selects those bindings that satisfy the condition. The RETURN clause constructs, from each of the variable bindings, an XML result tree. This may involve grouping and aggregation and even complete subqueries. © By Ganesh K. Yenurkar
  • 60. XQuery Examples LECTURE-6 DTEL // find Web-related articles by Dan Suciu from the year 1998 <results> { FOR $a IN document(“literature.xml“)//article FOR $n IN $a//author, $t IN $a/title WHERE $a/@year = “1998“ AND contains($n, “Suciu“) AND contains($t, “Web“) RETURN <result> $n $t </result> } </results> // find articles co-authored by authors who have jointly written a book after 1995 <results> { FOR $a IN document(“literature.xml“)//article FOR $a1 IN $a//author, $a2 IN $a//author WHERE SOME $b IN document(“literature.xml“)//book SATISFIES $b//author = $a1 AND $b//author = $a2 AND $b/@year>“1995“ RETURN <result> $a1 $a2 <wrote> $a </wrote> </result> } </results> © By Ganesh K. Yenurkar
  • 61. References LECTURE-6 DTEL 1. https://www.tutorialspoint.com/xml/xml_schemas.htm 2. Ralf Schenkel ,”XML for Beginners”, Organizing and Searching Information with XML, April-2003 3. W3Schools Online Web Tutorials,http://www.w3schools.com. © By Ganesh K. Yenurkar
  • 62. LECTURE-6 DTEL THANK YOU © By Ganesh K. Yenurkar