More Related Content
Similar to XML: An Overview.pptx (20)
XML: An Overview.pptx
- 1. Teaching Innovation - Entrepreneurial - Global
The Centre for Technology enabled Teaching & Learning , N Y S S, India
DTEL(Department for Technology Enhanced Learning)
© By Ganesh K. Yenurkar
- 3. PREFACE
As educators, we all have the same common goal “to guide our students” so that they gain the
maximum possible in a positive environment that promotes their success and inculcates in
them desire to learn. One of the best tools available to us in this pursuit is PPT instruction that
is systematic and self Learning. The goal of this PPT is to help teachers in the use of
eLearning that it is both effective and efficient method for teaching our students. It has been
developed for purely academic and non-commercial purpose.
Our desire in preparing this PPT is to support the teachers, who have the very demanding task
of Teaching-Plan to deliver instruction on a lecture/period basis. The PPT is therefore prepared
lecture wise. Further at the end of each chapter Questions have also been included for practice.
We begin here with a tour for XML : An Overview for the learners. In this ppt we attracts to
the beginners of the XML who wants to make their carrier or hands on XML technologies.
One can find here the XML introduction, purpose of use, how to use xml, xml syntax and
many more.
With deep regards and humility, we thank both our Management of MGI for motivating and
our CEO for strong follow-ups to prepare PPTs under . We dedicate this PPT to students and
our shared profession.
G. K. Yenurkar
XML: An Overview
DTEL © By Ganesh K. Yenurkar
- 4. COURSE OBJECTIVES
1. To learn how XML and its related technologies function and
how they facilitate integration between applications
2. To master and integrate the core syntax of XML, DTD, and
XML Schema
3. To understand the basic fundamentals of XSL
DTEL
XML: An Overview
© By Ganesh K. Yenurkar
- 5. COURSE OUTCOMES
On successful completion of this course the student should be able to:
1. Design and code data transfer scripts using XML languages for the transfer of data
over business networks and the Internet.
2. Validate XML documents with the Document Type Definitions and schemas according
to industry standards.
3. Transform various data formats such as text and images so that this information can be
transferred to and from server storage devices on business and health care networks and
the Internet.
4. Validate XML code and associated DTDs and schemas using a XML editing tool so
that the XML code can be used within business and health care industries.
DTEL
XML: An Overview
© By Ganesh K. Yenurkar
- 8. LECTURE-1
8
DTEL
XML - Introduction
• XML stands for Extensible Markup Language. It is a textual
markup language derived from Standard Generalized Markup
Language ( i.e. SGML).
• XML tags may identify the data and are used to store and
organize the data, rather than specifying how to display it like
HTML tags, which are used to display the data. XML is not
going to replace HTML in the near future, but it introduces new
possibilities by adopting many successful features of HTML.
• There are 03 main important characteristics of XML that make it
useful in a variety of systems and solutions.
© By Ganesh K. Yenurkar
- 9. 9
LECTURE-1
DTEL
•XML is extensible language − XML allows us to create/add your own
self-descriptive tags, or language, that suits your application.
•XML just carries the data, does not present it − XML allows us to
store the data irrespective of how it will be presented.
•XML is a public standard language − XML was developed by an
organization called the World Wide Web Consortium (W3C) and is
available as an open standard.
© By Ganesh K. Yenurkar
- 10. LECTURE-1
DTEL
Usage of XML
The various XML usage are as follows:-
•XML can work behind the scene to simplify the creation of HTML documents for
large web sites.
•XML can be used to exchange the information between organizations and
systems.
•XML can be used for offloading and reloading of databases.
•XML can be used to store and arrange the data, which can customize your data
handling needs.
•XML can easily be merged with style sheets to create almost any desired output.
•Virtually, any type of data can be expressed as an XML document.
© By Ganesh K. Yenurkar
- 11. What do you mean by XML Markup?
LECTURE-1
DTEL
• XML is a markup language that defines set of rules for
encoding documents in a format that is both human-readable
and machine-readable. So what exactly is a markup language?
Markup is information added to a document that enhances its
meaning in certain ways, in that it identifies the parts and how
they relate to each other.
• More specifically, a markup language is a set of symbols that
can be placed in the text of a document to demarcate and label
the parts of that document.
© By Ganesh K. Yenurkar
- 12. LECTURE-1
DTEL
• Following example shows how XML markup looks, when
embedded in a piece of text −
<My_message>
<text> Hello, Friends <text>
</My_message>
Explanation:- This snippet includes the markup symbols, or the tags such as
<message>...</message> and <text>... </text>. The tags <message> and
</message> mark the start and the end of the XML code fragment. The tags
<text> and </text> surround the text Hello, Friends.
© By Ganesh K. Yenurkar
- 13. LECTURE-1
DTEL
Is XML a Programming Language?
No, A programming language consists of grammar rules and
its own vocabulary which is used to create computer
programs. These programs instruct the computer to perform
specific tasks. XML does not qualify to be a programming
language as it does not perform any computation or
algorithms. It is usually stored in a simple text file and is
processed by special software that is capable of interpreting
XML.
© By Ganesh K. Yenurkar
- 14. LECTURE-1
DTEL
XML - Syntax
Following is a complete XML document −
<?xml version = "1.0"?>
<contact>
<name>Ganesh Yenurkar </name>
<address> Nagpur</address>
<company> YCCE </company>
<phone> 0000000000</phone>
</contact>
There are two kinds of information in the above example −
•Markup tags , like <contact> and
•The text, or the character data, YCCE and 0000000000.
© By Ganesh K. Yenurkar
- 16. LECTURE-1
DTEL
XML Declaration
The XML document can optionally have an XML declaration. It is written as
follows −
<?xml version = "1.0" encoding = "UTF-8"?>
Where version is the XML version and encoding specifies the character
encoding used in the document. Whole syntax is called as a “Prologue”.
Syntax Rules for XML Declaration
•The XML declaration is case sensitive and must begin with "<?xml>" where "xml"
is written in lower-case.
•If document contains XML declaration, then it strictly needs to be the first
statement of the XML document.
•The XML declaration strictly needs be the first statement in the XML document.
•An HTTP protocol can override the value of encoding that you put in the XML
declaration.
© By Ganesh K. Yenurkar
- 17. LECTURE-1
DTEL
XML is not…
• A replacement for HTML
(but HTML can be generated from XML)
• A presentation format
(but XML can be converted into one)
• A programming language
(but it can be used with almost any language)
• A network transfer protocol
(but XML may be transferred over a network)
• A database
(but XML may be stored into a database)
© By Ganesh K. Yenurkar
- 18. But XML Is …
LECTURE-1
DTEL
• XML is a meta markup language for text documents / textual
data
• XML allows to define languages („applications“) to represent
text documents / textual data
© By Ganesh K. Yenurkar
- 19. XML with an Examples
LECTURE-1
DTEL
<article>
<author>GANESH YENURKAR</author>
<title>WEB TECHNOLOGY</title>
</article>
• Easy to understand for all human users
• Very expressive clear to understand (semantics along with the
data is given)
• Well structured, easy to read and write from programs
This looks nice, but…
© By Ganesh K. Yenurkar
- 22. LECTURE-2
DTEL
Advantages of Using XML
• Truly Portable Data
• Easily readable by human users
• Very expressive (semantics near data)
• Very flexible and customizable (no finite tag set)
• Easy to use from programs (libs available)
• Easy to convert into other representations
(XML transformation languages)
• Many additional standards and tools
• Widely used and supported
© By Ganesh K. Yenurkar
- 23. App. Scenario : Data Exchange
LECTURE-2
DTEL
Legacy
System
(e.g., SAP
R/2)
Legacy
System
(e.g.,
Cobol)
XML
Adapter
XML
Adapter
XML
(BMECat, ebXML, RosettaNet, BizTalk, …)
Supplier
Buyer
Order
© By Ganesh K. Yenurkar
- 24. App. Scenario : XML for Metadata
LECTURE-2
DTEL
<rdf:RDF
<rdf:Description rdf:about="http://www-dbs/Sch03.pdf">
<dc:title>A Framework for…</dc:title>
<dc:creator>Ralf Schenkel</dc:creator>
<dc:description>While there are...</dc:description>
<dc:publisher>Saarland University</dc:publisher>
<dc:subject>XML Indexing</dc:subject>
<dc:rights>Copyright ...</dc:rights>
<dc:type>Electronic Document</dc:type>
<dc:format>text/pdf</dc:format>
<dc:language>en</dc:language>
</rdf:Description>
</rdf:RDF>
© By Ganesh K. Yenurkar
- 25. App. Scenario : Document Markup
LECTURE-2
DTEL
<article>
<section id=„01“ title=„XML:An Overview“>
This article is about <index>XML</index>.
</section>
<section id=„02“ title=„Main Results“>
<name>Weikum</name> <cite idref=„Weik01“/> shows the following
theorem (see Section <ref idref=„1“/>)
<theorem id=„theo:1“ source=„Weik01“>
For any XML document x, ...
</theorem>
</section>
<literature>
<cite id=„Weik01“><author>Weikum</author></cite>
</literature>
</article>
© By Ganesh K. Yenurkar
- 26. LECTURE-2
DTEL
• Document Markup adds structural and semantic
information to documents, e.g.
Sections, Subsections, Theorems, …
1. Cross References
2. Literature Citations
3. Index Entries
4. Named Entities
• This allows queries like
Which articles cite Weikum‘s XML paper from 2001? and
Which articles talk about (the named entity) „Weikum“?
© By Ganesh K. Yenurkar
- 28. A Simple XML Document
LECTURE-2
DTEL
<article>
<author>Gerhard Weikum</author>
<title>The Web in Ten Years</title>
<text>
<abstract>In order to evolve...</abstract>
<section number=“1” title=“Introduction”>
The <index>Web</index> provides the universal...
</section>
</text>
</article>
© By Ganesh K. Yenurkar
- 30. What is Markup?
LECTURE-3
DTEL
• (Freely definable) tags: article, title, author
with start tag: <article> etc.
and end tag: </article> etc.
• Elements: <article> ... </article>
• Elements have a name (article) and a content (...)
• Elements may be nested.
• Elements may be empty: <this_is_empty/>
• Element content is typically parsed character data (PCDATA),
i.e., strings with special characters, and/or nested elements
(mixed content if both).
• Each XML document has exactly one root element and forms
a tree.
• Elements with a common parent are ordered.
© By Ganesh K. Yenurkar
- 31. Elements vs. Attributes
LECTURE-3
DTEL
Elements may have attributes (in the start tag) that have a name and
a value, e.g. <section number=“1“>.
What is the difference between elements and attributes?
Only one attribute with a given name per element (but an arbitrary number of
subelements)
Attributes have no structure, simply strings (while elements can have
subelements)
As a rule of thumb:
Content into elements
Metadata into attributes
Example:
<person born=“1912-06-23“ died=“1954-06-07“>
Alan Turing</person> proved that…
© By Ganesh K. Yenurkar
- 33. LECTURE-3
DTEL
Some special characters must be escaped using
entities:
< → <
& → &
(will be converted back when reading the XML doc)
Some other characters may be escaped, too:
> → >
“ → "
‘ → '
special characters used in XML
© By Ganesh K. Yenurkar
- 34. Well-Formed XML Documents
LECTURE-3
DTEL
• A well-formed document must adher to, among others, the
following rules:
• Every start tag has a matching end tag.
• Elements may nest, but must not overlap.
• There must be exactly one root element.
• Attribute values must be quoted.
• An element may not have two attributes with the same name.
• Comments and processing instructions may not appear inside
tags.
• No unescaped < or & signs may occur inside character data.
© By Ganesh K. Yenurkar
- 38. Document Type Definitions
LECTURE-4
DTEL
Sometimes XML is too flexible:
• Most Programs can only process a subset of all possible XML
applications
• For exchanging data, the format (i.e., elements, attributes and their
semantics) must be fixed
Document Type Definitions (DTD) for establishing the
vocabulary for one XML application (in some sense comparable to
schemas in databases)
A document is valid with respect to a DTD if it conforms to the
rules specified in that DTD.
Most XML parsers can be configured to validate the things.
© By Ganesh K. Yenurkar
- 39. DTD Example: Elements
LECTURE-4
DTEL
<!ELEMENT article (title,author+,text)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT text (abstract,section*,literature?)>
<!ELEMENT abstract (#PCDATA)>
<!ELEMENT section (#PCDATA|index)+>
<!ELEMENT literature (#PCDATA)>
<!ELEMENT index (#PCDATA)>
Content of the title element is
parsed character data
Content of the article element is a title element,
followed by one or more author elements,
followed by a text element
Content of the text element may
contain zero or more section
elements in this position
© By Ganesh K. Yenurkar
- 40. Element Declarations in DTDs
LECTURE-4
DTEL
One element declaration for each element type:
<!ELEMENT element_name content_specification>
where content_specification can be
(#PCDATA) parsed character data
(child) one child element
(c1,…,cn) a sequence of child elements c1…cn
(c1|…|cn) one of the elements c1…cn
For each component c, possible counts can be specified:
c exactly one such element
c+ one or more
c* zero or more
c? zero or one
Plus arbitrary combinations using parenthesis:
<!ELEMENT f ((a|b)*,c+,(d|e))*>
© By Ganesh K. Yenurkar
- 41. Element Declarations
LECTURE-4
DTEL
Elements with mixed content:
<!ELEMENT text (#PCDATA|index|cite|glossary)*>
Elements with empty content:
<!ELEMENT image EMPTY>
Elements with arbitrary content (this is nothing for production-level
DTDs):
<!ELEMENT thesis ANY>
© By Ganesh K. Yenurkar
- 42. Attribute Declarations in DTDs
LECTURE-4
DTEL
element
name
attribute name
attribute type
attribute default
Attributes are declared per element:
<!ATTLIST section number CDATA #REQUIRED
title CDATA #REQUIRED>
declares two required attributes for element section.
© By Ganesh K. Yenurkar
- 43. Attribute Declarations in DTDs
LECTURE-4
DTEL
Attributes are declared per element:
<!ATTLIST section number CDATA #REQUIRED
title CDATA #REQUIRED>
declares two required attributes for element section.
Possible attribute defaults:
#REQUIRED is required in each element instance
#IMPLIED is optional
#FIXED default always has this default value
default has this default value if the attribute is
omitted from the element instance
© By Ganesh K. Yenurkar
- 44. Attribute Types in DTDs
LECTURE-4
DTEL
CDATA string data
(A1|…|An) enumeration of all possible values of the
attribute (each is XML name)
ID unique XML name to identify the element
IDREF refers to ID attribute of some other element
(„intra-document link“)
IDREFS list of IDREF, separated by white space
plus some more
© By Ganesh K. Yenurkar
- 45. Attribute Examples
LECTURE-4
DTEL
<ATTLIST publication type (journal|inproceedings) #REQUIRED
pubid ID #REQUIRED>
<ATTLIST cite cid IDREF #REQUIRED>
<ATTLIST citation ref IDREF #IMPLIED
cid ID #REQUIRED>
<publications>
<publication type=“journal“ pubid=“Weikum01“>
<author>Gerhard Weikum</author>
<text>In the Web of 2010, XML <cite cid=„12“/>...</text>
<citation cid=„12“ ref=„XML98“/>
<citation cid=„15“>...</citation>
</publication>
<publication type=“inproceedings“ pubid=“XML98“>
<text>XML, the extended Markup Language, ...</text>
</publication>
</publications>
© By Ganesh K. Yenurkar
- 46. XML - DTDs
LECTURE-4
DTEL
• The XML Document Type Declaration, commonly known as DTD, is a way
to describe XML language precisely. DTDs check vocabulary and validity of
the structure of XML documents against grammatical rules of appropriate
XML language.
• An XML DTD can be either specified inside the document, or it can be kept
in a separate document and then liked separately.
Syntax:-
Basic syntax of a DTD is as follows −
<!DOCTYPE element DTD identifier
[
declaration1
declaration2 ........
]>
© By Ganesh K. Yenurkar
- 47. Elements vs. Attributes
LECTURE-5
DTEL
In the above syntax,
•The DTD starts with <!DOCTYPE delimiter.
•An element tells the parser to parse the document from the
specified root element.
•DTD identifier is an identifier for the document type definition,
which may be the path to a file on the system or URL to a file on
the internet. If the DTD is pointing to external path, it is called
External Subset.
•The square brackets [ ] enclose an optional list of entity
declarations called Internal Subset.
© By Ganesh K. Yenurkar
- 48. Internal DTD
LECTURE-5
DTEL
A DTD is referred to as an internal DTD if elements are declared within the
XML files. To refer it as internal DTD, standalone attribute in XML declaration
must be set to yes. This means, the declaration works independent of an
external source.
Syntax
Following is the syntax of internal DTD −
<!DOCTYPE root-element
[
element-declarations
]>
where root-element is the name of root element and element-declarations is
where you declare the elements.
© By Ganesh K. Yenurkar
- 49. Elements vs. Attributes
LECTURE-5
DTEL
Example
Following is a simple example of internal DTD −
<?xml version = "1.0" encoding = "UTF-8" standalone = "yes" ?>
<!DOCTYPE address
[
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>
<address> <name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
</address>
© By Ganesh K. Yenurkar
- 50. External DTD
LECTURE-5
DTEL
In external DTD elements are declared outside the XML file. They are
accessed by specifying the system attributes which may be either the legal .dtd
file or a valid URL. To refer it as external DTD, standalone attribute in the
XML declaration must be set as no. This means, declaration includes
information from the external source.
Syntax
Following is the syntax for external DTD −
<!DOCTYPE root-element SYSTEM "file-name">
where file-name is the file with .dtd extension.
© By Ganesh K. Yenurkar
- 51. LECTURE-5
DTEL
Example
The following example shows external DTD usage −
<?xml version = "1.0" encoding = "UTF-8" standalone = "no" ?>
<!DOCTYPE address SYSTEM "address.dtd">
<address> <name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
</address>
The content of the DTD file address.dtd is as shown −
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
© By Ganesh K. Yenurkar
- 52. XML Schema Basics
LECTURE-5
DTEL
• XML Schema is an XML application
Provides simple types (string, integer, dateTime, duration, language,
…)
• Allows defining possible values for elements
• Allows defining types derived from existing types
• Allows defining complex types
• Allows posing constraints on the occurrence of elements
• Allows forcing uniqueness and foreign keys
• Way too complex to cover in an introductory talk
© By Ganesh K. Yenurkar
- 53. XML Schema Example
LECTURE-5
DTEL
<xs:schema>
<xs:element name=“article“>
<xs:complexType>
<xs:sequence>
<xs:element name=“author“ type=“xs:string“/>
<xs:element name=“title“ type=“xs:string“/>
<xs:element name=“text“>
<xs:complexType>
<xs:sequence>
<xs:element name=“abstract“ type=“xs:string“/>
<xs:element name=“section“ type=“xs:string“
minOccurs=“0“ maxOccurs=“unbounded“/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
© By Ganesh K. Yenurkar
- 55. LECTURE-6
DTEL
XPath and XQuery are query languages for XML data, both
standardized by the W3C and supported by various database
products.
Their search capabilities include
• logical conditions over element and attribute content
(first-order predicate logic a la SQL; simple conditions only in
XPath)
• regular expressions for pattern matching of element names
along paths or subtrees within XML data
+ joins, grouping, aggregation, transformation, etc. (XQuery
only)
Xpath...
© By Ganesh K. Yenurkar
- 56. Xpath...
LECTURE-6
DTEL
• XPath is a simple language to identify parts of the XML
document (for further processing)
• XPath operates on the tree representation of the document
• Result of an XPath expression is a set of elements or
attributes
• Discuss abbreviated version of XPath
© By Ganesh K. Yenurkar
- 57. Elements of XPath
LECTURE-6
DTEL
• An XPath expression usually is a location path that consists of
location steps, separated by /:
/article/text/abstract: selects all abstract elements
• A leading / always means the root element
• Each location step is evaluated in the context of a node in the tree,
the so-called context node
• Possible location steps:
child element x: select all child elements with name x
Attribute @x: select all attributes with name x
Wildcards * (any child), @* (any attribute)
Multiple matches, separated by |: x|y|z
© By Ganesh K. Yenurkar
- 58. XPath by Example
LECTURE-6
DTEL
/literature/book/author retrieves all book authors:
starting with the root, traverses the tree, matches element
names literature, book, author, and returns elements
<author>Suciu, Dan</author>,
<author>Abiteboul, Serge</author>, ...,
<author><firstname>Jeff</firstname>
<lastname>Ullman</lastname></author>
/literature/*/author authors of books, articles, essays, etc.
/literature//author authors that are descendants of literature
/literature//@year value of the year attribute of descendants of literature
/literature//author[firstname] authors that have a subelement firstname
/literature/(book|article)/author authors of books or articles
/literature/book[price < „50“]
/literature/book[author//country = „Germany“]
low priced books
books with German author
© By Ganesh K. Yenurkar
- 59. XQuery
LECTURE-6
DTEL
XQuery is an extremely powerful query language for XML data.
A query has the form of a so-called FLWR expression:
FOR $var1 IN expr1, $var2 IN expr2, ...
LET $var3 := expr3, $var4 := expr4, ...
WHERE condition
RETURN result-doc-construction
The FOR clause evaluates expressions (which may be XPath-style
path expressions) and binds the resulting elements to variables.
For a given binding each variable denotes exactly one element.
The LET clause binds entire sequences of elements to variables.
The WHERE clause evaluates a logical condition with each of
the possible variable bindings and selects those bindings that
satisfy the condition.
The RETURN clause constructs, from each of the variable bindings,
an XML result tree. This may involve grouping and aggregation
and even complete subqueries.
© By Ganesh K. Yenurkar
- 60. XQuery Examples
LECTURE-6
DTEL
// find Web-related articles by Dan Suciu from the year 1998
<results> {
FOR $a IN document(“literature.xml“)//article
FOR $n IN $a//author, $t IN $a/title
WHERE $a/@year = “1998“
AND contains($n, “Suciu“) AND contains($t, “Web“)
RETURN <result> $n $t </result> } </results>
// find articles co-authored by authors who have jointly written a book after
1995
<results> {
FOR $a IN document(“literature.xml“)//article
FOR $a1 IN $a//author, $a2 IN $a//author
WHERE SOME $b IN document(“literature.xml“)//book SATISFIES
$b//author = $a1 AND $b//author = $a2 AND $b/@year>“1995“
RETURN <result> $a1 $a2 <wrote> $a </wrote> </result> } </results>
© By Ganesh K. Yenurkar