2. Introduction
Extensible Markup Language (XML) is a markup
language that defines a set of rules for encoding
documents in a format that is both human-
readable and machine-readable. It is defined in
the XML 1.0 Specification produced by the W3C,
and several other related specifications, all gratis
open standards.
3. The main benefit of xml is that you can use it to take data
from a program like Microsoft SQL, convert it into XML
then share that XML with other programs and platforms.
You can communicate between two platforms which are
generally vThe main thing which makes XML truly
powerful is its international acceptance. Many corporation
use XML interfaces for databases, programming, office
application mobile phones and more. It is due to its
platform independent feature.ery difficult.
4. XML tags identify the data and are used to store and
organize the data, rather than specifying how to
display it like HTML tags, which are used to display
the data. XML is not going to replace HTML in the
near future, but it introduces new possibilities by
adopting many successful features of HTML.
5. There are three important characteristics of XML that
make it useful in a variety of systems and solutions −
XML is extensible − XML allows you to create your
own self-descriptive tags, or language, that suits your
application.
XML carries the data, does not present it − XML
allows you to store the data irrespective of how it will
be presented.
XML is a public standard − XML was developed by
an organization called the World Wide Web
Consortium (W3C) and is available as an open
6. XML Usage
XML can work behind the scene to simplify the creation of
HTML documents for large web sites.
XML can be used to exchange the information between
organizations and systems.
XML can be used for offloading and reloading of databases.
XML can be used to store and arrange the data, which can
customize your data handling needs.
XML can easily be merged with style sheets to create almost
any desired output.
Virtually, any type of data can be expressed as an XML
document.
7. What is Markup?
XML is a markup language that defines set of rules for
encoding documents in a format that is both human-
readable and machine-readable. So what exactly is a
markup language? Markup is information added to a
document that enhances its meaning in certain ways, in
that it identifies the parts and how they relate to each
other. More specifically, a markup language is a set of
symbols that can be placed in the text of a document to
demarcate and label the parts of that document.
8. Following example shows how XML markup looks,
when embedded in a piece of text −
<message>
<text>Hello, world!</text>
</message>
9. Is XML a Programming Language?
A programming language consists of grammar rules and
its own vocabulary which is used to create computer
programs. These programs instruct the computer to
perform specific tasks. XML does not qualify to be a
programming language as it does not perform any
computation or algorithms. It is usually stored in a simple
text file and is processed by special software that is
11. XML Declaration
The XML document can optionally have an XML
declaration. It is written as follows
<?xml version = "1.0" encoding = "UTF-8"?>
Where version is the XML version and encoding
specifies the character encoding used in the
document
12. Syntax Rules for XML Declaration
The XML declaration is case sensitive and must begin
with "<?xml>" where "xml" is written in lower-case.
If document contains XML declaration, then it strictly
needs to be the first statement of the XML document.
The XML declaration strictly needs be the first statement
in the XML document.
An HTTP protocol can override the value of encoding that
you put in the XML declaration.
13. Tags and Elements
An XML file is structured by several XML-elements, also called XML-
nodes or XML-tags. The names of XML-elements are enclosed in
triangular brackets < > as shown below −
<element>
Syntax Rules for Tags and Elements
Element Syntax − Each XML-element needs to be closed either with
start or with end elements as shown below −
<element>....</element>
14. Nesting of Elements − An XML-element can contain
multiple XML-elements as its children, but the children
elements must not overlap. i.e., an end tag of an element
must have the same name as that of the most recent
unmatched start tag.
The Following example shows incorrect nested tags −
<?xml version = "1.0"?>
<contact-info>
<company>TutorialsPoint
</contact-info>
</company>
15. The Following example shows
correct nested tags −
<?xml version = "1.0"?>
<contact-info>
<company>TutorialsPoint</company>
<contact-info>
16. Root Element
An XML document can have only one root
element. For example, following is not a correct
XML document, because both the x and y
elements occur at the top level without a root
element −
<x>...</x>
<y>...</y>
The Following example shows a correctly formed
XML document −
<root>
<x>...</x>
<y>...</y>
</root>
17. XML Attributes
An attribute specifies a single property for the
element, using a name/value pair. An XML-element
can have one or more attributes. For example −
<a href = http://www.xy.com/>xy!</a>
Syntax Rules for XML Attributes
Attribute names in XML (unlike HTML) are case sensitive. That is,
HREF and href are considered two different XML attributes.
Same attribute cannot have two values in a syntax. The following
example shows incorrect syntax because the attribute b is specified
twice
18. XML References
References usually allow you to add or include
additional text or markup in an XML document.
References always begin with the symbol "&" which
is a reserved character and end with the symbol ";".
XML has two types of references −
Entity References − An entity reference contains a
name between the start and the end delimiters. For
example & where amp is name. The name refers to
a predefined string of text and/or markup.
Character References − These contain references, such
as A, contains a hash mark (“#”) followed by a
number. The number always refers to the Unicode code
of a character. In this case, 65 refers to alphabet "A".
19. Characteristics of XML
XML has a number of important characteristics. Some of
them are given below:
XML is a structured format: Which means that we can
define exactly how the data is to be arranged, organized
and expressed within the file. When we are given a file, we
can validate that it conforms to a specific structure, prior to
importing the data. As we know he structure of the file in
advance, we know what it contains and how to process
each item. Prior to XML, the only structure in a text file was
positional – we knew the bit of text after the fourth comma
should be a date of birth – and we had no way to validate
whether it was a date of birth, or even a date, or whether it
was in day/month/year or month/day/year order.
20. XML is a described format:
Which means that within the text file, every item of data
has a name that is both human- and machine-readable as
well as being uniquely identifiable.
We can open these files, read their contents and
understand the data they contain, without having to refer
back to another document to find out what the text after the
fourth comma represents (and was that comma a
separator, or part of the text of the second item?).
Similarly, we can edit these documents with a fairly high
level of confidence that we’re making the correct changes.
21. XML can easily describe hierarchical data and the
relationships between data.If we want to import and
export a list of authors, with their names, addresses and
the books they’ve written, deciding on a reasonable
format for a csv file is by no means straightforward. Using
XML, we can define what an Author item is and that it has
a name, address and multiple Book items. We can also
define what a Book item it is and that it has a title, a
publisher and an ISBN. The hierarchy and relationships
are a natural consequence of the definition
22. XML can be validated:
Which means we can provide a second XML file
– an XML Schema Definition file – that describes
exactly how the XML data file should be
structured. Before processing an XML file, we
can compare it with the schema to ensure it
conforms to the structure we expect to receive.
23. XML is a discoverable format:
Which means programs (including Excel
2003/2007/2010/ 2013) can parse an XML data file
and infer the structure and relationships between the
items. This means we can read an XML file, infer its
structure and generate new XML data files that
conform to the same structure, with a high degree of
confidence the new XML data files will pass
validation.
24. XML is a strongly-typed format
Which means the schema definition file specifies the
data type of each element. When importing the data,
the application can check the schema definition to
identify the data type to import it as. We no longer run
the risk of the product code 01-03 being imported as a
date.
25. XML is a global format
There is only one way to express a number in
an XML file (with US number formats) and
only one way to express a date. We no longer
have to check whether a csv file was created
with US or French settings and adjust our
processing of it accordingly.
26. XML is a standard format
The way in which the content of an XML file is defined
has been specified by the World Wide Web
Consortium (W3C). This allows applications (including
Excel 2003/2007/2010/2013) to read, understand and
validate the structure of an XML file and create files
that conform to the specified structure. It also allows
different applications to read, write, understand and
validate the same XML files, allowing us to share data
between applications in an extremely robust manner
27. XML–Advantage
Simplicity: Information coded in XML is easy to read and understand,
plus it can be processed easily by computers.
Openness: XML is a W3C standard, endorsed by software industry
market leaders.
Extensibility: There is no fixed set of tags.New tags can be created as
they are needed.
Self-description: XML documents can be stored without [schemas]
because they contain meta data; any XML tag can possess an
unlimited number of attributes such as author or version.
28. Contains machine-readable context information: Tags,
attributes and element structure provide context
information ... opening up new possibilities for highly
efficient search engines, intelligent data mining, agents,
etc.
Separates content from presentation: XML tags describe
29. Everything has pros and cons. Some
of the problems are:
XML syntax is redundant or large relative to binary
representations of similar data.
The redundancy may affect application efficiency through higher
storage, transmission and processing costs.
XML syntax is too verbose relative to other alternative ‘text-
based’ data transmission formats.
No intrinsic data type support: XML provides no specific notion
of “integer”, “string”, “boolean”, “date”, and so on.
The hierarchical model for representation is limited in
comparison to the relational model or an object oriented graph.
Expressing overlapping (non-hierarchical) node relationships
requires extra effort.
XML namespaces are problematic to use and namespace
support can be difficult to correctly implement in an XML parser.
XML is commonly depicted as “self-documenting” but this
depiction ignores critical ambiguities.