SlideShare a Scribd company logo
1 of 38
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
Faculty of Science, Technology and Communication (FSTC)
Bachelor en informatique (professionnel)
-- Media IT -–
¯_(ツ)_/¯
Unit 8
XML
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
2
8.1 Limitations of HTML
8.2 XML introduction
8.3 XML specifications
8.4 Document Definition Type (DTD)
8.5 XML Schema
8.6 Namespaces
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.1 Limitations of HTML
3
<h1>Christoph Meinel</h1>
<h2>Viola Brehmer</h2>
<ul>
<li>Long Wang</li>
<li>Feng Cheng</li>
<li>Dirk Cordel</li>
<li>Serge Linckels</li>
</ul>
Harald Sack
Limitations of HTML
HTML was initiated to give a structure to a
document and to modify its layout; NOT to
describe semantics
What is this Web page about?
What position has "Viola Brehmer"?
…
Meta-Tags
<meta name="description"
content="Homepage of Serge Linckels">
<meta name="keywords"
content="teacher, athlete">
<meta name="Author"
content="The Master of the Universe">
<meta name="xyz"
content="nothing special">
Do you believe in Meta-Tags?
HTML metadata are created by the author of the Web page.
Their syntax and semantics are individual.
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.1 Limitations of HTML
4
Extensible Markup Language (XML)
Markup Language: allows to give a structure to text documents
by using tags
Meta Language: XML does not have a fixed set of tags (new
tags can be created)
Extensible: XML can be adapted (extended) to meet many
different domains, e.g.,
• Mathematical Markup Language (MathML)
• Chemical Markup Language (CML)
• Synchronized Multimedia Integration Language (SMIL)
• WAP Markup Language (WML)
Creator Jon Bosak, 1996
XML is not…
a programming language
a network transport protocol
a database
XML is…
a simple data format
platform independent
does not require special applications
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.2 XML introduction
5
Picture created by Harald Sack
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.2 XML introduction
6
Picture created by Harald Sack
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
7
Standard Generalized Markup Language (SGML)
The Standard Generalized Markup Language
(SGML) is a metalanguage in which one can define
markup languages for documents
HTML XML


 XHTML
• Subset of SGML
• Structure of data
• Layout and data
are separated
8.2 XML introduction
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.2 XML introduction
8
Welcome to LIASIT
LIASIT stands for Luxembourg Advanced Studies in Information
Technology and since August 01, 2006 is a Doctoral School in the Faculty
of Science, Technology and Communication.
The faculty is composed of the following professors: David BASIN, Pascal
BOUVRY, Eric DUBOIS, Thomas ENGEL, Franck LEPREVOST, Christoph
MEINEL, Nicolas GUELFI, and Björn OTTERSTEN.
The PhD Students are: Christoph BRANDT, Pandu DEVARAKOTA, Daniel
FISCHER, Benjamin GATEAU, Markus GROSS, Joel GROTZ, Annie
GUERRIERO, Serge LINCKELS, Nicolas MAYER, Michael NOLL, Benoît RIES,
Michael STIEGHAHN.
Magali MARTIN is the secretary of LIASIT... and also a nice entertainer.
How we can see this…
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.2 XML introduction
9
Welcome to LIASIT
LIASIT stands for Luxembourg Advanced Studies in
Information Technology and since August 01, 2006 is a
Doctoral School in the Faculty of Science, Technology
and Communication.
The faculty is composed of the following professors:
David BASIN, Pascal BOUVRY, Eric DUBOIS, Thomas ENGEL,
Franck LEPREVOST, Christoph MEINEL, Nicolas GUELFI,
and Brn OTTERSTEN.
The PhD Students are: Christoph BRANDT, Pandu
DEVARAKOTA, Daniel FISCHER, Benamin GATEAU, Markus
GROSS, Joel GROTZ, Annie GUERRIERO, Serge LINCKELS,
Nicolas MAYER, Michael NOLL, Benot RIES, Michael
STIEGHAHN.
Magali MARTIN is the secretary of LIASIT... and also
a nice entertainer.
What a computer sees…
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.2 XML introduction
10
<title>Welcome to LIASIT</title>
<description>LIASIT stands for Luxembourg Advanced Studies
in Information Technology and since August 01, 2006 is
a Doctoral School in the Faculty of Science,
Technology and Communication.</description>
<profs>
<name>The faculty</name>
<name>is composed of</name>
<name>the following</name>
</profs>
<students>
<name>The PhD</name>
<name>Students are:</name>
<name>Christoph BRANDT,</name>
<name>Pandu DEVARAKOTA,</name>
</students>
<administration>
<name>Daniel FISCHER, <name>
</administration>
How we can help…
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.2 XML introduction
11
<title>Welcome to LIASIT</title>
<description>LIASIT stands for Luxembourg Advanced
Studies in Information Technology and since August 01,
2006 is a Doctoral School in the Faculty of Science,
Technology and Communication.</description>
<profs>
<name>The faculty</name>
<name>is composed of</name>
<name>the following</name>
</profs>
<students>
<name>The PhD</name>
<name>Students are:</name>
<name>Christoph BRANDT,</name>
<name>Pandu DEVARAKOTA,</name>
</students>
<administration>
<name>Daniel FISCHER, </name>
</administration>
How we can help…
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.2 XML introduction
12
<profs>
<name>Thomas Engel</name>
<name>Christoph Meinel</name>
<name>David Basin</name>
<name> Björn Ottersten</name>
</profs>
<students>
<name>Benoît Ries</name>
<name>Daniel Fischer</name>
<name>Christoph Brandt,</name>
<name>Pandu Devarakota</name>
<name>Serge Linckels</name>
</students>
Benefits of XML
Document is well-structured
Applications can process the file
XML file
Thomas Engel
Christoph Meinel
David Basin
Björn Ottersten
Benoît Ries
Daniel Fischer
Christoph Brandt
Pandu Devarakota
Serge Linckels
pure text file
Problems with text file
No structure
Difficult to process
Attention: although XML adds a certain amount of
semantics to the document, there are sill
information that are missing, e.g., what is the
relation between "profs" and "students"?
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.3 XML specifications
13
<person type="Teacher">
<name>Serge Linckels</name>
<hp>http://www.linckels.lu</hp>
<size>173</size>
<phone>691-111111</phone>
</person>
element
attribute
child-element
value
Terminology
Tree representation
person
name hp size phone
Serge Linckels
http://www.linckels.lu
173
691-111111
General
XML is composed of text and tags
Tags come in pairs, e.g., <hp></hp>
Tags must be properly nested, e.g.,
<person><hp></person></hp>
<person><hp></hp></person>
Tags are case sensitive: <hp> ≠ <HP>
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.3 XML specifications
14
<staff>
</staff>
XML structure
<person type="Teacher">
<name>Serge Linckels</name>
<hp>http://www.linckels.lu</hp>
<size>173</size>
<phone>26 00 11 22</phone>
<phone>691-111111</phone>
</person>
<person type="Teacher">
<name>Denis Zampunieris</name>
<phone>4666445290</phone>
</person>
same element can be used
repeatedly
Nested tags can be
part of a list too.
Order is not
significant.
Element or attribute?
<name>
<first>Serge</first>
<last>Linckels</last>
</name>
<name first="Serge" last="Linckels”></name>
Both variants are semantically identical
or
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.3 XML specifications
15
XML Names
Can include:
- letters (a..z, A..Z)
- digits (0..9)
- these punctuation chars:
- underscore (_)
- hyphen (-)
- period (.)
- special chars like ö, ç, Ω
Examples for valid XML Names:
<drivers_licence>
<_oki-doki>
<téléphone>
<this.works>
CDATA Sections
Everything between <![CDATA[ and ]]>
is treated as raw character data.
<person type="Teacher">
<name>Serge Linckels</name>
<![CDATA[This is just some
code that is ignored,
10 print "Hello world"
20 goto 10
]]>
<phone>26 00 11 22</phone>
</person>
Comments
Comments are between <!-- and -->
like in HTML
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.3 XML specifications
16
XML declaration
<?xml version="1.0" encoding="ASCII" standalone="yes"?>
<person type="Teacher">
<name>Serge Linckels</name>
<hp>http://www.linckels.lu</hp>
<size>173</size>
<phone>691-111111</phone>
</person>
encoding: XML is pure text, but can use different encoding, e.g.,
ASCII, Latin-1, Unicode, ISO-8859-1. When omitted then Unicode is
default.
standalone:
- "yes", no external DTD/Schema is given
- "no", external DTD/Schema is specified
XML-defined character sets
Unicode: 95156 characters from most of
Earths living languages (variants: UCS-2,
UCS-4, UTF-8, UTF-16)
ISO character sets: e.g., ISO-8859-15 (Latin
9) is ASCII + accented letters + €
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.3 XML specifications
17
JSON – Javascript Object Notation
No element names
Primary data format used for asynchronous
browser/server communication (AJAX)
Language-independent data format
Supported by many programming languages
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
Exercise
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.4 Document Type Definition (DTD)
19
<person type="Teacher">
<name>
<first>Serge</first>
<last>Linckels</last>
</name>
<phone>691-111111</phone>
</person>
<Personne>
<Type>Teacher</Type>
<Nom>Serge Linckels</Nom>
<HP>http://www.linckels.lu</HP>
<Sexe>M</Sexe>
</Personne>
?
Formal syntax is required
DTD – Document Type Definitions
Syntax of XML document is described
Validating parser checks syntax:
XML document with DTD syntax
A XML document is valid if it respects the
syntax defined in its DTD
<!ELEMENT person (name, phone*)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
#PCDATA: value of type string
A person-element can contain
1 name sub-element and 0..*
phone sub-elements
Attention: a document can be well-formed
but not valid!
Web browser only checks if well-formed.
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.4 Document Type Definition (DTD)
20
<?xml version="1.0" standalone="no"?>
<!DOCTYPE person SYSTEM "http://www.linckels.lu/person.dtd">
<person type="Teacher">
<name>
<first>Serge</first>
<last>Linckels</last>
</name>
<phone>691-111111</phone>
</person>
<!ELEMENT person (name, phone*)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
person.xml
person.dtd
URI of the DTD-file
e.g., "/mydisk/person.dtd"
Validating a document
A Web browser does not validate documents
but only checks it for well-formedness
XML validators APIs are available in Java
Online validators, e.g.,
http://www.stg.brown.edu/service/xmlvalid/
http://www.w3.org/2001/03/webdata/xsv
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.4 Document Type Definition (DTD)
21
<?xml version="1.0"?>
<!DOCTYPE person [
<!ELEMENT person (name, phone*)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>
<person type="Teacher">
<name>
<first>Serge</first>
<last>Linckels</last>
</name>
<phone>691-111111</phone>
</person>
Valid XML document with internal
DTD
Sequences
* Zero or more of the element is allowed
? Zero or one of the element is allowed
+ One or more of the element is required
Elements must appear in the specified order
Choices
<!ELEMENT color (red | green)
Here, the element color can have a child-
element red or green, not both at a time.
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.4 Document Type Definition (DTD)
22
<?xml version="1.0"?>
<!DOCTYPE person [
<!ELEMENT person (name, phone*)>
<!ATTLIST name first CDATA #IMPLIED
last CDATA #REQUIRED
>
<!ELEMENT phone (#PCDATA)>
]>
<person type="Teacher">
<name first="Serge" last="Linckels" />
<phone>691-111111</phone>
</person>
Attribute declarations
Attribute defaults:
#IMPLIED: value is optional
#REQUIRED: value is required
#FIXED: value is constant
Literal: value is given as quoted string
Attribute types:
CDATA: any string of text
Enumeration: list of values
ID: unique XML name
IDREF: unique identification of some
element in the document
IDREFS: set of IDREFs
Valid XML document with internal DTD
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.4 Document Type Definition (DTD)
23
<family>
<person id="jane" mother="mary" father="john">
<name>Jane Doe</name>
</person>
<person id="john" children="jane jack">
<name>John Doe</name>
</person>
<person id="mary" children="jane jack">
<name>Mary Smith</name>
</person>
<person id="jack" mother="mary" father="john">
<name>Jack Smith</name>
</person>
</family>
DTD – ID, IDREF and IDREFS
<!DOCTYPE family [
<!ELEMENT family (person*)>
<!ELEMENT person (name)>
<!ELEMENT name (#PCDATA)>
<!ATTLIST person
id ID #REQUIRED
mother IDREF #IMPLIED
father IDREF #IMPLIED
children IDREFS #IMPLIED>
]>
XML document
DTD
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.4 Document Type Definition (DTD)
24
Problems and limitations
DTD are context-free grammars; recursive definitions are possible
Order matters, e.g.,
<!ELEMENT person (last, first)>
Workaround:
<!ELEMENT person ((last, first) | (first, last))>
Can become unclear:
<!ELEMENT person ((name | phone | e-mail)*)>
Lacks of expressiveness, e.g., restriction over references are not possible
All elements are global in one namespace
XML Schema, more powerful than DTD and W3C recommendation
No support for newer features of XML
DTD are expressed in a non-XML syntax
…but there are numerous other XML schema languages, e.g., RELAX NG, ISO DSDL, Schematron…
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
Exercise
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
Exercise
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE university [
<!ELEMENT university (teachers)>
<!ELEMENT teachers (teacher*)>
<!ELEMENT teacher (first, name, title, office?, teach*)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT office (#PCDATA)>
<!ELEMENT teach* EMPTY>
<!ATTLIST teach course IDREF #REQUIRED>
]>
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.5 XML Schema
27
XML Schema - Overview
XML Schema is an XML document containing a formal description of what comprises a valid
XML document
An XML document described by a schema is called an instance document
More explicit restrictions on the number and sequence of child elements are possible
Example
<?xml version="1.0"?>
<fullName>Serge Linckels</fullname>
XML document
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="fullName" type="xs:string" />
</xs:schema>
XML Schema
xs: is standard
prefix for XML
Schema namespace
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.5 XML Schema
28
Atomic types (more than 40!)
string: Unicode string
integer: positive or negative number
boolean: true/false or 0/1
ID, IDREF, IDREFS: cf. DTD
Simple types
New simple types can be created by using atomic types
<xs:element name="first" type="xs:string" />
<xs:element name="age" type="xs:integer" />
<xs:element name="link" type="xs:anyURI" />
<xs:element name="year" type="xs:year" />
<xs:simpleType name="aName" base="xs:string" />
Restrictions can be defined
<xs:simpleType name="aName">
<xs:resriction base="xs:string>
<xs:maxLength value="50" />
</xs:restriction>
</xs:simpleType>
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.5 XML Schema
29
More about restrictions
Restrictions (facets) can be defined over simple types using xs:restriction
<xs:simpleType name="location">
<xs:resriction base="xs:string>
<xs:enumeration value="work" />
<xs:enumeration value="school" />
<xs:enumeration value="mobile" />
</xs:restriction>
</xs:simpleType>
Enumerations:
<xs:simpleType name="age">
<xs:resriction base="xs:unsignedShort>
<xs:minExclusive value="0" />
<xs:maxInclusive value="120" />
</xs:restriction>
</xs:simpleType>
Numeric facets:
possible values:
• minInclusive
• maxInclusive
• minExclusive
• maxExclusive
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.5 XML Schema
30
More about restrictions
<xs:simpleType name="mobile-phone">
<xs:resriction base="xs:string>
<xs:pattern value="ddd-dd dd dd" />
</xs:restriction>
</xs:simpleType>
Enforcing format:
Enforces the rule that a
mobile phone-number
consists of 3 digits, a
dash, 2 digits, a space, 2
digits, another space and
finally 2 digits.
<xs:simpleType name="TypeAuthor">
<xs:list itemType="xs:string />
</xs:simpleType>
Lists:
<xs:element name="author" type="TypeAuthor" />
XML Schema – element definition
XML Schema – simple type definition
The author element of an
instance document can
contain an unlimited list
of strings, each separated
by a whitespace
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.5 XML Schema
31
Complex types
A complex type is an element that contains
child-elements
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="first" type="xs:string" />
<xs:element name="last" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
<?xml version="1.0"?>
<person>
<first>Serge</first>
<last>Linckels</last>
</person>
XML document
XML Schema
Only elements can have complex types,
attributes always have simple types
sequence: order of elements matters (a,b)
all: order of elements does not matter (a,b or b,a)
choice: one or the other element (a xor b)
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.5 XML Schema
32
Occurrence Constraints
Set the number of times an element may occur:
minOccurs: minimum occurrences
maxOccurs: maximum occurrences
<xs:element name="middle" type="xs:string"
minOccurs="0" maxOccurs="unbounded“ />
The default value for minOccurs and maxOccurs is 1
In this example, maxOccurs is not set, but has a default value of 1. Therefore, the middle
element may appear 0 or 1 times.
The value unbounded indicates that the element may appear an unlimited number of times.
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.5 XML Schema
33
Derived complex types
Deriving by extension: add new definitions to existing complex type. E.g.,
add the phone element to the existing person type.
<xs:complexType name="PersonWithPhone">
<xs:extension base="person">
<xs:sequence>
<xs:element name="phone" type="xs:string" />
</xs:sequence>
</xs:extension>
</xs:complexType>
Deriving by restriction: by omitting parts of the parent definition, the
restriction element create a new, constrained type.
<xs:complexType name="PersonWithMoreNames">
<xs:restriction base="person">
<xs:sequence>
<xs:element name="first" type="xs:string" minOccurs="2" />
<xs:element name="last" type="xs:string" />
</xs:sequence>
</xs:restriction>
</xs:complexType>
Structure cannot be
changed!
Only available for
complex types
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.5 XML Schema
34
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="first" type="xs:string" />
<xs:element name="last" type="xs:string" />
</xs:sequence>
<xs:attribute name="job" type="xs:string" />
</xs:complexType>
</xs:element>
Attribute declarations
Attributes can be declared globally by top-level xs:attribute
<xs:attribute name="job" type="xs:string" use="optional" />
Attributes can be declared locally as part of a complex type definition
<?xml version="1.0"?>
<person job="Teacher">
<first>Serge</first>
<last>Linckels</last>
</person>
possible values:
optional or required
XML document
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.5 XML Schema
35
Conclusion: DTD vs. XML Schemas
XML Schemas is a more powerful language than DTD to specify the syntax of XML
documents; therefore, it is more expressive in terms of semantics
XML Schemas is a W3C recommendation and widely used. As DTD are simpler to use, they
are still used today
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.6 Namespaces
36
<compactDisk author="HS">
<titel>Remixes</titel>
<track number="1">
<titel>Night over Manaus</titel>
<author>Boozoo Bajou</author>
</track>
</compactDisk>
Problem of ambiguous names
XML names can be used for different
elements. But this creates ambiguities.
XML namespaces disambiguate elements with the same name from each
other by assigning elements and attributes to URIs.
Qualified names, prefixes and local parts
Elements are identified by qualified names:
cd:titel
prefix local name
qualified name
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
8.6 Namespaces
37
Using XML namespaces
<cd:compactDisk
xmlns:cd = "http://www.linckels.lu/cd"
xmlns:tr = "http://www.xyz.com/tracks"
author="HS">
<cd:titel>Remixes</cd:titel>
<tr:track number="1">
<tr:titel>Night over Manaus</tr:titel>
<tr:author>Boozoo Bajou</tr:author>
</tr:track>
</cd:compactDisk>
Each element exists in a unique
namespace
Namespace URIs are purely formal
identifiers; they are not the
addresses of a page, and they are
not meant to be followed as links
<tr:title>Remixes</tr:title>
Instead of using a prefix, the complete URI can be indicated, e.g.,
<http://www.xyz.com/tracks#title>
Remixes
</http://www.xyz.com/tracks#title>
Namespace binding: each prefix in a qualified name must be associated with a URI
Namespaces only apply to elements,
not to attributes
Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu ::
8. XML and its Sub-Languages
38
XML in a Nutshell
Elliotte R. Harold, W. Scott Means, W. Scott Means
E-Librarian Service
User-Friendly Semantic Search in Digital Libraries
Serge Linckels, Christoph Meinel

More Related Content

Similar to Media IT - XML and sublanguages

Euclid Data Model 101 - Episode 01: Overview
Euclid Data Model 101 - Episode 01: OverviewEuclid Data Model 101 - Episode 01: Overview
Euclid Data Model 101 - Episode 01: Overvieweuc-dm-test
 
xml and xhtml.pptx
xml and xhtml.pptxxml and xhtml.pptx
xml and xhtml.pptxssusere16bd9
 
XML Prague 2005 EXSLT
XML Prague 2005 EXSLTXML Prague 2005 EXSLT
XML Prague 2005 EXSLTjimfuller2009
 
IPT Chapter 3 Data Mapping and Exchange - Dr. J. VijiPriya
IPT Chapter 3 Data Mapping and Exchange - Dr. J. VijiPriyaIPT Chapter 3 Data Mapping and Exchange - Dr. J. VijiPriya
IPT Chapter 3 Data Mapping and Exchange - Dr. J. VijiPriyaVijiPriya Jeyamani
 
Xml in bio medical field
Xml in bio medical fieldXml in bio medical field
Xml in bio medical fieldJuman Ghazi
 
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...Pieter Pauwels
 
3rd 3DDRESD: DRESD Future Plan 0809
3rd 3DDRESD: DRESD Future Plan 08093rd 3DDRESD: DRESD Future Plan 0809
3rd 3DDRESD: DRESD Future Plan 0809Marco Santambrogio
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerFrancesco Osborne
 
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)Beat Signer
 
Semantic Web - Overview
Semantic Web - OverviewSemantic Web - Overview
Semantic Web - OverviewSerge Linckels
 
Data interchange integration, HTML XML Biological XML DTD
Data interchange integration, HTML XML Biological XML DTDData interchange integration, HTML XML Biological XML DTD
Data interchange integration, HTML XML Biological XML DTDAnushaMahmood
 

Similar to Media IT - XML and sublanguages (20)

Euclid Data Model 101 - Episode 01: Overview
Euclid Data Model 101 - Episode 01: OverviewEuclid Data Model 101 - Episode 01: Overview
Euclid Data Model 101 - Episode 01: Overview
 
xml and xhtml.pptx
xml and xhtml.pptxxml and xhtml.pptx
xml and xhtml.pptx
 
PRESENTATION WORK ON COMPUTER APPLICATION IN PHARMACY.pdf
PRESENTATION WORK ON COMPUTER APPLICATION IN PHARMACY.pdfPRESENTATION WORK ON COMPUTER APPLICATION IN PHARMACY.pdf
PRESENTATION WORK ON COMPUTER APPLICATION IN PHARMACY.pdf
 
Why XML is important for everyone, especially technical communicators
Why XML is important for everyone, especially technical communicatorsWhy XML is important for everyone, especially technical communicators
Why XML is important for everyone, especially technical communicators
 
Mwml
MwmlMwml
Mwml
 
XML Prague 2005 EXSLT
XML Prague 2005 EXSLTXML Prague 2005 EXSLT
XML Prague 2005 EXSLT
 
IPT Chapter 3 Data Mapping and Exchange - Dr. J. VijiPriya
IPT Chapter 3 Data Mapping and Exchange - Dr. J. VijiPriyaIPT Chapter 3 Data Mapping and Exchange - Dr. J. VijiPriya
IPT Chapter 3 Data Mapping and Exchange - Dr. J. VijiPriya
 
Xml in bio medical field
Xml in bio medical fieldXml in bio medical field
Xml in bio medical field
 
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
BuildingSMART Standards Summit 2015 - Technical Room - Linked Data for Constr...
 
3rd 3DDRESD: DRESD Future Plan 0809
3rd 3DDRESD: DRESD Future Plan 08093rd 3DDRESD: DRESD Future Plan 0809
3rd 3DDRESD: DRESD Future Plan 0809
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
HTML/CSS Lecture 1
HTML/CSS Lecture 1HTML/CSS Lecture 1
HTML/CSS Lecture 1
 
93 peter butterfield
93 peter butterfield93 peter butterfield
93 peter butterfield
 
Mdst 3559-02-01-html
Mdst 3559-02-01-htmlMdst 3559-02-01-html
Mdst 3559-02-01-html
 
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)
XML and XML Applications - Lecture 04 - Web Information Systems (WE-DINF-11912)
 
The XS2OWL Framework
The XS2OWL FrameworkThe XS2OWL Framework
The XS2OWL Framework
 
Htmlppt
Htmlppt Htmlppt
Htmlppt
 
Semantic Web - Overview
Semantic Web - OverviewSemantic Web - Overview
Semantic Web - Overview
 
Data interchange integration, HTML XML Biological XML DTD
Data interchange integration, HTML XML Biological XML DTDData interchange integration, HTML XML Biological XML DTD
Data interchange integration, HTML XML Biological XML DTD
 
Data Science Course.pdf
Data Science Course.pdfData Science Course.pdf
Data Science Course.pdf
 

More from Serge Linckels

Media IT - author rights
Media IT - author rightsMedia IT - author rights
Media IT - author rightsSerge Linckels
 
Media IT - Natural Language Processing
Media IT - Natural Language ProcessingMedia IT - Natural Language Processing
Media IT - Natural Language ProcessingSerge Linckels
 
Semantic Web - Search engines
Semantic Web - Search enginesSemantic Web - Search engines
Semantic Web - Search enginesSerge Linckels
 
Semantic Web - Ontologies
Semantic Web - OntologiesSemantic Web - Ontologies
Semantic Web - OntologiesSerge Linckels
 
Semantic Web - Introduction
Semantic Web - IntroductionSemantic Web - Introduction
Semantic Web - IntroductionSerge Linckels
 

More from Serge Linckels (11)

Media IT - author rights
Media IT - author rightsMedia IT - author rights
Media IT - author rights
 
Media IT - Images
Media IT - ImagesMedia IT - Images
Media IT - Images
 
Media IT - Entropy
Media IT - EntropyMedia IT - Entropy
Media IT - Entropy
 
Media IT - Natural Language Processing
Media IT - Natural Language ProcessingMedia IT - Natural Language Processing
Media IT - Natural Language Processing
 
Media IT - Coding
Media IT - CodingMedia IT - Coding
Media IT - Coding
 
Semantic Web - Search engines
Semantic Web - Search enginesSemantic Web - Search engines
Semantic Web - Search engines
 
Semantic Web - OWL
Semantic Web - OWLSemantic Web - OWL
Semantic Web - OWL
 
Semantic Web - Ontologies
Semantic Web - OntologiesSemantic Web - Ontologies
Semantic Web - Ontologies
 
Semantic Web - RDF
Semantic Web - RDFSemantic Web - RDF
Semantic Web - RDF
 
Semantic Web - Introduction
Semantic Web - IntroductionSemantic Web - Introduction
Semantic Web - Introduction
 
E-Librarian Service
E-Librarian ServiceE-Librarian Service
E-Librarian Service
 

Recently uploaded

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Recently uploaded (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 

Media IT - XML and sublanguages

  • 1. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: Faculty of Science, Technology and Communication (FSTC) Bachelor en informatique (professionnel) -- Media IT -– ¯_(ツ)_/¯ Unit 8 XML
  • 2. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 2 8.1 Limitations of HTML 8.2 XML introduction 8.3 XML specifications 8.4 Document Definition Type (DTD) 8.5 XML Schema 8.6 Namespaces
  • 3. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.1 Limitations of HTML 3 <h1>Christoph Meinel</h1> <h2>Viola Brehmer</h2> <ul> <li>Long Wang</li> <li>Feng Cheng</li> <li>Dirk Cordel</li> <li>Serge Linckels</li> </ul> Harald Sack Limitations of HTML HTML was initiated to give a structure to a document and to modify its layout; NOT to describe semantics What is this Web page about? What position has "Viola Brehmer"? … Meta-Tags <meta name="description" content="Homepage of Serge Linckels"> <meta name="keywords" content="teacher, athlete"> <meta name="Author" content="The Master of the Universe"> <meta name="xyz" content="nothing special"> Do you believe in Meta-Tags? HTML metadata are created by the author of the Web page. Their syntax and semantics are individual.
  • 4. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.1 Limitations of HTML 4 Extensible Markup Language (XML) Markup Language: allows to give a structure to text documents by using tags Meta Language: XML does not have a fixed set of tags (new tags can be created) Extensible: XML can be adapted (extended) to meet many different domains, e.g., • Mathematical Markup Language (MathML) • Chemical Markup Language (CML) • Synchronized Multimedia Integration Language (SMIL) • WAP Markup Language (WML) Creator Jon Bosak, 1996 XML is not… a programming language a network transport protocol a database XML is… a simple data format platform independent does not require special applications
  • 5. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.2 XML introduction 5 Picture created by Harald Sack
  • 6. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.2 XML introduction 6 Picture created by Harald Sack
  • 7. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 7 Standard Generalized Markup Language (SGML) The Standard Generalized Markup Language (SGML) is a metalanguage in which one can define markup languages for documents HTML XML    XHTML • Subset of SGML • Structure of data • Layout and data are separated 8.2 XML introduction
  • 8. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.2 XML introduction 8 Welcome to LIASIT LIASIT stands for Luxembourg Advanced Studies in Information Technology and since August 01, 2006 is a Doctoral School in the Faculty of Science, Technology and Communication. The faculty is composed of the following professors: David BASIN, Pascal BOUVRY, Eric DUBOIS, Thomas ENGEL, Franck LEPREVOST, Christoph MEINEL, Nicolas GUELFI, and Björn OTTERSTEN. The PhD Students are: Christoph BRANDT, Pandu DEVARAKOTA, Daniel FISCHER, Benjamin GATEAU, Markus GROSS, Joel GROTZ, Annie GUERRIERO, Serge LINCKELS, Nicolas MAYER, Michael NOLL, Benoît RIES, Michael STIEGHAHN. Magali MARTIN is the secretary of LIASIT... and also a nice entertainer. How we can see this…
  • 9. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.2 XML introduction 9 Welcome to LIASIT LIASIT stands for Luxembourg Advanced Studies in Information Technology and since August 01, 2006 is a Doctoral School in the Faculty of Science, Technology and Communication. The faculty is composed of the following professors: David BASIN, Pascal BOUVRY, Eric DUBOIS, Thomas ENGEL, Franck LEPREVOST, Christoph MEINEL, Nicolas GUELFI, and Brn OTTERSTEN. The PhD Students are: Christoph BRANDT, Pandu DEVARAKOTA, Daniel FISCHER, Benamin GATEAU, Markus GROSS, Joel GROTZ, Annie GUERRIERO, Serge LINCKELS, Nicolas MAYER, Michael NOLL, Benot RIES, Michael STIEGHAHN. Magali MARTIN is the secretary of LIASIT... and also a nice entertainer. What a computer sees…
  • 10. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.2 XML introduction 10 <title>Welcome to LIASIT</title> <description>LIASIT stands for Luxembourg Advanced Studies in Information Technology and since August 01, 2006 is a Doctoral School in the Faculty of Science, Technology and Communication.</description> <profs> <name>The faculty</name> <name>is composed of</name> <name>the following</name> </profs> <students> <name>The PhD</name> <name>Students are:</name> <name>Christoph BRANDT,</name> <name>Pandu DEVARAKOTA,</name> </students> <administration> <name>Daniel FISCHER, <name> </administration> How we can help…
  • 11. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.2 XML introduction 11 <title>Welcome to LIASIT</title> <description>LIASIT stands for Luxembourg Advanced Studies in Information Technology and since August 01, 2006 is a Doctoral School in the Faculty of Science, Technology and Communication.</description> <profs> <name>The faculty</name> <name>is composed of</name> <name>the following</name> </profs> <students> <name>The PhD</name> <name>Students are:</name> <name>Christoph BRANDT,</name> <name>Pandu DEVARAKOTA,</name> </students> <administration> <name>Daniel FISCHER, </name> </administration> How we can help…
  • 12. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.2 XML introduction 12 <profs> <name>Thomas Engel</name> <name>Christoph Meinel</name> <name>David Basin</name> <name> Björn Ottersten</name> </profs> <students> <name>Benoît Ries</name> <name>Daniel Fischer</name> <name>Christoph Brandt,</name> <name>Pandu Devarakota</name> <name>Serge Linckels</name> </students> Benefits of XML Document is well-structured Applications can process the file XML file Thomas Engel Christoph Meinel David Basin Björn Ottersten Benoît Ries Daniel Fischer Christoph Brandt Pandu Devarakota Serge Linckels pure text file Problems with text file No structure Difficult to process Attention: although XML adds a certain amount of semantics to the document, there are sill information that are missing, e.g., what is the relation between "profs" and "students"?
  • 13. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.3 XML specifications 13 <person type="Teacher"> <name>Serge Linckels</name> <hp>http://www.linckels.lu</hp> <size>173</size> <phone>691-111111</phone> </person> element attribute child-element value Terminology Tree representation person name hp size phone Serge Linckels http://www.linckels.lu 173 691-111111 General XML is composed of text and tags Tags come in pairs, e.g., <hp></hp> Tags must be properly nested, e.g., <person><hp></person></hp> <person><hp></hp></person> Tags are case sensitive: <hp> ≠ <HP>
  • 14. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.3 XML specifications 14 <staff> </staff> XML structure <person type="Teacher"> <name>Serge Linckels</name> <hp>http://www.linckels.lu</hp> <size>173</size> <phone>26 00 11 22</phone> <phone>691-111111</phone> </person> <person type="Teacher"> <name>Denis Zampunieris</name> <phone>4666445290</phone> </person> same element can be used repeatedly Nested tags can be part of a list too. Order is not significant. Element or attribute? <name> <first>Serge</first> <last>Linckels</last> </name> <name first="Serge" last="Linckels”></name> Both variants are semantically identical or
  • 15. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.3 XML specifications 15 XML Names Can include: - letters (a..z, A..Z) - digits (0..9) - these punctuation chars: - underscore (_) - hyphen (-) - period (.) - special chars like ö, ç, Ω Examples for valid XML Names: <drivers_licence> <_oki-doki> <téléphone> <this.works> CDATA Sections Everything between <![CDATA[ and ]]> is treated as raw character data. <person type="Teacher"> <name>Serge Linckels</name> <![CDATA[This is just some code that is ignored, 10 print "Hello world" 20 goto 10 ]]> <phone>26 00 11 22</phone> </person> Comments Comments are between <!-- and --> like in HTML
  • 16. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.3 XML specifications 16 XML declaration <?xml version="1.0" encoding="ASCII" standalone="yes"?> <person type="Teacher"> <name>Serge Linckels</name> <hp>http://www.linckels.lu</hp> <size>173</size> <phone>691-111111</phone> </person> encoding: XML is pure text, but can use different encoding, e.g., ASCII, Latin-1, Unicode, ISO-8859-1. When omitted then Unicode is default. standalone: - "yes", no external DTD/Schema is given - "no", external DTD/Schema is specified XML-defined character sets Unicode: 95156 characters from most of Earths living languages (variants: UCS-2, UCS-4, UTF-8, UTF-16) ISO character sets: e.g., ISO-8859-15 (Latin 9) is ASCII + accented letters + €
  • 17. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.3 XML specifications 17 JSON – Javascript Object Notation No element names Primary data format used for asynchronous browser/server communication (AJAX) Language-independent data format Supported by many programming languages
  • 18. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages Exercise
  • 19. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.4 Document Type Definition (DTD) 19 <person type="Teacher"> <name> <first>Serge</first> <last>Linckels</last> </name> <phone>691-111111</phone> </person> <Personne> <Type>Teacher</Type> <Nom>Serge Linckels</Nom> <HP>http://www.linckels.lu</HP> <Sexe>M</Sexe> </Personne> ? Formal syntax is required DTD – Document Type Definitions Syntax of XML document is described Validating parser checks syntax: XML document with DTD syntax A XML document is valid if it respects the syntax defined in its DTD <!ELEMENT person (name, phone*)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT phone (#PCDATA)> #PCDATA: value of type string A person-element can contain 1 name sub-element and 0..* phone sub-elements Attention: a document can be well-formed but not valid! Web browser only checks if well-formed.
  • 20. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.4 Document Type Definition (DTD) 20 <?xml version="1.0" standalone="no"?> <!DOCTYPE person SYSTEM "http://www.linckels.lu/person.dtd"> <person type="Teacher"> <name> <first>Serge</first> <last>Linckels</last> </name> <phone>691-111111</phone> </person> <!ELEMENT person (name, phone*)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT phone (#PCDATA)> person.xml person.dtd URI of the DTD-file e.g., "/mydisk/person.dtd" Validating a document A Web browser does not validate documents but only checks it for well-formedness XML validators APIs are available in Java Online validators, e.g., http://www.stg.brown.edu/service/xmlvalid/ http://www.w3.org/2001/03/webdata/xsv
  • 21. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.4 Document Type Definition (DTD) 21 <?xml version="1.0"?> <!DOCTYPE person [ <!ELEMENT person (name, phone*)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT phone (#PCDATA)> ]> <person type="Teacher"> <name> <first>Serge</first> <last>Linckels</last> </name> <phone>691-111111</phone> </person> Valid XML document with internal DTD Sequences * Zero or more of the element is allowed ? Zero or one of the element is allowed + One or more of the element is required Elements must appear in the specified order Choices <!ELEMENT color (red | green) Here, the element color can have a child- element red or green, not both at a time.
  • 22. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.4 Document Type Definition (DTD) 22 <?xml version="1.0"?> <!DOCTYPE person [ <!ELEMENT person (name, phone*)> <!ATTLIST name first CDATA #IMPLIED last CDATA #REQUIRED > <!ELEMENT phone (#PCDATA)> ]> <person type="Teacher"> <name first="Serge" last="Linckels" /> <phone>691-111111</phone> </person> Attribute declarations Attribute defaults: #IMPLIED: value is optional #REQUIRED: value is required #FIXED: value is constant Literal: value is given as quoted string Attribute types: CDATA: any string of text Enumeration: list of values ID: unique XML name IDREF: unique identification of some element in the document IDREFS: set of IDREFs Valid XML document with internal DTD
  • 23. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.4 Document Type Definition (DTD) 23 <family> <person id="jane" mother="mary" father="john"> <name>Jane Doe</name> </person> <person id="john" children="jane jack"> <name>John Doe</name> </person> <person id="mary" children="jane jack"> <name>Mary Smith</name> </person> <person id="jack" mother="mary" father="john"> <name>Jack Smith</name> </person> </family> DTD – ID, IDREF and IDREFS <!DOCTYPE family [ <!ELEMENT family (person*)> <!ELEMENT person (name)> <!ELEMENT name (#PCDATA)> <!ATTLIST person id ID #REQUIRED mother IDREF #IMPLIED father IDREF #IMPLIED children IDREFS #IMPLIED> ]> XML document DTD
  • 24. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.4 Document Type Definition (DTD) 24 Problems and limitations DTD are context-free grammars; recursive definitions are possible Order matters, e.g., <!ELEMENT person (last, first)> Workaround: <!ELEMENT person ((last, first) | (first, last))> Can become unclear: <!ELEMENT person ((name | phone | e-mail)*)> Lacks of expressiveness, e.g., restriction over references are not possible All elements are global in one namespace XML Schema, more powerful than DTD and W3C recommendation No support for newer features of XML DTD are expressed in a non-XML syntax …but there are numerous other XML schema languages, e.g., RELAX NG, ISO DSDL, Schematron…
  • 25. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages Exercise
  • 26. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages Exercise <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!DOCTYPE university [ <!ELEMENT university (teachers)> <!ELEMENT teachers (teacher*)> <!ELEMENT teacher (first, name, title, office?, teach*)> <!ELEMENT first (#PCDATA)> <!ELEMENT name (#PCDATA)> <!ELEMENT title (#PCDATA)> <!ELEMENT office (#PCDATA)> <!ELEMENT teach* EMPTY> <!ATTLIST teach course IDREF #REQUIRED> ]>
  • 27. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.5 XML Schema 27 XML Schema - Overview XML Schema is an XML document containing a formal description of what comprises a valid XML document An XML document described by a schema is called an instance document More explicit restrictions on the number and sequence of child elements are possible Example <?xml version="1.0"?> <fullName>Serge Linckels</fullname> XML document <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="fullName" type="xs:string" /> </xs:schema> XML Schema xs: is standard prefix for XML Schema namespace
  • 28. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.5 XML Schema 28 Atomic types (more than 40!) string: Unicode string integer: positive or negative number boolean: true/false or 0/1 ID, IDREF, IDREFS: cf. DTD Simple types New simple types can be created by using atomic types <xs:element name="first" type="xs:string" /> <xs:element name="age" type="xs:integer" /> <xs:element name="link" type="xs:anyURI" /> <xs:element name="year" type="xs:year" /> <xs:simpleType name="aName" base="xs:string" /> Restrictions can be defined <xs:simpleType name="aName"> <xs:resriction base="xs:string> <xs:maxLength value="50" /> </xs:restriction> </xs:simpleType>
  • 29. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.5 XML Schema 29 More about restrictions Restrictions (facets) can be defined over simple types using xs:restriction <xs:simpleType name="location"> <xs:resriction base="xs:string> <xs:enumeration value="work" /> <xs:enumeration value="school" /> <xs:enumeration value="mobile" /> </xs:restriction> </xs:simpleType> Enumerations: <xs:simpleType name="age"> <xs:resriction base="xs:unsignedShort> <xs:minExclusive value="0" /> <xs:maxInclusive value="120" /> </xs:restriction> </xs:simpleType> Numeric facets: possible values: • minInclusive • maxInclusive • minExclusive • maxExclusive
  • 30. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.5 XML Schema 30 More about restrictions <xs:simpleType name="mobile-phone"> <xs:resriction base="xs:string> <xs:pattern value="ddd-dd dd dd" /> </xs:restriction> </xs:simpleType> Enforcing format: Enforces the rule that a mobile phone-number consists of 3 digits, a dash, 2 digits, a space, 2 digits, another space and finally 2 digits. <xs:simpleType name="TypeAuthor"> <xs:list itemType="xs:string /> </xs:simpleType> Lists: <xs:element name="author" type="TypeAuthor" /> XML Schema – element definition XML Schema – simple type definition The author element of an instance document can contain an unlimited list of strings, each separated by a whitespace
  • 31. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.5 XML Schema 31 Complex types A complex type is an element that contains child-elements <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="person"> <xs:complexType> <xs:sequence> <xs:element name="first" type="xs:string" /> <xs:element name="last" type="xs:string" /> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> <?xml version="1.0"?> <person> <first>Serge</first> <last>Linckels</last> </person> XML document XML Schema Only elements can have complex types, attributes always have simple types sequence: order of elements matters (a,b) all: order of elements does not matter (a,b or b,a) choice: one or the other element (a xor b)
  • 32. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.5 XML Schema 32 Occurrence Constraints Set the number of times an element may occur: minOccurs: minimum occurrences maxOccurs: maximum occurrences <xs:element name="middle" type="xs:string" minOccurs="0" maxOccurs="unbounded“ /> The default value for minOccurs and maxOccurs is 1 In this example, maxOccurs is not set, but has a default value of 1. Therefore, the middle element may appear 0 or 1 times. The value unbounded indicates that the element may appear an unlimited number of times.
  • 33. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.5 XML Schema 33 Derived complex types Deriving by extension: add new definitions to existing complex type. E.g., add the phone element to the existing person type. <xs:complexType name="PersonWithPhone"> <xs:extension base="person"> <xs:sequence> <xs:element name="phone" type="xs:string" /> </xs:sequence> </xs:extension> </xs:complexType> Deriving by restriction: by omitting parts of the parent definition, the restriction element create a new, constrained type. <xs:complexType name="PersonWithMoreNames"> <xs:restriction base="person"> <xs:sequence> <xs:element name="first" type="xs:string" minOccurs="2" /> <xs:element name="last" type="xs:string" /> </xs:sequence> </xs:restriction> </xs:complexType> Structure cannot be changed! Only available for complex types
  • 34. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.5 XML Schema 34 <xs:element name="person"> <xs:complexType> <xs:sequence> <xs:element name="first" type="xs:string" /> <xs:element name="last" type="xs:string" /> </xs:sequence> <xs:attribute name="job" type="xs:string" /> </xs:complexType> </xs:element> Attribute declarations Attributes can be declared globally by top-level xs:attribute <xs:attribute name="job" type="xs:string" use="optional" /> Attributes can be declared locally as part of a complex type definition <?xml version="1.0"?> <person job="Teacher"> <first>Serge</first> <last>Linckels</last> </person> possible values: optional or required XML document
  • 35. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.5 XML Schema 35 Conclusion: DTD vs. XML Schemas XML Schemas is a more powerful language than DTD to specify the syntax of XML documents; therefore, it is more expressive in terms of semantics XML Schemas is a W3C recommendation and widely used. As DTD are simpler to use, they are still used today
  • 36. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.6 Namespaces 36 <compactDisk author="HS"> <titel>Remixes</titel> <track number="1"> <titel>Night over Manaus</titel> <author>Boozoo Bajou</author> </track> </compactDisk> Problem of ambiguous names XML names can be used for different elements. But this creates ambiguities. XML namespaces disambiguate elements with the same name from each other by assigning elements and attributes to URIs. Qualified names, prefixes and local parts Elements are identified by qualified names: cd:titel prefix local name qualified name
  • 37. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 8.6 Namespaces 37 Using XML namespaces <cd:compactDisk xmlns:cd = "http://www.linckels.lu/cd" xmlns:tr = "http://www.xyz.com/tracks" author="HS"> <cd:titel>Remixes</cd:titel> <tr:track number="1"> <tr:titel>Night over Manaus</tr:titel> <tr:author>Boozoo Bajou</tr:author> </tr:track> </cd:compactDisk> Each element exists in a unique namespace Namespace URIs are purely formal identifiers; they are not the addresses of a page, and they are not meant to be followed as links <tr:title>Remixes</tr:title> Instead of using a prefix, the complete URI can be indicated, e.g., <http://www.xyz.com/tracks#title> Remixes </http://www.xyz.com/tracks#title> Namespace binding: each prefix in a qualified name must be associated with a URI Namespaces only apply to elements, not to attributes
  • 38. Media IT :: Dr Serge Linckels :: http://www.linckels.lu/ :: serge@linckels.lu :: 8. XML and its Sub-Languages 38 XML in a Nutshell Elliotte R. Harold, W. Scott Means, W. Scott Means E-Librarian Service User-Friendly Semantic Search in Digital Libraries Serge Linckels, Christoph Meinel