SlideShare a Scribd company logo
1 of 65
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 1
Semantic Web
Unit 3: XML and its Sub-Languages
Faculty of Science, Technology and Communication (FSTC)
Bachelor en informatique (professionnel)
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 2
3. XML and its Sub-Languages
Semantic Web Roadmap:
Controlled growth bottom
up according to this
architecture.
Architecture was (slightly)
modified in the last years.
2
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 3
3.1. Why is HTML not Sufficient?
3.2. XML - Introduction
3.3. XML – Language Specifications
3.4. Document Type Definitions (DTD)
3.5. XML Schemas
3.6. Namespaces
3.7. Programming Models
3.8. XLink, XPath and XPointer
3.9. XSL Transformations (XSLT)
3.10. References
3. XML and its Sub-Languages
3
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 4
3.1. Why is HTML not Sufficient?
3. XML and its Sub-Languages
<h1>Christoph Meinel</h1>
<h2>Viola Brehmer</h2>
<ul>
<li>Long Wang</li>
<li>Feng Cheng</li>
<li>Dirk Cordel</li>
<li>Serge Linckels</li>
</ul>
Harald Sack
Limitations of HTML
HTML was initiated to give a structure to a
document and to modify its layout; NOT to
describe semantics
What is this Web page about?
What position has "Viola Brehmer"?
…
Meta-Tags
<meta name="description"
content="Homepage of Serge Linckels">
<meta name="keywords"
content="teacher, athlete">
<meta name="Author"
content="The Master of the Universe">
<meta name="xyz"
content="nothing special">
Do you believe in Meta-Tags?
HTML metadata are created by the author of the Web page.
Their syntax and semantics are individual.
4
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 5
3.2. XML - Introduction
3. XML and its Sub-Languages
Extensible Markup Language (XML)
Markup Language: allows to give a structure to text documents
by using tags
Meta Language: XML does not have a fixed set of tags (new
tags can be created)
Extensible: XML can be adapted (extended) to meet many
different domains, e.g.,
• Mathematical Markup Language (MathML)
• Chemical Markup Language (CML)
• Synchronized Multimedia Integration Language (SMIL)
• WAP Markup Language (WML)
Creator Jon Bosak, 1996
XML is not…
a programming language
a network transport protocol
a database
XML is…
a simple data format
platform independent
does not require special applications
5
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 6
3.2. XML - Introduction
3. XML and its Sub-Languages
Picture created by Harald Sack
6
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 7
3.2. XML - Introduction
3. XML and its Sub-Languages
Picture created by Harald Sack
7
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 8
3.2. XML - Introduction
3. XML and its Sub-Languages
Standard Generalized Markup Language (SGML)
The Standard Generalized Markup Language
(SGML) is a metalanguage in which one can
define markup languages for documents
HTML XML


 XHTML
• Instance of SGML
• Layout of data
• Layout and data
are mixed-up
• Subset of SGML
• Structure of data
• Layout and data
are separated
8
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 9
Welcome to LIASIT
LIASIT stands for Luxembourg Advanced Studies in Information
Technology and since August 01, 2006 is a Doctoral School in the Faculty
of Science, Technology and Communication.
The faculty is composed of the following professors: David BASIN, Pascal
BOUVRY, Eric DUBOIS, Thomas ENGEL, Franck LEPREVOST, Christoph
MEINEL, Nicolas GUELFI, and Björn OTTERSTEN.
The PhD Students are: Christoph BRANDT, Pandu DEVARAKOTA, Daniel
FISCHER, Benjamin GATEAU, Markus GROSS, Joel GROTZ, Annie
GUERRIERO, Serge LINCKELS, Nicolas MAYER, Michael NOLL, Benoît RIES,
Michael STIEGHAHN.
Magali MARTIN is the secretary of LIASIT... and also a nice entertainer.
3.2. XML - Introduction
3. XML and its Sub-Languages
How we can see this…
9
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 10
Welcome to LIASIT
LIASIT stands for Luxembourg Advanced Studies in
Information Technology and since August 01, 2006 is a
Doctoral School in the Faculty of Science, Technology
and Communication.
The faculty is composed of the following professors:
David BASIN, Pascal BOUVRY, Eric DUBOIS, Thomas ENGEL,
Franck LEPREVOST, Christoph MEINEL, Nicolas GUELFI,
and Brn OTTERSTEN.
The PhD Students are: Christoph BRANDT, Pandu
DEVARAKOTA, Daniel FISCHER, Benamin GATEAU, Markus
GROSS, Joel GROTZ, Annie GUERRIERO, Serge LINCKELS,
Nicolas MAYER, Michael NOLL, Benot RIES, Michael
STIEGHAHN.
Magali MARTIN is the secretary of LIASIT... and also
a nice entertainer.
3.2. XML - Introduction
3. XML and its Sub-Languages
What a computer sees…
10
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 11
<title>Welcome to LIASIT</title>
<description>LIASIT stands for Luxembourg Advanced Studies
in Information Technology and since August 01, 2006 is
a Doctoral School in the Faculty of Science,
Technology and Communication.</description>
<profs>
<name>The faculty</name>
<name>is composed of</name>
<name>the following</name>
</profs>
<students>
<name>The PhD</name>
<name>Students are:</name>
<name>Christoph BRANDT,</name>
<name>Pandu DEVARAKOTA,</name>
</students>
<administration>
<name>Daniel FISCHER, <name>
</administration>
3.2. XML - Introduction
3. XML and its Sub-Languages
How we can help…
11
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 12
<title>Welcome to LIASIT</title>
<description>LIASIT stands for Luxembourg Advanced
Studies in Information Technology and since August 01,
2006 is a Doctoral School in the Faculty of Science,
Technology and Communication.</description>
<profs>
<name>The faculty</name>
<name>is composed of</name>
<name>the following</name>
</profs>
<students>
<name>The PhD</name>
<name>Students are:</name>
<name>Christoph BRANDT,</name>
<name>Pandu DEVARAKOTA,</name>
</students>
<administration>
<name>Daniel FISCHER, </name>
</administration>
How we can help…
3. XML and its Sub-Languages
3.2. XML - Introduction
12
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 13
<profs>
<name>Thomas Engel</name>
<name>Christoph Meinel</name>
<name>David Basin</name>
<name> Björn Ottersten</name>
</profs>
<students>
<name>Benoît Ries</name>
<name>Daniel Fischer</name>
<name>Christoph Brandt,</name>
<name>Pandu Devarakota</name>
<name>Serge Linckels</name>
</students>
3.3. XML - Introduction
3. XML and its Sub-Languages
Benefits of XML
Document is well-structured
Applications can process the file
XML file
Thomas Engel
Christoph Meinel
David Basin
Björn Ottersten
Benoît Ries
Daniel Fischer
Christoph Brandt
Pandu Devarakota
Serge Linckels
pure text file
Problems with text file
No structure
Difficult to process
Attention: although XML adds a certain amount of
semantics to the document, there are sill
information that are missing, e.g., what is the
relation between "profs" and "students"?
13
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 14
3.3. XML – Language Specifications
3. XML and its Sub-Languages
<person type="Teacher">
<name>Serge Linckels</name>
<hp>http://www.linckels.lu</hp>
<size>173</size>
<phone>691-111111</phone>
</person>
element
attribute
child-element
value
Terminology
Tree representation
person
name hp size phone
Serge Linckels
http://www.linckels.lu
173
691-111111
General
XML is composed of text and tags
Tags come in pairs, e.g., <hp></hp>
Tags must be properly nested, e.g.,
<person><hp></person></hp>
<person><hp></hp></person>
Tags are case sensitive: <hp> ≠ <HP>
14
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 15
<staff>
</staff>
3.3. XML – Language Specifications
3. XML and its Sub-Languages
XML structure
<person type="Teacher">
<name>Serge Linckels</name>
<hp>http://www.linckels.lu</hp>
<size>173</size>
<phone>26 00 11 22</phone>
<phone>691-111111</phone>
</person>
<person type="Teacher">
<name>Denis Zampunieris</name>
<phone>4666445290</phone>
</person>
same element can be used repeatedly
Nested tags can be part
of a list too.
Order is not significant.
Element or attribute?
<name>
<first>Serge</first>
<last>Linckels</last>
</name>
<name first="Serge" last="Linckels”></name>
Both variants are semantically identical
or
15
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 16
3.3. XML – Language Specifications
3. XML and its Sub-Languages
XML Names
Can include:
- letters (a..z, A..Z)
- digits (0..9)
- these punctuation chars:
- underscore (_)
- hyphen (-)
- period (.)
- special chars like ö, ç, Ω
Examples for valid XML Names:
<drivers_licence>
<_oki-doki>
<téléphone>
<this.works>
CDATA Sections
Everything between <![CDATA[ and ]]> is
treated as raw character data.
<person type="Teacher">
<name>Serge Linckels</name>
<![CDATA[This is just some
code that is ignored,
10 print "Hello world"
20 goto 10
]]>
<phone>26 00 11 22</phone>
</person>
Comments
Comments are between <!-- and --> like in
HTML
16
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 17
3.3. XML – Language Specifications
3. XML and its Sub-Languages
XML declaration
<?xml version="1.0" encoding="ASCII" standalone="yes"?>
<person type="Teacher">
<name>Serge Linckels</name>
<hp>http://www.linckels.lu</hp>
<size>173</size>
<phone>691-111111</phone>
</person>
encoding: XML is pure text, but can use different encoding, e.g.,
ASCII, Latin-1, Unicode, ISO-8859-1. When omitted then Unicode is
default.
standalone:
- "yes", no external DTD/Schema is given
- "no", external DTD/Schema is specified
XML-defined character sets
Unicode: 95156 characters from most of Earths
living languages (variants: UCS-2, UCS-4, UTF-8,
UTF-16)
ISO character sets: e.g., ISO-8859-15 (Latin 9)
is ASCII + accented letters + €
17
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 18
3.3. XML – Language Specifications
3. XML and its Sub-Languages
JSON – Javascript Object Notation
No element names
Primary data format used for asynchronous
browser/server communication (AJAX)
Language-independent data format
Supported by many programming languages
18
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 19
3.4. Document Type Definitions (DTD)
3. XML and its Sub-Languages
<person type="Teacher">
<name>
<first>Serge</first>
<last>Linckels</last>
</name>
<phone>691-111111</phone>
</person>
<Personne>
<Type>Teacher</Type>
<Nom>Serge Linckels</Nom>
<HP>http://www.linckels.lu</HP>
<Sexe>M</Sexe>
</Personne>
?
Formal syntax is required
DTD – Document Type Definitions
Syntax of XML document is described
Validating parser checks syntax:
XML document with DTD syntax
A XML document is valid if it respects the
syntax defined in its DTD
<!ELEMENT person (name, phone*)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
#PCDATA: value of type string
A person-element can contain
1 name sub-element and 0..*
phone sub-elements
Attention: a document can be well-formed
but not valid!
Web browser only checks if well-formed.
19
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 20
3.4. Document Type Definitions (DTD)
3. XML and its Sub-Languages
<?xml version="1.0" standalone="no"?>
<!DOCTYPE person SYSTEM "http://www.linckels.lu/person.dtd">
<person type="Teacher">
<name>
<first>Serge</first>
<last>Linckels</last>
</name>
<phone>691-111111</phone>
</person>
<!ELEMENT person (name, phone*)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
person.xml
person.dtd
URI of the DTD-file
e.g., "/mydisk/person.dtd"
Validating a document
A Web browser does not validate documents
but only checks it for well-formedness
XML validators APIs are available in Java
Online validators, e.g.,
http://www.stg.brown.edu/service/xmlvalid/
http://www.w3.org/2001/03/webdata/xsv
20
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 21
3.4. Document Type Definitions (DTD)
3. XML and its Sub-Languages
<?xml version="1.0"?>
<!DOCTYPE person [
<!ELEMENT person (name, phone*)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>
<person type="Teacher">
<name>
<first>Serge</first>
<last>Linckels</last>
</name>
<phone>691-111111</phone>
</person>
Valid XML document with internal DTD
Sequences
* Zero or more of the element is allowed
? Zero or one of the element is allowed
+ One or more of the element is required
Elements must appear in the specified order
Choices
<!ELEMENT color (red | green)
Here, the element color can have a child-
element red or green, not both at a time.
21
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 22
3.4. Document Type Definitions (DTD)
3. XML and its Sub-Languages
<?xml version="1.0"?>
<!DOCTYPE person [
<!ELEMENT person (name, phone*)>
<!ATTLIST name first CDATA #IMPLIED
last CDATA #REQUIRED
>
<!ELEMENT phone (#PCDATA)>
]>
<person type="Teacher">
<name first="Serge" last="Linckels" />
<phone>691-111111</phone>
</person>
Attribute declarations
Attribute defaults:
#IMPLIED: value is optional
#REQUIRED: value is required
#FIXED: value is constant
Literal: value is given as quoted string
Attribute types:
CDATA: any string of text
Enumeration: list of values
ID: unique XML name
IDREF: unique identification of some
element in the document
IDREFS: set of IDREFs
Valid XML document with internal DTD
22
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 23
<family>
<person id="jane" mother="mary" father="john">
<name>Jane Doe</name>
</person>
<person id="john" children="jane jack">
<name>John Doe</name>
</person>
<person id="mary" children="jane jack">
<name>Mary Smith</name>
</person>
<person id="jack" mother="mary" father="john">
<name>Jack Smith</name>
</person>
</family>
3.4. Document Type Definitions (DTD)
3. XML and its Sub-Languages
DTD – ID, IDREF and IDREFS
<!DOCTYPE family [
<!ELEMENT family (person*)>
<!ELEMENT person (name)>
<!ELEMENT name (#PCDATA)>
<!ATTLIST person
id ID #REQUIRED
mother IDREF #IMPLIED
father IDREF #IMPLIED
children IDREFS #IMPLIED>
]>
XML document
DTD
23
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 24
3.4. Document Type Definitions (DTD)
3. XML and its Sub-Languages
Problems and limitations
DTD are context-free grammars; recursive definitions are possible
Order matters, e.g.,
<!ELEMENT person (last, first)>
Workaround:
<!ELEMENT person ((last, first) | (first, last))>
Can become unclear:
<!ELEMENT person ((name | phone | e-mail)*)>
Lacks of expressiveness, e.g., restriction over references are not possible
All elements are global in one namespace
XML Schema, more powerful than DTD and W3C recommendation
No support for newer features of XML
DTD are expressed in a non-XML syntax
…but there are numerous other XML schema languages, e.g., RELAX NG, ISO DSDL, Schematron…
24
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 25
3.5. XML Schemas
3. XML and its Sub-Languages
XML Schema - Overview
XML Schema is an XML document containing a formal description of what comprises a valid
XML document
An XML document described by a schema is called an instance document
More explicit restrictions on the number and sequence of child elements are possible
Example
<?xml version="1.0"?>
<fullName>Serge Linckels</fullname>
XML document
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="fullName" type="xs:string" />
</xs:schema>
XML Schema
xs: is standard
prefix for XML
Schema namespace
25
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 26
Atomic types (more than 40!)
string: Unicode string
3.5. XML Schemas
3. XML and its Sub-Languages
integer: positive or negative number
boolean: true/false or 0/1
ID, IDREF, IDREFS: cf. DTD
Simple types
New simple types can be created by using atomic types
<xs:element name="first" type="xs:string" />
<xs:element name="age" type="xs:integer" />
<xs:element name="link" type="xs:anyURI" />
<xs:element name="year" type="xs:year" />
<xs:simpleType name="aName" base="xs:string" />
Restrictions can be defined
<xs:simpleType name="aName">
<xs:resriction base="xs:string>
<xs:maxLength value="50" />
</xs:restriction>
</xs:simpleType>
26
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 27
More about restrictions
Restrictions (facets) can be defined over simple types using xs:restriction
<xs:simpleType name="location">
<xs:resriction base="xs:string>
<xs:enumeration value="work" />
<xs:enumeration value="school" />
<xs:enumeration value="mobile" />
</xs:restriction>
</xs:simpleType>
3.5. XML Schemas
3. XML and its Sub-Languages
Enumerations:
<xs:simpleType name="age">
<xs:resriction base="xs:unsignedShort>
<xs:minExclusive value="0" />
<xs:maxInclusive value="120" />
</xs:restriction>
</xs:simpleType>
Numeric facets:
possible values:
• minInclusive
• maxInclusive
• minExclusive
• maxExclusive
27
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 28
More about restrictions
<xs:simpleType name="mobile-phone">
<xs:resriction base="xs:string>
<xs:pattern value="ddd-dd dd dd" />
</xs:restriction>
</xs:simpleType>
Enforcing format:
3.5. XML Schemas
3. XML and its Sub-Languages
Enforces the rule that a
mobile phone-number
consists of 3 digits, a
dash, 2 digits, a space, 2
digits, another space and
finally 2 digits.
<xs:simpleType name="TypeAuthor">
<xs:list itemType="xs:string />
</xs:simpleType>
Lists:
<xs:element name="author" type="TypeAuthor" />
XML Schema – element definition
XML Schema – simple type definition
The author element of an
instance document can
contain an unlimited list
of strings, each separated
by a whitespace
28
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 29
3.5. XML Schemas
3. XML and its Sub-Languages
Complex types
A complex type is an element that contains
child-elements
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="first" type="xs:string" />
<xs:element name="last" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
<?xml version="1.0"?>
<person>
<first>Serge</first>
<last>Linckels</last>
</person>
XML document
XML Schema
Only elements can have complex types,
attributes always have simple types
sequence: order of elements matters (a,b)
all: order of elements does not matter (a,b or b,a)
choice: one or the other element (a xor b)
29
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 30
3.5. XML Schemas
3. XML and its Sub-Languages
Occurrence Constraints
Set the number of times an element may occur:
minOccurs: minimum occurrences
maxOccurs: maximum occurrences
<xs:element name="middle" type="xs:string"
minOccurs="0" maxOccurs="unbounded“ />
The default value for minOccurs and maxOccurs is 1
In this example, maxOccurs is not set, but has a default value of 1. Therefore, the middle
element may appear 0 or 1 times.
The value unbounded indicates that the element may appear an unlimited number of times.
30
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 31
3.5. XML Schemas
3. XML and its Sub-Languages
Derived complex types
Deriving by extension: add new definitions to existing complex type. E.g.,
add the phone element to the existing person type.
<xs:complexType name="PersonWithPhone">
<xs:extension base="person">
<xs:sequence>
<xs:element name="phone" type="xs:string" />
</xs:sequence>
</xs:extension>
</xs:complexType>
Deriving by restriction: by omitting parts of the parent definition, the
restriction element create a new, constrained type.
<xs:complexType name="PersonWithMoreNames">
<xs:restriction base="person">
<xs:sequence>
<xs:element name="first" type="xs:string" minOccurs="2" />
<xs:element name="last" type="xs:string" />
</xs:sequence>
</xs:restriction>
</xs:complexType>
Structure cannot
be changed!
Only available for
complex types
31
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 32
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="first" type="xs:string" />
<xs:element name="last" type="xs:string" />
</xs:sequence>
<xs:attribute name="job" type="xs:string" />
</xs:complexType>
</xs:element>
3.5. XML Schemas
3. XML and its Sub-Languages
Attribute declarations
Attributes can be declared globally by top-level xs:attribute
<xs:attribute name="job" type="xs:string" use="optional" />
Attributes can be declared locally as part of a complex type definition
<?xml version="1.0"?>
<person job="Teacher">
<first>Serge</first>
<last>Linckels</last>
</person>
possible values:
optional or required
XML document
32
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 33
3.5. XML Schemas
3. XML and its Sub-Languages
Conclusion: DTD vs. XML Schemas
XML Schemas is a more powerful language than DTD to specify the syntax of XML
documents; therefore, it is more expressive in terms of semantics
XML Schemas is a W3C recommendation and widely used. As DTD are simpler to use, they
are still used today
33
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 34
3.6. Namespaces
3. XML and its Sub-Languages
<compactDisk author="HS">
<titel>Remixes</titel>
<track number="1">
<titel>Night over Manaus</titel>
<author>Boozoo Bajou</author>
</track>
</compactDisk>
Problem of ambiguous names
XML names can be used for different
elements. But this creates ambiguities.
XML namespaces disambiguate elements with the same name from each
other by assigning elements and attributes to URIs.
Qualified names, prefixes and local parts
Elements are identified by qualified names:
cd:titel
prefix local name
qualified name
34
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 35
3.6. Namespaces
3. XML and its Sub-Languages
Using XML namespaces
<cd:compactDisk
xmlns:cd = "http://www.linckels.lu/cd"
xmlns:tr = "http://www.xyz.com/tracks"
author="HS">
<cd:titel>Remixes</cd:titel>
<tr:track number="1">
<tr:titel>Night over Manaus</tr:titel>
<tr:author>Boozoo Bajou</tr:author>
</tr:track>
</cd:compactDisk>
Each element exists in a unique
namespace
Namespace URIs are purely formal
identifiers; they are not the
addresses of a page, and they are
not meant to be followed as links
<tr:title>Remixes</tr:title>
Instead of using a prefix, the complete URI can be indicated, e.g.,
<http://www.xyz.com/tracks#title>
Remixes
</http://www.xyz.com/tracks#title>
Namespace binding: each prefix in a qualified name must be associated with a URI
Namespaces only apply to elements,
not to attributes
35
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 36
3.7. Programming Models
3. XML and its Sub-Languages
Common XML processing models
Treating XML as text
Treating XML as events; the document is read as it happens (e.g., an "event" can
be the start of an element, the content of an element, and the end of an element)
Treating XML as tree models
XML transformations
Abstracting XML always; do not consider the XML elements
Most commonly used
Document Object Model (DOM)
Simple API for XML (SAX)
36
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 37
3.7. Programming Models
3. XML and its Sub-Languages
DOM overview
The entire document must be read and parsed before it is available as DOM; unsuitable for
very large documents
User accesses data by traversing the tree (tree and its traversal conform to a W3C standard)
The API allows for constructing, accessing and manipulating the structure and content of XML
documents
<countries>
<country continent="Asia">
<name>Israel</name>
<population year="2001">6199008</population>
<city capital="yes"><name>Jerusalem</name></city>
<city><name>Ashdod</name></city>
</country>
<country continent="Europe">
<name>France</name>
<population year="2004">60424213</population>
</country>
</countries>
Example:
37
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 38
Asia
2001
60424213
3.7. Programming Models
3. XML and its Sub-Languages
population
document
countries
country
continent name
Israel
population
year
6199008
city
capital
yes Jerusalem
name
city
capital
no Ashod
continent
Europe France 2004
namename
country
year
DOM tree
root node
node
value node
38
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 39
API
Application
XML
document DOM parser
DOM tree
(in memory)
3. XML and its Sub-Languages
3.7. Programming Models
DOM Java API - general
DOM tree is generated by a DocumentBuilder
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document myXMLdoc = builder.parse("world.xml");
The builder is generated by a Factory to be implementation independent.
The factory is chosen according the system configuration
39
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 40
3. XML and its Sub-Languages
3.7. Programming Models
DOM Java API - general
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
40
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 41
3. XML and its Sub-Languages
3.7. Programming Models
DOM Java API – node API
//create the root element
Element root = myXMLdoc.createElement("root");
//add it to the xml tree
myXMLdoc.appendChild(root);
//create child element
Element childElement = myXMLdoc.createElement("Child");
//add the attribute to the child
childElement.setAttribute("attribute1","The value of Attribute 1");
//add child element to the root element
root.appendChild(childElement);
The nodes of the DOM tree include
- a special root (denoted document)
- element nodes
- text nodes and CDATA sections
- attributes
- comments
- and more ...
Examples:
41
myXMLdoc
root
attribute1
The value of
Attribute 1
Child
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 42
3. XML and its Sub-Languages
3.7. Programming Models
DOM Java API – node API
Every node in the DOM tree implements the Node interface
DocumentFragment
Document
Element
Attribute
CDATA
DocumentType
Notation
Entity
EntityReference
ProcessInstruction
Node
Text
Comment
CDATA Section
42
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 43
3. XML and its Sub-Languages
3.7. Programming Models
DOM Java API – node API
Every node has a specific location in tree
Node interface specifies methods for tree navigation
• Node getFirstChild();
• Node getLastChild();
• Node getNextSibling();
• Node getPreviousSibling();
• Node getParentNode();
• NodeList getChildNodes();
• NamedNodeMap getAttributes();
getParentNode()
getPreviousSibling()
getNextSibling()
getFirstChild()
getChildNodes()
getLastChild()
43
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 44
3. XML and its Sub-Languages
3.7. Programming Models
DOM Java API – node API
Every node has :
• a type
• a name
• a value
• attributes
The roles of these properties differ according to the node types
if (myNode.getNodeType() == Node.ELEMENT_NODE) {
//process node
…
}
ELEMENT_NODE = 1
ATTRIBUTE_NODE = 2
TEXT_NODE = 3
CDATA_SECTION_NODE = 4
ENTITY_REFERENCE_NODE = 5
ENTITY_NODE = 6
PROCESSING_INSTRUCTION_NODE = 7
COMMENT_NODE = 8
DOCUMENT_NODE = 9
DOCUMENT_TYPE_NODE = 10
DOCUMENT_FRAGMENT_NODE = 11
NOTATION_NODE = 12
Node-types:
44
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 45
Asia
3.7. Programming Models
3. XML and its Sub-Languages
2001
60424213
population
document
countries
country
continent name
Israel
population
year
6199008
city
capital
yes Jerusalem
name
city
capital
no Ashod
continent
Europe France 2004
namename
country
year
DOM tree
root node
node
value node
45
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 46
3. XML and its Sub-Languages
3.7. Programming Models
Example
46
// print one given city (city is given as element)
public static void printCity(Element city) {
Node nameNode = city.getElementsByTagName("name").item(0);
String cName = nameNode.getFirstChild().getNodeValue();
System.out.println("Found City: " + cName);
}
// prints all cities found in the DOM tree
public static void printCities(Document myXMLdoc) {
NodeList cities = myXMLdoc.getElementsByTagName("city");
for(int i = 0; i < cities.getLength(); ++i) {
printCity((Element)cities.item(i));
}
}
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 47
3. XML and its Sub-Languages
3.7. Programming Models
DOM Java API – node API
Children of a node in a DOM tree can be manipulated; added, edited, deleted, copied etc.
To construct new nodes, use the methods of Document: createElement, createAttribute,
createTextNode, createCDATASection etc.
To manipulate a node, use the methods of Node: appendChild, insertBefore, removeChild,
replaceChild, setNodeValue, cloneNode(boolean deep) etc.
insertBefore()
new
replaceChild()
new
cloneNode(false)
new
cloneNode(true)
new
47
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 48
3. XML and its Sub-Languages
Picture created by Harald Sack
48
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 49
3.8. XLink, XPath and XPointer
3. XML and its Sub-Languages
Overview
XML Linking Language (XLink) is a XML markup language used for creating hyperlinks in XML
documents, e.g., HTML <a>-tag
XPointer is a system for addressing components of XML based internet media
XML Path Language (XPath) is a language for selecting nodes from an XML document
<a href="…">
<a name="…">
49
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 50
3.8. XLink, XPath and XPointer
3. XML and its Sub-Languages
XLink - general
Allows to create unidirectional links between exactly 2 resources
Origin of the link is always the starting document (where the link is nested)
Browsers are free to interpret this link as they like (depends on used CSS)
<a href="…">
Starting
document
Destination
document
Example as an implementation as HTML hyperlink, but can be more…
50
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 51
3.8. XLink, XPath and XPointer
3. XML and its Sub-Languages
XLink – parameters 1/3
title: (optional) textual information about the link (display in a browser as hint)
<publications
xmlns:xlink="http//www.w3.org/1999/xlink"
xlink:title="My publications"
xlink:href="http://www.linckels.lu/publications.txt"
xlink:role="http://www.dblp.de"
xlink:show="new"
xlink:actuate="onRequest"
xlink:type="simple"
/>
href: URI of linked destination resource (must not necessarily be a URL)
xlink: specifies that this is an XLink definition
role: (optional) points to a resource that specifies the meaning of the connection between
the resources, e.g., a Web page that gives further information
51
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 52
3.8. XLink, XPath and XPointer
3. XML and its Sub-Languages
XLink – parameters 2/3
show: (optional) context in which the linked resource is to display
- new: new window
- replace: current window
- embed: embed in current document
- other: customized
- none: no behavior
<publications
xmlns:xlink="http//www.w3.org/1999/xlink"
xlink:title="My publications"
xlink:href="http://www.linckels.lu/publications.txt"
xlink:role="http://www.dblp.de/"
xlink:show="new"
xlink:actuate="onRequest"
xlink:type="simple"
/>
actuate: (optional) specifies when an application that encounters an XLink should follow it
- onLoad: as soon as the application sees it
- onRequest: when the user asks to follow it
- other: customized
- none: no behavior (e.g., if the link is an ISBN number of a "physical book")
52
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 53
3.8. XLink, XPath and XPointer
3. XML and its Sub-Languages
XLink – parameters 3/3
type: specifies the type of the link
- simple: one standard link between two resources (i.e., HTML hyperlinks)
- extended: more links between a collection of resources (≈ directed graph)
1..n relation
E.g., a book is
published in three
particular editions
sequences
E.g., a book has
two preceding
versions
n..m relation
E.g., pizza is composed of different
ingredients and can result in different
compositions
Starting
documents
Destination
documents
Link basis
53
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 54
3.8. XLink, XPath and XPointer
3. XML and its Sub-Languages
XLink – Extended type example
<xlink:extended
xmlns:xlink="http//www.w3.org/1999/xlink/"
link:role="http://www.pizza.de/pizzaworld"
xlink:title="Pizza Tonno">
<xlink:locator
href="Pizzaboden.xml"
role="http://www.pizza.de/base"
title="Pizzaboden"/>
<xlink:locator
href="Basilikum.xml"
role="http://www.pizza.de/base"
title="Basilikum"/>
<xlink:arc
from="http://www.pizza.de/base"
to="http://www.pizza.de/special"
show="new"
actuate="onRequest"/>
</xlink:extended>
Pizzaboden Basilikum
Pizza Tonno
base
special
Specifies how the resources
relate to each other
Specifies the resources
involved
54
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 55
3.8. XLink, XPath and XPointer
3. XML and its Sub-Languages
XPointer and XPath - general
Fix an anchor inside a document, e.g., HTML: <a name="…">
Problems with HTML-anchors:
• not possible inside remote documents (no permission to modify the source code)
• the complete destination document must be transmitted, even if only one sub-part of the
document is addressed
Principle solution:
• represent documents as trees
• address sub-tree only (navigation through tree)
55
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 56
3.8. XLink, XPath and XPointer
3. XML and its Sub-Languages
XML Pointer Language (XPointer)
General definition of an XML Pointer:
URI#xpointer(anchor description)
<gedicht>
<strophe ID="strophe1">
<zeile ID="zeile1">Dreifach ist des Raumes Maß:</zeile>
<zeile ID="zeile2">Rastlos fort ohne Unterlaß</zeile>
<zeile ID="zeile3">Strebt die Länge fort ins Weite,</zeile>
<zeile ID="zeile4">Endlos gießet sich die Breite,</zeile>
<zeile ID="zeile5">Grundlos senkt die Tiefe sich</zeile>
</strophe>
</gedicht>
http://www.bsp.de/gedicht.xml#xpointer(ID('zeile2'))
http://www.bsp.de/gedicht.xml#xpointer(//zeile[@ID="zeile2"])
http://www.bsp.de/gedicht.xml#element(/1/1/2)
Example :
XML document
XPointers
56
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 57
3.8. XLink, XPath and XPointer
3. XML and its Sub-Languages
XML Path Language (XPath)
XPath is a non-XML language for identifying particular parts of XML documents, i.e., for
picking out nodes an sets of nodes out of a tree
gedicht.xml#xpointer(/child::gedicht[position()=3])
Example :
child descendant following following sibling
parent ancestor preceding preceding sibling
Context-nodeAddressed-node
axis context-node
predicate
57
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 58
3. XML and its Sub-Languages
Picture created by Harald Sack
58
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 59
3.9. XSL Transformations (XSLT)
3. XML and its Sub-Languages
XSLT overview
Extensible Stylesheet Language Transformations (XSLT) is an XML-based language used to
specify rules by which one XML document is transformed into another (XML) document
The resulting document may be XML syntax or another format, such as HTML or plain text
Examples of applications of XSLT:
- convert data between different XML schemas
- convert XML data into HTML or XHTML documents for web pages (e.g., with CSS)
- creating a dynamic web page
- convert into an intermediate XML format that can be converted to PDF documents
Such a transformation is based on the following languages:
- XSLT: specifies the transformation rules
- XSL-FO: describes how to transform layout
- XPath: access to specific parts of an XML document
59
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 60
3.9. XSL Transformations (XSLT)
3. XML and its Sub-Languages
XSLT transformation principle (example)
XML
documents
DTD/XML-S
XSL
document 1
XSL
document 2
PDF document
HTML
WML
XSLT processor
60
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 61
3.9. XSL Transformations (XSLT)
3. XML and its Sub-Languages
XSLT transformation principle (example)
XSL stylesheet
tree representation of XML
document
XSL works on the abstract tree representation of the XML document
A set of transformation rules are required in form a an XSLT document, e.g., template (XSL
stylesheet)
output
document
+
The structure tree is browsed and for each node the appropriate template from the XSL
stylesheet is applied
61
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 62
3.9. XSL Transformations (XSLT)
3. XML and its Sub-Languages
<data>
<row num="1">toto 1</row>
<row num="2">toto 2</row>
<row num="3">toto 3</row>
</data>
XML document
<?xml version="1.0" encoding="ISO-8859-1"?>
<html xsl:version="1.0"
xmlns:xsl="http://www.w3.org/
1999/XSL/Transform">
<body>
<h1>Demo</h1>
<xsl:for-each select="row">
<br/>Row:
<xsl:value-of select="@num"/>
- Data:
<xsl:value-of select="."/>
</xsl:for-each>
</body>
</html>
XSL stylesheet
<html><body>
<h1>Demo</h1>
<br/>Row: 1 - Data: toto 1
<br/>Row: 2 - Data: toto 2
<br/>Row: 3 - Data: toto 3
</body></html>
HTML document
XSLT
processor
62
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 63
3.9. XSL Transformations (XSLT)
3. XML and its Sub-Languages
import java.io.FileReader;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
public class TestTransformation {
public static void main(String[] args)
throws Exception {
Source sourceXSL = new StreamSource(new FileReader("stylesheet.xsl"));
Source sourceXML = new StreamSource(new FileReader("data.xml"));
TransformerFactory trFac = TransformerFactory.newInstance();
Transformer tf;
Result resultOnScreen = new StreamResult(System.out);
tf = trFac.newTransformer(sourceXSL);
tf.transform(sourceXML, resultOnScreen);
}
}
Perform XSL transformation using a Java programme
63
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 64
3.9. XSL Transformations (XSLT)
3. XML and its Sub-Languages
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="stylesheet.xsl" ?>
<person type="Teacher">
<name>Serge Linckels</name>
<hp>http://www.linckels.lu</hp>
<size>173</size>
<phone>691-111111</phone>
</person>
Perform XSL transformation using a stylesheet
64
Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 65
XML in a Nutshell
Elliotte R. Harold, W. Scott Means, W. Scott Means
3. XML and its Sub-Languages
3.10. References
65
E-Librarian Service
User-Friendly Semantic Search in Digital Libraries
Serge Linckels, Christoph Meinel

More Related Content

Similar to Semantic Web - XML and sublanguages

Mainflux - Hyperscalable Unified IoT Platform
Mainflux - Hyperscalable Unified IoT PlatformMainflux - Hyperscalable Unified IoT Platform
Mainflux - Hyperscalable Unified IoT Platform
Sasa Klopanovic
 
Mainflux - Hyperscalable Unified IoT Platform
Mainflux - Hyperscalable Unified IoT PlatformMainflux - Hyperscalable Unified IoT Platform
Mainflux - Hyperscalable Unified IoT Platform
Sasa Klopanovic
 
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docx
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docxA Survey on Domain-Specific Languages for Machine.pdfA Sur.docx
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docx
bartholomeocoombs
 
How our network_works
How our network_worksHow our network_works
How our network_works
Robin Nappi
 

Similar to Semantic Web - XML and sublanguages (20)

Full xml
Full xmlFull xml
Full xml
 
Tutor Xml Gxs
Tutor Xml GxsTutor Xml Gxs
Tutor Xml Gxs
 
XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7XML, XML Databases and MPEG-7
XML, XML Databases and MPEG-7
 
The need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formatsThe need of Interoperability in Office and GIS formats
The need of Interoperability in Office and GIS formats
 
groovy DSLs from beginner to expert
groovy DSLs from beginner to expertgroovy DSLs from beginner to expert
groovy DSLs from beginner to expert
 
Mainflux - Hyperscalable Unified IoT Platform
Mainflux - Hyperscalable Unified IoT PlatformMainflux - Hyperscalable Unified IoT Platform
Mainflux - Hyperscalable Unified IoT Platform
 
Mainflux - Hyperscalable Unified IoT Platform
Mainflux - Hyperscalable Unified IoT PlatformMainflux - Hyperscalable Unified IoT Platform
Mainflux - Hyperscalable Unified IoT Platform
 
XML Tutor maXbox starter27
XML Tutor maXbox starter27XML Tutor maXbox starter27
XML Tutor maXbox starter27
 
Yacks
YacksYacks
Yacks
 
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docx
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docxA Survey on Domain-Specific Languages for Machine.pdfA Sur.docx
A Survey on Domain-Specific Languages for Machine.pdfA Sur.docx
 
Why XML is important for everyone, especially technical communicators
Why XML is important for everyone, especially technical communicatorsWhy XML is important for everyone, especially technical communicators
Why XML is important for everyone, especially technical communicators
 
HHS_TOC_Glossary EMERSON EDUARDO RODRIGUES
HHS_TOC_Glossary EMERSON EDUARDO RODRIGUESHHS_TOC_Glossary EMERSON EDUARDO RODRIGUES
HHS_TOC_Glossary EMERSON EDUARDO RODRIGUES
 
93 peter butterfield
93 peter butterfield93 peter butterfield
93 peter butterfield
 
Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)
 
How our network_works
How our network_worksHow our network_works
How our network_works
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
XML: A New Standard for Data
XML: A New Standard for DataXML: A New Standard for Data
XML: A New Standard for Data
 
Investigating Soap and Xml Technologies in Web Service
Investigating Soap and Xml Technologies in Web Service  Investigating Soap and Xml Technologies in Web Service
Investigating Soap and Xml Technologies in Web Service
 
INVESTIGATING SOAP AND XML TECHNOLOGIES IN WEB SERVICE
INVESTIGATING SOAP AND XML TECHNOLOGIES IN WEB SERVICEINVESTIGATING SOAP AND XML TECHNOLOGIES IN WEB SERVICE
INVESTIGATING SOAP AND XML TECHNOLOGIES IN WEB SERVICE
 
20100614 ISWSA Keynote
20100614 ISWSA Keynote20100614 ISWSA Keynote
20100614 ISWSA Keynote
 

More from Serge Linckels (11)

Media IT - author rights
Media IT - author rightsMedia IT - author rights
Media IT - author rights
 
Media IT - Images
Media IT - ImagesMedia IT - Images
Media IT - Images
 
Media IT - Entropy
Media IT - EntropyMedia IT - Entropy
Media IT - Entropy
 
Media IT - Natural Language Processing
Media IT - Natural Language ProcessingMedia IT - Natural Language Processing
Media IT - Natural Language Processing
 
Media IT - Coding
Media IT - CodingMedia IT - Coding
Media IT - Coding
 
Semantic Web - Search engines
Semantic Web - Search enginesSemantic Web - Search engines
Semantic Web - Search engines
 
Semantic Web - OWL
Semantic Web - OWLSemantic Web - OWL
Semantic Web - OWL
 
Semantic Web - Ontologies
Semantic Web - OntologiesSemantic Web - Ontologies
Semantic Web - Ontologies
 
Semantic Web - RDF
Semantic Web - RDFSemantic Web - RDF
Semantic Web - RDF
 
Semantic Web - Introduction
Semantic Web - IntroductionSemantic Web - Introduction
Semantic Web - Introduction
 
E-Librarian Service
E-Librarian ServiceE-Librarian Service
E-Librarian Service
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Semantic Web - XML and sublanguages

  • 1. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 1 Semantic Web Unit 3: XML and its Sub-Languages Faculty of Science, Technology and Communication (FSTC) Bachelor en informatique (professionnel)
  • 2. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 2 3. XML and its Sub-Languages Semantic Web Roadmap: Controlled growth bottom up according to this architecture. Architecture was (slightly) modified in the last years. 2
  • 3. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 3 3.1. Why is HTML not Sufficient? 3.2. XML - Introduction 3.3. XML – Language Specifications 3.4. Document Type Definitions (DTD) 3.5. XML Schemas 3.6. Namespaces 3.7. Programming Models 3.8. XLink, XPath and XPointer 3.9. XSL Transformations (XSLT) 3.10. References 3. XML and its Sub-Languages 3
  • 4. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 4 3.1. Why is HTML not Sufficient? 3. XML and its Sub-Languages <h1>Christoph Meinel</h1> <h2>Viola Brehmer</h2> <ul> <li>Long Wang</li> <li>Feng Cheng</li> <li>Dirk Cordel</li> <li>Serge Linckels</li> </ul> Harald Sack Limitations of HTML HTML was initiated to give a structure to a document and to modify its layout; NOT to describe semantics What is this Web page about? What position has "Viola Brehmer"? … Meta-Tags <meta name="description" content="Homepage of Serge Linckels"> <meta name="keywords" content="teacher, athlete"> <meta name="Author" content="The Master of the Universe"> <meta name="xyz" content="nothing special"> Do you believe in Meta-Tags? HTML metadata are created by the author of the Web page. Their syntax and semantics are individual. 4
  • 5. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 5 3.2. XML - Introduction 3. XML and its Sub-Languages Extensible Markup Language (XML) Markup Language: allows to give a structure to text documents by using tags Meta Language: XML does not have a fixed set of tags (new tags can be created) Extensible: XML can be adapted (extended) to meet many different domains, e.g., • Mathematical Markup Language (MathML) • Chemical Markup Language (CML) • Synchronized Multimedia Integration Language (SMIL) • WAP Markup Language (WML) Creator Jon Bosak, 1996 XML is not… a programming language a network transport protocol a database XML is… a simple data format platform independent does not require special applications 5
  • 6. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 6 3.2. XML - Introduction 3. XML and its Sub-Languages Picture created by Harald Sack 6
  • 7. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 7 3.2. XML - Introduction 3. XML and its Sub-Languages Picture created by Harald Sack 7
  • 8. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 8 3.2. XML - Introduction 3. XML and its Sub-Languages Standard Generalized Markup Language (SGML) The Standard Generalized Markup Language (SGML) is a metalanguage in which one can define markup languages for documents HTML XML    XHTML • Instance of SGML • Layout of data • Layout and data are mixed-up • Subset of SGML • Structure of data • Layout and data are separated 8
  • 9. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 9 Welcome to LIASIT LIASIT stands for Luxembourg Advanced Studies in Information Technology and since August 01, 2006 is a Doctoral School in the Faculty of Science, Technology and Communication. The faculty is composed of the following professors: David BASIN, Pascal BOUVRY, Eric DUBOIS, Thomas ENGEL, Franck LEPREVOST, Christoph MEINEL, Nicolas GUELFI, and Björn OTTERSTEN. The PhD Students are: Christoph BRANDT, Pandu DEVARAKOTA, Daniel FISCHER, Benjamin GATEAU, Markus GROSS, Joel GROTZ, Annie GUERRIERO, Serge LINCKELS, Nicolas MAYER, Michael NOLL, Benoît RIES, Michael STIEGHAHN. Magali MARTIN is the secretary of LIASIT... and also a nice entertainer. 3.2. XML - Introduction 3. XML and its Sub-Languages How we can see this… 9
  • 10. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 10 Welcome to LIASIT LIASIT stands for Luxembourg Advanced Studies in Information Technology and since August 01, 2006 is a Doctoral School in the Faculty of Science, Technology and Communication. The faculty is composed of the following professors: David BASIN, Pascal BOUVRY, Eric DUBOIS, Thomas ENGEL, Franck LEPREVOST, Christoph MEINEL, Nicolas GUELFI, and Brn OTTERSTEN. The PhD Students are: Christoph BRANDT, Pandu DEVARAKOTA, Daniel FISCHER, Benamin GATEAU, Markus GROSS, Joel GROTZ, Annie GUERRIERO, Serge LINCKELS, Nicolas MAYER, Michael NOLL, Benot RIES, Michael STIEGHAHN. Magali MARTIN is the secretary of LIASIT... and also a nice entertainer. 3.2. XML - Introduction 3. XML and its Sub-Languages What a computer sees… 10
  • 11. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 11 <title>Welcome to LIASIT</title> <description>LIASIT stands for Luxembourg Advanced Studies in Information Technology and since August 01, 2006 is a Doctoral School in the Faculty of Science, Technology and Communication.</description> <profs> <name>The faculty</name> <name>is composed of</name> <name>the following</name> </profs> <students> <name>The PhD</name> <name>Students are:</name> <name>Christoph BRANDT,</name> <name>Pandu DEVARAKOTA,</name> </students> <administration> <name>Daniel FISCHER, <name> </administration> 3.2. XML - Introduction 3. XML and its Sub-Languages How we can help… 11
  • 12. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 12 <title>Welcome to LIASIT</title> <description>LIASIT stands for Luxembourg Advanced Studies in Information Technology and since August 01, 2006 is a Doctoral School in the Faculty of Science, Technology and Communication.</description> <profs> <name>The faculty</name> <name>is composed of</name> <name>the following</name> </profs> <students> <name>The PhD</name> <name>Students are:</name> <name>Christoph BRANDT,</name> <name>Pandu DEVARAKOTA,</name> </students> <administration> <name>Daniel FISCHER, </name> </administration> How we can help… 3. XML and its Sub-Languages 3.2. XML - Introduction 12
  • 13. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 13 <profs> <name>Thomas Engel</name> <name>Christoph Meinel</name> <name>David Basin</name> <name> Björn Ottersten</name> </profs> <students> <name>Benoît Ries</name> <name>Daniel Fischer</name> <name>Christoph Brandt,</name> <name>Pandu Devarakota</name> <name>Serge Linckels</name> </students> 3.3. XML - Introduction 3. XML and its Sub-Languages Benefits of XML Document is well-structured Applications can process the file XML file Thomas Engel Christoph Meinel David Basin Björn Ottersten Benoît Ries Daniel Fischer Christoph Brandt Pandu Devarakota Serge Linckels pure text file Problems with text file No structure Difficult to process Attention: although XML adds a certain amount of semantics to the document, there are sill information that are missing, e.g., what is the relation between "profs" and "students"? 13
  • 14. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 14 3.3. XML – Language Specifications 3. XML and its Sub-Languages <person type="Teacher"> <name>Serge Linckels</name> <hp>http://www.linckels.lu</hp> <size>173</size> <phone>691-111111</phone> </person> element attribute child-element value Terminology Tree representation person name hp size phone Serge Linckels http://www.linckels.lu 173 691-111111 General XML is composed of text and tags Tags come in pairs, e.g., <hp></hp> Tags must be properly nested, e.g., <person><hp></person></hp> <person><hp></hp></person> Tags are case sensitive: <hp> ≠ <HP> 14
  • 15. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 15 <staff> </staff> 3.3. XML – Language Specifications 3. XML and its Sub-Languages XML structure <person type="Teacher"> <name>Serge Linckels</name> <hp>http://www.linckels.lu</hp> <size>173</size> <phone>26 00 11 22</phone> <phone>691-111111</phone> </person> <person type="Teacher"> <name>Denis Zampunieris</name> <phone>4666445290</phone> </person> same element can be used repeatedly Nested tags can be part of a list too. Order is not significant. Element or attribute? <name> <first>Serge</first> <last>Linckels</last> </name> <name first="Serge" last="Linckels”></name> Both variants are semantically identical or 15
  • 16. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 16 3.3. XML – Language Specifications 3. XML and its Sub-Languages XML Names Can include: - letters (a..z, A..Z) - digits (0..9) - these punctuation chars: - underscore (_) - hyphen (-) - period (.) - special chars like ö, ç, Ω Examples for valid XML Names: <drivers_licence> <_oki-doki> <téléphone> <this.works> CDATA Sections Everything between <![CDATA[ and ]]> is treated as raw character data. <person type="Teacher"> <name>Serge Linckels</name> <![CDATA[This is just some code that is ignored, 10 print "Hello world" 20 goto 10 ]]> <phone>26 00 11 22</phone> </person> Comments Comments are between <!-- and --> like in HTML 16
  • 17. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 17 3.3. XML – Language Specifications 3. XML and its Sub-Languages XML declaration <?xml version="1.0" encoding="ASCII" standalone="yes"?> <person type="Teacher"> <name>Serge Linckels</name> <hp>http://www.linckels.lu</hp> <size>173</size> <phone>691-111111</phone> </person> encoding: XML is pure text, but can use different encoding, e.g., ASCII, Latin-1, Unicode, ISO-8859-1. When omitted then Unicode is default. standalone: - "yes", no external DTD/Schema is given - "no", external DTD/Schema is specified XML-defined character sets Unicode: 95156 characters from most of Earths living languages (variants: UCS-2, UCS-4, UTF-8, UTF-16) ISO character sets: e.g., ISO-8859-15 (Latin 9) is ASCII + accented letters + € 17
  • 18. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 18 3.3. XML – Language Specifications 3. XML and its Sub-Languages JSON – Javascript Object Notation No element names Primary data format used for asynchronous browser/server communication (AJAX) Language-independent data format Supported by many programming languages 18
  • 19. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 19 3.4. Document Type Definitions (DTD) 3. XML and its Sub-Languages <person type="Teacher"> <name> <first>Serge</first> <last>Linckels</last> </name> <phone>691-111111</phone> </person> <Personne> <Type>Teacher</Type> <Nom>Serge Linckels</Nom> <HP>http://www.linckels.lu</HP> <Sexe>M</Sexe> </Personne> ? Formal syntax is required DTD – Document Type Definitions Syntax of XML document is described Validating parser checks syntax: XML document with DTD syntax A XML document is valid if it respects the syntax defined in its DTD <!ELEMENT person (name, phone*)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT phone (#PCDATA)> #PCDATA: value of type string A person-element can contain 1 name sub-element and 0..* phone sub-elements Attention: a document can be well-formed but not valid! Web browser only checks if well-formed. 19
  • 20. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 20 3.4. Document Type Definitions (DTD) 3. XML and its Sub-Languages <?xml version="1.0" standalone="no"?> <!DOCTYPE person SYSTEM "http://www.linckels.lu/person.dtd"> <person type="Teacher"> <name> <first>Serge</first> <last>Linckels</last> </name> <phone>691-111111</phone> </person> <!ELEMENT person (name, phone*)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT phone (#PCDATA)> person.xml person.dtd URI of the DTD-file e.g., "/mydisk/person.dtd" Validating a document A Web browser does not validate documents but only checks it for well-formedness XML validators APIs are available in Java Online validators, e.g., http://www.stg.brown.edu/service/xmlvalid/ http://www.w3.org/2001/03/webdata/xsv 20
  • 21. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 21 3.4. Document Type Definitions (DTD) 3. XML and its Sub-Languages <?xml version="1.0"?> <!DOCTYPE person [ <!ELEMENT person (name, phone*)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT phone (#PCDATA)> ]> <person type="Teacher"> <name> <first>Serge</first> <last>Linckels</last> </name> <phone>691-111111</phone> </person> Valid XML document with internal DTD Sequences * Zero or more of the element is allowed ? Zero or one of the element is allowed + One or more of the element is required Elements must appear in the specified order Choices <!ELEMENT color (red | green) Here, the element color can have a child- element red or green, not both at a time. 21
  • 22. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 22 3.4. Document Type Definitions (DTD) 3. XML and its Sub-Languages <?xml version="1.0"?> <!DOCTYPE person [ <!ELEMENT person (name, phone*)> <!ATTLIST name first CDATA #IMPLIED last CDATA #REQUIRED > <!ELEMENT phone (#PCDATA)> ]> <person type="Teacher"> <name first="Serge" last="Linckels" /> <phone>691-111111</phone> </person> Attribute declarations Attribute defaults: #IMPLIED: value is optional #REQUIRED: value is required #FIXED: value is constant Literal: value is given as quoted string Attribute types: CDATA: any string of text Enumeration: list of values ID: unique XML name IDREF: unique identification of some element in the document IDREFS: set of IDREFs Valid XML document with internal DTD 22
  • 23. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 23 <family> <person id="jane" mother="mary" father="john"> <name>Jane Doe</name> </person> <person id="john" children="jane jack"> <name>John Doe</name> </person> <person id="mary" children="jane jack"> <name>Mary Smith</name> </person> <person id="jack" mother="mary" father="john"> <name>Jack Smith</name> </person> </family> 3.4. Document Type Definitions (DTD) 3. XML and its Sub-Languages DTD – ID, IDREF and IDREFS <!DOCTYPE family [ <!ELEMENT family (person*)> <!ELEMENT person (name)> <!ELEMENT name (#PCDATA)> <!ATTLIST person id ID #REQUIRED mother IDREF #IMPLIED father IDREF #IMPLIED children IDREFS #IMPLIED> ]> XML document DTD 23
  • 24. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 24 3.4. Document Type Definitions (DTD) 3. XML and its Sub-Languages Problems and limitations DTD are context-free grammars; recursive definitions are possible Order matters, e.g., <!ELEMENT person (last, first)> Workaround: <!ELEMENT person ((last, first) | (first, last))> Can become unclear: <!ELEMENT person ((name | phone | e-mail)*)> Lacks of expressiveness, e.g., restriction over references are not possible All elements are global in one namespace XML Schema, more powerful than DTD and W3C recommendation No support for newer features of XML DTD are expressed in a non-XML syntax …but there are numerous other XML schema languages, e.g., RELAX NG, ISO DSDL, Schematron… 24
  • 25. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 25 3.5. XML Schemas 3. XML and its Sub-Languages XML Schema - Overview XML Schema is an XML document containing a formal description of what comprises a valid XML document An XML document described by a schema is called an instance document More explicit restrictions on the number and sequence of child elements are possible Example <?xml version="1.0"?> <fullName>Serge Linckels</fullname> XML document <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="fullName" type="xs:string" /> </xs:schema> XML Schema xs: is standard prefix for XML Schema namespace 25
  • 26. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 26 Atomic types (more than 40!) string: Unicode string 3.5. XML Schemas 3. XML and its Sub-Languages integer: positive or negative number boolean: true/false or 0/1 ID, IDREF, IDREFS: cf. DTD Simple types New simple types can be created by using atomic types <xs:element name="first" type="xs:string" /> <xs:element name="age" type="xs:integer" /> <xs:element name="link" type="xs:anyURI" /> <xs:element name="year" type="xs:year" /> <xs:simpleType name="aName" base="xs:string" /> Restrictions can be defined <xs:simpleType name="aName"> <xs:resriction base="xs:string> <xs:maxLength value="50" /> </xs:restriction> </xs:simpleType> 26
  • 27. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 27 More about restrictions Restrictions (facets) can be defined over simple types using xs:restriction <xs:simpleType name="location"> <xs:resriction base="xs:string> <xs:enumeration value="work" /> <xs:enumeration value="school" /> <xs:enumeration value="mobile" /> </xs:restriction> </xs:simpleType> 3.5. XML Schemas 3. XML and its Sub-Languages Enumerations: <xs:simpleType name="age"> <xs:resriction base="xs:unsignedShort> <xs:minExclusive value="0" /> <xs:maxInclusive value="120" /> </xs:restriction> </xs:simpleType> Numeric facets: possible values: • minInclusive • maxInclusive • minExclusive • maxExclusive 27
  • 28. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 28 More about restrictions <xs:simpleType name="mobile-phone"> <xs:resriction base="xs:string> <xs:pattern value="ddd-dd dd dd" /> </xs:restriction> </xs:simpleType> Enforcing format: 3.5. XML Schemas 3. XML and its Sub-Languages Enforces the rule that a mobile phone-number consists of 3 digits, a dash, 2 digits, a space, 2 digits, another space and finally 2 digits. <xs:simpleType name="TypeAuthor"> <xs:list itemType="xs:string /> </xs:simpleType> Lists: <xs:element name="author" type="TypeAuthor" /> XML Schema – element definition XML Schema – simple type definition The author element of an instance document can contain an unlimited list of strings, each separated by a whitespace 28
  • 29. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 29 3.5. XML Schemas 3. XML and its Sub-Languages Complex types A complex type is an element that contains child-elements <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="person"> <xs:complexType> <xs:sequence> <xs:element name="first" type="xs:string" /> <xs:element name="last" type="xs:string" /> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> <?xml version="1.0"?> <person> <first>Serge</first> <last>Linckels</last> </person> XML document XML Schema Only elements can have complex types, attributes always have simple types sequence: order of elements matters (a,b) all: order of elements does not matter (a,b or b,a) choice: one or the other element (a xor b) 29
  • 30. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 30 3.5. XML Schemas 3. XML and its Sub-Languages Occurrence Constraints Set the number of times an element may occur: minOccurs: minimum occurrences maxOccurs: maximum occurrences <xs:element name="middle" type="xs:string" minOccurs="0" maxOccurs="unbounded“ /> The default value for minOccurs and maxOccurs is 1 In this example, maxOccurs is not set, but has a default value of 1. Therefore, the middle element may appear 0 or 1 times. The value unbounded indicates that the element may appear an unlimited number of times. 30
  • 31. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 31 3.5. XML Schemas 3. XML and its Sub-Languages Derived complex types Deriving by extension: add new definitions to existing complex type. E.g., add the phone element to the existing person type. <xs:complexType name="PersonWithPhone"> <xs:extension base="person"> <xs:sequence> <xs:element name="phone" type="xs:string" /> </xs:sequence> </xs:extension> </xs:complexType> Deriving by restriction: by omitting parts of the parent definition, the restriction element create a new, constrained type. <xs:complexType name="PersonWithMoreNames"> <xs:restriction base="person"> <xs:sequence> <xs:element name="first" type="xs:string" minOccurs="2" /> <xs:element name="last" type="xs:string" /> </xs:sequence> </xs:restriction> </xs:complexType> Structure cannot be changed! Only available for complex types 31
  • 32. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 32 <xs:element name="person"> <xs:complexType> <xs:sequence> <xs:element name="first" type="xs:string" /> <xs:element name="last" type="xs:string" /> </xs:sequence> <xs:attribute name="job" type="xs:string" /> </xs:complexType> </xs:element> 3.5. XML Schemas 3. XML and its Sub-Languages Attribute declarations Attributes can be declared globally by top-level xs:attribute <xs:attribute name="job" type="xs:string" use="optional" /> Attributes can be declared locally as part of a complex type definition <?xml version="1.0"?> <person job="Teacher"> <first>Serge</first> <last>Linckels</last> </person> possible values: optional or required XML document 32
  • 33. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 33 3.5. XML Schemas 3. XML and its Sub-Languages Conclusion: DTD vs. XML Schemas XML Schemas is a more powerful language than DTD to specify the syntax of XML documents; therefore, it is more expressive in terms of semantics XML Schemas is a W3C recommendation and widely used. As DTD are simpler to use, they are still used today 33
  • 34. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 34 3.6. Namespaces 3. XML and its Sub-Languages <compactDisk author="HS"> <titel>Remixes</titel> <track number="1"> <titel>Night over Manaus</titel> <author>Boozoo Bajou</author> </track> </compactDisk> Problem of ambiguous names XML names can be used for different elements. But this creates ambiguities. XML namespaces disambiguate elements with the same name from each other by assigning elements and attributes to URIs. Qualified names, prefixes and local parts Elements are identified by qualified names: cd:titel prefix local name qualified name 34
  • 35. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 35 3.6. Namespaces 3. XML and its Sub-Languages Using XML namespaces <cd:compactDisk xmlns:cd = "http://www.linckels.lu/cd" xmlns:tr = "http://www.xyz.com/tracks" author="HS"> <cd:titel>Remixes</cd:titel> <tr:track number="1"> <tr:titel>Night over Manaus</tr:titel> <tr:author>Boozoo Bajou</tr:author> </tr:track> </cd:compactDisk> Each element exists in a unique namespace Namespace URIs are purely formal identifiers; they are not the addresses of a page, and they are not meant to be followed as links <tr:title>Remixes</tr:title> Instead of using a prefix, the complete URI can be indicated, e.g., <http://www.xyz.com/tracks#title> Remixes </http://www.xyz.com/tracks#title> Namespace binding: each prefix in a qualified name must be associated with a URI Namespaces only apply to elements, not to attributes 35
  • 36. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 36 3.7. Programming Models 3. XML and its Sub-Languages Common XML processing models Treating XML as text Treating XML as events; the document is read as it happens (e.g., an "event" can be the start of an element, the content of an element, and the end of an element) Treating XML as tree models XML transformations Abstracting XML always; do not consider the XML elements Most commonly used Document Object Model (DOM) Simple API for XML (SAX) 36
  • 37. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 37 3.7. Programming Models 3. XML and its Sub-Languages DOM overview The entire document must be read and parsed before it is available as DOM; unsuitable for very large documents User accesses data by traversing the tree (tree and its traversal conform to a W3C standard) The API allows for constructing, accessing and manipulating the structure and content of XML documents <countries> <country continent="Asia"> <name>Israel</name> <population year="2001">6199008</population> <city capital="yes"><name>Jerusalem</name></city> <city><name>Ashdod</name></city> </country> <country continent="Europe"> <name>France</name> <population year="2004">60424213</population> </country> </countries> Example: 37
  • 38. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 38 Asia 2001 60424213 3.7. Programming Models 3. XML and its Sub-Languages population document countries country continent name Israel population year 6199008 city capital yes Jerusalem name city capital no Ashod continent Europe France 2004 namename country year DOM tree root node node value node 38
  • 39. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 39 API Application XML document DOM parser DOM tree (in memory) 3. XML and its Sub-Languages 3.7. Programming Models DOM Java API - general DOM tree is generated by a DocumentBuilder DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document myXMLdoc = builder.parse("world.xml"); The builder is generated by a Factory to be implementation independent. The factory is chosen according the system configuration 39
  • 40. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 40 3. XML and its Sub-Languages 3.7. Programming Models DOM Java API - general import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException; import org.w3c.dom.Document; import org.w3c.dom.Element; 40
  • 41. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 41 3. XML and its Sub-Languages 3.7. Programming Models DOM Java API – node API //create the root element Element root = myXMLdoc.createElement("root"); //add it to the xml tree myXMLdoc.appendChild(root); //create child element Element childElement = myXMLdoc.createElement("Child"); //add the attribute to the child childElement.setAttribute("attribute1","The value of Attribute 1"); //add child element to the root element root.appendChild(childElement); The nodes of the DOM tree include - a special root (denoted document) - element nodes - text nodes and CDATA sections - attributes - comments - and more ... Examples: 41 myXMLdoc root attribute1 The value of Attribute 1 Child
  • 42. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 42 3. XML and its Sub-Languages 3.7. Programming Models DOM Java API – node API Every node in the DOM tree implements the Node interface DocumentFragment Document Element Attribute CDATA DocumentType Notation Entity EntityReference ProcessInstruction Node Text Comment CDATA Section 42
  • 43. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 43 3. XML and its Sub-Languages 3.7. Programming Models DOM Java API – node API Every node has a specific location in tree Node interface specifies methods for tree navigation • Node getFirstChild(); • Node getLastChild(); • Node getNextSibling(); • Node getPreviousSibling(); • Node getParentNode(); • NodeList getChildNodes(); • NamedNodeMap getAttributes(); getParentNode() getPreviousSibling() getNextSibling() getFirstChild() getChildNodes() getLastChild() 43
  • 44. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 44 3. XML and its Sub-Languages 3.7. Programming Models DOM Java API – node API Every node has : • a type • a name • a value • attributes The roles of these properties differ according to the node types if (myNode.getNodeType() == Node.ELEMENT_NODE) { //process node … } ELEMENT_NODE = 1 ATTRIBUTE_NODE = 2 TEXT_NODE = 3 CDATA_SECTION_NODE = 4 ENTITY_REFERENCE_NODE = 5 ENTITY_NODE = 6 PROCESSING_INSTRUCTION_NODE = 7 COMMENT_NODE = 8 DOCUMENT_NODE = 9 DOCUMENT_TYPE_NODE = 10 DOCUMENT_FRAGMENT_NODE = 11 NOTATION_NODE = 12 Node-types: 44
  • 45. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 45 Asia 3.7. Programming Models 3. XML and its Sub-Languages 2001 60424213 population document countries country continent name Israel population year 6199008 city capital yes Jerusalem name city capital no Ashod continent Europe France 2004 namename country year DOM tree root node node value node 45
  • 46. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 46 3. XML and its Sub-Languages 3.7. Programming Models Example 46 // print one given city (city is given as element) public static void printCity(Element city) { Node nameNode = city.getElementsByTagName("name").item(0); String cName = nameNode.getFirstChild().getNodeValue(); System.out.println("Found City: " + cName); } // prints all cities found in the DOM tree public static void printCities(Document myXMLdoc) { NodeList cities = myXMLdoc.getElementsByTagName("city"); for(int i = 0; i < cities.getLength(); ++i) { printCity((Element)cities.item(i)); } }
  • 47. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 47 3. XML and its Sub-Languages 3.7. Programming Models DOM Java API – node API Children of a node in a DOM tree can be manipulated; added, edited, deleted, copied etc. To construct new nodes, use the methods of Document: createElement, createAttribute, createTextNode, createCDATASection etc. To manipulate a node, use the methods of Node: appendChild, insertBefore, removeChild, replaceChild, setNodeValue, cloneNode(boolean deep) etc. insertBefore() new replaceChild() new cloneNode(false) new cloneNode(true) new 47
  • 48. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 48 3. XML and its Sub-Languages Picture created by Harald Sack 48
  • 49. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 49 3.8. XLink, XPath and XPointer 3. XML and its Sub-Languages Overview XML Linking Language (XLink) is a XML markup language used for creating hyperlinks in XML documents, e.g., HTML <a>-tag XPointer is a system for addressing components of XML based internet media XML Path Language (XPath) is a language for selecting nodes from an XML document <a href="…"> <a name="…"> 49
  • 50. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 50 3.8. XLink, XPath and XPointer 3. XML and its Sub-Languages XLink - general Allows to create unidirectional links between exactly 2 resources Origin of the link is always the starting document (where the link is nested) Browsers are free to interpret this link as they like (depends on used CSS) <a href="…"> Starting document Destination document Example as an implementation as HTML hyperlink, but can be more… 50
  • 51. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 51 3.8. XLink, XPath and XPointer 3. XML and its Sub-Languages XLink – parameters 1/3 title: (optional) textual information about the link (display in a browser as hint) <publications xmlns:xlink="http//www.w3.org/1999/xlink" xlink:title="My publications" xlink:href="http://www.linckels.lu/publications.txt" xlink:role="http://www.dblp.de" xlink:show="new" xlink:actuate="onRequest" xlink:type="simple" /> href: URI of linked destination resource (must not necessarily be a URL) xlink: specifies that this is an XLink definition role: (optional) points to a resource that specifies the meaning of the connection between the resources, e.g., a Web page that gives further information 51
  • 52. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 52 3.8. XLink, XPath and XPointer 3. XML and its Sub-Languages XLink – parameters 2/3 show: (optional) context in which the linked resource is to display - new: new window - replace: current window - embed: embed in current document - other: customized - none: no behavior <publications xmlns:xlink="http//www.w3.org/1999/xlink" xlink:title="My publications" xlink:href="http://www.linckels.lu/publications.txt" xlink:role="http://www.dblp.de/" xlink:show="new" xlink:actuate="onRequest" xlink:type="simple" /> actuate: (optional) specifies when an application that encounters an XLink should follow it - onLoad: as soon as the application sees it - onRequest: when the user asks to follow it - other: customized - none: no behavior (e.g., if the link is an ISBN number of a "physical book") 52
  • 53. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 53 3.8. XLink, XPath and XPointer 3. XML and its Sub-Languages XLink – parameters 3/3 type: specifies the type of the link - simple: one standard link between two resources (i.e., HTML hyperlinks) - extended: more links between a collection of resources (≈ directed graph) 1..n relation E.g., a book is published in three particular editions sequences E.g., a book has two preceding versions n..m relation E.g., pizza is composed of different ingredients and can result in different compositions Starting documents Destination documents Link basis 53
  • 54. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 54 3.8. XLink, XPath and XPointer 3. XML and its Sub-Languages XLink – Extended type example <xlink:extended xmlns:xlink="http//www.w3.org/1999/xlink/" link:role="http://www.pizza.de/pizzaworld" xlink:title="Pizza Tonno"> <xlink:locator href="Pizzaboden.xml" role="http://www.pizza.de/base" title="Pizzaboden"/> <xlink:locator href="Basilikum.xml" role="http://www.pizza.de/base" title="Basilikum"/> <xlink:arc from="http://www.pizza.de/base" to="http://www.pizza.de/special" show="new" actuate="onRequest"/> </xlink:extended> Pizzaboden Basilikum Pizza Tonno base special Specifies how the resources relate to each other Specifies the resources involved 54
  • 55. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 55 3.8. XLink, XPath and XPointer 3. XML and its Sub-Languages XPointer and XPath - general Fix an anchor inside a document, e.g., HTML: <a name="…"> Problems with HTML-anchors: • not possible inside remote documents (no permission to modify the source code) • the complete destination document must be transmitted, even if only one sub-part of the document is addressed Principle solution: • represent documents as trees • address sub-tree only (navigation through tree) 55
  • 56. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 56 3.8. XLink, XPath and XPointer 3. XML and its Sub-Languages XML Pointer Language (XPointer) General definition of an XML Pointer: URI#xpointer(anchor description) <gedicht> <strophe ID="strophe1"> <zeile ID="zeile1">Dreifach ist des Raumes Maß:</zeile> <zeile ID="zeile2">Rastlos fort ohne Unterlaß</zeile> <zeile ID="zeile3">Strebt die Länge fort ins Weite,</zeile> <zeile ID="zeile4">Endlos gießet sich die Breite,</zeile> <zeile ID="zeile5">Grundlos senkt die Tiefe sich</zeile> </strophe> </gedicht> http://www.bsp.de/gedicht.xml#xpointer(ID('zeile2')) http://www.bsp.de/gedicht.xml#xpointer(//zeile[@ID="zeile2"]) http://www.bsp.de/gedicht.xml#element(/1/1/2) Example : XML document XPointers 56
  • 57. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 57 3.8. XLink, XPath and XPointer 3. XML and its Sub-Languages XML Path Language (XPath) XPath is a non-XML language for identifying particular parts of XML documents, i.e., for picking out nodes an sets of nodes out of a tree gedicht.xml#xpointer(/child::gedicht[position()=3]) Example : child descendant following following sibling parent ancestor preceding preceding sibling Context-nodeAddressed-node axis context-node predicate 57
  • 58. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 58 3. XML and its Sub-Languages Picture created by Harald Sack 58
  • 59. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 59 3.9. XSL Transformations (XSLT) 3. XML and its Sub-Languages XSLT overview Extensible Stylesheet Language Transformations (XSLT) is an XML-based language used to specify rules by which one XML document is transformed into another (XML) document The resulting document may be XML syntax or another format, such as HTML or plain text Examples of applications of XSLT: - convert data between different XML schemas - convert XML data into HTML or XHTML documents for web pages (e.g., with CSS) - creating a dynamic web page - convert into an intermediate XML format that can be converted to PDF documents Such a transformation is based on the following languages: - XSLT: specifies the transformation rules - XSL-FO: describes how to transform layout - XPath: access to specific parts of an XML document 59
  • 60. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 60 3.9. XSL Transformations (XSLT) 3. XML and its Sub-Languages XSLT transformation principle (example) XML documents DTD/XML-S XSL document 1 XSL document 2 PDF document HTML WML XSLT processor 60
  • 61. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 61 3.9. XSL Transformations (XSLT) 3. XML and its Sub-Languages XSLT transformation principle (example) XSL stylesheet tree representation of XML document XSL works on the abstract tree representation of the XML document A set of transformation rules are required in form a an XSLT document, e.g., template (XSL stylesheet) output document + The structure tree is browsed and for each node the appropriate template from the XSL stylesheet is applied 61
  • 62. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 62 3.9. XSL Transformations (XSLT) 3. XML and its Sub-Languages <data> <row num="1">toto 1</row> <row num="2">toto 2</row> <row num="3">toto 3</row> </data> XML document <?xml version="1.0" encoding="ISO-8859-1"?> <html xsl:version="1.0" xmlns:xsl="http://www.w3.org/ 1999/XSL/Transform"> <body> <h1>Demo</h1> <xsl:for-each select="row"> <br/>Row: <xsl:value-of select="@num"/> - Data: <xsl:value-of select="."/> </xsl:for-each> </body> </html> XSL stylesheet <html><body> <h1>Demo</h1> <br/>Row: 1 - Data: toto 1 <br/>Row: 2 - Data: toto 2 <br/>Row: 3 - Data: toto 3 </body></html> HTML document XSLT processor 62
  • 63. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 63 3.9. XSL Transformations (XSLT) 3. XML and its Sub-Languages import java.io.FileReader; import javax.xml.transform.Result; import javax.xml.transform.Source; import javax.xml.transform.Transformer; import javax.xml.transform.TransformerFactory; import javax.xml.transform.stream.StreamResult; import javax.xml.transform.stream.StreamSource; public class TestTransformation { public static void main(String[] args) throws Exception { Source sourceXSL = new StreamSource(new FileReader("stylesheet.xsl")); Source sourceXML = new StreamSource(new FileReader("data.xml")); TransformerFactory trFac = TransformerFactory.newInstance(); Transformer tf; Result resultOnScreen = new StreamResult(System.out); tf = trFac.newTransformer(sourceXSL); tf.transform(sourceXML, resultOnScreen); } } Perform XSL transformation using a Java programme 63
  • 64. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 64 3.9. XSL Transformations (XSLT) 3. XML and its Sub-Languages <?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/xsl" href="stylesheet.xsl" ?> <person type="Teacher"> <name>Serge Linckels</name> <hp>http://www.linckels.lu</hp> <size>173</size> <phone>691-111111</phone> </person> Perform XSL transformation using a stylesheet 64
  • 65. Semantic Web ::: Serge Linckels ::: www.linckels.lu ::: serge@linckels.lu ::: 65 XML in a Nutshell Elliotte R. Harold, W. Scott Means, W. Scott Means 3. XML and its Sub-Languages 3.10. References 65 E-Librarian Service User-Friendly Semantic Search in Digital Libraries Serge Linckels, Christoph Meinel