1. Seman&c
Analysis
in
Language
Technology
http://stp.lingfil.uu.se/~santinim/sais/2016/sais_2016.htm
Ontologies and the Semantic Web
Marina
San(ni
san$nim@stp.lingfil.uu.se
Department
of
Linguis(cs
and
Philology
Uppsala
University,
Uppsala,
Sweden
Spring
2016
3. Outline
• The
Seman(c
Web
• Ontologies
The
Seman(c
Web
&
Ontologies
3
4. Chronology
hNp://en.wikipedia.org/wiki/
History_of_the_World_Wide_Web
• On
August
6,
1991,Berners-‐Lee
posted
a
short
summary
of
the
World
Wide
Web
project
on
the
alt.hypertext
newsgroup,
invi(ng
collaborators.
This
date
also
marked
the
debut
of
the
Web
as
a
publicly
available
service
on
the
Internet,
although
new
users
could
only
access
it
aEer
August
23.
• Beginning
in
2002,
new
ideas
for
sharing
and
exchanging
content
ad
hoc,
such
as
Weblogs
and
RSS,
rapidly
gained
acceptance
on
the
Web.
This
new
model
for
informa(on
exchange,
primarily
featuring
user-‐generated
and
user-‐edited
websites,
was
dubbed
Web
2.0.
• Popularized
by
Berners-‐Lee's
book
Weaving
the
Web
(2000)
and
a
Scien(fic
American
ar(cle
by
Berners-‐Lee,
James
Hendler,
and
Ora
Lassila,
the
term
• Seman&c
Web
describes
an
evolu&on
of
the
exis&ng
Web
in
which
the
network
of
hyperlinked
human-‐readable
web
pages
is
extended
by
machine-‐readable
metadata
about
documents
and
how
they
are
related
to
each
other,
enabling
automated
agents
to
access
the
Web
more
intelligently
and
perform
tasks
on
behalf
of
users.
• In
2006,
Berners-‐Lee
and
colleagues
stated
that
the
idea
"remains
largely
unrealized"
The
Seman(c
Web
&
Ontologies
4
5. Web
1.0
• Web
1.0
is
a
retronym
referring
to
an
early
stage
of
the
World
Wide
Web's
evolu(on.
• Some
design
elements
of
a
Web
1.0
site
include:
– Personal
web
pages
were
common,
consis(ng
mainly
of
sta(c
pages
– Sta(c
pages
instead
of
dynamic
HTML.
– The
use
of
HTML
3.2-‐era
elements
such
as
Framing
(World
Wide
Web)s
and
tables
to
posi(on
and
align
elements
on
a
page
(now
we
use
css
and
frames
are
deprecated)
– GIF
buNons...
The
Seman(c
Web
&
Ontologies
5
6. Web
2.0
• Web
2.0
describes
World
Wide
Web
sites
that
use
technology
beyond
the
sta(c
pages
of
earlier
Web
sites.
• The
key
features
of
Web
2.0
include:
– Tagging
-‐
allows
users
to
collec(vely
classify
and
find
informa(on
(e.g.
Tagging)
– Rich
User
Experience-‐
dynamic
content;
responsive
to
user
input
– User
Par(cipa(on
-‐
informa(on
flows
two
ways
between
site
owner
and
site
user
by
means
of
evalua(on,
review,
and
commen(ng.
– Site
users
add
content
for
others
to
see
– Mass
Par(cipa(on
-‐
Universal
web
access
leads
to
differen(a(on
of
concerns
from
the
tradi(onal
internet
userbase.
– etc.
The
Seman(c
Web
&
Ontologies
6
7. Web
3.0
• “Web
3.0,
a
phrase
coined
by
John
Markoff
of
the
New
York
Times
in
2006,
refers
to
a
supposed
third
genera(on
of
Internet-‐based
services
that
collec(vely
comprise
what
might
be
called
‘the
intelligent
Web’
—
such
as
those
using
seman(c
web,
microformats,
natural
language
search,
data-‐
mining,
machine
learning,
recommenda(on
agents,
and
ar(ficial
intelligence
technologies
—
which
emphasize
machine-‐facilitated
understanding
of
informa(on
in
order
to
provide
a
more
produc(ve
and
intui(ve
user
experience.”
• Web
3.0
will
be
more
connected,
open,
and
intelligent,
with
seman(c
Web
technologies,
distributed
databases,
natural
language
processing,
machine
learning,
machine
reasoning,
and
autonomous
agents.
– hNp://lifeboat.com/ex/web.3.0
The
Seman(c
Web
&
Ontologies
7
This
has
yet
to
happen.
8. • "The
Web
was
designed
as
an
informa$on
space,
with
the
goal
that
it
should
be
useful
not
only
for
human-‐human
communica(on,
but
also
that
machines
would
be
able
to
par(cipate
and
help.
• One
of
the
major
obstacles
to
this
has
been
the
fact
that
most
informa$on
on
the
Web
is
designed
for
human
consump$on,
and
even
if
it
was
derived
from
a
database
with
well
defined
meanings
(in
at
least
some
terms)
for
its
columns,
that
the
structure
of
the
data
is
not
evident
to
a
robot
browsing
the
Web.
• Leaving
aside
the
ar(ficial
intelligence
problem
of
training
machines
to
behave
like
people,
the
Seman$c
Web
approach
instead
develops
languages
for
expressing
informa$on
in
a
machine
process-‐able
form"-‐
– Tim
Berners-‐Lee,
The
Seman&c
Web
Roadmap,
1998
– hNp://www.w3.org/DesignIssues/Seman(c.html
The
Seman(c
Web
&
Ontologies
8
The
web:
present
and
future
9. Today…
• The
web
is
rela(vely
simple:
– Hypertexts
and
hypermedia
– Access
is
engineered
via
a
combina(on
of
keyword-‐based
search
and
link
nagiva(on.
This
simplicity
has
been
one
of
the
great
strengths
of
the
web,
and
has
been
an
important
factor
in
its
popularity
and
their
own
content.
The
Seman(c
Web
&
Ontologies
9
10. Shortcomings
Examples:
• Finding
informa(on
about
people
with
very
common
names
can
be
a
frustra(ng
experience.
• Answering
more
complex
queries
along
with
more
general
informa(on
retrieval,
integra(on,
sharing
and
processing
can
be
difficult
….
We
have
seen
that…
The
Seman(c
Web
&
Ontologies
10
11. Some
solu(ons
• Sosware
glue:
Mashups
– loca(on
informa(on
from
one
source
might
be
combined
with
map
informa(on
from
another
source
in
order
to
show
the
loca(on
of
and
provide
direc(ons
to
points
of
interest
such
as
hotels
and
restaurants.
• Tagging
via
social
networks
(Web
2.0)
– harness
the
power
of
user
communi(es
in
order
to
share
and
annotate
informa(on.
• Examples
include
image
and
video
shar-‐ing
sites
such
as
Flickr
and
YouTube,
and
auc(on
sites
such
as
eBay.
– In
these
applica(ons,
annota(ons
usually
take
the
form
simple
tags,
such
as
”each",
”birthday",
”family"
and
”friends".
The
meaning
of
tags
is,
however,
typically
not
well
defined,
and
may
be
impenetrable
even
to
human
users:
typ-‐ical
examples
(from
Flickr)
include
"asquatchmusicfes(val",
"elebritylookalikes",
and
"wab08".
The
Seman(c
Web
&
Ontologies
11
12. The
”travel
agent”
• The
classic
example
of
a
seman(c
web
applica(on
is
an
automated
travel
agent
that,
given
various
constraints
and
preferences,
would
offer
the
user
suitable
travel
or
vaca(on
sugges(ons.
• A
key
feature
of
such
a
"sosware
agent"
is
that
it
would
not
simply
exploit
a
predetermined
set
of
informa(on
sources,
but
would
search
the
web
for
relevant
informa(on
in
much
the
same
way
that
a
human
user
might
do
when
planning
a
vaca(on.
The
Seman(c
Web
&
Ontologies
12
13. The
goal
• The
goal
of
the
Seman(c
Web
is
to
allow
web
informa(on
and
services
to
be
more
effec(vely
exploited
by
humans
and
automated
tools.
The
Seman(c
Web
&
Ontologies
13
14. Seman(c
Web
• The
focus
of
the
seman(c
web
is
to
share
data
instead
of
documents.
• In
other
words,
it
is
a
project
that
should
provide
a
common
framework
that
allows
data
to
be
shared
and
reused
across
applica(on,
enterprise,
and
community
boundaries.
• It
is
a
collabora(ve
effort
led
by
World
Wide
Web
Consor(um
(W3C).
The
Seman(c
Web
&
Ontologies
14
15. Semantic Web & Ontologies
• How
are
we
going
to
represent
meaning
and
knowledge
on
the
web?
• A
key
idea
behind
the
seman&c
web
is
to
address
this
problem
by
giving
machine-‐accessible
seman&cs
via
annota&on.
• Knowledge
is
represented
in
the
form
of
rich
conceptual
schemas
called
ontologies.
• Ontologies
are
the
backbone
of
the
Seman(c
Web.
• Ontologies
are
rich
conceptual
schemas
that
give
formally
defined
meanings
to
the
terms
used
in
annota&ons,
transforming
them
into
seman&c
annota&ons.
• They
provide
the
knowledge
that
is
required
for
seman(c
applica(ons
of
all
kinds.
15The
Seman(c
Web
&
Ontologies
16. Main
Difficulty
• Current
web
content
is
intended
for
humans
(HTML
markup
with
layout,
images
and
other
presenta(onal
features).
• Humans
understand
this
content,
but
machines
can’t.
The
Seman(c
Web
&
Ontologies
16
17. Basically...
• Ontologies provide a shared understanding of a domain.
• They provide background knowledge to automatize certain tasks.
• By the process of annotation, knowledge can be linked to
ontologies.
– Example: “Angelina Jolie” (Text) linked to concept Actress
– In our ontology we also know that an actress always is female and a
person.
• Ontologies allow the creation of annotations à machine-readable
and machine-understandable content.
• If machines can understand content, they can also perform more
meaningful and intelligent queries.
– Distinction of Jaguar the animal and the car.
– Combination of information that is distributed on the Web.
17The
Seman(c
Web
&
Ontologies
18. Old
and
New
Issues
Old
ones:
• knowledge
representa(on
• Reasoning
• Harnessing
the
idiosyncracies
of
natural
languages
• …
New
ones:
• integra(ng
different
ontologies
may
prove
to
be
at
least
as
hard
as
integra(ng
the
resources
that
they
describe
• Crea(on
of
suitable
annota(ons
• …
The
Seman(c
Web
&
Ontologies
18
19. Regardless
these
issues…
• …
considerable
progress
has
been
made
in
the
development
of
the
infrastructure
needed
to
support
the
seman(c
web.
• In
par(cular,
there
has
been
impressive
progress
in
the
development
of
languages
and
tools
for
content
annota(on
and
for
the
design
and
deployment
of
ontologies.
The
Seman(c
Web
&
Ontologies
19
20. Seman(c
Annota(on
• To
facilitate
the
process
of
seman(c
annota(on,
RDF
and
OWL
have
been
developed
as
standard
formats
fo
the
sharing
and
integra(on
of
data
and
knowledge.
• RDF
and
OWL
are
standards:
– RDF
(Resource
Descrip(on
Framework)
– OWL
(Web
Ontology
Language)
The
Seman(c
Web
&
Ontologies
20
21. Ontologies
(Metaphysics)
• Ontology,
in
its
original
philosophical
sense,
is
a
fundamental
branch
of
metaphysics
focusing
on
the
study
of
existence.
• Its
objec(ve
is
to
determine
what
en((es
and
types
of
en((es
actually
exist,
and
thus
to
study
the
structure
of
the
world.
• The
study
of
ontology
can
be
traced
back
to
the
work
of
Plato
and
Aristotle,
and
includes
the
development
of
hierarchical
categorisa(ons
of
different
kinds
of
en((es
and
the
features
that
dis(nguish
them
The
Seman(c
Web
&
Ontologies
21
Tree
of
Porphyry
22. Tree
of
Porphyry,
III
AD
• The
Porphyrian
tree,
Tree
of
Porphyry
or
Arbor
Porphyriana
is
a
classic
device
for
illustra(ng
what
is
also
called
a
"scale
of
being".
It
was
suggested
by
the
3rd
century
AD
Greek
neoplatonist
philosopher
and
logician
Porphyry
The
Seman(c
Web
&
Ontologies
22
23. Ontology
(Computer
Science,
AI,
LT,
IR…)
• Engineering
artefact,
usually
a
model
of
some
aspect
of
the
world.
• It
introduces
vocabulary
describing
various
aspects
of
the
domain
being
modelled,
and
provides
an
explicit
specifica(on
of
the
intended
meaning
of
the
vocabulary.
• This
specifica(on
osen
includes
classifica(on-‐
based
informa(on,
not
unlike
that
in
Porphyry's
tree.
The
Seman(c
Web
&
Ontologies
23
24. What is an ontology (i)?
24
“An
ontology
is
a
formal,
explicit
specifica&on
of
a
shared
conceptualiza&on”
Studer,
Benjamins,
Fensel.
Knowledge
Engineering:
Principles
and
Methods.
Data
and
Knowledge
Engineering.
25
(1998)
161-‐197
An ontology is an explicit specification of a conceptualization
Gruber, T. A translation Approach to portable ontology specifications. Knowledge Acquisition. Vol. 5. 1993. 199-220
Abstract model and
simplified view of some
phenomenon in the world
that we want to represent
Machine-readable
Concepts, properties
relations, functions,
constraints, axioms,
are explicitly defined
Consensual
Knowledge
The
Seman(c
Web
&
Ontologies
25. What is an ontology (ii)?
• An ontology is a hierarchically structured set of terms for describing a
domain that can be used as a skeletal foundation for a knowledge
base
B. Swartout; R. Patil; k. Knight; T. Russ. Toward Distributed Use of Large-Scale Ontologies Ontological
Engineering. AAAI-97 Spring Symposium Series. 1997. 138-148
• An ontology defines the basic terms and relations comprising
the vocabulary of a topic area, as well as the rules for
combining terms and relations to define extensions to the
vocabulary
Neches, R.; Fikes, R.; Finin, T.; Gruber, T.; Patil, R.; Senator, T.; Swartout, W.R. Enabling Technology
for Knowledge Sharing. AI Magazine. Winter 1991. 36-56
• An ontology provides the means for describing explicitly the
conceptualization behind the knowledge represented in a knowledge
base
A. Bernaras;I. Laresgoiti; J. Correra. Building and Reusing Ontologies for Electrical Network Applications
ECAI96. 12th European conference on Artificial Intelligence. Ed. John Wiley & Sons, Ltd.
298-302
25The
Seman(c
Web
&
Ontologies
26. Examples
• Top
level
ontology:
Standard
Upper
Ontology
– In
informa(on
science,
an
upper
ontology
(also
known
as
a
top-‐
level
ontology
or
founda(on
ontology)
is
an
ontology
(in
the
sense
used
in
informa(on
science)
which
describes
very
general
concepts
that
are
the
same
across
all
knowledge
domains.
• Linguis(c
ontology:
WordNet
• General
Ontology:
Cyc,
UNSPSC,
ecl@ss
• Domain
ontology:
MeSH
(Medical
Subject
Headings),
CHEMICALS,
UMLS
• Research
ontology:
KA2
(Knowledge
Acquisi(on
Community
Ontology)
The
Seman(c
Web
&
Ontologies
26
27. Resource
Descrip(on
Framework
(i)
• A
language
that
has
been
developed
in
order
to
provide
a
extensible
mechanism
for
describing
web
resources
and
rela(onships
between
them.
• A
key
feature
of
RDF
is
the
use
of
Interna(onalized
Resource
Iden(fiers
(IRIs)
(which
is
a
generalisa(on
of
Uniform
Resource
Locators
(URLs)
to
refer
to
resources.
• RDF
is
a
very
simple
language:
its
underlying
data
structure
is
a
labelled
directed
graph,
and
its
only
syntac(c
construct
is
the
triple.
• A
triple
consists
of
three
components,
referred
to
as
the
subject,
predicate
and
object.
The
Seman(c
Web
&
Ontologies
27
a
directed
graph
is
a
set
of
nodes
connected
by
edges,
where
the
edges
have
a
direc(on
associated
with
them.
/ˈaɪˌɑːˌraɪ/
28. RDF
(ii)
• More
formally,
a
triple
represents
a
single
edge
(labelled
with
the
predicate)
connec(ng
two
nodes
(labelled
with
the
subject
and
object);
it
describes
a
binary
rela(onship
between
the
subject
and
object
via
the
predicate.
• The
predicate
of
a
triple
is
always
an
IRI,
and
an
IRI
that
is
used
in
the
predicate
posi(on
of
a
triple
is
called
a
property.
• A
set
of
triples
is
called
an
RDF
graph.
• In
order
to
facilitate
the
sharing
and
exchanging
of
graphs
on
the
web,
an
XML
serialisa(on
has
also
been
defined.
The
Seman(c
Web
&
Ontologies
28
29. ”Harry
PoNer
has
a
pet
called
Hedwig…”
The
Seman(c
Web
&
Ontologies
29
RDF/XML
RDF
graph
30. Lect
09:
Rela(on
Extrac(on:
DBPediaRela(on
database
that
draw
from
Wikipedia
• Resource
Descrip&on
Framework
(RDF)
triples
subject
predicate
object
Golden Gate Park location San Francisco!
dbpedia:Golden_Gate_Park
dbpedia-‐owl:loca(on
dbpedia:San_Francisco
!
• DBPedia:
The
DBpedia
project
uses
the
Resource
Descrip(on
Framework
(RDF)
to
represent
the
extracted
informa(on
and
consists
of
3
billion
RDF
triples,
580
million
extracted
from
the
English
edi(on
of
Wikipedia
and
2.46
billion
from
other
language
edi(ons
(wikipedia,
March
2016).
30
The
Seman(c
Web
&
Ontologies
31. …
but
…
not
enough…
• Capabili(es
of
RDF
as
ontology
language
are
limited
– No
cardinality
– No
possible
to
describe
conjunc(on
of
classes
– …
RDF
is
a
very
simple
language
The
Seman(c
Web
&
Ontologies
31
cardinality
of
a
set
is
a
measure
of
the
"number
of
elements
of
the
set”.
For
example,
the
set
A
=
{2,
4,
6}
contains
3
elements,
and
therefore
A
has
a
cardinality
of
3
32. Need
for
a
more
expressive
ontology
language:
OWL
(Web
Ontology
Language)
• Since
the
architecture
of
the
web
depends
on
agreed
standards,
the
World
Wide
Web
Consor(um
(W3C)
set
up
a
standardisa(on
working
group
to
develop
a
standard
for
a
web
ontology
language.
•
The
result
of
this
ac(vity
was
the
OWL
ontology
language
standard.
• The
integra(on
of
OWL
with
RDF
has
the
advantage
of
making
OWL
ontologies
directly
accessible
to
web
based
applica(ons.
The
Seman(c
Web
&
Ontologies
32
33. Back
Story:
hNp://ileriseviye.wordpress.com/2011/11/01/why-‐web-‐
ontology-‐language-‐is-‐abbreviated-‐as-‐owl-‐and-‐not-‐wol/
The
Seman(c
Web
&
Ontologies
33
34. Descrip(on
Logics
(DLs)
• A
key
feature
of
OWL
is
its
basis
in
Descrip(on
Logics,
a
family
of
logic-‐based
knowledge
representa(on
formalisms
that
have
a
formal
seman(cs
based
on
first-‐order
logic
(FOL).
The
Seman(c
Web
&
Ontologies
34
35. Descrip(on
Logics
• We
can
use
DLs
to
model
an
applica(on
domain.
The
focus
is
then
on:
– Representa(on
of
knowledge
about
categories
– The
set
of
categories
in
an
applica(on
domain
is
called
terminology
– The
terminology
is
arranged
in
a
hierachical
organiza&on
called
ontology,
which
capture
superset
&
subset
rela(ons
among
categoires/concepts.
– In
order
to
specify
a
hierachical
structure,
we
can
use
subsump$on
rela(ons
betw
the
appropriate
concepts
in
a
terminiology
– Subsump$on
is
a
form
of
inference.
Determines
whether
a
superset/subset
rela(on
(based
on
the
fact
asserted
in
a
terminology)
exists
betw
two
concepts.
The
Seman(c
Web
&
Ontologies
35
36. In
short,
DLs
are…
• …
formalisms
based
on
an
object-‐oriented
modelling,
in
which
the
domain
is
described
in
terms
of
individuals
(instances),
concepts
(classes),
and
roles
(proper(es/predicates):
– individuals,
e.g.,
"Hedwig",
are
the
basic
elements
of
the
domain;
– concepts,
e.g.,
"Owl",
describe
sets
of
individuals
having
similar
characteris(cs;
– roles,
e.g.,
"hasPet",
describe
rela(onships
between
pairs
of
individuals,
such
as
"HarryPoNer
hasPet
Hedwig".
The
Seman(c
Web
&
Ontologies
36
37. Axioms
• An
OWL
ontology
consists
of
a
set
of
axioms
• Exemple:
– given
the
axiom
C
equivalentClass
D,
then
an
individual
is
an
instance
of
C
if
and
only
if
it
is
an
instance
of
D.
– i.e.
Combining
axioms
with
class
descrip(ons
allows
for
easy
extension
of
the
vocabulary
by
introducing
new
names
as
abbrevia(ons
for
descrip(ons.
See
the
following
axiom:
Class: HogwartsStudent!
!EquivalentTo: Student and attendsSchoolvalue Hogwarts!
introduces
the
class
name
HogwartsStudent,
and
asserts
that
its
instances
are
just
those
Students
who
aNend
Hogwarts.
The
Seman(c
Web
&
Ontologies
37
38. TBox
&
ABox
• Axioms
describe
constraints
on
the
structure
of
the
domain:
– in
DLs
such
a
set
of
axioms
is
called
a
TBox
(Terminology
Box).
• OWL
also
allows
for
axioms
asser&ng
facts
about
some
concrete
situa(on,
similar
to
data
in
a
database
se€ng:
– in
DLs
such
a
set
of
axioms
is
called
an
ABox
(Asser(on
Box).
The
Seman(c
Web
&
Ontologies
38
39. Decid-‐ability
(i)
• Descrip(on
Logics
are
fully-‐fledged
logics
and
so
have
a
formal
seman(cs.
•
DLs
can
be
seen
as
decidable
subsets
of
FOL
with:
–
individuals
being
equivalent
to
constants,
– concepts
to
unary
predicates,
– roles
to
binary
predicates.
The
Seman(c
Web
&
Ontologies
39
40. FOL
…
undecidable
(some(mes)
• The
Incompleteness
Theorem
,
proven
in
1930,
demonstrates
that
first-‐order
logic
is
in
general
undecidable.
• That
means
there
exist
statements
in
this
logic
form
that,
under
certain
condi(ons,
cannot
be
proven
either
true
or
false.
• Ex:
can’t
solve
the
Hal$ng
Problem
The
Seman(c
Web
&
Ontologies
40
41. Hal(ng
Problem
• In
1936
Alan
Turing
proved
that
it's
not
possible
to
decide
whether
an
arbitrary
program
will
eventually
halt,
or
run
forever.
• The
official
defini&on
of
the
problem
is
to
write
a
program
(actually,
a
Turing
Machine*)
that
accepts
as
parameters
a
program
and
its
parameters.
That
program
needs
to
decide,
in
finite
&me,
whether
that
program
will
ever
halt
running
these
parameters.
• The
hal(ng
problem
is
a
cornerstone
problem
in
computer
science.
It
is
used
mainly
as
a
way
to
prove
a
given
task
is
impossible,
by
showing
that
solving
that
task
will
allow
one
to
solve
the
hal(ng
problem.
*A
Turing
machine
is
a
hypothe(cal
device
that
manipulates
symbols
according
to
a
table
of
rules.
Despite
its
simplicity,
a
Turing
machine
can
be
adapted
to
simulate
the
logic
of
any
computer
algorithm,
The
Seman(c
Web
&
Ontologies
41
42. Decid-‐ability
(ii)
• DLs
give
a
precise
and
unambiguous
meaning
to
descrip(ons
of
the
domain
• This
also
allows
for
the
development
of
reasoning
algorithms
that
can
provide
correct
answers
to
arbitrarily
complex
queries
about
the
domain.
The
Seman(c
Web
&
Ontologies
42
43. Reasoning:
OWL
vs
Databases
OWL
axioms
behave
like
inference
rules
rather
than
database
constraints.
!
Class: Phoenix!
!SubClassOf: isPetOf only Wizard!
!
Individual: Fawkes!
Types: Phoenix!
Facts: isPetOf Dumbledore!
• Fawkes
is
said
to
be
a
Phoenix
and
to
be
the
pet
of
Dumbledore,
and
it
is
also
stated
that
only
a
Wizard
can
have
a
pet
Phoenix.
• In
OWL,
this
leads
to
the
implica(on
that
Dumbledore
is
a
Wizard.
That
is,
if
we
were
to
query
the
ontology
for
instances
of
Wizard,
then
Dumbledore
would
be
part
of
the
answer.
• In
a
database
se€ng
the
schema
could
include
a
similar
statement
about
the
Phoenix
class,
but
in
this
case
it
would
be
interpreted
as
a
constraint
on
the
data:
adding
the
fact
that
Fawkes
isPetOf
Dumbledore
without
Dumbledore
being
already
known
to
be
a
Wizard
would
lead
to
an
invalid
database
state,
and
such
an
update
would
therefore
be
rejected
by
a
database
management
system
as
a
constraint
viola(on.
The
Seman(c
Web
&
Ontologies
43
44. Ontology
Development
Tools
• State
of
the
art
ontology
development
tools,
such
as
SWOOP,
Protégé,
and
TopBraid
Composer,
use
DL
reasoners
to
provide
feedback
to
the
user
about
the
logical
implica(ons
of
their
design:
– i.e.
warnings
about
inconsistencies
and
synonyms.
The
Seman(c
Web
&
Ontologies
44
46. VOWL:
Visual
Nota(on
for
OWL
Ontologies
hNp://vowl.visualdataweb.org/v2/
The
Seman(c
Web
&
Ontologies
46
47. Domain-‐specific
ontologies
• The
availability
of
tools
has
contributed
to
the
increasingly
widespread
use
of
OWL,
and
it
has
become
the
de
facto
standard
for
ontology
development
in
fields
as
diverse
as
– Biology
– Medicine
– Geography
– Geology
– Agriculture
– Defence
– etc
The
Seman(c
Web
&
Ontologies
47
48. Complex
Queries
• The
use
of
DL
reasoners
allows
OWL
ontology
applica(ons
to
answer
complex
queries
and
to
provide
guarantees
about
the
correctness
of
the
result.
• Reliability
and
correctness
are
clearly
important
features
of
any
informa(on
system;
• They
are
par(cularly
important
if
ontology
based
systems
are
to
be
used
in
safety-‐cri(cal
applica(ons
such
as
medicine,
where
incorrect
reasoning
could
adversely
impact
pa(ent
care.
The
Seman(c
Web
&
Ontologies
48
49. Standard
Query
Language
• It
has
long
been
recognised
that
the
seman(c
web,
and
seman(c
web
knowledge
representa(on
languages
such
as
RDF
and
OWL,
would
also
benefit
from
the
availability
of
a
standardised
query
language
such
as
SQL
• A
W3C
standardisa(on
working
group
was
set
up,
and
has
completed
its
work
on
the
SPARQL
query
language
standard.
The
Seman(c
Web
&
Ontologies
49
50. SPARQL
Protocol
and
RDF
Query
Language
…
• …
is
an
RDF
query
language,
ie
a
query
language
that
can
retrieve
and
manipulate
data
stored
in
RDF
format
(ie
triples).
• SPARQL
allows
for
a
query
to
consist
of
triple
paSerns,
conjunc(ons,
disjunc(ons,
and
op(onal
paNerns
The
Seman(c
Web
&
Ontologies
50
51. Tags
&
Ontologies
• Tagging
facili(es
within
Web
2.0
applica(ons
have
shown
how
it
might
be
possible
for
user
communi(es
to
collabora(vely
annotate
web
content,
and
create
simple
forms
of
ontology
via
the
development
of
hierarchically
organised
sets
of
tags,
osen
called
folksonomies….
The
Seman(c
Web
&
Ontologies
51
52. Challenges
• Currently
hard
to
combine:
– Increased
expressive
power
(by
using
more
sophis(cated
logics)
with
scalability
(large
ontologies)
The
Seman(c
Web
&
Ontologies
52
53. Ontology
Learning
• Ontology
learning
(ontology
extrac(on,
ontology
genera(on,
or
ontology
acquisi(on)
is
the
automa(c
or
semi-‐automa(c
crea(on
of
ontologies,
including
extrac(ng
the
corresponding
domain's
terms
and
the
rela&onships
between
those
concepts
from
a
corpus
of
natural
language
text,
and
encoding
them
with
an
ontology
language
for
easy
retrieval.
• As
building
ontologies
manually
is
extremely
labor-‐intensive
and
(me
consuming,
there
is
great
mo(va(on
to
automate
the
process.
• Typically,
the
process
starts
by
extrac(ng
terms
and
concepts
or
noun
phrases
from
plain
text
using
linguis(c
processors
such
as
part-‐of-‐speech
tagging
and
phrase
chunking.
Then
sta(s(cal
techniques
are
used
to
extract
rela(on,
osen
based
on
Machine
Learning.
– hNp://en.wikipedia.org/wiki/Ontology_learning
The
Seman(c
Web
&
Ontologies
53
54. In
summary…
Why
to
build
an
ontology?
•
To
share
common
understanding
of
the
structure
of
informa(on
among
people
or
sosware
agents
•
To
enable
reuse
of
domain
knowledge
•
To
make
domain
assump(ons
explicit
•
To
analyze
domain
knowledge
The
Seman(c
Web
&
Ontologies
54
55. How
to
build
an
ontology
Generally
speaking
(and
roughtly
said),
when
designing
an
ontology,
four
main
components
are
used:
1. Classes
2. Rela(ons
3. Axioms
4. Instances
The
Seman(c
Web
&
Ontologies
55
56. Classes
•
concepts
of
the
domain
or
tasks,
which
are
usually
organized
in
taxonomies
Ex:
in
a
university
ontology,
student
and
professor
are
two
classes
The
Seman(c
Web
&
Ontologies
56
57. Rela(ons
A
type
of
interac(on
between
concepts
of
the
domain:
Ex:
subclass-‐of
or
is-‐a
are
rela(ons
The
Seman(c
Web
&
Ontologies
57
58. Axioms
Asser(ons
that
are
always
true
for
the
domain
of
interest
Ex:
if
a
student
aNends
both
”Math”
and
”Basic
text
processing”
courses,
then
he
or
she
must
be
a
1st
year
student.
The
Seman(c
Web
&
Ontologies
58
59. Instances
Represent
specific
elements
Ex:
a
Student
called
Peter
is
the
instance
of
Student
class
The
Seman(c
Web
&
Ontologies
59
60. Important!
•
There
is
no
single
correct
class
hierarchy
for
any
given
domain.
• The
hierarchy
depends
on
the
possible
uses
of
the
ontology.
• The
level
of
detail
is
depend
on
the
applica(ons
and
purposes.
The
Seman(c
Web
&
Ontologies
60