Integrating Relational Databases with the Semantic Web: A Reflection

Smart Data for Smarter Business | © 2016 Capsenta | capsenta.com
Integrating
Relational
Databases

with
the
Semantic
Web:
A
Reflection
Juan
F.
Sequeda
Joint
work
with
Daniel
P.
Miranker
(UT
Austin)
and
Marcelo
Arenas
(PUC
Chile)
Thanks
to:
Oscar
Corcho,
Aibo Tian,
Mayank Kejriwal,
Hamid
Tirmizi
13th
Reasoning
Web
Summer
School
(RW
2017)
– July
7
to
11,
2017
– London,
UK

Take
away
message
of
this
talk
• Reflect
on
10
years
of
(our)
research
on

Integrating
Relational
Database
with
the

Semantic
Web
– DISCLAIMER:
This
is
NOT
a
Survey
– W3C
Relational
Database
to
RDF
Standards

(Science
vs
Engineering)
• Provide
answer
to
the
research
question:

• Thesis:
How and
to
what extent can
Relational
Databases
be

integrated
with
the
Semantic
Web?
Much
of
the
existing
Relational
Database
infrastructure
can
be

reused
to
support
the
Semantic
Web

Data
Logic
RDBMS
Semantic

Web
Workshop
on

Logic
and
Data
Bases,

Toulouse
1977
Gallaire,
Nicolas
&

Minker
SQL99
Recursion
KL-‐ONE
Description

Logic RDF OWL
Views Triggers
Semantic
Networks
Japanese
5th
Generation
Project
MCC
Austin,
TX
Today1970s
Relational

Algebra
Workshops
on
Expert
Systems
Deductive
Databases
KRDB
1980s 1990s 2000s
Let’s
put
History
in
Today’s
Context
4

What
is
the
relationship
between
Relational
Model
Table
Definition
ConstraintsS
Q
L
Relational
Databases
RDF
RDFS
OWL
S
P
A
R
Q
L
TIME
Triggers Rules
Semantic
Web
Sequeda
et
al.
SQL
Databases
are
a
Moving
Target.
W3C
Workshop
on
RDF
Access
on
RDB.
2007
Progra
mmer
type
2 “Bob”
name
ITEmployee
subClassOf
SELECT
?s
?n
{
?s
type
ITEmployee.
?s
name
?n
}
Literal
name

Once
upon
a
time
…
• D2R
(Map,Q,Server),
Virtuoso
RDF
Views,
SquirrelRDF,
R2D2,

Relational.OWL,
DB2OWL,
R2O,
Triplify,
Dartgrid,
RDBToOnto,

METAmorphoses,…
https://www.w3.org/2007/03/RdfRDB/

F2F
Meeting

ISWC
2008
March
2008 February
2009
1. Recommendation

to
standardize
a

mapping
language
2. RDB2RDF
Survey
(2)
http://www.w3.org/2005/Incubator/rdb2rdf/RDB2RDF_SurveyReport.pdf
(1)
http://www.w3.org/2005/Incubator/rdb2rdf/XGR-‐rdb2rdf-‐20090126/
October
2008

Sept
2009 Sept
2012

0
50
100
150
200
250
Sep-‐09
Oct-‐09
Nov-‐09
Dec-‐09
Jan-‐10
Feb-‐10
Mar-‐10
Apr-‐10
May-‐10
Jun-‐10
Jul-‐10
Aug-‐10
Sep-‐10
Oct-‐10
Nov-‐10
Dec-‐10
Jan-‐11
Feb-‐11
Mar-‐11
Apr-‐11
May-‐11
Jun-‐11
Jul-‐11
Aug-‐11
Sep-‐11
Oct-‐11
Nov-‐11
Dec-‐11
Jan-‐12
Feb-‐12
Mar-‐12
Apr-‐12
May-‐12
Jun-‐12
Jul-‐12
Aug-‐12
Sep-‐12
Oct-‐12
First
F2F

@Semtech 2010
FPWD
R2RML
WD
R2RML
+
DM
Rec
R2RML
+
DM
Candidate
Rec
R2RML
+
DM Proposed
Rec
R2RML
+
DM
FPWD
DM
WD
R2RML+DM
WD
R2RML+DM
Photo
from
cygri http://www.flickr.com/photos/cygri/4719458268/

W3C
Relational
Database
to
RDF
(RDB2RDF)
Standards
• Tools:
Ultrawrap,
Morph,
ontop,
…
• Ontology
Based
Data
Access
(OBDA)

Outline
• 9:00
– 10:30
– Intro
– Relational
Database
à Semantic
Web:

Data
Mapping
• 10:30
– 11:00
– Coffee
Break
• 11:00
– 12:30
– Semantic
Web
à Relational
Database:

Data
Access
– Conclusion
11

RELATIONAL
DATABASES
à
SEMANTIC
WEB:
DATA
MAPPING
12

RDF
W3C
Direct
Mapping
Overview
Relational
Database
Direct

Mapping
Engine
Input:

Database
(Schema
and
Data)
Primary
Keys
Foreign
Keys
Output
RDF
graph

orderid date total currency status
1234 2017-‐07-‐07 100 USD 1
Order
LineItem
LineItem/lineid=6789
Order#orderid
6789
Input
• Relational
Schema
• Primary
Keys
PK
and

Foreign
Keys
FK over
R
• Relational
Data
Output
• RDF
graphDirect
Mapping
W3C
Direct
Mapping
14
lineid price quantity product orderid
6789 30 2 Shoes Foo 1234
6790 20 2 Tshirt Bar 1234
<LineItem/lineid=6790>
<Order/orderid=1234>
LineItem#ref-‐orderid LineItem#ref-‐orderid
1234
2017-‐07-‐07
100
USD
1
30
2
Shoes
Foo
1234
6790
20
2Tshirt Bar
1234
Order#date
Order#total Order#currency
Order#status
Lineitem#lineid
Lineitem#price
Lineitem#quantity
Lineitem#product
Lineitem#orderid
Lineitem#lineid
Lineitem#price
Lineitem#quantity
Lineitem#product

What
do
we
need
to
automatically
generate?
• Generate
Identifiers
– IRI
– Blank
Nodes
• Generate
Triples
– Table
– Literal
– Reference

Generating
Identifiers
• Identifier
for
rows,
tables,
columns
and

foreign
keys
• If
a
table
has
a
primary
key,

– then
the
row
identifier
will
be
an
IRI,

– otherwise
a
blank
node
• The
identifiers
for
table,
columns
and
foreign

keys
are
IRIs
• IRIs
are
generated
by
appending
to
a
given

base
IRI
• All
strings
are
percent
encoded

Row
Node
1)
<http://www.ex.com/Person/ID=1>
Base
IRI “Table
Name”/“PK
attr”=“PK
value”
2)
<http://www.ex.com/Person/ID=1;SID=123>
Base
IRI “Table
Name”/“PK
attr”=“PK
value”
3)
Fresh
Blank
Node

More
IRI
1)
<http://www.ex.com/Person>
Base
IRI “Table
Name”
2)
<http://www.ex.com/Person#NAME>
Base
IRI “Table
Name”#“Attribute”
3)
<http://www.ex.com/Person#ref-‐CID>
Base
IRI “Table
Name”#ref-‐“Attribute”

ID (pk) NAME AGE
1 Alice 25
2 Bob NULL
Person
Table
Triple
19
rdf:type

“Alice”
.
Literal
Triples
20
ID (pk) NAME AGE
1 Alice 25
2 Bob NULL
Person

ID
(pk)
NAME AGE
CID
(fk)
1 Alice 25 100
2 Bob NULL 200
Person
CID
(pk)
TITLE
100 Austin
200 Madrid
City
Reference
Triples
21
<http://www.ex.com/Person#ref-‐CID>
<http://www.ex.com/City/CID=100>.

Direct
Mapping
Result
22
ID NAME AGE CID
1 Alice 25 100
2 Bob NULL 100
Person
CID NAME
100 Austin
200 Madrid
City
<Person/ID=1>
<City/CID=100>
Alice
25
Austin
<Person/ID=2>
Alice
<City/CID=200> Madrid
<Person#NAME>
<Person#AGE> <Person#NAME>
<Person#NAME>
<Person#NAME>
<Person#ref-‐CID>
<Person#ref-‐CID>

Summary
of
W3C
Direct
Mapping
• Default
and
Automatic
Mapping
• URIs
are
automatically
generated
– <table>
– <table#attribute>
– <table#ref-‐attribute>
– <Table/pkAttr=pkValue>
• RDF
represents
the
same
relational
schema
• RDF
can
be
transformed
by

SPARQL
CONSTRUCT
– RDF
represents
the
structure
and
ontology
of
mapping

author’s
choice
23

Issues
with
the
W3C
Direct
Mapping
1. Mapping
is
only
from
Relational
data
to
RDF

data.

– The
relational
schema
is
not
taken
in
account.

Hence
no
relational
schema
to
OWL
2. Semantics
is
not
defined
for
NULL
values
– “The
direct
mapping
does
not
generate
triples

for
NULL
values.
Note
that
it
is
not
known
how
to

relate
the
behavior
of
the
obtained
RDF
graph

with
the
standard
SQL
semantics
of
the
NULL

values
of
the
source
RDB.”

24

Research
Problem
with
Direct
Mapping
• How
can
a
relational
database
schema
and

data,
including
nulls,
be
automatically
mapped
to
RDF
and
OWL?
• How
can
we
assure
correctness
of
mapping?
– Information
Preservation:
no
information
is
lost
– Query
Preservation:
no
queries
are
lost
– Monotonicity:
inserts
does
not
affect
– Semantics
Preservation:
constraints
are
not
lost
25
Hypothesis:
Relational
Databases
can
be
automatically

mapped
to
RDF
and
OWL
under
a
correct
mapping

1234 2017-‐07-‐07 100 USD 1
Order
LineItem
Order#orderid
6789
Input
• Relational
Schema
R
• Set
Σ of
Primary
Keys
PK

and
Foreign
Keys
FK over
R
• Instance
I of
R
Output
• RDF
graph
• OWL
Ontology
as

RDF
graph
Direct
Mapping
Direct
Mapping
as
Ontology
Overview
26
6789 30 2 Shoes Foo 1234
6790 20 2 Tshirt Bar 1234
<LineItem/lineid=6790>
<Order/orderid=1234>
LineItem#ref-‐orderid
1234
2017-‐07-‐07
100
USD
1
30
2
Shoes
Foo
1234
6790
20
2Tshirt Bar
1234
Order#date
Order#total Order#currency
Order#status
Lineitem#lineid
Lineitem#price
Lineitem#quantity
Lineitem#product
Lineitem#orderid
Lineitem#lineid
Lineitem#price
Lineitem#quantity
Lineitem#product
<Order> <LineItem>
LineItem#ref-‐orderid
owl:Class
rdf:type rdf:type
We
need
to
be
careful
about
two
issues
• Binary
Relations
• NULLs

NULLs
• What
should
we
do
with
NULLs?
– Generate
a
Blank
Node
– Don’t
generate
a
triple
27
How
do
we

reconstruct
the

NULL?
lineid product comment
6789 Shoes
Foo “…”
6790 Tshirt Bar NULL
“…”
_:a
LineItem/lineid=6789 “…”
pr:title

Direct
Mapping
Input:
A
relational
schema
R a
set
of
Σ of

primary
keys
and
foreign
keys
and
a
database

instance
I of
this
schema
Output:
An
RDF
Graph
28
Definition:
A
direct
mapping
M is
a
total
function
from
the

set
of
all
(R,
Σ,
I)
to
the
set
of
all
RDF
graphs

I
R,
Σ Predicates
to

store
(R,
Σ,
I)
Predicates
to

Store
Ontology
O
Datalog
Rules

to
generate

O
from
R,
Σ
Datalog
Rules

to
generate

RDF
from
O
and
I
Datalog
Rules

to
generate

OWL
from
O
OWL
RDF
Direct
Mapping
RDB
to
RDF
and
OWL
29
Rel(r)
Attr(a,
r)
PKn(a1,
…
,
an,
r)
Value(v,
a,
t,
r)
…
Class(X)
←
Rel(X),

¬IsBinRel(X)
Triple(U,"rdf:type","owl:Class")

←
Class(R),
ClassIRI(R,
U)
Triple(s,
p,
o)
←
…

On
Directly
Mapping
Relational
Databases
to
RDF
and
OWL
Sequeda,
Arenas,
Miranker.
WWW
2012
Class(C)
DtP(p,
C)
ObjP(p,
S,
T)
…

Input:
Relational
Schema
• Rel(r) :

– Rel(Order)
• Attr(a,
r) :

– Attr(total,
Order)
• PKn(a1,
…
,
an,
r) :

– PK1(orderid,
Order)
• FKn(a1,
…
,
an,
r,
b1,
…
,
bn,
s)
:

– FK1(orderid,
LineItem,
orderid,
Order)
30
1234 2017-‐07-‐07 100 USD 1
Order
LineItem
6789 30 2 Shoes Foo 1234
6790 20 2 Tshirt Bar 1234

Input:
Relational
Schema
• Value(v,
a,
t,
r)
31
1234 2017-‐07-‐07 100 USD 1
Order

Input:
Relational
Schema
• Value(v,
a,
t,
r)
– Value(
1234,
orderid,
t1,
Order)
32
1234 2017-‐07-‐07 100 USD 1
Order

Input:
Relational
Schema
• Value(v,
a,
t,
r)
– Value(
1234,
orderid,
t1,
Order)
– Value(
2017-‐07-‐07,
date,
t1,
Order)
33
1234 2017-‐07-‐07 100 USD 1
Order

Input:
Relational
Schema
• Value(v,
a,
t,
r)
– Value(
1234,
orderid,
t1,
Order)
– Value(
2017-‐07-‐07,
date,
t1,
Order)
– Value(
100,
total,
t1,
Order)
34
1234 2017-‐07-‐07 100 USD 1
Order

Mapping
to
OWL
35
Triple(http://ex.org/Order,
rdf:type,
owl:Class)
Triple(U,"rdf:type","owl:Class")
←
Class(R),
ClassIRI(R,
U)
ClassIRI(R,
X)
←
Class(R),
Concat2(base,
R,
X)
Class(X)
←
Rel(X),
¬IsBinRel(X)
IsBinRel(X)
←
BinRel(X,
A,
B,
S,
C,
T,
D)
BinRel(R,
A,
B,
S,
C,
T,
D)
←

PK2(A,
B,
R),
¬ThreeAttr(R),
FK1(A,R,C,S),R
≠
S,
FK1(B,R,D,T),R
≠
T,
¬TwoFK(A,
R),
¬TwoFK (B,
R),
¬OneFK(A,
B,
R),
¬FKTo(R)

Generating
IRIs
for
Tuples
36
Generate
IRIs
for
the
tuples
of
the
relations
having
a
primary
key:

Generate
blank
nodes
for
the
tuples
of
the
relations
not
having
a
primary
key

Generate
an
identifier
X
of
a
tuple
T
of
a
relation
R,
which
is
an
IRI
if
R
has
a
primary
key
or
a

blank
node.

Mapping
to
RDF:
Table
Triples
37
Table
triples:
for
each
relation,
store
the
tuples
that
belongs
to
it
Triple(http://ex.org/Order#orderid=1234 ,
rdf:type,
http://ex.org/Order )

Mapping
to
RDF:
Literal
Triples
38
Literal
triples:
for
each
tuple,
store
the
values
in
each
of
its
attributes
Triple(http://ex.org/person#ssn=123 ,
http://ex.org/person#name ,
“Juan”)
Generate
for
every
tuple
t
in
a
relation
R
and
for
every
attribute
A
of
R,
a
triple

storing
the
value
of
t
in
A,
which
is
called
a
literal
triple.

Mapping
to
RDF:
Reference
Triples
39
Reference
triples:
store
the
references
generated
by
the
FKs
Triple(http://ex.org/student#id=3 ,

http://ex.org/student,person#ssn,ssn ,

http://ex.org/person#ssn=123 )
Construct
reference
triples
for
object

properties
that
are
generated
from
binary

relations

Construct
reference
triples
for
object

properties
that
are
generated
from
foreign

keys

Information
Preservation
40
I
R,
Σ
DM(R,
Σ,
I)
DM-‐ (DM(R,
Σ,
I))

Proof:
Provide
a
computable
mapping
DM-‐
Theorem:
The
Direct
Mapping
DM is
information
preserving

Query
Preservation
41
I
R,
Σ DM(R,
Σ,
I)
eval(Q*,
DM(R,
Σ,
I))eval(Q,
I)

Relational
Algebra
tuples
vs.

SPARQL
mappings
42
ssn name age
789 Daniel NULL
person
t.ssn
=
789
t.name
=
Daniel
t.age
=
NULL
Then,
tr(t)
=
μ
:
• Domain
of
μ
is
{?ssn,
?name}
• μ(?ssn)
=
789
• μ(?name)
=
Daniel

Query
Preservation
43
I
R,
Σ DM(R,
Σ,
I)
eval(Q*,
DM(R,
Σ,
I))tr(eval(Q,
I)) =
Proof:
By
induction
on
the
structure
of
Q
Bottom-‐up
algorithm
for
translating
Q
into
Q*
Theorem:
The
Direct
Mapping
is
query
preserving

Monotonicity
44
DM(R,
Σ,
I1)

DM(R,
Σ,
I2)
I1 ⊆ I2 DM(R,
Σ,
I1)
⊆ DM(R,
Σ,
I2)
I1
R,
Σ
I2
R,
Σ
Proof:
All
negative
atoms
in
the
Datalog
rules
refer
to
the
schema,
where
the
schema
is
fixed
Theorem:
The
Direct
Mapping
DM
is
Monotone

Semantics
Preservation
45
DM(R,
Σ,
I)

DM(R,
Σ,
I)

I
R,
Σ
I
R,
Σ
I satisfies
Σ

I does
not
satisfies
Σ
Consistent
under
OWL
semantics
Not
consistent
under
OWL
semantics
ssn name
123 Juan
123 Marcelo
person
ssn is
the
PK
#ssn=123
Juan
Marcelo
DM(R,
Σ,
I)

12
3
person#ssn
I does
not satisfy
Σ however DM(R,
Σ,
I)
is
consistent
under
OWL
semantics
Proposition:
The
direct
mapping
DM is
not
semantics
preserving.

Theorem:
No
monotone
direct
mapping
is
semantics
preserving

Extending
DM for
Semantics
Preservation
• Family
of
Datalog rules
to
determine
violation

– Primary
Keys
– Foreign
Keys
– Create
artificial
triple
that
will
generate

contradiction
• Non-‐monotone
direct
mapping
• Information
Preserving
• Query
Preserving
• Semantics
Preserving
46

Reflection
1
• We
studied
how
Relational
Databases
can
be

automatically
and
correctly
mapped
to
the

Semantic
Web
– HOW:
Defined
a
Direct
Mapping
using
Datalog

rules
– EXTENT:
Information
and
Query
Preserving.

Monotonicity
is
an
obstacle
for
Semantics

Preservation

Recall
the
Hypothesis:

Relational
Databases
can
be
automatically
mapped

to
RDF
and
OWL
under
a
correct
mapping
Information
Preserving,
Query
Preserving
and
Monotone
or

Information,
Query
and
Semantics
Preserving

RDF
W3C
R2RML
Relational
Database
R2RML
Mapping
Engine
OWL
Ontologies

(e.g FOAF,
etc)
R2RML
File
Input
Database
(schema
and
data)
Target
Ontologies
Mappings
between
the
Database
and

Target
Ontologies
in
R2RML
Output
RDF
graph
Direct
Mapping
helps
to
“bootstrap”

Direct
Mapping
as
R2RML
49
ID NAME AGE CID
1 Alice 25 100
2 Bob NULL 100
Person
CID NAME
100 Austin
200 Madrid
City
<Person/ID=1>
<City/CID=100>
Alice
25
Austin
<Person/ID=2>
Alice
<City/CID=200> Madrid
<Person#NAME>
<Person#AGE> <Person#NAME>
<Person#NAME>
<Person#NAME>
<Person#ref-‐CID>
<Person#ref-‐CID>
How
can
this
be

represented
as
R2RML?

@prefix rr: <http://www.w3.org/ns/r2rml#> .
<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”Person”];
rr:subjectMap [
rr:template "http://www.ex.com/Person/ID={ID}";
rr:class <http://www.ex.com/Person>
];
rr:predicateObjectMap [
rr:predicate <http://www.ex.com/Person#NAME> ;
rr:objectMap [rr:column ”NAME" ]
].
Direct
Mapping
as
R2RML

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
].
Direct
Mapping
as
R2RML
51
Logical
Table:
What
is
being
mapped?

SubjectMap:
How
to
generate
the
Subject?
PredicateObjectMap:
How
to
generate
the
Predicate
and
Object?

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
]
.
Logical
Table
52
What
is
being
mapped?

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
]
.
Subject
URI
Template
53
Subject
URI
<Subject
URI>
rdf:type <Class
URI>

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
]
.
Predicate
URI
Constant
54
Predicate
URI

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
]
.
Object
Column
Value
55
Object
Literal

<http://www.ex.com/Person/1>
foaf:name
“Ugly”
vs “Cool”
URIs
56
foaf:Person

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person
];
rr:predicate foaf:name;
]
.
Customization
57
Customized
Subject
URI
Customized
Class
Customized
Property

What
if
…

58
ID NAME GENDER
1 Alice F
2 Bob M
Person
<Person/1> Alice
foaf:name
<Woman>
rdf:type
SELECT
ID,
NAME

FROM
Person

WHERE
GENDER
=
"F"
R2RML
View

<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:sqlQuery
“””SELECT ID, NAME
FROM Person WHERE gender = “F” “””];
rr:subjectMap [
rr:class <http://www.ex.com/Woman>
];
]
.
R2RML
View
59
Query
instead
of
table

Quick
Overview
of
R2RML
• Manual
and
Customizable
Language
• Learning
Curve
• Direct
Mapping
bootstraps
R2RML
• RDF
represents
the
structure
and
ontology
of

mapping
author’s
choice
60

W3C
R2RML
Details
• Logical
Tables:
What
is
being
mapped
• Term
Maps:
How
to
create
RDF
terms
• How
to
create
Triples
from
a
table
• How
to
create
Triples
between
two
tables
• Languages
• Datatypes

R2RML
Mapping
Input
Database
Logical
Table

Logical
Table
=
existing
table
or
view
in
database
R2RML
View
=
SQL
Query
R2RML
Mapping

sid name pid
1 Juan 100
2 Martin 200
pid name
100 Dan
200 Marcelo
Student
Professor
ex:Student1
rdf:type ex:Student .
ex:Student2
ex:Professor100
rdf:type ex:Professor .
ex:Professor200
ex:Student1
foaf:name “Juan”.
…
R2RML
Mapping
R2RML
Mapping

R2RML
Mapping
• A
R2RML
Mapping
M consists
of
a
finite
set

TM TripleMaps.
• Each
TM
∈TM
consists
of
a
tuple

(LT,
SM,
POM)
– LT:
LogicalTable
– SM:
SubjectMap
– POM:
PredicateObjectMap
• Each
POM∈POM
consists
of
a
pair
(PM,
OM)*
– PM:
PredicateMap
– OM:
ObjectMap
*
For
simplicity

R2RML
Mapping
• An
R2RML
Mapping
is
represented
as
an
RDF

Graph
itself.
• Associated
RDFS
schema
– http://www.w3.org/ns/r2rml
• Turtle
is
the
recommended
syntax

LogicalTable
• Tabular
data
mapped
to
RDF
– rr:logicalTable
1. Existing
Relational
table
or
view
– rr:tableName
2. R2RML
(SQL)
View
– rr:sqlQuery

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
]
.
67

<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:sqlQuery
“””SELECT ID, NAME
FROM Person WHERE gender = “F” “””];
rr:subjectMap [
rr:class <http://www.ex.com/Woman>
];
]
.

How
to
create
RDF
terms
that
define
RDF
Triples?
• RDF
term
is
either
an
IRI,
a
blank
node,
or
a

literal
• Answer
1. Constant
Value
2. Value
in
the
database
a. Raw
Value
in
a
Column
b. Column
Value
applied
to
a
template

TermMap
• A
TermMap is
a
function
that
generates
an

RDF
Term
from
a
logical
table
row.
• RDF
Term
is
either
a
IRI,
or
a
Blank
Node,
or
a

Literal
Logical
Table
Row
TermMap
IRI
Bnode
Literal
RDF
Term

TermMap
• A
TermMap must
be
exactly
on
of
the

following
– Constant-‐valued
TermMap
– Column-‐valued
TermMap
– Template-‐valued
TermMap
• If
TermMaps are
used
to
create
S,
P,
O,
then
– 3
ways
to
create
a
subject
– 3
ways
to
create
a
predicate
– 3
ways
to
create
an
object

Stemplate
Ptemplate
Otemplate
Oconstant
Ocolumn
PConstant
Otemplate
Oconstant
Ocolumn
Pcolumn
Otemplate
Oconstant
Ocolumn
Sconstant
Ptemplate
Otemplate
Oconstant
Ocolumn
PConstant
Otemplate
Oconstant
Ocolumn
Pcolumn
Otemplate
Oconstant
Ocolumn
Scolumn
Ptemplate
Otemplate
Oconstant
Ocolumn
PConstant
Otemplate
Oconstant
Ocolumn
Pcolumn
Otemplate
Oconstant
Ocolumn
How
many
ways
to
create
a
Triple?

Constant-‐valued
TermMap
• A
TermMap that
ignores
the
logical
table
row

and
always
generates
the
same
RDF
term
• rr:constant
• Commonly
used
to
generate
constant
IRIs
as

the
predicate

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
rr:predicateMap [rr:constant foaf:name ]
]
.
75

Column-‐valued
TermMap
• A
TermMap that
maps
a
column
value
of
a

column
name
in
a
logical
table
row
• rr:column
• Commonly
used
to
generate
Literals
as
the

object

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
]
.
77

Template-‐valued
TermMap
• A
TermMap that
maps
the
column
values
of
a

set
of
column
names
to
a
string
template.
• A
string
template is
a
format
that
can
be
used

to
build
strings
from
multiple
components.
• rr:template
• Commonly
used
to
generate
IRIs
as
the

subject
or
concatenate
different
attributes

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
]
.
79

Commonly used…
• …
but
any
of
these
TermMaps can
be
used
to

create
any
RDF
Term
(s,p,o).
Recall:
– 3
ways
to
create
a
subject
– 3
ways
to
create
a
predicate
– 3
ways
to
create
an
object
• Template-‐valued
TermMap are
commonly

used
to
create
an
IRI
for
a
subject,
but
can
be

used
to
create
Literal
for
an
object.
• How
to
specify
the
term
(IRI
or
Literal
in
this

case)?

TermType
• Specify
the
type
of
a
term
that
a
TermMap
should
generate
• Force what
the
RDF
term
should
be
• Three
types
of
TermType:
– rr:IRI
– rr:BlankNode
– rr:Literal

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
rr:objectMap [
rr:template ”{FIRST_NAME} {LAST_NAME}”;
rr:termType rr:Literal;
]
]
.
82

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
rr:template ”person{ID}";
rr:termType rr:BlankNode;
];
]
.
83

TermType (cont…)
• Can
only
be
applied
to
Template
and
Column

valued
TermMap
• Applying
to
Constant-‐valued
TermMap has
no

effect
– i.e If
the
constant
is
an
IRI,
the
term
type
is

automatically
an
IRI

TermType Rules
• If
the
Term
Map
is
for
a

1. Subject
à TermType =
IRI
or
Blank
Node
2. Predicate
à TermType =
IRI

3. Object
à TermType =
IRI or
Blank
Node
or
Literal

TermType is
Optional
• If
a
TermType is
not
specified
then
– Default
=
IRI
– Unless
it’s
for
an
object
being
defined
by
a

Column-‐based
TermMap or
has
a
language
tag
or

specified
datatype,
then
the
TermType is
a
Literal
• That’s
why
if
there
is
a
template
in
an

ObjectMap,
it
will
always
generate
an
IRI,

unless
a
TermType to
Literal
is
specified.

rr:objectMap [
rr:template ”{FIRST_NAME} {LAST_NAME}”;
]
]
87
rr:predicateMap [rr:constant ex:role ]
rr:objectMap [
rr:template ”http://ex.com/role/{role}”
]
]
rr:objectMap [
rr:template ”{FIRST_NAME} {LAST_NAME}”
]
]

NOW
WE
HAVE
THE
ELEMENTS
TO

CREATE
TRIPLES

Generating
an
RDF
Triple
• TermMap that
specifies
what
RDF
term
should

be
for
S,
P,
O
– SubjectMap
– PredicateMap
– ObjectMap

SubjectMap
• SubjectMap is
a
TermMap
• rr:subjectMap
• Specifies
what
the
subject
of
a
triple
should
be
• 3
ways
to
create
a
subject
Term
Map
Term
Map
Term
Map
• Has
to
be
an
IRI
or
Blank
Node

SubjectMap
• SubjectMaps are
usually Template-‐valued

TermMap
• Use-‐case
for
Column-‐valued
TermMap
– Use
a
column
value
to
create
a
blank
node
– URI
exist
as
a
column
value
• Use-‐case
for
Constant-‐valued
TermMap
– For
all
tuples:
<CompanyABC>
<consistsOf>
<Dep{id}>

SubjectMap
• Optionally,
a
SubjectMap may
have
one
or

more
Class
IRIs
associated
– This
will
generate
rdf:type triples
• rr:class

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
]
.
94
Optional

PredicateObjectMap
• A
function
that
creates
one
or
more
predicate-‐
object
pairs
for
each
logical
table
row.
• rr:predicateObjectMap
• It
is
used
in
conjunction
with
a
SubjectMap to

generate
RDF
triples
in
a
TriplesMap.
• A
predicate-‐object
pair
consists
of
– One
or
more
PredicateMaps
– One
or
more
ObjectMaps or

ReferencingObjectMaps

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
rr:predicateMap [rr:constant foaf:name];
]
.
96

PredicateMap
• PredicateMap is
a
TermMap
• rr:predicateMap
• Specifies
what
the
predicate
of
a
RDF
triple

should
be
• 3
ways
to
create
a
predicate
Term
Map
Term
Map
Term
Map
• Has
to
be
an
IRI

PredicateMap
• PredicateMaps are
usually Constant-‐valued

TermMap
• Use-‐case
for
Column-‐valued
TermMap
– …

• Use-‐case
for
Template-‐valued
TermMap
– …

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
]
.
99

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
]
.
10
0
Shortcut!

Constant
Shortcut
Properties
• ?x
rr:predicate ?y
• ?x
rr:predicateMap [
rr:constant ?y
]
• ?x
rr:subject ?y
• ?x
rr:subjectMap [
rr:constant ?y
]
• ?x
rr:object ?y
• ?x
rr:objectMap [
rr:constant ?y
]

ObjectMap
• ObjectMap is
a
TermMap
• rr:objectMap
• Specifies
what
the
object
of
a
triple
should
be
• 3
ways
to
create
a
predicate
Term
Map
Term
Map
Term
Map
• Has
to
be
an
IRI
or
Literal
or
Blank
Node

ObjectMap
• ObjectMaps are
usually Column-‐valued

TermMap
• Use-‐case
for
Template-‐valued
TermMap
– Concatenate
values
– Create
IRIs
• Use-‐case
for
Constant-‐valued
TermMap
– All
rows
in
a
table
share
a
role

<TriplesMap1>
a rr:TriplesMap;
rr:subjectMap [
];
]
.
10
4

sid name pid
1 Juan 100
2 Martin 200
Student
@prefix ex: <http://example.com/ns/>.
ex:Student1 rdf:type ex:Student .
TripleMap
Example
1
• We
now
have
sufficient
elements
to
create
a

mapping
that
will
generate
– A
Subject
IRI
– rdf:Type triple(s)

Example
1
@prefix rr: <http://www.w3.org/ns/r2rml#>.
<#TriplesMap1>
rr:logicalTable [ rr:tableName ”Student”];
rr:subjectMap [
rr:template "http://example.com/ns/{sid}";
rr:class ex:Student;
].
Logical
Table
is
a
Table
Name
SubjectMap is
a
Template-‐valued
TermMap
And
it
has
one
Class
IRI

sid name pid
1 Juan 100
2 Martin 200
Student
TripleMap
ρ1 :

Student(s,
x,
y)
∧ p
=
rdf:type∧ o
=
ex:Student →
Triple(s,
p,
o)

Predicate
ObjectQuery
over
R
and
Subject

Class
RDB2RDF
Rule
• Given
a
relational
schema
R such,
a
class

RDB2RDF-‐rule
ρ over
R
is
a
first-‐order
formula

of
the
form:

∀s∀p∀o∀x̄
α(s,x ̄)
∧ p
=
type ∧ o
=
c
→
triple(s,p,o)

where
α(s,x ̄) is
a
query
over
R
and
c
∈ D and
D
is
a
a
countably infinite
domain
of
constants

10
8
ρ1 :

Student(s,
x,
y)
∧ p
=
rdf:type∧ o
=
ex:Student →
Triple(s,
p,
o)

sid name pid
1 Juan 100
2 Martin 200
Student
ex:Student1 ex:name “Juan” .
ex:Student2 ex:name “Martin” .
TripleMap
Example
2

Example
2
<#TriplesMap1>
rr:subjectMap [
];
rr:predicate ex:name;
rr:objectMap [ rr:column “name”];
].
Logical
Table
is
a
Table
Name
SubjectMap is
a
Template-‐valued
TermMap
And
it
has
one
Class
IRI

PredicateObjectMap
PredicateMap which
is
a

Constant-‐valued
TermMap
ObjectMap which
is
a

Column-‐valued
TermMap
ρ2 :

Student(s,
o,
y)
∧ p
=
ex:name →
Triple(s,
p,
o)

Predicate
RDB2RDF
Rule
• Given
a
relational
schema
R such,
a
class

RDB2RDF-‐rule
ρ over
R
is
a
first-‐order
formula

of
the
form:

∀s∀p∀o∀x̄
β(s,
o,
x
̄)
∧ p
=
c
→
triple(s,
p,
o)

where
β(s,
o,
x
̄)
is
a
query
over
R
and
c
∈ D and

D is
a
a
countably infinite
domain
of
constants

11
1
ρ2 :

Student(s,
o,
y)
∧ p
=
ex:name →
Triple(s,
p,
o)

RDB2RDF
Mapping
• An
RDB2RDF
mapping
M over
R is
a
finite
set

of
class
or
predicate
RDB2RDF
rules
over
R
11
2
M =
{ρ1, ρ2}
ρ1 :

Student(s,
x,
y)
∧ p
=
rdf:type∧ o
=
ex:Student →
Triple(s,
p,
o)

ρ2 :

Student(s,
o,
y)
∧ p
=
ex:name →
Triple(s,
p,
o)

<#TriplesMap1>
rr:subjectMap [
];
rr:predicate ex:name;
rr:objectMap [ rr:column “name”];
].

sid name pid
1 Juan 100
2 Martin 200
Student
ex:Student1 ex:comment “Juan is a Student” .
ex:Student2 ex:comment “Martin is a Student” .
TripleMap
Example
3

Example
3
<#TriplesMap1>
rr:subjectMap [
];
rr:predicate ex:comment;
rr:objectMap [
rr:template “{name} is a Student”;
];
].
Logical
Table
is
a
Table
Name
SubjectMap is
a
Template-‐valued
TermMap
And
it
has
one
Class
IRI

PredicateObjectMap
PredicateMap which
is
a

Constant-‐valued
TermMap
ObjectMap which
is
a

Template-‐valued
TermMap
TermType

sid name pid
1 Juan 100
2 Martin 200
Student
ex:Student1 ex:webpage <http://ex.com/Juan>.
ex:Student2 ex:webpage <http://ex.com/Martin>.
TripleMap
Example
4

Example
4
<#TriplesMap1>
rr:subjectMap [
];
rr:predicate ex:webpage;
rr:objectMap [
rr:template “http://ex.com/{name}”;
];
].
Logical
Table
is
a
Table
Name
SubjectMap is
a
Template-‐valued
TermMap
And
it
has
one
Class
IRI

PredicateObjectMap
PredicateMap which
is
a

Constant-‐valued
TermMap
ObjectMap which
is
a

Template-‐valued
TermMap
Note
that
there
is
not
TermType

sid name pid
1 Juan 100
2 Martin 200
Student
ex:Student1 ex:studentType ex:GradStudent.
ex:Student2 ex:studentType ex:GradStudent.
TripleMap
Example
5

Example
6
<#TriplesMap1>
rr:subjectMap [
];
rr:predicate ex:studentType;
rr:object ex:GradStudent ;
].
Logical
Table
is
a
Table
Name
SubjectMap is
a
Template-‐valued
TermMap
And
it
has
one
Class
IRI

PredicateObjectMap
PredicateMap which
is
a

Constant-‐valued
TermMap
ObjectMap which
is
a

Constant-‐valued
TermMap

RefObjectMap
• A
RefObjectMap (Referencing
ObjectMap)

allows
using
the
subject
of
another

TriplesMap as
the
object
generated
by
a

ObjectMap.
• rr:objectMap
• A
RefObjectMap defined
by
– Exactly
one
ParentTripleMap,
which
must
be
a

TripleMap
– May
have
one
or
more
JoinConditions

<TriplesMap1>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName”Person" ];
rr:subjectMap [ rr:template "http://www.ex.com/Person/{ID}";
rr:class foaf:Person ];
rr:predicate foaf:based_near ;
rr:objectMap [
rr:parentTripelMap <TripleMap2>;
rr:joinCondition [
rr:child “CID”;
rr:parent “CID”;
]
]
]
.
<TriplesMap2>
a rr:TriplesMap;
rr:logicalTable [ rr:tableName ”City" ];
rr:subjectMap [ rr:template "http://ex.com/City/{CID}";
rr:class ex:City ];
rr:objectMap [ rr:column ”TITLE" ]
]
.
12
0
RefObjectMap

ParentTripleMap
• The
referencing
TripleMap
• rr:parentTriplesMap
<TriplesMap1>
a rr:TriplesMap;
rr:objectMap [
rr:joinCondition [
rr:child “CID”;
]
]
]
.
Parent
TriplesMap

JoinCondition
• Join
between
child
and
parent
attributes
• rr:joinCondition
<TriplesMap1>
a rr:TriplesMap;
rr:objectMap [
rr:joinCondition [
rr:child “CID”;
]
]
]
.
JoinCondition

<TriplesMap1>
a rr:TriplesMap;
rr:objectMap [
rr:joinCondition [
rr:child “CID”;
]
]
]
.
<TriplesMap2>
a rr:TriplesMap;
rr:subjectMap [ rr:template "http://ex.com/City/{CID}";
rr:class ex:City ];
rr:objectMap [ rr:column ”TITLE" ]
]
.
12
3
RefObjectMap
Parent
TriplesMap
JoinCondition

JoinCondition
• Child
Column
which
must

be
the
column
name
that

exists
in
the
logical
table

of
the
TriplesMap that

contains
the

RefObjectMap
• Parent
Column
which

must
be
the
column

name
that
exists
in
the

logical
table
of
the

RefObjectMap’sParent

TriplesMap.
<TriplesMap1>
a rr:TriplesMap;
...
rr:objectMap [
rr:joinCondition [
rr:child “CID”;
rr:parent “CID”;]
]
] .
<TriplesMap2>
a rr:TriplesMap;
...
.

JoinCondition
• Child
Query
– The
Child
Query
of
a

RefObjectMap is
the

LogicalTable of
the

TriplesMap containing
the

RefObjectMap
• Parent
Query
– The
ParentQuery of
a

RefObjectMap is
the

LogicalTable of
the
Parent

TriplesMap
• If
the
ChildQuery and

ParentQuery are
not

identical,
then
a

JoinCondition must
exist
<TriplesMap1>
a rr:TriplesMap;
...
rr:objectMap [
rr:joinCondition [
rr:child “CID”;
rr:parent “CID”;]
]
] .
<TriplesMap2>
a rr:TriplesMap;
...
.

sid name pid
1 Juan 100
2 Martin 200
pid name
100 Dan
200 Marcelo
Student
Professor
ex:Student1
ex:Student2
ex:Professor100
ex:Professor200
ex:Student1
ex:hasAdvisor ex:Professor100
.
ex:Student2
ex:hasAdvisor ex:Professor200
R2RML
Mapping
Example
7
ρ1 :

Student(s,
x,
o)
∧ Professor(o,
z)
∧ p
=
ex:hasAdvisor →
Triple(s,
p,
o)

ρ2 :

Student(s,
x,
y)
∧ p
=
rdf:type∧ o
=
ex:Student →
Triple(s,
p,
o)

ρ3 :
Professor(s,
x)
∧ p
=
rdf:type∧ o
=
ex:Professor →
Triple(s,
p,
o)

<#TriplesMap1>
rr:subjectMap [
];
rr:predicate ex:hasAdvisor;
rr:objectMap [
rr:parentTriplesMap <#TriplesMap2>;
rr:joinCondition [
rr:child “pid”;
rr:parent “pid”;
]
]
].
<#TriplesMap2>
rr:logicalTable [ rr:tableName ”Professor”];
rr:subjectMap [
rr:template "http://example.com/ns/{pid}";
rr:class ex:Professor;
].
RefObjectMap
Parent
TriplesMap
JoinCondition

Summary

Languages
• TermMap with
a
TermType of
rr:Literal may

have
a
language
tag
• rr:language
<#TriplesMap1>
rr:subjectMap [
];
rr:predicate ex:comment;
rr:objectMap [
rr:column “comment”;
rr:language “en”;
];
].

sid name comment
1 Juan Excellent Student
2 Martin Wonderful
student
Student
ex:Student1 ex:comment “Excellent Student”@en .
ex:Student2 ex:comment “Wonderful Student”@en .

Issue
with
Languages
• What
happens
if
language
value
is
in
the
data?
ID COUNTRY_ID LABEL LANG
1 1 United
States en
2 1 Estados Unidos es
3 2 England en
4 2 Inglaterra es

ex:country1 rdfs:label “United States”@en .
ex:country1 rdfs:label “Estados Unidos”@es .
ex:country2 rdfs:label “England”@en .
ex:country2 rdfs:label “Inglaterra”@es .
ID COUNTRY_ID LABEL LANG
1 1 United
States en
2 1 Estados Unidos es
3 2 England en
4 2 Inglaterra es
?

Issue
with
Languages
• Mapping
for
each
language
<#TripleMap_Countries_EN>
a rr:TriplesMap;
rr:logicalTable [ rr:sqlQuery """SELECT COUNTRY_ID, LABEL FROM
COUNTRY WHERE LANG = ’en'""" ];
rr:subjectMap [
rr:template "http://example.com/country{COUNTRY_ID}"
];
rr:predicate rdfs:label;
rr:objectMap [
rr:column “LABEL”;
rr:language “en”;
];
].

Datatypes
• TermMap with
a
TermType of
rr:Literal
• TermMap does
not
have
rr:language
<#TriplesMap1>
rr:subjectMap [
];
rr:predicate ex:startDate;
rr:objectMap [
rr:column “start_date”;
rr:datatype xsd:date;
];
].

Summary
of
Terminology
• R2RML
Mapping
• Logical
Table
• Input
Database
• R2RML
View
• TriplesMap
• Logical
Table
Row
• TermMap
• TermType
• SubjectMap
• PredicateObjectMap
• PredicateMap
• ObjectMap
• Constant-‐valued
TermMap
• Column-‐valued
TermMap
• Template-‐valued
TermMap
• RefObjectMap
• JoinConditions
• ChildQuery
• ParentQuery
• Language
• Datatype

SEMANTIC
WEB
à RELATIONAL

DATABASES:
DATA
ACCESS
13
7

Semantic
Web
à Relational
Database

139
ETL
SPARQL
RDBMS RDF
Graph

Triplestore
SPARQL
Results
ETL
Mapping

140
NoETL (Wrapper)
SPARQL
RDBMS Virtual
RDF
SQL
SQL

Results
SPARQL
Results
NoETL
R2RML
Mapping

“Comparing the overall performance […] of the fastest rewriter with the fastest relational
database shows an overhead for query rewriting of 106%. This is an indicator that there is
still room for improving the rewriting algorithms”
Larger

numbers
are

better
100M
Triple
Dataset
[Bizer and
Schultz.
Berlin
SPARQL
Benchmark
2009]

Current
rdb2rdf
systems
are
not
capable
of

providing
the
query
execution
performance

required
[...]
it
is
likely
that
with
more
work

on
query
translation,
suitable
mechanisms

for
translating
queries
could
be
developed.

These
mechanisms
should
focus
on

exploiting
the
underlying
database
system’s

capabilities
to
optimize
queries
and
process

large
quantities
of
structure
data

[Gray
et
al.
2009]

https://sourceforge.net/p/d2rq-‐map/mailman/message/28055191/
Sept
2011

Why
was
this
happening
if
…
ISWC
2008
Hypothesis:
Existing
commercial
relational
databases
already

subsume
algorithms
and
optimizations
needed
to
support

effective
SPARQL
execution
on
relationally
stored
data

Compile
Time
1. Translate
SQL
Schema

to
OWL
and
Mapping
2. Define
RDF
Triples,
as
a
View
Run
Time
3. SPARQL
to
SQL

translation
4. SQL
Optimizer

creates
relational

query
plan
14
5
Ultrawrap1:
SPARQL
to
SQL
under
Direct
Mapping
Ultrawrap:
SPARQL
execution
on
relational
data
Sequeda
&
Miranker.
J.
WebSem 2013
US
Patent
8719252,
9396283

Creating
Tripleview
• For
every
ontology
element
(Class,
Object

Property
and
Datatype
property),
create
a
SQL

SELECT
query
that
outputs
triples
SELECT
'Product’+ptID as
s,

‘label’
as
p,
label
as
o
FROM
Product
WHERE
label
IS
NOT
NULL
S P O
Product1 label ACME
Inc
Product2 label Foo
Bars
ptID label prID
1 ACME
Inc 4
2 Foo
Bars 5
Product
14
6

Creating
Tripleview
SELECT
‘Product’+ptID as
s,
prID as
s_id,
‘label’
as
p,
label
as
o,
NULL
as
o_id
FROM
Product
WHERE
label
IS
NOT
NULL
S S_id P O O_id
Product1 1 label ACME
Inc NULL
Product2 2 label Foo
Bars NULL
ptID label prID
1 ACME
Inc 4
2 Foo
Bars 5
Product
14
7

Creating
Tripleview (…)
• Create
TripleViews (SQL
View),
which
are

unions
of
the
SQL
SELECT
query
that
have
the

same
datatype
CREATE
VIEW
Tripleview_varchar AS
SELECT
‘Product’+ptID as
s,
ptID as
s_id,
‘label’
as
p,
label
as
o,
NULL
as
o_id FROM
Product
UNION
ALL
SELECT
‘Producer’+prID as
s,
prID as
s_id,
‘title’
as
p,
title
as
o,
NULL
as
o_id FROM
Producer
UNION
ALL
…
S S_id P O O_id
Product1 1 label ACME
Inc NULL
Product2 2 label Foo
Bars NULL
Producer4 4 title Foo NULL
Producer5 5 Ttitle Bars NULL
14
8

SPARQL
and
SQL
• Translating
a
SPARQL
query
to
a
semantically

equivalent
SQL
query
SELECT
?label
?pnum1
WHERE{

?x
label
?label.
?x
pnum1
?pnum1.
}
à SELECT
label,
pnum1
FROM
product
SQL
on
Tripleview
SELECT
t1.o
AS
label,
t2.o
AS
pnum1
FROM
tripleview_varchar t1,
tripleview_int t2
WHERE
t1.p
=
'label'
AND

t2.p
=
'pnum1'
AND
t1.s_id
=
t2.s_id
What

is
the

Query

Plan?
14
9

Tripleview_varchar t1
Product
π
Product+’id’
AS
s ,
‘label’
AS
p,
label
AS
o

σlabel ≠
NULL
Producer
π
Producer+’id’
AS
s ,
‘title’
AS
p,
title
AS
o

σtitle ≠
NULL
U
Tripleview_int t2
Product
π
Product+’id’
AS
s ,
‘pnum1’
AS
p,
pnum1
AS
o

σpnum1
≠
NULL
Product
π
Product+’id’
AS
s ,
‘pnum2’
AS
p,
pnum2
AS
o

σpnum2
≠
NULL
U
π
t1.o
AS
label,
t2.o
AS
pnum1
σp =
‘label’
σp =
‘pnum1’
CONTRADICTION
CONTRADICTION
15
0

Detection
of
Unsatisfiable Conditions
• Determine
that
the
query
result
will
be
empty

if
the
existence
of
another
answer
would

violate
some
integrity
constraint
in
the

database.

• This
would
imply
that
the
answer
to
the
query

is
null
and
therefore
the
database
does
not

need
to
be
accessed
Chakravarthy,
Grant
and
Minker.
(1990)
Logic-‐Based
Approach
to
Semantic
Query
Optimization.

15
1

Product
π
Product+’id’
AS
s ,
‘label’
AS
p,
label
AS
o

σlabel ≠
NULL
Product
π
Product+’id’
AS
s ,
‘pnum1’
AS
p,
pnum1
AS
o

σpnum1
≠
NULL
π
t1.o
AS
label,
t2.o
AS
pnum1
Join
on
the
same
table?
à REDUNDANT
15
2

Self
Join
Elimination
• If
attributes
from
the
same
table
are
projected

separately
and
then
joined,
then
the
join
can

be
dropped
SELECT
label,
pnum1

FROM
product

WHERE

id
=
1
SELECT
p1.label,
p2.pnum1

FROM
product
p1,
product
p2

WHERE

p1.id
=
1
and

p1.id
=
p2.id
SELECT
p1.id

FROM
product
p1,
product
p2

WHERE

p1.pnum1
>100
and

p2.pnum2
<
500
and

p1.id
=
p2.id
SELECT
id

FROM
product

WHERE

pnum1
>
100
and
pnum2
<
500
Self
Join
Elimination
of
Projection
Self
Join
Elimination
of
Selection
15
3

Product
σlabel ≠
NULL
AND
pnum1
≠
NULL
π
label,
pnum1
15
4

Evaluation
• Used
Two
Benchmarks
that
stores
data
in

relational
databases,
provides
SPARQL
queries

and
their
semantically
equivalent
SQL
queries
Detection
of

Unsatisfiable
Conditions
Self
Join

Elimination
MYSQL
MSSQL
ORACLE

DB2
✖
✔
✖
✖
✖ ✔
✔ ✔
15
5

Ultrawrap
Experiment

Augmented
Ultrawrap
Experiment
• Implemented
DoUC

Reflection
2
• We
studied
how
SQL
systems
can
be
used
to

effectively
evaluate
SPARQL
queries
– HOW:
Defined
architecture
based
on
SQL
Views

which
allows
RDBMS
to
do
the
optimization
and

Identified
two
important
optimizations
that

already
exist
in
commercial
RDBMS.
– EXTENT:
SPARQL
1.0
(relational
core)
Recall
the
Hypothesis:

Existing
commercial
relational
databases
already
subsume

algorithms
and
optimizations
needed
to
support
effective

SPARQL
execution
on
relationally
stored
data

Ontology
Based
Data
Access

UltrawrapOBDA:
Ontology-‐Based Data
Access
• Given

– a
source
relational
database
D
– a
target
OWL
ontology
O,
and
– a
mapping
from
the
source
database
to
the
target

ontology
M
• Goal:
Answer
SPARQL
queries
in
terms
of
the

target
ontology using
mappings and
the

database
Hypothesis:
We
can
effect
optimizations
for
OBDA
by
push

processing
into
the
RDBMS,
thus
acting
as
a
reasoner.

16
1
ID Name Age Job
1 Alice 40 CTO
2 Bob 41 Java
3 John 42 SysAd
Employee
Executive IT

Employee
Programmer SysAdmin
EMP
subClassOf
subClassOf
subClassOf
subClassOf
EMP(s, y, z, ”CTO”)
à Triple(s, type, ”CTO”)

EMP(s, y, z, ”Java”)
à Triple(s, type, ”Programmer”)

EMP(s, y, z, ”SysAd”)
à Triple(s, type, ”SysAdmin”)

CTO
subClassOf

16
2
Forward
Chaining
– Materialization

Backward
Chaining
– Query
Rewriting
RDBMS OWL
Ontology

State
of
the
Art:
Materialization
SELECT
?x
WHERE

{
?x
type

ITEmployee}
Triple(2,type,Programmer)
Triple(2,type,ITEmployee)
Triple(3,type,SysAdmin)
Triple(3,type, ITEmployee)
OWL
Ontology
Mapping
Relational
Database
Materialization
?x
=2
?x
=3
SPARQL
Query Ans
Programmer
⊑ ITEmployee
SysAdmin ⊑ ITEmployee
RDF
Database

EMP(ID,NAME,AGE,JOB)
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
EMP(3,John,42,SysAd)

State
of
the
Art:
Query
Rewriting
Rewriting
Unfolding
Evaluation
SPARQL
Query OWL
Ontology
Qo
Qsql
Mapping
Ans
Programmer
⊑
ITEmployee
SELECT
?x
WHERE

{
?x
type
ITEmployee}
SELECT
?x
WHERE

{?x
type
ITEmployee UNION

?x
type
Programmer
UNION

?x
type
SysAdmin}
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
?x
=2
?x
=3SELECT
SID
FROM
EMP
WHERE
JOB
=
‘Java’

UNION

SELECT
SID
FROM
EMP
WHERE
JOB
=

‘SysAd’ Database


16
5
RDB2RDF

Mapping
SQL
Optimizer:
Views
&
Recursion
Inheritance
&

Transitivity
Hybrid:
Backward
– Forward
Chaining
RDBMS OWL
Ontology

Hybrid
Approach:
UltrawrapOBDA
OWL
Ontology
Mapping
Programmer
⊑
ITEmployee

Compiler
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
Saturated
Mapping
à Triple(s, type, ”ITEmployee”)

OBDA:
query
rewriting
or
materialization?
In
practice,
both!
Sequeda,
Arenas,
Miranker.
ISWC
2014
(Best
Paper)

RDFS
Subclass
Inference
Rule
16
7
Programmer
subClassOf ITEmployee X
type
Programmer

X
type
ITEmployee

Generating
Saturated
Mappings
Programmer
subClassOf ITEmployee X
type
Programmer

X
type
ITEmployee
Programmer
subClassOf ITEmployee EMP(s,
x1,
“Java”)
∧ p
=
type
∧ o
=
Programmer→
Triple(s,
p,
o)

EMP(s,
x1,
“Java”)
∧ p
=
type
∧ o
=
ITEmployee→
Triple(s,
p,
o)

ρ :

EMP(s,
y,
z,
‘Java’)
∧ p
=
type
∧ o
=
Programmer→
Triple(s,
p,
o)

Predicate
ObjectQuery
over
R
and
Subject

Transitivity
and
SQL
Recursion
16
9
WITH
MANAGER (X,
Y)
AS(
SELECT

ID,
MAN
FROM
EMP

UNION
ALL

SELECT
EMP.ID,
MANAGER.Y

FROM
EMP,
MANAGER
WHERE
EMP.MAN=
MANAGER.X
)
SELECT
X,
Y
FROM
MANAGER

OWL 2 QL OWL 2 RL
OWL 2 EL
OWL 2 DL
EL QL RL
subClass X X X
subProp X X X
domain X X X
range X X X
eqClass X X X
eqProp X X X
inverseProp X X
symProp X X
transProp X X
17
0
Check
paper
for

all
9
Inference

rules

Saturated
Mappings
17
1
OWL
Ontology
Qo
Mapping
Programmer
⊑
ITEmployee

Compiler
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
Saturated
Mapping

Saturated
Mappings
are

similar
to
T-‐Mapping
per

Rodriguez-‐Muro et.
al.

(AMW2011,
ISWC2013)
Saturation
is

performed
by

exhaustively applying

the
inference
rules
We
present
a

linear-‐time
algorithm
Check

paper

Represent
Saturated
Mappings
as
SQL
Views
17
2
OWL
Ontology
Qo
Mapping
Programmer
⊑
ITEmployee

Compiler
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
Saturated
Mapping

CREATE VIEW ITEmployeeView AS
SELECT
ID as S, “type” as P, “ITEmployee” as O
FROM EMP where JOB = ‘Java’ UNION …

From
Saturated
Mappings
to
SQL
Views
Mapping
Triplequery
Tripleview
EMP(s,
x1,
“Java”)
∧ p
=
type
∧ o
=
ITEmployee →
Triple(s,
p,
o)

EMP(s,
x1,
“SysAd”)
∧ p
=
type
∧ o
=
ITEmployee →
Triple(s,
p,
o)

SELECT
ID
as
S,
“type”
as
P,
“ITEmployee”
as
O
FROM
EMP
where
JOB
=
‘Java’
SELECT
ID
as
S,
“type”
as
P,
“ITEmployee”
as
O
FROM
EMP
where
JOB
=
‘SysAd’
CREATE
VIEW
ITEmployeeView AS
SELECT
ID
as
S,
“type”
as
P,
“ITEmployee”
as
O
FROM
EMP
where
JOB
=
‘Java’
UNION
SELECT
ID
as
S,
“type”
as
P,
“ITEmployee”
as
O
FROM
EMP
where
JOB
=
‘SysAd’
S P O
… … …

All
of
this
is
offline
OWL
Ontology
Qo
Mapping
Programmer
⊑
ITEmployee

Compiler
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
Saturated
Mapping

SELECT

Runtime:
SPARQL
Execution
17
5
EMP(1,Alice,40,CTO)
EMP(2,Bob,41,Java)
SELECT
?x
WHERE

{
?x
type

ITEmployee}
s
=
2
s
=
3
SPARQL
Query Ans
SELECT
SELECT
s
FROM
Tripleview

WHERE
p
=
type
and
o
=
ITEmployee
SQL
Query
on
Views
S P O
… … …
Theorem:
Given
a
RDB2RDF
Mapping
M,
every
SPARQL
query
is
SQL-‐rewritable
under
M
Proof:
by
induction
on
the
structure
of
SPARQL

Query
Optimization:
Materialize
Views?
17
6
Best
query

response
time
Worst
query

response
time
Less
Space
Consumed
Most
Space
Consumed
Materialize

Everything
Materialize

Nothing
Harinarayan et
al.
Implementing
Data
Cubes
Efficiently.
SIGMOD96
…..
Mami &
Bellahsene.
A
Survey
of
View
Selection
Methods.
SIGMOD
Record
2012
Hybrid

Approach

Cost
Model
Best
query

response

time
Worst
query

response
time
Less
Space
Consumed
Most
Space
Consumed
Materialize
All
Materialize

Nothing
Materialize
views
representing

mappings
to
leaf
classes

Query
cost
=
n
x
NR x
S(A,R)
Space
cost
=
NR +
(NR x
d)
Query
cost
=
n
x
NR
Space
cost
=
NR
Query
cost
=
n
x
NR x
S(A,R)
Space
cost
=
2NR
Hypothesis:
If
a
RDBMS
rewrites

queries
in
terms
of
materialized

views,
then
…
Check

paper

for

details
n
is
the
number
of
leaf
classes
underneath
the
class
that
is
being
queried
NR is
the
number
of
tuples
of
the
relation
R
in
the
mapping

S(A,
R)
is
the
selectivity
of
the
attribute
A
of
the
relation
R
in
the
mapping

d
is
the
depth
of
the
ontology

Texas
Benchmark
id …. TWO FIVE TEN TWENTY FIFTY HUNDRED
1 1 1 1 1 1
… … … … … …
2 5 10 20 50 100
1 100…
d
=
2
1 100…
d
=
5… …
Database
…
Ontologies
Goal:
Understand
the
behavior
when

querying
for
instances
of
a
class
depending

on
the
depth of
the
ontology
and
the

selectivity

Oracle
implements

Query
Rewriting
with
Materialized
Views

Seconds
www.obda-‐benchmark.org

BSBM
Extension
for
Transitivity
SELECT
?x
WHERE
{
?x
typeAncestor ProductType7
}
SELECT
?product
?x
WHERE
{
?x
.
?product
hasType ?x.
}
SELECT
?product
?x
WHERE
{
?x
.
?product
hasType ?x.

?product
label
?label
.
?product
numProp ?num.}
Product ProductType
hasType
typeAncestor
Literal
Literal
label
numProp
Ontology
Simple
Query Join
Query More
Join
Query

Query
Plan
for
Transitivity
18
1
Unmaterialized
View Materialized
View
www.obda-‐benchmark.org

Reflection
3
• We
studied
how
SQL
systems
can
be
used
as

reasoners for
SPARQL
queries
in
terms
of

Ontologies

– HOW:
Incorporate
semantics
of
ontologies
in

Saturating
Mappings
and
take
advantages
of
query

rewriting
using
materialized
views
and
recursion

which
exist
in
RDBMS
– EXTENT:
OWL-‐ SQL
Recall
the
Hypothesis:

We
can
effect
optimizations
for
OBDA
by
push
processing
into

the
RDBMS,
thus
acting
as
a
reasoner.

RELATIONAL
DATABASES
AND

SEMANTIC
WEB
IN
PRACTICE
18
3

past
à present
à FUTURE
• Federated
Semantic
Data
Management
– “Semantify”
data
by
mapping
to
ontologies
• Business
view
of
heterogeneous
data

– Federation
(NoETL)
in
order
to
avoid
centralization

(ETL)
– Dagstuhl seminar
on
this
topic
(June
2017)
• http://www.dagstuhl.de/17262
• “Start”
of
commercial
interest
– Startups:
Capsenta,
…

– Industries:
Pharma,
Finance,
…

– EU
Project:
Optique

IT Biz
Total
net

sales
of

all
Orders

today
Reports
Real
World
Data
Integration
Problem
18
5

What
do
you
mean
by
…
How
many
orders
were

placed
in
June
2017?
317,595
317,124
316,899
Billing
Shipping
E-‐Commerce
18
6

It’s
a
Semantic
Problem!
What
is
an
Order?
When
a
user

clicks
“Order”
on

the
website
When
the

customer
has

received
the

product
When
it
comes

out
of
the
billing

system
and
the
CC

has
been
charged
Billing
Shipping
E-‐Commerce
18
7

Cross
Organizational
Data
Integration

Organization
1
Organization
2
Organization
n
18
8

IT
Biz
Total
net

sales
of

all
Orders

today
Data
Architect
SELECT

..

FROM
…
csv csv
csv
MS
Access
T=1
T=2T=3
XLS
Did
the
Biz
User
communicate
the
correct

message
to
IT?

Did
IT
understand
correctly
what
the
Biz

User
wanted?

Did
IT
deliver
the
correct/precise
results?
Reports
XLS
XLS
Status
Quo
1
18
9

Enterprise
Data
Warehouse
IT Biz
Reports
Time
and
$
Total
net

sales
of

all
Orders

today
ETL
ETL
ETL
Total
net

sales
of
all

Orders

today
with

FX
Status
Quo
2
Data
Architect
19
0

Integrating
Data
using
Graphs
and
Semantics
19
1
HIVE
Impala,
etc
Oracle
SQL

Server
Postgres
Unstructured
Semi-‐
Structured
Mappings
Enterprise
Knowledge
Graph
Search ReportsAPI Dashboard

Semantic
Technology
is
not
easy

Who
creates
this?
Using
what
tools?
Funny
Note:
I
found
my

presentation
from
2007
where
I

asked
this
same
question

Real
World
19
3

Real
World
Mappings
are
not
easy
and
obvious
19
4

Mappings
and
Ontologies
from
Questions
19
5
A
Pay-‐As-‐You-‐Go
Methodology
for
Ontology-‐Based
Data
Access
Sequeda
&
Miranker.
IEEE
Internet
Computing
2017

Real
World
Example
SELECT
o.orderid, o.orderdate,
o.ordertotal
- ot.finaltax
- CASE
WHEN o.currencyid in (‘USD’, ‘CAD’) THEN
o.shippingcost
ELSE o.shippingcost - ot.shippingtax
END AS netsales,
o.currencyid
FROM order o, ordertax ot
WHERE o.orderid = ordertax.orderid
AND o.statusid NOT IN (4, 5)

Reflection
4
• We
are
studying
how
non-‐semantic
web
(and

non-‐technical)
users
can
integrate
data
using

semantic
web
technologies
– HOW:
We
need
better
tools
– EXTENT:
I
don’t
know

CONCLUSION
19
8

HOW and
to
what
EXTENT can
RDB
be
integrated
with
the
SW?
1. RDB
can
be
automatically
directly
mapped
to

RDF
and
OWL
– Monotone,
Information
and
Query
Preserving
– Monotone
is
obstacle
for
Semantics
Preserving
2. RDB
can
evaluate
and
optimize
SPARQL
1.0

queries
– Two
important
optimizations
3. RDB
can
act
as
a
reasoner for
Ontologies
with

inheritance
and
transitivity
– Saturated
mappings,
query
rewriting
using
mat

views
and
recursion

Tipping
Point
Relational

Database
Semantic

Web
• Semantics
• “Graphy”
Queries
• Data
Integration
• Flexible
• Metadata
• Provenance
• Graph
Visualizations
OWL 2 QL OWL 2 RL
OWL 2 EL
OWL 2 DL
OWL SQL

HOW and
to
what
EXTENT can
RDB
be
integrated
with
the
SW?
20
1
Juan
Sequeda,
Ph.D
Co-‐Founder
– Capsenta
juan@capsenta.com
@juansequeda
Sequeda
J.
Integrating
Relational
Databases
with
the
Semantic
Web.
IOS
Press.
2016
http://www.iospress.nl/book/integrating-‐relational-‐databases-‐with-‐the-‐semantic-‐web/
We
are
always
looking
for

smart
people
THANK
YOU!
RDB
can
be
automatically
directly

mapped
to
RDF
and
OWL
and

preserve
information
and
queries
RDB
can
evaluate

and
optimize

SPARQL
1.0
queries
RDB
can
act
as
a
reasoner
for
Ontologies
with

inheritance
and
transitivity

Integrating Relational Databases with the Semantic Web: A Reflection

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Integrating Relational Databases with the Semantic Web: A Reflection

Similar to Integrating Relational Databases with the Semantic Web: A Reflection (20)

More from Juan Sequeda

More from Juan Sequeda (20)

Recently uploaded

Recently uploaded (20)

Integrating Relational Databases with the Semantic Web: A Reflection