Ontology-mediated query answering with data-tractable description logics

ontology-mediated
query answering with
data-tractable
description logics
Meghyn Bienvenu (CNRS & Université de Montpellier)
Magdalena Ortiz (Vienna University of Technology)

ontology-mediated query answering (omqa)
data
incomplete
database
(ground facts)
ontology
(logical theory)
???
user query
2/109

data ???
patient data
“Melanie has listeriosis”
“Paul has Lyme disease”
medical knowledge
“Listeriosis & Lyme disease
are bacterial infections”
user query
“Find all patients with
bacterial infections”
2/109

data ???
patient data
“Melanie has listeriosis”
“Paul has Lyme disease”
medical knowledge
“Listeriosis & Lyme disease
are bacterial infections”
user query
“Find all patients with
bacterial infections”
expected answers: Melanie, Paul
2/109

data ???
employee data
“Marie is a professor”
“Mark teaches CS200”
org. knowledge
“Professors are teaching staff”
“Someone who teaches is
part of the teaching staff”
user query
“Find all teaching staff”
2/109

data ???
employee data
“Marie is a professor”
“Mark teaches CS200”
org. knowledge
“Professors are teaching staff”
“Someone who teaches is
part of the teaching staff”
user query
“Find all teaching staff”
expected answers: Marie, Mark
2/109

what are ontologies good for?
To standardize the terminology of an application domain
∙ meaning of terms is constrained, so less misunderstandings
∙ by adopting a common vocabulary, easy to share information
3/109

To present an intuitive and uniﬁed view of data sources
∙ ontology can be used to enrich the data vocabulary, making it
easier for users to formulate their queries
∙ especially useful when integrating multiple data sources
3/109

To present an intuitive and uniﬁed view of data sources
∙ ontology can be used to enrich the data vocabulary, making it
easier for users to formulate their queries
∙ especially useful when integrating multiple data sources
To support automated reasoning
∙ uncover implicit connections between terms, errors in modelling
∙ exploit knowledge in the ontology during query answering, to get
back a more complete set of answers to queries
3/109

applications of omqa: medicine
General medical ontologies: SNOMED CT (∼ 400,000 terms!), GALEN
Specialized ontologies: FMA (anatomy), NCI (cancer), ...
Querying & exchanging medical records (ﬁnd patients for medical trials)
∙ myocardial infarction vs. MI vs. heart attack vs. 410.0
Supports tools for annotating and visualizing patient data (scans, x-rays) 4/109

applications of omqa: life sciences
Hundreds of ontologies at BioPortal (http://bioportal.bioontology.org/):
Gene Ontology (GO), Cell Ontology, Pathway Ontology, Plant Anatomy, ...
Help scientists share, query, & visualize experimental data
5/109

applications of omqa: entreprise information systems
Companies and organizations have lots of data
∙ need easy and ﬂexible access to support decision-making
Example industrial projects:
∙ Public debt data: Sapienza Univ. & Italian Department of Treasury
∙ Energy sector: Optique EU project (several univ, StatOil, & Siemens)
6/109

our focus: horn description logics
Ontologies formulated using description logics (DLs):
∙ family of decidable fragments of ﬁrst-order logic
∙ basis for OWL web ontology language (W3C)
∙ range from fairly simple to highly expressive
∙ complexity of query answering well understood
7/109

our focus: horn description logics
Ontologies formulated using description logics (DLs):
∙ family of decidable fragments of ﬁrst-order logic
∙ basis for OWL web ontology language (W3C)
∙ range from fairly simple to highly expressive
∙ complexity of query answering well understood
In this tutorial, focus on Horn description logics:
∙ DL-LiteR, EL, ELHI, Horn-SHIQ, ...
∙ good computational properties, well suited for OMQA
∙ still expressive enough for interesting applications
∙ basis for OWL 2 QL and OWL 2 EL proﬁles
Consider various types of queries
7/109

plan for today
∙ Horn Description Logics
∙ Basics of OMQA
∙ Instance Queries
∙ Conjunctive Queries
∙ Navigational Queries
∙ Queries with Negation and Recursion
∙ Research Trends in OMQA
8/109

dl basics
Building blocks of DLs:
∙ concept names (unary predicates, classes)
IceCream, Pizza, Meat, SpicyDish, Dish, Menu, Restaurant, ...
∙ role names (binary predicates, properties)
hasIngred, hasCourse, hasDessert, serves, ...
∙ individual names (constants)
menu32, pastadish17, d3, rest156, r12, ...
(speciﬁc menus, dishes, restaurants ...)
NC / NR / NI: set of all concept / role / individual names
10/109

dl knowledge bases
Knowledge base (KB) = ABox (data) + TBox (ontology)
ABox contains facts about speciﬁc individuals
∙ ﬁnite set of concept assertions A(a) and role assertions r(a, b)
∙ IceCream(d2): dish d2 is of type IceCream
∙ hasDessert(m, d2): menu m is connected via hasDessert to dish d2
11/109

dl knowledge bases
Knowledge base (KB) = ABox (data) + TBox (ontology)
ABox contains facts about specific individuals
∙ finite set of concept assertions A(a) and role assertions r(a, b)
∙ IceCream(d2): dish d2 is of type IceCream
∙ hasDessert(m, d2): menu m is connected via hasDessert to dish d2
TBox contains general knowledge about the domain of interest
∙ finite set of axioms (details on syntax to follow)
∙ IceCream is a subclass of Dessert
∙ hasCourse connects Menus to Dishes
∙ every Menu is connected to at least one dish via hasCourse
11/109

concept and role constructors
Can build complex concepts and roles using constructors:
∙ conjunction (⊓), disjunction (⊔), negation (¬)
Dessert ⊓ ¬IceCream Pizza ⊔ PastaDish
12/109

∙ restricted forms of existential and universal quantiﬁcation (∃, ∀)
∃hasCourse.⊤ ∃contains.Meat Dish ⊓ ∀contains.¬Meat
( ⊤ acts as a “wildcard”, denotes set of all things)
12/109

∙ inverse (−
) and composition (·) of roles
hasCourse−
contains · contains
(use N±
R for set of role names and inverse roles)
(use inv(r) to toggle −: inv(r) = r−, inv(r−) = r )
12/109

∙ inverse (−
) and composition (·) of roles
hasCourse−
contains · contains
(use N±
R for set of role names and inverse roles)
(use inv(r) to toggle −: inv(r) = r−, inv(r−) = r )
Note: set of available constructors depends on the particular DL! 12/109

tbox axioms
Concept inclusions C ⊑ D (C, D possibly complex concepts)
IceCream ⊑ Dessert Menu ⊑ ∃hasCourse.⊤ Spicy ⊓ Dish ⊑ SpicyDish
Role inclusions R ⊑ S (R, S possibly complex roles)
hasIngred ⊑ contains ingredOf−
⊑ hasIngred hasDessert ⊑ hasCourse
Note: type and syntax of axioms depends on the particular DL!
13/109

dl semantics
Interpretation I (“possible world”)
∙ domain of objects ∆I
(possibly inﬁnite set)
∙ interpretation function ·I
that maps
∙ concept name A ⇝ set of objects AI
⊆ ∆I
∙ role name r ⇝ set of pairs of objects rI
⊆ ∆I
× ∆I
∙ individual name a ⇝ object aI
∈ ∆I
14/109

example: interpretation
∆I
italFeastI
chCakeI
Dish
I
DessertI
Appetizer
I
MenuI
hasCourse
hasCourse
hasCourse
hasCourse
4 concept names: Dish, Dessert, Appetizer, Menu
1 role name: hasCourse 2 individual names: italFeast, chCake
15/109

dl semantics
Interpretation I (“possible world”)
∙ domain of objects ∆I
(possibly inﬁnite set)
∙ interpretation function ·I
that maps
∙ concept name A ⇝ set of objects AI
⊆ ∆I
∙ role name r ⇝ set of pairs of objects rI
⊆ ∆I
× ∆I
∙ individual name a ⇝ object aI
∈ ∆I
Interpretation function ·I
extends to complex concepts and roles:
⊤ ∆I
⊥ ∅
¬C ∆I
CI
C1 ⊓ C2 C1
I
∩ C2
I
∃R.C {d1 | there exists (d1, d2) ∈ RI
with d2 ∈ CI
}
∀R.C {d1 | d2 ∈ CI
for all (d1, d2) ∈ RI
}
r−
{(d2, d1) | (d1, d2) ∈ rI
}
16/109

back to the example
∆I
italFeastI
chCakeI
Dish
I
DessertI
Appetizer
I
MenuI
hasCourse
hasCourse
hasCourse
hasCourse
Dish ⊓ Menu Dessert ⊓ Appetizer ∃hasCourse.⊤ ∃hasCourse−
.Dessert
17/109

semantics of dl kbs
Satisfaction in an interpretation
∙ I satisﬁes C ⊑ D ⇔ CI
⊆ DI
∙ I satisﬁes R ⊑ S ⇔ RI
⊆ SI
18/109

semantics of dl kbs
⊆ DI
⊆ SI
∙ I satisﬁes A(a) ⇔ aI
∈ AI
∙ I satisﬁes r(a, b) ⇔ (aI
, bI
) ∈ rI
18/109

semantics of dl kbs
⊆ DI
⊆ SI
∙ I satisfies A(a) ⇔ aI
∈ AI
∙ I satisfies r(a, b) ⇔ (aI
, bI
) ∈ rI
Model of a KB K = interpretation that satisfies all statements in K
K is satisfiable = K has at least one model
K entails α (written K |= α) = every model I of K satisfies α
18/109

back to the example
∆I
italFeastI
chCakeI
Dish
I
DessertI
Appetizer
I
MenuI
hasCourse
hasCourse
hasCourse
hasCourse
Which of the following assertions / axioms is satisﬁed in I?
Dessert ⊑ Dish Dish ⊓ Menu ⊑ ⊥ Menu ⊑ ∃hasCourse.⊤
∃hasCourse−
.⊤ ⊑ Dish Menu(italFeast) hasCourse(italFeast, chCake)
19/109

some important horn dls
Idea: Horn DLs cannot express disjunction (explicitly or implicitly)
∙ better computational properties than non-Horn DLs (more on this later)
20/109

DL-LiteR
∙ concept inclusions B1 ⊑ (¬)B2 B1, B2 either A ∈ NC or ∃R (R ∈ N±
R )
∙ role inclusions R1 ⊑ (¬)R2 R1, R2 ∈ N±
R
20/109

DL-LiteR
R )
R
EL
∙ allows only ⊤, ⊓, and ∃r.C as constructors
∙ only concept inclusions in TBox
20/109

DL-LiteR
R )
R
EL
ELHI⊥
∙ additionally allows for ⊥ and inverse roles (r−
)
∙ can also have role inclusions
20/109

DL-LiteR
R )
R
EL
ELHI⊥
∙ additionally allows for ⊥ and inverse roles (r−
)
∙ can also have role inclusions
Horn-SHIQ
∙ limited use of ¬, ∀r.C, and number restrictions (≥ nR.C, ≤ nR.C)
∙ also have transitivity axioms (e.g. assert contains is transitive) 20/109

aboxes vs. databases
ABoxes and databases (DBs) and are syntactically similar:
∙ ABox = ﬁnite set of assertions (unary and binary facts)
∙ Database = ﬁnite set of facts of arbitrary arity
22/109

ABoxes interpreted under open world assumption:
∙ every assertion in the ABox is assumed to hold (true)
∙ assertions not present in the ABox may hold or not (unknown)
Each ABox gives rise to many interpretations (its models)
∙ models can be inﬁnite, can have inﬁnitely many models
22/109

ABoxes interpreted under open world assumption:
∙ every assertion in the ABox is assumed to hold (true)
∙ assertions not present in the ABox may hold or not (unknown)
Each ABox gives rise to many interpretations (its models)
∙ models can be infinite, can have infinitely many models
Databases interpreted under closed world assumption:
∙ every fact in the DB is assumed to hold (true)
∙ every fact not in the DB is assumed not to hold (false)
In other words, each DB corresponds to single finite interpretation
∙ domain of the interpretation = set of constants in DB
22/109

querying databases
Database query q of arity n maps (Boolean query = arity 0)
Database D ⇝ ans(q, D) = set of n-tuples of constants from D
23/109

querying databases
Interpretation I ⇝ ans(q, I) = set of n-tuples of elements from I
23/109

querying databases
First-order (FO) query = ﬁrst-order formula
∙ arity of FO query = number of free variables
∙ answers = substitutions for free vars that make formula hold
∙ example: Dish(x) ∧ ∀y.(contains(x, y) → ¬Spicy(y))
23/109

querying databases
First-order (FO) query = ﬁrst-order formula
∙ arity of FO query = number of free variables
∙ answers = substitutions for free vars that make formula hold
∙ example: Dish(x) ∧ ∀y.(contains(x, y) → ¬Spicy(y))
Datalog queries = ﬁnite set of Datalog rules + ‘goal’ relation
∙ arity of Datalog query = arity of goal relation
∙ answers = exhaustively apply rules to DB / interpretation, collect
tuples in goal relation
∙ example: rules contains(x, z) ← contains(x, y), contains(y, z) and
SpicyDish(x) ← Dish(x), contains(x, y), Spicy(y)
23/109

querying dl knowledge bases
Problem: each KB gives rise to multiple interpretations (its models),
but DB query semantics deﬁnes answers w.r.t. a single interpretation
24/109

Solution: adopt certain answer semantics
∙ require tuple to be an answer w.r.t. all models of KB
24/109

Formally: Call a tuple (a1, . . . , an) of individuals from A a certain
answer to n-ary query q over DL KB K = (T , A) iff
(aI
1 , . . . , aI
n ) ∈ ans(q, I) for every model I of K
24/109

Formally: Call a tuple (a1, . . . , an) of individuals from A a certain
answer to n-ary query q over DL KB K = (T , A) iff
(aI
1 , . . . , aI
n ) ∈ ans(q, I) for every model I of K
Ontology-mediated query answering (OMQA)
= computing certain answers to queries
24/109

example: certain answers
Consider the query q(x) = Dessert(x) and the DL-LiteR KB K = (T , A):
T = {Cake ⊑ Dessert IceCream ⊑ Dessert hasDessert ⊑ hasCourse
∃hasCourse ⊑ Menu ∃hasDessert−
⊑ Dessert }
A = {Cake(d1) IceCream(d2) Dessert(d3) hasDessert(m, d4)}
25/109

⊑ Dessert }
There are four certain answers to q w.r.t. K:
∙ d1 ∈ cert(q, K) Cake(d1) ∈ A, Cake ⊑ Dessert ∈ T
∙ d2 ∈ cert(q, K) IceCream(d2) ∈ A, IceCream ⊑ Dessert ∈ T
∙ d3 ∈ cert(q, K) Dessert(d3) ∈ A
∙ d4 ∈ cert(q, K) hasDessert(m, d4)∈A, hasDessert−
⊑ Dessert ∈ T
25/109

⊑ Dessert }
There are four certain answers to q w.r.t. K:
∙ d1 ∈ cert(q, K) Cake(d1) ∈ A, Cake ⊑ Dessert ∈ T
∙ d2 ∈ cert(q, K) IceCream(d2) ∈ A, IceCream ⊑ Dessert ∈ T
∙ d3 ∈ cert(q, K) Dessert(d3) ∈ A
∙ d4 ∈ cert(q, K) hasDessert(m, d4)∈A, hasDessert−
⊑ Dessert ∈ T
The ﬁfth individual m is not a certain answer: can construct model
J of K in which mJ
̸∈ DessertJ
(see lecture notes).
25/109

key techniques for omqa: query rewriting
Query rewriting: Reduces problem of ﬁnding certain answers to
standard DB query evaluation (⇝ exploit existing DB systems)
+
query rewriting
+
+
query evaluation
TBox T
query
ABox
q
database query q0
query answersA
26/109

+
query rewriting
+
+
query evaluation
TBox T
query
ABox
q
database query q0
query answersA
Call q′
(⃗x) a rewriting of q(⃗x) and T iff for every ABox A and tuple ⃗a
T , A |= q(⃗a) ⇔ ⃗a ∈ ans(q′
(⃗x), IA) (IA = treat A as DB)
26/109

+
query rewriting
+
+
query evaluation
TBox T
query
ABox
q
database query q0
query answersA
Call q′
(⃗x) a rewriting of q(⃗x) and T iff for every ABox A and tuple ⃗a
T , A |= q(⃗a) ⇔ ⃗a ∈ ans(q′
(⃗x), IA) (IA = treat A as DB)
Types of rewritings: FO-rewritings (SQL), Datalog rewritings, ...
26/109

key techniques for omqa: saturation
Saturation: Render explicit (some of) the implicit information
contained in the KB, making it available for query evaluation
27/109

Simple use of saturation: (works e.g. for RDFS ontologies)
∙ use saturation to ‘complete’ the ABox by adding those assertions
that are logically entailed from the KB
∙ then evaluate the query over the saturated ABox
27/109

Simple use of saturation: (works e.g. for RDFS ontologies)
∙ use saturation to ‘complete’ the ABox by adding those assertions
that are logically entailed from the KB
∙ then evaluate the query over the saturated ABox
More complex uses:
∙ enrich the ABox in other ways (e.g. add new ABox individuals to
witness the existential restrictions ∃R.C)
∙ combine saturation with query rewriting
27/109

complexity of omqa
View OMQA as a decision problem (yes-or-no question):
Problem: Q answering in L (Q a query language, L a DL)
Input: An n-ary query q ∈ Q, an ABox A, a L-TBox T ,
and a tuple ⃗a ∈ Ind(A)n
Question: Does ⃗a belong to cert(q, (T , A))?
28/109

complexity of omqa
View OMQA as a decision problem (yes-or-no question):
Problem: Q answering in L (Q a query language, L a DL)
Input: An n-ary query q ∈ Q, an ABox A, a L-TBox T ,
and a tuple ⃗a ∈ Ind(A)n
Question: Does ⃗a belong to cert(q, (T , A))?
Combined complexity: in terms of size of whole input
Data complexity: in terms of size of A only
∙ view rest of input as ﬁxed (of constant size)
∙ motivation: ABox typically much larger than rest of input
Note: use |A| to denote size of A (similarly for |T |, |q|, etc.)
28/109

complexity classes
We will mention the following standard classes:
P problems solvable in deterministic polynomial time
NP problems solvable in non-det. polynomial time
coNP problems whose complement is solvable in
non-deterministic polynomial time
LogSpace problems solvable in deterministic logarithmic space
NLogSpace problems solvable in non-det. logarithmic space
PSpace problems solvable in polynomial space (note: =NPSpace)
Exp problems solvable in deterministic exponential time
29/109

complexity classes
We will mention the following standard classes:
P problems solvable in deterministic polynomial time
NP problems solvable in non-det. polynomial time
coNP problems whose complement is solvable in
non-deterministic polynomial time
LogSpace problems solvable in deterministic logarithmic space
NLogSpace problems solvable in non-det. logarithmic space
PSpace problems solvable in polynomial space (note: =NPSpace)
Exp problems solvable in deterministic exponential time
Another less known but important class:
AC0 problems solvable by uniform family of
polynomial-size constant-depth circuits
Relationships between classes:
AC0 ⊊ LogSpace ⊆ NLogSpace ⊆ P ⊆ NP ⊆ PSpace ⊆ Exp
29/109

instance queries
Instance queries (IQs): ﬁnd instances of a given concept or role
A(x) where A ∈ NC concept instance query
r(x, y) where r ∈ NR role instance query
31/109

instance queries
To query for a complex concept C, take AC(x) for fresh AC ∈ NC and
add C ⊑ AC to the TBox
31/109

instance queries
To query for a complex concept C, take AC(x) for fresh AC ∈ NC and
add C ⊑ AC to the TBox
Remarks:
∙ Instance query answering is often called instance checking
∙ Focus of OMQA until mid-2000s
31/109

instance checking in dl-lite via query rewriting
Input = instance query q + DL-LiteR TBox T
We construct an FO-rewriting of q w.r.t. T
More speciﬁcally, we construct:
∙ an FO-rewriting of q relative to consistent ABoxes, and
∙ an FO-rewriting of unsatisﬁability
(these can be easily combined into FO-rewriting of q for all ABoxes)
32/109

rewriting relative to consistent aboxes
We ﬁrst deﬁne two procedures:
ComputeSubsumees all reasons for an individual to be in B
input concept B, TBox T
output set of C such that T |= C ⊑ B ⇝ subsumees of B w.r.t. T
ComputeSubroles all reasons for a pair to be in R
input role R, TBox T
output set of S such that T |= S ⊑ R ⇝ subroles of R w.r.t. T
33/109

computing subsumees
Algorithm ComputeSubsumees
Input: DL-LiteR TBox T , concept B ∈ NC ∪ {∃R | R ∈ N±
R }
1. Initialize Subsumees = {B} and Examined = ∅.
2. While Subsumees Examined ̸= ∅
2.1 Select D ∈ Subsumees Examined and add D to Examined.
2.2 For every concept inclusion C ⊑ D ∈ T
∙ If C ̸∈ Subsumees, add C to Subsumees
2.3 For every role inclusion R ⊑ S ∈ T such that D = ∃S.
∙ If ∃R ̸∈ Subsumees, add ∃R to Subsumees
2.4 For every role inclusion R ⊑ S ∈ T such that D = ∃inv(S).
∙ If ∃inv(R) ̸∈ Subsumees, add ∃inv(R) to Subsumees.
3. Return Subsumees.
34/109

computing subsumees: an example (1/3)
ComputeSubsumees on (T , Dish), where T : ItalDish ⊑ Dish
VegDish ⊑ Dish
Dish ⊑ ∃hasIngred
∃hasCourse−
⊑ Dish
hasMain ⊑ hasCourse
hasDessert ⊑ hasCourse
Examined = ∅
Subsumees = {Dish}
35/109

VegDish ⊑ Dish
∃hasCourse−
⊑ Dish
Examined = ∅
Subsumees = {Dish}
Choose: Dish Examined = {Dish}
Subsumees = {Dish, ItalDish, VegDish, ∃hasCourse−
}
35/109

VegDish ⊑ Dish
∃hasCourse−
⊑ Dish
Examined = ∅
Subsumees = {Dish}
Choose: Dish Examined = {Dish}
}
Choose: ItalDish Examined = {Dish, ItalDish}
}
35/109

VegDish ⊑ Dish
∃hasCourse−
⊑ Dish
Choose: VegDish Examined = {Dish, ItalDish, VegDish}
}
36/109

VegDish ⊑ Dish
∃hasCourse−
⊑ Dish
Choose: VegDish Examined = {Dish, ItalDish, VegDish}
}
Choose: ∃hasCourse−
Examined = {Dish, ItalDish, VegDish, ∃hasCourse−
}
,
∃hasMain−
, ∃hasDessert−
}
36/109

VegDish ⊑ Dish
∃hasCourse−
⊑ Dish
Choose: ∃hasMain−
,
hasMain−
}
,
∃hasMain−
, ∃hasDessert−
}
37/109

VegDish ⊑ Dish
∃hasCourse−
⊑ Dish
Choose: ∃hasMain−
,
hasMain−
}
,
∃hasMain−
, ∃hasDessert−
}
Choose: ∃hasDessert−
,
∃hasMain−
, ∃hasDessert−
}
,
∃hasMain−
, ∃hasDessert−
}
37/109

computing subroles
Algorithm ComputeSubroles
Input: DL-LiteR TBox T , role R ∈ N±
R
1. Initialize Subroles = {R} and Examined = ∅.
2. While Subroles Examined ̸= ∅
2.1 Select S ∈ Subroles Examined and add S to Examined.
2.2 For every role inclusion U ⊑ S or inv(U) ⊑ inv(S) in T
∙ If U ̸∈ Subsumees, add U to Subsumees
3. Return Subroles.
38/109

computing subroles
Algorithm ComputeSubroles
Input: DL-LiteR TBox T , role R ∈ N±
R
1. Initialize Subroles = {R} and Examined = ∅.
2. While Subroles Examined ̸= ∅
2.1 Select S ∈ Subroles Examined and add S to Examined.
2.2 For every role inclusion U ⊑ S or inv(U) ⊑ inv(S) in T
∙ If U ̸∈ Subsumees, add U to Subsumees
3. Return Subroles.
ItalDish ⊑ Dish
VegDish ⊑ Dish
∃hasCourse−
⊑ Dish
Run on hasCourse:
Subroles = {hasCourse, hasMain,
hasDessert}
38/109

from concepts and roles to queries
Let SC = ComputeSubsumees(A, T ), SR = ComputeSubroles(r, T ).
Rewriting of A(x) w.r.t. T (and consistent ABoxes):
RewriteIQ(A, T ) =
∨
C∈SC∩NC
C(x) ∨
∨
∃r∈SC
∃y.r(x, y) ∨
∨
∃r−∈SC
∃y.r(y, x)
39/109

RewriteIQ(A, T ) =
∨
C∈SC∩NC
C(x) ∨
∨
∃r∈SC
∃y.r(x, y) ∨
∨
∃r−∈SC
∃y.r(y, x)
Rewriting of r(x, y) w.r.t. T (and consistent ABoxes):
RewriteIQ(r, T ) =
∨
s∈SR
s(x, y) ∨
∨
s−∈SR
s(y, x)
39/109

RewriteIQ(A, T ) =
∨
C∈SC∩NC
C(x) ∨
∨
∃r∈SC
∃y.r(x, y) ∨
∨
∃r−∈SC
∃y.r(y, x)
Rewriting of r(x, y) w.r.t. T (and consistent ABoxes):
RewriteIQ(r, T ) =
∨
s∈SR
s(x, y) ∨
∨
s−∈SR
s(y, x)
The rewriting is ABox-independent and polysize in |T | and |q|.
39/109

example of query rewriting (1/2)
We have already computed:
ComputeSubsumees(Dish, T ) ={Dish, ItalDish, VegDish,
∃hasCourse−
, ∃hasMain
−
, ∃hasDessert−
}
Get following rewriting of Dish(x) w.r.t. T :
RewriteIQ(Dish, T ) = Dish(x) ∨ ItalDish(x) ∨ VegDish(x)
∨ ∃y.hasCourse(y, x) ∨ ∃y.hasMain(y, x)
∨ ∃y.hasDessert(y, x)
40/109

example of query rewriting (2/2)
ItalDish ⊑ Dish
VegDish ⊑ Dish
∃hasCourse−
⊑ Dish
ABox A:
hasMain(m, d1)
hasDessert(m, d2)
VegDish(d3)
RewriteIQ(Dish, T ) = Dish(x) ∨ ItalDish(x) ∨ VegDish(x) ∨ ∃y.hasCourse(y, x)
∨ ∃y.hasMain(y, x) ∨ ∃y.hasDessert(y, x)
Certain answers: d1, because of the disjunct ∃y.hasMain(y, x)
d2, because of the disjunct ∃y.hasDessert(y, x)
d3, because of the disjunct VegDish(x)
41/109

checking unsatisfiability
We have a FO-rewriting of q w.r.t. T relative to consistent ABoxes
We need a rewriting of unsatisﬁability to obtain a rewriting of q
42/109

checking unsatisfiability
We have a FO-rewriting of q w.r.t. T relative to consistent ABoxes
We need a rewriting of unsatisﬁability to obtain a rewriting of q
∙ only negative inclusions relevant
∙ one subquery for each such inclusion G ⊑ ¬H
∙ consider all possible ways of violating G ⊑ ¬H: combinations of a
subsumee (subrole) of G and a subsumee (subrole) of H
Details in the lecture notes.
42/109

complexity of instance checking in dl-lite
In data complexity
∙ rewriting takes constant time, yields FO query
∙ upper bound from FO query evaluation: AC0
43/109

In data complexity
In combined complexity:
∙ P membership: rewriting and evaluation both in polynomial time
∙ NLogSpace upper bound: ‘guess’ relevant part of rewriting
43/109

In data complexity
In combined complexity:
∙ P membership: rewriting and evaluation both in polynomial time
∙ NLogSpace upper bound: ‘guess’ relevant part of rewriting
Theorem In DL-LiteR, satisﬁability and instance checking are
1. in AC0 for data complexity
2. NLogSpace-complete for combined complexity.
Note: Same bounds hold for several other DL-Lite dialects
43/109

instance checking in el
Next consider instance checking in EL.
Assume EL TBoxes given in normal form: axioms of the forms
A1 ⊓ . . . ⊓ An ⊑ B A ⊑ ∃r.B ∃r.A ⊑ B
44/109

A1 ⊓ . . . ⊓ An ⊑ B A ⊑ ∃r.B ∃r.A ⊑ B
Cannot use FO query rewriting approach for EL:
no FO-rewriting of A(x) w.r.t. T = {∃r.A ⊑ A}
44/109

A1 ⊓ . . . ⊓ An ⊑ B A ⊑ ∃r.B ∃r.A ⊑ B
Cannot use FO query rewriting approach for EL:
no FO-rewriting of A(x) w.r.t. T = {∃r.A ⊑ A}
We present a saturation-based approach.
44/109

saturation rules for el
TBox rules
A ⊑ Bi (1 ≤ i ≤ n) B1 ⊓ . . . ⊓ Bn ⊑ D
A ⊑ D
T1
A ⊑ B B ⊑ ∃r.D
A ⊑ ∃r.D
T2
A ⊑ ∃r.B B ⊑ D ∃r.D ⊑ E
A ⊑ E
T3
ABox rules
A1 ⊓ . . . ⊓ An ⊑ B Ai(a) (1 ≤ i ≤ n)
B(a)
A1
∃r.B ⊑ A r(a, b) B(b)
A(a)
A2
Algorithm: apply rules exhaustively, check if A(a) (r(a, b)) is present
45/109

example: saturation in el
46/109

example: saturation in el
ArrabSauce ⊑ Spicy T3 : (5), (6), (7) (10)
PenneArrab ⊑ Spicy T3 : (1), (10), (7) (11)
PenneArrab ⊑ Dish T1 : (2), (3) (12)
PenneArrab ⊑ ∃hasIngred.Pasta T2 : (2), (4) (13)
PenneArrab ⊑ SpicyDish T1 : (11), (12), (8) (14)
Spicy(p) A1 : (11), (9) (15)
Dish(p) A1 : (12), (9) (16)
SpicyDish(p) A1 : (16), (15) (17)
46/109

complexity of instance checking in el
Saturation approach is sound: everything derived is entailed
47/109

Also complete for instance checking:
Theorem Let K be an EL knowledge base, and let K′
be the result
of saturating K. For every ABox assertion α, we have:
K |= α iff α ∈ K′
47/109

be the result
Note: does not make all consequences explicit
∙ can have inﬁnitely many implied axioms ⇝ would not terminate!
∙ so: only complete for some reasoning tasks
47/109

be the result
Note: does not make all consequences explicit
∙ can have inﬁnitely many implied axioms ⇝ would not terminate!
∙ so: only complete for some reasoning tasks
Runs in polynomial time in |K|. This is optimal:
Theorem Instance checking in EL is P-complete for both data and
combined complexity.
47/109

extending the saturation approach
Saturation approach can be extended to ELHI⊥
Additional rules required
Key difference: new conjunctions of concepts can occur
A ⊑ ∃R.D ∃R−
.B ⊑ E
48/109

A ⊑ ∃R.D ∃R−
.B ⊑ E
A ⊓ B ⊑ ∃R.(D ⊓ E)
48/109

extended set of saturation rules
TBox rules
{A ⊑ Bi}n
i=1 B1 ⊓ . . . ⊓ Bn ⊑ D
A ⊑ D
T1
R ⊑ S S ⊑ T
R ⊑ T
T4
M ⊑ ∃R.(N ⊓ ⊥)
M ⊑ ⊥
T5
M ⊑ ∃R.(N ⊓ N′
) N ⊑ A
M ⊑ ∃R.(N ⊓ N′
⊓ A)
T6
M ⊑ ∃R.(N ⊓ A) ∃S.A ⊑ B R ⊑ S
M ⊑ B
T7
M ⊑ ∃R.N ∃inv(S).A ⊑ B R ⊑ S
M ⊓ A ⊑ ∃R.(N ⊓ B)
T8
ABox rules
A1 ⊓ . . . ⊓ An ⊑ B Ai(a) (1 ≤ i ≤ n)
B(a)
A1
∃r.B ⊑ A r(a, b) B(b)
A(a)
A2
∃r−
.B ⊑ A r(b, a) B(b)
A(a)
A3
r ⊑ s r(a, b)
s(a, b)
A4
r ⊑ s−
r(a, b)
s(b, a)
A5
49/109

C ⊑ ∃R.D D ⊑ ∀R−
.B
C ⊑ ∃R.(N ⊓ N′
⊓ A)
New set of rules ⇝ exponentially many different new axioms
Theorem Instance checking in ELHI⊥ is P-complete for data and
Exp-complete for combined complexity.
50/109

saturation as a datalog program
Let sat(T ) be result of applying TBox saturation rules to T .
For each ELHI⊥ TBox T and ABox signature Σ deﬁne following
Datalog program Π(T , Σ):
Π(T , Σ) ={B(x) ← A1(x), . . . , An(x) | A1 ⊓ . . . ⊓ An ⊑ B ∈ sat(T )} ∪
{B(x) ← A(y), r(x, y) | ∃r.A ⊑ B ∈ T } ∪
{B(y) ← A(x), r(x, y) | ∃r−
.A ⊑ B ∈ T } ∪
{s(x, y) ← r(x, y) | r ⊑ s ∈ sat(T ), s ∈ NR} ∪
{s(y, x) ← r(x, y) | r ⊑ s−
∈ sat(T ), s ∈ NR} ∪
{⊤(x) ← A(x) | A ∈ NC ∩ Σ} ∪
{⊤(x) ← r(x, y) | r ∈ NR ∩ Σ} ∪
{⊤(x) ← r(y, x) | r ∈ NR ∩ Σ}
51/109

saturation as a datalog program (continued)
Theorem For every finite signature Σ and ELHI⊥ KB K = (T , A)
with sig(A) ⊆ Σ:
1. K is unsatisfiable iff ans((Π(T , Σ), ⊥), IA) ̸= ∅;
2. If K is satisfiable, then for all A ∈ NC, r ∈ NR, and a, b ∈ Ind(A):
∙ K |= A(a) iff a ∈ ans((Π(T , Σ), A), IA);
∙ K |= r(a, b) iff (a, b) ∈ ans((Π(T , Σ), r), IA).
This means:
∙ get Datalog rewriting of instance queries in ELHI⊥
∙ can use Datalog program to create saturated ABox
52/109

saturation as a datalog program: an example
The Datalog program associated with our example:
PastaDish(x) ← PenneArrab(x) PenneArrab ⊑ PastaDish
Dish(x) ← PastaDish(x) PastaDish ⊑ Dish
Spicy(x) ← Peperonc(x) Peperonc ⊑ Spicy
Spicy(x) ← hasIngred(x, y), Spicy(y) ∃hasIngred.Spicy ⊑ Spicy
SpicyDish(x) ← Spicy(x), Dish(x) Dish ⊓ Spicy ⊑ SpicyDish
Spicy(x) ← ArrabSauce(x) ArrabSauce ⊑ Spicy
Spicy(x) ← PenneArrab(x) PenneArrab ⊑ Spicy
Dish(x) ← PenneArrab(x) PenneArrab ⊑ Dish
SpicyDish(x) ← PenneArrab PenneArrab ⊑ SpicyDish
(technically, also have T -independent rules for ⊤...)
53/109

(unions of) conjunctive queries
IQs quite restricted: No selections and joins as in DB queries
55/109

Most work on OMQA adopts (unions of) conjunctive queries (CQs)
55/109

A conjunctive query (CQ) is a ﬁrst-order query q(⃗x) of the form
∃⃗y.P1(⃗t1) ∧ · · · ∧ Pn(⃗tn)
where every variable in some ⃗ti appears in either ⃗x or ⃗y
55/109

A conjunctive query (CQ) is a ﬁrst-order query q(⃗x) of the form
∃⃗y.P1(⃗t1) ∧ · · · ∧ Pn(⃗tn)
where every variable in some ⃗ti appears in either ⃗x or ⃗y
A union of CQs (UCQ) is a ﬁrst-order query q(⃗x) of the form
q1(⃗x) ∨ · · · ∨ qn(⃗x)
where the qi(⃗x) are CQs with same tuple ⃗x of free vars
55/109

cqs and ucqs in datalog
Alternatively, CQs and UCQs can be seen as Datalog rules
CQs:
q(⃗x) = ∃⃗y.P1(⃗t1) ∧ · · · ∧ Pn(⃗tn) ⇝ q(⃗x) ← P1(⃗t1), . . . , Pn(⃗tn)
56/109

cqs and ucqs in datalog
Alternatively, CQs and UCQs can be seen as Datalog rules
CQs:
q(⃗x) = ∃⃗y.P1(⃗t1) ∧ · · · ∧ Pn(⃗tn) ⇝ q(⃗x) ← P1(⃗t1), . . . , Pn(⃗tn)
UCQs:
q(⃗x) = ∃⃗y1.P1
1(⃗t1
1) ∧ · · · ∧ P1
n1
( ⃗t1
n1
) q(⃗x) ← P1
1(⃗t1
1), . . . , P1
n1
( ⃗t1
n1
)
∨ ∃⃗y2.P2
1(⃗t2
1) ∧ · · · ∧ P2
n2
( ⃗t2
n2
) q(⃗x) ← P2
1(⃗t2
1), . . . , P2
n(⃗t2
n)
... ⇝
...
∨ ∃⃗yℓ.Pℓ
1(⃗tℓ
1) ∧ · · · ∧ Pℓ
nℓ
( ⃗tℓ
nℓ
) q(⃗x) ← Pℓ
1(⃗tℓ
1), . . . , Pℓ
nℓ
( ⃗tℓ
nℓ
)
56/109

what can we express as (u)cqs?
q1(y, x) = ∃z.serves(x, y) ∧ hasIngred(y, z) ∧ Spicy(z)
q2(y, x) = q2(y, x) ∨
(
∃z, z′
.serves(x, y) ∧ hasIngred(y, z)
∧hasIngred(z, z′
) ∧ Spicy(z′
)
)
57/109

what can we express as (u)cqs?
q1(y, x) = ∃z.serves(x, y) ∧ hasIngred(y, z) ∧ Spicy(z)
q2(y, x) = q2(y, x) ∨
(
∃z, z′
.serves(x, y) ∧ hasIngred(y, z)
∧hasIngred(z, z′
) ∧ Spicy(z′
)
)
Capture select-project-join queries of relational algebra
Capture basic graph patterns of SPARQL
57/109

query matches
A match for q(⃗x) = ∃⃗y.φ(⃗x,⃗y) in an interpretation I is a mapping π
from the variables in ⃗x ∪⃗y to objects in ∆I
such that:
∙ π(⃗t) ∈ PI
for every atom P(⃗t) ∈ q
58/109

query matches
such that:
∙ π(⃗t) ∈ PI
We write I |=π q(⃗a) if π is a match for q(⃗x) in I and π(⃗x) = ⃗a
58/109

query matches
such that:
∙ π(⃗t) ∈ PI
By deﬁnition: ⃗a ∈ ans(q, I)
iff
there exists a match π such that I |=π q(⃗a)
58/109

query matches
such that:
∙ π(⃗t) ∈ PI
By deﬁnition: ⃗a ∈ ans(q, I)
iff
there exists a match π such that I |=π q(⃗a)
Answering CQs amounts to searching for matches
Recall that ⃗a ∈ cert(q, K) iff ⃗a ∈ ans(q, I) for every model I of K
Challenge: how do we check that there is a match in every model?
58/109

the universal model property
For Horn DLs, each satisﬁable K has a universal model IK
IK is ‘contained’ in every model I of K
⇝ formally, there is a homomorphism from IK to I
An answer to a (U)CQ q in IK is an answer to q in every model of K
⇝ matches of (U)CQs are preserved under homomorphisms
⃗a ∈ cert(q, K) iff ⃗a ∈ ans(q, IK)
So: IK gives us the certain answers to q over K
59/109

constructing a universal model
Use the saturation of (T , A) for building a universal model IT ,A
60/109

Intuition: - IT ,A contains the saturated ABox A′
- if an object satisﬁes M and M ⊑ ∃R.M′
∈ sat(T ),
a fresh object witnessing this is created
60/109

Intuition: - IT ,A contains the saturated ABox A′
- if an object satisﬁes M and M ⊑ ∃R.M′
∈ sat(T ),
a fresh object witnessing this is created
Formally, ∆IT ,A
contains words aR1M1 . . . RnMn with a ∈ Ind(A) and:
∙ Ri are roles and Mi are conjunctions of concepts
∙ there exists M ⊑ ∃R1.M1 ∈ sat(T ) such that T , A |= M(a)
∙ Mi ⊑ ∃Ri+1.Mi+1 ∈ sat(T ) for every 1 ≤ i < n
(note: use only strongest axioms M ⊑ ∃R.M′ in sat(T ))
The interpretation function is deﬁned as follows:
∙ aIT ,A
= a,
∙ a ∈ AI
iff A(a) ∈ sat(T , A), eRM ∈ AIT ,A
iff M ⊑ A ∈ sat(T )
∙ (a, b) ∈ rI
iff r(a, b) ∈ sat(T , A), (e, eRM) ∈ rIT ,A
iff R ⊑ r ∈ sat(T ),
and (eRM, e) ∈ rIT ,A
if R ⊑ r−
∈ sat(T )
60/109

example of the canonical model construction (1/3)
TBox: PenneArrab ⊑ ∃hasIngred.Penne
Penne ⊑ Pasta
PenneArrab ⊑ ∃hasIngred.ArrabSauce
ArrabSauce ⊑ ∃hasIngred.Peperonc
Peperonc ⊑ Spicy
PizzaCalab ⊑ ∃hasIngred.Nduja
Nduja ⊑ Spicy
ABox: serves(r, b) serves(r, p) PenneArrab(b) PizzaCalab(p)
The saturated TBox additionally contains:
PenneArrab ⊑ ∃hasIngred.(Penne ⊓ Pasta)
ArrabSauce ⊑ ∃hasIngred.(Peperonc ⊓ Spicy)
PizzaCalab ⊑ ∃hasIngred.(Nduja ⊓ Spicy)
61/109

IT ,A contains the ABox and is closed under inclusions
serves(r, b) serves(r, p) PenneArrab(b) PizzaCalab(p)
rp
PizzaCalab
b
PenneArrabserves serves
62/109

The anonymous objects witnessing existential concepts form trees
rp
PizzaCalab
b
PenneArrab
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauce
e4
Peperonc, Spicy
serves serves
hasIngred hasIngred hasIngred
hasIngred
PenneArrab ⊑ ∃hasIngred.(Penne ⊓ Pasta)
PizzaCalab ⊑ ∃hasIngred.(Nduja ⊓ Spicy)
63/109

finding answers in the canonical model
To answer CQ q, it sufﬁces to test whether it has a match in IT ,A
But this is still challenging!
- IT ,A contains assertions and objects not present in A
- we cannot build IT ,A explicitly: can be inﬁnite!
Our approach: use query rewriting!
Formally: given a CQ q, we construct a UCQ REWT (q) such that
⃗a ∈ ans(q, IT ,A)
iff
there is a match π for a disjunct q′
of rewT (q) such that
IT ,A |=π q′
(⃗a) and π sends all vars to individuals from A
64/109

rewriting the query
Rewriting step (idea):
1. Choose leaf variable x so that no vars are mapped below it in IT ,A
2. Find axiom M ⊑ ∃R.N in sat(T ) that ensures atoms containing x
are satisﬁed
3. Drop from q all atoms with x and add M to parent of x to get q′
65/109

rewriting the query
are satisﬁed
Properties:
∙ Every match for q′
can be extended to a match for q
∙ Every match for q contains a match for q′
∙ The answers of q and q′
coincide, but the relevant matches for q′
are closer to the ABox than those of q
65/109

rewriting the query
are satisﬁed
Properties:
∙ Every match for q′
can be extended to a match for q
∙ Every match for q contains a match for q′
∙ The answers of q and q′
coincide, but the relevant matches for q′
are closer to the ABox than those of q
We repeatedly apply this rewriting step to obtain a set of queries
whose relevant matches range over ABox individuals.
65/109

example of a rewriting step (1/2)
q(y, x) = ∃z, z′
.serves(x, y) ∧ hasIngred(y, z) ∧ hasIngred(z, z′
) ∧ Spicy(z′
)
We have IK |=π q(b, r) with π(x) = r, π(y) = b, π(z) = e3, π(z′
) = e4
r
π(x)
p
PizzaCalab
b
PenneArrab
π(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauceπ(z)
e4
Peperonc, Spicy
π(z′)
serves serves
hasIngred
66/109

q(y, x) = ∃z, z′
) ∧ Spicy(z′
)
) = e4
r
π(x)
p
PizzaCalab
b
PenneArrab
π(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauceπ(z)
e4
Peperonc, Spicy
π(z′)
serves serves
hasIngred
• Choose z′
as ‘leaf’
• Choose ArrabSauce ⊑ ∃hasIngred.Spicy ∈ sat(T )
• RHS ensures hasIngred(z, z′
), Spicy(z′
)
• We replace these atoms by ArrabSauce(z)
66/109

q(y, x) = ∃z, z′
) ∧ Spicy(z′
)
) = e4
r
π(x)
p
PizzaCalab
b
PenneArrab
π(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauceπ(z)
e4
Peperonc, Spicy
π(z′)
serves serves
hasIngred
• Choose z′
as ‘leaf’
• Choose ArrabSauce ⊑ ∃hasIngred.Spicy ∈ sat(T )
• RHS ensures hasIngred(z, z′
), Spicy(z′
)
• We replace these atoms by ArrabSauce(z)
q′
(y, x) = ∃z.serves(x, y) ∧ hasIngred(y, z) ∧ ArrabSauce(z)
66/109

q(y, x) = ∃z, z′
) ∧ Spicy(z′
)
q′
67/109

q(y, x) = ∃z, z′
) ∧ Spicy(z′
)
q′
IK |=π q(b, r) IK |=π′ q′
(b, r)
r
π(x),π′(x)
p
PizzaCalab
b
PenneArrab
π(y),π′(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauceπ(z) ,π′(z)
e4
Peperonc, Spicy
π(z′)
serves serves
hasIngred
67/109

q(y, x) = ∃z, z′
) ∧ Spicy(z′
)
q′
(b, r)
r
π(x),π′(x)
p
PizzaCalab
b
PenneArrab
π(y),π′(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e4
Peperonc, Spicy
π(z′)
serves serves
hasIngred
depth(π) > depth(π′
)
67/109

another rewriting step (1/2)
q′
IK |=π′ q′
(b, r)
r
π′(x)
p
PizzaCalab
b
PenneArrab
π′(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauceπ′(z)
e4
Peperonc, Spicy
serves serves
hasIngred
68/109

q′
IK |=π′ q′
(b, r)
r
π′(x)
p
PizzaCalab
b
PenneArrab
π′(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e4
Peperonc, Spicy
serves serves
hasIngred
• Choose z as leaf
• Choose PenneArrab ⊑ ∃hasIngred.ArrabSauce
• RHS yields hasIngred(y, z) and ArrabSauce(z)
• We replace these atoms by PenneArrab(y)
68/109

q′
IK |=π′ q′
(b, r)
r
π′(x)
p
PizzaCalab
b
PenneArrab
π′(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e4
Peperonc, Spicy
serves serves
hasIngred
• Choose z as leaf
• Choose PenneArrab ⊑ ∃hasIngred.ArrabSauce
• RHS yields hasIngred(y, z) and ArrabSauce(z)
• We replace these atoms by PenneArrab(y)
q′′
(y, x) = serves(x, y) ∧ PenneArrab(y)
68/109

q(y, x) = ∃z, z′
serves(x, y) ∧ hasIngred(y, z) ∧ hasIngred(z, z′
) ∧ Spicy(z′
)
q′
q′′
69/109

q(y, x) = ∃z, z′
) ∧ Spicy(z′
)
q′
q′′
(b, r) IK |=π′′ q′′
(b, r)
69/109

q(y, x) = ∃z, z′
) ∧ Spicy(z′
)
q′
q′′
(b, r) IK |=π′′ q′′
(b, r)
r
π(x),π′(x),π′′(x)
p
PizzaCalab
b
PenneArrab
π(y),π′(y),π′′(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e4
Peperonc, Spicy
π(z′)
serves serves
hasIngred
69/109

q(y, x) = ∃z, z′
) ∧ Spicy(z′
)
q′
q′′
(b, r) IK |=π′′ q′′
(b, r)
r
π(x),π′(x),π′′(x)
p
PizzaCalab
b
PenneArrab
π(y),π′(y),π′′(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e4
Peperonc, Spicy
π(z′)
serves serves
hasIngred
) > depth(π′′
)
69/109

q(y, x) = ∃z, z′
) ∧ Spicy(z′
)
q′
q′′
(b, r) IK |=π′′ q′′
(b, r)
r
π(x),π′(x),π′′(x)
p
PizzaCalab
b
PenneArrab
π(y),π′(y),π′′(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e4
Peperonc, Spicy
π(z′)
serves serves
hasIngred
) > depth(π′′
)
In π′′
all variables are mapped to individuals
69/109

decision procedure
Theorem For every satisﬁable ELHI⊥KB K = (T , A), and CQ q(⃗x):
⃗a ∈ cert(q, K) iff IK |=π q′
(⃗a) for some q′
∈ rewT (q) and some π that
maps all variables to individuals in A.
70/109

decision procedure
There is a bounded number of such restricted matches π
Checking if π is match reduces to linearly many instance checks
70/109

decision procedure
There is a bounded number of such restricted matches π
Checking if π is match reduces to linearly many instance checks
Yields terminating, sound, and complete CQ answering procedure
70/109

complexity of cq answering
Combined complexity:
sat(T ) and rewT (q) can be constructed in single exponential time
single exponential bound on candidate matches π
instance checking in single exponential time
71/109

Data complexity:
sat(T ) and rewT (q) are ABox independent
polynomial bound on candidate matches π
Instance checking in polynomial time
71/109

Data complexity:
sat(T ) and rewT (q) are ABox independent
polynomial bound on candidate matches π
Instance checking in polynomial time
Theorem CQ answering in ELHI⊥and Horn-SHIQ is Exp-complete
in combined complexity and P-complete in data complexity.
71/109

optimal bounds for lightweight dls
Adapting our technique gives optimal bounds for lightweight DLs:
For ELH and DL-LiteR we get NP in combined complexity:
∙ compute sat(T ) in polynomial time
∙ non-deterministically build the right q′
∈ rewT (q)
∙ guess a candidate π
∙ check if it is a match ⇝ instance checking in polynomial time
CQ answering is NP-hard over ABox alone seen as DB (no TBox)
72/109

∈ rewT (q)
For EL in data complexity, yields P membership
⇝ optimal since instance queries already P-hard
72/109

∈ rewT (q)
In DL-LiteR, can get a FO rewriting ⇝ in AC0 for data complexity.
72/109

∈ rewT (q)
In DL-LiteR, can get a FO rewriting ⇝ in AC0 for data complexity.
Theorem CQ answering in ELH and DL-LiteR is NP-complete in
combined complexity. For ELH the data complexity is P-complete,
and for DL-LiteR the data complexity is in AC0. 72/109

datalog rewriting
Our procedure yields a Datalog rewriting:
∙ rewT (q) is a UCQ ⇝ translate into set of Datalog rules Πrew(q)
∙ use Q in head of rules
∙ the program Π(T ) (from earlier) computes all entailed ABox
assertions
73/109

datalog rewriting
Our procedure yields a Datalog rewriting:
∙ rewT (q) is a UCQ ⇝ translate into set of Datalog rules Πrew(q)
∙ use Q in head of rules
∙ the program Π(T ) (from earlier) computes all entailed ABox
assertions
(Πrew(q) ∪ Π(T ), Q)
is a Datalog rewriting of q w.r.t. T relative to consistent ABoxes
73/109

combined approach for cqs in elhi
Alternative: combined approach (saturation and rewriting)
74/109

Know that it sufﬁces to evaluate the UCQ rewT (q) over the set of
ABox assertions entailed from the KB K
74/109

Also know: assertions entailed from K = assertions in sat(K)
74/109

Also know: assertions entailed from K = assertions in sat(K)
Materialize assertions in sat(K) and view result as database
+ only need to evaluate a UCQ
can use standard relational database systems
– materializing not always convenient
saturation needs to be updated if data changes
74/109

an fo rewriting approach for cqs in dl-lite
For DL-LiteR we can generate an FO-rewriting as follows.
Replace in all q′
∈ rewT (q) each atom by its FO-rewriting for
instance checking:
75/109

an fo rewriting approach for cqs in dl-lite
For DL-LiteR we can generate an FO-rewriting as follows.
Replace in all q′
∈ rewT (q) each atom by its FO-rewriting for
instance checking:
∙ replace each A(t) by RewriteIQ(A, T )
∙ replace each r(t, t′
) by RewriteIQ(r, T )
Resulting FO formula:
∙ positive, can be transformed into a UCQ
∙ rewriting relative to consistent ABoxes
∙ can be combined with a rewriting of unsatisﬁability
∙ yields AC0 upper bound in data complexity
75/109

a glimpse beyond
Other Horn DLs
∙ Similar results hold for other dialects of DL-Lite and EL
∙ Sometimes complexity increases, e.g., EL with complex role
inclusions
∙ The P data and Exp combined upper bounds extend to even more
expressive Horn DLs, like Horn-SHOIQ
76/109

a glimpse beyond
Other Horn DLs
∙ Similar results hold for other dialects of DL-Lite and EL
∙ Sometimes complexity increases, e.g., EL with complex role
inclusions
∙ The P data and Exp combined upper bounds extend to even more
expressive Horn DLs, like Horn-SHOIQ
Beyond Horn DLs
∙ no universal model property
∙ query answering usually exponentially harder
∙ different techniques: automata, rolling-up, resolution, etc.
∙ for some well-known DLs (e.g., SHOIQ) decidability open
76/109

complexity of answering (u)cqs
IQs CQs
data
complexity
combined
complexity
data
complexity
combined
complexity
DL-Lite
DL-LiteR
in AC0 NLogSpace in AC0 NP
EL, ELH P P P NP
ELI, ELHI⊥,
Horn-SHOIQ
P Exp P Exp
ALC,
ALCHQ
coNP Exp coNP Exp
ALCI, SH,
SHIQ
coNP Exp coNP 2Exp
SHOIQ coNP-hard coNExp coNP-hard1
coN2Exp-hard1
1
decidability open
77/109

limitations of (u)cqs
Some very natural queries are not expressible as CQs:
- ﬁnd dishes that contain something spicy
- is a a relative of b?
- is there a bus connection from X to Y?
79/109

We need navigational queries queries that can query the topology
of the graph, that is, the information stored in the connections
79/109

We need navigational queries queries that can query the topology
of the graph, that is, the information stored in the connections
Especially important for highly connected data with no ﬁxed schema
∙ social, biological, and chemical networks
79/109

navigational queries
Prominent navigational query languages:
80/109

Regular Path Queries (RPQs): ﬁnd pairs of objects that are connected
by a chain of roles that comply with a given regular language
(hasCourse ∪ courseOf−
) · (hasIngred ∪ ingredOf
−
)∗
· Spicy?(x, y)
80/109

Regular Path Queries (RPQs): ﬁnd pairs of objects that are connected
by a chain of roles that comply with a given regular language
(hasCourse ∪ courseOf−
) · (hasIngred ∪ ingredOf
−
)∗
· Spicy?(x, y)
Conjunctive RPQs: allow to join RPQs conjunctively
∙ similar to CQs, but each atom is an RPQ
∙ extend CQs with the navigational power of RPQs
q(x, x′
) = ∃y, z. serves · Menu? · (hasMain ∪ hasStarter)(x, y) ∧
serves · Menu? · (hasCourse ∪ courseOf−
)(x′
, y) ∧
(hasIngred ∪ ingredOf
−
)∗
· Spicy?(y, z)
Both languages have 1-way and 2-way variants
80/109

our most expressive navigational queries: c2rpqs
Recall: N±
R contains all role names and their inverses.
A conjunctive two-way regular path query (C2RPQ) has the form
q(⃗x) = ∃⃗y.
∧
L(t, t′
) ∧
∧
A(t)
where A is a concept name
t, t′
are variables or individuals (in NI ∪⃗x ∪⃗y)
L is regular language over N±
R ∪ {A? | A ∈ NC}
81/109

our most expressive navigational queries: c2rpqs
Recall: N±
R contains all role names and their inverses.
A conjunctive two-way regular path query (C2RPQ) has the form
q(⃗x) = ∃⃗y.
∧
L(t, t′
) ∧
∧
A(t)
where A is a concept name
t, t′
are variables or individuals (in NI ∪⃗x ∪⃗y)
L is regular language over N±
R ∪ {A? | A ∈ NC}
Regular languages can be given as:
∙ regular expressions E → r ∈ N±
R | A? | r · r | r ∪ r | r∗
∙ non-deterministic ﬁnite automata NFA
Recall: RegExps and NFAs are equivalent, but NFAs are more succinct
81/109

other navigational query languages
Conjunctive (one-way) regular path queries (CRPQs) disallow inverses ⇝
regular expressions use only (direct) role names
q(x, x′
) = ∃y, z.serves · Menu?hasCourse(x, y) ∧
serves · Menu? · hasCourse(x′
, y) ∧ hasIngred∗
· Spicy?(y, z)
q(x) = ∃y.hasIngred∗
· Spicy?(x, y)
82/109

q(x, x′
· Spicy?(y, z)
· Spicy?(x, y)
Two-way regular path queries (2RPQs) have only one atom and no
existential variables ⇝ both variables are answer variables
q(x, y) = (hasIngred ∪ ingredOf−
)∗
· Spicy?(x, y)
)∗
· Spicy? · Σ∗
(x, y)
82/109

q(x, x′
· Spicy?(y, z)
· Spicy?(x, y)
Two-way regular path queries (2RPQs) have only one atom and no
existential variables ⇝ both variables are answer variables
)∗
· Spicy?(x, y)
)∗
· Spicy? · Σ∗
(x, y)
(One-way) Regular path queries (RPQs) are 2RPQs with no inverses
⇝ all of the restrictions above
q(x, y) = hasIngred∗
· Spicy?(x, y)
q(x, y) = hasCourse · hasIngred∗
· Spicy?(x, y)
82/109

semantics of c2rpqs
Satisfaction of atoms L(t, t′
):
(d, d′
) ∈ LI
if there is an L-path from d to d′
, i.e.,
∙ a sequence e0e1 . . . en objects from ∆I
with e0 = d and en = d′
∙ a word u1u2 . . . un ∈ L over N±
R ∪ {A? | A ∈ NC}
such that, for every 1 ≤ i ≤ n:
∙ if ui = A?, then ei−1 = ei ∈ AI
∙ if ui = R ∈ N±
R , then (ei−1, ei) ∈ RI
83/109

semantics of c2rpqs
):
(d, d′
) ∈ LI
, i.e.,
R ∪ {A? | A ∈ NC}
∙ if ui = R ∈ N±
Match: mapping π from terms to elements that satisﬁes all atoms
As before: I |=π q(⃗a) if match π maps answer variables to ⃗a
83/109

semantics of c2rpqs
):
(d, d′
) ∈ LI
, i.e.,
R ∪ {A? | A ∈ NC}
∙ if ui = R ∈ N±
Match: mapping π from terms to elements that satisfies all atoms
As before: I |=π q(⃗a) if match π maps answer variables to ⃗a
Certain answers defined as for CQs
Again suffices to find match in canonical model
83/109

answering 2rpqs
We focus on answering 2PRQs: one atom, no existential variables
84/109

answering 2rpqs
Bound on matches ranging over individuals only
84/109

answering 2rpqs
Bound on matches ranging over individuals only
Challenge: paths may need to go deep into the canonical model
q(x, y) = serves · (hasIngred ∪ ingredOf
−
)∗
· Spicy? · Σ∗
(x, y)
r
π(x)
p
PizzaCalab
b
PenneArrabπ(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauce
e4
Peperonc, Spicy
serves serves
hasIngred
84/109

loops through the anonymous part
Goal: compact representation of all ways in which paths through
the anonymous part can participate in matches
85/109

s0 s1 sf
serves
hasIngred
ingredOf
−
Spicy?
Σ∗
We use NFA representation
We write M ∈ Loopα[s, s′
] iff a ∈ MIK
implies the existence of a path
p below a that takes the NFA α from s to s′
, e.g.,
85/109

s0 s1 sf
serves
hasIngred
ingredOf
−
Spicy?
Σ∗
We use NFA representation
We write M ∈ Loopα[s, s′
] iff a ∈ MIK
implies the existence of a path
p below a that takes the NFA α from s to s′
, e.g.,
PenneArrab ∈ Loopα[s1, sf]
because of
85/109

computing the loop table
We can explicitly compute the full table Loopα inductively:
86/109

computing the loop table
We can explicitly compute the full table Loopα inductively:
if s is a state then Loopα[s, s] = NC
if M1 ∈ Loopα[s1, s2] and
M2 ∈ Loopα[s2, s3]
then M1 ⊓ M2 ∈ Loopα[s1, s3]
if T |= C1 ⊓ · · · ⊓ Cn ⊑ A and
(s1, A?, s2) ∈ δ
then C1 ⊓ · · · ⊓ Cn ∈ Loopα[s1, s2]
if T |= C1 ⊓ · · · ⊓ Cn ⊑ ∃R.D,
T |= R ⊑ R′
, T |= R ⊑ R′′
,
(s1, R′
, s2) ∈ δ,
D ∈ Loopα[s2, s3], and
(s3, R′′−
, s4) ∈ δ
then C1 ⊓ · · · ⊓ Cn ∈ Loopα[s1, s4]
86/109

computing loops: an example
s0 s1 sf
serves
hasIngred
ingredOf
−
Spicy?
Σ∗ rp
PizzaCalab
b
PenneArrab
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauce
e4
Peperonc, Spicy
serves serves
hasIngred
87/109

s0 s1 sf
serves
hasIngred
ingredOf
−
Spicy?
Σ∗ rp
PizzaCalab
b
PenneArrab
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauce
e4
Peperonc, Spicy
serves serves
hasIngred
∙ Peperonc ∈ Loopα[s1, sf] because (s1, Spicy?, sf) ∈ δ and
Peperonc ⊑ Spicy
87/109

s0 s1 sf
serves
hasIngred
ingredOf
−
Spicy?
Σ∗ rp
PizzaCalab
b
PenneArrab
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauce
e4
Peperonc, Spicy
serves serves
hasIngred
Peperonc ⊑ Spicy
∙ ArrabSauce ∈ Loopα[s1, sf] because
(s1, hasIngred, s1), (sf, hasIngred−
, sf) ∈ δ and
Peperonc ∈ Loopα[s1, sf]
87/109

s0 s1 sf
serves
hasIngred
ingredOf
−
Spicy?
Σ∗ rp
PizzaCalab
b
PenneArrab
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauce
e4
Peperonc, Spicy
serves serves
hasIngred
Peperonc ⊑ Spicy
∙ ArrabSauce ∈ Loopα[s1, sf] because
, sf) ∈ δ and
Peperonc ∈ Loopα[s1, sf]
∙ PenneArrab ∈ Loopα[s1, sf] because
, sf) ∈ δ and
ArrabSauce ∈ Loopα[s1, sf]
87/109

evaluation 2rpqs using the loop table
Non-deterministic algorithm to decide (a, b) ∈ cert(α(x, y), K)
Input: NFA α = (S, Σ, δ, s0, F), KB K = (T , A), (a, b) from A
88/109

evaluation 2rpqs using the loop table
Non-deterministic algorithm to decide (a, b) ∈ cert(α(x, y), K)
Input: NFA α = (S, Σ, δ, s0, F), KB K = (T , A), (a, b) from A
∙ After checking consistency, we start from (a, s0)
∙ At pair (c, s), guess new pair (d, s′
) together with one of:
∙ transition (s, σ, s′
) a σ-step from c to d in ABox
⇝ check if (c, d) ∈ σI
∙ concepts M in Loopα[s, s′
] stay at same individual, and jump to s′
⇝ check if c = d ∈ MI
∙ Exit when we get pair (b, sf)
∙ Use counter to ensure termination (only need to consider each
pair once)
88/109

evaluation algorithm
Algorithm EvalAtom
Input: NFA α = (S, Σ, δ, s0, F) with Σ ⊆ N±
R ∪ {A? | A ∈ NC}, ELHI⊥ KB
(T , A), (a, b) ∈ Ind(A) × Ind(A)
1. Test whether (T , A) is satisﬁable, output yes if not.
2. Initialize current = (a, s0) and count = 0. Set max = |A| · |S| + 1.
3. While count < max and current ̸∈ {(b, sf) | sf ∈ F}
3.1 Let current = (c, s).
3.2 Guess a pair (d, s′
) ∈ Ind(A) × S and either (s, σ, s′
) ∈ δ or
M ∈ Loopα[s, s′
].
3.3 If (s, σ, s′
) was guessed
∙ If σ ∈ N±
R , then verify that T , A |= σ(c, d), and return no if not.
∙ If σ = A?, then verify that c = d and T , A |= A(c), and return no if not.
3.4 If M was guessed, then verify that c = d and that T , A |= B(c) for every
concept name B ∈ M, and return no if not.
3.5 Set current = (d, s′
) and increment count.
4. If current = (b, sf) for some sf ∈ F, return yes. Else return no.
89/109

evaluation algorithm: example (1/2)
q(x, y) = serves · (hasIngred ∪ ingredOf−
)∗
· Spicy? · Σ∗
(x, y)
PenneArrab ⊑ PastaDish ⊓ ∃hasIngred.ArrabSauce
PastaDish ⊑ Dish ⊓ ∃hasIngred.Pasta
Peperonc ⊔ ∃hasIngred.Spicy ⊑ Spicy
Spicy ⊓ Dish ⊑ SpicyDish
r
π(x)
p
PizzaCalab
b
PenneArrabπ(y)
e1
Nduja, Spicy
e2
Penne, Pasta
e3 ArrabSauce
e4
Peperonc, Spicy
serves serves
hasIngred
90/109

)∗
· Spicy? · Σ∗
(x, y)
s0 s1 sf
serves
hasIngred
ingredOf−
Spicy?
Σ∗Peperonc ∈ Loopα[s1, sf]
91/109

)∗
· Spicy? · Σ∗
(x, y)
s0 s1 sf
serves
hasIngred
ingredOf−
Spicy?
Σ∗Peperonc ∈ Loopα[s1, sf]
count: 0 1 2
Guess (r, s0) (b, s1) (b, sf)
(s0, serves, s1) ∈ δ PenneArrab ∈ Loopα[s1, sf]
Test (r, b) ∈ servesI
b ∈ PenneArrabI
return yes
91/109

complexity of the algorithm
Theorem (a, b) ∈ cert(q, K) iff there is some execution of
EvalAtom(α, K, (a, b)) that returns yes.
92/109

∙ Iterations bounded by counter (poly. counter ⇝ log space)
∙ We need calls to procedures for:
satisﬁability instance checking membership in Loopα table
∙ These calls are in Exp for ELHI⊥
Loopα computation: polynomially many iterations
each one tests entailment
92/109

Exp upper bound for ELHI⊥ (combined complexity)
92/109

For ELH and DL-Lite, we can obtain P upper bound (combined)
92/109

For ELH and DL-Lite, we can obtain P upper bound (combined)
Data complexity:
∙ Loopα computation in constant time (ABox independent)
∙ called procedures in P for ELH and NLogSpace for DL-LiteR
92/109

complexity bounds
Theorem
∙ For ELHI⊥, 2RPQ answering is Exp-complete in combined
complexity and P-complete in data complexity
∙ For DL-LiteR and ELH, the combined complexity drops to
P-complete
∙ In data complexity, the problem is NLogSpace-complete for
DL-LiteR, and P-complete for ELH.
93/109

complexity bounds
Theorem
P-complete
Most matching lower bounds from simpler problems:
93/109

complexity bounds
Theorem
P-complete
∙ instance checking
∙ graph reachability = RPQ over plain ABox
93/109

complexity bounds
Theorem
P-complete
∙ instance checking
∙ graph reachability = RPQ over plain ABox
P-hardness for DL-LiteR non-trivial
93/109

answering c2rpqs
For answering C2RPQs, we combine the ideas above:
∙ rewrite the query so that matches ranging over individuals sufﬁce
∙ in each step, consider possibly deeper paths with Loopαtable
After rewriting, guess matches using individuals only and check
them using EvalAtom on each atom
94/109

answering c2rpqs
This algorithm works for all DLs discussed and gives optimal
complexity bounds
94/109

answering c2rpqs
This algorithm works for all DLs discussed and gives optimal
complexity bounds
Answering C2RPQs is not much harder:
∙ Combined complexity increases to PSpace for DL-LiteR and ELH
∙ but most other bounds are the same as for RPQs and CQs
∙ even for very expressive DLs that are not Horn
94/109

final remarks on navigational queries
Navigational queries provide more querying power at moderate
computational cost
Expressible in Datalog, but computationally better behaved
Good alternative to CQs, gaining increasing attention
Property paths in SPARQL
∙ included in the SPARQL 1.1 standard
∙ add regular paths as in C2RPQs
Ongoing quest for more ﬂexible navigational languages
95/109

complexity of answering (c)(2)rpqs
2RPQs C2RPQs
data
complexity
combined
complexity
data
complexity
combined
complexity
DL-Lite
DL-LiteR
NLogSpace P NLogSpace PSpace
EL, ELH P P P PSpace
ELI, ELHI⊥,
Horn-SHOIQ
P Exp P Exp
ALC,
ALCHQ
coNP Exp coNP-hard 2Exp
ALCI, SH,
SHIQ
coNP Exp coNP-hard 2Exp
coN2Exp-hard1
1
decidability open 96/109

comparison: complexity of answering (u)cqs
IQs CQs
data
complexity
combined
complexity
data
complexity
combined
complexity
DL-Lite
DL-LiteR
in AC0 NLogSpace in AC0 NP
EL, ELH P P P NP
ELI, ELHI⊥,
Horn-SHOIQ
P Exp P Exp
ALC,
ALCHQ
coNP Exp coNP Exp
ALCI, SH,
SHIQ
coNP Exp coNP 2Exp
coN2Exp-hard1
1
decidability open
97/109

queries with negation or re-
cursion

queries with negation
Conjunctive query with safe negation (CQ¬s
):
∙ like a CQ, but can also have negated atoms
∙ safety condition: every variable occurs in some positive atom
∙ example: ﬁnd menus whose main course is not spicy
∃y Menu(x) ∧ hasMain(x, y) ∧ ¬Spicy(y)
99/109

):
Conjunctive query with inequalities (CQ̸=
)
∙ like a CQ, but can also have atoms t1 ̸= t2 (t1, t2 vars or individuals)
∙ example: ﬁnd restaurant offering two menus having different
dessert courses
∃y1y2z1z2 offers(x, y1) ∧ Menu(y1) ∧ hasDessert(y1, z1)∧
offers(x, y2) ∧ Menu(y2) ∧ hasDessert(y2, z2) ∧ z1 ̸= z2
∙ example: ﬁnd menus with at least three courses (see notes)
99/109

):
Conjunctive query with inequalities (CQ̸=
)
∙ like a CQ, but can also have atoms t1 ̸= t2 (t1, t2 vars or individuals)
∙ example: find restaurant offering two menus having different
dessert courses
∃y1y2z1z2 offers(x, y1) ∧ Menu(y1) ∧ hasDessert(y1, z1)∧
offers(x, y2) ∧ Menu(y2) ∧ hasDessert(y2, z2) ∧ z1 ̸= z2
∙ example: find menus with at least three courses (see notes)
Note: can define UCQ¬s
s and UCQ̸=
s in the obvious way
99/109

undecidability results for queries with negation
Adding negation leads to undecidability even in very restricted
settings.
Theorem The following problems are undecidable:
∙ CQ¬s
answering in DL-LiteR
∙ UCQ¬s
answering in EL⊥
∙ CQ̸=
∙ CQ̸=
answering in EL⊥
100/109

undecidability results for queries with negation
Adding negation leads to undecidability even in very restricted
settings.
Theorem The following problems are undecidable:
∙ CQ¬s
∙ UCQ¬s
answering in EL⊥
∙ CQ̸=
∙ CQ̸=
answering in EL⊥
Possible solution: adopt alternative semantics (see lecture notes)
100/109

undecidability results for queries with recursion
Signiﬁcant interest in combining DLs with Datalog rules
Unfortunately, this almost always leads to undecidability:
Theorem Datalog query answering is undecidable in every DL that
can express (directly or indirectly) A ⊑ ∃r.A
In particular: undecidable in both DL-Lite and EL
101/109

undecidability results for queries with recursion
Signiﬁcant interest in combining DLs with Datalog rules
Unfortunately, this almost always leads to undecidability:
Theorem Datalog query answering is undecidable in every DL that
can express (directly or indirectly) A ⊑ ∃r.A
In particular: undecidable in both DL-Lite and EL
Possible solutions:
∙ use restricted classes of Datalog queries (e.g. path queries)
∙ DL-safe rules: can only apply rules to (named) individuals
101/109

efficient omqa
Lots of work on developing and implementing efﬁcient OMQA
algorithms
Focus mostly on DL-Lite (and related dialects):
∙ First algorithm PerfectRef proposed in mid-2000’s
∙ Rewrites into UCQs, implemented in Quonto
∙ Improved versions proposed in Requiem, Presto, Rapid, …
∙ Some algorithms rewrite into positive existential queries or
Datalog programs instead of UCQs
∙ Resulting queries are smaller, can be easier to evaluate
103/109

optimizations and omqa beyond dl-lite
Tractable classes, fragments of lower complexity
Rewriting engines for other Horn DLs also developed, e.g.,
∙ Requiem and the related Kyrie cover several EL dialects
∙ Clipper, and recently Rapid cover Horn-SHIQ
They usually rewrite into Datalog programs
104/109

understanding rewritability
Much attention devoted to understanding the limits of rewritability
and size of rewritings
When are polynomial rewritings possible?
Can we give bounds on the size of rewritings?
Which non-DL-Lite ontologies can be rewritten into FO-queries?
⇝ related to non-uniform complexity:
∙ study speciﬁc pairs (q, T ), called ontology-mediated queries
105/109

combined approaches
Saturate the ABox using the TBox axioms
⇝ a ﬁnite version of the canonical model
and then evaluate the query over the saturated ABox
Two approaches:
∙ modify the query before evaluation to ensure soundess, or
∙ evaluate and then ﬁlter unsound answers
First proposed for EL, then also for DL-Lite
Extended to other dialects, richer DLs
106/109

querying existing relational data using mappings
Today: assumed data given as ABox assertions (unary + binary facts)
Problem: how to query existing relational data (arbitrary arity)?
Solution: use mapping that speciﬁes relationship between the
database relations and the concepts / roles in DL vocabulary
Formally: mapping assertions of the form φ → ψ where:
∙ φ is an query formulated using DB relations
∙ ψ is a query in the DL vocabulary
Global-as-view (GAV) mappings: φ CQ, ψ atom (no quantiﬁers)
Handling mappings:
∙ apply mappings to generate ABox, proceed as usual
∙ virtual ABox: unfolding step to get rewriting over DB relations
107/109

other research topics (non-exhaustive)
Beyond classical OMQA
∙ inconsistency-tolerant query answering
∙ probabilistic query answering
∙ privacy-aware query answering
∙ temporal query answering
Support for building and maintaining OMQA systems
∙ module extraction
∙ ontology evolution
∙ query inseparability and emptiness
Improving the usability of OMQA systems
∙ interfaces and support for query formulation
∙ explaining query (non-)answers
108/109

Ontology-mediated query answering with data-tractable description logics

More Related Content

What's hot

Viewers also liked

Similar to Ontology-mediated query answering with data-tractable description logics

Recently uploaded

Ontology-mediated query answering with data-tractable description logics