Computational Techniques for the Statistical Analysis of Big Data in R
presentation
1. Query Answering over Contextualized
RDF/OWL Knowledge with Forall-Existential
Bridge Rules: Decidable Classes
Mathew Joseph1,2
1DKM, FBK-IRST, Trento, Italy
2DISI, University of Trento, Trento, Italy
PhD Defence Presentation
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
2. Outline of the talk
1 Introduction
2 Quad-Systems
3 Query Answering over Quad-Systems
4 Decidable Classes of Quad-Systems
Context Acyclic Quad-Systems
Csafe, Msafe, and Safe Quad-systems
Range Restricted Quad-Systems
5 Quad-systems and Forall-Existential rules
6 Related Work
7 Conclusion
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
3. Outline
1 Introduction
2 Quad-Systems
3 Query Answering over Quad-Systems
4 Decidable Classes of Quad-Systems
Context Acyclic Quad-Systems
Csafe, Msafe, and Safe Quad-systems
Range Restricted Quad-Systems
5 Quad-systems and Forall-Existential rules
6 Related Work
7 Conclusion
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
4. Contextualized Knowledge
The fact “I am giving this presentation” is only true in a
certain context.
Contextualized RDF knowledge is proliferating:
Recent releases of Billion Triples Challenge Datasets,
DBPedia datasets are all in NQuads format.
Triple stores are more and more moving to quad-stores -
4store, Openlink Virtuoso, Sesame.
RDF 1.1 introduced NQuads as official W3C
recommendation in 2014.
The focus of the thesis work is query answering over
contextualized RDF Knowledge/Quads.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
5. Contextualized Knowledge
The fact “I am giving this presentation” is only true in a
certain context.
Contextualized RDF knowledge is proliferating:
Recent releases of Billion Triples Challenge Datasets,
DBPedia datasets are all in NQuads format.
Triple stores are more and more moving to quad-stores -
4store, Openlink Virtuoso, Sesame.
RDF 1.1 introduced NQuads as official W3C
recommendation in 2014.
The focus of the thesis work is query answering over
contextualized RDF Knowledge/Quads.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
6. Contexts: Literature review
John McCarthy 1987 - Proposed contexts as a solution to
Generality problem in AI.
Multi-context Systems (MCS) - contexts are propositional
theories and propositional bridge rules enable interoperability.
Distributed Description Logics (DDL) - contexts are description
logic KBs and bridge rules are of the form:
c : φ(x) → c : φ (x),
where φ(x), φ (x) are either both concept (role) atoms.
Contextualized Knowledge Repository (CKR) - a framework
developed at DKM group. Its aims are to design and implement
effective algorithms for reasoning and query answering over
contextual knowledge.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
7. Thesis Novelty/Advancement
Key Difference
BRs, we consider, are more expressive than the BRs in the
above works and contain ∧s and ∃ operators
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
8. Outline
1 Introduction
2 Quad-Systems
3 Query Answering over Quad-Systems
4 Decidable Classes of Quad-Systems
Context Acyclic Quad-Systems
Csafe, Msafe, and Safe Quad-systems
Range Restricted Quad-Systems
5 Quad-systems and Forall-Existential rules
6 Related Work
7 Conclusion
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
9. Quads and Quad-graphs
Let C be a distinguished set of URIs called context identifiers.
A quad is an expression of the form c : (s, p, o), where c ∈ C,
(s, p, o) is a triple.
A quad graph is a set of quads.
Notation:
QC is the quad-graph whose set of context identifiers is C.
For any c ∈ C, graphQC
(c) = {(s, p, o) | c : (s, p, o) ∈ QC}
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
10. Example: Quad-graph
Example
Let C = {cWC2014, cUEFA2014, cSerieA2014}
cWC2014 - context about World cup football 2014.
cUEFA2014 - context about UEFA cup football 2014.
cSerieA2014 - context about Italian Serie A football 2014.
Quad-graph
QC =
cWC2014 : (Buffon, playsFor, Italy)
cWC2014 : (Buffon, captains, Italy)
. . .
cUEFA2014 : (Buffon, playsFor, Juventus)
. . .
cSerieA2014 : (Buffon, playsFor, Juventus)
. . .
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
11. Quad-graph: Visualization
QC can be viewed as a family of RDF graphs
Buffon
Italy
graphQC
(cWC2014)
playsFor Buffon
Agnelli
Juventus
graphQC
(cUEFA2014)
owns
playsFor
superMario
InterMilan
graphQC
(cSerieA2014)
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
12. Bridge Rules (BRs)
Eg: cUEFA2014 : (x, a, GoodPlayer), cSerieA : (x, a, GoodPlayer )
→ cWC2014 : (x, playsFor, Italy)
A BR is an expression of the form:
∀x∀z [
body
c1: t1(x, z) ∧ ... ∧ cn: tn(x, z) →
∃y c1: t1(x, y) ∧ ... ∧ cm: tm(x, y)
head
]
c1 : t1(x, z), ..., cn : tn(x, z) are quad patterns over variable
sets {x} or {z}.
c1 : t1(x, y), ..., cm : tm(x, y) are quad patterns over variable
sets {x} or {y},
where a quad-pattern is a quad that allows variables at s, p, o.
Variables in x are called frontier variables.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
13. Quad-Systems
Definition (Quad-System)
A quad-system QSC is defined as a pair QC, R , where QC is a
quad-graph, and R is a set of bridge rules.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
14. Quad-System Semantics
Semantics of a quad-system QSC is defined on top of a
distributed interpretation structure IC = {Ic}c∈C, where
Ic = ∆c, ·c , for each c ∈ C, is a local interpretation structure.
local ∈ { rdf, rdfs, owl-horst, . . .}
Ic |=local graphQ(c), when Ic is a local model of the triples in
context c.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
15. Model of a Quad-system
Definition (Model of a Quad-system (|=))
A distributed interpretation structure IC = {Ic}c∈C satisfies a
quad-system QSC = QC, R , in symbols IC |= QSC iff all the
following are satisfied:
1 For every c ∈ C, Ic |=local graphQC
(c);
2 For every BR r ∈ R, for every σ ∈ {x} ∪ {z} → ∆C, where
∆C = c∈C ∆c, if
Ic1
|=local t1(x, z)[σ], ..., Icn
|=local tn(x, z)[σ],
then there exists function σ ⊇ σ, s.t.
Ic1 |=local t1(x, y)[σ ], ..., Icm |=local tm(x, y)[σ ].
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
16. Outline
1 Introduction
2 Quad-Systems
3 Query Answering over Quad-Systems
4 Decidable Classes of Quad-Systems
Context Acyclic Quad-Systems
Csafe, Msafe, and Safe Quad-systems
Range Restricted Quad-Systems
5 Quad-systems and Forall-Existential rules
6 Related Work
7 Conclusion
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
17. Contextualized Conjunctive Queries
A Contextualized Conjunctive Query (CCQ) is an expression of
the form:
∃y c1 : t1(x, y) ∧ ... ∧ cp : tp(x, y)
where qi, for i = 1, ..., p are quad patterns over vectors of free
variables x and quantified variables y.
Example
If context c1 is about Football World Cup 2014 and context c2
about Football Euro Cup 2012. Then the CCQ
c1: (x, beat, Italy) ∧ c2: (x, beat, Italy), where x is a variable.
intuitively means “Who beat Italy in both Euro Cup 2012 and
World Cup 2014”.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
18. Query Answering Decision Problem over QS
CCQ evaluation problem
Decision problem of determining, for any vector of constants a,
a CCQ CQ(x) over a quad-system QSC, if QSC |= CQ(a).
Distributed chase (dChase) of a quad-system
We extend the standard chase algorithm [Meir et al. 79] to
our setting, call its output the distributed chase,
abbreviated dChase.
The algorithm runs iteratively, for iterations i = 0, . . . ,
producing outputs dChase0(QSC), . . . ,, respectively.
dChase(QSC) = i∈N dChasei(QSC)
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
23. Distributed chase of a quad-system
Termination condition: If ∃i s.t. dChasei(QSC) =
dChasei+1, then dChase(QSC) = dChasei(QSC).
It might be the case that the termination condition is never
satisfied and dChase is infinite, which leads to
non-termination of dChase algorithm.
So what? Same problem occurs in DLs DL-Lite, EL etc.,
but QA algorithms based on rewriting
techniques [Calvanese et al. 2007] and combined
approaches [Lutz et al., 2009] exists.
Is there an algorithm for deciding CCQ evaluation problem
for quad-systems?
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
24. Distributed chase of a quad-system
Termination condition: If ∃i s.t. dChasei(QSC) =
dChasei+1, then dChase(QSC) = dChasei(QSC).
It might be the case that the termination condition is never
satisfied and dChase is infinite, which leads to
non-termination of dChase algorithm.
So what? Same problem occurs in DLs DL-Lite, EL etc.,
but QA algorithms based on rewriting
techniques [Calvanese et al. 2007] and combined
approaches [Lutz et al., 2009] exists.
Is there an algorithm for deciding CCQ evaluation problem
for quad-systems?
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
25. Distributed chase of a quad-system
Termination condition: If ∃i s.t. dChasei(QSC) =
dChasei+1, then dChase(QSC) = dChasei(QSC).
It might be the case that the termination condition is never
satisfied and dChase is infinite, which leads to
non-termination of dChase algorithm.
So what? Same problem occurs in DLs DL-Lite, EL etc.,
but QA algorithms based on rewriting
techniques [Calvanese et al. 2007] and combined
approaches [Lutz et al., 2009] exists.
Is there an algorithm for deciding CCQ evaluation problem
for quad-systems?
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
26. Distributed chase of a quad-system
Termination condition: If ∃i s.t. dChasei(QSC) =
dChasei+1, then dChase(QSC) = dChasei(QSC).
It might be the case that the termination condition is never
satisfied and dChase is infinite, which leads to
non-termination of dChase algorithm.
So what? Same problem occurs in DLs DL-Lite, EL etc.,
but QA algorithms based on rewriting
techniques [Calvanese et al. 2007] and combined
approaches [Lutz et al., 2009] exists.
Is there an algorithm for deciding CCQ evaluation problem
for quad-systems? Ans: NO
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
27. Undecidability of Query Answering for QS
Theorem
CCQ evaluation problem is undecidable
Non emptyness checking of intersection of languages
generated by two CFGs in undecidable.
Reduction: Each PR of the form S → S1S2 . . . Sn can be
encoded as a BR of the form:
c : (x1, S1, x2), c : (x2, S2, x3), . . . , c : (xn, Sn, xn+1) →
c : (x1, S, xn+1)
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
28. Outline
1 Introduction
2 Quad-Systems
3 Query Answering over Quad-Systems
4 Decidable Classes of Quad-Systems
Context Acyclic Quad-Systems
Csafe, Msafe, and Safe Quad-systems
Range Restricted Quad-Systems
5 Quad-systems and Forall-Existential rules
6 Related Work
7 Conclusion
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
29. Outline
1 Introduction
2 Quad-Systems
3 Query Answering over Quad-Systems
4 Decidable Classes of Quad-Systems
Context Acyclic Quad-Systems
Csafe, Msafe, and Safe Quad-systems
Range Restricted Quad-Systems
5 Quad-systems and Forall-Existential rules
6 Related Work
7 Conclusion
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
30. Triple generating context
For any quad-system QSC = QC, R , a context c ∈ C is called a
triple generating context (TGC), if there exists a BR r ∈ R, with
c : (s, p, o) ∈ head(r) and s or p or o is an existential variable.
Definition (Context dependency graph)
of a quad-system QSC = QC, R is a directed graph V, E ,
V = context identifiers in C s.t. TGCs are marked with a ∗, and
E are s.t.:
for each BR r ∈ R {
for each context ci occurring in the body of r {
for each context cj occurring in the head of r {
exists edge from ci to cj;
}}}
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
31. Example
Consider a quad-system, whose set of BRs R are:
c1 : (x1, x2, U1) → ∃y1 c2 : (x1, x2, y1),
c3 : (x2, a, rdf:Property)
c2 : (x1, x2, z1) → c1 : (x1, x2, U1)
c3 : (x1, x2, x3) → c1 : (x1, x2, x3)
c1
c2
∗
c3
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
32. Example
Consider a quad-system, whose set of BRs R are:
c1 : (x1, x2, U1) → ∃y1 c2 : (x1, x2, y1),
c3 : (x2, a, rdf:Property)
c2 : (x1, x2, z1) → c1 : (x1, x2, U1)
c3 : (x1, x2, x3) → c1 : (x1, x2, x3)
c1
c2
∗
c3
A quad-system is said to be context acyclic (cAcyclic), iff its
context dependency graph does not contain cycles involving
TGCs.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
33. Example
Consider a quad-system, whose set of BRs R are:
c1 : (x1, x2, U1) → ∃y1 c2 : (x1, x2, y1),
c3 : (x2, a, rdf:Property)
c2 : (x1, x2, z1) → c1 : (x1, x2, U1)
c3 : (x1, x2, x3) → c1 : (x1, x2, x3)
c1
c2
∗
c3
A quad-system is said to be context acyclic (cAcyclic), iff its
context dependency graph does not contain cycles involving
TGCs.
Since the cycle (c1, c2, c1) in the quad-system contains c2
which is a TGC, the quad-system is not cAcyclic.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
34. Context Acyclic Quad-Systems: Complexity Results
Theorem
(i) Combined Complexity of CCQ evaluation is
2EXPTIME-complete.
(ii) Data complexity of CCQ evaluation is PTIME-complete
(ii) PTIME-hardness established by the reduction of 3HornSat,
i.e. satisfiability of Propositional Horn clauses with at most 3
literals.
(i) 2EXPTIME-hardness established by reduction of word
problem of double exponentially time bounded Deterministic
Turing Machine.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
35. 2EXPTIME-Hardness of CCQ Evaluation
0 0 1 1 · · ·
qI
Figure : Deterministic Turing Machine (DTM)
A 2EXPTIME DTM is a DTM that decides acceptance in
atmost double exponential number of transitions w.r.t. input
size.
Computation also uses atmost double exponential number
of cells
Reduction of the word problem of 2EXPTIME DTM to CCQ
evaluation problem of context acyclic quad-systems.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
36. Outline
1 Introduction
2 Quad-Systems
3 Query Answering over Quad-Systems
4 Decidable Classes of Quad-Systems
Context Acyclic Quad-Systems
Csafe, Msafe, and Safe Quad-systems
Range Restricted Quad-Systems
5 Quad-systems and Forall-Existential rules
6 Related Work
7 Conclusion
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
37. C(dChase(QSC)) = U(dChase(QSC) ∪ L(dChase(QSC)) ∪
B(dChase(QSC)) can be potentially infinite.
U(dChase(QSC) ⊆ U(QSC), and L(dChase(QSC)) ⊆ L(QSC)
are finite sets.
Hence, the real reason of non-finiteness is B(dChase(QSC))
and, specifically, the set of Skolem blank nodes.
Intuitively, csafe, msafe, and safe classes restricts the structure
of the Skolem blank nodes in the dChase to be DAGs of
bounded depth.
Assumption: Every BR has a unique identifier.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
38. Origin ruleId, Origin vector, Descendants of Skolem
blank nodes
Consider the application of an assignment µ on the following
BR ri = body(ri)(x, z) → head(ri)(x, y)
body(ri)
x1
. . .
xp z1
. . .
zq
head(ri)
x1
. . .
xp y1
. . .
yr
a1 . . . ap c1 . . . cq
µ
ri
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
39. Origin ruleId, Origin vector, Descendants of Skolem
blank nodes
body(ri)
x1
. . .
xp z1
. . .
zq
head(ri)
x1
. . .
xp y1
. . .
yr
a1 . . . ap c1 . . . cq
µ
apply
ri
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
40. Origin ruleId, Origin vector, Descendants of Skolem
blank nodes
body(ri)
x1
. . .
xp z1
. . .
zq
head(ri)
x1
. . .
xp y1
. . .
yr
a1 . . . ap _: b1
. . .a1 . . . ap c1 . . . cq
µ
apply
ri
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
41. Origin ruleId, Origin vector, Descendants of Skolem
blank nodes
originRuleId(_: b1) = i
body(ri)
x1
. . .
xp z1
. . .
zq
head(ri)
x1
. . .
xp y1
. . .
yr
a1 . . . ap _: b1
. . .a1 . . . ap c1 . . . cq
µ
apply
ri
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
42. Origin ruleId, Origin vector, Descendants of Skolem
blank nodes
originVector(_: b1) = a1, . . . , ap
body(ri)
x1
. . .
xp z1
. . .
zq
head(ri)
x1
. . .
xp y1
. . .
yr
a1 . . . ap _: b1
. . .a1 . . . ap c1 . . . cq
µ
apply
ri
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
43. Origin ruleId, Origin vector, Descendants of Skolem
blank nodes
hasChild(_: b1, a1), . . . , hasChild(_: b1, ap)
body(ri)
x1
. . .
xp z1
. . .
zq
head(ri)
x1
. . .
xp y1
. . .
yr
a1 . . . ap _: b1
. . .a1 . . . ap c1 . . . cq
µ
apply
ri
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
44. Origin ruleId, Origin vector of Skolem blank nodes
For any Skolem blank node _: b generated in the dChase by
the application of the BR ri = body(ri)(x, z) → head(ri)(x, y)
using assignment µ,
we say that
originRuleId(_: b) = i
Also, we say
originVector(_: b) = a = x[µ]
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
45. Origin Contexts/Descendants of Skolem blank nodes
Origin contexts
of _: b is the set of contexts in which triples containing _: b are
first generated, during the dChase construction. Formally
originContexts(_: b) = {c | c : (s, p, o) ∈ dChasei(QSC),
s = _: b or p = _: b or o = _: b, and
∃j < i with c : (s , p , o ) ∈ dChasej(QSC),
s = _: b or p = _: b or o = _: b}
Descendants
We call a c = µ(xi), for any xi ∈ x, as the childOf _: b, in
symbols hasChild(_: b, c).
hasDescendant=hasChild+ (transitive closure)
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
46. Origin Contexts/Descendants of Skolem blank nodes
Origin contexts
of _: b is the set of contexts in which triples containing _: b are
first generated, during the dChase construction. Formally
originContexts(_: b) = {c | c : (s, p, o) ∈ dChasei(QSC),
s = _: b or p = _: b or o = _: b, and
∃j < i with c : (s , p , o ) ∈ dChasej(QSC),
s = _: b or p = _: b or o = _: b}
Descendants
We call a c = µ(xi), for any xi ∈ x, as the childOf _: b, in
symbols hasChild(_: b, c).
hasDescendant=hasChild+ (transitive closure)
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
55. Example Contd.
Consider the quad-system QC, R , where QC = {c1 : (a, b, c)}.
Suppose R is the following set:
R =
c1 : (x11, x12, z1) → c2 : (x11, x12, y1) (r1)
c2 : (z21, z22, x2) → c3 : (y21, y22, x2) (r2)
c3 : (z3, x31, x32) → c2 : (y3, x31, x32) (r3)
dChase3(QSC) = {c1:(a, b, c),
c2 : (a, b, _: b1), c3 : (_: b2, _: b3,
_: b1), c2 : (_: b4, _: b3, _: b1) }
dChase4(QSC) = dChase3(QSC)
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
56. Descendance Graph
Descendance graph for _ :b4 of example above is:
_:b4
3, _:b3, _:b1 , {c2}
_:b3
2, _:b1 ,
{c3}
_:b1
1, a, b ,
{c2}
a b
Figure : Nodes labelled with tuple:Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
57. Safe, Msafe, and Csafe Quad-systems
Definition (safe, msafe, csafe quad-systems)
A quad-system QSC is said to be:
unsafe iff ∃ Skolem blank nodes _: b = _: b in
dChase(QSC) s.t. _: b is a descendant of _: b ,
with originRuleId(_: b) = originRuleId(_: b ) and
originVector(_: b) ∼= originVector(_: b ),
unmsafe iff ∃ Skolem blank nodes _: b = _: b in
dChase(QSC) s.t. _: b is a descendant of _: b ,
with originRuleId(_: b) = originRuleId(_: b ),
uncsafe iff ∃ Skolem blank nodes _: b = _: b in
dChase(QSC) s.t. _: b is a descendant of _: b ,
with originContexts(_: b) = originContexts(_: b ).
A quad-system is safe (resp. msafe, resp. csafe) iff it is not
unsafe (resp. unmsafe, resp. uncsafe).
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
58. Safe, Msafe, and Csafe Quad-systems: Properties
Theorem
Let CACYCLIC, SAFE, MSAFE, and CSAFE denote the class of
context acyclic, safe, msafe, and csafe quad-systems,
respectively, then the following holds:
CACYCLIC ⊂ CSAFE ⊂ MSAFE ⊂ SAFE
Lemma (DAG property)
For a safe (csafe, msafe) quad-system QSC, and for any blank
node b ∈ Bsk (dChase(QSC)), its descendance graph is a DAG
with bounded depth.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
59. Safe quad-systems: Properties
Theorem
For any safe/msafe/csafe quad-system, the following holds:
(i) size of the dChase is double exponential,
(ii) dChase can be computed in 2EXPTIME,
(iii) when the size of bridge rules are assumed to be a constant,
dChase can be computed in PTIME
Theorem
For any safe/msafe/csafe quad-system, the following holds: (i)
The data complexity of CCQ evaluation is PTIME-complete (ii)
The combined complexity of CCQ evaluation is
2EXPTIME-complete.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
60. Outline
1 Introduction
2 Quad-Systems
3 Query Answering over Quad-Systems
4 Decidable Classes of Quad-Systems
Context Acyclic Quad-Systems
Csafe, Msafe, and Safe Quad-systems
Range Restricted Quad-Systems
5 Quad-systems and Forall-Existential rules
6 Related Work
7 Conclusion
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
61. Range Restricted (RR) Quad-Systems
Suppose if we disallow the occurrence of existentially quantified
variable from our bridge rules, then the resulting BRs are of the
form:
∀x∀z[c1 : t1(x, z) ∧ . . . ∧ cn : tn(x, z)
→ c1 : t1(x) ∧ . . . ∧ cm : tm(x)]
Any such BR can be replaced with the following equivalent set
of BRs each of which has exactly one quad-pattern in its head:
∀x∀z[c1 : t1(x, z) ∧ . . . ∧ cn : tn(x, z) → c1 : t1(x)]
...
∀x∀z[c1 : t1(x, z) ∧ . . . ∧ cn : tn(x, z) → cm : tm(x)]
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
62. RR quad-systems: Computational Properties
Theorem
For any RR quad-system QSC = QC, R , the following holds:
Size of dChase(QSC) is a polynomial sized,
dChase(QSC) can be computed in EXPTIME,
When R is assumed to be constant sized, then
dChase(QSC) can be computed in PTIME.
Theorem
For RR quad-systems, the following holds:
(i) Combined complexity of CCQ evaluation problem is in
EXPTIME,
(ii) Data complexity of CCQ evaluation problem is
PTIME-complete. P-hardness by reduction of 3HornSat.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
63. RR quad-systems: Computational Properties
Theorem
For any RR quad-system QSC = QC, R , the following holds:
Size of dChase(QSC) is a polynomial sized,
dChase(QSC) can be computed in EXPTIME,
When R is assumed to be constant sized, then
dChase(QSC) can be computed in PTIME.
Theorem
For RR quad-systems, the following holds:
(i) Combined complexity of CCQ evaluation problem is in
EXPTIME,
(ii) Data complexity of CCQ evaluation problem is
PTIME-complete. P-hardness by reduction of 3HornSat.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
64. Restricted RR quad-systems
Restricted RR quad-system
is an RR quad-system in which the number of quad-patterns in
the body of each bridge rule is less than or equal to a constant
n. For instance,
n = 1, we get linear quad-systems,
n = 2, we get quadratic quad-systems, etc.
Theorem
For restricted RR quad-systems, the following holds:
(i) Data complexity of CCQ evaluation problem is
PTIME-complete. P-hardness by reduction of 3HornSat.
(ii) Combined complexity of CCQ evaluation problem is
NP-complete. NP-hardness by reduction of the graph coloring
problem.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
65. Restricted RR quad-systems
Restricted RR quad-system
is an RR quad-system in which the number of quad-patterns in
the body of each bridge rule is less than or equal to a constant
n. For instance,
n = 1, we get linear quad-systems,
n = 2, we get quadratic quad-systems, etc.
Theorem
For restricted RR quad-systems, the following holds:
(i) Data complexity of CCQ evaluation problem is
PTIME-complete. P-hardness by reduction of 3HornSat.
(ii) Combined complexity of CCQ evaluation problem is
NP-complete. NP-hardness by reduction of the graph coloring
problem.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
66. Outline
1 Introduction
2 Quad-Systems
3 Query Answering over Quad-Systems
4 Decidable Classes of Quad-Systems
Context Acyclic Quad-Systems
Csafe, Msafe, and Safe Quad-systems
Range Restricted Quad-Systems
5 Quad-systems and Forall-Existential rules
6 Related Work
7 Conclusion
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
67. Quad-systems and Forall-Existential (∀∃) rules
A ternary ∀∃ rule is an expression of the form:
∀x∀z[P1(x, z) ∧ . . . ∧ Pn(x, z) → ∃y P1(x, y) ∧ . . . ∧ Pm(x, y)],
where
Pi(x, z), 1 ≤ i ≤ n, are atoms over variables {x} or {z},
Pj(x, y), 1 ≤ j ≤ n, are atoms over variables {x} or {y},
ar(Pi) ≤ 3 and ar(Pj ) ≤ 3.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
68. Translation from Quad-systems to ∀∃ rules
Let τq be the translation s.t. for any quad (pattern) c : (s, p, o),
τq(c : (s, p, o)) = c(s, p, o);
For any quad-graph QC with bnodes _: b1, . . ., _: bn
τ(QC) =→ ∃y1, . . . , yn qi ∈QC
τq(qi)[µB]
where µB = {_: bi → yi}i=1...n;
For any BR r for the form seen before,
τ(r) = ∀x∀z τq(q1(x, z)) ∧ . . . τq(qn(x, z)) → ∃y τq(q1(x, z))
∧ . . . ∧ τq(qm(x, z));
For any quad-system QSC = QC, R ,
τ(QSC) = τ(QC) ∪ r∈R τ(r).
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
69. Quad-system and ∀∃ rules
Theorem
Given the translation τ as defined above, for any quad-system
QSC and boolean CCQ CQ(a), QSC |= CQ(a) iff
τ(QSC) |=fol τ(CQ(a)).
Note that τ(QSC) is a ∀∃ rule set, τ(CQ(a)) is a standard
conjunctive query, and τ is a PTIME translation.
Inverse translation τ−1
Similarly, PTIME inverse translation τ−1 exists from ternary ∀∃
rules (resp. CQs) to quad-systems (resp. CCQs) s.t. for any ∀∃
ruleset P and a boolean CQ Q(), P |=fol Q() iff τ−1(P) |=
τ−1(Q()).
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
70. Quad-systems and Ternary ∀∃ rules
Corollary
CCQ evaluation problem of quad-systems is polynomially
equivalent to CQ evaluation problem over ternary ∀∃ rules.
This means that the well known techniques for decidability
guarantees such as Weak acyclicity (WA) [Fagin et al. 2005],
Joint acyclicity (JA) [Krötzsch et al. 2011], and Model faithful
acyclicity (MFA) [Cuenca Grau et al. 2013] from the discipline
of ∀∃ rules are also applicable in our settings, and vice versa.
The following relations holds [Cuenca Grau et al. 2013]:
WA ⊂ JA ⊂ MFA,
what are the relations with our decidability approaches to these
existing notions?
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
71. Quad-systems and Ternary ∀∃ rules
Corollary
CCQ evaluation problem of quad-systems is polynomially
equivalent to CQ evaluation problem over ternary ∀∃ rules.
This means that the well known techniques for decidability
guarantees such as Weak acyclicity (WA) [Fagin et al. 2005],
Joint acyclicity (JA) [Krötzsch et al. 2011], and Model faithful
acyclicity (MFA) [Cuenca Grau et al. 2013] from the discipline
of ∀∃ rules are also applicable in our settings, and vice versa.
The following relations holds [Cuenca Grau et al. 2013]:
WA ⊂ JA ⊂ MFA,
what are the relations with our decidability approaches to these
existing notions?
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
72. Quad-systems and Ternary ∀∃ rules
Corollary
CCQ evaluation problem of quad-systems is polynomially
equivalent to CQ evaluation problem over ternary ∀∃ rules.
This means that the well known techniques for decidability
guarantees such as Weak acyclicity (WA) [Fagin et al. 2005],
Joint acyclicity (JA) [Krötzsch et al. 2011], and Model faithful
acyclicity (MFA) [Cuenca Grau et al. 2013] from the discipline
of ∀∃ rules are also applicable in our settings, and vice versa.
The following relations holds [Cuenca Grau et al. 2013]:
WA ⊂ JA ⊂ MFA,
what are the relations with our decidability approaches to these
existing notions?
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
73. Quad-systems and ∀∃ rules
Theorem
1 CACYCLIC ⊂ WA,
2 If local semantics of contexts is OWL-Horst or its derivative,
then QSC is context acyclic iff τ(QSC) is weakly acyclic.
3 WA ⊆ CSAFE and CSAFE ⊆ WA,
4 JA ⊆ CSAFE and CSAFE ⊆ JA,
5 MFA ≡ MSAFE,
6 MFA ⊂ SAFE. Important! because MFA was the most
expressive of the known classes with finite chase property,
so far
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
74. Quad-systems and ∀∃ rules
Theorem
1 CACYCLIC ⊂ WA,
2 If local semantics of contexts is OWL-Horst or its derivative,
then QSC is context acyclic iff τ(QSC) is weakly acyclic.
3 WA ⊆ CSAFE and CSAFE ⊆ WA,
4 JA ⊆ CSAFE and CSAFE ⊆ JA,
5 MFA ≡ MSAFE,
6 MFA ⊂ SAFE. Important! because MFA was the most
expressive of the known classes with finite chase property,
so far
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
75. Outline
1 Introduction
2 Quad-Systems
3 Query Answering over Quad-Systems
4 Decidable Classes of Quad-Systems
Context Acyclic Quad-Systems
Csafe, Msafe, and Safe Quad-systems
Range Restricted Quad-Systems
5 Quad-systems and Forall-Existential rules
6 Related Work
7 Conclusion
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
76. Related Work
∀∃, Datalog+- rules, Tgds
Beeri and Vardi, 1981 Proved that reasoning with Tgds is
undecidable.
Deutch et al., Fagin et al., 2003 Weakly acyclic Tgds: A
decidable class for query answering. Tgds are analyzed using
a dependency graph. Difference: nodes in the dependency
graph contain predicate positions, in place of context identifiers
in our approach.
(Weakly) (Frontier) Guarded Rules Ensures decidability using
bounded tree width property of underlying models (Courcelle’s
theorem)
Linear TGDs, Sticky Tgds Ensures decidability using query
rewriting approach.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
77. Outline
1 Introduction
2 Quad-Systems
3 Query Answering over Quad-Systems
4 Decidable Classes of Quad-Systems
Context Acyclic Quad-Systems
Csafe, Msafe, and Safe Quad-systems
Range Restricted Quad-Systems
5 Quad-systems and Forall-Existential rules
6 Related Work
7 Conclusion
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
78. Complexity of
CCQ Entailment
Expressivity Landscape dChase size
UNRESTRICTEDUNDECIDABLE INFINITE
TERNARY
∀∃ RULES
SAFE
MSAFE MFA [Cuenca Grau et al. 2013]
CSAFE JA [Krötzsch et al. 2011]
WA [Fagin et al. 2005]
CACYCLIC
2EXPTIME-
COMPLETE
DOUBLE
EXPONENTIAL
RREXPTIME
POLYNOMIAL
REST. RRNP-COMPLETE
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
79. Data Complexity & Complexity of Recognition
Quad-System Complexity Data Complexity of
Fragment of Recognition CCQ evaluation
Unrestricted PTIME Undecidable
Safe 2EXPTIME PTIME-complete
MSafe 2EXPTIME PTIME-complete
CSafe 2EXPTIME PTIME-complete
Context Acyclic PTIME PTIME-complete
RR PTIME PTIME-complete
Restricted RR PTIME PTIME-complete
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
80. Conclusion
We know ways to code bridge rules over quads s.t. query
answering can be done with termination guarantees and
reasonably efficiently.
Since SAFE ⊃ MFA, we also have new ways of writing
ternary ∀∃ rules that allows for termination guaranteed
query answering.
The technique of safety can also be ported to general ∀∃
rules setting by keeping track of origin ruleId/vector and
descendants.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
81. Articles and Conference Experiences
M.Joseph, G.Kuper, T. Mossakowski, L.Serafini. Query
Answering over Contextualized RDF/OWL Knowledge with
Forall-Existential Bridge Rules: Decidable Finite Extension
Classes. Semantic Web Journal (Accepted for
Publications, To Appear). IOS Press. 2015
M.Joseph, G.Kuper, L.Serafini. Query Answering over
Contextualized RDF/OWL Knowledge with
Forall-Existential Bridge Rules: Attaining Decidability using
Acyclicity. In Proceedings of International Conference in
Web Reasoning and Rule Systems (RR-2014). 2014
M.Joseph, G.Kuper, L.Serafini. Query Answering over
Contextualized RDF Knowledge with Forall-Existential
Bridge Rules: Attaining Decidability using Acyclicity. In
Proceedings of Italian Conference in Computational Logic
(CILC-2014). 2014
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
82. Articles and Conference Experiences: Contd
M.Joseph, L.Serafini. Simple Reasoning for Contextualized
RDF Knowledge. In Proceedings of Workshop on Modular
Ontologies (WOMO-2011). Ljubljana, Slovenia. 2011
A. Tamilin, B. Magnini, L. Serafini, C. Girardi, M. Joseph, R.
Zanoli. Context-driven Semantic Enrichment of Italian
News Archive. In proceedings of Extended Semantic Web
Conference (ESWC-2010). In use track. 364-378 crete,
greece. 2010
M. Joseph. A Contextualized Knowledge Framework for
Semantic Web. In proceedings of Extended Semantic Web
Conference (ESWC-2010). PhD symposium track. crete,
greece. 2010
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
83. THANKS
Thanks for your attention
Questions?
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
89. [Borgida, Serafini. 2003]
Borgida, A., Serafini, L.: Distributed Description Logics:
Assimilating Information from Peer Sources. J. Data
Semantics 1, 153–184 (2003)
[Giunchiglia and Ghidini, 2001]
Giunchiglia, F., Ghidini, C.: Local models semantics, or
contextual reasoning = locality + compatibility. Artificial
Intelligence 127 (2001)
[Fagin et al. 2005]
Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data
Exchange: Semantics and Query Answering. In:
Theoretical Computer Science. pp. 28(1):89–124 (2005)
[Deutsch et al. 2008]
Deutsch, A., Nash, A., Remmel, J.: The chase revisited. In:
Proceedings of the twenty-seventh ACM
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
90. SIGMOD-SIGACT-SIGART symposium on Principles of
database systems. pp. 149–158. PODS ’08 (2008)
[Johnson and Klug, 84]
Johnson, D.S., Klug, A.C.: Testing containment of
conjunctive queries under functional and inclusion
dependencies. Computer and System Sciences 28,
167–189 (1984)
[Lutz et al., 2009]
C. Lutz, D. Toman, F. Wolter, Conjunctive query answering
in the description logic EL using a relational database
system twenty-first International Joint Conference on
Artificial Intelligence. 2009 (IJCAI 09).
[Calvanese et al. 2007]
D. Calvanese, G. Giacomo, D. Lembo, M. Lenzerini, and
R. Rosati, “Tractable reasoning and efficient query
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge
91. answering in description logics: The dl-lite family,” J. Autom.
Reason., vol. 39, pp. 385–429, Oct. 2007.
[Cuenca Grau et al. 2013]
B. Cuenca Grau„ I. Horrocks, M. Krötzsch, C. Kupke,
D. Magka, B. Motik, and Z. Wang, “Acyclicity Notions for
Existential Rules and Their Application to Query Answering
in Ontologies,” in Journal of Artificial Intelligence Research
(JAIR), vol. 47, pp. 741–808, AI Access Foundation, 2013.
[Krötzsch et al. 2011]
M. Krötzsch and S. Rudolph, “Extending decidable
existential rules by joining acyclicity and guardedness,” in
Proceedings of the 22nd International Joint Conference on
Artificial Intelligence (IJCAI’11) (T. Walsh, ed.),
pp. 963–968, AAAI Press/IJCAI, 2011.
Mathew Joseph Query Answering over Contextualized RDF/OWL Knowledge