1. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Learning Commonalities in RDF
and SPARQL
Sara El Hassad
Supervisors
François Goasdoué Hélène Jaudoin
2nd February 2018
1/63
2. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Introduction
Semantic web
"I have a dream for the Web [...] , collaborations
extend to computers. Machines become capable of
analyzing all the data on the Web - the content,
links, and transactions between people and compu-
ters. A "Semantic Web," which should make this
possible [...]." [Berners-Lee and Fischetti, 1999]
Tim Berners-Lee
2/63
4. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Introduction
Q : How to find commonalities between datasets and queries ?
3/63
5. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Introduction
Least general generalization (lgg)
Machine Learning in the early 70’s by Gordon Plotkin
Learning from examples
Learning concept descriptions from instances
Knowledge representation domain in the early 90’s
Recently in semantic web [Lehmann and Bühmann, 2011],
[Colucci et al., 2016], [Petrova et al., 2017]
4/63
6. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Introduction
Applications of lgg of datasets
Research common patterns
of datasets
Linked Data Cloud : links
between datasets
Social network context : lgg
of users descriptions
(profiles)
Applications of lgg of queries
Clustering user queries to
help recommending similar
and complementary searches
Query optimization : identify
candidate views, or potiential
query sharing
Social network context : lgg
of users searches
5/63
7. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Introduction
Contributions
To study the problem of computing an lgg in the setting of the
entire RDF standard. Published in [ESWC17] and presented in
[ILP17].
To define and to compute an lgg
lgg in RDF always exists
To provide algorithms to compute lgg of small-to-huge RDF
To report on experiments using DBpedia
To study the problem of computing an lgg in the setting of the
entire conjunctive fragment of SPARQL. Published in [ESWC17],
[ISWC17] and presented in [BDA17].
To define and to compute an lgg
lgg in SPARQL may not exists
To report on experiments using LUBM and DBpedia
6/63
8. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Outline
1 Introduction
2 Preliminaries
3 Learning commonalities in RDF
Defining lgg in RDF
Computing lgg in RDF
Experimentation
4 Learning commonalities in SPARQL
Defining lgg in SPARQL
Computing lgg in SPARQL
Experimentation
5 Related work
6 Conclusion
7/63
9. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Towards defining lgg in RDF
G.Plotkin
A least general generalization (lgg) of n descriptions d1, . . . , dn is a most
specific description d generalizing every d1≤i≤n for some
generalization/specialization relation between descriptions.
lgg in RDF
descriptions are RDF graphs
relation
generalization/specialization
is entailment between RDF
graphs
lgg in our SPARQL setting
descriptions are BGP Queries
relation
generalization/specialization
is entailment between BGP
Queries
8/63
10. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
RDF graphs
Specification of RDF graphs with triples :
(s, p, o) ∈ (U ∪ B) × U × (U ∪ L ∪ B) s op
Built-in property URIs to state RDF statements
RDF statement Triple
Class assertion (s, rdf:type, o)
Property assertion (s, p, o) with
p = rdf:type
9/63
11. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
RDF graphs
Specification of RDF graphs with triples :
(s, p, o) ∈ (U ∪ B) × U × (U ∪ L ∪ B) s op
Built-in property URIs to state RDF statements
RDF statement Triple
Class assertion (s, rdf:type, o)
Property assertion (s, p, o) with
p = rdf:type
b "LGG in RDF"
ConfPaper b1
hasTitle
τ hasContactAuthor
9/63
12. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Adding ontological knowledge to RDF graphs
Built-in property URIs to state RDF Schema statements, i.e.,
ontological constraints.
RDFS statement Triple
Subclass (s, sc, o)
Subproperty (s, sp, o)
Domain typing (s, ←d , o)
Range typing (s, →r , o)
10/63
13. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Adding ontological knowledge to RDF graphs
Built-in property URIs to state RDF Schema statements, i.e.,
ontological constraints.
RDFS statement Triple
Subclass (s, sc, o)
Subproperty (s, sp, o)
Domain typing (s, ←d , o)
Range typing (s, →r , o)
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
10/63
14. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Deriving the implicit triples
Let us consider the following RDF graph :
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
RDF graph G
15. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Deriving the implicit triples
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
RDF graph G
16. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Deriving the implicit triples
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
RDF graph G
17. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Deriving the implicit triples
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ
RDF graph G
18. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Deriving the implicit triples
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r
RDF graph G
19. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Deriving the implicit triples
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
RDF graph G
11/63
20. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Deriving the implicit triples
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
RDF graph G
How to derive implicit triples of an RDF graph ?
11/63
22. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
RDF graph G
23. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
rdfs9 : (s, sc, o), (s1, τ, s) → (s1, τ, o)
RDF graph G
24. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
rdfs9 : (s, sc, o), (s1, τ, s) → (s1, τ, o)
b
Publication
ConfPaper
τ
sc
RDF graph G
25. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
rdfs9 : (s, sc, o), (s1, τ, s) → (s1, τ, o)
b
Publication
ConfPaper
τ
sc
τ
RDF graph G
26. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
rdfs9 : (s, sc, o), (s1, τ, s) → (s1, τ, o)
b
Publication
ConfPaper
τ
sc
τ
b
Publication
ConfPaper
τ
sc
τ
RDF graph G
27. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
rdfs7 : (p1, sp, p2), (s, p1, o) → (s, p2, o)
b
Publication
ConfPaper
τ
sc
τ
b
Publication
ConfPaper
τ
sc
τ
RDF graph G
28. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
rdfs7 : (p1, sp, p2), (s, p1, o) → (s, p2, o)
b
Publication
ConfPaper
τ
sc
τ
b
Publication
ConfPaper
τ
sc
τ
b
b1hasContactAuthor
hasAuthor
sp
hasContactAuthor
RDF graph G
29. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
rdfs7 : (p1, sp, p2), (s, p1, o) → (s, p2, o)
b
Publication
ConfPaper
τ
sc
τ
b
Publication
ConfPaper
τ
sc
τ
b
b1hasContactAuthor
hasAuthor
sp
hasContactAuthor
hasAuthor
RDF graph G
30. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
rdfs7 : (p1, sp, p2), (s, p1, o) → (s, p2, o)
b
Publication
ConfPaper
τ
sc
τ
b
Publication
ConfPaper
τ
sc
τ
b
b1hasContactAuthor
hasAuthor
sp
hasContactAuthor
hasAuthor
b
b1hasContactAuthor
hasAuthor
sp
hasContactAuthorτ
hasAuthor
RDF graph G
31. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
rdfs7 : (p1, sp, p2), (s, p1, o) → (s, p2, o)rdfs3 : (p, →r , o), (s1, p, o1) → (o1, τ, o)
b
Publication
ConfPaper
τ
sc
τ
b
Publication
ConfPaper
τ
sc
τ
b
b1hasContactAuthor
hasAuthor
sp
hasContactAuthor
hasAuthor
b
b1hasContactAuthor
hasAuthor
sp
hasContactAuthorτ
hasAuthor
τ
RDF graph G
32. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
ext4 : (p, sp, p1), (p1, →r , o) → (p, →r , o)
b
Publication
ConfPaper
τ
sc
τ
b
Publication
ConfPaper
τ
sc
τ
b
b1hasContactAuthor
hasAuthor
sp
hasContactAuthor
hasAuthor
b
b1hasContactAuthor
hasAuthor
sp
hasContactAuthorτ
hasAuthor
τ→r
RDF graph G
33. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Materializing implicit triples using rules
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
hasAuthor
τ→r←d
τ
ext3 : (p, sp, p1), (p1, ←d , o) → (p, ←d , o)
b
Publication
ConfPaper
τ
sc
τ
b
Publication
ConfPaper
τ
sc
τ
b
b1hasContactAuthor
hasAuthor
sp
hasContactAuthor
hasAuthor
b
b1hasContactAuthor
hasAuthor
sp
hasContactAuthorτ
hasAuthor
τ→r←d
RDF graph G
13/63
34. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Semantics of RDF graphs
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
Saturated RDF graph G∞
The obtained graph materialises the semantic of G.
14/63
35. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Towards defining lgg in RDF
G.Plotkin
A least general generalization (lgg) of n descriptions d1, . . . , dn is a most
specific description d generalizing every d1≤i≤n for some
generalization/specialization relation between descriptions.
lgg in RDF
descriptions are RDF graphs
relation
generalization/specialization
is entailment between RDF
graphs
lgg in our SPARQL setting
descriptions are BGP Queries
relation
generalization/specialization
is entailment between BGP
Queries
15/63
36. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailment between RDF graphs
Let G and G be two graphs RDF and R a set of RDF entailment rules.
There exists a relationship to compare G and G called entailment
between graphs
G is more specific than G :
G |=R G ⇐⇒ G∞
|= G
There must exist an embedding of G in G∞
.
16/63
37. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailment between RDF graphs
G
?
|=R G
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor
G
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
b
b2
G
Publication
hasTitle
τ
38. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailment between RDF graphs
G|=RG ≡ G∞
?
|= G
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor
G∞
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
b
b2
G
Publication
hasTitle
ττ
hasAuthor
τ→r←d
39. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailment between RDF graphs
G|=RG ≡ G∞
|=G
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor
G∞
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
b
b2
G
Publication
hasTitle
ττ
hasAuthor
τ→r←d
b
Publication
"LGG in RDF"hasTitle
τ
RDF graph G is more specific than RDF graph G
17/63
40. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Towards defining lgg in SPARQL
G.Plotkin
A least general generalization (lgg) of n descriptions d1, . . . , dn is a most
specific description d generalizing every d1≤i≤n for some
generalization/specialization relation between descriptions.
lgg in RDF
descriptions are RDF graphs
relation
generalization/specialization
is entailment between RDF
graphs
lgg in our SPARQL setting
descriptions are BGP Queries
relation
generalization/specialization
is entailment between BGP
Queries
18/63
41. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Basic graph pattern queries (BGPQ)
BGPQ : conjunctive fragment of SPARQL queries, is the counterpart
of the select-project-join queries for databases
(s, p, o) ∈ (V ∪ U) × (V ∪ U) × (V ∪ U ∪ L)
19/63
42. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Basic graph pattern queries (BGPQ)
BGPQ : conjunctive fragment of SPARQL queries, is the counterpart
of the select-project-join queries for databases
(s, p, o) ∈ (V ∪ U) × (V ∪ U) × (V ∪ U ∪ L)
x1 ConfPaper
y1
τ
hasContactAuthor
body(q1)
Sample BGPQ q1(x1)
19/63
43. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailing and answering queries
Query entailment
G |=R q ⇐⇒ G∞
|= q
x1
x2
τ
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor
G∞q(x1, x2)
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
44. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailing and answering queries
Query entailment
G |=R q ⇐⇒ G∞
|= q
x1
x2
τ
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor
G∞q(x1, x2)
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
b
Publication
τ
20/63
45. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailing and answering queries
Query answering
q(G) = {(¯x)φ | G |=φ
R body(q)}
x1
x2
τ
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor
G∞q(x1, x2)
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
46. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailing and answering queries
Query answering
q(G) = {(¯x)φ | G |=φ
R body(q)}
x1
x2
τ
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor
G∞q(x1, x2)
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
b
Publication
τ
47. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailing and answering queries
Query answering
q(G) = {(¯x)φ | G |=φ
R body(q)}
x1
x2
τ
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor
G∞q(x1, x2)
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
b
Publication
τ
ConfPaper
τ
48. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailing and answering queries
Query answering
q(G) = {(¯x)φ | G |=φ
R body(q)}
x1
x2
τ
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor
G∞q(x1, x2)
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
b
Publication
τ
ConfPaper
τ
Researcher
τ
b1
21/63
49. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Towards defining lgg in SPARQL
G.Plotkin
A least general generalization (lgg) of n descriptions d1, . . . , dn is a most
specific description d generalizing every d1≤i≤n for some
generalization/specialization relation between descriptions.
lgg in RDF
descriptions are RDF graphs
relation
generalization/specialization
is entailment between RDF
graphs
lgg in our SPARQL setting
descriptions are BGP Queries
relation
generalization/specialization
is entailment between BGP
Queries
22/63
50. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailing between BGPQs
q |=R q ⇐⇒ q∞
|= q
x1 Publication
y1
τ
hasAuthor
SA
z1
hasContactAuthor
title
x2 Publication
y2
τ
hasAuthor
q∞
(x1) q (x2)
51. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailing between BGPQs
q |=R q ⇐⇒ q∞
|= q
x1 Publication
y1
τ
hasAuthor
SA
z1
hasContactAuthor
title
x2 Publication
y2
τ
hasAuthor
q∞
(x1) q (x2)
x2 Publication
y2
τ
hasAuthor
x1 Publication
y1
τ
hasAuthor
23/63
52. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Outline
1 Introduction
2 Preliminaries
3 Learning commonalities in RDF
4 Learning commonalities in SPARQL
5 Related work
6 Conclusion
24/63
53. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of RDF graphs
Definition (lgg of RDF graphs)
Let G1, . . . , Gn be RDF graphs and R a set of RDF entailment rules.
A generalization of G1, . . . , Gn is an RDF graph Gg such that
Gi |=R Gg holds for 1 ≤ i ≤ n.
A least general generalization (lgg) of G1, . . . , Gn is a generalization
Glgg of G1, . . . , Gn such that for any other generalization Gg of
G1, . . . , Gn, Glgg |=R Gg holds.
Theorem
An lgg of RDF graphs always exists ; it is unique up to entailment.
25/63
54. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of RDF graphs
Definition (lgg of RDF graphs)
Let G1, . . . , Gn be RDF graphs and R a set of RDF entailment rules.
A generalization of G1, . . . , Gn is an RDF graph Gg such that
Gi |=R Gg holds for 1 ≤ i ≤ n.
A least general generalization (lgg) of G1, . . . , Gn is a generalization
Glgg of G1, . . . , Gn such that for any other generalization Gg of
G1, . . . , Gn, Glgg |=R Gg holds.
Result : lgg of n RDF graphs vs lgg of two RDF graphs
3(G1, G2, G3) ≡R 2( 2(G1, G2), G3)
· · · · · ·
n(G1, . . . , Gn) ≡R 2( n−1(G1, . . . , Gn−1), Gn)
≡R 2( 2(· · · 2( 2(G1, G2), G3) · · · , Gn−1), Gn)
We focus on computing lgg of two RDF graphs
26/63
55. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of RDF graphs
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
hasAuthorhasContactAuthorhasAuthor
τ τ
τ
τ
sctitle
G1 G2
56. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of RDF graphs
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
hasAuthorhasContactAuthorhasAuthor
τ τ
τ
τ
sctitle
G1 G2
b1
SA Researcher
b3 b2
Publication
title τ
τ
hasAuthor
τ
sc
Glgg
27/63
57. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of RDF graphs
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
hasAuthorhasContactAuthorhasAuthor
τ τ
τ
τ
sctitle
G1 G2
b1
SA Researcher
b3 b2
Publication
title τ
τ
hasAuthor
τ
sc
Glgg
How to compute this graph ?
27/63
58. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
The cover graph of RDF graphs
Definition (Cover graph)
The cover graph G of two RDF graph G1 and G2 is the RDF graph such
that for every property p in both G1 and G2 :
(t1, p, t2) ∈ G1 and (t3, p, t4) ∈ G2 iff (ς(t1, t3), p, ς(t2, t4)) ∈ G
with ς(t1, t3) = t1 if t1 = t3 and t1 ∈ U ∪ L, else ς(t1, t3) is the blank
node bt1t3
, and, similarly ς(t2, t4) = t2 if t2 = t4 and t2 ∈ U ∪ L, else
ς(t2, t4) is the blank node bt2t4
.
28/63
59. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
The cover graph of RDF graphs
Definition (Cover graph)
The cover graph G of two RDF graph G1 and G2 is the RDF graph such
that for every property p in both G1 and G2 :
(t1, p, t2) ∈ G1 and (t3, p, t4) ∈ G2 iff (ς(t1, t3), p, ς(t2, t4)) ∈ G
with ς(t1, t3) = t1 if t1 = t3 and t1 ∈ U ∪ L, else ς(t1, t3) is the blank
node bt1t3
, and, similarly ς(t2, t4) = t2 if t2 = t4 and t2 ∈ U ∪ L, else
ς(t2, t4) is the blank node bt2t4
.
Example
(i1, hasAuthor, SA) ∈ G1 and (i2, hasAuthor, SA) ∈ G2 iff
(bi1i2, hasAuthor, SA) ∈ G
(i1, hasAuthor, SA) ∈ G1 and (i2, hasContactAuthor, SA) ∈ G2 iff
(bi1i2, bhAhCA, SA) /∈ G
28/63
60. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Properties of cover graph
Theorem
The cover graph G of the RDF graphs G1 and G2 is an lgg of them for
the empty set R of RDF entailment rules (i.e., R = ∅).
Proposition
The cover graph of two RDF graphs G1 and G2 can be computed in
O(|G1| × |G2|) ; its size is bounded by |G1| × |G2|.
29/63
61. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Cover graph-based lgg
Theorem
Let G1 and G2 be two RDF graphs, and R a set of RDF entailment rules.
The cover graph G of G∞
1 and G∞
2 is an lgg of G1 and G2.
Proposition
An lgg of two RDF graphs G1 and G2 can be computed in
O(|G∞
1 | × |G∞
2 |) and its size is bounded by |G∞
1 | × |G∞
2 |.
30/63
62. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Cover graph-based lgg of RDF graphs
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
hasAuthorhasContactAuthorhasAuthor
τ τ
τ
τ
sctitle
G∞
1 G∞
2
63. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Cover graph-based lgg of RDF graphs
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
hasAuthorhasContactAuthorhasAuthor
τ τ
τ
τ
sctitle
i2
ResearcherVV
Publication
JourPaper"CwFOL"
hasAuthor
τ τ
τ
τ
sctitle
G∞
1 G∞
2
bi1i2 bSAVVhasAuthor
64. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Cover graph-based lgg of RDF graphs
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
hasAuthorhasContactAuthorhasAuthor
τ τ
τ
τ
sctitle
i2
ResearcherVV
Publication
JourPaper"CwFOL"
hasAuthor
τ τ
τ
τ
sctitle
SA
hasAuthor
G∞
1 G∞
2
bi1i2 bSAVVhasAuthor bSAVVhasAuthor
SA
hasAuthor
65. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Cover graph-based lgg of RDF graphs
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
hasAuthorhasContactAuthorhasAuthor
τ τ
τ
τ
sctitle
i2
ResearcherVV
Publication
JourPaper"CwFOL"
hasAuthor
τ τ
τ
τ
sctitle
G∞
1 G∞
2
bi1i2 bSAVVhasAuthor bSAVVhasAuthor
SA
hasAuthor
bi1i2
SA
hasAuthor
bSAVV
Researcher
τ
66. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Cover graph-based lgg of RDF graphs
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
hasAuthorhasContactAuthorhasAuthor
τ τ
τ
τ
sctitle
i2
ResearcherVV
Publication
JourPaper"CwFOL"
hasAuthor
τ τ
τ
τ
sctitle
G∞
1 G∞
2
bi1i2 bSAVVhasAuthor bSAVVhasAuthor
SA
hasAuthor
bi1i2
SA
hasAuthor
bSAVV
Researcher
τ
bSAVV
Researcher
τ
bDC
title
bi1i2
67. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Cover graph-based lgg of RDF graphs
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
hasAuthorhasContactAuthorhasAuthor
τ τ
τ
τ
sctitle
G∞
1 G∞
2
bi1i2 bSAVVhasAuthor bSAVVhasAuthor
SA
hasAuthor
bi1i2
SA
hasAuthor
bSAVV
Researcher
τ
bSAVV
Researcher
τ
bDC
title
bi1i2
bi1i2 bSAVV bPR
SA Researcherτ bi1SA bCPR
bi1VV
bDC bPJP bCPP bCPJP Publication
title τττ τ
hasAuthor
hasAuthor
τ
sc
τ
τ τ
τ
bRP bSAi2 bRJPτ τ
31/63
68. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Experimentation : RDF graphs (DBPedia)
G1≤i≤n, Gi |=R GR
lgg |=R Glgg
Goal
To show how much more precise lggs are when entailment beween
RDF graphs (|=R) is utilized instead of just simple entailment (|=).
32/63
69. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Experimentation : RDF graphs (DBPedia)
G1≤i≤n, Gi |=R GR
lgg |=R Glgg
Goal
To show how much more precise lggs are when entailment beween
RDF graphs (|=R) is utilized instead of just simple entailment (|=).
DBpedia graph G1≤i≤9 : G1 G2 G3 G4 G5 G6 G7 G8 G9
|Gi |’s shape Graph Graph Tree Graph Graph Graph Graph Graph Graph
|Gi ODBpedia| 4 5 4 6 6 4 6 5 5
|Gi
∞
O∞
DBpedia| 16 17 19 22 23 19 25 21 24
Table: Characteristics of our test graphs (top) and of their saturations
(bottom) ; times are in ms. |ODBpedia| = 397 & |O∞
DBpedia| = 1,067.
32/63
70. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Experimentation : lgg of RDF graphs (DBPedia)
lgg of 2 DBpedia graphs : G1G2 G1G3 G1G4 G2G4 G3G4 G5G6 G5G9 G6G7 G6G8 G7G8 G8G9
Time to compute Glgg 5 5 4 5 5 5 6 6 5 5 5
Time to compute GR
lgg 8 9 11 15 13 8 14 8 8 9 10
Gain in precision 42.46 47.70 49.15 47.86 61.12 2.17 2.17 19.89 0 19.14 13.39
Table: Characteristics of cover graph-based lggs of 2 test graphs, w/ or w/o
saturation w.r.t. DBpedia RDFS constraints ; times are in ms.
lgg of 3 DBpedia graphs : G1G2G3 G1G2G4 G1G3G4 G2G3G4 G5G6G7 G5G6G8 G5G6G9 G6G7G8 G7G8G9
Time to compute Glgg 10 10 11 12 11 12 11 11 11
Time to compute GR
lgg 30 41 39 41 33 42 37 49 52
Gain in precision 47.48 50.41 48.46 51.65 19.06 1.44 19.39 29.13 16.92
Table: Characteristics of cover graph-based lggs of 3 test graphs, w/ or w/o
saturation w.r.t. DBpedia RDFS constraints ; times are in ms.
33/63
71. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Outline
1 Introduction
2 Preliminaries
3 Learning commonalities in RDF
4 Learning commonalities in SPARQL
5 Related work
6 Conclusion
34/63
72. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of queries
lgg of BGPQs
Let q1, . . . , qn be BGPQs with the same arity and R a set of RDF
entailment rules.
A generalization of q1, . . . , qn is a BGPQ qg such that qi |=R qg for
1 ≤ i ≤ n.
A least general generalization of q1, . . . , qn is a generalization qlgg of
q1, . . . , qn such that for any other generalization qg of q1, . . . , qn :
qlgg |=R qg .
35/63
73. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of queries
lgg of BGPQs
Let q1, . . . , qn be BGPQs with the same arity and R a set of RDF
entailment rules.
A generalization of q1, . . . , qn is a BGPQ qg such that qi |=R qg for
1 ≤ i ≤ n.
A least general generalization of q1, . . . , qn is a generalization qlgg of
q1, . . . , qn such that for any other generalization qg of q1, . . . , qn :
qlgg |=R qg .
x1 ConfPaper
y1
τ
hasContactAuthor
x2 JourPaper
y2
τ
hasAuthor
x
y
τ
q1(x1) q2(x2) qlgg (x)
74. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of queries
lgg of BGPQs
Let q1, . . . , qn be BGPQs with the same arity and R a set of RDF
entailment rules.
A generalization of q1, . . . , qn is a BGPQ qg such that qi |=R qg for
1 ≤ i ≤ n.
A least general generalization of q1, . . . , qn is a generalization qlgg of
q1, . . . , qn such that for any other generalization qg of q1, . . . , qn :
qlgg |=R qg .
x1 ConfPaper
y1
τ
hasContactAuthor
x2 JourPaper
y2
τ
hasAuthor
x
y
τ
x
Publication
y
Researcher
τ
hasAuthor
τ
q1(x1) q2(x2) qlgg (x) qlggO(x)
35/63
75. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Enriching queries w.r.t. background knowledge
Publication hasAuthor Researcher
ConfPaper JourPaper
hasContactAuthor
←d →r
sp←d →r
scsc
O
x1
ConfPaper
y1
τ
hasContactAuthor
x2
JourPaper
y2
τ
hasAuthor
q1(x1) q2(x2)
76. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Enriching queries w.r.t. background knowledge
Publication hasAuthor Researcher
ConfPaper JourPaper
hasContactAuthor
←d →r
sp←d →r
scsc
O
x1
ConfPaper
y1
τ
hasContactAuthor
Researcher
Publication
τ
τ
hasAuthor x2
JourPaper
y2
τ
hasAuthor
Researcher
Publication
τ
τ
q1
∞
O (x1) q2
∞
O (x2)
36/63
77. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Saturation of queries
BGPQ saturation w.r.t. RDFS constraints
Let R be a set of RDF entailment rules, O a set of RDFS statements,
and q a BGPQ. The saturation of q w.r.t. O, noted q∞
O , is the BGPQ
with the same answer variables as q and whose body, noted body(q∞
O ), is
the maximal subset of (body(q) ∪ O)∞
such that for any of its subset
S : if O |=R S holds then body(q) |=R S holds.
(body(q) ∪ O)∞
37/63
78. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Entailment relation between BGPQs w.r.t. background
knowledge
Entailment between BGPQs w.r.t. R, O
Given a set R of RDF entailment rules, a set O of RDFS statements, and
two BGPQs q1 and q2 with the same arity, q1 entails q2 w.r.t. O,
denoted q1 |=R,O q2, iff q1
∞
O |= q2 holds.
Well-founded relation : q1 |=R,O q2
Query entailment : if G |=R q1 holds then G |=R q2 holds,
Query answering : q1(G) ⊆ q2(G) holds.
38/63
79. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of queries w.r.t. background knowledge
Definition (lgg of BGPQs w.r.t. RDFS constraints)
Let R be a set of RDF entailment rules, O a set of RDFS statements,
and q1, . . . , qn n BGPQs with the same arity.
A generalization of q1, . . . , qn w.r.t. O is a BGPQ qg such that
qi |=R,Oqg for 1 ≤ i ≤ n.
A least general generalization of q1, . . . , qn w.r.t. O is a
generalization qlgg of q1, . . . , qn w.r.t. O such that for any other
generalization qg of q1, . . . , qn w.r.t. O : qlgg|=R,Oqg .
Theorem
An lgg of BGPQs w.r.t. RDFS statements may not exist for some set of
RDF entailment rules ; when it exists, it is unique up to entailment
(|=R,O).
39/63
80. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of queries w.r.t. background knowledge
Definition (lgg of BGPQs w.r.t. RDFS constraints)
Let R be a set of RDF entailment rules, O a set of RDFS statements,
and q1, . . . , qn n BGPQs with the same arity.
A generalization of q1, . . . , qn w.r.t. O is a BGPQ qg such that
qi |=R,Oqg for 1 ≤ i ≤ n.
A least general generalization of q1, . . . , qn w.r.t. O is a
generalization qlgg of q1, . . . , qn w.r.t. O such that for any other
generalization qg of q1, . . . , qn w.r.t. O : qlgg|=R,Oqg .
Result : lgg of n BGPQ queries vs lgg of two BGPQ queries
3(q1, q2, q3) ≡R,O 2( 2(q1, q2), q3)
· · · · · ·
n(q1, . . . , qn) ≡R,O 2( n−1(q1, . . . , qn−1), qn)
≡R,O 2( 2(· · · 2( 2(q1, q2), q3) · · · , qn−1), qn)
We focus on computing lgg of two BGPQ queries
40/63
81. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of queries
x1 ConfPaper
y1
τ
hasContactAuthor
x2 JourPaper
y2
τ
hasAuthor
Publication hasAuthor Researcher
ConfPaper JourPaper
hasContactAuthor
←d →r
sp←d →r
scsc
q1(x1) q2(x2) O
82. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of queries
x1 ConfPaper
y1
τ
hasContactAuthor
x2 JourPaper
y2
τ
hasAuthor
Publication hasAuthor Researcher
ConfPaper JourPaper
hasContactAuthor
←d →r
sp←d →r
scsc
q1(x1) q2(x2) O
x
Publication
y
Researcher
τ
hasAuthor
τ
qlggO(x)
41/63
83. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Defining the lgg of queries
x1 ConfPaper
y1
τ
hasContactAuthor
x2 JourPaper
y2
τ
hasAuthor
Publication hasAuthor Researcher
ConfPaper JourPaper
hasContactAuthor
←d →r
sp←d →r
scsc
q1(x1) q2(x2) O
x
Publication
y
Researcher
τ
hasAuthor
τ
qlggO(x)
How to compute this query ?
41/63
84. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
The cover of SPARQL queries
Definition (Cover query)
Let q1, q2 be two BGPQs with the same arity n.
If there exists the BGPQ q such that
head(q1) = q(x1
1 , . . . , xn
1 ) and head(q2) = q(x1
2 , . . . , xn
2 ) iff
head(q) = q(vx1
1 x1
2
, . . . , vxn
1 xn
2
)
(t1, t2, t3) ∈ body(q1) and (t4, t5, t6) ∈ body(q2) iff
(ς(t1, t4), ς(t2, t5), ς(t3, t6)) ∈ body(q) with, for 1 ≤ i ≤ 3,
ς(ti , ti+3) = ti if ti = ti+3 and ti ∈ U ∪ L, otherwise ς(ti , ti+3) is the
variable vti ti+3
then q is the cover query of q1, q2.
42/63
85. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
The cover of SPARQL queries
Definition (Cover query)
Let q1, q2 be two BGPQs with the same arity n.
If there exists the BGPQ q such that
head(q1) = q(x1
1 , . . . , xn
1 ) and head(q2) = q(x1
2 , . . . , xn
2 ) iff
head(q) = q(vx1
1 x1
2
, . . . , vxn
1 xn
2
)
(t1, t2, t3) ∈ body(q1) and (t4, t5, t6) ∈ body(q2) iff
(ς(t1, t4), ς(t2, t5), ς(t3, t6)) ∈ body(q) with, for 1 ≤ i ≤ 3,
ς(ti , ti+3) = ti if ti = ti+3 and ti ∈ U ∪ L, otherwise ς(ti , ti+3) is the
variable vti ti+3
then q is the cover query of q1, q2.
Example
(i1, hasAuthor, SA) ∈ body(q1) and
(i2, hasContactAuthor, SA) ∈ body(q2) iff (bi1i2, bhAhCA, SA) ∈ body(q)
42/63
86. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Cover query-based lgg
Theorem
Given a set R of RDF entailment rules, a set O of RDFS statements and
two BGPQs q1, q2 with the same arity,
1 the cover query q of q1
∞
O , q2
∞
O exists iff an lgg of q1, q2 w.r.t. O
exists ;
2 the cover query q of q1
∞
O , q2
∞
O is an lgg of q1, q2 w.r.t. O.
Proposition
A cover query-based lgg of two BGPQs q1 and q2 is computed in
O(|body(q1
∞
O )| × |body(q2
∞
O )|) and its size is
|body(q1
∞
O )| × |body(q2
∞
O )|.
43/63
91. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Experimentation : BGPQs (DBPedia)
q
ODBpedia
lgg |=R qlgg
Goal
To show how much more precise lggs are when entailment between
BGPQs w.r.t. background knowledge (|=R,O) are utilized instead of
just simple entailment (|=).
45/63
92. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Experimentation : BGPQs (DBPedia)
q
ODBpedia
lgg |=R qlgg
Goal
To show how much more precise lggs are when entailment between
BGPQs w.r.t. background knowledge (|=R,O) are utilized instead of
just simple entailment (|=).
DBpedia query Q1≤i≤8 : Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8
Qi ’s shape tree tree tree graph graph graph graph graph
|body(Qi )| 4 6 4 6 4 6 6 6
Number of URI/variable occurrences in Qi 7/5 9/9 5/7 7/11 5/7 9/9 9/9 9/9
|Qi (GDBpedia)| 77 0 41 695 13 6 0 1 0
|body(Qi
∞
ODBpedia
)| 16 19 19 23 16 23 23 23
Table: Characteristics of our test BGPQs (top) and of their saturations
w.r.t. DBpedia constraints (bottom) ; times are in ms.
45/63
93. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Experimentation : lgg of BGPQs (DBPedia)
lgg of 2 DBpedia BGPQs : Q1Q2 Q1Q3 Q1Q4 Q2Q3 Q4Q5 Q5Q6 Q5Q7 Q7Q8
Time to compute qlgg 3 3 5 4 4 5 6 5
|qlgg(GDBpedia)| 477,455 34,747,102 34,901,117 34,747,102 1,977 1,221 35 70
Time to compute q
ODBpedia
lgg 13 14 14 15 15 14 17 18
|q
ODBpedia
lgg (GDBpedia)| 10,637 7,874,768 456,690 4,537,824 1,701 780 34 36
Gain in precision 97.77 77.33 98.69 86.94 13.96 36.11 2.85 48.57
Table: Characteristics of cover query-based lggs of test queries, w/ or w/o
using the DBpedia RDFS constraints ; times are in ms.
lgg of 2 DBpedia BGPQs : Q1Q2Q3 Q1Q2Q4 Q1Q3Q4 Q2Q3Q4 Q4Q7Q8 Q5Q7Q8 Q6Q7Q8
Time to compute qlgg 5 4 5 6 10 11 12
|qlgg(GDBpedia)| 34,747,102 34,901,117 34,901,117 34,901,117 70 1,977 4,969
Time to compute q
ODBpedia
lgg 19 20 20 24 27 27 33
|q
ODBpedia
lgg (GDBpedia)| 7,874,768 615,339 7,874,779 4,537,824 36 1,701 335
Gain in precision 77.33 98.23 77.43 86.99 48.57 13.96 93.25
Table: Characteristics of cover query-based lggs of 3 test queries, w/ or w/o
using the DBpedia RDFS constraints ; times are in ms.
46/63
94. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Outline
1 Introduction
2 Preliminaries
3 Learning commonalities in RDF
Defining lgg in RDF
Computing lgg in RDF
Experimentation
4 Learning commonalities in SPARQL
Defining lgg in SPARQL
Computing lgg in SPARQL
Experimentation
5 Related work
6 Conclusion
47/63
95. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Related work
Structural approaches
Description Logics
- [Baader et al., 1999].
- [Zarrieß and Turhan, 2013].
RDF : Rooted graphs, ignore RDF entailment :
- [Colucci et al., 2016].
SPARQL : tree queries, ignore RDF entailment :
- [Lehmann and Bühmann, 2011].
Approaches independent of the structure
First Order Clauses
- [Plotkin, 1970].
- [Nienhuys-Cheng and de Wolf, 1996].
Conceptual Graphs
- [Chein and Mugnier, 2009].
48/63
96. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Conclusion & Perspectives
We revisited the problem of computing a least general generalization
in the entire setting of RDF & SPARQL conjunctif queries .
We defined new entailment relationship between BGPQs
w.r.t. background knowledge.
Algorithms to compute lggs of Queries and small-to-huge RDF
graphs.
Memory
Data management system
MapReduce
We studied the added-value of considering entailment rules when
learning lggs of RDF graphs and entailment rules plus external
ontology when learning lggs of BGPQs, using synthetic LUBM data
and real DBpedia data.
49/63
97. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Conclusion & Perspectives
Perspectives
Redundancy elimination :
to define heuristics in order to efficiently prune out as much as
possible redundant triples from our cover graph/query-based lggs.
Learning commonalities in DL-Lite :
to study the problem of learning lggs in the setting of the DL-LiteR
which underpins the OWL2 QL profile of the Web Ontology
Language, the other Semantic Web data model by W3C.
Analogy and analogical proportion :
Analogy : Green Algae is like Cray Fish
Analogical proportion : Green Algae is to Bretagne Bay as Cray Fish
is to Aquitaine River
50/63
99. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
References I
[Baader et al., 1999] Baader, F., Kiisters, R., and Molitor, R. (1999).
Computing least common subsumers in description logics with existential restrictions.
In IJCAI.
[Berners-Lee and Fischetti, 1999] Berners-Lee, T. and Fischetti, M. (1999).
Weaving the Web : The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor.
Harper San Francisco, 1st edition.
[Chein and Mugnier, 2009] Chein, M. and Mugnier, M. (2009).
Graph-based Knowledge Representation - Computational Foundations of Conceptual Graphs.
Springer.
[Colucci et al., 2016] Colucci, S., Donini, F., Giannini, S., and Sciascio, E. D. (2016).
Defining and computing least common subsumers in RDF.
J. Web Semantics, 39(0).
[El Hassad, 2015] El Hassad, S. (2015).
Interrogation par analogie dans les bases de données.
Base de Données Avancées.
Poster.
[Lehmann and Bühmann, 2011] Lehmann, J. and Bühmann, L. (2011).
Autosparql : Let users query your knowledge base.
In ESWC.
[Nienhuys-Cheng and de Wolf, 1996] Nienhuys-Cheng, S. and de Wolf, R. (1996).
Least generalizations and greatest specializations of sets of clauses.
J. Artif. Intell. Res.
[Petrova et al., 2017] Petrova, A., Sherkhonov, E., Grau, B. C., and Horrocks, I. (2017).
Entity comparison in RDF graphs.
In International Semantic Web Conference (ISWC). Springer.
[Plotkin, 1970] Plotkin, G. D. (1970).
A note on inductive generalization.
Machine Intelligence, 5.
52/63
100. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
References II
[W3C-RDFS, 2014] W3C-RDFS (2014).
RDF 1.1 semantics.
https://www.w3.org/TR/rdf11-mt/.
[Zarrieß and Turhan, 2013] Zarrieß, B. and Turhan, A. (2013).
Most specific generalizations w.r.t. general EL-TBoxes.
In IJCAI.
53/63
101. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Anti-unification of triples
Definition (Anti-unification of triples)
Let ς be an injective generalization function from
(U ∪ L ∪ B) × (U ∪ L ∪ B) to U ∪ L ∪ B that maps (i) any pair of same
input URI or literal value to itself, i.e., ς(v, v) = v, and (ii) any other pair
of values (URIs, blank nodes, literals and mix thereof) to a blank node.
The anti-unification of the two triples (t1, p, t2) and (t3, p, t4) is the
triple (ς(t1, t3), p, ς(t2, t4)).
54/63
102. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Anti-unification of triple patterns
Definition(Anti-unification of triple patterns)
Let ς be an injective generalization function from
(U ∪ L ∪ V) × (U ∪ L ∪ V) to U ∪ L ∪ V that maps (i) any pair of same
input URI or literal value to itself, i.e., ς(v, v) = v, and (ii) any other
pair of values (URIs, literals, variables and mix thereof) to a variable.
The anti-unification of the two triples (t1, t2, t3) and (t4, t5, t6) is the
triple (ς(t1, t4), ς(t2, t5), ς(t3, t6)).
55/63
103. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Gain in precision for lgg of graphs : metric
G1≤i≤n, Gi |=R GR
lgg |=R Glgg
We measure the gain in precision of the lgg GR
lgg over the lgg Glgg
w.r.t. the input RDF graph G1≤i≤n, noted ρi :
ρi =
|([GR
lgg]φi
)∞
| − |([Glgg]φi
)∞
|
|G∞
i |
We measure the gain in precision of the lgg GR
lgg over the lgg Glgg
w.r.t. the input RDF graphs G1, . . . , Gn, noted ρ :
ρ =
Σn
i=1ρi
n
56/63
104. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Gain in precision for lgg of BGPQs : metric
q
ODBpedia
lgg |=R qlgg
we define the gain in precision ρ that standard entailment endowed
with background knowledge (|=R,O) yields over simple entailment
(|=) w.r.t. query answering :
ρ = 1 −
| qR,O
lgg (G) |
| qlgg(G) |
57/63
105. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Our approach vs [Lehmann and Bühmann, 2011]
Lgg : Q01Q02 Q01Q03 Q01Q04 Q02Q04 Q03Q04
|qlgglehman(GDBpedia)| 62485 9818 62485 9 62485
|qlgg(GDBpedia)| 62485 9818 62485 9 0
Gain 0 0 0 0 1
|q
ODBpedia
lgg (GDBpedia)| 4384 9818 4014 9 0
Gain 92.99 0 93.57 0 1
Table: Our cover query-based lggs vs lgg of [Lehmann and Bühmann, 2011]
58/63
106. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Our approach vs [Colucci et al., 2016]
Lgg : G1G2 G1G3 G1G4 G1G5 G2G3 G2G4 G2G5 G3G4 G3G5 G4G5
triplesCGlgg 4 3.5 3 3 4.5 4 4 3.5 3.5 8.5
triplesGlgg 5.5 3.5 3 3 7 5.5 5.5 3.5 3.5 16.5
Gain 18.75 0 0 0 32.22 18.75 18.75 0 0 48.33
triplesG
ODBpedia
lgg 11.5 13.5 13 14 12 11.5 12.5 16 15.5 18
Gain 63.46 74.16 76.78 78.46 58.88 61.50 65 78.12 77.50 52.78
Table: Our cover query-based lggs vs lgg of [Colucci et al., 2016]
59/63
107. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Using lgg for defining analogy in SPARQL
Green Algae is like Cray Fish
60/63
108. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Using lgg for defining analogy in SPARQL
x1
GreenAlgae
y1
BretagneBay
τ τ
proliferate x2
Crayfish
y2
AquitaineRiver
τ τ
contaminate x
Species
y
ObservationZone
τ τ
invade
QA(x1) QB (x2) Qlgg(x)
GreenAlgae
proliferate
BretagneBayCrayfish AquitaineRiver
Species ObservationZone
contaminate
invade
sp
sp
sc sc sc sc←d →r
←d →r
O 61/63
109. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Using lgg for defining analogical proportion in SPARQL
Green Algae is to Bretagne Bay as Cray Fish is to Aquitaine
River
62/63
110. Introduction Preliminaries Learning commonalities in RDF Learning commonalities in SPARQL Related work Conclusion
Using lgg for defining analogy in SPARQL
Definition (Analogy [El Hassad, 2015])
Let QA and QB be two BGPQs with the same arity, R a set of RDF
entailment rules and O a set of RDFS statements.
QA is like QB iff there exists a BGPQ Q such that QA |=R,O Q and
QB |=R,O Q.
63/63