Learning Commonalities in RDF

Introduction The Resource Description Framework Finding commonalities between RDF graphs Related work Conclusion
Learning Commonalities in RDF
Sara El Hassad François Goasdoué Hélène Jaudoin
IRISA, Univ. Rennes 1, Lannion, France
ESWC 2017 28th May - 1st June 2017
1/24

Introduction
Least general generalization (lgg)
• Machine Learning in the early 70’s by Gordon Plotkin
• Knowledge representation domain in the early 90’s
• Recently in semantic web
2/24

Introduction
Applications of lgg
• Social context : lgg of users descriptions (proﬁles)
• Research common graph patterns between of datasets
• Linked Data Cloud : links between datasets
2/24

Introduction
Applications of lgg
• Social context : lgg of users descriptions (proﬁles)
• Research common graph patterns between of datasets
• Linked Data Cloud : links between datasets
Goal
To study the problem in the setting of the entire RDF standard
2/24

Outline
Introduction
The Resource Description Framework
Finding commonalities between RDF graphs
Related work
Conclusion
3/24

RDF graphs
• Speciﬁcation of RDF graphs with triples :
(s, p, o) ∈ (U ∪ B) × U × (U ∪ L ∪ B) s op
• Built-in property URIs to state RDF statements
RDF statement Triple
Class assertion (s, rdf:type, o)
Property assertion (s, p, o) with
p = rdf:type
4/24

RDF graphs
• Speciﬁcation of RDF graphs with triples :
(s, p, o) ∈ (U ∪ B) × U × (U ∪ L ∪ B) s op
• Built-in property URIs to state RDF statements
RDF statement Triple
Class assertion (s, rdf:type, o)
Property assertion (s, p, o) with
p = rdf:type
b "LGG in RDF"
ConfPaper b1
hasTitle
τ hasContactAuthor
4/24

Adding ontological knowledge to RDF graphs
• Built-in property URIs to state RDF Schema statements, i.e.,
ontological constraints.
RDFS statement Triple
Subclass (s, sc, o)
Subproperty (s, sp, o)
Domain typing (s, ←d , o)
Range typing (s, →r , o)
5/24

Adding ontological knowledge to RDF graphs
• Built-in property URIs to state RDF Schema statements, i.e.,
ontological constraints.
RDFS statement Triple
Subclass (s, sc, o)
Subproperty (s, sp, o)
Domain typing (s, ←d , o)
Range typing (s, →r , o)
b "LGG in RDF"
ConfPaper hasContactAuthor b1
Publication hasAuthor Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
5/24

Deriving the implicit triples
b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthor
Figure: RDF graph G

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
Figure: RDF graph G

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
Figure: RDF graph G

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ
Figure: RDF graph G

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r
Figure: RDF graph G

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
Figure: RDF graph G
6/24

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
Figure: RDF graph G
How to derive implicit triples of an RDF graph ?
6/24

Sample set of entailment rules
Rule [7] Entailment rule
rdfs2 (p, ←d , o), (s1, p, o1) → (s1, τ, o)
rdfs3 (p, →r , o), (s1, p, o1) → (o1, τ, o)
rdfs5 (p1, sp, p2), (p2, sp, p3) → (p1, sp, p3)
rdfs7 (p1, sp, p2), (s, p1, o) → (s, p2, o)
rdfs9 (s, sc, o), (s1, τ, s) → (s1, τ, o)
rdfs11 (s, sc, o), (o, sc, o1) → (s, sc, o1)
ext1 (p, ←d , o), (o, sc, o1) → (p, ←d , o1)
ext2 (p, →r , o), (o, sc, o1) → (p, →r , o1)
ext3 (p, sp, p1), (p1, ←d , o) → (p, ←d , o)
ext4 (p, sp, p1), (p1, →r , o) → (p, →r , o)
7/24

Semantics of RDF graphs
b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthor
Figure: RDF graph G

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
rdfs9 : (s, sc, o), (s1, τ, s) → (s1, τ, o)
Figure: RDF graph G

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
rdfs7 : (p1, sp, p2), (s, p1, o) → (s, p2, o)
hasAuthor
τ
Figure: RDF graph G

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
rdfs3 : (p, →r , o), (s1, p, o1) → (o1, τ, o)
hasAuthor
τ
hasAuthor
τ
Figure: RDF graph G

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
ext4 : (p, sp, p1), (p1, →r , o) → (p, →r , o)
hasAuthor
τ
hasAuthor
τ→r τ
Figure: RDF graph G

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
ext3 : (p, sp, p1), (p1, ←d , o) → (p, ←d , o)
hasAuthor
τ
hasAuthor
τ→r τ←d →r
Figure: RDF graph G
8/24

b "LGG in RDF"
hasTitle
τ
sc sp
→r←d
hasContactAuthorτ
hasAuthor
τ→r←d
Figure: Saturated RDF graph G∞
9/24

Entailment between RDF graphs
Let G and G be two graphs RDF and R a set of RDF entailment rules.
There exists relationship to compare G and G called entailment between
graphs
G is more speciﬁc than G :
• G |=R G ⇐⇒ G∞
|= G
There must exist an embedding of G in G∞
.
10/24

G
?
|=R G
b "LGG in RDF"
Publication hasAuthor
G
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
b
b2
G
Publication
hasTitle
τ

G∞
?
|= G
b "LGG in RDF"
G∞
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
b
b2
G
Publication
hasTitle
ττ
hasAuthor
τ→r←d

G∞
?
|= G
b "LGG in RDF"
G∞
Researcher
hasTitle
τ
sc sp
→r←d
hasContactAuthor
b
b2
G
Publication
hasTitle
ττ
hasAuthor
τ→r←d
b
Publication
"LGG in RDF"hasTitle
τ
RDF graph G is more speciﬁc than RDF graph G
11/24

Outline
Introduction
Related work
Conclusion
12/24

Towards deﬁning lgg in RDF
A least general generalization (lgg) of n descriptions d1, . . . , dn is a most
speciﬁc description d generalizing every d1≤i≤n for some
generalization/specialization relation between descriptions (G.Plotkin).
lgg in RDF
• descriptions are RDF graphs
• relation generalization/specialization is entailment between RDF
graphs
13/24

Deﬁning the lgg of RDF graphs
Deﬁnition (lgg of RDF graphs)
Let G1, . . . , Gn be RDF graphs and R a set of RDF entailment rules.
• A generalization of G1, . . . , Gn is an RDF graph Gg such that
Gi |=R Gg holds for 1 ≤ i ≤ n.
• A least general generalization (lgg) of G1, . . . , Gn is a generalization
Glgg of G1, . . . , Gn such that for any other generalization Gg of
G1, . . . , Gn, Glgg |=R Gg holds.
Result : lgg of n RDF graphs vs lgg of two RDF graphs
3(G1, G2, G3) ≡R 2( 2(G1, G2), G3)
· · · · · ·
n(G1, . . . , Gn) ≡R 2( n−1(G1, . . . , Gn−1), Gn)
≡R 2( 2(· · · 2( 2(G1, G2), G3) · · · , Gn−1), Gn)
14/24

Deﬁnition (lgg of RDF graphs)
Let G1, . . . , Gn be RDF graphs and R a set of RDF entailment rules.
• A generalization of G1, . . . , Gn is an RDF graph Gg such that
Gi |=R Gg holds for 1 ≤ i ≤ n.
• A least general generalization (lgg) of G1, . . . , Gn is a generalization
Glgg of G1, . . . , Gn such that for any other generalization Gg of
G1, . . . , Gn, Glgg |=R Gg holds.
Result : lgg of n RDF graphs vs lgg of two RDF graphs
3(G1, G2, G3) ≡R 2( 2(G1, G2), G3)
· · · · · ·
n(G1, . . . , Gn) ≡R 2( n−1(G1, . . . , Gn−1), Gn)
≡R 2( 2(· · · 2( 2(G1, G2), G3) · · · , Gn−1), Gn)
We focus on computing lgg of two RDF graphs
14/24

i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
hasAuthorhasContactAuthorhasAuthor
τ τ
τ
τ
sctitle
G1 G2

i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
τ τ
τ
τ
sctitle
G1 G2
bi1i2
SA Researcher
bDC bCPJP
Publication
title τ
τ
hasAuthor
τ
sc
Glgg
15/24

i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
τ τ
τ
τ
sctitle
G1 G2
bi1i2
SA Researcher
bDC bCPJP
Publication
title τ
τ
hasAuthor
τ
sc
Glgg
How to compute this graph ?
15/24

The cover graph of RDF graphs
Deﬁnition (Cover graph)
The cover graph G of two RDF graph G1 and G2 is the RDF graph such
that for every property p in both G1 and G2 :
(t1, p, t2) ∈ G1 and (t3, p, t4) ∈ G2 iﬀ (t5, p, t6) ∈ G
with t5 = t1 if t1 = t3 and t1 ∈ U ∪ L, else t5 is the blank node bt1t3
, and,
similarly t6 = t2 if t2 = t4 and t2 ∈ U ∪ L, else t6 is the blank node bt2t4
.
16/24

i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
τ τ
τ
τ
sctitle
G1 G2

i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
τ τ
τ
τ
sctitle
i2
ResearcherVV
Publication
JourPaper"CwFOL"
hasAuthor
τ τ
τ
τ
sctitle
G1 G2
bi1i2 bSAVVhasAuthor

i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
τ τ
τ
τ
sctitle
i2
ResearcherVV
Publication
JourPaper"CwFOL"
hasAuthor
τ τ
τ
τ
sctitle
SA
hasAuthor
G1 G2
bi1i2 bSAVVhasAuthor bSAVVhasAuthor
SA
hasAuthor

i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
τ τ
τ
τ
sctitle
i2
ResearcherVV
Publication
JourPaper"CwFOL"
hasAuthor
τ τ
τ
τ
sctitle
G1 G2
SA
hasAuthor
bi1i2
SA
hasAuthor
bSAVV
Researcher
τ

i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
τ τ
τ
τ
sctitle
i2
ResearcherVV
Publication
JourPaper"CwFOL"
hasAuthor
τ τ
τ
τ
sctitle
G1 G2
SA
hasAuthor
bi1i2
SA
hasAuthor
bSAVV
Researcher
τ
bSAVV
Researcher
τ
bDC
title
bi1i2

i1
SA Researcher
Publication
ConfPaper"DiD"
hasAuthor
τ
τ
τ
sctitle
hasAuthor
hasContactAuthor
i2
SAResearcherVV
Publication
JourPaper"CwFOL"
sp
τ τ
τ
τ
sctitle
G1 G2
SA
hasAuthor
bi1i2
SA
hasAuthor
bSAVV
Researcher
τ
bSAVV
Researcher
τ
bDC
title
bi1i2
bi1i2 bSAVV bPR
SA Researcherτ bi1SA bCPR
bi1VV
bDC bPJP bCPP bCPJP Publication
title τττ τ
hasAuthor
hasAuthor
τ
sc
τ
τ τ
τ
bRP bSAi2 bRJPτ τ
17/24

Cover graph vs lgg
Theorem (R = ∅)
The cover graph G of the RDF graphs G1 and G2 is an lgg of them for
the empty set R of RDF entailment rules (i.e., R = ∅).
18/24

Cover graph vs lgg
Theorem (R = ∅)
Proposition (R = ∅)
The cover graph of two RDF graphs G1 and G2 can be computed in
O(|G1| × |G2|) ; its size is bounded by |G1| × |G2|.
18/24

Cover graph vs lgg
Theorem (R = ∅)
Theorem (R = ∅)
Let G1 and G2 be two RDF graphs, and R a set of RDF entailment rules.
The cover graph G of G∞
1 and G∞
2 is an lgg of G1 and G2.
18/24

Cover graph vs lgg
Theorem (R = ∅)
Theorem (R = ∅)
Let G1 and G2 be two RDF graphs, and R a set of RDF entailment rules.
The cover graph G of G∞
1 and G∞
2 is an lgg of G1 and G2.
Corollary (R = ∅)
An lgg of two RDF graphs G1 and G2 can be computed in
O(|G∞
1 | × |G∞
2 |) and its size is bounded by |G∞
1 | × |G∞
2 |.
18/24

Cover graph vs lgg
bi1i2 bSAVV
bi1VV
bDC bCPJP Publication
title τ
hasAuthor
τ
sc
τ
τ
bSAi2 bRJPτ
bi1i2 bSAVV bPR
bi1VV
bDC bPJP bCPP bCPJP Publication
title τττ τ
hasAuthor
hasAuthor
τ
sc
τ
τ τ
τ
bRP bSAi2 bRJPτ τ
19/24

Outline
Introduction
Related work
Conclusion
20/24

Related work
Structural based approach
• Description Logics EL
- F. Baader and al. :Computing least common subsumers in
description logics with existential restrictions.In IJCAI, 1999.
- B. ZarrieB and al. :Most speciﬁc generalizations w.r.t. general
EL-TBoxes.In IJCAI, 2013.
• RDF
• SPARQL : tree queries
- J. Lehmann and L. Buhmann. Autosparql : Let users query your
knowledge base. In ESWC, 2011.
• Rooted graphs, ignore RDF entailment :
- S. Colucci and al. :Deﬁning and computing least common subsumers
in RDF.J. Web Semantics, 39(0), 2016.
Independent structure approach
• Conceptual Graphs
- M. Chein and M. Mugnier.Graph-based Knowledge Representation -
Computational Foundations of Conceptual Graphs.Springer, 2009.
21/24

Conclusion
• Revisit the problem of computing a least general generalization in
the entire setting of RDF.
• Algorithms to compute lggs of small-to-huge RDF graphs.
• Memory
• Data management system
• MapReduce
• Perspective : Heuristics in order to compute lgg without redundants
triples.
22/24

Thank you !
Questions ?
23/24

References I
[1] F. Baader, R. Kiisters, and R. Molitor.
Computing least common subsumers in description logics with existential restrictions.
In IJCAI, 1999.
[2] F. Baader, B. Sertkaya, and A.-Y. Turhan.
Computing the least common subsumer w.r.t. a background terminology.
Journal of Applied Logic, 5(3), 2007.
[3] M. Chein and M. Mugnier.
Graph-based Knowledge Representation - Computational Foundations of Conceptual Graphs.
Springer, 2009.
[4] S. Colucci, F. Donini, S. Giannini, and E. D. Sciascio.
Deﬁning and computing least common subsumers in RDF.
J. Web Semantics, 39(0), 2016.
[5] S. Colucci, F. M. Donini, and E. D. Sciascio.
Common subsumbers in RDF.
In AI*IA, 2013.
[6] J. Lehmann and L. Bühmann.
Autosparql : Let users query your knowledge base.
In ESWC, 2011.
[7] RDF 1.1 semantics.
https://www.w3.org/TR/rdf11-mt/.
[8] B. Zarrieß and A. Turhan.
Most speciﬁc generalizations w.r.t. general EL-TBoxes.
In IJCAI, 2013.
24/24

Learning Commonalities in RDF

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Learning Commonalities in RDF

Similar to Learning Commonalities in RDF (20)

Recently uploaded

Recently uploaded (20)

Learning Commonalities in RDF