The talk covers my rPhD esearch work so far and has been given as an introductory presentation at the beginning of my visiting period at the Web&Media Group of the Vrije Universiteit, Amsterdam. At first, I introduce a system for automated knowledge management, I.M.P.A.K.T., which embeds a module for Core Competence extraction. The module is described as use case for the application of non-standard inference services based on Least Common Subsumer in Description Logics (DLs) to the problem of finding commonalities in knowledge bases modeled in DLs. Moreover, I present the Knowledge Compilation approach adopted for efficiently solving subsumption through only standard SQL queries.
Then, I focus on my current investigation related to the possibility of expand Common Subsumer (CS) reasoning service to RDF datasets. Here, the formal definition of CS in RDF is given, together with a sketch of possible applications (e.g. clustering of RDF resources).
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Finding Commonalities: from Description Logics to the Web of Data
1. Finding Commonalities in Linked Open Data
Silvia Giannini
PhD Student
(Supervisor: Prof. Eugenio Di Sciascio)
Dipartimento di Ingegneria Elettrica e dell'Informazione (DEI),
Politecnico di Bari, Bari, Italy
in collaboration with
Prof. Francesco M. Donini, Ph.D. Simona Colucci
Web&Media Group Meeting | 31 March, 2014
2. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Outline
1 Finding Commonalities: A DLs use case
The I.M.P.A.K.T. system
The Core Competence module
2 Finding Commonalities: the Web of Data
3 Conclusion
Silvia Giannini Finding commonalities in Linked Open Data
3. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
What is I.M.P.A.K.T.
Information Management and Processing with the Aid of
Knowledge-based Technologies
An integrated system managing three enterprise business services based on
knowledge management:
1 Skill Matching 1
2 Team Composition 2
3 Core Competence Extraction 3
1
E. Tinelli, S. Colucci, S. Giannini, E. Di Sciascio, and F.M. Donini, Large scale skill matching
through knowledge compilation In: Proc. of ISMIS 2012, Springer-Verlag (2012) 192201.
2
E. Tinelli, S. Colucci, E. Di Sciascio, and F.M. Donini, Knowledge compilation for automated team
composition exploiting standard SQL In: Proc. of SAC 2012, ACM (2012) 16801685.
3
S. Colucci, E. Tinelli, S. Giannini, E. Di Sciascio, and F.M. Donini, Knowledge Compilation for Core
Competence Extraction in Organizations In: Proc. of Business Information Systems 2013, Springer
(2013) 163174.
Silvia Giannini Finding commonalities in Linked Open Data
4. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
What is I.M.P.A.K.T.
Information Management and Processing with the Aid of
Knowledge-based Technologies
An integrated system managing three enterprise business services based on
knowledge management:
1 Skill Matching 1
2 Team Composition 2
3 Core Competence Extraction 3
1
E. Tinelli, S. Colucci, S. Giannini, E. Di Sciascio, and F.M. Donini, Large scale skill matching
through knowledge compilation In: Proc. of ISMIS 2012, Springer-Verlag (2012) 192201.
2
E. Tinelli, S. Colucci, E. Di Sciascio, and F.M. Donini, Knowledge compilation for automated team
composition exploiting standard SQL In: Proc. of SAC 2012, ACM (2012) 16801685.
3
S. Colucci, E. Tinelli, S. Giannini, E. Di Sciascio, and F.M. Donini, Knowledge Compilation for Core
Competence Extraction in Organizations In: Proc. of Business Information Systems 2013, Springer
(2013) 163174.
Silvia Giannini Finding commonalities in Linked Open Data
5. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
What is I.M.P.A.K.T.
Skill Matching GUI
Silvia Giannini Finding commonalities in Linked Open Data
6. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
Behind I.M.P.A.K.T.
An ontology for the HR domain (nearly 5000 concepts)
T -Box
Employee Profile
(M0
)
Industry
(M1
)
Complementary
Skill
(M2
)
Level
(M3
)
Language
(M5
)
Job
Title
(M6
)
Knowledge
(M4
)
Main module M0: it models the properties (entry points) needed to
imports all the sections describing an employee CV.
Silvia Giannini Finding commonalities in Linked Open Data
7. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
Behind I.M.P.A.K.T.
An ontology for the HR domain (nearly 5000 concepts)
T -Box
Employee Profile
(M0
)
Industry
(M1
)
Complementary
Skill
(M2
)
Level
(M3
)
Language
(M5
)
Job
Title
(M6
)
Knowledge
(M4
)
Possible employee skills and technical tools usage ability.
Specied through:
type - experience role (e.g., developer, administrator)
year - experience level
lastdate - last temporal update of work experience
Silvia Giannini Finding commonalities in Linked Open Data
8. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
Behind I.M.P.A.K.T.
A Curriculum Vitae representation
A-Box
A prole P = (∃R0
j .C) is a concept in ALE(D), where R0
j , 1 ≤ j ≤ 6, is
an entry point, and C is a concept in FL0(D) modeled in Mj.
Silvia Giannini Finding commonalities in Linked Open Data
9. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
What is a Core Competence
Core Competence: a Knowledge Management process
Core competencies are a company collective knowledge about
how to coordinate diverse production skills and integrate multiple
streams of technologies. Identifying core comptencies helps in support
competitive advantage, articulate a strategic intent, and allocate
resources to build cross-unit technological and production links.
(G. Hamel, and C.K.A. Prahalad, The core competence of the corporation. Harvard Business, in Harvard
Business Review May-June (1990) 7990)
Examples:
Apple - design
Netix - content delivery
Google - expertise in algorithms
...
Silvia Giannini Finding commonalities in Linked Open Data
10. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The reasoning service
Objective: Automatically extract Core Competence, by identifying a common
know-how in a signicant portion of personnel (k employees, with k set as a
threshold value by the people in charge for the strategic analysis).
Tool:
Logic-based approach
Non-standard inference services (LCS, k-CS, BICS)
Method:
Knowledge-compilation process
It solves subsumption only via SQL queries against a proper R-DB schema,
without any exponential-time inference engine
Silvia Giannini Finding commonalities in Linked Open Data
11. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
A logic-based approach
Least Common Subsumer (LCS)
Let C1, . . . , Cn be a collection of n
concepts in a DL L. The Least
Common Subsumer (LCS) of
C1, . . . , Cn is a concept D in L such
that D is the most specic concept
subsuming all the elements of the
collection.
k-Common Subsumer (k-CS)
Let C1, . . . , Cn be a collection of n
concepts in a DL L and let k n. A
k-Common Subsumer (k-CS) of
C1, . . . , Cn is a concept D in L such
that D is an LCS of k concepts among
C1, . . . , Cn.
Informative k-Common Subsumer
(IkCS)
Given k n, an Informative
k-Common Subsumer (IkCS) of the
concepts C1, . . . , Cn in a DL L is a
concept D such that D is a k-CS
stricltly subsumed by the
LCS(C1, . . . , Cn) and adding
informative content to it.
Best Informative Common Subsumer
(BICS)
Given k n, a Best Informative
Common Subsumer (BICS) of the
concepts C1, . . . , Cn in a DL L is a
concept B such that B is an IkCS for
C1, . . . , Cn, and for every k j ≤ n
every j-CS is not informative.
Silvia Giannini Finding commonalities in Linked Open Data
12. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Knowledge Compilation process
Issues:
Computational diculties of deduction in knowledge bases expressed
through a logical formalism;
Combining the representation power of a logical language, with the
scalability and eciency of information processing in a DBMS.
Knowledge Compilation:
1 OFF-LINE REASONING
pre-processing of a company intellectual capital, described in a Description
Logics (DLs) Knowledge Base (KB), in an appropriate relational database
schema.
2 ON-LINE REASONING
querying of the data structure coming out from the rst phase through
standard SQL-queries for ecient Core Competence Extraction.
Silvia Giannini Finding commonalities in Linked Open Data
13. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
CV translation
Silvia Giannini Finding commonalities in Linked Open Data
14. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
OFF-LINE REASONING: Relational schema design rules
T -Box informative content
Table CONCEPT: it stores CCNF of all the FL0(D) concepts (part (a))
Silvia Giannini Finding commonalities in Linked Open Data
15. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
OFF-LINE REASONING: Relational schema design rules
T -Box informative content
A table is created for each entry point R0
j , j 0 (part (b))
Silvia Giannini Finding commonalities in Linked Open Data
16. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
OFF-LINE REASONING: Relational schema design rules
A-Box informative content
Each atom of CCNF(C) of a conjunct ∃R0
j .C is stored in a dierent tuple
of table Rj with the same groupID (part (b))
Silvia Giannini Finding commonalities in Linked Open Data
17. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
OFF-LINE REASONING: Relational schema design rules
A-Box informative content
Table PROFILE includes proleID and extra-ontological structured
information (e.g., personal data, work-related information) (part (b))
Silvia Giannini Finding commonalities in Linked Open Data
18. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
ON-LINE REASONING: The Core Competence Extraction Algorithm
1 Proles Subsumers Matrix computation
Idea: Extract the common know-how, expressed in form of atomic
information, shared by the same group of employees, with cardinality
greater or equal to k.
Example
Mario Rossi: Cplusplus (5 years), Java (5 years), Visual Basic (5 years)
Daniela Bianchi: Cplusplus (2 years), Java (6 years), Visual Basic (1 years)
Elena Pomarico: CplusPlus, Java, Visual Basic
Carmelo Piccolo: VBScript, Process Performance Monitoring
Lucio Battista: DBMS (2 years)
Mariangela Porro: DBMS (2 years), Internet Technologies (2 years)
Nicola Marco: DBMS (5 years), Internet Technologies (5 years)
Domenico De Palo: OOprogramming (6 years), Articial intelligence (4 years), Internet technologies (4
years)
Silvia Giannini Finding commonalities in Linked Open Data
19. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
1 Proles Subsumers Matrix computation
Idea: Extract the common know-how, expressed in form of atomic
information, shared by the same group of employees, with cardinality
greater or equal to k.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Prole Subsumers Matrix
Silvia Giannini Finding commonalities in Linked Open Data
20. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
1 Proles Subsumers Matrix computation
Idea: Extract the common know-how, expressed in form of atomic
information, shared by the same group of employees, with cardinality
greater or equal to k.
D1 ∃hasKnowledge.ComputerScienceSkill
D2 ∃hasKnowledge.(ComputerScienceSkill =2 years)
D3 ∃hasKnowledge.ProgrammingLanguage
D4 ∃hasKnowledge.OOP
D5 ∃hasKnowledge.(ComputerScienceSkill =5 years)
D6 ∃hasKnowledge.(DBMS =2 years)
D7 ∃hasKnowledge.(OOP =5 years)
D8 ∃hasKnowledge.(InternetTechnologies =2 years)
D9 ∃hasKnowledge.C++
D10 ∃hasKnowledge.VisualBasic
D11 ∃hasKnowledge.Java
...
Table: Description of D1, . . . , D11 reported in the previous Table
Silvia Giannini Finding commonalities in Linked Open Data
21. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
1 Proles Subsumers Matrix computation
Silvia Giannini Finding commonalities in Linked Open Data
22. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P(a1), . . . , P(an)}, and to a concept
component Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is the
union of the most specic features (i.e., prole concept components Dj) shared
by the same group of k employees, where k is a predened threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Prole Subsumers Matrix
LCS = ∃hasKnowledge.ComputerScienceSkill
Silvia Giannini Finding commonalities in Linked Open Data
23. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P(a1), . . . , P(an)}, and to a concept
component Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is the
union of the most specic features (i.e., prole concept components Dj) shared
by the same group of k employees, where k is a predened threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Prole Subsumers Matrix
BICS = ∃hasKnowledge.ComputerScienceSkill =5 years
Silvia Giannini Finding commonalities in Linked Open Data
24. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P(a1), . . . , P(an)}, and to a concept
component Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is the
union of the most specic features (i.e., prole concept components Dj) shared
by the same group of k employees, where k is a predened threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Prole Subsumers Matrix
ICS3 = ∃hasKnowledge.(DBMS =2 years)
Silvia Giannini Finding commonalities in Linked Open Data
25. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P(a1), . . . , P(an)}, and to a concept
component Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is the
union of the most specic features (i.e., prole concept components Dj) shared
by the same group of k employees, where k is a predened threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Prole Subsumers Matrix
ICS3 = ∃hasKnowledge.(OOP =5 years)
Silvia Giannini Finding commonalities in Linked Open Data
26. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P(a1), . . . , P(an)}, and to a concept
component Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is the
union of the most specic features (i.e., prole concept components Dj) shared
by the same group of k employees, where k is a predened threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Prole Subsumers Matrix
ICS3 = ∃hasKnowledge.(InternetTechnologies =2 years)
Silvia Giannini Finding commonalities in Linked Open Data
27. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P(a1), . . . , P(an)}, and to a concept
component Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is the
union of the most specic features (i.e., prole concept components Dj) shared
by the same group of k employees, where k is a predened threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Prole Subsumers Matrix
ICS3 = ∃hasKnowledge.(C++ VisualBasic Java)
Silvia Giannini Finding commonalities in Linked Open Data
28. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
Core Competence module GUI
Silvia Giannini Finding commonalities in Linked Open Data
29. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
Core Competence module GUI
Silvia Giannini Finding commonalities in Linked Open Data
30. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
Core Competence module GUI
Silvia Giannini Finding commonalities in Linked Open Data
31. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
Core Competence module GUI
Silvia Giannini Finding commonalities in Linked Open Data
32. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
Lessons learned
Proposal: Knowledge Compilation approach for Core Competence Extraction.
+ It improves performances in terms of execution times, w.r.t. classical
logic-based approach.
+ It adopts standard SQL-queries to compute the same informative content
as advanced inference services.
+ It makes the computational costs of the process aordable also for large
organizations, while retaining the full expressiveness of the logic-based
approaches.
Notes on Performance:
The number of proles is highly relevant in the common subsumers
enumeration process.
The most computationally expensive process is the prole subsumers
matrix creation, under a threshold of proles concept components.
Silvia Giannini Finding commonalities in Linked Open Data
33. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Outline
1 Finding Commonalities: A DLs use case
2 Finding Commonalities: the Web of Data
Common Subsumer in RDF
RDF Clustering
3 Conclusion
Silvia Giannini Finding commonalities in Linked Open Data
34. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Motivation
Learning from the Web of Data:
huge amount of interconnected and machine-understandable data
data modeled as RDF resources
dataset addressed as Linked (Open) Data (LOD).
Facts to learn
identication of subsets of resources related to a common informative
content
- Cluster search (approximate matching)
- Disambiguation
- Personalization
Silvia Giannini Finding commonalities in Linked Open Data
35. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Problem Denition
In analogy to the LCS service, proposed in DLs to learn from examples.
Adaptation to the Web of Data:
giving up to the subsumption minimality requirement: even rough
Common Subsumers are useful for learning in the Web of Data
denition of Common Subsumer of pairs of RDF resources
Denition (Rooted Graph (r-graph))
Let TWr be the set of all triples with subject r in the Web. A Rooted Graph
(r-graph) is a pair r, Tr , where
1 r is either the URI of an RDF resource, or a blank node
2 Tr = {t | t = r p c} is a subset of relevant triples in TWr
Silvia Giannini Finding commonalities in Linked Open Data
36. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: A Possible Representation for resources a and b
Silvia Giannini Finding commonalities in Linked Open Data
37. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: A(nother) Possible Representation for resources a and b
Silvia Giannini Finding commonalities in Linked Open Data
38. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Common Subsumer
Denition (Common Subsumer)
Let a, Ta , b, Tb be two r-graphs and x, w, y be blank nodes.
If a, Ta = b, Tb , then a, Ta is a Common Subsumer of a, Ta , b, Tb .
if Ta = ∅ or Tb = ∅, the pair x, ∅ is a Common Subsumer of a, Ta ,
b, Tb
Otherwise, a pair x, T is a Common Subsumer of a, Ta , b, Tb i:
∃t = x w y such that (T entails t)
⇒ (1)
∃t1 = a p c, t2 = b q d such that
(T entails t1) ∧ (T entails t2)
where Ta ⊆ T, Tb ⊆ T and w, T is a Common Subsumer of p, Tp and
q, Tq , and y, T is a Common Subsumer of c, Tc and d, Td .
Note: We consider only simple entailment
Silvia Giannini Finding commonalities in Linked Open Data
39. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
40. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
41. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
42. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a Common Subsumer of a and b
Note: Triples with a blank node in predicate and object positions are discarded
Silvia Giannini Finding commonalities in Linked Open Data
43. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a(nother) Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
44. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a(nother) Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
45. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a(nother) Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
46. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a(nother) Common Subsumer of a and b
Note: Triples with a blank node in predicate and object positions are discarded
Silvia Giannini Finding commonalities in Linked Open Data
47. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Solving Algorithm
Main Features:
anytime: if interrupted, it always returns a Common Subsumer of the
input pair of RDF resources
modular: it takes as input a function computing the sets of triples relevant
for the input RDF resources
Our current criterion for triples selection:
triples within a given graph distance from the input resource
triples having properties within to a selected set of signicant properties
for the dataset/application of interest
Output: A Common Subsumer of two r-graphs a, Ta and b, Tb :
a pair made up by a resource (anonymous or not) and a set of triples
stating facts about such a resource which are true for both a and b.
Alternative cases:
_ : cs, T : a blank node _ : cs together with a set of triples related to
_ : cs.
a, Ta , i and a, Ta = b, Tb
_ : cs, ∅ if either Ta = ∅ or Tb = ∅
Silvia Giannini Finding commonalities in Linked Open Data
48. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
RDF Clustering
Target Semantic Web Task
Clustering of Web resources with a CS
retrieving resources conveying the same information
in their dierent RDF descriptions
CS description → SPARQL queries:
WHERE { Tcs [blank nodes → variables] }
Silvia Giannini Finding commonalities in Linked Open Data
49. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
RDF Clustering
Clustering with a CS: A use case
The Italian Chamber of Deputies LOD
Public SPARQL endpoint (http://dati.camera.it/sparql)
Running example: Find the commonalities between deputies Nilde Iotti
and Tina Anselmi in the 10th Legislature
Silvia Giannini Finding commonalities in Linked Open Data
50. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
RDF Clustering
Clustering with a CS: A use case
The Italian Chamber of Deputies LOD
Public SPARQL endpoint (http://dati.camera.it/sparql)
Running example: Find the commonalities between deputies Nilde Iotti
and Tina Anselmi in the 10th Legislature
Silvia Giannini Finding commonalities in Linked Open Data
51. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
RDF Clustering
Clustering with a CS: A use case
The Italian Chamber of Deputies LOD
Public SPARQL endpoint (http://dati.camera.it/sparql)
SELECT DISTINCT ?x0
WHERE{
?x0 a http://dati.camera.it/ocd/deputato .
?x0 http:xmlns.comfoaf0.1gender female .
?x0 http://dati.camera.it/ocd/rif_mandatoCamera ?x1 .
. . .
}
Silvia Giannini Finding commonalities in Linked Open Data
52. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
RDF Clustering
Clustering with a CS: A use case
1st Legislature clusters
Silvia Giannini Finding commonalities in Linked Open Data
53. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Outline
1 Finding Commonalities: A DLs use case
2 Finding Commonalities: the Web of Data
3 Conclusion
Silvia Giannini Finding commonalities in Linked Open Data
54. Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Conclusion
Motivation: learning shared informative content in collections of RDF
resources
Problem Denition: search for Common Subsumers not subsumption
minimal in order to ensure computability in the Web of Data, too large to
be explored
Results:
An anytime algorithm computing Common Subsumers of pairs of RDF
resources:
allowing for using partial learned informative content for further processing,
whenever the search for Common Subsumers is interrupted
possibly supporting the clustering of collections of RDF resources, by
exploiting associativity of Common Subsumers.
Future works:
Extension of CS denition to other entailment regimes
Investigation on methods for selection of relevant triples
Automated link traversal techniques for more dataset exploration
Application to data quality problems (e.g.,missing values)
Silvia Giannini Finding commonalities in Linked Open Data