SlideShare a Scribd company logo
1 of 23
Date: 30/11/2012
SSONDE: Semantic Similarity On
liNked Data Entities
Riccardo Albertoni
ralbertoni@delicias.dia.fi.upm.es
Ontology Engineering Group. Departamento de Inteligencia Artificial
Facultad de Informática
Universidad Politécnica de Madrid
Joint work with Monica De Martino (CNR-IMATI-GE)
MTSR 2012,
6th Metadata and Semantics Research Conference
28-30 November 2012 - Cádiz (Spain)
2
Presentation Outline
1. How SSONDE fits with other linked data
technologies
• What is it for? what is it not for?
2. Characteristics of instance similarity in SSONDE
• The theory behind SSONDE’s similarity is detailed in
• Albertoni R. and De Martino M.; Asymmetric and
context dependent semantic similarity among
ontology instances, Journal of Data Semantics,
LNCS, 2008.
3. SSONDE Architecture and Examples on Linked
Data
Riccardo Albertoni
3
Linked data Crawling architectural pattern
Riccardo Albertoni
SSONDE
LDSPIDER/FUSE
KI
LDIF
Cluster analysis Explorative search
on resources
Build analysis
services
Tom Heath and Christian Bizer (2011) Linked Data: Evolving the Web into a Global Data Space (1st
edition). 1-136. Morgan & Claypool
4
SSONDE Instance similarity
is not
to align ontologies/schemas;
to interlink/consolidate entities;
aims at
• providing a method for comparing entities represented as
instances in ontology driven repository or as entities
exposed in linked data;
• supporting in explorative searches.
assumes all the integration steps are done
Actually, it works at the Application Layer of the Linked
Data Crawling Architectural Pattern
main characteristics (make SSONDE unique in its kind)
Context to represent similarity criteria (algorithm parameters);
Asymmetry to emphasize containment between instances.
Example: comparing researchers
5
Presentation Outline
1. How SSONDE fits with other linked data
technologies
• What is it for? what is it not for?
2. Characteristics of instance similarity in SSONDE
• The theory behind SSONDE’s similarity is detailed in
• Albertoni R. and De Martino M.; Asymmetric and
context dependent semantic similarity among
ontology instances, Journal of Data Semantics,
LNCS, 2008.
3. SSONDE Architecture and Examples on Linked
Data
Riccardo Albertoni
6
Example: Researchers’ comparison
their
Publications
Researchers
Their
Research
Topics
Their
Projects
7
• Common
publications
• Common research
projects
• Similar research
interests
Different Contexts
the researchers, publications, … are instances
Researcher’s
Experience
Researchers’
Scientific
Interest
• Age
• Number of
publications
• Number of projects
Contexts
Researchers’ Features
(Data/Object properties)
considered in the Sim.
It is used only in
this context!!
They are used
In both the
contexts!!
8
[ResearchStaff, Interest]{{{TopicName,Inter}},{{RelatedTopic, Inter} }}
Formalization of Application Context
A function that for each recursion path
specifies data/objects properties and
which operations to consider
Example
• Common publications
• Common research
project
• Similar research interest
Researchers’
Scientific
Interest
[ResearchStaff] {{Φ}, {{Publication, Inter} {WorkAtProject, Inter}
{interest, Simil}}}
9
Why an Asymmetric Similarity?
Sim(a,b) might differ from Sim(b,a)
• Sim is not the inverse of a metric distance  metric properties
cannot be exploited to prune comparisons
Here asymmetry is adopted to highlight the
containment between instances A, B
Example of containment: (Comparing wrt publications only)
• A is Ph.D student who has always published with his tutor
B,
A
B
pub 3
pub 1
pub 2
Aiscontainedin B!!! (A<<B)
A can be replaced by B
B is notcontainedin A!!!
If you replace B with A
some experience got lost !!
10
SSONDE’s Asymmetric Similarity returns
Sim(A,B) ranges in [0,1]
It is proportional to the number of data and
object property values that A shares with B
• A is contained in B Sim(A,B)=1
• If A is not contained in B Sim(A,B)<1
• If A and B don’t share any “features” Sim(A,B)=0
• If A has exactly the same characteristics of B (A<<B,
B<<A)  Sim(A,B) = Sim(B,A) = 1
11
Results comparing young and senior researchers of IMATI
Research Experience Research Interest
The darkest is the matrix value the more is the similarity
12
Presentation Outline
1. How SSONDE fits with other linked data
technologies
• What is it for? what is it not for?
2. Characteristics of instance similarity in SSONDE
• The theory behind SSONDE’s similarity is detailed in
• Albertoni R. and De Martino M.; Asymmetric and
context dependent semantic similarity among
ontology instances, Journal of Data Semantics,
LNCS, 2008.
3. SSONDE Architecture and Examples on Linked
Data
Riccardo Albertoni
13
SSONDE
Output
TDB
Rep.
SDB
Rep.
RDF
Dumps
Configuration Similarity
Context Layer
Ontology Layer
Data Layer
Data wrappers
JENA
TDB
JENA
SDB
JENA
MEM
List of Instances
Java Class to
generate the list
Ref. Context
Ref. Rules (e.g.,
JENA rules)
Similarity matrix in
CSV
n-most similar
entities
In JSON
...Virtuoso
Wrppr
virtuoso
Kind of Store
….
WEBOF
DATA
RDF
Dumps
HTTP DEREFERENCIABLE
URIs
SPARQL
End Points
Third parties
Served Linked dataset
Crawling architectural pattern
LDIFLDSpider +Fuseki
Linked data consumption
Local Data Store
/Cache
SSONDE ARCHITECTURE
14
SSONDE: a building block for new analysis services
SSONDE applied on “real linked data”
• Analysing Habitat and Species
• published in NatureSDIplus (ECP-2007-GEO-317007), a
European project developing a Spatial Data Infrastructure for
Nature Conservation.
• to rank habitats according to the species they host  an
insight into inter-dependencies between habitats and
species
• Analysing overlaps among scientific interests
• Subset of linked dataset provided data.cnr.it as part of
SemanticScout framework by third parties (Gangemi et al)
• to compare IMATI-CNR researcher according to their
research interests
Riccardo Albertoni
15Riccardo Albertoni
Applying SSONDE on data.cnr.it
16Riccardo Albertoni
Applying SSONDE on data.cnr.it
http://code.google.com/p/ssonde/wiki/RDF_statements_download
17
Configuration file 1
{ "StoreConfiguration":{
"KindOfStore":"JENATDB",
"RDFDocumentURIs":[ ],
"TDBDirectory":"data/CNRIT/TDB-0.8.9/CNRR/"
},
"InstanceConfiguration":{
"InstanceURIsClass":"application.dataCNRIt.GetResearcherIMATIplusCoauthor"
},
"OutputConfiguration":{
"KindOfOutput":"JSONOrderedResult",
"NumberOfOrderedResult":”20",
"FilePath":"conf/dataCNRIt/ComplexContextResearchInterest/CRRIIntPub.res.json"
},
"ContextConfiguration":{
"ContextFilePath":"conf/dataCNRIt/ComplexContextResearchInterest/CCRIIntPub.ctx"
}
}
Riccardo Albertoni
List of LOD Entities URI
Java class Implementing ListOfInputInstances
Similarity Matrix CSV - JSON encoding of top n-most
similar
Context Encoded in a format in-house text format/
hopefully soon in JSON
18
Crawled by Data.CNR.it
Crawled by DBPEDIA
Data.cnr.it – defining a context
Riccardo Albertoni
Res 226
pub: 22
Topic:25Res 225
Topic:26
pub: 26
Topic:2
pub: 29
Res 226
Topic:27
Topic:23
skos:broader
dc:subject
pub:autoreCNRdi
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX pub: <http://www.cnr.it/ontology/cnr/pubblicazioni.owl#>
[owl:Thing, dc:subject]-> {{},{(skos:broader, Inter)}}
[owl:Thing]-> {{}, { (pub:autoreCNRDi, Inter),(dc:subject, Simil)}}
No data
properties are
considered in this
context
Publications
Interests
Interest Hierarchy
19
Similarity Matrix:
Riccardo Albertoni
data is more recent but
less accurate
But
More Researchers are
represented
&
Still containment is
highlighted
20
Hierarchical clustering: Scientific cluster are discovered
Hierarchical Clustering Hierarchical Clustering Explorer, 3.0, Human-Computer
Interaction Lab University of Maryland. http://www.cs.umd.edu/hcil/multi-cluster/.
21
What next?
(i) semantic similarity optimization:
(i) the caching of intermediate similarity results
(ii) the adoption of MapReduce paradigm to speed up the
assessment of semantic similarity;
(ii) domain driven extensions at data layer:
(i) defining new data layer measures suited for geo-
referenced entities
(ii) the multilingual similarity
(iii) definition of interfaces sifting entities according to
their similarity exploiting visualization frameworks
such as Exibit, Google visualization and JavaScript
InfoVis Toolkit.
Riccardo Albertoni
22
THANKS for your kind attention!!!
Questions/ Discussion / Suggestion
Riccardo Albertoni
• SSONDE can be deployed in some of your future projects
(proposal)
• You are interested in contributing to SSONDE Open
framework
Do not hesitate to contact us if
SSONDE framework
• pushes our instance similarity as a ready-to-go tool for the
analysis of linked data.
• its Java Code available in Google Code
• http://purl.oclc.org/NET/SSONDE
• licenced as open source code (GNU GPL v3)
23
SSONDE Framework
• R. Albertoni, M. De Martino, SSONDE: Semantic Similarity On liNked Data Entities, 6th Metadata
and Semantics Research Conference, 28-30 November 2012 - Cádiz (Spain) [to appear]
• Framework Installation & use http://code.google.com/p/ssonde/wiki/GettingStarted
Semantic Similarity Theoretical Framework
• Albertoni R. and De Martino M.; Asymmetric and context dependent semantic similarity among
ontology instances, Journal of Data Semantics, LNCS, 2008.
• Albertoni R. and De Martino M.;. Semantic similarity of ontology instances tailored on the
application context. Full paper at On the Move to Meaningful Internet Systems 2006: CoopIS, DOA,
GADA, and ODBASE, volume 4275 of LNCS, pages 1020–1038. Springer, 2006.
Issues adapting theoretical framework to Linked Data
• Albertoni R., De Martino M.; Semantic Similarity and Selection of Resources Published
According to Linked Data Best Practice, OnToContent 2010, Part of the OTM (OTM'10)
Further Applications
Comparing EUNIS habitats wrt their species
• Albertoni R., De Martino M.; Semantic Technology to Exploit Digital Content Exposed as Linked
Data, eChallenges e-2011, 26-28 October 2011 Florence, Italy
Comparing shapes metadata (not Linked Data)
• Albertoni R., De Martino M.; Using Context Dependent Semantic Similarity to Browse
Information Resources: an Application for the Industrial Design, First workshop on multimedia
Annotation and Retrieval enabled by Shared Ontologies, Genoa, Italy, (2007)
A complete list of references on SSONDE and its Instance Similarity

More Related Content

What's hot

Revealing digital documents - concealed structures in data
Revealing digital documents - concealed structures in dataRevealing digital documents - concealed structures in data
Revealing digital documents - concealed structures in dataJakob .
 
Ontology Building and its Application using Hozo
Ontology Building and its Application using HozoOntology Building and its Application using Hozo
Ontology Building and its Application using HozoKouji Kozaki
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsVito Ostuni
 
SemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic WebSemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic WebAdrian Paschke
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeDan Brickley
 
Querying Linked Data and Büchi automata
Querying Linked Data and Büchi automataQuerying Linked Data and Büchi automata
Querying Linked Data and Büchi automataKonstantinos Giannakis
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly CommunityMarko Rodriguez
 
Atlas.ti making sense of research data in policy analysis
Atlas.ti   making sense of research data in policy analysisAtlas.ti   making sense of research data in policy analysis
Atlas.ti making sense of research data in policy analysisMerlien Institute
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...Raphael Troncy
 
Semantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionSemantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionKent State University
 
From Exploratory Search to Web Search and back - PIKM 2010
From Exploratory Search to Web Search and back - PIKM 2010From Exploratory Search to Web Search and back - PIKM 2010
From Exploratory Search to Web Search and back - PIKM 2010Roku
 
GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003butest
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Andre Freitas
 
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchLinked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchChristoph Lange
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionLihua Zhao
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Lihua Zhao
 

What's hot (20)

Revealing digital documents - concealed structures in data
Revealing digital documents - concealed structures in dataRevealing digital documents - concealed structures in data
Revealing digital documents - concealed structures in data
 
Ontology Building and its Application using Hozo
Ontology Building and its Application using HozoOntology Building and its Application using Hozo
Ontology Building and its Application using Hozo
 
Linked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender SystemsLinked Open Data to support content based Recommender Systems
Linked Open Data to support content based Recommender Systems
 
SemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic WebSemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic Web
 
SemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in PracticeSemWeb Fundamentals - Info Linking & Layering in Practice
SemWeb Fundamentals - Info Linking & Layering in Practice
 
Querying Linked Data and Büchi automata
Querying Linked Data and Büchi automataQuerying Linked Data and Büchi automata
Querying Linked Data and Büchi automata
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly Community
 
Atlas.ti making sense of research data in policy analysis
Atlas.ti   making sense of research data in policy analysisAtlas.ti   making sense of research data in policy analysis
Atlas.ti making sense of research data in policy analysis
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...
 
Semantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: IntroductionSemantic Web, Ontology, and Ontology Learning: Introduction
Semantic Web, Ontology, and Ontology Learning: Introduction
 
Linked open data: standardization, interoperability and multilingual challeng...
Linked open data: standardization, interoperability and multilingual challeng...Linked open data: standardization, interoperability and multilingual challeng...
Linked open data: standardization, interoperability and multilingual challeng...
 
From Exploratory Search to Web Search and back - PIKM 2010
From Exploratory Search to Web Search and back - PIKM 2010From Exploratory Search to Web Search and back - PIKM 2010
From Exploratory Search to Web Search and back - PIKM 2010
 
Semantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including AstrophysicsSemantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including Astrophysics
 
GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
 
Fqas09
Fqas09Fqas09
Fqas09
 
POSTDATA: Towards publishing European Poetry as Linked Open Data
POSTDATA: Towards publishing European Poetry as Linked Open DataPOSTDATA: Towards publishing European Poetry as Linked Open Data
POSTDATA: Towards publishing European Poetry as Linked Open Data
 
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchLinked Open (Geo)Data and the Distributed Ontology Language – a perfect match
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect match
 
Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge Acquisition
 
Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011Mid-Ontology Learning from Linked Data @JIST2011
Mid-Ontology Learning from Linked Data @JIST2011
 

Viewers also liked

SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...Riccardo Albertoni
 
Registreren en publiceren volgens CEST richtlijnen
Registreren en publiceren volgens CEST richtlijnenRegistreren en publiceren volgens CEST richtlijnen
Registreren en publiceren volgens CEST richtlijnenPACKED vzw
 
Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...John Breslin
 
Presentatie speel en deelsessie web 3 0 (frans van der horst) (2)
Presentatie speel  en deelsessie web 3 0 (frans van der horst) (2)Presentatie speel  en deelsessie web 3 0 (frans van der horst) (2)
Presentatie speel en deelsessie web 3 0 (frans van der horst) (2)PHC
 
Cuestionario china
Cuestionario chinaCuestionario china
Cuestionario chinamaqui17
 

Viewers also liked (6)

SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...
 
Registreren en publiceren volgens CEST richtlijnen
Registreren en publiceren volgens CEST richtlijnenRegistreren en publiceren volgens CEST richtlijnen
Registreren en publiceren volgens CEST richtlijnen
 
Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...Interlinking Online Communities and Enriching Social Software with the Semant...
Interlinking Online Communities and Enriching Social Software with the Semant...
 
Presentatie speel en deelsessie web 3 0 (frans van der horst) (2)
Presentatie speel  en deelsessie web 3 0 (frans van der horst) (2)Presentatie speel  en deelsessie web 3 0 (frans van der horst) (2)
Presentatie speel en deelsessie web 3 0 (frans van der horst) (2)
 
Cuestionario china
Cuestionario chinaCuestionario china
Cuestionario china
 
China antigua
China antiguaChina antigua
China antigua
 

Similar to Presentation at MTSR 2012

Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data VisualizationLaura Po
 
A Framework for Ontology Usage Analysis
A Framework for Ontology Usage AnalysisA Framework for Ontology Usage Analysis
A Framework for Ontology Usage AnalysisJamshaid Ashraf
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things PayamBarnaghi
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Riccardo Albertoni
 
Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Enrico Daga
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so farElena Simperl
 
The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...João Rocha da Silva
 
Ontology mapping for the semantic web
Ontology mapping for the semantic webOntology mapping for the semantic web
Ontology mapping for the semantic webWorawith Sangkatip
 
Linked Open Data about Springer Nature conferences. The story so far
Linked Open Data about Springer Nature conferences. The story so farLinked Open Data about Springer Nature conferences. The story so far
Linked Open Data about Springer Nature conferences. The story so farAliaksandr Birukou
 
Scientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an OverviewScientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an OverviewAngelo Salatino
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep LearningAndre Freitas
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016Jessie Chuang
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaGiorgia Lodi
 
Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1ErhardRahm
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Fernando de Assis Rodrigues
 
From Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsFrom Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsAndre Freitas
 

Similar to Presentation at MTSR 2012 (20)

Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
A Framework for Ontology Usage Analysis
A Framework for Ontology Usage AnalysisA Framework for Ontology Usage Analysis
A Framework for Ontology Usage Analysis
 
Exploring Linked Data
Exploring Linked DataExploring Linked Data
Exploring Linked Data
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 
Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...Semantic Similarity and Selection of Resources Published According to Linked ...
Semantic Similarity and Selection of Resources Published According to Linked ...
 
Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.Data integration with a façade. The case of knowledge graph construction.
Data integration with a façade. The case of knowledge graph construction.
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 
The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...
 
Weso research group
Weso research groupWeso research group
Weso research group
 
Ontology mapping for the semantic web
Ontology mapping for the semantic webOntology mapping for the semantic web
Ontology mapping for the semantic web
 
Linked Open Data about Springer Nature conferences. The story so far
Linked Open Data about Springer Nature conferences. The story so farLinked Open Data about Springer Nature conferences. The story so far
Linked Open Data about Springer Nature conferences. The story so far
 
Scientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an OverviewScientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an Overview
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenza
 
Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1Scalable and privacy-preserving data integration - part 1
Scalable and privacy-preserving data integration - part 1
 
dotte.ppt
dotte.pptdotte.ppt
dotte.ppt
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...
 
From Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsFrom Linked Data to Semantic Applications
From Linked Data to Semantic Applications
 

More from Riccardo Albertoni

Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015Riccardo Albertoni
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Riccardo Albertoni
 
LusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for EnvironmentLusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for EnvironmentRiccardo Albertoni
 
SSONDE: Semantic Similarity On liNked Data Entities
SSONDE: Semantic Similarity On liNked Data EntitiesSSONDE: Semantic Similarity On liNked Data Entities
SSONDE: Semantic Similarity On liNked Data EntitiesRiccardo Albertoni
 
An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...Riccardo Albertoni
 
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Riccardo Albertoni
 

More from Riccardo Albertoni (8)

Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
 
LusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for EnvironmentLusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for Environment
 
Linkset quality (LWDM 2013)
Linkset quality (LWDM 2013)Linkset quality (LWDM 2013)
Linkset quality (LWDM 2013)
 
Linkset quality
Linkset qualityLinkset quality
Linkset quality
 
SSONDE: Semantic Similarity On liNked Data Entities
SSONDE: Semantic Similarity On liNked Data EntitiesSSONDE: Semantic Similarity On liNked Data Entities
SSONDE: Semantic Similarity On liNked Data Entities
 
An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...
 
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
 

Presentation at MTSR 2012

  • 1. Date: 30/11/2012 SSONDE: Semantic Similarity On liNked Data Entities Riccardo Albertoni ralbertoni@delicias.dia.fi.upm.es Ontology Engineering Group. Departamento de Inteligencia Artificial Facultad de Informática Universidad Politécnica de Madrid Joint work with Monica De Martino (CNR-IMATI-GE) MTSR 2012, 6th Metadata and Semantics Research Conference 28-30 November 2012 - Cádiz (Spain)
  • 2. 2 Presentation Outline 1. How SSONDE fits with other linked data technologies • What is it for? what is it not for? 2. Characteristics of instance similarity in SSONDE • The theory behind SSONDE’s similarity is detailed in • Albertoni R. and De Martino M.; Asymmetric and context dependent semantic similarity among ontology instances, Journal of Data Semantics, LNCS, 2008. 3. SSONDE Architecture and Examples on Linked Data Riccardo Albertoni
  • 3. 3 Linked data Crawling architectural pattern Riccardo Albertoni SSONDE LDSPIDER/FUSE KI LDIF Cluster analysis Explorative search on resources Build analysis services Tom Heath and Christian Bizer (2011) Linked Data: Evolving the Web into a Global Data Space (1st edition). 1-136. Morgan & Claypool
  • 4. 4 SSONDE Instance similarity is not to align ontologies/schemas; to interlink/consolidate entities; aims at • providing a method for comparing entities represented as instances in ontology driven repository or as entities exposed in linked data; • supporting in explorative searches. assumes all the integration steps are done Actually, it works at the Application Layer of the Linked Data Crawling Architectural Pattern main characteristics (make SSONDE unique in its kind) Context to represent similarity criteria (algorithm parameters); Asymmetry to emphasize containment between instances. Example: comparing researchers
  • 5. 5 Presentation Outline 1. How SSONDE fits with other linked data technologies • What is it for? what is it not for? 2. Characteristics of instance similarity in SSONDE • The theory behind SSONDE’s similarity is detailed in • Albertoni R. and De Martino M.; Asymmetric and context dependent semantic similarity among ontology instances, Journal of Data Semantics, LNCS, 2008. 3. SSONDE Architecture and Examples on Linked Data Riccardo Albertoni
  • 7. 7 • Common publications • Common research projects • Similar research interests Different Contexts the researchers, publications, … are instances Researcher’s Experience Researchers’ Scientific Interest • Age • Number of publications • Number of projects Contexts Researchers’ Features (Data/Object properties) considered in the Sim. It is used only in this context!! They are used In both the contexts!!
  • 8. 8 [ResearchStaff, Interest]{{{TopicName,Inter}},{{RelatedTopic, Inter} }} Formalization of Application Context A function that for each recursion path specifies data/objects properties and which operations to consider Example • Common publications • Common research project • Similar research interest Researchers’ Scientific Interest [ResearchStaff] {{Φ}, {{Publication, Inter} {WorkAtProject, Inter} {interest, Simil}}}
  • 9. 9 Why an Asymmetric Similarity? Sim(a,b) might differ from Sim(b,a) • Sim is not the inverse of a metric distance  metric properties cannot be exploited to prune comparisons Here asymmetry is adopted to highlight the containment between instances A, B Example of containment: (Comparing wrt publications only) • A is Ph.D student who has always published with his tutor B, A B pub 3 pub 1 pub 2 Aiscontainedin B!!! (A<<B) A can be replaced by B B is notcontainedin A!!! If you replace B with A some experience got lost !!
  • 10. 10 SSONDE’s Asymmetric Similarity returns Sim(A,B) ranges in [0,1] It is proportional to the number of data and object property values that A shares with B • A is contained in B Sim(A,B)=1 • If A is not contained in B Sim(A,B)<1 • If A and B don’t share any “features” Sim(A,B)=0 • If A has exactly the same characteristics of B (A<<B, B<<A)  Sim(A,B) = Sim(B,A) = 1
  • 11. 11 Results comparing young and senior researchers of IMATI Research Experience Research Interest The darkest is the matrix value the more is the similarity
  • 12. 12 Presentation Outline 1. How SSONDE fits with other linked data technologies • What is it for? what is it not for? 2. Characteristics of instance similarity in SSONDE • The theory behind SSONDE’s similarity is detailed in • Albertoni R. and De Martino M.; Asymmetric and context dependent semantic similarity among ontology instances, Journal of Data Semantics, LNCS, 2008. 3. SSONDE Architecture and Examples on Linked Data Riccardo Albertoni
  • 13. 13 SSONDE Output TDB Rep. SDB Rep. RDF Dumps Configuration Similarity Context Layer Ontology Layer Data Layer Data wrappers JENA TDB JENA SDB JENA MEM List of Instances Java Class to generate the list Ref. Context Ref. Rules (e.g., JENA rules) Similarity matrix in CSV n-most similar entities In JSON ...Virtuoso Wrppr virtuoso Kind of Store …. WEBOF DATA RDF Dumps HTTP DEREFERENCIABLE URIs SPARQL End Points Third parties Served Linked dataset Crawling architectural pattern LDIFLDSpider +Fuseki Linked data consumption Local Data Store /Cache SSONDE ARCHITECTURE
  • 14. 14 SSONDE: a building block for new analysis services SSONDE applied on “real linked data” • Analysing Habitat and Species • published in NatureSDIplus (ECP-2007-GEO-317007), a European project developing a Spatial Data Infrastructure for Nature Conservation. • to rank habitats according to the species they host  an insight into inter-dependencies between habitats and species • Analysing overlaps among scientific interests • Subset of linked dataset provided data.cnr.it as part of SemanticScout framework by third parties (Gangemi et al) • to compare IMATI-CNR researcher according to their research interests Riccardo Albertoni
  • 16. 16Riccardo Albertoni Applying SSONDE on data.cnr.it http://code.google.com/p/ssonde/wiki/RDF_statements_download
  • 17. 17 Configuration file 1 { "StoreConfiguration":{ "KindOfStore":"JENATDB", "RDFDocumentURIs":[ ], "TDBDirectory":"data/CNRIT/TDB-0.8.9/CNRR/" }, "InstanceConfiguration":{ "InstanceURIsClass":"application.dataCNRIt.GetResearcherIMATIplusCoauthor" }, "OutputConfiguration":{ "KindOfOutput":"JSONOrderedResult", "NumberOfOrderedResult":”20", "FilePath":"conf/dataCNRIt/ComplexContextResearchInterest/CRRIIntPub.res.json" }, "ContextConfiguration":{ "ContextFilePath":"conf/dataCNRIt/ComplexContextResearchInterest/CCRIIntPub.ctx" } } Riccardo Albertoni List of LOD Entities URI Java class Implementing ListOfInputInstances Similarity Matrix CSV - JSON encoding of top n-most similar Context Encoded in a format in-house text format/ hopefully soon in JSON
  • 18. 18 Crawled by Data.CNR.it Crawled by DBPEDIA Data.cnr.it – defining a context Riccardo Albertoni Res 226 pub: 22 Topic:25Res 225 Topic:26 pub: 26 Topic:2 pub: 29 Res 226 Topic:27 Topic:23 skos:broader dc:subject pub:autoreCNRdi PREFIX dc: <http://purl.org/dc/terms/> PREFIX pub: <http://www.cnr.it/ontology/cnr/pubblicazioni.owl#> [owl:Thing, dc:subject]-> {{},{(skos:broader, Inter)}} [owl:Thing]-> {{}, { (pub:autoreCNRDi, Inter),(dc:subject, Simil)}} No data properties are considered in this context Publications Interests Interest Hierarchy
  • 19. 19 Similarity Matrix: Riccardo Albertoni data is more recent but less accurate But More Researchers are represented & Still containment is highlighted
  • 20. 20 Hierarchical clustering: Scientific cluster are discovered Hierarchical Clustering Hierarchical Clustering Explorer, 3.0, Human-Computer Interaction Lab University of Maryland. http://www.cs.umd.edu/hcil/multi-cluster/.
  • 21. 21 What next? (i) semantic similarity optimization: (i) the caching of intermediate similarity results (ii) the adoption of MapReduce paradigm to speed up the assessment of semantic similarity; (ii) domain driven extensions at data layer: (i) defining new data layer measures suited for geo- referenced entities (ii) the multilingual similarity (iii) definition of interfaces sifting entities according to their similarity exploiting visualization frameworks such as Exibit, Google visualization and JavaScript InfoVis Toolkit. Riccardo Albertoni
  • 22. 22 THANKS for your kind attention!!! Questions/ Discussion / Suggestion Riccardo Albertoni • SSONDE can be deployed in some of your future projects (proposal) • You are interested in contributing to SSONDE Open framework Do not hesitate to contact us if SSONDE framework • pushes our instance similarity as a ready-to-go tool for the analysis of linked data. • its Java Code available in Google Code • http://purl.oclc.org/NET/SSONDE • licenced as open source code (GNU GPL v3)
  • 23. 23 SSONDE Framework • R. Albertoni, M. De Martino, SSONDE: Semantic Similarity On liNked Data Entities, 6th Metadata and Semantics Research Conference, 28-30 November 2012 - Cádiz (Spain) [to appear] • Framework Installation & use http://code.google.com/p/ssonde/wiki/GettingStarted Semantic Similarity Theoretical Framework • Albertoni R. and De Martino M.; Asymmetric and context dependent semantic similarity among ontology instances, Journal of Data Semantics, LNCS, 2008. • Albertoni R. and De Martino M.;. Semantic similarity of ontology instances tailored on the application context. Full paper at On the Move to Meaningful Internet Systems 2006: CoopIS, DOA, GADA, and ODBASE, volume 4275 of LNCS, pages 1020–1038. Springer, 2006. Issues adapting theoretical framework to Linked Data • Albertoni R., De Martino M.; Semantic Similarity and Selection of Resources Published According to Linked Data Best Practice, OnToContent 2010, Part of the OTM (OTM'10) Further Applications Comparing EUNIS habitats wrt their species • Albertoni R., De Martino M.; Semantic Technology to Exploit Digital Content Exposed as Linked Data, eChallenges e-2011, 26-28 October 2011 Florence, Italy Comparing shapes metadata (not Linked Data) • Albertoni R., De Martino M.; Using Context Dependent Semantic Similarity to Browse Information Resources: an Application for the Industrial Design, First workshop on multimedia Annotation and Retrieval enabled by Shared Ontologies, Genoa, Italy, (2007) A complete list of references on SSONDE and its Instance Similarity