SlideShare a Scribd company logo
Semantic Similarity and Selection
of Resources Published According
to Linked Data Best Practice
Riccardo Albertoni,
Monica De Martino
CNR-IMATI-GE
Institute of Applied Mathematics and Information Technologies
(Dept of Genoa)
Consiglio Nazionale delle Ricerche, Italy
The 6th International Workshop on Ontology Content (OnToContent 2010)
Oct 28, 2010 Crete Part of the OTM (OTM'2010)
Outline
• Resource Selection, Semantic Similarity and Linked
data.
▫ Why does Resource Selection matter?
▫ Real example:
 Complex metadata to document resources
 Linked data paves the way for sharing complex metadata
▫ Semantic Similarity as base for resource selection
 Nice features as Asymmetry & Context-Dependence
• Scaling Semantic similarity up to Web of Data
▫ Issues & Research plandirection
▫ Exploratory phase with real data from the web data
 Are the issues we consider relevant? In which varieties
shapes issues occur in real data?
▫ Lesson learnt from the exploratory phase
Resource Selection:
• why does it matter?
▫ Effective sharing and reuse of data are still
desiderata of many scientific and industrial
domains where the selection of tailored and
high-quality data is a necessary condition to
provide successful and competitive services
• Resource selection
▫ in order to select the resources which fit a given
problemtask we rely on an analysis of
metadata documenting resources
Real Example
Acquisition
Preprocessing
Integration
ModelsAnalysis
Web server
Sea Trial
courtesy of NATO Undersea Research Centre (NURC),
Example developed in NURC Research Assistance
granted to R. Albertoni (2008)
Short term perspective: data is collected and
elaborated for well planned purposes (aka sea trial experiments)
Potential new “customer” for sea trial
Data
• NATO Agencies/Nations ask for data previously
collected
• New scientists arriving at NURC
▫ They want to access to data in order to produce model by their own
approaches and to compare the results with models already
produced at NURC (Benchmarking)
• Scientists/Agencies investigating how phenomena
have been changed in a long period
▫ They are interested in data collected in the past
• Scientists/Agencies planning a new sea trial
▫ It can be useful to know what have collected in previous sea trials,
how data have been elaborated
Data reusability: unplanned use of data
long term perspective
Potential customers’ point of view
These curtomers were not involved in sea
trials, thus, searching for data they
wonder:
• Is data collected at NURC suitable for the
application I have in mind?
• Is data reliable enough?
To answer to these macro questions
• Users need to have details about how data has
been acquired, pre-processed, integrated,
analyzed, and even to know who was in charge
for which part…
ModelsData
Processes
People
Sensors
Characteristics
Metadata Complexity in Real World- Linked data helps in
share complex metadata
Sensor’s responsible
party
sensor settings
Parameters, choices made
during the preprocessing
Analysis applied..
Parameters etc
Sensor
Sensor
Sensor
Sensor APO
FOAF
ISO19115CoreTest PlanDublin Core
SensorML
SensorML
SensorML
SensorML
Problem: keep the bar balanced !!
Semantic similarity
as Metadata analysis
to support user
comparing the
features of candidate
resources
Huge amount of
ontology driven
metadata describing
complex features as
linked data
semantic similarity as metadata
analysis tool
• instance similarity is fundamental to support detailed
comparison, ranking and selection of resources through
its ontology driven metadata
▫ Albertoni R., De Martino M., Asymmetric and context-dependent
semantic similarity among ontology instances, Journal on Data
Semantics X, Springer Verlag, (2008).
• Explicitly addressing the
▫ Context as explicit parameterization of similarity assessment
 Context specifies which features to consider and how
▫ Asymmetry to highlight containment between resources
 Sim(A,B) ranges [0,1] is worked out to measure how many
features A shares with B out of the overall A features
 If features(A) are contained in features(B): sim(A,B)=1 and
sim(B,A)<1
• Limitation: Not for linked data, it was for locally-stored
ontology-driven repository and one well defined schema
How to make Semantic Similarity to
scale up to the web of data? 1/2
Identified issues Research Plan
non-authoritative metadata, metadata
published by actors who are neither the resource
producers nor the owners
WHEN metadata documenting resources that
have been re-elaborated or reviewed by third
parties
Synergies with semantic
web indexes (e.g.,
SINDICE ) to retrieve non
authoritative features
heterogeneous metadata, metadata
provided according to different, sometimes
interlinked, more often overlapping metadata
vocabularies
WHEN metadata for a resource is provided by
stakeholders with different fields of competency,
then they may use different vocabularies, not
always these vocabularies are independent
deploying schema and
entity level
consolidation using both
explicit metadata
statements and mining
implicit equivalences
through co-occurring
resources annotations;
How to make Semantic Similarity to
scale up to the web of data? 2/2
Identified issues Research Plan
non-consistently identified metadata,
namely metadata occurring when the same
resource has different identifiers in distinct
metadata sets
WHEN
Two actors in the pipeline documents
independently the same resource at different
stage of the pipeline
•reasoning techniques to be
applied to web datasets, e.g., to
smush fragments of
distributed metadata
• scripts to interlink
resources relying on a-
priori knowledge about how
datasets have been originated;
efficiency and computational issue: in
a longer perspective an accurate similarity
assessment might result computationally
prohibitive
WHEN
the number of resources discovered and
features considered increase.
•cashing of intermediate
comparisons
•techniques to prune
comparisons according to a
specified application context
•algorithms for efficient
parallelization can be studied
Exploratory phase
• Facing with the aforementioned issues is a very
challenging research plan!!!
• Let’s get a first hand experience in varieties introduced
by data providers
▫ Requirements:
 Real metadata published as linked data
 Provided by third parties
• Linked data provides huge potential for documenting
resources produced in complex pipelines but it is not yet
a common practice
▫ We considered a simpler domain (researchers and
their publications)
 Semantic Web Dog Food-SWDF
(http://data.semanticweb.org/)
 DBLP in RDF (http://dblp.l3s.de/d2r).
Instance similarity redesigned
prototype
• As test bed for experimenting and deepen the
aforementioned issues
• Extension
▫ Extended the notion of context including
namespaces to consider properties from different
RDF schemas
▫ Updated the ontology model, moving from
ProtegeAPI to RDF model
 JENA Reasoner and SPARQL
Context:
Researcher X (URI(X)=A) Researcher Y (URI(Y)=B)
<A> rdfs:label “A descr" ;
dc:license <http://vb.com> ;
foaf:primaryTopic <xzc> ;
foaf:made <paperC>;
foaf:made <paperD>;
foaf:made <paperH>.
<B> rdfs:label “B descr" ;
dc:license <http://vb.com> ;
foaf:primaryTopic <vbn> ;
foaf:made <paperC>;
foaf:made <paperF>.
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
[foaf:Person]->{{},{(foaf:made, Count)}}
Two researchers are as similar as they
have a similar number of publications
3 2
SIM(X,Y)= SIM(A,B)= 2/max(3,2)=2/3
SIM(Y,X)= SIM(B,A)= 3/max(3,2)=1
Take a look to R. Albertoni, M. De Martino
JODS X, 2008 for more complex similarity
assessment !!
We compare researchers by their URIs
Non-Authoritative Metadata - Example
URI(Giovanni)=A URI (Renaud)=B
<A> rdfs:label “A descr" ;
dc:license <http://vb.com> ;
foaf:primaryTopic <xzc> ;
foaf:made <paperC>;
foaf:made <paperD>;
foaf:made <paperH>.
<B> rdfs:label “B descr" ;
dc:license <http://vb.com> ;
foaf:primaryTopic <vbn> ;
foaf:made <paperC>;
foaf:made <paperF>.
Let’s compare Giovanni and Renaud starting from their URI in
DBLP
A= http://dblp.l3s.de/…../Giovanni_Tummarello
B= http://dblp.l3s.de/…./Renaud_Delbru
But we know, semantic web dog food (SWDF) might provide more info about
Giovanni and Renaud,
What if SWDF provides an additional paper for Renaud,
paper which Giovanni is not coauthoring?
SIM(Giovanni, Renaud)=1 instead of 2/3….
Non-Authoritative Metadata -SINDICE
You get RDF Fragments from DBLP only !!!
none from semantic web dog food we know providing further info..
IDEA: Querying SINDICE by Researchers’ URIs A, B to get RDF
fragments pertaining to Giovanni and Renaud
•URIs not name as keywords, because different people might share
the same name, URIs are in principle more precise
First lesson: Non-authoritative metadata and Non-consistently
identified metadata are tightly inter-related in the real practice. To
effectively deal with the former issue often we have to care about the
latter issue.
SWDF
Researchers
URI
DBLP
Researchers’
URI
They do not
overlap!!!
DBLP URI ---
How to move next?
IDEA: if SWDF added rules likes
<http://data.semanticweb.org/person/name-[midlename]-[familyname]>
owl:sameAs <http://dblp.l3s.de/d2r/resource/authors/name_[middle-
name]_familyname>
SWDF URIOwl:SameAs
The SWDF fragments would have been retrieved by SINDICE..
[We are implicitly assuming some reasoning:
e.g.:
(X owl:sameAs X1) and (X1 rel Z) -> (X rel Z)
]
heterogeneous metadata- Example
RDF for Giovanni in DBLP RDF for Giovanni in SWDF
<A> rdfs:label “A descr" ;
dc:license <http://vb.com> ;
foaf:primaryTopic <xzc> ;
foaf:made <paperC>;
foaf:made <paperD>;
foaf:made <paperH>.
<A> rdfs:label “A descr" ;
dc:license <http://vb.com> ;
foaf:primaryTopic <xzc> ;
<paperE> foaf:maker <A>.
<paperB> foaf:maker <A>.
This problem does not appear in terms of different RDF
schemas
Both DBLP and SWDF deploy foaf …
foaf:made is owl:inverseOf foaf:maker, but you cannot know it if you don’t
dereference/load the foaf schema
Second lesson: ontology/schema/properties in the context must be
dereferenced as much as entity’s URIs to make the semantics of
properties exploitable.
We must be careful dereferencing
• Dereferencing schemata and URI
▫ is extremely slow
▫ adds many RDF statements which might result
useless for semantic similarity assessment
 Info not pertaining to specified context
▫ ends up with huge amount of derived RDF
statement which might worsen efficiency ad
computational problems
Third lesson: specific and context driven policies to dereference the URI
and retrieve RDF fragments should be deployed in order to ease
efficiency and computational problems.
For example : to dereference only properties mentioned in context .. Or
consider only RDF fragments returned by SINDICE with explicit
reference to schemas mentioned in the context.
How to move next?
RDF for Giovanni in DBLP RDF for Giovanni in SWDF
<A> rdfs:label “A descr" ;
dc:license <http://vb.com> ;
foaf:primaryTopic <xzc> ;
foaf:made <paperC>;
foaf:made <paperD>;
foaf:made <paperH>.
<A> rdfs:label “A descr" ;
dc:license <http://vb.com> ;
foaf:primaryTopic <xzc>.
<paperE> foaf:maker <A>.
<paperB> foaf:maker <A>.
<A> foaf:make <paperE>.
<A> foaf:make <paperB>.
Assuming we have dereferenced the foaf:maker, or
upload in the reasoner a rule saying (P foaf:maker X)-
> (X foaf:make P)
Non-consistently identified metadata
What if the same pub is provided both by DBLP and SWDF?
E.g., DBLP:paperC and SWDF:paperB are two URI for the same paper
We count it twice 
Fourth lesson: Non-consistently identified metadata is a recursive
problem. Consolidating researchers without consolidating papers brings
to wrong similarity results. We must be sure entities and properties in
the similarity context have been properly consolidated before applying
instance similarity.
RDF for Giovanni in DBLP + for Giovanni in SWDF
<A> rdfs:label “A descr" ;
dc:license <http://vb.com> ;
foaf:primaryTopic <xzc> ;
foaf:made <DBLP:paperC>;
foaf:made <DBLP:paperD>;
foaf:made <DBLP:paperH>.
<A> rdfs:label “A descr" ;
dc:license <http://vb.com> ;
foaf:primaryTopic <xzc>.
<A> foaf:made <SWDF:paperE>.
<A> foaf:made <SWDF:paperB>.
Conclusion (I)
• Linked data best practice and our semantic
similarity
▫ good potential to support data selection for
complex domain resource
• But scaling semantic similarity up to web of data
means to deal with
▫ Non authoritative metadata
▫ Heterogeneous metadata
▫ Non-consistently identified metadata
▫ Efficiency and computational issue
Conclusion (II)
• The exploratory phase shows
▫ All the mentioned issues arise even in very simple
scenario assessing the semantic similarity
▫ It is pivotal to have first-hand experience with real
data to discover the shape issues might assume
• Consideration
▫ Problems we found are not exclusive for similarity
assessment
 We suspect this issues arise whenever you try to
elaborate information published as linked data in
order to mining new factsinfo from the published
data
Do not hesitate to email me (Albertoni@ge.imati.cnr.it)
If you have off line questions

More Related Content

What's hot

Practical machine learning - Part 1
Practical machine learning - Part 1Practical machine learning - Part 1
Practical machine learning - Part 1
Traian Rebedea
 
ODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For GoodODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For Good
Karry Lu
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
Saeedeh Shekarpour
 
The Next Generation of AI-powered Search
The Next Generation of AI-powered SearchThe Next Generation of AI-powered Search
The Next Generation of AI-powered Search
Trey Grainger
 
Dats nih-dccpc-kc7-april2018-prs-uoxf
Dats  nih-dccpc-kc7-april2018-prs-uoxfDats  nih-dccpc-kc7-april2018-prs-uoxf
Dats nih-dccpc-kc7-april2018-prs-uoxf
Philippe Rocca-Serra
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?
Constantin Orasan
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
Lifeng (Aaron) Han
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Lifeng (Aaron) Han
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
Besnik Fetahu
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Andre Freitas
 
The paper trail:steps towards a reference model for the metadata ecology
The paper trail:steps towards a reference model for the metadata ecologyThe paper trail:steps towards a reference model for the metadata ecology
The paper trail:steps towards a reference model for the metadata ecology
R. John Robertson
 
Data Science Workshop - day 1
Data Science Workshop - day 1Data Science Workshop - day 1
Data Science Workshop - day 1
Aseel Addawood
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
Patrice Bellot - Aix-Marseille Université / CNRS (LIS, INS2I)
 
Data Science Workshop - day 2
Data Science Workshop - day 2Data Science Workshop - day 2
Data Science Workshop - day 2
Aseel Addawood
 
Modern association rule mining methods
Modern association rule mining methodsModern association rule mining methods
Modern association rule mining methods
ijcsity
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Jenn Riley
 
Unstructured data processing webinar 06272016
Unstructured data processing webinar 06272016Unstructured data processing webinar 06272016
Unstructured data processing webinar 06272016
George Roth
 
Emerging Data Citation Infrastructure
Emerging Data Citation InfrastructureEmerging Data Citation Infrastructure
Emerging Data Citation Infrastructure
Micah Altman
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalDustin Smith
 
Text Analytics Presentation
Text Analytics PresentationText Analytics Presentation
Text Analytics PresentationSkylar Ritchie
 

What's hot (20)

Practical machine learning - Part 1
Practical machine learning - Part 1Practical machine learning - Part 1
Practical machine learning - Part 1
 
ODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For GoodODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For Good
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
The Next Generation of AI-powered Search
The Next Generation of AI-powered SearchThe Next Generation of AI-powered Search
The Next Generation of AI-powered Search
 
Dats nih-dccpc-kc7-april2018-prs-uoxf
Dats  nih-dccpc-kc7-april2018-prs-uoxfDats  nih-dccpc-kc7-april2018-prs-uoxf
Dats nih-dccpc-kc7-april2018-prs-uoxf
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
 
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesA Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
 
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
Question Answering over Linked Data: Challenges, Approaches & Trends (Tutoria...
 
The paper trail:steps towards a reference model for the metadata ecology
The paper trail:steps towards a reference model for the metadata ecologyThe paper trail:steps towards a reference model for the metadata ecology
The paper trail:steps towards a reference model for the metadata ecology
 
Data Science Workshop - day 1
Data Science Workshop - day 1Data Science Workshop - day 1
Data Science Workshop - day 1
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
 
Data Science Workshop - day 2
Data Science Workshop - day 2Data Science Workshop - day 2
Data Science Workshop - day 2
 
Modern association rule mining methods
Modern association rule mining methodsModern association rule mining methods
Modern association rule mining methods
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
 
Unstructured data processing webinar 06272016
Unstructured data processing webinar 06272016Unstructured data processing webinar 06272016
Unstructured data processing webinar 06272016
 
Emerging Data Citation Infrastructure
Emerging Data Citation InfrastructureEmerging Data Citation Infrastructure
Emerging Data Citation Infrastructure
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Text Analytics Presentation
Text Analytics PresentationText Analytics Presentation
Text Analytics Presentation
 

Viewers also liked

160115 nocturne professional bouwunie
160115 nocturne professional bouwunie160115 nocturne professional bouwunie
160115 nocturne professional bouwunie
KurtPeys
 
Brief discussion on inverse Theory
Brief discussion on inverse TheoryBrief discussion on inverse Theory
Brief discussion on inverse Theory
Amin khalil
 
Zaira Vicente - Full Life Coaching
Zaira Vicente - Full Life CoachingZaira Vicente - Full Life Coaching
Zaira Vicente - Full Life Coaching
Carmen Amil Vena
 
фестиваль
фестивальфестиваль
фестиваль
gbou1747
 
Artificial inteligence
Artificial inteligenceArtificial inteligence
Artificial inteligence
Adarsh Saxena
 
Deep Learning for NLP
Deep Learning for NLP Deep Learning for NLP
Deep Learning for NLP
Miguel González-Fierro
 
게임 AI를 통해 본 인공지능 기본 개념
게임 AI를 통해 본 인공지능 기본 개념게임 AI를 통해 본 인공지능 기본 개념
게임 AI를 통해 본 인공지능 기본 개념
Keunhyun Oh
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in Python
Abhinav Gupta
 
Funded PhD/MSc. Opportunities at AYLIEN
Funded PhD/MSc. Opportunities at AYLIENFunded PhD/MSc. Opportunities at AYLIEN
Funded PhD/MSc. Opportunities at AYLIEN
Sebastian Ruder
 
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
Sebastian Ruder
 
Simple present negative
Simple present negativeSimple present negative
Simple present negative
Susana Barajas
 
Indications of pm implantation mosaad
Indications of pm implantation mosaadIndications of pm implantation mosaad
Indications of pm implantation mosaad
Tanta Rhythm Group
 
Verbs
VerbsVerbs
NYAI - A Path To Unsupervised Learning Through Adversarial Networks by Soumit...
NYAI - A Path To Unsupervised Learning Through Adversarial Networks by Soumit...NYAI - A Path To Unsupervised Learning Through Adversarial Networks by Soumit...
NYAI - A Path To Unsupervised Learning Through Adversarial Networks by Soumit...
Rizwan Habib
 
Semi-Supervised Autoencoders for Predicting Sentiment Distributions(第 5 回 De...
 Semi-Supervised Autoencoders for Predicting Sentiment Distributions(第 5 回 De... Semi-Supervised Autoencoders for Predicting Sentiment Distributions(第 5 回 De...
Semi-Supervised Autoencoders for Predicting Sentiment Distributions(第 5 回 De...
Ohsawa Goodfellow
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
David Rostcheck
 
Information on Startupbootcamp FinTech Singapore Accelerator Program
Information on Startupbootcamp FinTech Singapore Accelerator ProgramInformation on Startupbootcamp FinTech Singapore Accelerator Program
Information on Startupbootcamp FinTech Singapore Accelerator Program
Hilda Low
 
Essentials Of Prescribing In Homoeopathy
Essentials Of Prescribing In HomoeopathyEssentials Of Prescribing In Homoeopathy
Essentials Of Prescribing In Homoeopathy
HHC Healthcare Pvt. Ltd.
 
Introduction to Seismic Method
Introduction to Seismic Method Introduction to Seismic Method
Introduction to Seismic Method
Ahmed Younhais Tariq
 

Viewers also liked (19)

160115 nocturne professional bouwunie
160115 nocturne professional bouwunie160115 nocturne professional bouwunie
160115 nocturne professional bouwunie
 
Brief discussion on inverse Theory
Brief discussion on inverse TheoryBrief discussion on inverse Theory
Brief discussion on inverse Theory
 
Zaira Vicente - Full Life Coaching
Zaira Vicente - Full Life CoachingZaira Vicente - Full Life Coaching
Zaira Vicente - Full Life Coaching
 
фестиваль
фестивальфестиваль
фестиваль
 
Artificial inteligence
Artificial inteligenceArtificial inteligence
Artificial inteligence
 
Deep Learning for NLP
Deep Learning for NLP Deep Learning for NLP
Deep Learning for NLP
 
게임 AI를 통해 본 인공지능 기본 개념
게임 AI를 통해 본 인공지능 기본 개념게임 AI를 통해 본 인공지능 기본 개념
게임 AI를 통해 본 인공지능 기본 개념
 
Analyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in PythonAnalyzing Arguments during a Debate using Natural Language Processing in Python
Analyzing Arguments during a Debate using Natural Language Processing in Python
 
Funded PhD/MSc. Opportunities at AYLIEN
Funded PhD/MSc. Opportunities at AYLIENFunded PhD/MSc. Opportunities at AYLIEN
Funded PhD/MSc. Opportunities at AYLIEN
 
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
FaDA: Fast document aligner with word embedding - Pintu Lohar, Debasis Gangul...
 
Simple present negative
Simple present negativeSimple present negative
Simple present negative
 
Indications of pm implantation mosaad
Indications of pm implantation mosaadIndications of pm implantation mosaad
Indications of pm implantation mosaad
 
Verbs
VerbsVerbs
Verbs
 
NYAI - A Path To Unsupervised Learning Through Adversarial Networks by Soumit...
NYAI - A Path To Unsupervised Learning Through Adversarial Networks by Soumit...NYAI - A Path To Unsupervised Learning Through Adversarial Networks by Soumit...
NYAI - A Path To Unsupervised Learning Through Adversarial Networks by Soumit...
 
Semi-Supervised Autoencoders for Predicting Sentiment Distributions(第 5 回 De...
 Semi-Supervised Autoencoders for Predicting Sentiment Distributions(第 5 回 De... Semi-Supervised Autoencoders for Predicting Sentiment Distributions(第 5 回 De...
Semi-Supervised Autoencoders for Predicting Sentiment Distributions(第 5 回 De...
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Information on Startupbootcamp FinTech Singapore Accelerator Program
Information on Startupbootcamp FinTech Singapore Accelerator ProgramInformation on Startupbootcamp FinTech Singapore Accelerator Program
Information on Startupbootcamp FinTech Singapore Accelerator Program
 
Essentials Of Prescribing In Homoeopathy
Essentials Of Prescribing In HomoeopathyEssentials Of Prescribing In Homoeopathy
Essentials Of Prescribing In Homoeopathy
 
Introduction to Seismic Method
Introduction to Seismic Method Introduction to Seismic Method
Introduction to Seismic Method
 

Similar to Semantic Similarity and Selection of Resources Published According to Linked Data Best Practice

10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
DuraSpace
 
Hansen Metadata for Institutional Repositories
Hansen Metadata for Institutional RepositoriesHansen Metadata for Institutional Repositories
Hansen Metadata for Institutional Repositories
National Information Standards Organization (NISO)
 
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
ASIS&T
 
Duraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesDuraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository Services
Matthew Critchlow
 
Linked Data: Uses and Users
Linked Data: Uses and UsersLinked Data: Uses and Users
Linked Data: Uses and Users
Gretchen Gueguen
 
Hide the Stack: Toward Usable Linked Data
Hide the Stack:Toward Usable Linked DataHide the Stack:Toward Usable Linked Data
Hide the Stack: Toward Usable Linked Data
aba-sah
 
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
National Information Standards Organization (NISO)
 
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
Semantics-enhanced Geoscience Interoperability, Analytics, and ApplicationsSemantics-enhanced Geoscience Interoperability, Analytics, and Applications
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
Artificial Intelligence Institute at UofSC
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptx
elisarosa29
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
SEAD
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challenges
fazail amin
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
AKSHAY BHAGAT
 
Integrating an electronic lab notebook with a data repository; American Chemi...
Integrating an electronic lab notebook with a data repository; American Chemi...Integrating an electronic lab notebook with a data repository; American Chemi...
Integrating an electronic lab notebook with a data repository; American Chemi...
rmacneil88
 
Elns and repositories, American Chemical Society, Dallas, March 2014
Elns and repositories, American Chemical Society, Dallas, March 2014Elns and repositories, American Chemical Society, Dallas, March 2014
Elns and repositories, American Chemical Society, Dallas, March 2014
ResearchSpace
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
Jian Qin
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
Jian Qin
 
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogueseROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
e-ROSA
 
Data sharing as part of the research ecosystem
Data sharing as part of the research ecosystemData sharing as part of the research ecosystem
Data sharing as part of the research ecosystem
Varsha Khodiyar
 
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleFull Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
National Information Standards Organization (NISO)
 
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
Yongyao Jiang
 

Similar to Semantic Similarity and Selection of Resources Published According to Linked Data Best Practice (20)

10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
 
Hansen Metadata for Institutional Repositories
Hansen Metadata for Institutional RepositoriesHansen Metadata for Institutional Repositories
Hansen Metadata for Institutional Repositories
 
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
 
Duraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesDuraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository Services
 
Linked Data: Uses and Users
Linked Data: Uses and UsersLinked Data: Uses and Users
Linked Data: Uses and Users
 
Hide the Stack: Toward Usable Linked Data
Hide the Stack:Toward Usable Linked DataHide the Stack:Toward Usable Linked Data
Hide the Stack: Toward Usable Linked Data
 
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
Llebot "Research Data Support for Researchers: Metadata, Challenges, and Oppo...
 
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
Semantics-enhanced Geoscience Interoperability, Analytics, and ApplicationsSemantics-enhanced Geoscience Interoperability, Analytics, and Applications
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptx
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challenges
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
 
Integrating an electronic lab notebook with a data repository; American Chemi...
Integrating an electronic lab notebook with a data repository; American Chemi...Integrating an electronic lab notebook with a data repository; American Chemi...
Integrating an electronic lab notebook with a data repository; American Chemi...
 
Elns and repositories, American Chemical Society, Dallas, March 2014
Elns and repositories, American Chemical Society, Dallas, March 2014Elns and repositories, American Chemical Society, Dallas, March 2014
Elns and repositories, American Chemical Society, Dallas, March 2014
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
 
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogueseROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
 
Data sharing as part of the research ecosystem
Data sharing as part of the research ecosystemData sharing as part of the research ecosystem
Data sharing as part of the research ecosystem
 
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleFull Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
 
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
 

More from Riccardo Albertoni

Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015
Riccardo Albertoni
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Riccardo Albertoni
 
LusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for EnvironmentLusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for Environment
Riccardo Albertoni
 
SSONDE: Semantic Similarity On liNked Data Entities
SSONDE: Semantic Similarity On liNked Data EntitiesSSONDE: Semantic Similarity On liNked Data Entities
SSONDE: Semantic Similarity On liNked Data Entities
Riccardo Albertoni
 
An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...Riccardo Albertoni
 
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Riccardo Albertoni
 
SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...Riccardo Albertoni
 

More from Riccardo Albertoni (10)

Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015Albertoni ldq workshop ESWC 2015
Albertoni ldq workshop ESWC 2015
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
 
Presentation at MTSR 2012
Presentation at MTSR 2012Presentation at MTSR 2012
Presentation at MTSR 2012
 
LusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for EnvironmentLusTRE: a Linked Thesaurus fRamework for Environment
LusTRE: a Linked Thesaurus fRamework for Environment
 
Linkset quality (LWDM 2013)
Linkset quality (LWDM 2013)Linkset quality (LWDM 2013)
Linkset quality (LWDM 2013)
 
Linkset quality
Linkset qualityLinkset quality
Linkset quality
 
SSONDE: Semantic Similarity On liNked Data Entities
SSONDE: Semantic Similarity On liNked Data EntitiesSSONDE: Semantic Similarity On liNked Data Entities
SSONDE: Semantic Similarity On liNked Data Entities
 
An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...An ontology driven module for accessing chronic pathology literature- CHRONIO...
An ontology driven module for accessing chronic pathology literature- CHRONIO...
 
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an...
 
SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...SKOS and semantic web best practice to access terminological resources: Natur...
SKOS and semantic web best practice to access terminological resources: Natur...
 

Recently uploaded

Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 

Recently uploaded (20)

Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 

Semantic Similarity and Selection of Resources Published According to Linked Data Best Practice

  • 1. Semantic Similarity and Selection of Resources Published According to Linked Data Best Practice Riccardo Albertoni, Monica De Martino CNR-IMATI-GE Institute of Applied Mathematics and Information Technologies (Dept of Genoa) Consiglio Nazionale delle Ricerche, Italy The 6th International Workshop on Ontology Content (OnToContent 2010) Oct 28, 2010 Crete Part of the OTM (OTM'2010)
  • 2. Outline • Resource Selection, Semantic Similarity and Linked data. ▫ Why does Resource Selection matter? ▫ Real example:  Complex metadata to document resources  Linked data paves the way for sharing complex metadata ▫ Semantic Similarity as base for resource selection  Nice features as Asymmetry & Context-Dependence • Scaling Semantic similarity up to Web of Data ▫ Issues & Research plandirection ▫ Exploratory phase with real data from the web data  Are the issues we consider relevant? In which varieties shapes issues occur in real data? ▫ Lesson learnt from the exploratory phase
  • 3. Resource Selection: • why does it matter? ▫ Effective sharing and reuse of data are still desiderata of many scientific and industrial domains where the selection of tailored and high-quality data is a necessary condition to provide successful and competitive services • Resource selection ▫ in order to select the resources which fit a given problemtask we rely on an analysis of metadata documenting resources
  • 4. Real Example Acquisition Preprocessing Integration ModelsAnalysis Web server Sea Trial courtesy of NATO Undersea Research Centre (NURC), Example developed in NURC Research Assistance granted to R. Albertoni (2008) Short term perspective: data is collected and elaborated for well planned purposes (aka sea trial experiments)
  • 5. Potential new “customer” for sea trial Data • NATO Agencies/Nations ask for data previously collected • New scientists arriving at NURC ▫ They want to access to data in order to produce model by their own approaches and to compare the results with models already produced at NURC (Benchmarking) • Scientists/Agencies investigating how phenomena have been changed in a long period ▫ They are interested in data collected in the past • Scientists/Agencies planning a new sea trial ▫ It can be useful to know what have collected in previous sea trials, how data have been elaborated Data reusability: unplanned use of data long term perspective
  • 6. Potential customers’ point of view These curtomers were not involved in sea trials, thus, searching for data they wonder: • Is data collected at NURC suitable for the application I have in mind? • Is data reliable enough? To answer to these macro questions • Users need to have details about how data has been acquired, pre-processed, integrated, analyzed, and even to know who was in charge for which part…
  • 7. ModelsData Processes People Sensors Characteristics Metadata Complexity in Real World- Linked data helps in share complex metadata Sensor’s responsible party sensor settings Parameters, choices made during the preprocessing Analysis applied.. Parameters etc Sensor Sensor Sensor Sensor APO FOAF ISO19115CoreTest PlanDublin Core SensorML SensorML SensorML SensorML
  • 8. Problem: keep the bar balanced !! Semantic similarity as Metadata analysis to support user comparing the features of candidate resources Huge amount of ontology driven metadata describing complex features as linked data
  • 9. semantic similarity as metadata analysis tool • instance similarity is fundamental to support detailed comparison, ranking and selection of resources through its ontology driven metadata ▫ Albertoni R., De Martino M., Asymmetric and context-dependent semantic similarity among ontology instances, Journal on Data Semantics X, Springer Verlag, (2008). • Explicitly addressing the ▫ Context as explicit parameterization of similarity assessment  Context specifies which features to consider and how ▫ Asymmetry to highlight containment between resources  Sim(A,B) ranges [0,1] is worked out to measure how many features A shares with B out of the overall A features  If features(A) are contained in features(B): sim(A,B)=1 and sim(B,A)<1 • Limitation: Not for linked data, it was for locally-stored ontology-driven repository and one well defined schema
  • 10. How to make Semantic Similarity to scale up to the web of data? 1/2 Identified issues Research Plan non-authoritative metadata, metadata published by actors who are neither the resource producers nor the owners WHEN metadata documenting resources that have been re-elaborated or reviewed by third parties Synergies with semantic web indexes (e.g., SINDICE ) to retrieve non authoritative features heterogeneous metadata, metadata provided according to different, sometimes interlinked, more often overlapping metadata vocabularies WHEN metadata for a resource is provided by stakeholders with different fields of competency, then they may use different vocabularies, not always these vocabularies are independent deploying schema and entity level consolidation using both explicit metadata statements and mining implicit equivalences through co-occurring resources annotations;
  • 11. How to make Semantic Similarity to scale up to the web of data? 2/2 Identified issues Research Plan non-consistently identified metadata, namely metadata occurring when the same resource has different identifiers in distinct metadata sets WHEN Two actors in the pipeline documents independently the same resource at different stage of the pipeline •reasoning techniques to be applied to web datasets, e.g., to smush fragments of distributed metadata • scripts to interlink resources relying on a- priori knowledge about how datasets have been originated; efficiency and computational issue: in a longer perspective an accurate similarity assessment might result computationally prohibitive WHEN the number of resources discovered and features considered increase. •cashing of intermediate comparisons •techniques to prune comparisons according to a specified application context •algorithms for efficient parallelization can be studied
  • 12. Exploratory phase • Facing with the aforementioned issues is a very challenging research plan!!! • Let’s get a first hand experience in varieties introduced by data providers ▫ Requirements:  Real metadata published as linked data  Provided by third parties • Linked data provides huge potential for documenting resources produced in complex pipelines but it is not yet a common practice ▫ We considered a simpler domain (researchers and their publications)  Semantic Web Dog Food-SWDF (http://data.semanticweb.org/)  DBLP in RDF (http://dblp.l3s.de/d2r).
  • 13. Instance similarity redesigned prototype • As test bed for experimenting and deepen the aforementioned issues • Extension ▫ Extended the notion of context including namespaces to consider properties from different RDF schemas ▫ Updated the ontology model, moving from ProtegeAPI to RDF model  JENA Reasoner and SPARQL
  • 14. Context: Researcher X (URI(X)=A) Researcher Y (URI(Y)=B) <A> rdfs:label “A descr" ; dc:license <http://vb.com> ; foaf:primaryTopic <xzc> ; foaf:made <paperC>; foaf:made <paperD>; foaf:made <paperH>. <B> rdfs:label “B descr" ; dc:license <http://vb.com> ; foaf:primaryTopic <vbn> ; foaf:made <paperC>; foaf:made <paperF>. PREFIX foaf: <http://xmlns.com/foaf/0.1/> [foaf:Person]->{{},{(foaf:made, Count)}} Two researchers are as similar as they have a similar number of publications 3 2 SIM(X,Y)= SIM(A,B)= 2/max(3,2)=2/3 SIM(Y,X)= SIM(B,A)= 3/max(3,2)=1 Take a look to R. Albertoni, M. De Martino JODS X, 2008 for more complex similarity assessment !! We compare researchers by their URIs
  • 15. Non-Authoritative Metadata - Example URI(Giovanni)=A URI (Renaud)=B <A> rdfs:label “A descr" ; dc:license <http://vb.com> ; foaf:primaryTopic <xzc> ; foaf:made <paperC>; foaf:made <paperD>; foaf:made <paperH>. <B> rdfs:label “B descr" ; dc:license <http://vb.com> ; foaf:primaryTopic <vbn> ; foaf:made <paperC>; foaf:made <paperF>. Let’s compare Giovanni and Renaud starting from their URI in DBLP A= http://dblp.l3s.de/…../Giovanni_Tummarello B= http://dblp.l3s.de/…./Renaud_Delbru But we know, semantic web dog food (SWDF) might provide more info about Giovanni and Renaud, What if SWDF provides an additional paper for Renaud, paper which Giovanni is not coauthoring? SIM(Giovanni, Renaud)=1 instead of 2/3….
  • 16. Non-Authoritative Metadata -SINDICE You get RDF Fragments from DBLP only !!! none from semantic web dog food we know providing further info.. IDEA: Querying SINDICE by Researchers’ URIs A, B to get RDF fragments pertaining to Giovanni and Renaud •URIs not name as keywords, because different people might share the same name, URIs are in principle more precise First lesson: Non-authoritative metadata and Non-consistently identified metadata are tightly inter-related in the real practice. To effectively deal with the former issue often we have to care about the latter issue. SWDF Researchers URI DBLP Researchers’ URI They do not overlap!!!
  • 17. DBLP URI --- How to move next? IDEA: if SWDF added rules likes <http://data.semanticweb.org/person/name-[midlename]-[familyname]> owl:sameAs <http://dblp.l3s.de/d2r/resource/authors/name_[middle- name]_familyname> SWDF URIOwl:SameAs The SWDF fragments would have been retrieved by SINDICE.. [We are implicitly assuming some reasoning: e.g.: (X owl:sameAs X1) and (X1 rel Z) -> (X rel Z) ]
  • 18. heterogeneous metadata- Example RDF for Giovanni in DBLP RDF for Giovanni in SWDF <A> rdfs:label “A descr" ; dc:license <http://vb.com> ; foaf:primaryTopic <xzc> ; foaf:made <paperC>; foaf:made <paperD>; foaf:made <paperH>. <A> rdfs:label “A descr" ; dc:license <http://vb.com> ; foaf:primaryTopic <xzc> ; <paperE> foaf:maker <A>. <paperB> foaf:maker <A>. This problem does not appear in terms of different RDF schemas Both DBLP and SWDF deploy foaf … foaf:made is owl:inverseOf foaf:maker, but you cannot know it if you don’t dereference/load the foaf schema Second lesson: ontology/schema/properties in the context must be dereferenced as much as entity’s URIs to make the semantics of properties exploitable.
  • 19. We must be careful dereferencing • Dereferencing schemata and URI ▫ is extremely slow ▫ adds many RDF statements which might result useless for semantic similarity assessment  Info not pertaining to specified context ▫ ends up with huge amount of derived RDF statement which might worsen efficiency ad computational problems Third lesson: specific and context driven policies to dereference the URI and retrieve RDF fragments should be deployed in order to ease efficiency and computational problems. For example : to dereference only properties mentioned in context .. Or consider only RDF fragments returned by SINDICE with explicit reference to schemas mentioned in the context.
  • 20. How to move next? RDF for Giovanni in DBLP RDF for Giovanni in SWDF <A> rdfs:label “A descr" ; dc:license <http://vb.com> ; foaf:primaryTopic <xzc> ; foaf:made <paperC>; foaf:made <paperD>; foaf:made <paperH>. <A> rdfs:label “A descr" ; dc:license <http://vb.com> ; foaf:primaryTopic <xzc>. <paperE> foaf:maker <A>. <paperB> foaf:maker <A>. <A> foaf:make <paperE>. <A> foaf:make <paperB>. Assuming we have dereferenced the foaf:maker, or upload in the reasoner a rule saying (P foaf:maker X)- > (X foaf:make P)
  • 21. Non-consistently identified metadata What if the same pub is provided both by DBLP and SWDF? E.g., DBLP:paperC and SWDF:paperB are two URI for the same paper We count it twice  Fourth lesson: Non-consistently identified metadata is a recursive problem. Consolidating researchers without consolidating papers brings to wrong similarity results. We must be sure entities and properties in the similarity context have been properly consolidated before applying instance similarity. RDF for Giovanni in DBLP + for Giovanni in SWDF <A> rdfs:label “A descr" ; dc:license <http://vb.com> ; foaf:primaryTopic <xzc> ; foaf:made <DBLP:paperC>; foaf:made <DBLP:paperD>; foaf:made <DBLP:paperH>. <A> rdfs:label “A descr" ; dc:license <http://vb.com> ; foaf:primaryTopic <xzc>. <A> foaf:made <SWDF:paperE>. <A> foaf:made <SWDF:paperB>.
  • 22. Conclusion (I) • Linked data best practice and our semantic similarity ▫ good potential to support data selection for complex domain resource • But scaling semantic similarity up to web of data means to deal with ▫ Non authoritative metadata ▫ Heterogeneous metadata ▫ Non-consistently identified metadata ▫ Efficiency and computational issue
  • 23. Conclusion (II) • The exploratory phase shows ▫ All the mentioned issues arise even in very simple scenario assessing the semantic similarity ▫ It is pivotal to have first-hand experience with real data to discover the shape issues might assume • Consideration ▫ Problems we found are not exclusive for similarity assessment  We suspect this issues arise whenever you try to elaborate information published as linked data in order to mining new factsinfo from the published data
  • 24. Do not hesitate to email me (Albertoni@ge.imati.cnr.it) If you have off line questions

Editor's Notes

  1. Enabling factors for establishing the web of data as preferred selling point for complex resources are: (i) linked data best practice relies on light-weighed ontologies encoded in Resource Description Framework (RDF) which can be exploited to provide ontology driven metadata. Such a kind of metadata takes advantage from the Open Word Assumption, enabling the adoption of complex, domain specialized and independently developed metadata vocabularies, which are pivotal to document resources produced in complex and loosely coupled pipelines; (ii) linked data best practice relies on content negotiation exploiting the standard HTTP protocol, it is not proposing a brand new platform replacing the existing technologies. Rather, it can be placed side by side to domain specific protocol and standards (e.g., Open Geospatial Consortium specification for the geographic domain) making metadata available in human and machine consumable format; (iii) technological headways have brought to mature prototypes in order to expose resource as linked data (e.g., D2R and Pubby), to query them by appropriate query language (i.e., SPARQL), to retrieve their pertaining RDF fragments published around the web (e.g., Sindice), to reason, store and manipulate these fragments once there are retrieved (e.g., JENA API).
  2. However, even supposing the linked data was massively adopted to share the metadata of complex resources, the selection of the most suitable datasets for complex domains like environmental analysis would still be an enervating task. A huge amount of resource features and their complex relations must be considered during the selection process. Especially for assisting in this process, semantic similarity algorithms supporting a deep comparison of resource features are pivotal.
  3. Before engaging in this challenging research plan, we have undertaken an exploratory phase analyzing real web data. The goal is to get a first-hand experience in varieties introduced by data providers publishing metadata. Although publishing metadata according linked data best practice has a huge potential for documenting resources produced in complex pipelines, it is not yet a common practice in the specialized domains we have mentioned. For this reason, we have been forced to move on a simpler domain considering the scientific publications exposed as linked data by Semantic Web Dog Food-SWDF (http://data.semanticweb.org/) and DBLP in RDF (http://dblp.l3s.de/d2r).
  4. Very simple context!
  5. We would expect that the similarity starting from comparing two uris, takes advantage from non authoritative info, in order to give as much as possible a realistic assessment of the entity similarity ..