Your SlideShare is downloading. ×
Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Semantic Similarity Assessment to Browse Resources exposed as Linked Data: an Application to Habitat and Species Datasets

616
views

Published on

INSPIRE 2011, slideshow

INSPIRE 2011, slideshow

Published in: Technology, Education

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
616
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
13
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. INSPIRE 2011, Edinburgh, June 29, 2011Semantic SimilarityAssessment to BrowseResources exposed as LinkedData: an Application toHabitat and Species DatasetsR. Albertoni, M. De Martino,Institute for Applied Mathematics and Information TechnologiesNational Research Council (CNR), Italy
  • 2. Outline!   Linked data - Motivation!   EUNIS Habitat and Species!   Asymmetric and context dependent Semantic Similarity !   Two contexts !   Examples of assessments!   Semantic similarity – Query refinement !   searching for geographical data set!   Conclusion and remarks
  • 3. Linked DataWhy Linked data ?!   Data Portability across current Data Silos!   HTTP based Open Database Connectivity!   Platform Independent Data & Information Access Linked Data Spaces –!   Serendipitous Discovery of relevant things via the WebExamples of geographical related linked data datasetsEARTH, GEMET, EUNIS SPECIES & SITE, LINKED GEO DATA, GEONAMES … Items in “why Linked data” are borrowed from the Kingsley Idehen’s presentation “Creating_Deploying_Exploiting_Linked_Data2”
  • 4. What can we do with linked data?Applications already successful:!   Improve/enrich the result returned by search engine (RDF/RDFa snippets) (Google, Yahoo)!   Linked data driven mesh-ups considering data from different sources (LOD Graph,…)What else we can do?!   We want to push ahead with Serendipitous Discovery supporting decision making by analyzing Linked Data sources!   Tools analyzing linked data: Context Dependent Instance Semantic Similarity !   Albertoni R., De Martino M., Asymmetric and context-dependent semantic similarity among ontology instances, Journal on Data Semantics X, Springer Verlag, pp 1-30, (2008).
  • 5. EUNIS Species-Habitats
  • 6. EUNIS Habitat and Species mapped in SKOS and published as Linked Data skos:prefLabel URI: http://linkeddata.ge.imati.cnr.it:2020/…/B2.1 skos:description
  • 7. Species and Habitats are instances of SKOS schema skos:description “Beach and upper beach formations, mostly of annuals of the low … ….. characteristic are [Cakile edentula], [Polygonum norvegicum] ([Polygonum oxyspermum ssp. raii]), [Atriplex longipes] s.l., [Atriplex glabriuscula], [Mertensia maritima]. Species are easily identifiable in the Habitat title and description !!!! We didn’t use SILK, We just developed an ad hoc interlinking procedure in JENA
  • 8. Applying semanticsimilarity on EUNISSpecies-Habitats Details among context formalization and mathematical formulas behind our semantic similarity are available in Albertoni R., De Martino M., Asymmetric and context-dependent semantic similarity among ontology instances, Journal on Data Semantics X, Springer Verlag, pp 1-30, (2008).
  • 9. Definition of contexts- parameterizations of our instance similarityContext 1:“habitat species-based similarity” habitats are compared according to the species that they host or vice versaPREFIX skos: <http://www.w3.org/2004/02/skos/core#>[skos:Concept]->{{},{(skos:relatedMatch, Inter)}Context 2: “taxonomy-based similarity” habitats or species instances are compared with respect to their position in the taxonomy hierarchyPREFIX skos: <http://www.w3.org/2004/02/skos/core#>[skos:Concept]->{ {},{(skos:broader, Inter)}}You can have contexts as complex as you want, for example1)  considering different ontology schemas2)  providing recursive similarity assessment
  • 10. Context 1:“habitat species-based similarity” habitats are compared according to the species that they host or vice versa PREFIX skos: <http://www.w3.org/2004/02/skos/core#> [skos:Concept]->{{},{(skos:relatedMatch, Inter)} SIM(B211,X)=SIM(X, B211)=0 SIM(B211,X)=2/4 SIM(X,B211)=1SIM(B211,X)=1/3 SIM(X,B211)=1/2 SIM(B211,X)=SIM(X, B211)=1
  • 11. Context 2: “taxonomy-based similarity” habitats or species instances are compared with respect to their position in the taxonomy hierarchyPREFIX skos: <http://www.w3.org/2004/02/skos/core#> [skos:Concept]->{ {},{(skos:broader, Inter)}} JENA RULES to get skos:broader as transitive an reflexive relations in order to compare nodes according to their ancestors (?x skos:broader ?y) (?y skos:broader ?z)-> (?x skos:broader ?z) (?y skos:broader ?z)-> (?y skos:broader ?y)
  • 12. Our semantic similarity was adapted to work with Linked Data(Here we have consider fairly “harmonized” linked data sets)Semantic similarity design enhancements:!   Direct access to linked data (No anymore centralized ontology driven repositories): !   (i) Follow your nose approach, (ii) RDF Dumps, (iii) SPARQL End Points!   Increased independence from the ontology schema !   CONTEXTs can mix up different light weighted ontology schemas, since it is common practice in Linked data.!   A reasoner to add simple RDF entailmentsQuite challenging when we consider sources that are not “harmonized” !   non-authoritative resources, heterogeneous schema, non-consistently identified entities !   Riccardo Albertoni, Monica De Martino: Semantic Similarity and Selection of Resources Published According to Linked Data Best Practice. OTM Workshops 2010, LNCS vol. 6428/2010
  • 13. Result considering Habitats and sub habitats of Coastal shingle (B2)Context Aif SIM(X,Y)=1 and SIM(Y,X)=1 than Y contains the same species of X;if SIM(X,Y)=1 and SIM(Y,X)<1 than Y contains the species of X but thevice versa is not true;SIM(X,Y) is proportional to the percentage of species in X that arecontained in Y out of the overall species of X.
  • 14. Comparing speciesaccording to habitatsthey can be found in
  • 15. HOW to USE ITExample: Searching for data • you might want similarity to refine your keyword query •  habitats and species can be deployed as Thesaurus/controlled vocabularyADVANTAGES in our approach wrt other similarities • Different contexts  even more personalized suggestions • Asymmetry/Containment Highlighting  even more information when browsing the refinement alternatives
  • 16. Conclusion!   After publishing your data, let’s start to consume Linked Data not only for meshing up !! !   Assumed data is properly interlinked, we can consume data from different distributed sources and mixing up light weighted ontologies schemas. !   The more dataset are interlinked, the more are the potential contexts and similarity applications!   Here we presented some very simple examples !   We can define more complex context considering instances’ relations and properties !   Our semantic similarity is a working prototype written in JAVA/JENA!   Future work !   Further uses cases (Do you fancy trying our semantic similarity on your data? Let’s talk about it) !   Developments of a front end to define user-driven contexts !   Further reengineering of the prototype to scale up even more complex use cases