LusTRE: a Linked Thesaurus
fRamework for Environment
Riccardo Albertoni1, Monica De Martino1, Paola Podestà1, Paolo Plini2
 

CNR-IMATI, Via De Marini, 6, Torre di Francia, 16149 Genova, Italia
{name.surname}@ge.imati.cnr.it
2
CNR-IIA-EKOLab - Via Salaria Km 29,300 C.P. 10,I-00016 Monterotondo
stazione RM, Italia, plini@iia.cnr.it
1

W3C-IT LOD 2014, Rome, Italy, 20-21 Feb 2014
INSPIRE vs thesauri
INSPIRE implementation rules
• recommend the adoption of (multilingual) thesauri when
compiling metadata for data/services
However
Different thesauri have been developed, and may be deployed for
cataloguing the geographical, e.g.,

AGROVOC

EARTh

GEMET

THiST

…

Thesauri are heterogeneous wrt thematic coverage, multilinguality,
granularities, popularity in certain communities
Heterogeneity is precious!!!
LusTRE: Enabling Thesauri Joint Exploitation
Modularity
Modularity
To add new KOS as a new module
To add new KOS as a new module
Openness
Openness
plugged easilyset of thesauri in the
plugged in the set of thesauri KOS
in the extend each in the
To easily extend each KOS
To
TF
Interlinking
TF
Interlinking
keeping separated the original one

Design Principle (NatureSDIPlus 2009-2011)

keeping separatedthe terms referring
the original one
Linking among the terms referring
Linking among
Exploitability
Exploitability
to the same concepts in more then
to the same concepts in more then
To encode in a standard and flexible format
To encode
standard and flexible format
one thesaurus ain order to harmonize
one order to in in order toadoption and its
harmonize
in thesaurus
encourage the adoption and its
in order to their usage.
encourage the
their usage.
enrichment from third party system
enrichment from third party system

Simple Knowledge Organization System
(SKOS) to encode the thesaurus content
Linked Data best practices
to publish the thesaurus in machine
understandable format

De Martino M. and Albertoni R., A multilingual/multicultural semantic-based approach to improve
Data Sharing in a SDI for Nature Conservation, IJSDIR, vol.6, ISSN 1725-0463, pp. 206-233, 2011

TF Extension (eENVplus 2013-2015)
Publication of further thesauri not yet
exposed as Linked data
Interlinking with well-known LOD
environmental related thesauri
Services to access LusTRE and cross-

Common Thesaurus
Framework (TF)

walking from a thesaurus to another
3
Candidate List of Terminological resources
to be considered in LusTRE
Survey on Environmental Thesauri
Terminologies in the LOD to be interlinked

Terminologies to be included in the TF

•

AGROVOC,

•

ShowTerm,

•

Eurovoc,

•

EOSterm,

•

SoilThes,

•

UMTHES,

•

NERC,

•

ThIST

•

ThesSoz,

•

•

Geological Survey of Austria (GBA)thesaurus,
•

Inspire IFCD
Glossary

•

EnvThes

and

INSPIRE

EEA-EIONET Data Dictionaries

•

EEA-EIONET AQ pollutants

•

One
Geology,
vocabularies

IUGS–CGI

M. De Martino (CNR-IMATI), R. Albertoni (CNR-IMATI), P.Podestà(CNR-IMATI), C. Cipolloni
(ISPRA) P. Plini (CNR-IIA), D 4.1 – Survey on environmental thesauri, eENVplus, December
2013

5
Server re-engineering
we moved from D2R +Mysql to Virtuoso +
Pubby to have
+ More flexibility (Named Graph)
+ Better performance
+ Materialization of SKOS entailments

EARTh Interlinking
to
GEMET
AGROVOC
UMTHES
DBPEDIA
EUROVOC

EARTh included into the Linked Open Data Cloud:
http://datahub.io/it/dataset/environmental-applications-reference-thesaurus
Albertoni R., De Martino M., Di Franco S., De Santis V., Plini P.:
EARTh: An Environmental Application Reference Thesaurus in the Linked Open Data cloud.
Semantic Web, Vol. 5, No. 2, DOI.10.3233/SW-130122, 2014

6
Future actions
Further thesauri and linksets (end of the year)
Evaluation of provided content
•Quality of thesauri included in LusTRE, e.g., deploying
qSKOS
Suominen, O., Mader, C.: Assessing and Improving the Quality of SKOS
Vocabularies. J. Data Semant. (2013).

•Quality and usefulness of Linkset
•We are developing in-house quality measures for
Linksets extending
Albertoni R., Gómez-Pérez A.: Assessing linkset quality for
complementing third-party datasets. LDWM 2013: 52-59, 2013
Conclusions
… Waiting for next LusTRE’s release … We invite you:
To take a look at the Thesaurus Framework at:
http://linkeddata.ge.imati.cnr.it:2020
To check if a term is contained in TF at:
http://linkeddata.ge.imati.cnr.it:8890/fct/
To access the SPARQL ENDPOINT at:
http://linkeddata.ge.imati.cnr.it:8890/sparql
 To querying thesaurus concepts, relationships thesaurus
 To mappings between EARTh to GEMET, AGROVOC, UMTHES…
 To build your own services/application on the TF

To interlink your vocabularies/thesauri with LusTRE’s thesauri
For info and support contact
albertoni@ge.imati.cnr.it demartino@ge.imati.cnr.it
8
Further in-house technology
SSONDE,
•a Open Source Framework,
•providing an instance similarity
•enabling in a detailed comparison and ranking of resources through
the comparison of their RDF ontology driven metadata
•code available at https://code.google.com/p/ssonde/
Albertoni R, De Martino M: SSONDE: Semantic Similarity on LiNked Data Entities.
MTSR 2012: 25-36
Albertoni R., De Martino, M: Asymmetric and Context-Dependent Semantic Similarity
among Ontology Instances. J. Data Semantics 10: 1-30 (2008)
Further in-house technology
SSONDE,
•a Open Source Framework,
•providing an instance similarity
•enabling in a detailed comparison and ranking of resources through
the comparison of their RDF ontology driven metadata
•code available at https://code.google.com/p/ssonde/
Albertoni R, De Martino M: SSONDE: Semantic Similarity on LiNked Data Entities.
MTSR 2012: 25-36
Albertoni R., De Martino, M: Asymmetric and Context-Dependent Semantic Similarity
among Ontology Instances. J. Data Semantics 10: 1-30 (2008)

LusTRE: a Linked Thesaurus fRamework for Environment

  • 1.
    LusTRE: a LinkedThesaurus fRamework for Environment Riccardo Albertoni1, Monica De Martino1, Paola Podestà1, Paolo Plini2   CNR-IMATI, Via De Marini, 6, Torre di Francia, 16149 Genova, Italia {name.surname}@ge.imati.cnr.it 2 CNR-IIA-EKOLab - Via Salaria Km 29,300 C.P. 10,I-00016 Monterotondo stazione RM, Italia, plini@iia.cnr.it 1 W3C-IT LOD 2014, Rome, Italy, 20-21 Feb 2014
  • 2.
    INSPIRE vs thesauri INSPIREimplementation rules • recommend the adoption of (multilingual) thesauri when compiling metadata for data/services However Different thesauri have been developed, and may be deployed for cataloguing the geographical, e.g., AGROVOC EARTh GEMET THiST … Thesauri are heterogeneous wrt thematic coverage, multilinguality, granularities, popularity in certain communities Heterogeneity is precious!!!
  • 3.
    LusTRE: Enabling ThesauriJoint Exploitation Modularity Modularity To add new KOS as a new module To add new KOS as a new module Openness Openness plugged easilyset of thesauri in the plugged in the set of thesauri KOS in the extend each in the To easily extend each KOS To TF Interlinking TF Interlinking keeping separated the original one Design Principle (NatureSDIPlus 2009-2011) keeping separatedthe terms referring the original one Linking among the terms referring Linking among Exploitability Exploitability to the same concepts in more then to the same concepts in more then To encode in a standard and flexible format To encode standard and flexible format one thesaurus ain order to harmonize one order to in in order toadoption and its harmonize in thesaurus encourage the adoption and its in order to their usage. encourage the their usage. enrichment from third party system enrichment from third party system Simple Knowledge Organization System (SKOS) to encode the thesaurus content Linked Data best practices to publish the thesaurus in machine understandable format De Martino M. and Albertoni R., A multilingual/multicultural semantic-based approach to improve Data Sharing in a SDI for Nature Conservation, IJSDIR, vol.6, ISSN 1725-0463, pp. 206-233, 2011 TF Extension (eENVplus 2013-2015) Publication of further thesauri not yet exposed as Linked data Interlinking with well-known LOD environmental related thesauri Services to access LusTRE and cross- Common Thesaurus Framework (TF) walking from a thesaurus to another 3
  • 4.
    Candidate List ofTerminological resources to be considered in LusTRE Survey on Environmental Thesauri Terminologies in the LOD to be interlinked Terminologies to be included in the TF • AGROVOC, • ShowTerm, • Eurovoc, • EOSterm, • SoilThes, • UMTHES, • NERC, • ThIST • ThesSoz, • • Geological Survey of Austria (GBA)thesaurus, • Inspire IFCD Glossary • EnvThes and INSPIRE EEA-EIONET Data Dictionaries • EEA-EIONET AQ pollutants • One Geology, vocabularies IUGS–CGI M. De Martino (CNR-IMATI), R. Albertoni (CNR-IMATI), P.Podestà(CNR-IMATI), C. Cipolloni (ISPRA) P. Plini (CNR-IIA), D 4.1 – Survey on environmental thesauri, eENVplus, December 2013 5
  • 5.
    Server re-engineering we movedfrom D2R +Mysql to Virtuoso + Pubby to have + More flexibility (Named Graph) + Better performance + Materialization of SKOS entailments EARTh Interlinking to GEMET AGROVOC UMTHES DBPEDIA EUROVOC EARTh included into the Linked Open Data Cloud: http://datahub.io/it/dataset/environmental-applications-reference-thesaurus Albertoni R., De Martino M., Di Franco S., De Santis V., Plini P.: EARTh: An Environmental Application Reference Thesaurus in the Linked Open Data cloud. Semantic Web, Vol. 5, No. 2, DOI.10.3233/SW-130122, 2014 6
  • 6.
    Future actions Further thesauriand linksets (end of the year) Evaluation of provided content •Quality of thesauri included in LusTRE, e.g., deploying qSKOS Suominen, O., Mader, C.: Assessing and Improving the Quality of SKOS Vocabularies. J. Data Semant. (2013). •Quality and usefulness of Linkset •We are developing in-house quality measures for Linksets extending Albertoni R., Gómez-Pérez A.: Assessing linkset quality for complementing third-party datasets. LDWM 2013: 52-59, 2013
  • 7.
    Conclusions … Waiting fornext LusTRE’s release … We invite you: To take a look at the Thesaurus Framework at: http://linkeddata.ge.imati.cnr.it:2020 To check if a term is contained in TF at: http://linkeddata.ge.imati.cnr.it:8890/fct/ To access the SPARQL ENDPOINT at: http://linkeddata.ge.imati.cnr.it:8890/sparql  To querying thesaurus concepts, relationships thesaurus  To mappings between EARTh to GEMET, AGROVOC, UMTHES…  To build your own services/application on the TF To interlink your vocabularies/thesauri with LusTRE’s thesauri For info and support contact albertoni@ge.imati.cnr.it demartino@ge.imati.cnr.it 8
  • 8.
    Further in-house technology SSONDE, •aOpen Source Framework, •providing an instance similarity •enabling in a detailed comparison and ranking of resources through the comparison of their RDF ontology driven metadata •code available at https://code.google.com/p/ssonde/ Albertoni R, De Martino M: SSONDE: Semantic Similarity on LiNked Data Entities. MTSR 2012: 25-36 Albertoni R., De Martino, M: Asymmetric and Context-Dependent Semantic Similarity among Ontology Instances. J. Data Semantics 10: 1-30 (2008)
  • 9.
    Further in-house technology SSONDE, •aOpen Source Framework, •providing an instance similarity •enabling in a detailed comparison and ranking of resources through the comparison of their RDF ontology driven metadata •code available at https://code.google.com/p/ssonde/ Albertoni R, De Martino M: SSONDE: Semantic Similarity on LiNked Data Entities. MTSR 2012: 25-36 Albertoni R., De Martino, M: Asymmetric and Context-Dependent Semantic Similarity among Ontology Instances. J. Data Semantics 10: 1-30 (2008)

Editor's Notes

  • #2 IN questa breve presentazione introdurrò Lustre, un framework di thesauri per l’ambiente che stiamo sviluppando nell’ambito del progetto europeo eENVPLUS.
  • #3 Come è noto le implementation RULEs di INSPIRE al fine di facilitare la catalogazione e il riuso dei dati e servizi geografici prescrivono l’impiego di thesauri nella compilazione del metatada INSPIRE Tuttavia, Vi sono differenti thesauri sviluppati negli anni con caratteristiche piuttosto eterogene in termini di copertura tematica, multilinguismo e popolarità all’interno delle specifiche comunità coinvolte nella gestione dei dati e servizi geografici. A seconda di chi compila il metadato ed i tematismi coinvolti ,la scelta del tesauro da addottare può legittimamente cadere su un thesaurus piuttosto che un altro Pero a livello europeo ci si trova ad avere spatial data infrastructure con risorse documentate con thesauri differenti è quindi è importante cercare di facilitare l’impiego congiunto di tesauri per - (i) trarre vantaggio dalle differenti specificità dei thesauri -(II) facilitare la ricerca di risorse che magari sono state annotate con differenti thesauri
  • #4 per facilitare l’impiego congiunto di questi thesauri nell’ambito nasce LusTRE del progetto Europeo eENVplus Stiamo costruendo un framework per mettere a fattore comune diversi thesauri In particololare, partino dai risultati ottenuni in un nostro precendete progetto to catalogue geographical data/information/services enabling the sharing among the different communities working in environment-related domains
  • #5 Earth come general purpouse thesaurus, essendo una estensione di GEMET, che include più di 14000 concetti A questo abbiamo aggiunto risorse terminologiche per coprire più in specifico i tematismi InSPIRE in cui si focalizzava il progetto natureSDIplus, Habitat and biotopes, species distribution, Biogeographical regions, protect sites