SlideShare a Scribd company logo
1 of 11
Download to read offline
(Semantic) Interoperability 
in the 
CLARIN Infrastructure 
Menzo Windhouwer 
The Language Archive - DANS 
menzo.windhouwer@dans.knaw.nl 
Succeed Interoperability Workshop 
2nd October 2014 
KB, The Hague, Netherlands
CLARIN 
 CLARIN = Common Language Resources and Technology 
Infrastructure = an european ESFRI infrastructure project 
 Aims at providing easy and sustainable access for scholars 
in the humanities and social sciences to digital language 
data (in written, spoken, video or multimodal form) and 
advanced tools to discover, explore, exploit, annotate, 
analyze or combine them, independent of where they are 
located. 
http://www.clarin.eu/
CLARIN & Interoperability 
 CLARIN Centre Committee 
 Sets technical standards for the infrastructure to operate 
 CLARIN Standards Committee 
 Standards/Recommendations for linguistic resources 
 http://clarin.ids-mannheim.de/standards/ 
 Various (national) coordinators 
 Recommendations and best practices for specific topics 
 CMDI and ISOcat 
 My own focus: achieve a level of semantic interoperability
CLARIN and Semantics 
Metadata Concept Registry 
Resources (aka data) 
 (meta)data schema building blocks have semantics 
 Make semantics explicit and share them, i.e., identify common (emerging) concepts 
 The concept registries used by CLARIN are lightweight 
 CLARIN doesn’t try to construct a ‘global’ domain ontology 
 Relationships are on the CLARIN scale suitable for resource discovery, not for formal reasoning 
 Technology 
 Resources are taken as they are, i.e., no CLARIN scale conversion to RDF 
 Metadata is XML-based, but some RDF experiments are underway
Component Metadata Infrastructure 
OAI-PMH 
Data provider 
OAI-PMH 
Service provider 
Local 
metadata 
repository 
Joint 
metadata 
repository 
metadata 
modeler 
metadata 
user 
metadata 
creator 
component 
registry & 
editor 
metadata 
editor 
metadata 
curator 
metadata 
curator 
metadata 
catalogue 
Relation 
Registry 
search & 
semantic 
mapping 
DATA 
ISOcat
CMDI 
Concept 
Registry 
Name 
Age 
Actor Sex (male, female) 
Technical 
Metadata 
Metadata Profile 
Language 
Sample frequency 
Format 
Size 
Language 
Name 
Id (aaa … zzj) 
Location 
Continent 
Country 
Address 
Project 
Name 
Contact
ISOcat Concept Registry 
http://www.isocat.org/
CMDI: Component Editor 
http://catalog.clarin.eu/ds/ComponentRegistry/#
CMDI: Semantic Mapping 
http://clarin.aac.ac.at/smc-browser
Virtual Language Observatory 
http://catalog.clarin.eu/vlo/
Good vs Bad & the future 
+ The CMD Infrastructure is very flexible with regard to metadata 
structures, but also provides an integrated light weight semantic layer to 
achieve semantic interoperability and overcome the structural 
differences  
 But sometimes more context needs to be taken into account 
- Do deal with emerging semantics and specific needs many registries 
are open, which leads to proliferation  
- The ISOcat registry operated together with ISO TC 37 is too complex  
 Move to a simpler model based on SKOS 
 Move to a simpler workflow based on CLARIN recommendations 
 The organisation problem is harder than the technical one! 
 CLARIN-NL experimented with the same kind of semantic 
interoperability for linguistic resources 
 Might be exploited by a higher level of the CLARIN Federated Content 
Search (currently supports full text search) 
 How to make semantic annotation a part of a researcher’s workflow? 
 Embed interaction with the registries in their tools

More Related Content

Similar to 8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer. larin

Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Andrea Scharnhorst
 
D4 science scientific data infrastructure promoting interoperability by embra...
D4 science scientific data infrastructure promoting interoperability by embra...D4 science scientific data infrastructure promoting interoperability by embra...
D4 science scientific data infrastructure promoting interoperability by embra...
FAO
 
D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...
FAO
 

Similar to 8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer. larin (20)

Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector AdministrationNISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector Administration
 
How to Find a Needle in the Haystack
How to Find a Needle in the HaystackHow to Find a Needle in the Haystack
How to Find a Needle in the Haystack
 
Choosing the right software for your research study : an overview of leading ...
Choosing the right software for your research study : an overview of leading ...Choosing the right software for your research study : an overview of leading ...
Choosing the right software for your research study : an overview of leading ...
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001
 
Poster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History CollectionsPoster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History Collections
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
 
D4 science scientific data infrastructure promoting interoperability by embra...
D4 science scientific data infrastructure promoting interoperability by embra...D4 science scientific data infrastructure promoting interoperability by embra...
D4 science scientific data infrastructure promoting interoperability by embra...
 
D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
 
Presentation of Mediamap @Ebu Production Technology Seminar
Presentation of Mediamap @Ebu Production Technology SeminarPresentation of Mediamap @Ebu Production Technology Seminar
Presentation of Mediamap @Ebu Production Technology Seminar
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projects
 
Amersfoort 2016 koch_wg_v02
Amersfoort 2016 koch_wg_v02Amersfoort 2016 koch_wg_v02
Amersfoort 2016 koch_wg_v02
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Research
 
A Distributed Audio Personalization Framework over Android
A Distributed Audio Personalization Framework over AndroidA Distributed Audio Personalization Framework over Android
A Distributed Audio Personalization Framework over Android
 
It's all semantics! -The premises and promises of the semantic web
It's all semantics! -The premises and promises of the semantic webIt's all semantics! -The premises and promises of the semantic web
It's all semantics! -The premises and promises of the semantic web
 
CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 

More from IMPACT Centre of Competence

More from IMPACT Centre of Competence (20)

Session6 01.helmut schmid
Session6 01.helmut schmidSession6 01.helmut schmid
Session6 01.helmut schmid
 
Session1 03.hsian-an wang
Session1 03.hsian-an wangSession1 03.hsian-an wang
Session1 03.hsian-an wang
 
Session7 03.katrien depuydt
Session7 03.katrien depuydtSession7 03.katrien depuydt
Session7 03.katrien depuydt
 
Session7 02.peter kiraly
Session7 02.peter kiralySession7 02.peter kiraly
Session7 02.peter kiraly
 
Session6 04.giuseppe celano
Session6 04.giuseppe celanoSession6 04.giuseppe celano
Session6 04.giuseppe celano
 
Session6 03.sandra young
Session6 03.sandra youngSession6 03.sandra young
Session6 03.sandra young
 
Session6 02.jeremi ochab
Session6 02.jeremi ochabSession6 02.jeremi ochab
Session6 02.jeremi ochab
 
Session5 04.evangelos varthis
Session5 04.evangelos varthisSession5 04.evangelos varthis
Session5 04.evangelos varthis
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Session5 02.tom derrick
Session5 02.tom derrickSession5 02.tom derrick
Session5 02.tom derrick
 
Session5 01.rutger vankoert
Session5 01.rutger vankoertSession5 01.rutger vankoert
Session5 01.rutger vankoert
 
Session4 04.senka drobac
Session4 04.senka drobacSession4 04.senka drobac
Session4 04.senka drobac
 
Session3 04.arnau baro
Session3 04.arnau baroSession3 04.arnau baro
Session3 04.arnau baro
 
Session3 03.christian clausner
Session3 03.christian clausnerSession3 03.christian clausner
Session3 03.christian clausner
 
Session3 02.kimmo ketunnen
Session3 02.kimmo ketunnenSession3 02.kimmo ketunnen
Session3 02.kimmo ketunnen
 
Session3 01.clemens neudecker
Session3 01.clemens neudeckerSession3 01.clemens neudecker
Session3 01.clemens neudecker
 
Session2 04.ashkan ashkpour
Session2 04.ashkan ashkpourSession2 04.ashkan ashkpour
Session2 04.ashkan ashkpour
 
Session2 03.juri opitz
Session2 03.juri opitzSession2 03.juri opitz
Session2 03.juri opitz
 
Session2 02.christian reul
Session2 02.christian reulSession2 02.christian reul
Session2 02.christian reul
 
Session2 01.emad mohamed
Session2 01.emad mohamedSession2 01.emad mohamed
Session2 01.emad mohamed
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 

8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer. larin

  • 1. (Semantic) Interoperability in the CLARIN Infrastructure Menzo Windhouwer The Language Archive - DANS menzo.windhouwer@dans.knaw.nl Succeed Interoperability Workshop 2nd October 2014 KB, The Hague, Netherlands
  • 2. CLARIN  CLARIN = Common Language Resources and Technology Infrastructure = an european ESFRI infrastructure project  Aims at providing easy and sustainable access for scholars in the humanities and social sciences to digital language data (in written, spoken, video or multimodal form) and advanced tools to discover, explore, exploit, annotate, analyze or combine them, independent of where they are located. http://www.clarin.eu/
  • 3. CLARIN & Interoperability  CLARIN Centre Committee  Sets technical standards for the infrastructure to operate  CLARIN Standards Committee  Standards/Recommendations for linguistic resources  http://clarin.ids-mannheim.de/standards/  Various (national) coordinators  Recommendations and best practices for specific topics  CMDI and ISOcat  My own focus: achieve a level of semantic interoperability
  • 4. CLARIN and Semantics Metadata Concept Registry Resources (aka data)  (meta)data schema building blocks have semantics  Make semantics explicit and share them, i.e., identify common (emerging) concepts  The concept registries used by CLARIN are lightweight  CLARIN doesn’t try to construct a ‘global’ domain ontology  Relationships are on the CLARIN scale suitable for resource discovery, not for formal reasoning  Technology  Resources are taken as they are, i.e., no CLARIN scale conversion to RDF  Metadata is XML-based, but some RDF experiments are underway
  • 5. Component Metadata Infrastructure OAI-PMH Data provider OAI-PMH Service provider Local metadata repository Joint metadata repository metadata modeler metadata user metadata creator component registry & editor metadata editor metadata curator metadata curator metadata catalogue Relation Registry search & semantic mapping DATA ISOcat
  • 6. CMDI Concept Registry Name Age Actor Sex (male, female) Technical Metadata Metadata Profile Language Sample frequency Format Size Language Name Id (aaa … zzj) Location Continent Country Address Project Name Contact
  • 7. ISOcat Concept Registry http://www.isocat.org/
  • 8. CMDI: Component Editor http://catalog.clarin.eu/ds/ComponentRegistry/#
  • 9. CMDI: Semantic Mapping http://clarin.aac.ac.at/smc-browser
  • 10. Virtual Language Observatory http://catalog.clarin.eu/vlo/
  • 11. Good vs Bad & the future + The CMD Infrastructure is very flexible with regard to metadata structures, but also provides an integrated light weight semantic layer to achieve semantic interoperability and overcome the structural differences   But sometimes more context needs to be taken into account - Do deal with emerging semantics and specific needs many registries are open, which leads to proliferation  - The ISOcat registry operated together with ISO TC 37 is too complex   Move to a simpler model based on SKOS  Move to a simpler workflow based on CLARIN recommendations  The organisation problem is harder than the technical one!  CLARIN-NL experimented with the same kind of semantic interoperability for linguistic resources  Might be exploited by a higher level of the CLARIN Federated Content Search (currently supports full text search)  How to make semantic annotation a part of a researcher’s workflow?  Embed interaction with the registries in their tools