SlideShare a Scribd company logo
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and
DANS Research Data Repositories
Slava Tykhonov; Jerry de Vries; Eko Indarto; Andrea Scharnhorst; Femmy Admiraal; Mike
Priddy (DANS-KNAW)
Presentation at ISKO Knowledge Organisation Research Observatory
24 Nov 2021
RESEARCH REPOSITORIES AND DATAVERSE: NEGOTIATING METADATA, VOCABULARIES
AND DOMAIN NEEDS
Content
- Introduction
- Who we are
- CMDI in the ‘wild’ - CLARIN data collections in EASY
- Creating FAIR metadata and semantic services - case of CMDI Pipeline
- Extract-Transform-Load (CMDI) metadata into Dataverse
- Workflow for linking external concepts to (CMDI) metadata values to make metadata FAIR
- Lessons learned and pointers
Introduction
DANS - Royal Netherlands Academy of Arts and
Sciences - research data expertise center - long-term
preservation - archive (www.easy.dans.knaw;
Dataverse.nl; Narcis.nl)
CLARIAH.nl Large Scale Infrastructure Project for
Humanities (CLARIN+DARIAH)
DARIAH: Digital Research Infrastructure for the Arts and
Humanities
CLARIN: Common Language Resources and
Technology Infrastructure
CMDI: Component MetaData Infrastructure: a framework
to describe and reuse metadata blueprints
SKOSMOS: Open source web-based SKOS browser
and publishing tool
Challenge
How to make the datasets from the CLARIN
community in the long-term archive discoverable in
the CLARIN infrastructure?
How to search ‘in data’ ? = How to achieve richer
indexing of metadata?
The problem - on a generic level
Vision: Semantic interoperability on the infrastructure level
We envision a situation where thousands of Dataverse instances (due to EOSC) on the
web can be simultaneously search for data.
The old dream of Federated search/Universal catalogue can only be realised if:
(1) Cross -walks; mapping across different metadata schemes are implemented
(2) In metadata schemes we seek for ways to enrich indexes with values from controlled
vocabularies
Standard response = standardisation and harmonisation = repository software, certain
metadata standards, or certain controlled vocabularies
New response = explore agile solutions (Proof of Concept) which can be implemented by
different communities (even smaller ones), so we keep variety and still enable integration.
The problem - on a concrete level
CMDI ‘in the wild’
Content
- Introduction
- Who we are
- CMDI in the ‘wild’ - CLARIN data collections in EASY
- Creating FAIR metadata and semantic services - case of CMDI Pipeline
- Extract-Transform-Load (CMDI) metadata into Dataverse
- Workflow for linking external concepts to (CMDI) metadata values to make metadata FAIR
- Lessons learned and pointers
Conceptual approach: Semantic interoperability on the
infrastructure level - building common solutions for everyone
Dataverse Semantic API in release 5.6: https://github.com/IQSS/dataverse/releases/tag/v5.6
“Dataset metadata can be retrieved, set, and updated using a new, flatter JSON-LD format -
following the format of an OAI-ORE export (RDA-conformant Bags), allowing for easier transfer of
metadata to/from other systems (i.e. without needing to know Dataverse's metadata block and field
storage architecture). This new API also allows for the update of terms metadata“.
External controlled vocabularies support is being developed by DANS in SSHOC project and
already integrated in Dataverse core in release 5.7.
Proposal: https://docs.google.com/document/d/1txdcFuxskRx_tLsDQ7KKLFTMR_r9IBhorDu3V_r445w/
Interfaces: http://github.com/gdcc/dataverse-external-vocab-support
Integrations: Wikidata, ORCID, MeSH, Skosmos vocabularies
CMDI Pipeline
- Backbone of our pipeline: Extract-Transform-Load (CMDI) metadata into Dataverse
- One block relevant for semantic services: Mapping across metadata standards
- Another block: Look-up for values in controlled vocabulary registers - enrich indexing
SEMAF: A Proposal for a Flexible Semantic Mapping Framework
Proposal: https://zenodo.org/record/4651421#.YT9lyC8RpZI
POC: https://github.com/Dans-labs/semaf-poc
Coming close to the implementation
1. Use Data Catalog Vocabulary (DCAT) mappings for CMDI metadata fields
2. Simple Knowledge Organization System (SKOS) to model a thesauri-like
resources with simple skos:broader, skos:narrower and skos:related
properties
3. Load CMDI properties and attributes and build a Knowledge Graph out of all
elements
4. Enrich the Knowledge Graph with concept URIs from various controlled
vocabularies like Skosmos hosted or Wikidata
5. Use different format data-serialization formats suitable for the integration with
different systems. For example, json-ld suitable for Dataverse, turtle for Jena
Fuseki, RDF for LoD frameworks
Complexity of CMDI is unfolding
After the implementation
● Complexity in CMDI becomes more
visible
● Identify core concepts which can be
mapped to standard bibliographic
schemes as DCAT (red box)
● Possibility to match values of CMDI
concepts to other controlled
vocabularies (green box)
How does it look when implemented in Dataverse?
Every field can be linked to the appropriate controlled vocabularies in FAIR way!
Greater vision:
Dataverse metadata schemas ingested into a Knowledge Graph
Compound keyword field with SKOS
We use SKOS relationships to keep the
hierarchy and relationships between
metadata fields
Other Dataverse schemas: https://github.com/Dans-labs/semaf-client/tree/cmdi/schema
Once in a Knowledge graph: what can we do?
Pipeline managed to establish some relationships
to Wikidata concepts and automatically updated the
dataset with new conceptURIs!
The example of automatic enrichment with Wikidata
Content
- Introduction
- Who we are
- CMDI in the ‘wild’ - CLARIN data collections in EASY
- Creating FAIR metadata and semantic services - case of CMDI Pipeline
- Extract-Transform-Load (CMDI) metadata into Dataverse
- Workflow for linking external concepts to (CMDI) metadata values to make metadata FAIR
- Lessons learned and pointers
Lessons learned (I)
Scientific communities and archives have different perspectives on standardisation, and semantic services.
In research formalisation (including KOS, ontologies, any ‘model’) is a heuristic device, agile to new research
questions, and so intrinsically ‘not interoperable’. In other words, there is a difference between research needs
and information needs.
Lessons learned (II)
- We provided a solution for our CMDI problem - by creating a CLARIN
compatible Dataverse solution, which via an API can be harvested by the
CLARIN search service; we also created another perspective on the CDMI
‘challenge’
- We used Dataverse is a platform due to an open active community;
- The examples we showed you some of are results of a ‘Vision Lab’ - proof of
concepts - funded in projects as SSHOC, CLARIAH, EOSC
- The results are envisioned be implemented locally.
- But, in principle the solutions are platform agnostic.
Lessons learned (III)
- In the future, repositories might become nodes in a large searchable
knowledge graph and semantic links might enable pathways for
contextual/semantic search.
- Part of this future will be automatically supported semantic enrichment at the
local instantiations (automatic indexing in a net instead of in an index)
- Problem: keep provenance, authority (trust) - governance between those
(micro-service) providers need to be organised. What can we learn from
history?
References - pointers
CMDI exploration tool DANS CMDI converter github
CMDI properties frequency VLO top profiles
CMDI core metadata proposal Core metadata components design for use cases
DANS CMDI metadata generator CMDI metadata model published as TSV files
Convertor can extract and show the hierarchy of all fields
CLARIAH compliant Dataverse Docker module Dataverse Docker with CMDI metadata schema
Core metadata components design guidelines Guidelines link
Semantic Gateway as plugin app Dataverse gateway Semantic Gateway API
Dataverse metadata schema ingested into
Graph
https://github.com/Dans-labs/semaf-client/tree/cmdi/sche
ma
References - pointers
Dataverse 5.7 https://github.com/IQSS/dataverse/releases/tag/v5.7
Semantic Gateway: https://github.com/Dans-labs/semantic-gateway
SSHOC task 5.2 http://github.com/SSHOC
SEMAF client https://github.com/Dans-labs/semaf-client
CMDI data model and namespaces: M. Windhouwer, E. Indarto, D. Broeder. CMD2RDF: Building a Bridge from
CLARIN to Linked Open Data
Flexible Metadata Schemes for Research Data Repositories. / de Vries, Jerry; Tykhonov, Vyacheslav;
Scharnhorst, Andrea; Admiraal, Femmy; Indarto, Eko; Priddy, Mike.
2021. Abstract from CLARIN Annual Conference 2021.
https://www.clarin.eu/content/programme-clarin-annual-conference-2021
Questions?
Slava Tykhonov <vyacheslav.tykhonov@dans.knaw.nl>
Andrea Scharnhorst <andrea.scharnhorst@dans.knaw.nl>

More Related Content

What's hot

5 years of Dataverse evolution
5 years of Dataverse evolution 5 years of Dataverse evolution
5 years of Dataverse evolution
vty
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhy
vty
 
Setting up Dataverse repository for research data
Setting up Dataverse repository for research dataSetting up Dataverse repository for research data
Setting up Dataverse repository for research data
vty
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
vty
 
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
vty
 
The world of Docker and Kubernetes
The world of Docker and Kubernetes The world of Docker and Kubernetes
The world of Docker and Kubernetes
vty
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Project
vty
 
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC,  Service QA and DataverseIntegration of WORSICA’s thematic service in EOSC,  Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
vty
 
DataverseEU as multilingual repository
DataverseEU as multilingual repositoryDataverseEU as multilingual repository
DataverseEU as multilingual repository
vty
 
Library Web Services for Discovery and Delivery of Scientific Information
Library Web Services for Discovery and Delivery of Scientific InformationLibrary Web Services for Discovery and Delivery of Scientific Information
Library Web Services for Discovery and Delivery of Scientific Information
Richard Akerman
 
D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...
FAO
 
WhatIsData-Blitz
WhatIsData-BlitzWhatIsData-Blitz
WhatIsData-Blitz
pharvener
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop Environment
IRJET Journal
 
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
4Science
 
BrainSpa Paper
BrainSpa PaperBrainSpa Paper
BrainSpa Paper
adina toderas
 
Transient and persistent RDF views over relational databases in the context o...
Transient and persistent RDF views over relational databases in the context o...Transient and persistent RDF views over relational databases in the context o...
Transient and persistent RDF views over relational databases in the context o...
Nikolaos Konstantinou
 
Leveraging IoT as part of your digital transformation
Leveraging IoT as part of your digital transformationLeveraging IoT as part of your digital transformation
Leveraging IoT as part of your digital transformation
John Archer
 
Fast Synchronization In IVR Using REST API For HTML5 And AJAX
Fast Synchronization In IVR Using REST API For HTML5 And AJAXFast Synchronization In IVR Using REST API For HTML5 And AJAX
Fast Synchronization In IVR Using REST API For HTML5 And AJAX
IJERA Editor
 
LCI2009-Tutorial
LCI2009-TutorialLCI2009-Tutorial
LCI2009-Tutorial
tutorialsruby
 

What's hot (19)

5 years of Dataverse evolution
5 years of Dataverse evolution 5 years of Dataverse evolution
5 years of Dataverse evolution
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhy
 
Setting up Dataverse repository for research data
Setting up Dataverse repository for research dataSetting up Dataverse repository for research data
Setting up Dataverse repository for research data
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
 
The world of Docker and Kubernetes
The world of Docker and Kubernetes The world of Docker and Kubernetes
The world of Docker and Kubernetes
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Project
 
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC,  Service QA and DataverseIntegration of WORSICA’s thematic service in EOSC,  Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
 
DataverseEU as multilingual repository
DataverseEU as multilingual repositoryDataverseEU as multilingual repository
DataverseEU as multilingual repository
 
Library Web Services for Discovery and Delivery of Scientific Information
Library Web Services for Discovery and Delivery of Scientific InformationLibrary Web Services for Discovery and Delivery of Scientific Information
Library Web Services for Discovery and Delivery of Scientific Information
 
D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...
 
WhatIsData-Blitz
WhatIsData-BlitzWhatIsData-Blitz
WhatIsData-Blitz
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop Environment
 
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
 
BrainSpa Paper
BrainSpa PaperBrainSpa Paper
BrainSpa Paper
 
Transient and persistent RDF views over relational databases in the context o...
Transient and persistent RDF views over relational databases in the context o...Transient and persistent RDF views over relational databases in the context o...
Transient and persistent RDF views over relational databases in the context o...
 
Leveraging IoT as part of your digital transformation
Leveraging IoT as part of your digital transformationLeveraging IoT as part of your digital transformation
Leveraging IoT as part of your digital transformation
 
Fast Synchronization In IVR Using REST API For HTML5 And AJAX
Fast Synchronization In IVR Using REST API For HTML5 And AJAXFast Synchronization In IVR Using REST API For HTML5 And AJAX
Fast Synchronization In IVR Using REST API For HTML5 And AJAX
 
LCI2009-Tutorial
LCI2009-TutorialLCI2009-Tutorial
LCI2009-Tutorial
 

Similar to Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the DANS EASY Research Data Repository

Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21
vty
 
20230525_mmc_seminar.pdf
20230525_mmc_seminar.pdf20230525_mmc_seminar.pdf
20230525_mmc_seminar.pdf
Miel Vander Sande
 
Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs
vty
 
The NIH Data Commons - BD2K All Hands Meeting 2015
The NIH Data Commons -  BD2K All Hands Meeting 2015The NIH Data Commons -  BD2K All Hands Meeting 2015
The NIH Data Commons - BD2K All Hands Meeting 2015
Vivien Bonazzi
 
Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.
Menzo Windhouwer
 
What is SDMX-RDF?
What is SDMX-RDF?What is SDMX-RDF?
What is SDMX-RDF?
Richard Cyganiak
 
DICE & Cloudify – Quality Big Data Made Easy
DICE & Cloudify – Quality Big Data Made EasyDICE & Cloudify – Quality Big Data Made Easy
DICE & Cloudify – Quality Big Data Made Easy
Cloudify Community
 
CDMI For Swift
CDMI For SwiftCDMI For Swift
CDMI For Swift
Mark Carlson
 
Peter Tiernan - Preservation and Trust at DRI - OR2015
Peter Tiernan - Preservation and Trust at DRI - OR2015Peter Tiernan - Preservation and Trust at DRI - OR2015
Peter Tiernan - Preservation and Trust at DRI - OR2015
dri_ireland
 
Metaverse for Dataverse
Metaverse for DataverseMetaverse for Dataverse
Metaverse for Dataverse
vty
 
Semantic MediaWiki as Knowledge Graph Interface
Semantic MediaWiki as Knowledge Graph InterfaceSemantic MediaWiki as Knowledge Graph Interface
Semantic MediaWiki as Knowledge Graph Interface
Bernhard Krabina
 
Plannen Code Jam OpenSocial gadgets
Plannen Code Jam OpenSocial gadgets Plannen Code Jam OpenSocial gadgets
Plannen Code Jam OpenSocial gadgets
kirstenveelo
 
Buildvoc Introduction to linked data digital construction week 2018
Buildvoc Introduction to linked data digital construction week 2018Buildvoc Introduction to linked data digital construction week 2018
Buildvoc Introduction to linked data digital construction week 2018
Phil Stacey ICIOB
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
Marin Dimitrov
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
EUCLID project
 
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector AdministrationNISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector Administration
National Information Standards Organization (NISO)
 
Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases
Zakaria Zubi
 
EUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederEUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan Broeder
OpenAIRE
 
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
IMPACT Centre of Competence
 
Linked Data to Improve the OER Experience
Linked Data to Improve the OER ExperienceLinked Data to Improve the OER Experience
Linked Data to Improve the OER Experience
The Open Education Consortium
 

Similar to Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the DANS EASY Research Data Repository (20)

Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21
 
20230525_mmc_seminar.pdf
20230525_mmc_seminar.pdf20230525_mmc_seminar.pdf
20230525_mmc_seminar.pdf
 
Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs
 
The NIH Data Commons - BD2K All Hands Meeting 2015
The NIH Data Commons -  BD2K All Hands Meeting 2015The NIH Data Commons -  BD2K All Hands Meeting 2015
The NIH Data Commons - BD2K All Hands Meeting 2015
 
Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.
 
What is SDMX-RDF?
What is SDMX-RDF?What is SDMX-RDF?
What is SDMX-RDF?
 
DICE & Cloudify – Quality Big Data Made Easy
DICE & Cloudify – Quality Big Data Made EasyDICE & Cloudify – Quality Big Data Made Easy
DICE & Cloudify – Quality Big Data Made Easy
 
CDMI For Swift
CDMI For SwiftCDMI For Swift
CDMI For Swift
 
Peter Tiernan - Preservation and Trust at DRI - OR2015
Peter Tiernan - Preservation and Trust at DRI - OR2015Peter Tiernan - Preservation and Trust at DRI - OR2015
Peter Tiernan - Preservation and Trust at DRI - OR2015
 
Metaverse for Dataverse
Metaverse for DataverseMetaverse for Dataverse
Metaverse for Dataverse
 
Semantic MediaWiki as Knowledge Graph Interface
Semantic MediaWiki as Knowledge Graph InterfaceSemantic MediaWiki as Knowledge Graph Interface
Semantic MediaWiki as Knowledge Graph Interface
 
Plannen Code Jam OpenSocial gadgets
Plannen Code Jam OpenSocial gadgets Plannen Code Jam OpenSocial gadgets
Plannen Code Jam OpenSocial gadgets
 
Buildvoc Introduction to linked data digital construction week 2018
Buildvoc Introduction to linked data digital construction week 2018Buildvoc Introduction to linked data digital construction week 2018
Buildvoc Introduction to linked data digital construction week 2018
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector AdministrationNISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector Administration
 
Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases Knowledge Discovery in Remote Access Databases
Knowledge Discovery in Remote Access Databases
 
EUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederEUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan Broeder
 
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer....
 
Linked Data to Improve the OER Experience
Linked Data to Improve the OER ExperienceLinked Data to Improve the OER Experience
Linked Data to Improve the OER Experience
 

More from Andrea Scharnhorst

The Polifonia portal: a confluence of user stories, research pilots, data man...
The Polifonia portal: a confluence of user stories, research pilots, data man...The Polifonia portal: a confluence of user stories, research pilots, data man...
The Polifonia portal: a confluence of user stories, research pilots, data man...
Andrea Scharnhorst
 
Floating classifications - Knowledge Organization Systems in past, present an...
Floating classifications - Knowledge Organization Systems in past, present an...Floating classifications - Knowledge Organization Systems in past, present an...
Floating classifications - Knowledge Organization Systems in past, present an...
Andrea Scharnhorst
 
Digging into the Knowledge Graph (2017-2020)
Digging into the Knowledge Graph (2017-2020)Digging into the Knowledge Graph (2017-2020)
Digging into the Knowledge Graph (2017-2020)
Andrea Scharnhorst
 
Dilemmata of research infrastructures
Dilemmata of research infrastructuresDilemmata of research infrastructures
Dilemmata of research infrastructures
Andrea Scharnhorst
 
DARIAH Contributions 2019
DARIAH Contributions 2019DARIAH Contributions 2019
DARIAH Contributions 2019
Andrea Scharnhorst
 
Data curation and data archiving at different stages of the research process
Data curation and data archiving at different stages of the research processData curation and data archiving at different stages of the research process
Data curation and data archiving at different stages of the research process
Andrea Scharnhorst
 
SUSTAINABILITY BEYOND GUIDELINES
SUSTAINABILITY BEYOND GUIDELINESSUSTAINABILITY BEYOND GUIDELINES
SUSTAINABILITY BEYOND GUIDELINES
Andrea Scharnhorst
 
Information science in practice - research at a Trusted Digital Archive
Information science in practice - research at a Trusted Digital ArchiveInformation science in practice - research at a Trusted Digital Archive
Information science in practice - research at a Trusted Digital Archive
Andrea Scharnhorst
 
How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...
Andrea Scharnhorst
 
Bibliometrics, Webometrics, Altmetrics, Alternative metrics.
Bibliometrics, Webometrics, Altmetrics, Alternative metrics.Bibliometrics, Webometrics, Altmetrics, Alternative metrics.
Bibliometrics, Webometrics, Altmetrics, Alternative metrics.
Andrea Scharnhorst
 
Why do we need to model the science system?
Why do we need to model the science system?Why do we need to model the science system?
Why do we need to model the science system?
Andrea Scharnhorst
 
Humanities and ICT
Humanities and ICTHumanities and ICT
Humanities and ICT
Andrea Scharnhorst
 
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Andrea Scharnhorst
 
Between  information  retrieval  services  and bibliometrics  research. New  ...
Between  information  retrieval  services  and bibliometrics  research. New  ...Between  information  retrieval  services  and bibliometrics  research. New  ...
Between  information  retrieval  services  and bibliometrics  research. New  ...
Andrea Scharnhorst
 
Knowledge maps for libraries and archives - uses and use cases
Knowledge maps for libraries and archives - uses and use casesKnowledge maps for libraries and archives - uses and use cases
Knowledge maps for libraries and archives - uses and use cases
Andrea Scharnhorst
 
Digital Humanities in The Netherlands DARIAH, CLARIN, CLARIAH, … DHx.0 A pers...
Digital Humanities in The Netherlands DARIAH, CLARIN, CLARIAH, … DHx.0 A pers...Digital Humanities in The Netherlands DARIAH, CLARIN, CLARIAH, … DHx.0 A pers...
Digital Humanities in The Netherlands DARIAH, CLARIN, CLARIAH, … DHx.0 A pers...
Andrea Scharnhorst
 
Rare (and emergent) disciplines in the light of science studies
Rare (and emergent) disciplines in the light of science studiesRare (and emergent) disciplines in the light of science studies
Rare (and emergent) disciplines in the light of science studies
Andrea Scharnhorst
 
Drowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research fundingDrowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research funding
Andrea Scharnhorst
 
Digital Humanities as Innovation: ‘constant revolution’ or ‘moving to the su...
Digital Humanities as Innovation:  ‘constant revolution’ or ‘moving to the su...Digital Humanities as Innovation:  ‘constant revolution’ or ‘moving to the su...
Digital Humanities as Innovation: ‘constant revolution’ or ‘moving to the su...
Andrea Scharnhorst
 
Mapping Digital Humanities projects. A pilot of a DH project registry for The...
Mapping Digital Humanities projects. A pilot of a DH project registry for The...Mapping Digital Humanities projects. A pilot of a DH project registry for The...
Mapping Digital Humanities projects. A pilot of a DH project registry for The...
Andrea Scharnhorst
 

More from Andrea Scharnhorst (20)

The Polifonia portal: a confluence of user stories, research pilots, data man...
The Polifonia portal: a confluence of user stories, research pilots, data man...The Polifonia portal: a confluence of user stories, research pilots, data man...
The Polifonia portal: a confluence of user stories, research pilots, data man...
 
Floating classifications - Knowledge Organization Systems in past, present an...
Floating classifications - Knowledge Organization Systems in past, present an...Floating classifications - Knowledge Organization Systems in past, present an...
Floating classifications - Knowledge Organization Systems in past, present an...
 
Digging into the Knowledge Graph (2017-2020)
Digging into the Knowledge Graph (2017-2020)Digging into the Knowledge Graph (2017-2020)
Digging into the Knowledge Graph (2017-2020)
 
Dilemmata of research infrastructures
Dilemmata of research infrastructuresDilemmata of research infrastructures
Dilemmata of research infrastructures
 
DARIAH Contributions 2019
DARIAH Contributions 2019DARIAH Contributions 2019
DARIAH Contributions 2019
 
Data curation and data archiving at different stages of the research process
Data curation and data archiving at different stages of the research processData curation and data archiving at different stages of the research process
Data curation and data archiving at different stages of the research process
 
SUSTAINABILITY BEYOND GUIDELINES
SUSTAINABILITY BEYOND GUIDELINESSUSTAINABILITY BEYOND GUIDELINES
SUSTAINABILITY BEYOND GUIDELINES
 
Information science in practice - research at a Trusted Digital Archive
Information science in practice - research at a Trusted Digital ArchiveInformation science in practice - research at a Trusted Digital Archive
Information science in practice - research at a Trusted Digital Archive
 
How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...
 
Bibliometrics, Webometrics, Altmetrics, Alternative metrics.
Bibliometrics, Webometrics, Altmetrics, Alternative metrics.Bibliometrics, Webometrics, Altmetrics, Alternative metrics.
Bibliometrics, Webometrics, Altmetrics, Alternative metrics.
 
Why do we need to model the science system?
Why do we need to model the science system?Why do we need to model the science system?
Why do we need to model the science system?
 
Humanities and ICT
Humanities and ICTHumanities and ICT
Humanities and ICT
 
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
 
Between  information  retrieval  services  and bibliometrics  research. New  ...
Between  information  retrieval  services  and bibliometrics  research. New  ...Between  information  retrieval  services  and bibliometrics  research. New  ...
Between  information  retrieval  services  and bibliometrics  research. New  ...
 
Knowledge maps for libraries and archives - uses and use cases
Knowledge maps for libraries and archives - uses and use casesKnowledge maps for libraries and archives - uses and use cases
Knowledge maps for libraries and archives - uses and use cases
 
Digital Humanities in The Netherlands DARIAH, CLARIN, CLARIAH, … DHx.0 A pers...
Digital Humanities in The Netherlands DARIAH, CLARIN, CLARIAH, … DHx.0 A pers...Digital Humanities in The Netherlands DARIAH, CLARIN, CLARIAH, … DHx.0 A pers...
Digital Humanities in The Netherlands DARIAH, CLARIN, CLARIAH, … DHx.0 A pers...
 
Rare (and emergent) disciplines in the light of science studies
Rare (and emergent) disciplines in the light of science studiesRare (and emergent) disciplines in the light of science studies
Rare (and emergent) disciplines in the light of science studies
 
Drowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research fundingDrowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research funding
 
Digital Humanities as Innovation: ‘constant revolution’ or ‘moving to the su...
Digital Humanities as Innovation:  ‘constant revolution’ or ‘moving to the su...Digital Humanities as Innovation:  ‘constant revolution’ or ‘moving to the su...
Digital Humanities as Innovation: ‘constant revolution’ or ‘moving to the su...
 
Mapping Digital Humanities projects. A pilot of a DH project registry for The...
Mapping Digital Humanities projects. A pilot of a DH project registry for The...Mapping Digital Humanities projects. A pilot of a DH project registry for The...
Mapping Digital Humanities projects. A pilot of a DH project registry for The...
 

Recently uploaded

How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Community pharmacy- Social and preventive pharmacy UNIT 5
Community pharmacy- Social and preventive pharmacy UNIT 5Community pharmacy- Social and preventive pharmacy UNIT 5
Community pharmacy- Social and preventive pharmacy UNIT 5
sayalidalavi006
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
Celine George
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
simonomuemu
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 

Recently uploaded (20)

How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Community pharmacy- Social and preventive pharmacy UNIT 5
Community pharmacy- Social and preventive pharmacy UNIT 5Community pharmacy- Social and preventive pharmacy UNIT 5
Community pharmacy- Social and preventive pharmacy UNIT 5
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 

Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the DANS EASY Research Data Repository

  • 1. Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DANS Research Data Repositories Slava Tykhonov; Jerry de Vries; Eko Indarto; Andrea Scharnhorst; Femmy Admiraal; Mike Priddy (DANS-KNAW) Presentation at ISKO Knowledge Organisation Research Observatory 24 Nov 2021 RESEARCH REPOSITORIES AND DATAVERSE: NEGOTIATING METADATA, VOCABULARIES AND DOMAIN NEEDS
  • 2. Content - Introduction - Who we are - CMDI in the ‘wild’ - CLARIN data collections in EASY - Creating FAIR metadata and semantic services - case of CMDI Pipeline - Extract-Transform-Load (CMDI) metadata into Dataverse - Workflow for linking external concepts to (CMDI) metadata values to make metadata FAIR - Lessons learned and pointers
  • 3. Introduction DANS - Royal Netherlands Academy of Arts and Sciences - research data expertise center - long-term preservation - archive (www.easy.dans.knaw; Dataverse.nl; Narcis.nl) CLARIAH.nl Large Scale Infrastructure Project for Humanities (CLARIN+DARIAH) DARIAH: Digital Research Infrastructure for the Arts and Humanities CLARIN: Common Language Resources and Technology Infrastructure CMDI: Component MetaData Infrastructure: a framework to describe and reuse metadata blueprints SKOSMOS: Open source web-based SKOS browser and publishing tool Challenge How to make the datasets from the CLARIN community in the long-term archive discoverable in the CLARIN infrastructure? How to search ‘in data’ ? = How to achieve richer indexing of metadata?
  • 4. The problem - on a generic level
  • 5. Vision: Semantic interoperability on the infrastructure level We envision a situation where thousands of Dataverse instances (due to EOSC) on the web can be simultaneously search for data. The old dream of Federated search/Universal catalogue can only be realised if: (1) Cross -walks; mapping across different metadata schemes are implemented (2) In metadata schemes we seek for ways to enrich indexes with values from controlled vocabularies Standard response = standardisation and harmonisation = repository software, certain metadata standards, or certain controlled vocabularies New response = explore agile solutions (Proof of Concept) which can be implemented by different communities (even smaller ones), so we keep variety and still enable integration.
  • 6. The problem - on a concrete level CMDI ‘in the wild’
  • 7. Content - Introduction - Who we are - CMDI in the ‘wild’ - CLARIN data collections in EASY - Creating FAIR metadata and semantic services - case of CMDI Pipeline - Extract-Transform-Load (CMDI) metadata into Dataverse - Workflow for linking external concepts to (CMDI) metadata values to make metadata FAIR - Lessons learned and pointers
  • 8. Conceptual approach: Semantic interoperability on the infrastructure level - building common solutions for everyone Dataverse Semantic API in release 5.6: https://github.com/IQSS/dataverse/releases/tag/v5.6 “Dataset metadata can be retrieved, set, and updated using a new, flatter JSON-LD format - following the format of an OAI-ORE export (RDA-conformant Bags), allowing for easier transfer of metadata to/from other systems (i.e. without needing to know Dataverse's metadata block and field storage architecture). This new API also allows for the update of terms metadata“. External controlled vocabularies support is being developed by DANS in SSHOC project and already integrated in Dataverse core in release 5.7. Proposal: https://docs.google.com/document/d/1txdcFuxskRx_tLsDQ7KKLFTMR_r9IBhorDu3V_r445w/ Interfaces: http://github.com/gdcc/dataverse-external-vocab-support Integrations: Wikidata, ORCID, MeSH, Skosmos vocabularies
  • 9. CMDI Pipeline - Backbone of our pipeline: Extract-Transform-Load (CMDI) metadata into Dataverse - One block relevant for semantic services: Mapping across metadata standards - Another block: Look-up for values in controlled vocabulary registers - enrich indexing
  • 10. SEMAF: A Proposal for a Flexible Semantic Mapping Framework Proposal: https://zenodo.org/record/4651421#.YT9lyC8RpZI POC: https://github.com/Dans-labs/semaf-poc
  • 11. Coming close to the implementation 1. Use Data Catalog Vocabulary (DCAT) mappings for CMDI metadata fields 2. Simple Knowledge Organization System (SKOS) to model a thesauri-like resources with simple skos:broader, skos:narrower and skos:related properties 3. Load CMDI properties and attributes and build a Knowledge Graph out of all elements 4. Enrich the Knowledge Graph with concept URIs from various controlled vocabularies like Skosmos hosted or Wikidata 5. Use different format data-serialization formats suitable for the integration with different systems. For example, json-ld suitable for Dataverse, turtle for Jena Fuseki, RDF for LoD frameworks
  • 12. Complexity of CMDI is unfolding After the implementation ● Complexity in CMDI becomes more visible ● Identify core concepts which can be mapped to standard bibliographic schemes as DCAT (red box) ● Possibility to match values of CMDI concepts to other controlled vocabularies (green box)
  • 13. How does it look when implemented in Dataverse? Every field can be linked to the appropriate controlled vocabularies in FAIR way!
  • 14. Greater vision: Dataverse metadata schemas ingested into a Knowledge Graph Compound keyword field with SKOS We use SKOS relationships to keep the hierarchy and relationships between metadata fields Other Dataverse schemas: https://github.com/Dans-labs/semaf-client/tree/cmdi/schema
  • 15. Once in a Knowledge graph: what can we do? Pipeline managed to establish some relationships to Wikidata concepts and automatically updated the dataset with new conceptURIs! The example of automatic enrichment with Wikidata
  • 16. Content - Introduction - Who we are - CMDI in the ‘wild’ - CLARIN data collections in EASY - Creating FAIR metadata and semantic services - case of CMDI Pipeline - Extract-Transform-Load (CMDI) metadata into Dataverse - Workflow for linking external concepts to (CMDI) metadata values to make metadata FAIR - Lessons learned and pointers
  • 17. Lessons learned (I) Scientific communities and archives have different perspectives on standardisation, and semantic services. In research formalisation (including KOS, ontologies, any ‘model’) is a heuristic device, agile to new research questions, and so intrinsically ‘not interoperable’. In other words, there is a difference between research needs and information needs.
  • 18. Lessons learned (II) - We provided a solution for our CMDI problem - by creating a CLARIN compatible Dataverse solution, which via an API can be harvested by the CLARIN search service; we also created another perspective on the CDMI ‘challenge’ - We used Dataverse is a platform due to an open active community; - The examples we showed you some of are results of a ‘Vision Lab’ - proof of concepts - funded in projects as SSHOC, CLARIAH, EOSC - The results are envisioned be implemented locally. - But, in principle the solutions are platform agnostic.
  • 19. Lessons learned (III) - In the future, repositories might become nodes in a large searchable knowledge graph and semantic links might enable pathways for contextual/semantic search. - Part of this future will be automatically supported semantic enrichment at the local instantiations (automatic indexing in a net instead of in an index) - Problem: keep provenance, authority (trust) - governance between those (micro-service) providers need to be organised. What can we learn from history?
  • 20. References - pointers CMDI exploration tool DANS CMDI converter github CMDI properties frequency VLO top profiles CMDI core metadata proposal Core metadata components design for use cases DANS CMDI metadata generator CMDI metadata model published as TSV files Convertor can extract and show the hierarchy of all fields CLARIAH compliant Dataverse Docker module Dataverse Docker with CMDI metadata schema Core metadata components design guidelines Guidelines link Semantic Gateway as plugin app Dataverse gateway Semantic Gateway API Dataverse metadata schema ingested into Graph https://github.com/Dans-labs/semaf-client/tree/cmdi/sche ma
  • 21. References - pointers Dataverse 5.7 https://github.com/IQSS/dataverse/releases/tag/v5.7 Semantic Gateway: https://github.com/Dans-labs/semantic-gateway SSHOC task 5.2 http://github.com/SSHOC SEMAF client https://github.com/Dans-labs/semaf-client CMDI data model and namespaces: M. Windhouwer, E. Indarto, D. Broeder. CMD2RDF: Building a Bridge from CLARIN to Linked Open Data Flexible Metadata Schemes for Research Data Repositories. / de Vries, Jerry; Tykhonov, Vyacheslav; Scharnhorst, Andrea; Admiraal, Femmy; Indarto, Eko; Priddy, Mike. 2021. Abstract from CLARIN Annual Conference 2021. https://www.clarin.eu/content/programme-clarin-annual-conference-2021
  • 22. Questions? Slava Tykhonov <vyacheslav.tykhonov@dans.knaw.nl> Andrea Scharnhorst <andrea.scharnhorst@dans.knaw.nl>