External CV support in Dataverse 5.7

CV support in the next
Dataverse release
Slava Tykhonov
lead software engineer
DANS-KNAW R&D
CESSDA Tools Open Hour: Dataverse, 18.11.2021

DANS Data Stations - Future DANS Data Services
Dataverse is API based data platform and a key framework for Open Innovation!

FAIR and Dataverse
Source:
Mercè Crosas,
“FAIR principles and
beyond: implementation in
Dataverse”

Out of the box CV support in Dataverse (1)
Source: Dataverse Metadata Schema

Out of the box CV support in Dataverse (2)
Internal vocabularies are stored in Dataverse, we need more CVs!

Semantic interoperability on the infrastructure level
Dataverse Semantic API in release 5.6: https://github.com/IQSS/dataverse/releases/tag/v5.6
“Dataset metadata can be retrieved, set, and updated using a new, flatter JSON-LD format -
following the format of an OAI-ORE export (RDA-conformant Bags), allowing for easier transfer of
metadata to/from other systems (i.e. without needing to know Dataverse's metadata block and field
storage architecture). This new API also allows for the update of terms metadata“.
External controlled vocabularies support is being developed by DANS in SSHOC project and
already integrated in Dataverse core in the release 5.7.
Proposal: https://docs.google.com/document/d/1txdcFuxskRx_tLsDQ7KKLFTMR_r9IBhorDu3V_r445w/
Interfaces: http://github.com/gdcc/dataverse-external-vocab-support
Integrations: Wikidata, ORCID, MeSH, Skosmos vocabularies

Building block: Skosmos to host ontologies
7
● SKOSMOS is developed in
Europe by the National Library
of Finland (NLF)
● active global user community
● search and browsing interface
for SKOS concept
● multilingual vocabularies
support
● used for different use cases
(publish vocabularies, build
discovery systems, vocabulary
visualization)

Skosmos API with python module
pip install skosmos-client

SKOSMOS API for GRID ontology
9

Dataverse deposit form with connection to
ontologies
Every field can be linked to the appropriate controlled vocabularies in FAIR way!

One metadata field can be linked to many ontologies
Language switch in Dataverse will change the language of suggested terms!

Configuration to add external controlled vocabularies
Pull Request to Dataverse core https://github.com/IQSS/dataverse/pull/7712

Javascript interface
CV interface implemented as
Javascript and placed outside of
Dataverse application.
internal:
“js-url”: “/resources/js/cvoc-interface.js”
External:
“js-url”:
“https://raw.githubusercontent.com/Dans-
labs/semantic-
gateway/main/static/js/interface.js”

Example of the CV configuration in Dataverse
Configuration in plugable JavaScript:
● Field cvocDemo connected to “unesco”
controlled vocabulary hosted by
Skosmos
● 4 languages available (en, fr, es, ru)
● js-url pointing to javascript gateway to
read and transform output from
external API endpoint
● every Skosmos concept cached
internally in Dataverse to increase the
sustainability

We created Semantic Gateway as plugin app
Source: Dataverse gateway

Semantic Gateway for Skosmos and NDE

Suggestions for the usage of FAIR CVs
● Dutch Digital Heritage Network https://netwerkdigitaalerfgoed.nl
● Skosmos instances, for example, https://bartoc-skosmos.unibas.ch/en/
Skosmos client to access vocabularies https://pypi.org/project/skosmos-client/
● ORCID API to link CMDI records to identifiers of researchers
https://info.orcid.org
● CESSDA CV Service https://vocabularies.cessda.eu
More are coming! https://github.com/CLARIAH/awesome-humanities-
ontologies

Known issues with support of external CVs
● how CV support could be applied to any field
● support and ownership available vocabularies
● backward compatibility with fields from the old metadata schema
● clean UI experience (one selection can fill 1, 2 or 4 child fields)
● can we use non-managed vocabularies or free-text values in same field
● concept drift (the change of meaning of concepts)
● interoperability across all Dataverse instances
● how to ensure CVs are coming from authoritative services

Future plans
● Dataverse will be offered as an easy to install and maintain “archive in the
box” solution available for all data providers
● External controlled vocabularies will be available out-of-the-box and will be
included within CESSDA Metadata Schema (CMM) and CLARIN CMDI
● Dataverse administrators should be able to turn on external CV support for
any specific metadata field
● The same functionality will be implemented on the datafiles level to get
variables linked to external CVs

Future plans: linking data (files) to external CVs
Source: Scholars Portal’ Data Curation Tool (Canada)

Questions?
Slava Tykhonov (DANS-KNAW)
vyacheslav.tykhonov@dans.knaw.nl
References:
Dataverse 5.7 https://github.com/IQSS/dataverse/releases/tag/v5.7
Semantic Gateway: https://github.com/Dans-labs/semantic-gateway
SSHOC task 5.2 http://github.com/SSHOC

External CV support in Dataverse 5.7

More Related Content

What's hot

Similar to External CV support in Dataverse 5.7

More from vty

Recently uploaded

External CV support in Dataverse 5.7