SlideShare a Scribd company logo
dans.knaw.nl
DANS is een instituut van KNAW en NWO
PIDs in CESSDA DataverseEU
Vyacheslav Tykhonov
Senior Information Scientist (DANS),
DataverseEU lead developer
CESSDA PID workshop,
20.03.2018
DataverseEU development model
• We’re not going to create new fork of Dataverse, our
contributions should go to the master branch hosted by
IQSS at Harvard
• Delivered as Docker images and deployed in Google Cloud
as CESSDA Dataverse repository
• Any Service Provider can host separate Dataverse instance
in its own cloud if it’s required
• Metadata from other CESSDA repositories will be harvested
by central DataverseEU repository
• Easy to add new languages if more partners will join during
or after the project
DataverseEU tasks overview
• Development of multilingual web interface (German,
Slovenian, Swedish, Hungarian, Italian, French, Spanish)
• Support of localized metadata model corresponding to
CESSDA CMM
• Design and development of PID plugin mechanism to allow
service providers to choose PID service of their preference
• Development of APIs for CESSDA CVs and Topic
Classification based on CESSDA CV Manager services
Multilingual web interface
Multilingual web interface: Spanish
PID structure in Dataverse
Every PID contains:
• Prefix: unique authority (ID of institution or
organization)
• Separator
• Sequence of characters or numbers for dataset ID
identification
Examples:
<PID> ::= <Naming Authority> "/" <Handle Local Name>
doi:10.4232/1.0001 (DOI)
hdl:10411/KL0X8C (handle)
Dataverse PID Plugin requirements
• We need flexible way to switch between PID service
providers (da|ra, DataCite, handle)
• Registering DOIs with da|ra will give data providers a
greater visibility and recognition as data references will be
integrated in da|ra search index
• Different data archives can get separate prefixes within the
same Dataverse instance and increase their visibility and
recognition
• PID Plugin can be used in combination with external storage
configuration (based on Swift) to host data locally in national
infrastructures
Current implementation of PID service
• out-of-the-box support of DOIs (DataCite) and handles
(handle.net)
• one Dataverse instance can be only bundled or to DOI, or to
handle
• there is no possibility to use own prefixes for different
organisations (DataverseNL is hdl:10411 for all partners)
• switching between DOIs and handles can be done by
executing API requests:
curl -X PUT -d hdl "http://localhost:8080/api/admin/settings/:Protocol"
curl -X PUT -d 10411 “http://localhost:8080/api/admin/settings/:Authority”
curl -X PUT -d doi "http://localhost:8080/api/admin/settings/:Protocol"
curl -X PUT -d 10.5072/FK2 "http://localhost:8080/api/admin/settings/:Authority"
The Dataset Lifecycle
Destroyed Updated URL
create
update
destroy
Published versions can
be de-accessed at any
time.
Unpublished versions
(drafts) can also be
deleted.
publish update
Credits: Felix Bensmann (GESIS). Supporting New PID Providers in Dataverse
PID assigning strategies
• Different PID for every new version of a dataset (da|ra)
• The same PID for the dataset, shared by all versions (DOI,
handle)
The central idea of PID plugin: every service provider can
choose the strategy for assigning PIDs that will fit the best to
their needs.
Warning: different communities need various strategies!
PID strategies based on community needs
• Sharing data via the Archive
Dataset files aren’t changing, PIDs are different
• Research Data deposit
(Ph.D. students): obligation to make data of thesis or
study publicly available, work in progress, PID is the same
The same PID for all versions of a dataset:
Example on difference between versions
In the same time there is no support for granularity, for example:
https://dataverse.nl/dataset.xhtml?persistentId=hdl:10411/KL0X8C&version=2.0
is not https://dataverse.nl/dataset.xhtml?persistentId=hdl:10411/KL0X8C.2
PID Plugin features
• Developed by GESIS and designed as extra module that can
be added to running Dataverse application
• Functionality provided by the PID Plugin triggered by
events:
• Creation (onCreate)
• Update (onUpdate)
• Publication (onPublish)
• Deaccession
• Destruction
• Lookup (lookupDoi, getProviderName)
• All settings are controlled via Dataverse API (suffix)
PID Plugin registration with da|ra
• Own XML schema
• Mandatory and optional fields
• Every dataset update will create new PID
• Metadata schema of service providers
should be synchronized with da|ra schema
<xml>
<metadata>
<ID>ABCDE</ID>
<version>v1.0</version>
<doi>
auth.ority/DV/ABCDE
</doi>
<title>title</title>
<url>https://…</url>
</metadata>
</xml>

More Related Content

What's hot

Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020
Julien Le Dem
 
Data Virtualization in the Cloud: Accelerating Data Virtualization Adoption
Data Virtualization in the Cloud: Accelerating Data Virtualization AdoptionData Virtualization in the Cloud: Accelerating Data Virtualization Adoption
Data Virtualization in the Cloud: Accelerating Data Virtualization Adoption
Denodo
 
Dataverse in the European Open Science Cloud
Dataverse in the European Open Science CloudDataverse in the European Open Science Cloud
Dataverse in the European Open Science Cloud
vty
 
HDF-EOS Software Developer/Vendor Workshop Wrapup
HDF-EOS Software Developer/Vendor Workshop WrapupHDF-EOS Software Developer/Vendor Workshop Wrapup
HDF-EOS Software Developer/Vendor Workshop Wrapup
The HDF-EOS Tools and Information Center
 
IoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsIoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management Things
DataWorks Summit
 
Improve your SQL workload with observability
Improve your SQL workload with observabilityImprove your SQL workload with observability
Improve your SQL workload with observability
OVHcloud
 
Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)
vty
 
Data Warehouse embraces Kubernetes and Modernized Data Platforms with Pivotal...
Data Warehouse embraces Kubernetes and Modernized Data Platforms with Pivotal...Data Warehouse embraces Kubernetes and Modernized Data Platforms with Pivotal...
Data Warehouse embraces Kubernetes and Modernized Data Platforms with Pivotal...
VMware Tanzu
 
GDAL Enhancement for ESDIS Project
GDAL Enhancement for ESDIS ProjectGDAL Enhancement for ESDIS Project
GDAL Enhancement for ESDIS Project
The HDF-EOS Tools and Information Center
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
Alluxio, Inc.
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
HDF Product Designer
HDF Product DesignerHDF Product Designer
Hedvig slides from VMworld 2016
Hedvig slides from VMworld 2016Hedvig slides from VMworld 2016
Hedvig slides from VMworld 2016
Eric Carter
 
Ultralight data movement for IoT with SDC Edge. Guglielmo Iozzia - Optum
Ultralight data movement for IoT with SDC Edge. Guglielmo Iozzia - OptumUltralight data movement for IoT with SDC Edge. Guglielmo Iozzia - Optum
Ultralight data movement for IoT with SDC Edge. Guglielmo Iozzia - Optum
Data Driven Innovation
 
Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...
Research Data Alliance
 
HDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the CloudHDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the Cloud
The HDF-EOS Tools and Information Center
 
Hierarchical Data Formats (HDF) Update
Hierarchical Data Formats (HDF) UpdateHierarchical Data Formats (HDF) Update
Hierarchical Data Formats (HDF) Update
The HDF-EOS Tools and Information Center
 
Big Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data IntegrationBig Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data Integration
Alibaba Cloud
 
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
KDZ - Zentrum für Verwaltungsforschung
 
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
DataWorks Summit
 

What's hot (20)

Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020
 
Data Virtualization in the Cloud: Accelerating Data Virtualization Adoption
Data Virtualization in the Cloud: Accelerating Data Virtualization AdoptionData Virtualization in the Cloud: Accelerating Data Virtualization Adoption
Data Virtualization in the Cloud: Accelerating Data Virtualization Adoption
 
Dataverse in the European Open Science Cloud
Dataverse in the European Open Science CloudDataverse in the European Open Science Cloud
Dataverse in the European Open Science Cloud
 
HDF-EOS Software Developer/Vendor Workshop Wrapup
HDF-EOS Software Developer/Vendor Workshop WrapupHDF-EOS Software Developer/Vendor Workshop Wrapup
HDF-EOS Software Developer/Vendor Workshop Wrapup
 
IoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsIoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management Things
 
Improve your SQL workload with observability
Improve your SQL workload with observabilityImprove your SQL workload with observability
Improve your SQL workload with observability
 
Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)
 
Data Warehouse embraces Kubernetes and Modernized Data Platforms with Pivotal...
Data Warehouse embraces Kubernetes and Modernized Data Platforms with Pivotal...Data Warehouse embraces Kubernetes and Modernized Data Platforms with Pivotal...
Data Warehouse embraces Kubernetes and Modernized Data Platforms with Pivotal...
 
GDAL Enhancement for ESDIS Project
GDAL Enhancement for ESDIS ProjectGDAL Enhancement for ESDIS Project
GDAL Enhancement for ESDIS Project
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
HDF Product Designer
HDF Product DesignerHDF Product Designer
HDF Product Designer
 
Hedvig slides from VMworld 2016
Hedvig slides from VMworld 2016Hedvig slides from VMworld 2016
Hedvig slides from VMworld 2016
 
Ultralight data movement for IoT with SDC Edge. Guglielmo Iozzia - Optum
Ultralight data movement for IoT with SDC Edge. Guglielmo Iozzia - OptumUltralight data movement for IoT with SDC Edge. Guglielmo Iozzia - Optum
Ultralight data movement for IoT with SDC Edge. Guglielmo Iozzia - Optum
 
Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...
 
HDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the CloudHDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the Cloud
 
Hierarchical Data Formats (HDF) Update
Hierarchical Data Formats (HDF) UpdateHierarchical Data Formats (HDF) Update
Hierarchical Data Formats (HDF) Update
 
Big Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data IntegrationBig Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data Integration
 
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
SMW Use Cases at the Provincial Government of Lower Austria, Gerald Streimelw...
 
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
 

Similar to Persistent identifiers in DataverseEU project

5 years of Dataverse evolution
5 years of Dataverse evolution 5 years of Dataverse evolution
5 years of Dataverse evolution
vty
 
DataverseEU: Building Multilingual infrastructure for the Social Sciences in...
DataverseEU: Building Multilingual infrastructure  for the Social Sciences in...DataverseEU: Building Multilingual infrastructure  for the Social Sciences in...
DataverseEU: Building Multilingual infrastructure for the Social Sciences in...
vty
 
DataverseEU as multilingual repository
DataverseEU as multilingual repositoryDataverseEU as multilingual repository
DataverseEU as multilingual repository
vty
 
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Sri Ambati
 
Tutorial Workgroup - Model versioning and collaboration
Tutorial Workgroup - Model versioning and collaborationTutorial Workgroup - Model versioning and collaboration
Tutorial Workgroup - Model versioning and collaboration
PascalDesmarets1
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
Stéphane Fréchette
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
RTTS
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
Jim Kaskade
 
The Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and StreamingThe Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and Streaming
Timothy Spann
 
How AD has been re-engineered to extend to the cloud
How AD has been re-engineered to extend to the cloudHow AD has been re-engineered to extend to the cloud
How AD has been re-engineered to extend to the cloud
LDAPCon
 
The world of Docker and Kubernetes
The world of Docker and Kubernetes The world of Docker and Kubernetes
The world of Docker and Kubernetes
vty
 
FIWARE and Smart Data Models
FIWARE and Smart Data ModelsFIWARE and Smart Data Models
FIWARE and Smart Data Models
Fernando Lopez Aguilar
 
DevOps Digital Transformation: A real life use case enabled by Alien4Cloud
DevOps Digital Transformation: A real life use case enabled by Alien4CloudDevOps Digital Transformation: A real life use case enabled by Alien4Cloud
DevOps Digital Transformation: A real life use case enabled by Alien4Cloud
Cloudify Community
 
The Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud FoundryThe Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud Foundry
VMware Tanzu
 
SOLID Programming with Portable Class Libraries
SOLID Programming with Portable Class LibrariesSOLID Programming with Portable Class Libraries
SOLID Programming with Portable Class Libraries
Vagif Abilov
 
EOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub service portfolio
EOSC-hub service portfolio
EOSC-hub project
 
Big Data Infrastructure
Big Data InfrastructureBig Data Infrastructure
Big Data Infrastructure
Trivadis
 
A Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data VirtualizationA Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data Virtualization
Denodo
 
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life EasierWebinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
DataStax
 
Data as a service
Data as a serviceData as a service
Data as a service
Khushbu Joshi
 

Similar to Persistent identifiers in DataverseEU project (20)

5 years of Dataverse evolution
5 years of Dataverse evolution 5 years of Dataverse evolution
5 years of Dataverse evolution
 
DataverseEU: Building Multilingual infrastructure for the Social Sciences in...
DataverseEU: Building Multilingual infrastructure  for the Social Sciences in...DataverseEU: Building Multilingual infrastructure  for the Social Sciences in...
DataverseEU: Building Multilingual infrastructure for the Social Sciences in...
 
DataverseEU as multilingual repository
DataverseEU as multilingual repositoryDataverseEU as multilingual repository
DataverseEU as multilingual repository
 
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
Nanda Vijaydev, BlueData - Deploying H2O in Large Scale Distributed Environme...
 
Tutorial Workgroup - Model versioning and collaboration
Tutorial Workgroup - Model versioning and collaborationTutorial Workgroup - Model versioning and collaboration
Tutorial Workgroup - Model versioning and collaboration
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
The Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and StreamingThe Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and Streaming
 
How AD has been re-engineered to extend to the cloud
How AD has been re-engineered to extend to the cloudHow AD has been re-engineered to extend to the cloud
How AD has been re-engineered to extend to the cloud
 
The world of Docker and Kubernetes
The world of Docker and Kubernetes The world of Docker and Kubernetes
The world of Docker and Kubernetes
 
FIWARE and Smart Data Models
FIWARE and Smart Data ModelsFIWARE and Smart Data Models
FIWARE and Smart Data Models
 
DevOps Digital Transformation: A real life use case enabled by Alien4Cloud
DevOps Digital Transformation: A real life use case enabled by Alien4CloudDevOps Digital Transformation: A real life use case enabled by Alien4Cloud
DevOps Digital Transformation: A real life use case enabled by Alien4Cloud
 
The Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud FoundryThe Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud Foundry
 
SOLID Programming with Portable Class Libraries
SOLID Programming with Portable Class LibrariesSOLID Programming with Portable Class Libraries
SOLID Programming with Portable Class Libraries
 
EOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub service portfolio
EOSC-hub service portfolio
 
Big Data Infrastructure
Big Data InfrastructureBig Data Infrastructure
Big Data Infrastructure
 
A Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data VirtualizationA Successful Journey to the Cloud with Data Virtualization
A Successful Journey to the Cloud with Data Virtualization
 
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life EasierWebinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
 
Data as a service
Data as a serviceData as a service
Data as a service
 

More from vty

Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs
vty
 
Decentralisation and knowledge graphs
Decentralisation and knowledge graphs Decentralisation and knowledge graphs
Decentralisation and knowledge graphs
vty
 
Decentralised identifiers for CLARIAH infrastructure
Decentralised identifiers for CLARIAH infrastructure Decentralised identifiers for CLARIAH infrastructure
Decentralised identifiers for CLARIAH infrastructure
vty
 
Dataverse repository for research data in the COVID-19 Museum
Dataverse repository for research data  in the COVID-19 MuseumDataverse repository for research data  in the COVID-19 Museum
Dataverse repository for research data in the COVID-19 Museum
vty
 
Metaverse for Dataverse
Metaverse for DataverseMetaverse for Dataverse
Metaverse for Dataverse
vty
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
vty
 
External CV support in Dataverse 5.7
External CV support in Dataverse 5.7External CV support in Dataverse 5.7
External CV support in Dataverse 5.7
vty
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhy
vty
 
CLARIN CMDI use case and flexible metadata schemes
CLARIN CMDI use case and flexible metadata schemes CLARIN CMDI use case and flexible metadata schemes
CLARIN CMDI use case and flexible metadata schemes
vty
 
Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21
vty
 
Controlled vocabularies and ontologies in Dataverse data repository
Controlled vocabularies and ontologies in Dataverse data repositoryControlled vocabularies and ontologies in Dataverse data repository
Controlled vocabularies and ontologies in Dataverse data repository
vty
 
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
vty
 
Fighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial IntelligenceFighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial Intelligence
vty
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Project
vty
 
External controlled vocabularies support in Dataverse
External controlled vocabularies support in DataverseExternal controlled vocabularies support in Dataverse
External controlled vocabularies support in Dataverse
vty
 
Setting up Dataverse repository for research data
Setting up Dataverse repository for research dataSetting up Dataverse repository for research data
Setting up Dataverse repository for research data
vty
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
vty
 
Ontologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and DataverseOntologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and Dataverse
vty
 
CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse
vty
 
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC,  Service QA and DataverseIntegration of WORSICA’s thematic service in EOSC,  Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
vty
 

More from vty (20)

Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs
 
Decentralisation and knowledge graphs
Decentralisation and knowledge graphs Decentralisation and knowledge graphs
Decentralisation and knowledge graphs
 
Decentralised identifiers for CLARIAH infrastructure
Decentralised identifiers for CLARIAH infrastructure Decentralised identifiers for CLARIAH infrastructure
Decentralised identifiers for CLARIAH infrastructure
 
Dataverse repository for research data in the COVID-19 Museum
Dataverse repository for research data  in the COVID-19 MuseumDataverse repository for research data  in the COVID-19 Museum
Dataverse repository for research data in the COVID-19 Museum
 
Metaverse for Dataverse
Metaverse for DataverseMetaverse for Dataverse
Metaverse for Dataverse
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
 
External CV support in Dataverse 5.7
External CV support in Dataverse 5.7External CV support in Dataverse 5.7
External CV support in Dataverse 5.7
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhy
 
CLARIN CMDI use case and flexible metadata schemes
CLARIN CMDI use case and flexible metadata schemes CLARIN CMDI use case and flexible metadata schemes
CLARIN CMDI use case and flexible metadata schemes
 
Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21
 
Controlled vocabularies and ontologies in Dataverse data repository
Controlled vocabularies and ontologies in Dataverse data repositoryControlled vocabularies and ontologies in Dataverse data repository
Controlled vocabularies and ontologies in Dataverse data repository
 
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
Automated CI/CD testing, installation and deployment of Dataverse infrastruct...
 
Fighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial IntelligenceFighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial Intelligence
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Project
 
External controlled vocabularies support in Dataverse
External controlled vocabularies support in DataverseExternal controlled vocabularies support in Dataverse
External controlled vocabularies support in Dataverse
 
Setting up Dataverse repository for research data
Setting up Dataverse repository for research dataSetting up Dataverse repository for research data
Setting up Dataverse repository for research data
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
Ontologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and DataverseOntologies, controlled vocabularies and Dataverse
Ontologies, controlled vocabularies and Dataverse
 
CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse
 
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC,  Service QA and DataverseIntegration of WORSICA’s thematic service in EOSC,  Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
 

Recently uploaded

Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
PirithiRaju
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
Shashank Shekhar Pandey
 

Recently uploaded (20)

Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
 

Persistent identifiers in DataverseEU project

  • 1. dans.knaw.nl DANS is een instituut van KNAW en NWO PIDs in CESSDA DataverseEU Vyacheslav Tykhonov Senior Information Scientist (DANS), DataverseEU lead developer CESSDA PID workshop, 20.03.2018
  • 2. DataverseEU development model • We’re not going to create new fork of Dataverse, our contributions should go to the master branch hosted by IQSS at Harvard • Delivered as Docker images and deployed in Google Cloud as CESSDA Dataverse repository • Any Service Provider can host separate Dataverse instance in its own cloud if it’s required • Metadata from other CESSDA repositories will be harvested by central DataverseEU repository • Easy to add new languages if more partners will join during or after the project
  • 3. DataverseEU tasks overview • Development of multilingual web interface (German, Slovenian, Swedish, Hungarian, Italian, French, Spanish) • Support of localized metadata model corresponding to CESSDA CMM • Design and development of PID plugin mechanism to allow service providers to choose PID service of their preference • Development of APIs for CESSDA CVs and Topic Classification based on CESSDA CV Manager services
  • 6. PID structure in Dataverse Every PID contains: • Prefix: unique authority (ID of institution or organization) • Separator • Sequence of characters or numbers for dataset ID identification Examples: <PID> ::= <Naming Authority> "/" <Handle Local Name> doi:10.4232/1.0001 (DOI) hdl:10411/KL0X8C (handle)
  • 7. Dataverse PID Plugin requirements • We need flexible way to switch between PID service providers (da|ra, DataCite, handle) • Registering DOIs with da|ra will give data providers a greater visibility and recognition as data references will be integrated in da|ra search index • Different data archives can get separate prefixes within the same Dataverse instance and increase their visibility and recognition • PID Plugin can be used in combination with external storage configuration (based on Swift) to host data locally in national infrastructures
  • 8. Current implementation of PID service • out-of-the-box support of DOIs (DataCite) and handles (handle.net) • one Dataverse instance can be only bundled or to DOI, or to handle • there is no possibility to use own prefixes for different organisations (DataverseNL is hdl:10411 for all partners) • switching between DOIs and handles can be done by executing API requests: curl -X PUT -d hdl "http://localhost:8080/api/admin/settings/:Protocol" curl -X PUT -d 10411 “http://localhost:8080/api/admin/settings/:Authority” curl -X PUT -d doi "http://localhost:8080/api/admin/settings/:Protocol" curl -X PUT -d 10.5072/FK2 "http://localhost:8080/api/admin/settings/:Authority"
  • 9. The Dataset Lifecycle Destroyed Updated URL create update destroy Published versions can be de-accessed at any time. Unpublished versions (drafts) can also be deleted. publish update Credits: Felix Bensmann (GESIS). Supporting New PID Providers in Dataverse
  • 10. PID assigning strategies • Different PID for every new version of a dataset (da|ra) • The same PID for the dataset, shared by all versions (DOI, handle) The central idea of PID plugin: every service provider can choose the strategy for assigning PIDs that will fit the best to their needs. Warning: different communities need various strategies!
  • 11. PID strategies based on community needs • Sharing data via the Archive Dataset files aren’t changing, PIDs are different • Research Data deposit (Ph.D. students): obligation to make data of thesis or study publicly available, work in progress, PID is the same
  • 12. The same PID for all versions of a dataset: Example on difference between versions In the same time there is no support for granularity, for example: https://dataverse.nl/dataset.xhtml?persistentId=hdl:10411/KL0X8C&version=2.0 is not https://dataverse.nl/dataset.xhtml?persistentId=hdl:10411/KL0X8C.2
  • 13. PID Plugin features • Developed by GESIS and designed as extra module that can be added to running Dataverse application • Functionality provided by the PID Plugin triggered by events: • Creation (onCreate) • Update (onUpdate) • Publication (onPublish) • Deaccession • Destruction • Lookup (lookupDoi, getProviderName) • All settings are controlled via Dataverse API (suffix)
  • 14. PID Plugin registration with da|ra • Own XML schema • Mandatory and optional fields • Every dataset update will create new PID • Metadata schema of service providers should be synchronized with da|ra schema <xml> <metadata> <ID>ABCDE</ID> <version>v1.0</version> <doi> auth.ority/DV/ABCDE </doi> <title>title</title> <url>https://…</url> </metadata> </xml>