SlideShare a Scribd company logo
1 of 29
Download to read offline
Persistent Identifiers for Scientific Data at CSCS
Persistent Identifiers in Research – ETH Zürich event
Mario Valle, CSCS
September 13, 2019
Founded in 1991, CSCS, the Swiss
National Supercomputing Centre,
develops and provides the key
supercomputing capabilities
required to solve important problems
to science and/or society. …
Persistent Identifiers for Scientific Data at CSCS 2
CSCS Mission
Founded in 1991, CSCS, the Swiss
National Supercomputing Centre,
develops and provides the key
supercomputing capabilities
required to solve important problems
to science and/or society. …
Persistent Identifiers for Scientific Data at CSCS 3
CSCS Mission
Means also: Data Management, Data Analytics, FAIR support…
 Data should be unambiguously and
certainly identified (by something that
depends on data content and not location)
 A persistent identifier is a handle for any
type of dataset. Identifies data objects
regardless of their physical location
 A persistent identifier should be
permanent and reliably associated to its
dataset
 A persistent identifier carries more
information than a generic UUID because
it can associate metadata to the dataset
Persistent Identifiers for Scientific Data at CSCS 4
Prerequisite for good data science
Base of every handle system (DOI, PID, URN, ARK, PURL, ISBN…)
Persistent Identifiers for Scientific Data at CSCS 5
HANDLE
DATA
HANDLE
RESOLVER
HANDLE URL
URL
DOC
The Handle System is “a general purpose distributed information
system that provides efficient, extensible, and secure identifier
and resolution services for use on networks such as the Internet.”
The DOI resolving process
Persistent Identifiers for Scientific Data at CSCS 6
DOI
10.1038/nature07786
DOI
RESOLVER
DOI URL
URL
DOC
The PID resolving process
Persistent Identifiers for Scientific Data at CSCS 7
PID
21.17101/MY-SUFFIX-123
HANDLE
RESOLVER DATA
HANDLE URL
URL
The 21. identifies ePIC PID; 10. identifies DOI. Both handled by DONA foundation (https://www.dona.net/mpas)
ePIC consortium for Persistent Identifiers (PID)
Persistent Identifiers for Scientific Data at CSCS 8
“The eResearch Persistent Identifier Consortium (ePIC)
offers a service to create, manage, and resolve persistent
identifiers (PID). The increasing amount of research data,
the variety of the usage profiles and the international
exchange within different infrastructures demand to
uniquely assign the data with a PID with a high degree of
flexibility and robustness. ePIC offers a reliable mechanism
to guarantee these features of persistent identifiers.”
Excerpt from a poster at RDA 3rd Plenary Meeting
https://www.pidconsortium.eu/
CSCS is part of the ePIC consortium (since Sept. 2018)
Persistent Identifiers for Scientific Data at CSCS 9
CSCS will provide services to generate and
manage a certain range of PID assigned to
Switzerland and to resolve any PID
 For reliability and availability CSCS has
three handle servers integrated in the ePIC
infrastructure
 To resolve a PID use the master resolver:
https://hdl.handle.net/21.17101/SUFFIX
 We manage two prefixes:
 21.17101 (persistent)
 21.T17999 (testing)
 We could provide to research institutions
they own prefix:
 From 21.17102 ↑ counting up
 From 21.T17998 ↓ counting down
Persistent Identifiers for Scientific Data at CSCS 10
Infrastructure at CSCS
 Through the portal pid.cscs.ch (currently
on invitation only)
 The PID system could be accessed through
its API (docs on: http://www.handle.net/)
 Creation and management of PIDs through
the portal and through the API
Persistent Identifiers for Scientific Data at CSCS 11
User access to CSCS PID infrastructure
Persistent Identifiers for Scientific Data at CSCS 12
Next use case: long term storage at CSCS
PID
Simulation
results
CSCS Long-term archiving
user interface
Access simulation
results using PID
Persistent Identifiers for Scientific Data at CSCS 13
Next use case: long term storage at CSCS
CSCS Long-term archiving
user interfaceAccess simulation
results using PID
Data
migration
Persistent Identifiers for Scientific Data at CSCS 14
Next Use Case: Automatically associate PID to files
Result
location
PID
 Sufficiently flexible to cover ample set of
scientific data needs
 Make use of ontologies to extend the
metadata set
 Enhance the “Interoperable & Reusable” in
FAIR
 Strong dependency on science domains
 The user could associate a set of metadata
to a PID
 The user can run queries on metadata to
obtain a list of PID
Persistent Identifiers for Scientific Data at CSCS 15
Last Use Case: scientific metadata
DOI comes with an established set of metadata
Your data deserves a PID 16
 I’m the point of contact for PID ideas,
suggestions and project specific requests
 I’m collecting use cases to suggest how this
technology could help Swiss scientist’s work
 Contact me at: mvalle@cscs.ch
Persistent Identifiers for Scientific Data at CSCS 17
Main CSCS goal: find Use Cases, experiment with users
 All handle systems (PID, DOI, etc.) are
glorified DNS systems
 DNS receives www.google.com and returns:
172.217.16.36
 DOI receives 10.1038/nature07786 and
returns:
https://www.nature.com/articles/nature07786
 Who enforces that returned URL is valid?
 Who enforces file content has not tampered
with? Versioning?
 Who enforces file has not moved without
updating its PID or DOI?
 Who protects file (or publication) from being
deleted?
Persistent Identifiers for Scientific Data at CSCS 18
Where all Handle Systems stops?
A human (cultural) problem needs a human solution
Data mining:
“my data is mine,
and your data is mine”
Persistent Identifiers for Scientific Data at CSCS 19
PID future at CSCS
 We want to create awareness and, hopefully, build a Swiss community interested
in this aspect of data management
 Continue collecting and implementing PID Use Cases
 Continue working with other ePIC Consortium members to contribute to PID
maturity and to simplify PID usage and metadata management and search
 Study ePIC collaborations:
 With ORCID Project RIPEN. This project proposes the use of JSON Web tokens (JWTs) for
collecting authenticated user permissions and to delegate them from one system to another.
 With EU Project FREYA. The project aims to extend the infrastructure for persistent identifiers
(PIDs) as a core component of open research, in the EU and globally.
Persistent Identifiers for Scientific Data at CSCS 20
Thank you for your attention.
cscsch
PID needs a social infrastructure
 PID Infrastructure maintained by a dedicated and reliable team
 Provided by a non-profit organization
 Governed by international boards
 Based on open standards
22Persistent Identifiers for Scientific Data at CSCS
A human problem needs a human solution
Persistent Identifiers for Scientific Data at CSCS 23
We
moved
things
around
We
deleted
it
Wrong
or lost
content
http://www.phdcomics.com/comics/archive.php?comicid=1323
Not to say data management
leaves (often) a lot to be desired…
Persistent Identifiers for Scientific Data at CSCS 24
Persistent Identifiers for Scientific Data at CSCS 25
Next use case: long term storage at CSCS
Persistent Identifiers for Scientific Data at CSCS 26
Next use case: long term storage at CSCS
Persistent Identifiers for Scientific Data at CSCS 27
Next use case: long term storage at CSCS
PID
Simulation
results
CSCS Long-term archiving
user interface
 DOI for publications, PID for data
 DOI has metadata for publications, PID more general
 (not sure of this): DOI: managed by International DOI Foundation (IDF), a not-for-
profit membership organization that is the governance and management body for
the federation of Registration Agencies providing Digital Object Identifier (DOI)
services and registration. PID is managed by the ePIC consortium.
 Both depend from DONA for the first part of prefix (10. vs. 21.)
Persistent Identifiers for Scientific Data at CSCS 28
Handle frameworks comparison
Persistent Identifiers for Scientific Data at CSCS 29

More Related Content

Similar to Persistent Identifiers for Scientific Data at CSCS

Komatsoulis internet2 executive track
Komatsoulis internet2 executive trackKomatsoulis internet2 executive track
Komatsoulis internet2 executive trackGeorge Komatsoulis
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesASIS&T
 
Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs vty
 
BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands Vivien Bonazzi
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Projectvty
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemWarren Kibbe
 
Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster LEARN Project
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyvty
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataversevty
 
Managing and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and CloudsManaging and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and CloudsHong-Linh Truong
 
eHealth Area by Juan Carlos Castro per la XI Jornada Fòrum Catalalà d’Informa...
eHealth Area by Juan Carlos Castro per la XI Jornada Fòrum Catalalà d’Informa...eHealth Area by Juan Carlos Castro per la XI Jornada Fòrum Catalalà d’Informa...
eHealth Area by Juan Carlos Castro per la XI Jornada Fòrum Catalalà d’Informa...Fòrum Català d’Informació i Salut
 
DSpace-CRIS, anticipating innovation
DSpace-CRIS, anticipating innovationDSpace-CRIS, anticipating innovation
DSpace-CRIS, anticipating innovation4Science
 
Facilitating Scientific Collaborations by Delegating Identity Management
Facilitating Scientific Collaborations by Delegating Identity ManagementFacilitating Scientific Collaborations by Delegating Identity Management
Facilitating Scientific Collaborations by Delegating Identity Management Von Welch
 
Introduction to eudat and its services
Introduction to eudat and its servicesIntroduction to eudat and its services
Introduction to eudat and its servicesEUDAT
 
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...Kathleen Jagodnik
 
Building a Blockchain-based Reputation Infrastructure for Open Research. Ca...
  Building a Blockchain-based Reputation Infrastructure for Open Research. Ca...  Building a Blockchain-based Reputation Infrastructure for Open Research. Ca...
Building a Blockchain-based Reputation Infrastructure for Open Research. Ca...Carmen Holotescu
 
Building a Blockchain-based Reputation Infrastructure for Open Research. Case...
Building a Blockchain-based Reputation Infrastructure for Open Research. Case...Building a Blockchain-based Reputation Infrastructure for Open Research. Case...
Building a Blockchain-based Reputation Infrastructure for Open Research. Case...Carmen Holotescu
 
Cytoscape ci chapter 1
Cytoscape ci chapter 1Cytoscape ci chapter 1
Cytoscape ci chapter 1bdemchak
 

Similar to Persistent Identifiers for Scientific Data at CSCS (20)

Komatsoulis internet2 executive track
Komatsoulis internet2 executive trackKomatsoulis internet2 executive track
Komatsoulis internet2 executive track
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 
Freya, en förutsättning för öppna vetenskapssystem
Freya, en förutsättning för öppna vetenskapssystemFreya, en förutsättning för öppna vetenskapssystem
Freya, en förutsättning för öppna vetenskapssystem
 
Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs
 
BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Project
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
 
Sinnott Paper
Sinnott PaperSinnott Paper
Sinnott Paper
 
Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster Research Data Management, Challenges and Tools - Per Öster
Research Data Management, Challenges and Tools - Per Öster
 
Building COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhyBuilding COVID-19 Knowledge Graph at CoronaWhy
Building COVID-19 Knowledge Graph at CoronaWhy
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
Managing and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and CloudsManaging and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and Clouds
 
eHealth Area by Juan Carlos Castro per la XI Jornada Fòrum Catalalà d’Informa...
eHealth Area by Juan Carlos Castro per la XI Jornada Fòrum Catalalà d’Informa...eHealth Area by Juan Carlos Castro per la XI Jornada Fòrum Catalalà d’Informa...
eHealth Area by Juan Carlos Castro per la XI Jornada Fòrum Catalalà d’Informa...
 
DSpace-CRIS, anticipating innovation
DSpace-CRIS, anticipating innovationDSpace-CRIS, anticipating innovation
DSpace-CRIS, anticipating innovation
 
Facilitating Scientific Collaborations by Delegating Identity Management
Facilitating Scientific Collaborations by Delegating Identity ManagementFacilitating Scientific Collaborations by Delegating Identity Management
Facilitating Scientific Collaborations by Delegating Identity Management
 
Introduction to eudat and its services
Introduction to eudat and its servicesIntroduction to eudat and its services
Introduction to eudat and its services
 
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
 
Building a Blockchain-based Reputation Infrastructure for Open Research. Ca...
  Building a Blockchain-based Reputation Infrastructure for Open Research. Ca...  Building a Blockchain-based Reputation Infrastructure for Open Research. Ca...
Building a Blockchain-based Reputation Infrastructure for Open Research. Ca...
 
Building a Blockchain-based Reputation Infrastructure for Open Research. Case...
Building a Blockchain-based Reputation Infrastructure for Open Research. Case...Building a Blockchain-based Reputation Infrastructure for Open Research. Case...
Building a Blockchain-based Reputation Infrastructure for Open Research. Case...
 
Cytoscape ci chapter 1
Cytoscape ci chapter 1Cytoscape ci chapter 1
Cytoscape ci chapter 1
 

More from ETH-Bibliothek

17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...
17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...
17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...ETH-Bibliothek
 
10 YearsDOI Desk at ETH Zurich
10 YearsDOI Desk at ETH Zurich10 YearsDOI Desk at ETH Zurich
10 YearsDOI Desk at ETH ZurichETH-Bibliothek
 
OriginStamp: Trusted Time Stamping via the Bitcoin Blockchain
OriginStamp: Trusted Time Stamping via the Bitcoin BlockchainOriginStamp: Trusted Time Stamping via the Bitcoin Blockchain
OriginStamp: Trusted Time Stamping via the Bitcoin BlockchainETH-Bibliothek
 
Tracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDsTracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDsETH-Bibliothek
 
Building Open Research Infrastructure with PIDs
Building Open Research Infrastructure with PIDsBuilding Open Research Infrastructure with PIDs
Building Open Research Infrastructure with PIDsETH-Bibliothek
 
DataCite and its Members: Connecting Research and Identifying Knowledge
DataCite and its Members: Connecting Research and Identifying KnowledgeDataCite and its Members: Connecting Research and Identifying Knowledge
DataCite and its Members: Connecting Research and Identifying KnowledgeETH-Bibliothek
 
Bilder online recherchieren – Tipps und Tricks
Bilder online recherchieren – Tipps und TricksBilder online recherchieren – Tipps und Tricks
Bilder online recherchieren – Tipps und TricksETH-Bibliothek
 
Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...
Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...
Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...ETH-Bibliothek
 
Herausforderungen im Datenmanagement von Metadaten
Herausforderungen im Datenmanagement von MetadatenHerausforderungen im Datenmanagement von Metadaten
Herausforderungen im Datenmanagement von MetadatenETH-Bibliothek
 
Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...
Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...
Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...ETH-Bibliothek
 
Data Management in Research –WhyandHow?
Data Management in Research –WhyandHow?Data Management in Research –WhyandHow?
Data Management in Research –WhyandHow?ETH-Bibliothek
 
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....ETH-Bibliothek
 
CitizenScience - Freiwillige lokalisieren Bilder im virtuellen Globus
CitizenScience - Freiwillige lokalisieren Bilder im virtuellen GlobusCitizenScience - Freiwillige lokalisieren Bilder im virtuellen Globus
CitizenScience - Freiwillige lokalisieren Bilder im virtuellen GlobusETH-Bibliothek
 
FORUM - Das Bottom-up Gremium der ETH-Bibliothek
FORUM - Das Bottom-up Gremium der ETH-BibliothekFORUM - Das Bottom-up Gremium der ETH-Bibliothek
FORUM - Das Bottom-up Gremium der ETH-BibliothekETH-Bibliothek
 
Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...
Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...
Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...ETH-Bibliothek
 
„Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek
„Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek „Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek
„Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek ETH-Bibliothek
 
Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...
Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...
Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...ETH-Bibliothek
 
Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...
Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...
Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...ETH-Bibliothek
 
The ETH Zurich DOI Desk
The ETH Zurich DOI Desk The ETH Zurich DOI Desk
The ETH Zurich DOI Desk ETH-Bibliothek
 

More from ETH-Bibliothek (20)

17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...
17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...
17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...
 
ETH Zurich's DOI Desk
ETH Zurich's DOI DeskETH Zurich's DOI Desk
ETH Zurich's DOI Desk
 
10 YearsDOI Desk at ETH Zurich
10 YearsDOI Desk at ETH Zurich10 YearsDOI Desk at ETH Zurich
10 YearsDOI Desk at ETH Zurich
 
OriginStamp: Trusted Time Stamping via the Bitcoin Blockchain
OriginStamp: Trusted Time Stamping via the Bitcoin BlockchainOriginStamp: Trusted Time Stamping via the Bitcoin Blockchain
OriginStamp: Trusted Time Stamping via the Bitcoin Blockchain
 
Tracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDsTracking Citations to Research Software via PIDs
Tracking Citations to Research Software via PIDs
 
Building Open Research Infrastructure with PIDs
Building Open Research Infrastructure with PIDsBuilding Open Research Infrastructure with PIDs
Building Open Research Infrastructure with PIDs
 
DataCite and its Members: Connecting Research and Identifying Knowledge
DataCite and its Members: Connecting Research and Identifying KnowledgeDataCite and its Members: Connecting Research and Identifying Knowledge
DataCite and its Members: Connecting Research and Identifying Knowledge
 
Bilder online recherchieren – Tipps und Tricks
Bilder online recherchieren – Tipps und TricksBilder online recherchieren – Tipps und Tricks
Bilder online recherchieren – Tipps und Tricks
 
Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...
Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...
Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...
 
Herausforderungen im Datenmanagement von Metadaten
Herausforderungen im Datenmanagement von MetadatenHerausforderungen im Datenmanagement von Metadaten
Herausforderungen im Datenmanagement von Metadaten
 
Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...
Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...
Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...
 
Data Management in Research –WhyandHow?
Data Management in Research –WhyandHow?Data Management in Research –WhyandHow?
Data Management in Research –WhyandHow?
 
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
 
CitizenScience - Freiwillige lokalisieren Bilder im virtuellen Globus
CitizenScience - Freiwillige lokalisieren Bilder im virtuellen GlobusCitizenScience - Freiwillige lokalisieren Bilder im virtuellen Globus
CitizenScience - Freiwillige lokalisieren Bilder im virtuellen Globus
 
FORUM - Das Bottom-up Gremium der ETH-Bibliothek
FORUM - Das Bottom-up Gremium der ETH-BibliothekFORUM - Das Bottom-up Gremium der ETH-Bibliothek
FORUM - Das Bottom-up Gremium der ETH-Bibliothek
 
Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...
Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...
Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...
 
„Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek
„Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek „Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek
„Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek
 
Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...
Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...
Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...
 
Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...
Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...
Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...
 
The ETH Zurich DOI Desk
The ETH Zurich DOI Desk The ETH Zurich DOI Desk
The ETH Zurich DOI Desk
 

Recently uploaded

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 

Recently uploaded (20)

SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 

Persistent Identifiers for Scientific Data at CSCS

  • 1. Persistent Identifiers for Scientific Data at CSCS Persistent Identifiers in Research – ETH Zürich event Mario Valle, CSCS September 13, 2019
  • 2. Founded in 1991, CSCS, the Swiss National Supercomputing Centre, develops and provides the key supercomputing capabilities required to solve important problems to science and/or society. … Persistent Identifiers for Scientific Data at CSCS 2 CSCS Mission
  • 3. Founded in 1991, CSCS, the Swiss National Supercomputing Centre, develops and provides the key supercomputing capabilities required to solve important problems to science and/or society. … Persistent Identifiers for Scientific Data at CSCS 3 CSCS Mission Means also: Data Management, Data Analytics, FAIR support…
  • 4.  Data should be unambiguously and certainly identified (by something that depends on data content and not location)  A persistent identifier is a handle for any type of dataset. Identifies data objects regardless of their physical location  A persistent identifier should be permanent and reliably associated to its dataset  A persistent identifier carries more information than a generic UUID because it can associate metadata to the dataset Persistent Identifiers for Scientific Data at CSCS 4 Prerequisite for good data science
  • 5. Base of every handle system (DOI, PID, URN, ARK, PURL, ISBN…) Persistent Identifiers for Scientific Data at CSCS 5 HANDLE DATA HANDLE RESOLVER HANDLE URL URL DOC The Handle System is “a general purpose distributed information system that provides efficient, extensible, and secure identifier and resolution services for use on networks such as the Internet.”
  • 6. The DOI resolving process Persistent Identifiers for Scientific Data at CSCS 6 DOI 10.1038/nature07786 DOI RESOLVER DOI URL URL DOC
  • 7. The PID resolving process Persistent Identifiers for Scientific Data at CSCS 7 PID 21.17101/MY-SUFFIX-123 HANDLE RESOLVER DATA HANDLE URL URL The 21. identifies ePIC PID; 10. identifies DOI. Both handled by DONA foundation (https://www.dona.net/mpas)
  • 8. ePIC consortium for Persistent Identifiers (PID) Persistent Identifiers for Scientific Data at CSCS 8 “The eResearch Persistent Identifier Consortium (ePIC) offers a service to create, manage, and resolve persistent identifiers (PID). The increasing amount of research data, the variety of the usage profiles and the international exchange within different infrastructures demand to uniquely assign the data with a PID with a high degree of flexibility and robustness. ePIC offers a reliable mechanism to guarantee these features of persistent identifiers.” Excerpt from a poster at RDA 3rd Plenary Meeting https://www.pidconsortium.eu/
  • 9. CSCS is part of the ePIC consortium (since Sept. 2018) Persistent Identifiers for Scientific Data at CSCS 9 CSCS will provide services to generate and manage a certain range of PID assigned to Switzerland and to resolve any PID
  • 10.  For reliability and availability CSCS has three handle servers integrated in the ePIC infrastructure  To resolve a PID use the master resolver: https://hdl.handle.net/21.17101/SUFFIX  We manage two prefixes:  21.17101 (persistent)  21.T17999 (testing)  We could provide to research institutions they own prefix:  From 21.17102 ↑ counting up  From 21.T17998 ↓ counting down Persistent Identifiers for Scientific Data at CSCS 10 Infrastructure at CSCS
  • 11.  Through the portal pid.cscs.ch (currently on invitation only)  The PID system could be accessed through its API (docs on: http://www.handle.net/)  Creation and management of PIDs through the portal and through the API Persistent Identifiers for Scientific Data at CSCS 11 User access to CSCS PID infrastructure
  • 12. Persistent Identifiers for Scientific Data at CSCS 12 Next use case: long term storage at CSCS PID Simulation results CSCS Long-term archiving user interface Access simulation results using PID
  • 13. Persistent Identifiers for Scientific Data at CSCS 13 Next use case: long term storage at CSCS CSCS Long-term archiving user interfaceAccess simulation results using PID Data migration
  • 14. Persistent Identifiers for Scientific Data at CSCS 14 Next Use Case: Automatically associate PID to files Result location PID
  • 15.  Sufficiently flexible to cover ample set of scientific data needs  Make use of ontologies to extend the metadata set  Enhance the “Interoperable & Reusable” in FAIR  Strong dependency on science domains  The user could associate a set of metadata to a PID  The user can run queries on metadata to obtain a list of PID Persistent Identifiers for Scientific Data at CSCS 15 Last Use Case: scientific metadata
  • 16. DOI comes with an established set of metadata Your data deserves a PID 16
  • 17.  I’m the point of contact for PID ideas, suggestions and project specific requests  I’m collecting use cases to suggest how this technology could help Swiss scientist’s work  Contact me at: mvalle@cscs.ch Persistent Identifiers for Scientific Data at CSCS 17 Main CSCS goal: find Use Cases, experiment with users
  • 18.  All handle systems (PID, DOI, etc.) are glorified DNS systems  DNS receives www.google.com and returns: 172.217.16.36  DOI receives 10.1038/nature07786 and returns: https://www.nature.com/articles/nature07786  Who enforces that returned URL is valid?  Who enforces file content has not tampered with? Versioning?  Who enforces file has not moved without updating its PID or DOI?  Who protects file (or publication) from being deleted? Persistent Identifiers for Scientific Data at CSCS 18 Where all Handle Systems stops?
  • 19. A human (cultural) problem needs a human solution Data mining: “my data is mine, and your data is mine” Persistent Identifiers for Scientific Data at CSCS 19
  • 20. PID future at CSCS  We want to create awareness and, hopefully, build a Swiss community interested in this aspect of data management  Continue collecting and implementing PID Use Cases  Continue working with other ePIC Consortium members to contribute to PID maturity and to simplify PID usage and metadata management and search  Study ePIC collaborations:  With ORCID Project RIPEN. This project proposes the use of JSON Web tokens (JWTs) for collecting authenticated user permissions and to delegate them from one system to another.  With EU Project FREYA. The project aims to extend the infrastructure for persistent identifiers (PIDs) as a core component of open research, in the EU and globally. Persistent Identifiers for Scientific Data at CSCS 20
  • 21. Thank you for your attention. cscsch
  • 22. PID needs a social infrastructure  PID Infrastructure maintained by a dedicated and reliable team  Provided by a non-profit organization  Governed by international boards  Based on open standards 22Persistent Identifiers for Scientific Data at CSCS
  • 23. A human problem needs a human solution Persistent Identifiers for Scientific Data at CSCS 23 We moved things around We deleted it Wrong or lost content
  • 24. http://www.phdcomics.com/comics/archive.php?comicid=1323 Not to say data management leaves (often) a lot to be desired… Persistent Identifiers for Scientific Data at CSCS 24
  • 25. Persistent Identifiers for Scientific Data at CSCS 25 Next use case: long term storage at CSCS
  • 26. Persistent Identifiers for Scientific Data at CSCS 26 Next use case: long term storage at CSCS
  • 27. Persistent Identifiers for Scientific Data at CSCS 27 Next use case: long term storage at CSCS PID Simulation results CSCS Long-term archiving user interface
  • 28.  DOI for publications, PID for data  DOI has metadata for publications, PID more general  (not sure of this): DOI: managed by International DOI Foundation (IDF), a not-for- profit membership organization that is the governance and management body for the federation of Registration Agencies providing Digital Object Identifier (DOI) services and registration. PID is managed by the ePIC consortium.  Both depend from DONA for the first part of prefix (10. vs. 21.) Persistent Identifiers for Scientific Data at CSCS 28
  • 29. Handle frameworks comparison Persistent Identifiers for Scientific Data at CSCS 29