Decentralised identifiers for CLARIAH infrastructure

Decentralized Identifiers (DIDs)
CLARIAH IG FAIR Vocabularies
Slava Tykhonov, R&D
(DANS-KNAW, the Netherlands)
21 November 2022

Using Decentralized identifiers (DIDs) for any type of content
Source: Wikipedia
We’re considering experimental implementation of the decentralized identifiers for controlled
vocabularies and content types extension to archive various types of content.
DIDs can be assigned to any artefacts including images, audio and video, for example, to store
and link metadata records and provenance information next to their digitized content.

DOI costs
DataCite agency charge some fee from data providers depending on the amount of identifiers
and it can be significant amount starting from 1 million DOIs. What about DIDs?

Typical problems of “centralized” identifiers
Disambiguation and authorship issues:
● two authors with the same name mentioned in different papers, how do you know who is who?
● it’s very difficult to assign a paper to a specific person with ORCID without knowing the fact that it’s the original author
● some people can claim their false (fraudulent) authorship
Centralized entity which can be considered as a single point of failure.
Typical questions:
● can email be considered as identifier?
● what to do when email is changed because the domain name is changing and the identifier disappears
or not resolvable any more?
● how reliable is ORCID database?

“Centralized” controlled vocabularies
The European Language Social
Science Thesaurus (ELSST) hosted
by various data providers like
CESSDA and ODISSEI in Skosmos.
CESSDA has updated version with
more language properties.
How about versions of
vocabularies and concepts
changes and drift?

Decentralized identifiers as possible solution
We envision the near future where the it will be possible to create a decentralized system does which will not depend on any specific
registry, one provider, one authority, etc., so all connections will be established in a peer-to-peer network, and but will be persistent at
the same time.
The resolution of the global decentralized identifier (DID) should be cryptographically verifiable to prove the identity and the
ownership of that identifier.
Core DID features are listed below:
1. A permanent (persistent) identifier (never change)
2. A resolvable identifier (you can look it up to discover metadata)
3. A cryptographically-verifiable identifier (with private and public keys)
4. A decentralized identifier (no centralized authority)
DID should bring control of all provenance and metadata back to their owners instead of giving them away. In the same time public part
will/could not be very different from other persistent identifiers like DOIs and even replace them for the specific use cases like sharing
sensitive data.

The place of DID as unified resource
Source: “Self-Sovereign Identity”. by Alex Preukschat, Drummond Reed
DID can be considered as “replacement” of domain names and DNS from the “centralized” network

Example of DID with private and public key, and service endpoints
Service endpoints can tell how exactly to interact with the subject, what kind of protocols, what kind of network endpoints
are available to connect, for example, to an agent that represents the data subjects so that you can then exchange
credentials or some other messages.

DID URLs with parameters
Source: Decentralized identifiers (DIDs) fundamentals and deep dive, SSIMeetup

“Decentralized” technology is not the same as “Blockchain” technology
“Blockchain is a digitally distributed database that is shared among nodes, which are computers in the blockchain network, that makes
it difficult or impossible to change, hack, or cheat the system”.
Blockchain parties:
- Holder (Owner of the Verifiable Credential)
- Issuer (provides a credential to a holder and signs the credential with their private key)
- Verifier can check the blockchain to ensure that the issued certificate belongs to who it was issued to.
it’s not necessary to use blockchain to release decentralized identifiers as there are about 100 methods to register DIDs being
developed by various companies and organizations in the world. They implemented in the different way the same spec for interface
where input and output are standardized.
OYDID method was developed in Vienna and provides a self-sustained environment for managing digital identifiers
(DIDs). The did:oyd method links the identifier cryptographically to the DID Document and through also cryptographically
linked provenance information in a public log it ensures resolving to the latest valid version of the DID Document.

Universal Resolver for DIDs
Try this! https://dev.uniresolver.io
curl https://dev.uniresolver.io/1.0/identifiers/did:oyd:zQmdQvLdpogfEf5EHK7778EM9xoxFMVFdJgRD7SdYRcCHeL

OYDID methods explained
“OYDID (Own Your Decentralized IDentifier) takes the approach to not maintain DID and DID Document on a public ledger
but on one or more local storages (that usually are publicly available). Through cryptographically linking the DID identifier
to the DID Document, and furthermore linking the DID Document to a chained provenance trail, the same security and
validation properties as a traditional DID are maintained while avoiding highly redundant storage and general public access.”
(from OYDID docs)

DIDs for controlled vocabularies
Generic problem of CVs: the most of controlled vocabularies are published and distributed in not sustainable way and often
don’t even have persistent identifiers resolving to their concepts.
Possible solution for CLARIAH FAIR vocabularies:
● assign DID identifier to every vocabulary concept and use their built-in “update” mechanism to keep all revisions in the chain of
linked DIDs resolving to the archived version of every change
● metadata records can be linked in the distributed way to DID identifiers corresponding to a specific version of concept
preserved in data ledger
● this approach is more sustainable by design and can be considered as a step towards FAIR vocabularies, also high scores after
FAIR assessment
● vocabulary management/update in the hands of vocabulary owner/creator, separate private key will be generated for every
concept and should be stored it in a secure place
● extra properties and attributes could be added to DID documents representing specific vocabulary concept, such as
provenance information containing the date of creation or modification, authors, the name of ontology, relations to other
ontologies. They can even have their own labels.
● statistics of concepts usage, linkages, relations and other metrics will be available directly from the DID chains

CoronaWhy Proof of Concept on DIDs
Dataverse with information on Monkeypox 2022 outbreak use DIDs as persistent identifiers
https://datasets.coronawhy.org

Graph Network Sustainability with DIDs
COVID-19 Museum Knowledge Graph. Q142 Wikidata: France@en, Frankrijk@nl, Frankreich@de, Франція@ua, France@fr

Questions?
Slava Tykhonov, R&D
(DANS-KNAW, the Netherlands)
vyacheslav.tykhonov@dans.knaw.nl

Decentralised identifiers for CLARIAH infrastructure

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Decentralised identifiers for CLARIAH infrastructure

Similar to Decentralised identifiers for CLARIAH infrastructure (20)

More from vty

More from vty (20)

Recently uploaded

Recently uploaded (20)

Decentralised identifiers for CLARIAH infrastructure