Joint webinar FREYA and OpenAIRE: New
developments in the field of Persistent Identifiers
FREYA-WP3: New PID developments
By
Ketil Koop-Jakobsen
PANGAEA,
Bremen University
Germany
PANGAEA® Data Publisher
The FREYA project
Connected Open Identifiers for Discovery,
Access and Use of Research Resources
www.project-freya.eu | twitter: @freya_eu
FREYA in a nutshell
• FREYA = persistent identifiers
– “… To extend the infrastructure for persistent
identifiers (PIDs) as a core component of open
research, in the EU and globally. ”
• H2020 Project funded the European Commission
• Builds on THOR (which in turn built on ODIN)
• Started 1 December 2017
• www.project-freya.eu
• twitter: @freya_eu
FREYA characteristics:
FREYA works interdisciplinary
and draws on expertise from a
very diverse group of Data
repositories, Publishers, Research
institutions, PID providers and
libraries
Extended PID graph with New PIDs
New PID Types (WP3)
Sample
Data repository
Organization
Author 1
PublicationInstitution
Dataset
Software
Publication
Author 2
Author
affiliation
affiliation
funding
authorship
authorship
authorship
citation
reuse
input
Instrument
Grant
Conference
Where do we start ?
Which PID is needed the most ?
Which new PID can be expanded ?
??
?
Conference
?
?
Author
PublicationInstitution
Dataset
Software
Publication
Author
Author
affiliation
affiliation
funding
authorship
authorship
authorship
citation
reuse
input
What we do in FREYA:
FREYAs work on new PIDs assesses the state of the
art for PIDs, identifies gaps, and specifies use cases
and requirements for potential new PID types.
• Assess the current state PID landscape
New PID Types
DONE
Almost DONE
In Progress
• Identify needs for new PIDs and their requirements
• Develop prototypes for new PID services
The Process of
identifying new PIDs
services
Defining the PID landscape
• Determining current PID developments, existing
PIDs, PID initiatives, maturity of individual PIDs
25 entities
- Having or
needing a PID
Significant
overlap among
disciplines
Complicates
determination
of PID maturity
PID MATURITY INDEX
Deliverable 3.1
available on the
FREYA webpage
https://doi.org/10.5281/
zenodo.1324296
The Process of
identifying new PIDs
Only three entities
(researchers,
publications and data)
have services that are
deemed fully mature. The
remaining are either
emerging or immature
The Process of
identifying new PIDs
services
Identifying needs and requirements for new PIDs
Methods:
• Collecting Use-cases from the community
• Collecting Use-cases at conferences
• Identifying New PIDs in high demand
• Identifying requirements for the progress of new PIDs
• Matching Need and Requirements with Freya
expertise
The Process of
identifying new PIDs
services
What is a Use-case?
A Use-case in
FREYA describes a
scenario, where a
PID is needed
identifying User,
Goal and Benefit
General Use-case template
The Process of
identifying new PIDs
services
What is a Use-case? Use-case examples
The Process of
identifying new PIDs
services
Use-case collection:
72 Use-cases were collected in total
30 Use-cases revolved around “New PIDs”
The Process of
identifying new PIDs
services
What is a Use-case? Example: Research Cruises
As a funding agency, I would like to
trace the outcome of my financial
contribution to a marine research
cruise (Cruise ID) by tracking the data
generated (data-PID) and articles
(publication-PID), physical samples
taken (IGSN). I would also like to
track the future data and publications
generated from my cruise and the
samples it collected.
Image: MARUM©
The Process of
identifying new PIDs
services
What is a Use-case? Example: Research Cruises
As a funding agency, I would like to
trace the outcome of my financial
contribution to a marine research cruise
(Cruise ID) by tracking the data
generated (data-PID) and articles
(publication-PID), physical samples
taken (IGSN). I would also like to track
the future data and publications
generated from my cruise and the
samples it collected.
The Process of
identifying new PIDs
services
What is a Use-case? Example: Software
As a software author, I
want to be able to see the
citations of my software
aggregated across all
versions. so that I see a
complete picture of reuse
Image from:
https://opentextbc.ca/selfpublishguide/cha
pter/screenshots-of-software/
The Process of
identifying new PIDs
services
What is a Use-case? Example: Software
As a software author, I
want to be able to see the
citations of my software
aggregated across all
versions. so that I see a
complete picture of reuse
Image from:
https://opentextbc.ca/selfpublishguide/cha
pter/screenshots-of-software/
The Process of
identifying new PIDs
services
What is a Use-case? Example: Policy
As a research manager, I
want to have policy IDs, so
that I can easily identify
relevant policies and
assess the compatibility
between different policies.
The Process of
identifying new PIDs
services
What is a Use-case? Example: Policies
As a research manager, I
want to have policy IDs, so
that I can easily identify
relevant policies and
assess the compatibility
between different
policies.
Identifying PID needs in Use-cases
Entity Popularity
Instrument 10
data 8
article 6
person 5
Repository 5
Organisation 4
Sample 4
software 4
Grants 3
project 1
study 1
conference 1
The upcoming deliverable 3.2 will
provide detailed information on
1. Instruments
2. Repositories
3. Organizations
4. Physical samples
5. Grants
6. Software
7. Research Campaigns
8. Data management plans
9. Facilities
Of the 25 PIDs identified in the
landscape analysis 9 PIDs were
chosen for further analysis and
matched with expertise with in FREYA
1. Identify needs based in user-stories: Why do people want this PID?
2. Validate current status: What does it take to expand this PID type
3. Validate the extend of existing PIDs: Cross-disciplinary approach
4. Identifying experise: Who can move this PID-type forward
Prioritization of future
work on PIDs
Example
Physical samples
• User-stories:
1) Information about a sediment core/sample in
core repository.
2) Tracing/relocating misplaced cultural
artefacts.
3) Identifying samples of bacterial/viral/fungal
strains.
PID systems:
IGSN: Geological samples
RRID: Research resource identifiers antibodies, cell
lines, model organisms
ARTs: No universal identifiers
1. Identify needs based on user-stories: Why do people want this PID?
2. Validate current status: What does it take to expand this PID type
3. Validate the extend of existing PIDs: Cross-disciplinary approach
4. Identifying expertise: Who can move this PID-type forward
Prioritize which PIDs to move forward for
prototyping within FREYA
Prioritization of future
work on PIDs
Report will be available by the end of February
CASE-STUDY:
EXPANSION OF PANGAEA PID GRAPH
THROUGH IMPLEMENTATION OF PIDS FOR
PHYSICAL SAMPLES (IGSNs):
How implementation of new PIDs provide
the user with additional information
Photo: MARUM – Center for Marine Environmental
Sciences, University of Bremen; V. Diekamp
SCENARIO:
I am a Geologist interested in
sediment cores.
It has come to my knowledge that
interesting research is going on in
lakes of the French Alps.
I search PANGAEA for data and find
the work of Bajard et al 2015
What kind of information does
PANGAEAs use of PIDs provide me?
The work of Bajard et al 2015 found on
the PANGAEA webpage
Author-PID:
Data-PID:
Article PID:
PIDs:
Mature actionable PIDs available from PANGAEA
Data-
PID:
Article
PID:
PIDs
:
Author-
PID:
Sample-
PID:
The work of Bajard et al 2015 found on
the PANGAEA webpage
Conclusion:
Our use-case-oriented method gives a practical
orientation about the USERs demand for new PIDs
In particular PIDs for Instruments, Organizations,
Physical samples, Grants, Software are sought for
by the community.
Implementation of some of new PIDs will improve
the USERs access to additional information.
H2020-Project, Grant.-No.: 777523
ILLUSTRATION of PID graph
New PID Types (WP3)
25 entities
- Having or
needing a PID
Significant
overlap among
disciplines
Complicates
determination
of PID maturity
Rank maturity of PIDs for disciplines
Entity Popu
larity
EBI Datacit
e
BL DANS CERN STFC Pangaea
Instrument 10 2-3 1 1 1 1 1 2
Data 8 5 5 5 5 5 3 5
Article 6 5 5 5 5 5 5 5
Person 5 5 5 5 5 4-5 3 4-5
Repository 5 1 2 1 1 1 1 1
Prganisation 4 2 2 2 2 1-2 1 2
Sample 4 5 2 1 1 1 1 2-3
Software 4 2 5 1 2 5 2 1
Grants 3 3 2 2 2 1 1 1
project 1 2 1-2 1 2 1 1 1-2
study 1 2 1 1 2 1 5 1
conference 1 1 2 1 1 1 1 1
Matching PID needs with FREYA expertise
Maturity Key:1 non-existent 2 nascent 3 emerging 4 in pilot 5 mature

New PID developments

  • 1.
    Joint webinar FREYAand OpenAIRE: New developments in the field of Persistent Identifiers FREYA-WP3: New PID developments By Ketil Koop-Jakobsen PANGAEA, Bremen University Germany PANGAEA® Data Publisher
  • 2.
    The FREYA project ConnectedOpen Identifiers for Discovery, Access and Use of Research Resources www.project-freya.eu | twitter: @freya_eu
  • 3.
    FREYA in anutshell • FREYA = persistent identifiers – “… To extend the infrastructure for persistent identifiers (PIDs) as a core component of open research, in the EU and globally. ” • H2020 Project funded the European Commission • Builds on THOR (which in turn built on ODIN) • Started 1 December 2017 • www.project-freya.eu • twitter: @freya_eu
  • 4.
    FREYA characteristics: FREYA worksinterdisciplinary and draws on expertise from a very diverse group of Data repositories, Publishers, Research institutions, PID providers and libraries
  • 5.
    Extended PID graphwith New PIDs New PID Types (WP3) Sample Data repository Organization Author 1 PublicationInstitution Dataset Software Publication Author 2 Author affiliation affiliation funding authorship authorship authorship citation reuse input Instrument Grant Conference
  • 6.
    Where do westart ? Which PID is needed the most ? Which new PID can be expanded ? ?? ? Conference ? ? Author PublicationInstitution Dataset Software Publication Author Author affiliation affiliation funding authorship authorship authorship citation reuse input
  • 7.
    What we doin FREYA: FREYAs work on new PIDs assesses the state of the art for PIDs, identifies gaps, and specifies use cases and requirements for potential new PID types. • Assess the current state PID landscape New PID Types DONE Almost DONE In Progress • Identify needs for new PIDs and their requirements • Develop prototypes for new PID services
  • 8.
    The Process of identifyingnew PIDs services Defining the PID landscape • Determining current PID developments, existing PIDs, PID initiatives, maturity of individual PIDs
  • 9.
    25 entities - Havingor needing a PID Significant overlap among disciplines Complicates determination of PID maturity
  • 10.
    PID MATURITY INDEX Deliverable3.1 available on the FREYA webpage https://doi.org/10.5281/ zenodo.1324296 The Process of identifying new PIDs Only three entities (researchers, publications and data) have services that are deemed fully mature. The remaining are either emerging or immature
  • 11.
    The Process of identifyingnew PIDs services Identifying needs and requirements for new PIDs Methods: • Collecting Use-cases from the community • Collecting Use-cases at conferences • Identifying New PIDs in high demand • Identifying requirements for the progress of new PIDs • Matching Need and Requirements with Freya expertise
  • 12.
    The Process of identifyingnew PIDs services What is a Use-case? A Use-case in FREYA describes a scenario, where a PID is needed identifying User, Goal and Benefit General Use-case template
  • 13.
    The Process of identifyingnew PIDs services What is a Use-case? Use-case examples
  • 14.
    The Process of identifyingnew PIDs services Use-case collection: 72 Use-cases were collected in total 30 Use-cases revolved around “New PIDs”
  • 15.
    The Process of identifyingnew PIDs services What is a Use-case? Example: Research Cruises As a funding agency, I would like to trace the outcome of my financial contribution to a marine research cruise (Cruise ID) by tracking the data generated (data-PID) and articles (publication-PID), physical samples taken (IGSN). I would also like to track the future data and publications generated from my cruise and the samples it collected. Image: MARUM©
  • 16.
    The Process of identifyingnew PIDs services What is a Use-case? Example: Research Cruises As a funding agency, I would like to trace the outcome of my financial contribution to a marine research cruise (Cruise ID) by tracking the data generated (data-PID) and articles (publication-PID), physical samples taken (IGSN). I would also like to track the future data and publications generated from my cruise and the samples it collected.
  • 17.
    The Process of identifyingnew PIDs services What is a Use-case? Example: Software As a software author, I want to be able to see the citations of my software aggregated across all versions. so that I see a complete picture of reuse Image from: https://opentextbc.ca/selfpublishguide/cha pter/screenshots-of-software/
  • 18.
    The Process of identifyingnew PIDs services What is a Use-case? Example: Software As a software author, I want to be able to see the citations of my software aggregated across all versions. so that I see a complete picture of reuse Image from: https://opentextbc.ca/selfpublishguide/cha pter/screenshots-of-software/
  • 19.
    The Process of identifyingnew PIDs services What is a Use-case? Example: Policy As a research manager, I want to have policy IDs, so that I can easily identify relevant policies and assess the compatibility between different policies.
  • 20.
    The Process of identifyingnew PIDs services What is a Use-case? Example: Policies As a research manager, I want to have policy IDs, so that I can easily identify relevant policies and assess the compatibility between different policies.
  • 21.
    Identifying PID needsin Use-cases Entity Popularity Instrument 10 data 8 article 6 person 5 Repository 5 Organisation 4 Sample 4 software 4 Grants 3 project 1 study 1 conference 1
  • 22.
    The upcoming deliverable3.2 will provide detailed information on 1. Instruments 2. Repositories 3. Organizations 4. Physical samples 5. Grants 6. Software 7. Research Campaigns 8. Data management plans 9. Facilities Of the 25 PIDs identified in the landscape analysis 9 PIDs were chosen for further analysis and matched with expertise with in FREYA
  • 23.
    1. Identify needsbased in user-stories: Why do people want this PID? 2. Validate current status: What does it take to expand this PID type 3. Validate the extend of existing PIDs: Cross-disciplinary approach 4. Identifying experise: Who can move this PID-type forward Prioritization of future work on PIDs
  • 24.
    Example Physical samples • User-stories: 1)Information about a sediment core/sample in core repository. 2) Tracing/relocating misplaced cultural artefacts. 3) Identifying samples of bacterial/viral/fungal strains. PID systems: IGSN: Geological samples RRID: Research resource identifiers antibodies, cell lines, model organisms ARTs: No universal identifiers
  • 25.
    1. Identify needsbased on user-stories: Why do people want this PID? 2. Validate current status: What does it take to expand this PID type 3. Validate the extend of existing PIDs: Cross-disciplinary approach 4. Identifying expertise: Who can move this PID-type forward Prioritize which PIDs to move forward for prototyping within FREYA Prioritization of future work on PIDs Report will be available by the end of February
  • 26.
    CASE-STUDY: EXPANSION OF PANGAEAPID GRAPH THROUGH IMPLEMENTATION OF PIDS FOR PHYSICAL SAMPLES (IGSNs): How implementation of new PIDs provide the user with additional information Photo: MARUM – Center for Marine Environmental Sciences, University of Bremen; V. Diekamp SCENARIO: I am a Geologist interested in sediment cores. It has come to my knowledge that interesting research is going on in lakes of the French Alps. I search PANGAEA for data and find the work of Bajard et al 2015 What kind of information does PANGAEAs use of PIDs provide me?
  • 27.
    The work ofBajard et al 2015 found on the PANGAEA webpage
  • 28.
  • 29.
    Data- PID: Article PID: PIDs : Author- PID: Sample- PID: The work ofBajard et al 2015 found on the PANGAEA webpage
  • 30.
    Conclusion: Our use-case-oriented methodgives a practical orientation about the USERs demand for new PIDs In particular PIDs for Instruments, Organizations, Physical samples, Grants, Software are sought for by the community. Implementation of some of new PIDs will improve the USERs access to additional information.
  • 31.
  • 32.
    ILLUSTRATION of PIDgraph New PID Types (WP3)
  • 33.
    25 entities - Havingor needing a PID Significant overlap among disciplines Complicates determination of PID maturity
  • 34.
    Rank maturity ofPIDs for disciplines Entity Popu larity EBI Datacit e BL DANS CERN STFC Pangaea Instrument 10 2-3 1 1 1 1 1 2 Data 8 5 5 5 5 5 3 5 Article 6 5 5 5 5 5 5 5 Person 5 5 5 5 5 4-5 3 4-5 Repository 5 1 2 1 1 1 1 1 Prganisation 4 2 2 2 2 1-2 1 2 Sample 4 5 2 1 1 1 1 2-3 Software 4 2 5 1 2 5 2 1 Grants 3 3 2 2 2 1 1 1 project 1 2 1-2 1 2 1 1 1-2 study 1 2 1 1 2 1 5 1 conference 1 1 2 1 1 1 1 1 Matching PID needs with FREYA expertise Maturity Key:1 non-existent 2 nascent 3 emerging 4 in pilot 5 mature