Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Role of PIDs in connecting scholarly works

87 views

Published on

Presentation from a joint webinar FREYA and OpenAIRE: New developments in the field of Persistent Identifiers by Dr. Amir Aryani, Director, Research Graph Foundation

Published in: Science
  • Be the first to comment

  • Be the first to like this

Role of PIDs in connecting scholarly works

  1. 1. Role of PIDs in connecting scholarly works Dr. Amir Aryani Director, Research Graph Foundation https://orcid.org/0000-0002-4259-9774 1
  2. 2. Agenda • Background: DDRI / Research Graph • Augmenting the graph of scholarly works using PID • PID Graph 2
  3. 3. Challenge of finding related datasets Similar datasets in other repositories? ? 3 Background
  4. 4. 4 Initiated from the Data Description Registry Interoperability Working Group Goal: enabling cross-platform discovery between “Research Data Infrastructures” Research Data Alliance https://www.rd-alliance.org/groups/data-description-registry-interoperability.html 4 Background
  5. 5. Background https://researchdata.ands.org.au/idmm-immunome- database-for-marsupials-and-monotremes/11139
  6. 6. Background https://researchdata.ands.org.au/idmm-immunome- database-for-marsupials-and-monotremes/11139
  7. 7. Background Show 105 more publications
  8. 8. Background http://dx.doi.org/10.1371/journal.pone.0079092 One of the 105 articles …
  9. 9. doi:10.5061/dryad.4qq0v Authors: Wong ESW, Nichol S, Warren WC, Belov K Dryad Dataset http://datadryad.org/resource/doi:10.5061/dryad.4qq0v Background
  10. 10. Background Dataset Researcher Publication Dataset We found another dataset from the same author…
  11. 11. ResearchDataSwitchboard Background
  12. 12. 12 D1 − [: citedBy] → P1 ← [: citedBy] − D2 D1 − [: fundedBy] → G1 ← [: fundedBy] − D2 D1 − [: createdBy] → R1 ← [: createdBy] − D2
  13. 13. 13 In a limited graph cluster (~4 million nodes) • 70.6% publica>ons have DOI • 46% datasets have DOI Role of DOI https://www.nature.com/articles/sdata201899
  14. 14. Augment API 14
  15. 15. 15
  16. 16. Peter Oke 1 2 3 1. NCI Original Record: Dr. Peter Oke has multiple datasets in NCI Geonetwork Catalogue 2. Augmented Graph: The Augment API linked this record to the related ORCID profile. 3. External Records: All the publication from ORCID profile has been linked to the new record in the NCI graph. 16
  17. 17. Finding connection to a dataset Dataset Researcher RelatedTo 17
  18. 18. Finding connection to a dataset Dataset Researcher ORCID: 0000-1111-0000-1111 Researcher knownAs Publication RelatedTo P P 18
  19. 19. Finding connection to a dataset D R knownAs R P P P Dataset Publication 19
  20. 20. 1. ETL: Transform Geonetwork repository to a Neo4j Graph DB 2. Augment: Use Research Graph Service (in amazon cloud) to augment the NCI graph with ORCID and other repositories. 3. Visualisation: Creating optimised GraphML from the Augmented Neo4j 20
  21. 21. NCI Original Graph NCI Augmented Graph
  22. 22. Lessons Learned 22
  23. 23. 23 L1: Power of Trusted Persistent Identifiers • Persistent identifiers (PIDs) saves money • Disambiguation without PIDs is expensive and inefficient • Trusted PIDs creates trusted connections • Without PIDs, we need to rely on AI to map relationships between scholarly works. This is usually require using probabilistic models with accuracy < 100%.
  24. 24. 24 Sailing the Data Ocean Working with scholarly communication is traversing Dynamic Big Data We need • fast, • trusted, and • sustainable services. L2:
  25. 25. 25 L3: We need to connect instead of collect
  26. 26. PID Graph 26
  27. 27. PID Graph The PID Graph is a network of connections between PIDs available in the form of a set of federated RESTful JSON APIs. Applications 1.Disciplinary applications
 Integrate mature PID Graph functionality in disciplinary contexts, i.e. trusted author-article-data linking and software identification and citation workflows.
 2.European Open Science Cloud applications
 Connect European Open Science Cloud demonstrators to the PID Graph: Build the required knowledge-based resources that enable European Open Science Cloud stakeholders to exploit and contribute to the PID Graph.
 3.Graph visualisation
 Exploiting the PID Graph API to create an informative graph visualisation and exploration tool. The tool will show the research connections for a research enterprise such as the citation network for a data repository, institution, or funder.
 27
  28. 28. Examples • Track the citation of a dataset across all versions • D{1.1, 1.2, 1.3} − [: citedBy] → {P1 , P2 , P3 , P1} • Impact of funding • G1 ← [: fundedBy] − {D1 , P1 , P2 , P3 , … } • More effective linking of data to publications • As a researcher I want (easy ways) to (more effectively) link all data to publications. As a reader I want to be able to easily find all data related to a publication. 28
  29. 29. 29 User stories (Use cases) • What API? • Potential users? • How complicated to implement and expensive to operate?
  30. 30. What next? 30 Research Data Graph BoF
 RDA’s 13th Plenary Meeting (P13), April 2-4, Philadelphia Project FREYA https://www.project-freya.eu Research Graph https://www.researchgraph.org Contact info: amir.aryani@researchgraph.org

×