ODIN Project Presentation to CLOSER Leadership Team
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

ODIN Project Presentation to CLOSER Leadership Team

  • 893 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
893
On Slideshare
880
From Embeds
13
Number of Embeds
1

Actions

Shares
Downloads
1
Comments
0
Likes
0

Embeds 13

https://twitter.com 13

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. ODIN – ORCID and DATACITE Interoperability Network Presentation to CLOSER Leadership Team November 2012 John Kaye – British Librarywww.slideshare.net/johnkayebl www.odin-project.eu Funded by The European Union Seventh Framework Programme
  • 2. Overview• Overview• Project Structure• Humanities and Social Science Proof of Concept• High Energy Physics Proof of Concept• Results• Commonalties• Risks
  • 3. Overview• 2 year project funded under EC FP7 Coordination and Action Programme• ORCID (Open Researcher and Contributor ID Initiative)• Datacite Consortium – BL is UK registration agent• Partners: ORCID, Datacite, BL, CERN, Dryad, arXiv, ANDS• Build on ORCID and Datacite initiatives to uniquely identify and connect scientists and datasets• ‘Datasets’ has a broad definition (anything but journals) so can include grey literature, presentations, code etc.• Connect information across multiple services and infrastructures for scholarly communications
  • 4. Overview• Infrastructure already exists for researchers to build up an open portfolio of research objects• Register an ORCID ID www.orcid.org and link published papers using ORCID’s tools• Non published outputs (working papers, datasets) can be deposited in figshare http://figshare.com/ given a DataCite DOI and linked back and added to ORCID profile• ODIN wants to expand on this principle and engage with data centres and institutional repositories to allow easier more open discovery of non-traditional research outputs.
  • 5. Project Structure
  • 6. Proofs of Concept Objectives • Develop two disciplinary proofs of the concept of open and interoperable persistent identifiers of data and contributors in scholarly communication, in a variety of current and future scenarios. Specific goals: • Prove the ability to navigate across data and contributors in the Humanities and Social Sciences (HSS) where data and contributors are separated in space and time, with curators bridging the gap; • Prove the ability to navigate across data and contributors in High-Energy Physics (HEP), where multiple version of articles in preliminary and final form, with several thousand contributors, need to be associated with a correspondent dataset hosted in different systems • Identify, by a critical analysis of the proofs of concept, common issues in open and interoperable permanent identifiers of data and contributors, by establishing a common cross-disciplinary view on the relevant workflows
  • 7. Deliverables and Time frames • D3.1 HSS Proof of Concept – Aug 2013 • D3.2 HEP Proof of Concept – Aug 2013 • D3.3 Commonalities – Sept 2014 • MS5 Commonalities Identified Jan 2014 • D3.1 and D3.2 Validated by the community at 1st year event • Input from ANDS and arXiv
  • 8. Humanities andSocial Sciences
  • 9. HSS: Birth Cohort Studies• Why Birth Cohort Studies? • Investment • Established/Long history • Tradition of data curation • High Re-use • Derived Data • Multi-disciplinary • BL Involvement in CLOSER (Cohort and Longitudinal Studies Enhancement Resource)
  • 10. HSS: Current Status• HSS British Birth Cohort characteristics: • High re-use of data • Data analysed across cohorts (e.g. 1958 questions alongside 2000) • Derived data often kept outside original repository • Lots of ‘grey literature’ (working papers, pre-prints etc.) • Different publication spaces (publishers, institutional repositories)• Challenges: • Uniquely associate articles/datasets with authors/contributors from a range of data sources • Authors/creators/researchers go back a long way (could be as early as 1946) • How to deal with non-digital research outputs • How to deal with cross-cohort analysis (multiple datasets, derived datasets) • Associate datasets with articles and track impact of data re-use • Survey questions often more important to identify than actual survey (survey contains thousands of variables)
  • 11. HSS: Objectives• Indentify workflows and develop conceptual model• Provide technical solutions for Identifying and connecting data creators, authors, researchers, contributors and research objects related to British Birth Cohort Studies• Identify, use and link existing identifiers and data sources where possible• Identify deficiencies in identification or relationship data and develop or propose solutions• Work with the research community to develop user case studies and data collection and enhancement• Create an open and interoperable network linking people and research objects to allow Impact Tracking and Resource Discovery
  • 12. HSS Proof of Concept Data Creator, Researcher, Author Birth Cohort Study dataset Non- Birth Cohort Study dataset Derived dataset Grey Literature 1958 1958 Published article Citation Data Creator Derived Data Creator External Data input Author: Grey litExternal Data External Data(Census, (Census, Author: Article ))Health etc Health etc 1970 1970
  • 13. HSS Proof of Concept Data Creator, Researcher, Author Birth Cohort Study dataset Non- Birth Cohort Study dataset Derived dataset Grey Literature 1958 1958 Published article Citation Data Creator Derived Data Creator External Data input Author: Grey litExternal Data External Data(Census, (Census, Author: Article ))Health etc Health etc 1970 1970
  • 14. HSS Proof of Concept Data Creator, Researcher, Author Birth Cohort Study dataset Non- Birth Cohort Study dataset Derived dataset Grey Literature 1958 1958 Published article Citation Data Creator Derived Data Creator External Data input Author: Grey litExternal Data External Data(Census, (Census, Author: Artticle ))Health etc Health etc 1970 1970
  • 15. HSS Proof of Concept Data Creator, Researcher, Author Birth Cohort Study dataset Non- Birth Cohort Study dataset Derived dataset Grey Literature 1958 1958 Published article Citation Data Creator Derived Data Creator External Data input Author: Grey litExternal Data External Data(Census, (Census, Author: Article ))Health etc Health etc 1970 1970
  • 16. HSS: Identifiers andData Sources Researchers etc.: ORCID, ISNI, JISC Names, SCOPUS, Surveys, Citation DB’s, UK Data Service, Catalogue metadata Source Datasets: DataCite DOIs, ESDS Derived Data: DataCite DOIs, Institutional ID’s, No ID’s, ESDS, Surveys, Institutional Repositories ‘External’ Data: DataCite DOIs, Institutional ID’s, No ID’s, ESDS, Other datacentres, NHS, Institutional etc. Grey Literature: DataCite DOIs, Institutional ID’s, No ID’s, Surveys, ESDS, Institutions Published Literature: CrossRef DOIs, Institutional ID’s, No ID’s, SCOPUS Surveys, ESDS, Institutions, Citation DB’s, Catalogue metadata
  • 17. High Energy Physics
  • 18. Current status (I) HEP (High-Energy Physics) field specificities:  Multiversioning: from preprint versions until final publications  Hyperauthorship: hundreds/thousands of scientists signing the same article  Data levels of abstraction (CERN, Inspire, HEPData)  Different publication spaces (arXiv, Inspire, publishers) Challenges:  Author identification, improvement of the disambiguation process done in place  Uniquely associate articles/datasets with authors/contributors  Version management during the long publication process
  • 19. Current status (II) Current Inspire interface
  • 20. Current status (III)  Disambiguation process among thousands of authors:  Names and affiliations  Different ways to write the same information  Clustering algorithm Current Inspire interface
  • 21. Phase 2:Results and Commonalities• Results to feed into Hackathon event and strategy• Assessment and validation by research community and international partners• BL and CERN come together to find commonalities in the disciplines to inform WP4 (interoperability) • This process will incorporate knowledge from the results of the Hackathon as well as the conceptual model for global interoperability of data and contributor identifiers developed in WP4 • This task will result in a more comprehensive view on disciplinary and interdisciplinary needs, and will produce information, internally transferred to the other work packages
  • 22. Questions?John Kaye – Lead Curator Digital Social SciencesThe British Library96 Euston RoadLondon NW1 2DBjohn.kaye@bl.ukTwitter: @johnkayeblTelephone: 020 7412 7450Project Website http://odin-project.eu/Blog: http://britishlibrary.typepad.co.uk/socialscience/