Building data networks: exploring trust and interoperability between authoris, repositories and journals. Varsha Khodiyar , Scientific Data; Neil Chue Hong, Journal of Open Research Software; Rachael Kotarski, DataCite, Peter McQuilton, BioSharing; Reza Salek, Metabolights. At Repository Fringe 2015
Building data networks: exploring trust and interoperability between authoris, repositories and journals
1. • Varsha Khodiyar, Scientific Data
• Neil Chue Hong, Journal of Open Research Software
• Rachael Kotarski, DataCite
• Peter McQuilton, BioSharing
• Reza Salek, Metabolights
Building data networks: exploring trust and
interoperability between authors, repositories
and journals
Repository Fringe 2015
4. What do data journals require?
Our general criteria
1. Recognized within their scientific community
2. Long-term preservation of datasets
3. Implement relevant reporting standards
4. Allow confidential review of submitted datasets
5. Stable identifiers for submitted datasets
6. Allow public access to data without unnecessary restrictions
Questionnaire online for new repositories requesting listing:
http://www.nature.com/sdata/data-policies#repo-suggest
List of repositories:
http://www.nature.com/sdata/data-policies/repositories
5. Software Sustainability Institute
www.software.ac.uk
Neil Chue Hong
Director, Software Sustainability Institute
Editor-in-Chief, Journal of Open Research Software
Repository Fringe 2015, Edinburgh, 3-4 August 2015
Neil Chue Hong (@npch), Software Sustainability Institute
ORCID: 0000-0002-8876-7606 | N.ChueHong@software.ac.uk
Unless otherwise indicated
these slides licensed under
Supported by Project funding
from
9. www.bl.uk 9
DataCite UK
• 52 Data Centres /
Universities / other
organisations using
DataCite in the UK
• Assigning DOIs to data,
theses and software among
other things
10. www.bl.uk 10
British Library
Leverage the Library’s collections and expertise to drive
innovation in large-scale data analytics, for the wider
benefit of UK research
• Providing our digital collections as ‘data’
– http://labs.bl.uk/
– http://www.bl.uk/bibliographic/datafree.html
• Alan Turing Institute will be physically hosted in the
British Library building
11.
12.
13.
14. A web-based, curated and searchable portal where biological
standards and databases are registered, linked and discoverable.
We monitor the development and evolution of standards, their use in
databases and the adoption of both in data policies.
15. Researchers, developers and curators
lack support and guidance on which
format or checklist standards to use, or
database to deposit their data.
Journal publishers, funders and
librarians do not have enough
information to make informed decisions
on which content standards or database
to recommended in policies, or fund or
implement.
Our mission: To help people make the right choice
Scientific Data launched in May 2014, introducing a new type of content called the Data Descriptor designed to make data more discoverable, interpretable and reusable. Check out our first publications online.
Our Data Descriptors fall broadly into two categories
First descriptions of datasets
These often describe valuable, unpublished datasets that may be hard to fit into a traditional research article context. See our first publications for clear demonstrations that Scientific Data can help motivate scientists to share valuable datasets that might not have otherwise seen the light day.
Follow-up articles
These articles provide fuller descriptions and more complete release of datasets analysed in previous publications. In these cases, the value of the underlying datasets is often already well-demonstrated, but for groundbreaking studies, where there are not established standards or data repositories, a substantial amount of additional information is often needed before others can actually reuse the data. Data Descriptors at Scientific Data help motivate the authors to release datasets more fully, and the Data Descriptor manuscripts can provide more detailed descriptions of the data collection methods and the data file formats—essential information for others who may wish to reuse the data.
Visibility for repository
Subject specific data is stored with other related data, easier discovery for researchers
A metajournal which encourages the publication of information that encourages the reuse of software.
A way of using the current tools and practices to make software better recognised.
I am part of the team running DataCite in the UK. We work with organisations to provide persistent identifiers in the form of DOIs for their research data – although they are applied to other objects as well.
My other role is generally looking at the data activities of the Library. This is important for the Library right now, as one of the key aims of the Library’s current strategy (‘Living Knowledge’) is this: