Your SlideShare is downloading. ×
0
The Now and Future of
Data Publishing
Oxford University – 22nd May
Ruth Wilson
Publisher
Nature Publishing Group
22
Overview
Context
Scientific Data
– Concept
– Data descriptor
– Licenses
– Team
Evolution
– Better integration of SI
– S...
33
Data, data, data
Two important factors are driving to make research data more available and
reusable:
• To ensure the s...
44
Existing challenges
• Data producers do not necessarily get
appropriate credit for their work
• Traditional publication...
5
Calling for submissions in Fall 2013, launching in Spring 2014
nature.com/scientificdata
66
What is Scientific Data?
• Scientific Data is an Open Access, online-only
platform containing data descriptors that
des...
77
What is Scientific Data..?
• As part of the peer review process we will
check that the data is publically available in ...
88
8
Data Descriptors
a new publication type for describing scientifically valuable
datasets
SciData DD
Structured
content...
99
Narrative content
complements both journal articles and repository records
Includes
– Highly detailed, reproducible met...
1010
10
Structured content
It will be based on and compatible with ISA-tab and
undergo technical review by biocuration/sta...
1111
License types
Data: the raw datasets will reside in public
repositories and likely to be CC0 similar to
Figshare and ...
1212
Susanna-Assunta Sansone - Honorary Academic Editor
Andrew L Hufton - Managing Editor
Advisory Panel
Supported by
Jose...
1313
Contacts
Call for submission Fall 2013
Launching in Spring 2014
13
• www.nature.com/scientificdata
• Email: scientifi...
Evolution
1515
Evolution - SI
• Greater accessibility/visibility
• Greater discoverability
• Currently about to be piloted on
• Natu...
1616
Evolution
Source Data
About to be implemented on Nature
branded life science journals
Initially data behind figures
D...
Thankyou
Upcoming SlideShare
Loading in...5
×

Wilson-npg-scientific data-nfdp13

306

Published on

Presentation by Ruth Wilson on Nature Publishing Group's Scientific Data journal given at the Now and Future of Data Publishing Symposium, 22 May 2013, Oxford, UK

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
306
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Very broad theme and not so much time so will concentrate on two aspects of linking between publications and research data at NPG, one is a new product Scientific Data – A new data focused OA peer reviewed platform, other is evolution of practises for existing journals.A small amount of context
  • What are the existing challengesWe know that much research data is stored in draws if stored at all….
  • Response to challenges NPG is launching Scientific Data - focused on data interpretation and reuseCalling for submissions in Fall 2013, launching in Spring 2014Six Principles. The Scientific Brand: Innovative new publishing brand from NPG. Open-access, community-driven. Feature-rich. Complements the more tradition-bound Nature titles.
  • New layer in between traditional journal articles and Repositories. We don’t store the data.
  • Blessing and a curse……UnderutilisedPublishers (including NPG) do little with Supplementary Information (SI), other than present it with the article in PDF form (not being in xml/html format makes it hard to index and find)Growth difficult to managePublishers are struggling with the growing amount of SI in the life sciences: since 2010 the Journal of Neuroscience no longer accepts SI as it felt it was adversely affecting peer review. In 2009 Cell restricted the number and volume of SIIncreasing volumes at NPGNumber of pieces of SI in NG grown by 65% between 2008 and 2011There were 1515 pieces of SI in Nature Genetics (incl. figures and tables) in the first half of 2011, compared to 915 in the same time period in 2008 The volume of SI across NPG has grown from 5299 files, to 6469 and 7120 (2008 – 2010)(22%, 10%) These figures do not break out individual figures and tables but instead look at the number of PDF files, doc files, xls files etc. Approx 60% are PDF files
  • Source data – has been on EMBO and MSB for some time….Linked to Nature journals’ updated editorial policies aim to improve transparency and reproducibility by: -Both requiring much more precise description of statistics and employing the expertise of a statistics consultant, where needed;-Increasing the lengths of Methods sections in journals to allow authors to be much more descriptive and facilitate replication of their findings and;-Publishing source data: first the actual data points, that is, tabular source data, behind figures; next additional forms of source data.To this point we have been citing data sets in an online Accession Codes section in our articles online by listing the repository name and, via the persistent identifier, linking to the data set entry in the repository. We will further formalize Data Citations by having them appear in a similar manner to bibliographic references including ensuring that data set authors are more granularly credited for their work (and including the date, minimally year, of data deposition).
  • Transcript of "Wilson-npg-scientific data-nfdp13"

    1. 1. The Now and Future of Data Publishing Oxford University – 22nd May Ruth Wilson Publisher Nature Publishing Group
    2. 2. 22 Overview Context Scientific Data – Concept – Data descriptor – Licenses – Team Evolution – Better integration of SI – Source data – Data citations
    3. 3. 33 Data, data, data Two important factors are driving to make research data more available and reusable: • To ensure the scientific process is transparent and can be scrutinised and research results reproduced • To speed the scientific process, lead to new insights and reduce duplicated and repeated work To achieve this research data needs to be – Available – Findable – Interpretable – Re-usable – Citable
    4. 4. 44 Existing challenges • Data producers do not necessarily get appropriate credit for their work • Traditional publications are focused on hypothesis/conclusions • The peer review process at many research journals is not focused on ensuring data release and data standards • Data and info about datasets often ends in supp. material • Potentially valuable datasets are not released
    5. 5. 5 Calling for submissions in Fall 2013, launching in Spring 2014 nature.com/scientificdata
    6. 6. 66 What is Scientific Data? • Scientific Data is an Open Access, online-only platform containing data descriptors that describe and explain datasets, supported by an APC model. • Data descriptors are a new type of content and can be viewed as ‘secondary’ material aimed at increasing the visibility and usability of datasets and to aid research reproducibility • For all types of data the descriptor will be peer reviewed
    7. 7. 77 What is Scientific Data..? • As part of the peer review process we will check that the data is publically available in an approved data repository and follows community guidelines • All content will be published open access with the author able to select from a number of options. In addition the descriptor metadata will be available under CC0. • An in-house editorial team and new authoring tools are being developed to ensure the creation, submission, curation and publication of data descriptors is as simple as possible • The external advisory board will represent different stakeholder views and provide feedback on key services.
    8. 8. 88 8 Data Descriptors a new publication type for describing scientifically valuable datasets SciData DD Structured content Export to various formats (ISA_tab, RDF, etc ) Datasets Interoperate with Community resources Code Workflows Advanced Search and Discovery functions SciData DD Structured content SciData DD Structured content SciData DD Structured content Link to related Content Nature Methods Scientific Reports Nature Genetics
    9. 9. 99 Narrative content complements both journal articles and repository records Includes – Highly detailed, reproducible methods descriptions – Quality control & technical validation experiments – Searchable, machine-readable meta-data Does Not Include – In depth analysis or tests of hypotheses – New scientific conclusions – Exploratory analysis (e.g. clustering)
    10. 10. 1010 10 Structured content It will be based on and compatible with ISA-tab and undergo technical review by biocuration/standards referees Submit ISA-tab files directly OR Submission tools and simple templates help authors provide the information without special tools In-house curator standardizes the structured content
    11. 11. 1111 License types Data: the raw datasets will reside in public repositories and likely to be CC0 similar to Figshare and Dryad etc… DATA DESCRIPTOR Metadata: as NPG has already done with its existing Linked Data Portal the metadata about data descriptors in Scientific Data will be CC0 Narative/Figures: the narrative describing the methodology of data generation/collection and processing will be licensed under either of the following, by author choice:
    12. 12. 1212 Susanna-Assunta Sansone - Honorary Academic Editor Andrew L Hufton - Managing Editor Advisory Panel Supported by Joseph R. Ecker Salk Institute, USA Mark Forster Syngenta, UK Stephen Friend Sage Bionetworks, USA Pascale Gaudet Swiss Institute of Bioinformatics, Switzerland Anne-Claude Gavin EMBL, Germany Albert J. R. Heck Utrecht University, The Netherlands Wolfram Horstmann University of Oxford, UK Johanna McEntyre EMBL-EBI, European Bioinformatics Institute, UK Anthony Rowe Johnson & Johnson, USA Richard H. Scheuermann J. Craig Venter Institute, USA Caroline Shamu Harvard Medical School, USA Jessica Tenenbaum Duke Translational Medicine Institute, USA Weida Tong National Center for Toxicological Research, FDA, USA Judith A. Blake The Jackson Laboratory, USA Chris Bowler IBENS, France Piero Carninci RIKEN Omics Science Center, Japan David Carr Wellcome Trust, UK Stephen Chanock National Cancer Institute, USA Simon Hodson Jisc, UK Who are we?
    13. 13. 1313 Contacts Call for submission Fall 2013 Launching in Spring 2014 13 • www.nature.com/scientificdata • Email: scientificdata@nature.com • Twitter: @ScientificData
    14. 14. Evolution
    15. 15. 1515 Evolution - SI • Greater accessibility/visibility • Greater discoverability • Currently about to be piloted on • Nature Structural and Molecular Biology • Nature Cell Biology
    16. 16. 1616 Evolution Source Data About to be implemented on Nature branded life science journals Initially data behind figures Data Citations
    17. 17. Thankyou
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×