Preparing your data for sharing and publishing

Preparing your data for sharing
and publishing
Varsha Khodiyar, PhD
MRC Cognition and Brain Sciences Unit
Open Science Day 20.11.2018
Howchameleonschangecolour

1
7719 respondents
White paper available from
https://doi.org/10.6084/m9.figshare.5975011
Survey data available from
What are researchers concerned about when sharing data?

5
Make sure your data are well organised
• Data files and folders labelled in an understandable way.
• Data files and folders organised in a logical, easy-to-
follow manner.
• Any acronyms used for data file/folder names clearly
defined, ideally in a README file.
• Data files in a format that are easy for others to reuse /
use the standard format used by your discipline.

6
Increasing reproducibility
• Include any additional information needed to understand the data,
methods, parameters, e.g. which instrument (make and model) was
used to measure blood carbon dioxide levels?
• Include availability statements for any code that was used to view,
parse or analyse the data, in support of the conclusions.

7
Help to organise your data is available
Springer Nature Research Data Support
Researchers
submit their
data files
securely
The Research
Data team
curates the data
and metadata
The data are
published and
linked to the
author’s paper
More information is available on our website here:
http://www.springernature.com/gb/group/data-
policy/data-support-services

8
No one other than the
creator can access the
data, or even knows that
it exists
Before data curation: a researcher’s dataset in a desktop
folder
The dataset is
stored as an
Excel file in a
desktop folder
The file title is not
comprehensible to
anyone but the
creator
No description or
keywords
available

9
Before curation begins
Once received, we check to make sure
that the dataset is suitable for our
curation services. Multiple files in any
format are accepted.
After making these checks, we begin
the curation process. If necessary
we may recommend that the
dataset is split into smaller groups
or collections.
Pre-curation data checks:
 The data aren’t sensitive
 The data don’t include
direct or indirect human
identifiers
 The data shouldn’t be in a
community repository

10
After Springer Nature Research Data Support
Working with the researcher’s manuscript or published paper, we draft a comprehensive
metadata record for the dataset which is sent to the researcher for approval before
being published. Embargoes can be applied if necessary.
The curated dataset will be published with
its own metadata record which includes
rich descriptive information, reuse
conditions, licence, DOI, metrics and
keywords
(this example is
415)

1111
Choosing a repository to store data

12
Selecting a repository for your data
Considerations:
1. Is there a discipline-specific repository for the type of data you
have generated?
2. Will access to the data need to be controlled?
3. If no discipline-specific repository is available for the data,
does your funder or institute mandate deposition to a
particular repository?

13
Indexing services Curated lists
Sources to help choose a data repository
NEW! Tools to help select repositories
www.nature.com/sdata/data-policies/repositories
https://repositoryfinder.test.datacite.org/

15
• Consider an appropriate patient consent
framework
 Consent to use data in current study
 Consent to use data for future research
 Consent to share data for use by other
research groups
• Don’t collect more than you need
Collecting sensitive data

16
• Remove direct identifiers
• Aggregate indirect identifiers into groups where possible
• Anonymization or de-identification?
• Use controlled access repositories,
and consider:
 Data use agreement?
 Data access conditions?
Sharing sensitive data

1717
Scholarly credit for generating and sharing research data

18
Data Journals at Springer Nature
www.nature.com/scientificdata
https://bmcresnotes.biomedcentral.com
Data Descriptor
Open access
Sound science
Emphasis on enabling
data reuse
Data peer review
Data Note
Open access
Sound science
Short format

19
Scientific Data, a Nature Research journal
Data Descriptor
Primary article type; sound
science and facilitates data
reuse
Analysis
New analyses or meta-
analyses of existing data
Article
Original reports on
advances in data sharing &
reuse
Comment
Announcements of broad
interest; usually invited
www.nature.com/scientificdata

20
Under the hood of a Data Descriptor
• Context for data generation (background)
• How was data generated?
• How was data processed?
• Where is the data?
• Synthesis
• Analysis
• Conclusions

21
Data peer review
www.nature.com/sdata/policies/for-referees
Experimental
Rigor and
Technical Data
Quality
Were data produced in a sound manner?
Technical quality of data – appropriate statistical analyses?
Experimental rigor - appropriate depth, coverage?
Completeness
of the
Description
Sufficient detail to allow others to reproduce these steps?
Sufficient detail to allow others to reuse this data?
Consistent with relevant minimum reporting standards?
Integrity of the
Data Files and
Repository
Record
Do data files appear complete and match manuscript
descriptions?
Are data archived to the most appropriate repository?

22
What types of data can be published?
Decades old
dataset
Standalone
dataset
Data that has been
used in an analysis
article
Large
consortium
dataset
Data from a
single
experiment
Any data that the researcher
finds valuable and that others
might find useful too
Data associated with a
high impact analysis
article

23
When can a data paper be published?
After data
analysis has been
published
Before analysis has
been published
Authors not
intending to
analyse data
Data papers can be
submitted and published at
any point in the research
workflow, i.e. whenever it
makes most sense for your
data
After data
analysis has been
published
Before the
analysis has been
published
Publication alongside
analysis article

2424
Still unsure about research data?

25
What does research data training offer?
• Directly addresses the main challenges of data sharing
• Part of the Nature Research Academies, offering trusted quality
and value
• A unique perspective and trusted experience within the realm of
research data
• Training for both researchers & information professionals,
appropriate for all levels
• Courses are customised to meet your needs, and are brought to
you (and your researchers)
Springer Nature Research Data Training
Source: https://doi.org/10.6084/m9.figshare.5975011

26
Queries are answered within two business days
Run by members of the Springer Nature Research Data team
Expertise in data curation and management, archiving and
digital preservation, copyright and licensing, Open Access
publishing
Always encourage best practices, e.g. the use of community
repositories for specific data types
Email: researchdata@springernature.com
http://www.springernature.com/gp/group/data-policy/helpdesk
Springer Nature Research Data Helpdesk

2727
The story behind the image
How chameleons change colour
Chameleons are well known for their potential to
change colour but recent research on panther
chameleons is the first to find two layers of
crystal containing cells, each with a potentially
different purpose. Researchers from the
University of Geneva have speculated that the
deeper crystal containing cells may help with the
regulation of temperature, whilst the more
superficial layer of colour changing cells could be
responsible for camouflage or mating displays.
Thank you for listening
Varsha Khodiyar, PhD
Data Curation Manager, Springer Nature
(Data Curation Editor, Scientific Data)
varsha.khodiyar@nature.com

Preparing your data for sharing and publishing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Preparing your data for sharing and publishing

Similar to Preparing your data for sharing and publishing (20)

More from Varsha Khodiyar

More from Varsha Khodiyar (20)

Recently uploaded

Recently uploaded (20)

Preparing your data for sharing and publishing