2. About Me
Catherine (Cathy) Giffi is
Director, Strategic Market
Analysis, for Wiley.
Her team of talented analysts
are charged with producing
groundbreaking research on
issues impacting librarians,
societies, and researchers,
including Open Access,
Researcher Workflow, Data
Sharing, Society Member
Benefits, and Reviewer
Benefits.
She holds a Masters degree in
Publishing from NYU and, in
addition to Wiley, has led large
scale research projects for the
Sundance Film Festival and
VOGUE magazine.
8. Survey
Responses By
the Numbers
Our objective was to establish a
baseline view of data sharing
practices, attitudes, and
motivations globally, with
participation from researchers in
every scholarly field.
• 90,000
researchers
invited to
participate.
• 3,000
responses
recorded.
• 203 fields of
study were
recorded.
• 85 countries
participated.
• 14 days to
participate.
9. Key Findings Most researchers are sharing their data.
Data shared is typically <10 GB.
The most common type of data that is being
shared is flat, tabular data (.csv, .txt, .xl)
Data is usually “archived” on hard drives.
Those not sharing have a variety of
reasons.
10. Where Did
You Make Your
Data Publicly
Available?
• Supplemental material (67%)
• At a conference (57%)
• Informal paths/by request (42%)
• Personal, institutional, or project webpage (37%)
• Institutional data repository (26%)
• Discipline-specific data repository (19%)
• General purpose data repository, e.g. Dryad,
figshare (6%)
• Other (5%)
Of those surveyed, 66% have made data publicly
available (ever).
11. Why
Researchers Do
Not Share Data
IP or confidentiality issues (83%)
Research might be “scooped” (70%)
Concerns about misinterpretation (32%)
Insufficient time/resources (32%)
No mandate from Funder/Institution (13%)
Unsure how, where to share (8%)
12. Variation by Field of Research
Life Science*
• Concerns that their
research will be
scooped (56%)
• Intellectual property or
confidentiality issues
(54%)
• Concerns about
misinterpretation or
misuse (43%)
Health Science
• Intellectual property or
confidentiality issues
(68%)
• Ethical concerns (36%)
• Concerns about
misinterpretation or
misuse (36%)
*Most likely to share data
13. Variation by Field of Research
Physical Science
• Intellectual property or
confidentiality issues
(47%)
• No funder or institutional
require (29%)
• Concerns that their
research will be scooped
(27%)
Social Science & Humanities*
• Intellectual property or
confidentiality issues
(47%)
• Concerns about being
scooped (30%)
• No funder or institutional
requirement (28%)
*Least likely to share data
15. Echoes of Our Findings
“[Researchers] cite lack of time, money and universally
agreed upon standards, as well as technical barriers, as
the main reasons they hold data back. Of course, there are
psychological and cultural reasons, too: a sense of
ownership over such a hard-won resource and a fear of
scrutiny and of being “scooped.”
Neurodata Without Borders
August 2014
16. Echoes of Our Findings
“Twenty potential barriers were identified
and classified in six categories:
Technical, Motivational, Economic,
Political, Legal, Ethical.”
BMC Public Health
November 2014
17. Takeaways
• Sharing data is crucial for the
advancement of science.
• Recognizing the barriers to
sharing is important.
• Some barriers can be more
easily overcome.
• Others will take the support
of the scholarly community.
18. For More
Information
To Share or Not to Share, That is the Research Data
Question
Scholarly Kitchen
How and Why Researchers Share Data, and Why
They Don’t
Exchanges
Cathy Giffi
Director, Strategic
Market Analysis, Wiley
cgiffi@wiley.com
Thank you for having me – I’ve been asked to speak about some of the barriers researchers face in sharing their data. In particular, I’ll be outlining some of the results from a survey Wiley ran last year that identified the what, why, and why not behind data sharing. What data is shared, why it is shared, and why it might not be shared.
First, a bit about me. I’m director of Strategic Market Analysis for Wiley, and I run a group based out of Hoboken that conducts primary research studies on issues impacting the scholarly community. Our research is typically global and multi-disciplinary…as is our company.
Founded in 1807, we have more than 1600 scholarly journals in our portfolio in life, physical, health and the social sciences. We have publishing offices in North America, Australia, Europe, Asia, the Middle East and South America.
We work with CODATA, the World Data System, the Research Data Alliance, DataCite and NISO to advance initiatives which will enable research data to be used, re-used, cited, and accredited.
We are a core partner of the PREPARDE (Peer REview for Publication & Accreditation of Research data in the Earth sciences) project, which is capturing the processes and procedures required to publish a scientific dataset, ranging from ingestion into a data repository, through to formal publication in a data journal.
What led to our prioritizing data sharing in the fall of 2013 and winter of 2014?
The 208-year old company I work for is a baby compared with data sharing.
One of the earliest records is from the Royal Society pictured here – for continuity, I have included an image from Todd Vision’s presentation from Research Data Publication Part I which nicely illustrates that sharing data (tide information, in this case).
There was certainly a turning point with digital research accelerated with the migration of journal content to online in the 90s. By 2003, the NIH recognized the importance of data sharing in an FAQ that has remarkably stood the test of time. The NIH now requires data management plans for certain types of grant proposals.
In 2011, NSF began mandating data management plans be submitted with all grant applications. In October of last year, the Department of Energy began requiring data management plans, and in February of this year, NASA adopted a similar policy.
In January of this year, the Gates Foundation mandated data sharing.
From Wiley’s perspective, we are interested in advancing initiatives which will enable research data to be used, re-used, cited, and accredited. Understanding and being responsive to the needs of researchers is key for our business.
Neurodata Without Borders, a collaborative year-long project will focus on standardizing a subset of neuroscience data, making this research simpler for scientists to share.
Technical barriers meaning metadata was not collected, or standards were not followed.
Motivational barriers meaning the difficulty of sharing and the chance of being scooped makes sharing less appealing.
Economic barriers like lack of funding for human and technical resources.
Political barriers like lack of trust. Example: the Indonesian government refused to share H5N1 influenza samples with the international community during the 2007 pandemic due to lack of trust on the potential use of these samples for financial gain
Legal barriers like ownership and copyright, privacy
Ethical barriers like lack of reciprocity
Development of standards can help remove technical barriers. Legal barriers may be overcome with technology solutions that facilitate copyright protecting and alleviate privacy concerns. Political and motivational barriers will be more challenging.