BARRIERS TO DATA
SHARING
Cathy Giffi
Director, Strategic Market Analysis
April 20, 2015
About Me
Catherine (Cathy) Giffi is
Director, Strategic Market
Analysis, for Wiley.
Her team of talented analysts
are charged with producing
groundbreaking research on
issues impacting librarians,
societies, and researchers,
including Open Access,
Researcher Workflow, Data
Sharing, Society Member
Benefits, and Reviewer
Benefits.
She holds a Masters degree in
Publishing from NYU and, in
addition to Wiley, has led large
scale research projects for the
Sundance Film Festival and
VOGUE magazine.
WHY STUDY DATA
SHARING NOW?
Data Sharing Is Older Than Wiley
NIH Recognizes Importance of Sharing
Accessed April 17, 2015
NSF Mandates Data Management Plans
Accessed April 17, 2015
WILEY RESEARCHER
DATA SHARING SURVEY
March 2014
Survey
Responses By
the Numbers
Our objective was to establish a
baseline view of data sharing
practices, attitudes, and
motivations globally, with
participation from researchers in
every scholarly field.
• 90,000
researchers
invited to
participate.
• 3,000
responses
recorded.
• 203 fields of
study were
recorded.
• 85 countries
participated.
• 14 days to
participate.
Key Findings Most researchers are sharing their data.
Data shared is typically <10 GB.
The most common type of data that is being
shared is flat, tabular data (.csv, .txt, .xl)
Data is usually “archived” on hard drives.
Those not sharing have a variety of
reasons.
Where Did
You Make Your
Data Publicly
Available?
• Supplemental material (67%)
• At a conference (57%)
• Informal paths/by request (42%)
• Personal, institutional, or project webpage (37%)
• Institutional data repository (26%)
• Discipline-specific data repository (19%)
• General purpose data repository, e.g. Dryad,
figshare (6%)
• Other (5%)
Of those surveyed, 66% have made data publicly
available (ever).
Why
Researchers Do
Not Share Data
IP or confidentiality issues (83%)
Research might be “scooped” (70%)
Concerns about misinterpretation (32%)
Insufficient time/resources (32%)
No mandate from Funder/Institution (13%)
Unsure how, where to share (8%)
Variation by Field of Research
Life Science*
• Concerns that their
research will be
scooped (56%)
• Intellectual property or
confidentiality issues
(54%)
• Concerns about
misinterpretation or
misuse (43%)
Health Science
• Intellectual property or
confidentiality issues
(68%)
• Ethical concerns (36%)
• Concerns about
misinterpretation or
misuse (36%)
*Most likely to share data
Variation by Field of Research
Physical Science
• Intellectual property or
confidentiality issues
(47%)
• No funder or institutional
require (29%)
• Concerns that their
research will be scooped
(27%)
Social Science & Humanities*
• Intellectual property or
confidentiality issues
(47%)
• Concerns about being
scooped (30%)
• No funder or institutional
requirement (28%)
*Least likely to share data
Variation by Country
Echoes of Our Findings
“[Researchers] cite lack of time, money and universally
agreed upon standards, as well as technical barriers, as
the main reasons they hold data back. Of course, there are
psychological and cultural reasons, too: a sense of
ownership over such a hard-won resource and a fear of
scrutiny and of being “scooped.”
Neurodata Without Borders
August 2014
Echoes of Our Findings
“Twenty potential barriers were identified
and classified in six categories:
Technical, Motivational, Economic,
Political, Legal, Ethical.”
BMC Public Health
November 2014
Takeaways
• Sharing data is crucial for the
advancement of science.
• Recognizing the barriers to
sharing is important.
• Some barriers can be more
easily overcome.
• Others will take the support
of the scholarly community.
For More
Information
To Share or Not to Share, That is the Research Data
Question
Scholarly Kitchen
How and Why Researchers Share Data, and Why
They Don’t
Exchanges
Cathy Giffi
Director, Strategic
Market Analysis, Wiley
cgiffi@wiley.com
THANKS!

Barriers to Data Sharing

  • 1.
    BARRIERS TO DATA SHARING CathyGiffi Director, Strategic Market Analysis April 20, 2015
  • 2.
    About Me Catherine (Cathy)Giffi is Director, Strategic Market Analysis, for Wiley. Her team of talented analysts are charged with producing groundbreaking research on issues impacting librarians, societies, and researchers, including Open Access, Researcher Workflow, Data Sharing, Society Member Benefits, and Reviewer Benefits. She holds a Masters degree in Publishing from NYU and, in addition to Wiley, has led large scale research projects for the Sundance Film Festival and VOGUE magazine.
  • 3.
  • 4.
    Data Sharing IsOlder Than Wiley
  • 5.
    NIH Recognizes Importanceof Sharing Accessed April 17, 2015
  • 6.
    NSF Mandates DataManagement Plans Accessed April 17, 2015
  • 7.
  • 8.
    Survey Responses By the Numbers Ourobjective was to establish a baseline view of data sharing practices, attitudes, and motivations globally, with participation from researchers in every scholarly field. • 90,000 researchers invited to participate. • 3,000 responses recorded. • 203 fields of study were recorded. • 85 countries participated. • 14 days to participate.
  • 9.
    Key Findings Mostresearchers are sharing their data. Data shared is typically <10 GB. The most common type of data that is being shared is flat, tabular data (.csv, .txt, .xl) Data is usually “archived” on hard drives. Those not sharing have a variety of reasons.
  • 10.
    Where Did You MakeYour Data Publicly Available? • Supplemental material (67%) • At a conference (57%) • Informal paths/by request (42%) • Personal, institutional, or project webpage (37%) • Institutional data repository (26%) • Discipline-specific data repository (19%) • General purpose data repository, e.g. Dryad, figshare (6%) • Other (5%) Of those surveyed, 66% have made data publicly available (ever).
  • 11.
    Why Researchers Do Not ShareData IP or confidentiality issues (83%) Research might be “scooped” (70%) Concerns about misinterpretation (32%) Insufficient time/resources (32%) No mandate from Funder/Institution (13%) Unsure how, where to share (8%)
  • 12.
    Variation by Fieldof Research Life Science* • Concerns that their research will be scooped (56%) • Intellectual property or confidentiality issues (54%) • Concerns about misinterpretation or misuse (43%) Health Science • Intellectual property or confidentiality issues (68%) • Ethical concerns (36%) • Concerns about misinterpretation or misuse (36%) *Most likely to share data
  • 13.
    Variation by Fieldof Research Physical Science • Intellectual property or confidentiality issues (47%) • No funder or institutional require (29%) • Concerns that their research will be scooped (27%) Social Science & Humanities* • Intellectual property or confidentiality issues (47%) • Concerns about being scooped (30%) • No funder or institutional requirement (28%) *Least likely to share data
  • 14.
  • 15.
    Echoes of OurFindings “[Researchers] cite lack of time, money and universally agreed upon standards, as well as technical barriers, as the main reasons they hold data back. Of course, there are psychological and cultural reasons, too: a sense of ownership over such a hard-won resource and a fear of scrutiny and of being “scooped.” Neurodata Without Borders August 2014
  • 16.
    Echoes of OurFindings “Twenty potential barriers were identified and classified in six categories: Technical, Motivational, Economic, Political, Legal, Ethical.” BMC Public Health November 2014
  • 17.
    Takeaways • Sharing datais crucial for the advancement of science. • Recognizing the barriers to sharing is important. • Some barriers can be more easily overcome. • Others will take the support of the scholarly community.
  • 18.
    For More Information To Shareor Not to Share, That is the Research Data Question Scholarly Kitchen How and Why Researchers Share Data, and Why They Don’t Exchanges Cathy Giffi Director, Strategic Market Analysis, Wiley cgiffi@wiley.com
  • 19.

Editor's Notes

  • #2 Thank you for having me – I’ve been asked to speak about some of the barriers researchers face in sharing their data. In particular, I’ll be outlining some of the results from a survey Wiley ran last year that identified the what, why, and why not behind data sharing. What data is shared, why it is shared, and why it might not be shared.
  • #3 First, a bit about me. I’m director of Strategic Market Analysis for Wiley, and I run a group based out of Hoboken that conducts primary research studies on issues impacting the scholarly community. Our research is typically global and multi-disciplinary…as is our company. Founded in 1807, we have more than 1600 scholarly journals in our portfolio in life, physical, health and the social sciences. We have publishing offices in North America, Australia, Europe, Asia, the Middle East and South America. We work with CODATA, the World Data System, the Research Data Alliance, DataCite and NISO to advance initiatives which will enable research data to be used, re-used, cited, and accredited.  We are a core partner of the PREPARDE (Peer REview for Publication & Accreditation of Research data in the Earth sciences) project, which is capturing the processes and procedures required to publish a scientific dataset, ranging from ingestion into a data repository, through to formal publication in a data journal.
  • #4 What led to our prioritizing data sharing in the fall of 2013 and winter of 2014?
  • #5 The 208-year old company I work for is a baby compared with data sharing. One of the earliest records is from the Royal Society pictured here – for continuity, I have included an image from Todd Vision’s presentation from Research Data Publication Part I which nicely illustrates that sharing data (tide information, in this case).
  • #6 There was certainly a turning point with digital research accelerated with the migration of journal content to online in the 90s. By 2003, the NIH recognized the importance of data sharing in an FAQ that has remarkably stood the test of time. The NIH now requires data management plans for certain types of grant proposals.
  • #7 In 2011, NSF began mandating data management plans be submitted with all grant applications. In October of last year, the Department of Energy began requiring data management plans, and in February of this year, NASA adopted a similar policy. In January of this year, the Gates Foundation mandated data sharing. From Wiley’s perspective, we are interested in advancing initiatives which will enable research data to be used, re-used, cited, and accredited. Understanding and being responsive to the needs of researchers is key for our business.
  • #16 Neurodata Without Borders, a collaborative year-long project will focus on standardizing a subset of neuroscience data, making this research simpler for scientists to share.
  • #17 Technical barriers meaning metadata was not collected, or standards were not followed. Motivational barriers meaning the difficulty of sharing and the chance of being scooped makes sharing less appealing. Economic barriers like lack of funding for human and technical resources. Political barriers like lack of trust. Example: the Indonesian government refused to share H5N1 influenza samples with the international community during the 2007 pandemic due to lack of trust on the potential use of these samples for financial gain Legal barriers like ownership and copyright, privacy Ethical barriers like lack of reciprocity
  • #18 Development of standards can help remove technical barriers. Legal barriers may be overcome with technology solutions that facilitate copyright protecting and alleviate privacy concerns. Political and motivational barriers will be more challenging.