06/09/2016
Research Data Network Meeting
Understanding researchers’ needs
2016 DAF survey findings – Rob Johnson (ResearchConsulting) @rschconsulting
The DAFToolkit
The Data Asset Framework (DAF) toolkit allows institutions
to:
»Identify, Locate, and Describe digital assets
»Assess how they are managed
06/09/2016 Jisc Shared Research Data Pilot Meeting
Towards a refined DAF survey
Research Consulting was tasked with the development of a refined version of the DAF survey.
06/09/2016 Jisc Shared Research Data Pilot Meeting
2016
Today
Apr May Jun Jul Aug
Analysis of existing DAF surveys
from pilot institutions
1/4/2016
Begin development of a
refined DAF survey
25/4/2016 Pilot institution
feedback
26/5/2016
Final version of the refined DAF survey
28/6/2016
Launch of the DAF survey at 6
pilot institutions
4/7/2016
Analysis of the survey results
and reporting
5/8/2016
Profile of respondents by Institution
06/09/2016 Jisc Shared Research Data Pilot Meeting
»The survey gathered a total of 1,185 responses.
37%
25%
21%
10%
5% 1%
The University of Cambridge
St Andrews University
Plymouth University
Lancaster University
CREST
The Royal College of Music (RCM)
»Full anonymised dataset is now online:
» Johnson, Rob; Chiarelli, Andrea; Parsons,Tom (2016): Data asset
framework (DAF) survey results 2016. figshare.
»http://dx.doi.org//10.6084/m9.figshare.3796305
»Or just google it!
Jisc Shared Research Data Pilot Meeting06/09/2016
Top 10 types of digital research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
0% 10% 20% 30% 40% 50% 60% 70% 80%
Audio files (e.g. interviews, music)
Models/algorithms
Simulation data, models & software code
Observational data
Text files (e.g. .txt)
Digital photographs and other images
Data collected from sensors/instruments (e.g. microscopes)
Data automatically generated from or by computer programs
Spreadsheets
Documents or reports (e.g., Word, PDF, etc.)
Percentage of respondents
Research Data Management
06/09/2016 Jisc Shared Research Data Pilot Meeting
40%
37%
23%
No
Yes
Not sure
»Do researchers have a research data management plan?
Sensitive research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
»What types of sensitive data do researchers hold?
0% 5% 10% 15% 20% 25%
Patient identifiable data
Other types of confidential/restricted data
Commercially sensitive data
Sensitive personal data
Personal data about identifiable living
individuals
Percentage of respondents
Sensitive research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
“It would be helpful to clarify the rules for storing
anonymised data on cloud services. My departmental rules
say this is never OK, however this seems to contradict
University rules.”
Location of research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
»Professors vs. PGR Students
0% 20% 40% 60% 80% 100%
University-managed network
storage
Cloud service – Dropbox
Hard disk drive of a computer
owned by the University
Hard disk drive of a privately-
owned computer
External hard drive or
memory stick/USB/Flash drive
Percentage of respondents
PGR Students (N=443)
0% 20% 40% 60% 80% 100%
Hard disk drive of a privately-
owned computer
University-managed network
storage
Cloud service – Dropbox
External hard drive or
memory stick/USB/Flash drive
Hard disk drive of a computer
owned by the University
Percentage of respondents
Professors (N=105)
University services to support RDM
“Support is woeful in the university currently, in particular
long-term data archiving is critically required. Most of my
non-current data is rotting on CD's and hard-drives.”
06/09/2016 Jisc Shared Research Data Pilot Meeting
Impacts of research data loss
06/09/2016 Jisc Shared Research Data Pilot Meeting
»17% of respondents had lost data, resulting in…
0% 10% 20% 30% 40% 50% 60% 70% 80%
Failure to meet regulatory requirements
Failure to meet funder requirements
Reputational damage
Reduction in quality of research outputs
Delay to publication
Wasted research effort
Percentage of respondents with lost data
Preservation of research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
»How much data has long-term value?
0% 10% 20% 30% 40%
Not sure
More than 1 TB
501GB-1TB
101 GB- 500 GB
51-100 GB
<50 GB
Percentage of respondents
Data owned at present
Data expected to have long term
value
Preservation of research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
»How would respondents expect to preserve their data?
0% 10% 20% 30% 40% 50% 60% 70% 80%
General data repository
Discipline-specific data repository
Other - Please specify:
Institutional data repository
Percentage of respondents
Preservation of research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
“I currently spend about £1,200 pa on data storage from my
own salary. I have the highest data needs in my School, and
there is no plan in place for storing my data.”
Preservation of research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
»For how long do respondents expect their data to be
preserved?
5-10 years
>10 years
I don't know
1-5 years
0% 5% 10% 15% 20% 25% 30% 35% 40% 45%
Percentage of respondents
5-10 years
>10 years
I don't know
1-5 years
Preservation of research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
»Do you follow guidelines for metadata?
48%
34%
18%
No
Not sure
Yes
Sharing research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
»68% of respondents either already share data or expect
to do so in the future.What motivates them?
0% 10% 20% 30% 40% 50% 60% 70% 80%
University Research Data Policy
Saves time and effort of sharing results with individuals
My funder requires data sharing
Safeguards research integrity
Increases citation and impact
Verification of research findings
Potential for others to re-use the data
Research is a public good and should be open to all
Percentage of respondents
University services to support RDM
06/09/2016 Jisc Shared Research Data Pilot Meeting
»Do researchers use university services to support data
management and sharing?
35%
29%
16%
10%
10% 0%
I don't know what services are
available
I don't currently use these services,
but I expect to in future
I already use these services
I don't expect to use these services
Not sure
There are no services available
Training needs on Research Data Management
06/09/2016 Jisc Shared Research Data Pilot Meeting
0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%
Technical support for data processing, e.g. database design, High Performance
Computing (HPC)
Ethics, consent and legal issues with research data
Copyright and intellectual property rights within a data context
Funder requirements for research data management
Guidance on costing data management in grant applications
Publishing research data
Security of data
Collaboration and sharing of data
Developing a research data management plan for a funding application
Long-term storage of your data
Percentage of respondents
University services to support RDM
06/09/2016 Jisc Shared Research Data Pilot Meeting
“Please, individualise the support.Workshop are useless,
emails with information are useless, brochures are useless,
posters are useless.”
Lessons learned
06/09/2016 Jisc Shared Research Data Pilot Meeting
»Incentives
› voucher for first N respondents +
draw for the rest of the respondents
› higher amount of smaller vouchers
»Dissemination
› direct emails
› weekly staff newsletter
› library blog
› library tweets
› research office/research staff blog
› staff portal
› PGR portal
› link on RDM guidance page/newsletter
› targeted reminders to “missing” departments
Focus groups
06/09/2016 Jisc Shared Research Data Pilot Meeting
1. To allow researchers to make Jisc aware of their issues
and concerns
2. To collect use cases for the RDSS
3. To inform and stimulate discussion on important data
and metadata issues
»What are the aims of the focus groups?
»Timeline of focus groups
Focus groups
06/09/2016 Jisc Shared Research Data Pilot Meeting
2016
Today
May Jun Jul Aug Sep Oct
Oct/NovTBC
University of Surrey
Lancaster University
St George's Hospital Medical School
University ofYork
University of St Andrews
Cardiff University
Focus groups
06/09/2016 Jisc Shared Research Data Pilot Meeting
Business development managers
“I want to create news stories around the data sets, so as to use them as impact case studies.“
»Sample use cases:
Researchers
“I want to be able to encrypt data uploaded to the repository, so sensitive or commercial data can be
safely stored.“
“I want to know who is reusing my data, so that I can collaborate and learn more about their use.“
Reusers of data
“I want to know licence and policy for reuse, so that I am clear what I can do with the data. “
Conclusions
06/09/2016 Jisc Shared Research Data Pilot Meeting
Filling a gap
75% of respondents look first
to their institution to preserve
their data
Advocacy
Only 16% of respondents are
currently accessing university
RDM support services
Public datasets
>70% recognise that research
is a public good and should be
publicly released
Metadata
Only 18% of respondents say
they follow established
metadata guidelines
Sensitive data
41% of respondents have
some form of sensitive data
Uptake of RDM
Only 40% of respondents
have a Research Data
Management plan
The DAF dataset
»The data used for this analysis is available as a csv
dataset at:
»http://dx.doi.org//10.6084/m9.figshare.3796305
Contact: rob.johnson@researchconsulting.co.uk
@rschconsulting
06/09/2016 Jisc Shared Research Data Pilot Meeting
06/09/2016
Shared Research Data Pilot Meeting
Additional slides
2016 DAF survey findings – Rob Johnson (ResearchConsulting) @rschconsulting
Profile of respondents by Role
06/09/2016 Jisc Shared Research Data Pilot Meeting
»The survey respondents had 9 different roles.
38%
18%
16%
9%
9%
4%
4% 1% 1%
Postgraduate student (e.g. MA,
MSc, MEng, PhD, etc.)
Lecturer/Research Fellow
Research Assistant/Post Doc
Senior Lecturer/Senior Research
Fellow
Professor
Assistant/Associate Professor
Other
Administrative/Professional
Technician
Volume of research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
»How much data do researchers hold?
0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%
Not sure
More than 1 TB
501GB-1TB
101 GB- 500 GB
51-100 GB
<50 GB
Professors (N=105)
PGR students (N=442)
All respondents
Location of research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
»Where is research data stored (Top 5)?
0% 10% 20% 30% 40% 50% 60% 70% 80%
Cloud service – Dropbox
University-managed network storage
Hard disk drive of a privately-owned computer
External hard drive or memory stick/USB/Flash
drive
Hard disk drive of a computer owned by the
University
Percentage of respondents
All respondents
Loss of research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
»Have researchers ever lost data?
83%
17%
No
Yes
Top 3 causes for loss of data
1. Hardware failure
2. Human error
3. Equipment stolen
Sharing research data
06/09/2016 Jisc Shared Research Data Pilot Meeting
»How do you share data with other researchers?
0% 10% 20% 30% 40% 50% 60% 70% 80%
By upload to a web site or FTP server accessible
to that researcher
Institutional file-sharing service
Share it on an academic social network (e.g.
Academia, ResearchGate, Mendeley)
Using portable storage such as CDs, DVDs,
memory sticks etc.
Using a cloud storage service e.g. Dropbox,
Google Drive etc.
By emailing data files
Percentage of respondents
Editor's Notes
DAF surveys
Pilot focus groups
The Data Asset Framework has been developed with JISC funding in a project led by HATII at the University of Glasgow in conjunction with the Digital Curation Centre.
Response rates – Plymouth 12.4%
Cambridge – 11%
Lancaster – 10.3%
St Andrews – 15%
RCM – 33%
Mixture of career stages – 38% Postgrad students, but good mix of postdocs, lecturers, profs
SOPs, genomic data, videos etc
TOP REASONS FOR YES (N=100)
72% Good research practice
53% Required by project funder
TOP REASONS FOR NO (N=171)
47% Not required/appropriate to field of research
45% Not required by project funder
32% Lack of knowledge or experience on creating data management plan
25% Unaware of any tools or guidance that can help create data management plan
59% of 1167 respondents said none of the above.
Charts represent locations where at least SOME data is stored.
All the same across career
Qualitative analysis from FREE TEXT responses:
Hardware failure 43%
Human error 33%
Stolen hardware (e.g. laptop) 6%
Do researchers expect to move their data when a project ends?
44% Yes
39% No
15% Unsure
2% Would delete
Most mentioned repositories:
NCBI Sequence Read Archive (SRA)
GenBank
EMBL Nucleotide Sequence Database (ENA)
Open Science Framework
GitHub
Dryad Digital Repository
OTHER – PLEASE SPECIFY (most common responses)
external hard drive
cloud storage
lab drive
university backup
own backup solution
Period over which the data was collected (All respondents):
32% 1-3 years
26% Within the last 12 months
20% 3-5 years
11% 5-10 years
MORE THAN 1 TB: Most are between 1TB and 10 TB. There are also peaks of 100 TB, 300 TB, 500 TB.
Chart represents locations where at least SOME data is stored.
Use of cloud services (Dropbox, Box, Google Drive, OneDrive)
63% Personal account
22% Combination of personal and institutional cloud services
13% Institutional account
Qualitative analysis from FREE TEXT responses:
Hardware failure 43%
Human error 33%
Stolen hardware (e.g. laptop) 6%
These are the approaches with at least 15% of responses.