Research data management

Research data management
For SUSPLACE 20 April 2016
Hugo Besemer www.slideshare.net/hugobesemer

Data management planning – for whom?
 For yourself – to make it easier to find things back and
know what they are
 For your colleagues
 For the SUSPLACE program
 For your funder

What data should you store?
 Raw data
 Final data
 Papers
but also
 Intermediate data
 Drafts of papers
 Methods
 Equipment and materials
 Research notes
 ...

What do you choose to store?
 Everything you need to be able to do your work
 Everything your colleagues need to do their work
 Everything required by your funding organisation
 Everything required by your journal
 Everything necessary to reproduce your results

Short term storage – what are the issues?
 Space
 Access
● From where?
● By who?
 Versioning
 Backups
 Finding it again!

Storage: where?
Storage
solutions
Advantages Disadvantages Suitable for
Personal computer
/laptop
• Always available
• Portable
• What if it
breaks/is stolen?
• What if you are
ill or away?
Temporary storage
Network drive
Managed file
servers
• Regularly
backed up and
maintained
• Stored securely
• Stored centrally
• Costs
• May not be
accessible from
everywhere/by
everyone
Master copy (if
enough space is
provided)
External storage
devices – USB,
flash etc.
• Low cost
• Portable
• Easily damaged
or lost
• Insecure
Temporary storage
Cloud services –
Dropbox, Figshare,
SkyDrive etc.
• Automatic sync
(some services)
• Easy access
• Is it secure?
• No control over
backup
procedure
Data sharing
Question: are there
agreements for
SUSPLACE?

Storage during research: basic tips
 Versioning
● use a file in one (online) location as the “master”, and do
all your modifications and processing on copies of that
master
● When you have consolidated your changes and do not
want to lose them, replace the master file by the
consolidated file
● Keep track of ‘milestone files’

Folder structure
DO:
 Stabile and scalable
 Interaction with filenames. Folder? Or element in
filename?
9
Project_Files Pictures
??
UB_users_mktproj_01032015.tif =Projectfile (picture)
Project_Files
Pictures
UB_users_mktproj_20150103.tif =Projectfile (picture)
taken from: Data management Workshop For Researchers
by Tessa Pronk (Utrecht University Library)
If you use for example Atlas.ti or
NVIVO for qualitative data, it takes
care of some of this

Folder structure
DO:
 Stabile and scalable
 Interaction with filenames. Folder? Or element in
filename?
DON'T:
 Too flat or deep structure
 Folders with overlapping content
10

Example: folder structure
11From: ‘Setting up an Organised Folder Structure for Research Projects’
Posted June 4, 2014 Blog by Nikola Vukovic
don't forget the folder with your
literature (and Endnote or
Mendeley libraries)!

Filename conventions
DO:
 Note in a separate document what element codes in your
filename mean
 Keep short and relevant, about 25 characters.
 Go from generic to specific (handy with sorting and
finding)
 Use ‘_’ or ‘-’
12
Use fixed elements in your filename:
Version number, date, description content, project
number, name researcher/team.

How would you name the file?
13
?
a. MA_NTC023_20141031.xls
b.MA@NTC#23~20141031.xls
c. MicroArrayData_NetherlandsToxicogenomicsCentreP
roject023_20141031.xls
d.microarrayntc02320141031.xls
e. MA_NTC023_31102014.xls
f. MA/NTC/Project23/OCT31st/data.xls

DO:
 Note in a separate document what element codes in your
filename mean
 Keep short and relevant, about 25 characters.
 Go from generic to specific (handy with sorting and
finding)
 Use ‘_’ or ‘-’
14
Use fixed elements in your filename:
Version number, date, description content, project
number, name researcher/team.

DON'T:
 Use special characters (&%$#) or points or whitespace.
 Name your files 'new_version' 'newer_version',
'newest_version'.
 Duplicate files in different folders
 Trust computer-metadata with your file
15
TIP: In most operating systems
‘Batch renaming software’ exist

very good vs. less good
16
?
a. MA_NTC023_20141031.xls
b.MA@NTC#23~20141031.xls
c. MicroArrayData_NetherlandsToxicogenomicsCentreP
roject023_20141031.xls
d.microarrayntc02320141031.xls
e. MA_NTC023_31102014.xls
f. MA/NTC/Project23/OCT31st/data.xls

Long term or .....
 For WUR: contact our data librarian
(datamanagement.support@wur.nl)
● support with storage in DANS-EASY and 3TU
● advice on other repositories
 find a suitable discipline-specific repository
● provided by journal (e.g. Dryad)
● search re3data.org
 use a free generic repository
● figshare
● Mendeley.Data
● Harvard Dataverse
● Zenodo
17
Help! I need a DOI for my
manuscript!

documentation
 document your dataset on a project, file and parameter
level
 add a readme file
● describe the data that each file contains;
● define column headings and row labels, data codes
(including missing data) and measurement units for
tabular data;
● list whether associated data files are available and if so,
where they're available;
● list whom to contact with questions
 describe the data collection process/method in a
methodology file (or refer to the publication)
19
more info

For yourself
 For data processing and analysis
 Help in writing reports and papers
 Reference for the future
● Will you still understand it in 2 months, 6 months, 2
years..?

Thank you for
your participation!
More info?
Go to: Wageningen UR Data
Management Support Hub
Or contact us via:
datamanagement.support@wur.nl
And say your from WUR-coordinated SUSPLACE
24

Data documentation
Context is essential!

Example
Study to examine the effects of diet on health
- Conducted over 3 years by 3 researchers – Peter, Lisa
and Anna
There are many ways to organise the data. We will look at
three:
- By researcher
- By year
- By activity

Example
It is now the summer holidays in 2016. Peter and Anna
are on holiday, and Lisa has received some urgent
questions from the reviewers. They need to know:
 the procedure used to produce the high protein diet
 which bureau measured the data
 what sort of preprocessing was carried out on the data.

Organisation by year/researcher
Need to know what was done when or by who

Example – Organising by activity
Easy to navigate through, for each question you
quickly find the right folder
- even if you had no prior knowledge.

Example – Organising by activity
Still need to do quite a lot of detective work to find the
information
– have to rely on good names, guesswork, and ...
...read through the content of the files.

Descriptions and links
 Enter a brief description for each activity (folder)
 It may help to identify types of files (e.g. dataset,
procedure, sample, document)
 Linking to items produced in other activities allows you
to:
● follow the workflow
● reuse items
● avoid problems due to multiple copies

Example – Organising by activity plus
descriptions and links
Easy to navigate through, for each question you
quickly find the right folder
- even if you had no prior knowledge.
Descriptions help you to find and understand the
data
Links make the whole process traceable

Research data management

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to Research data management

Similar to Research data management (20)

More from Hugo Besemer

More from Hugo Besemer (20)

Recently uploaded

Recently uploaded (20)

Research data management

Editor's Notes