SlideShare a Scribd company logo
Research data management
part 1: what and why
Course 0HV90
TU/e, 22-11-2017
l.osinski@tue.nl, TU/e IEC/Library
Available under CC BY-SA license, which permits copying
and redistributing the material in any medium or format &
adapting the material for any purpose, provided the original
author and source are credited & you distribute the
adapted material under the same license as the original
Research data management [RDM]
what #1
Essence of RDM: “… tracking back to what you did 7
years ago and recovering it (...) immediately in a re-
usable manner.” (Henry Rzepa)
Research data management [RDM]
what #2
RDM: caring for your data with the purpose to:
1. protect their mere existence: data loss, data authenticity
(RDM basics)
2. share them with others
a. for reasons of reuse: in the same context or in a different context; during
research and after research
b. for reasons of reproducibility checks  scientific integrity; data quality
RDM = good data practices that make your data easy to work with,
understandable, and available to other scientists
Source: Research Data
Netherlands / Marina
Noordegraaf
Outline
1. Research data management [RDM]: what and why
2. Sharing your data, or making your data findable and accessible
a. data protection: back up, file naming, organizing data
b. data sharing: via collaboration platforms, data archives
3. Caring for your data, or making your data usable and interoperable
a. tidy data
b. metadata/documentation
c. licenses
d. open data formats
 Because you work together with other researchers  collaborative science
 Because of re-using results: data-driven science  open science
 Because of scientific integrity: validating data analysis by reproducibility checks
requires data and the code that is used to clean, process and analyze the data and
to produce the final outputs  reproducible science
Additional reasons
 Because your data are unique / not easily repeatable
(long term observational data)
 Because it’s required… by funders like NWO and EC and
your university
Why sharing research data? #1
Why sharing research data? #2
TU/e code of scientific conducts
Reasons not to share your data
 Preparing my data for sharing takes time and effort
But research data management also increases your research efficiency
 My data are confidential
But you can anonymize or pseudonymize your data
 My data still need to yield publications
But you can publish your data under an embargo and by publishing your data you
establish priority and you can get credits for it
 My data can be misused or misinterpret
But the best defense against malicious use is to refer to an archival copy of your
data which is guaranteed exactly as you mean it to be
 My data are only interesting for me
But sharing your data may be required by a funder /
journal or your data may be requested to validate your
results
Research data management
part 2: protecting and organizing
your data
RDM: caring for your data with the purpose to:
1. protect their mere existence: data loss, data authenticity
(RDM basics)
2. share them with others
a. for reasons of reuse: in the same context or in a different context;
during research and after research
b. for reasons of reproducibility checks  scientific integrity; data
quality
Research data management
Be safe
 storage, backup  data safety, protecting against loss
+ 3-2-1 rule: save 3 copies of your data, on 2 different devices,
and 1 copy off site
+ use local ICT infrastructure (university network servers) if
possible
 access control  data security, protecting against unauthorized use:
with SURFdrive, Open Science
Framework or DataverseNL
for example you can arrange
access to your data
Protecting your data #1
good data practices during your research
Be organized, or: you (and others) should be able to tell
what’s in a file without opening it
 file-naming
 organizing data in folders
Protecting your data #2
good data practices during your research
“…we can copy everything and do not manage it well.” (Indra Sihar)
File naming #1
File naming
Good file names are:
 consistent (use file-naming conventions)
 unique (distinguishes a file from files
with similar subjects as well as different
versions of the file), and
 meaningful (use descriptive names).
+ Avoid using special characters in file
names
+ Include dates (format YYYYMMDD) and
a version number on file names
Source: Best practices for file naming (Stanford University Libraries)
File naming #2
 Order by date:
2013-04-12_interview-recording_THD.mp3
2013-04-12_interview-transcript_THD.docx
2012-12-15_interview-recording_MBD.mp3
2012-12-15_interview-transcript_MBD.docx
 Order by subject:
MBD_interview-recording_2012-12-15.mp3
MBD_interview-transcript_2012-12-15.docx
THD_interview-recording_2013-04-12.mp3
THD_interview-transcript_2013-04-12.docx
 Order by type:
Interview-recording_MBD_2012-12-15.mp3
Interview-recording_THD_2013-04-12.mp3
Interview-transcript_MBD_2012-12-15.docx
Interview-transcript_THD_2013-04-12.docx
 Forced order with numbering:
01_THD_interview-recording_2013-04-12.mp3
02_THD_interview-transcript_2013-04-12.docx
03_MBD_interview-recording_2012-12-15.mp3
04_MBD_interview-transcript_2012-12-15.docx
Think about the ordering of elements in a filename
Organizing your data in folders #1
PAGE 1422-11-2017
Source: Haselager, G.J.T., Aken, M.A.G. van
(2000): Personality and Family
Relationships. DANS.
http://dx.doi.org/10.17026/dans-xk5-y7vc .
Organizing your data in folders #2
TIER documentation protocol
Guiding principles of TIER documentation protocol
1. keep your raw or original data raw
+ save your raw data read-only in its original format in a separate folder
+ make a working copy of your raw data (input data, used for
processing and analysis)
2. keep the command files (files containing code written in the syntax of the
(statistical) software you use for the study) apart from the data
3. keep the analysis files (the fully cleaned and processed data files that you
use to generate the results reported in your paper) in a separate folder
4. store the metadata (codebook, description of variables, etc.) in a separate
folder, apart from the data itself
Organizing your data in folders #3
TIER documentation protocol
1. Main project folder (name of your research project/working title of your
paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
1.1.2.1. Supplements
1.2. Processing and analysis files
1.2.1. Importable data files
1.2.2. Command files
1.2.3. Analysis files
1.3. Documents
1.4. Literature
1. Main project folder (name of your research project/working title of your
paper)
1.1. Original data and metadata
1.1.1. Original data (raw data, obtained/gathered data)
Any data that were necessary for any part of the processing and/or
analysis you reported in you paper.
Copies of all your original data files, saved in exactly the format it was
when you first obtained it. The name of the original data file may be
changed
Keep these data read only!
1.1.2. Metadata
1.1.2.1. Supplements
Organizing your data in folders #4
TIER documentation protocol
1. Main project folder (name of your research project/working title of your paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
The Metadata Guide: document that provides information about each of your
original data files. Applies especially to obtained data files
 A bibliographic citation of the original data files, including the date you
downloaded or obtained the original data files and unique identifiers that
have been assigned to the original data files.
 Information about how to obtain a copy of the original data file
 Whatever additional information to understand and use the data in the
original data file
1.1.2.1. Supplements
Additional information about an original data file that’s not written by
yourself but that is found in existing supplementary documents, such as
users’ guides and code books that accompany the original data file
Organizing your data in folders #5
TIER documentation protocol
Organizing your data in folders #6
TIER documentation protocol
1. Main project folder (name of your research project/working title of your paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
1.1.2.1. Supplements
1.2. Processing and analysis files
1.2.1. Importable data files (the data you work with, input data, suitable for
processing and analysis)
A corresponding version for each of the original data files. This version can be identical
to the original version, or in some cases it will be a modified version.
For example modifications required to allow your software to read the file (converting
the file to another format, removing unusable data or explanatory notes from a table)
 The original and importable versions of a data file should be given different names
 The importable data file should be as nearly as identical as possible to the original
 The changes you make to your original data files to create the corresponding
importable data files should be described in a Readme file
1.2.2. Command files
1.2.3. Analysis files
Organizing your data in folders #7
TIER documentation protocol
1. Main project folder (name of your research project/working title of your paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
1.1.2.1. Supplements
1.2. Processing and analysis files
1.2.1. Importable data files
1.2.2. Command files
One or more files containing code written in the syntax of the (statistical) software you use
for the study
 Importing phase: commands to import or read the files and save them in a format that
suits your software
 Processing phase: commands that execute all the processing required to transform the
importable version of your files into the final data files that you will use in your analysis
(i.e. cleaning, recoding, joining two or more data files, dropping variables or cases,
generating new variables)
 Generating the results: commands that open the analysis data file(s), and then
generate the results reported in your paper.
1.2.3. Analysis files
Organizing your data in folders #8
TIER documentation protocol
1. Main project folder (name of your research project/working title of your paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
1.1.2.1. Supplements
1.2. Processing and analysis files
1.2.1. Importable data files
1.2.2. Command files
1.2.3. Analysis files
 The fully cleaned and processed data files that you use to generate the
results reported in your paper in your paper
 The Data Appendix: codebook for your analysis data files: brief description
of the analysis data file(s), a complete definition of each variable (including
coding and/or units of measurement), the name of the original data files
from which the variable was extracted, the number of valid observations for
the variable, and the number of cases with missing values
Organizing your data in folders #9
TIER documentation protocol
1. Main project folder (name of your research project/working title of your paper)
1.1. Original data and metadata
1.1.1. Original data
1.1.2. Metadata
1.1.2.1. Supplements
1.2. Processing and analysis files
1.2.1. Importable data files
1.2.2. Command files
1.2.3. Analysis files
1.3. Documents
 An electronic copy of your complete final paper
 The Readme-file for your replication documentation
 What statistical software or other computer programs are needed to run the
command files
 Explain the structure of the hierarchy of folders in which the documentation is
stored
 Describe precisely any changes you made to your original data files to create
the corresponding importable data files
 Step-by-step instructions for using your documentation to replicate the
statistical results reported in your paper
1.4. Literature
 Retrieved relevant literature
Research data management
part 3: sharing your data
RDM: caring for your data with the purpose to:
1. protect their mere existence: data loss, data authenticity
(RDM basics)
2. share them with others
a. for reasons of reuse: in the same context or in a different context;
during research and after research
b. for reasons of reproducibility checks  scientific integrity; data
quality
Research data management
During research After researchInstitutionDisciplin
Local
ICT
services
Overview research data sharing
and storage services
Data sharing per se is pretty straightforward
Sharing your data
during your research versus after your research
During your research via collaboration or sharing platforms
 Data sharing is (more) aimed at collaboration, at working
together on data
 Being able to control access to data is crucial
After your research via data archives or repositories
 Publishing and keeping an archival copy of data is essential
 Long term preservation of your data is important
General data sharing platforms:
 SURFdrive [TU/e only]: Dutch academic Dropbox, 250 Gb,
maximum data transfer 16 Gb
Students can request a SURFdrive account
 Google Drive, Dropbox: don’t use these to store sensitive data
Sharing your data
during your research: general data sharing platforms
DataverseNL [TU/e only]: data sharing platform for active
research data [based on Harvard’s Dataverse Project] where you
may:
 store your data in an organized
and safe way
 clearly describe your data
 version control of your data
 arrange access to your data
Sharing your data
during your research: DataverseNL
Students may use DataverseNL
Go to: https://dataverse.nl/
Click ‘Log in’ (at the top right); under Institutional
account click SURFconext
Select Eindhoven University of Technology and log
on with your TU/e username and password
When asked for it, give permission to share your
data by answering Yes or click this Tab
When asked to create an account, answer Yes or
click this Tab.
When you succeeded to create an account, your
username is the prefix of your email address
Click 4TU dataverse  Eindhoven dataverse  Add
data: you can now create and publish data sets,
upload files and assign access rights to data sets or
files.
Sharing your data
during your research: Open Science Framework
On request
“I'd like to thank E.J. Masicampo and Daniel LaLande for sharing and allowing me to share
their data…”
Daniël Lakens (2014), What p-hacking really looks like: A comment on Masicampo & LaLande (2012)
On a (personal) website
“Let me start by saying that the reason why I put all excel files online, including all the
detailed excel formulas about data constructions and adjustments, is precisely because I
want to promote an open and transparent debate about these important and sensitive
measurement issues.”
Thomas Piketty, My response to the Financial Times, HuffPost The Blog, 29-05-2014 ;
originally published as Addendum: Response to FT, 28-05-2014
A data journal
Journal of open psychology data, Geoscience data journal,
Data in brief, Scientific data, Data reports
Sharing your data
after your research
Source: www.aukeherrema.nl
Choose a repository where other researchers in your discipline are sharing
their data
If not available, use a multidisciplinary or general repository that at least
assigns a persistent identifier to your data (DOI) and requires that you provide
adequate metadata
 for example Zenodo, Figshare, DANS, B2SHARE, or;
 4TU.ResearchData
+ DOI’s [ persistent identifier for citability and retrievability ]
+ open access
+ long-term availability, Data Seal of Approval
+ self-upload (single data sets < 3Gb)
Sharing your data
after your research has ended, via an archive or repository
Link your data to your publication
Sharing your data
link your data to your publication
The Human Technology Interaction group of the department of
Industrial Engineering & Innovation Sciences requires that an
archival copy of the results (‘replication package’) will be
deposited in ARCHIE
 the copy cannot be changed
 the copy is only accessible to the researcher
Archiving your data
after your research has ended
Research data management
part 4: making your data usable
The welfare consequences… /
by Jonathan J. Cooper et. al.,
http://dx.doi.org/10.1371/jou
rnal.pone.0102722
 Word.doc
Before data can be
reusable, it has first
to be usable
Making your data usable and interoperable with good data
practices
 tidy data
 metadata/documentation
 licenses
 open data formats
Research data management
Tidy data is about structure of a table / data set.
Tidy data ≠ clean data. It’s a step towards clean data
+ Each variable you measure is in one column
+ Column headers are variable names
+ Each observation is in a different row
+ Every cell contains only one piece of information
Tidy data
what
Tidy data allow your data to be easily:
+ imported by data management systems
+ analyzed by analysis software
+ visualized, modelled, transformed
+ combined with other data (interoperability)
Tidy data
why: making your data easy to handle for computers
Tidy data versus messy data
1. More than one variable in a
single column (‘clumped data’)
2. Column headers are values, or:
one variable over many columns
(‘wide data’)
3. Variables are in rows and
columns
4. More pieces of information in
one cell (cells are highlighted or
colored; values and
measurement units in one cell)
1. Each variable you measure
is in one column
2. Column headers are
variable names
3. Each observation is in a
different row
4. Every cell contains only one
piece of information
Tidy data Messy data
patient_id drug_a drug_b
1 67 56
2 80 90
3 64 50
4 85 75
Tidy data versus messy data
example
‘Wide’ data: one variable
over many columns Tidy data
patient_id drug heart_rate
1 a 67
2 a 80
3 a 64
4 a 85
1 b 56
2 b 90
3 b 50
4 b 75
Tools for tidying data
OpenRefine
 download OpenRefine: http://openrefine.org/download.html
 runs on your computer (not in the cloud), inside the Firefox browser (not in
IE), no web connection is needed
 captures all steps done to your raw data ; original dataset is not modified;
steps are easily reversed;
R, TidyR package
 scripted language (R (free), Matlab, SAS…) to process data (tidying,
cleaning, etc.), run the analysis and to produce final outputs
versus
 Excel: data provenance and documentation of data processing with a
graphical user interface is bad because it doesn’t leaves a record
The table or data set itself
+ columns: use clear, descriptive variable names (no hard to
understand abbreviations), avoid special characters (can cause
problems with some software)
+ rows: if possible, use standard names within cells (derived
from a taxonomy, for example: standard species name, CAS
registry for chemical substances, standard date formats, …)
+ try to avoid coding categorical or ordinal data as numbers
+ missing data: use NA
Documentation / metadata
making your data understandable for humans #1
The table or data set as a whole
A description (documentation) that at least mentions:
+ size of the data set: number of observations and variables
+ information about the variables and its measurement units
(code book)
+ what’s included and excluded in the data set, why data are
missing
+ description of how you collected the data (study design), data
manipulation steps (provenance)
+ when your data consists of multiple files organized in a folder
structure, an explanation of the structure and naming of the
files
Documentation / metadata
making your data understandable for humans #2
“Research outputs that are poorly documented are like canned goods with the label
removed (…)” (Carly Strasser)
Documentation / metadata
metadata standards
Sometimes there are metadata standards for the
documentation of your data set but where no standard
exists, a simple readme file can be good enough
Raw data:
https://www.amstat.org/publi
cations/jse/datasets/titanic.da
t.txt
Documentation accompanying
the data:
https://www.amstat.org/publi
cations/jse/datasets/titanic.txt
Based on:
The "Unusual Episode" Data
Revisited / by Robert J. MacG.
Dawson, in: Journal of
Statistics Education vol.
3(1995), issue 3
But sometimes
data sets are so
complex that a
readme file is
insufficient
Documentation / metadata
making your data findable for humans and search engines
Descriptive metadata for discovery and identification of
your data mainly
+ creator
+ title
+ short description + key words
+ date(s) of data collection
+ publication year
+ related publications
+ DOI (assigned by data archive)
+ etc.
When uploading your
data in a data archive
like 4TU.ResearchData,
you will be asked to
enter these metadata
A DOI is assigned by
the data archive
User license
making clear that other people are allowed to use your data
Let other people know in advance what they are
allowed to do with your data by attaching a user license
to it
+ Creative Commons license for data sets
+ GNU General Public License (GPL) for software
+ License selector
Open data formats
ensuring the ‘longevity’ of your data
+ with open (non-proprietary) data formats it is best
ensured that the data will remain usable and ‘legible’
for computers in the future
+ are easy to use in a variety of software, like .csv for
tabular data
+ check the data formats that are supported by a data
archive like 4TU.ResearchData
Research data management
recommended reading on ‘good data practices’
1. Goodman, A., Pepe, A., Blocker, A.W., et al. (2014) Ten simple rules for the care
and feeding of scientific data, PLOS Computional Biology, 10(4), e10033542.
https://doi.org/10.1371/journal.pcbi.1003542
2. Eugene Barsky (2017), Good enough research data management: a very brief
guide
3. Dynamic ecology (2016), Ten commandments for good data management.
https://dynamicecology.wordpress.com/2016/08/22/ten-commandments-for-
good-data-management/
4. Ellis SE, Leek JT. (2017) How to share data for collaboration. PeerJ
Preprints5:e3139v1 https://doi.org/10.7287/peerj.preprints.3139v1
5. Wilson G, Bryan J, Cranston K, Kitzes J, Nederbragt L, Teal TK (2017) Good enough
practices in scientific computing. PLOS Computational Biology, 13(6): e1005510.
https://doi.org/10.1371/journal.pcbi.1005510
Data Coach [website]
Online course with web lectures
Support

More Related Content

What's hot

The Brain Imaging Data Structure and its use for fNIRS
The Brain Imaging Data Structure and its use for fNIRSThe Brain Imaging Data Structure and its use for fNIRS
The Brain Imaging Data Structure and its use for fNIRS
Robert Oostenveld
 
The Donders Repository
The Donders RepositoryThe Donders Repository
The Donders Repository
Robert Oostenveld
 
Good Practice in Research Data Management
Good Practice in Research Data ManagementGood Practice in Research Data Management
Good Practice in Research Data Management
Historic Environment Scotland
 
Organizing EEG data using the Brain Imaging Data Structure
Organizing EEG data using the Brain Imaging Data Structure Organizing EEG data using the Brain Imaging Data Structure
Organizing EEG data using the Brain Imaging Data Structure
Robert Oostenveld
 
Brain Imaging Data Structure and Center for Reproducible Neuroscince
Brain Imaging Data Structure and Center for Reproducible NeuroscinceBrain Imaging Data Structure and Center for Reproducible Neuroscince
Brain Imaging Data Structure and Center for Reproducible Neuroscince
Krzysztof Gorgolewski
 
Brain Imaging Data Structure
Brain Imaging Data StructureBrain Imaging Data Structure
Brain Imaging Data Structure
Krzysztof Gorgolewski
 
Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data
Robert Oostenveld
 
Research data management for historians
Research data management for historiansResearch data management for historians
Research data management for historians
Jessica Parland-von Essen
 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practices
Robert Oostenveld
 
Data Archiving and Processing
Data Archiving and ProcessingData Archiving and Processing
Data Archiving and Processing
CRRC-Armenia
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
Jez Cope
 
Data carving using artificial headers info sec conference
Data carving using artificial headers   info sec conferenceData carving using artificial headers   info sec conference
Data carving using artificial headers info sec conference
Robert Daniel
 
Conquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data ManagementConquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data Management
Kathryn Houk
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
DataONE
 
DataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management Planning
DataONE
 
Donders Repository - removing barriers for management and sharing of research...
Donders Repository - removing barriers for management and sharing of research...Donders Repository - removing barriers for management and sharing of research...
Donders Repository - removing barriers for management and sharing of research...
Robert Oostenveld
 
Reuse of Repository Data
Reuse of Repository DataReuse of Repository Data
Reuse of Repository Data
Valerie Enriquez
 
Introduction to Research Data Management for postgraduate students
Introduction to Research Data Management for postgraduate studentsIntroduction to Research Data Management for postgraduate students
Introduction to Research Data Management for postgraduate students
Marieke Guy
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
DataONE
 
Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...
GarethKnight
 

What's hot (20)

The Brain Imaging Data Structure and its use for fNIRS
The Brain Imaging Data Structure and its use for fNIRSThe Brain Imaging Data Structure and its use for fNIRS
The Brain Imaging Data Structure and its use for fNIRS
 
The Donders Repository
The Donders RepositoryThe Donders Repository
The Donders Repository
 
Good Practice in Research Data Management
Good Practice in Research Data ManagementGood Practice in Research Data Management
Good Practice in Research Data Management
 
Organizing EEG data using the Brain Imaging Data Structure
Organizing EEG data using the Brain Imaging Data Structure Organizing EEG data using the Brain Imaging Data Structure
Organizing EEG data using the Brain Imaging Data Structure
 
Brain Imaging Data Structure and Center for Reproducible Neuroscince
Brain Imaging Data Structure and Center for Reproducible NeuroscinceBrain Imaging Data Structure and Center for Reproducible Neuroscince
Brain Imaging Data Structure and Center for Reproducible Neuroscince
 
Brain Imaging Data Structure
Brain Imaging Data StructureBrain Imaging Data Structure
Brain Imaging Data Structure
 
Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data
 
Research data management for historians
Research data management for historiansResearch data management for historians
Research data management for historians
 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practices
 
Data Archiving and Processing
Data Archiving and ProcessingData Archiving and Processing
Data Archiving and Processing
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
 
Data carving using artificial headers info sec conference
Data carving using artificial headers   info sec conferenceData carving using artificial headers   info sec conference
Data carving using artificial headers info sec conference
 
Conquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data ManagementConquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data Management
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
 
DataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management Planning
 
Donders Repository - removing barriers for management and sharing of research...
Donders Repository - removing barriers for management and sharing of research...Donders Repository - removing barriers for management and sharing of research...
Donders Repository - removing barriers for management and sharing of research...
 
Reuse of Repository Data
Reuse of Repository DataReuse of Repository Data
Reuse of Repository Data
 
Introduction to Research Data Management for postgraduate students
Introduction to Research Data Management for postgraduate studentsIntroduction to Research Data Management for postgraduate students
Introduction to Research Data Management for postgraduate students
 
DataONE Education Module 07: Metadata
DataONE Education Module 07: MetadataDataONE Education Module 07: Metadata
DataONE Education Module 07: Metadata
 
Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...
 

Similar to Research data management: course 0HV90, Behavioral Research Methods

Research data management : [part of] PROOF course Finding and controlling sci...
Research data management : [part of] PROOF course Finding and controlling sci...Research data management : [part of] PROOF course Finding and controlling sci...
Research data management : [part of] PROOF course Finding and controlling sci...
Leon Osinski
 
Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...
Leon Osinski
 
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
kulibrarians
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM Bootcamp
Sherry Lake
 
Course Research Data Management
Course Research Data ManagementCourse Research Data Management
Course Research Data Management
Maarten Van Bentum
 
A Guide for Reproducible Research
A Guide for Reproducible ResearchA Guide for Reproducible Research
A Guide for Reproducible Research
Yasmin AlNoamany, PhD
 
Introduction to RDM for trainee physicians
Introduction to RDM for trainee physiciansIntroduction to RDM for trainee physicians
Introduction to RDM for trainee physicians
Historic Environment Scotland
 
Data Science Process.pptx
Data Science Process.pptxData Science Process.pptx
Data Science Process.pptx
WidsoulDevil
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
Rebekah Cummings
 
Data management
Data management Data management
Data management
Graça Gabriel
 
Degonto file management
Degonto file managementDegonto file management
Degonto file management
Degonto Islam
 
Keep Calm and Curate
Keep Calm and CurateKeep Calm and Curate
Keep Calm and Curate
GarethKnight
 
A basic course on Research data management, part 4: caring for your data, or ...
A basic course on Research data management, part 4: caring for your data, or ...A basic course on Research data management, part 4: caring for your data, or ...
A basic course on Research data management, part 4: caring for your data, or ...
Leon Osinski
 
Research data management during and after your research ; an introduction / L...
Research data management during and after your research ; an introduction / L...Research data management during and after your research ; an introduction / L...
Research data management during and after your research ; an introduction / L...
Leon Osinski
 
Effective research data management
Effective research data managementEffective research data management
Effective research data management
Catherine Gold
 
Make your data great now
Make your data great nowMake your data great now
Make your data great now
Daniel JACOB
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
Marieke Guy
 
Data Life Cycle
Data Life CycleData Life Cycle
Data Life Cycle
Jason Henderson
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycle
Sherry Lake
 
Data management
Data management Data management

Similar to Research data management: course 0HV90, Behavioral Research Methods (20)

Research data management : [part of] PROOF course Finding and controlling sci...
Research data management : [part of] PROOF course Finding and controlling sci...Research data management : [part of] PROOF course Finding and controlling sci...
Research data management : [part of] PROOF course Finding and controlling sci...
 
Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...
 
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM Bootcamp
 
Course Research Data Management
Course Research Data ManagementCourse Research Data Management
Course Research Data Management
 
A Guide for Reproducible Research
A Guide for Reproducible ResearchA Guide for Reproducible Research
A Guide for Reproducible Research
 
Introduction to RDM for trainee physicians
Introduction to RDM for trainee physiciansIntroduction to RDM for trainee physicians
Introduction to RDM for trainee physicians
 
Data Science Process.pptx
Data Science Process.pptxData Science Process.pptx
Data Science Process.pptx
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
 
Data management
Data management Data management
Data management
 
Degonto file management
Degonto file managementDegonto file management
Degonto file management
 
Keep Calm and Curate
Keep Calm and CurateKeep Calm and Curate
Keep Calm and Curate
 
A basic course on Research data management, part 4: caring for your data, or ...
A basic course on Research data management, part 4: caring for your data, or ...A basic course on Research data management, part 4: caring for your data, or ...
A basic course on Research data management, part 4: caring for your data, or ...
 
Research data management during and after your research ; an introduction / L...
Research data management during and after your research ; an introduction / L...Research data management during and after your research ; an introduction / L...
Research data management during and after your research ; an introduction / L...
 
Effective research data management
Effective research data managementEffective research data management
Effective research data management
 
Make your data great now
Make your data great nowMake your data great now
Make your data great now
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
 
Data Life Cycle
Data Life CycleData Life Cycle
Data Life Cycle
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycle
 
Data management
Data management Data management
Data management
 

More from Leon Osinski

Articles and research data : DML Update, 08-10-2020
Articles and research data : DML Update, 08-10-2020Articles and research data : DML Update, 08-10-2020
Articles and research data : DML Update, 08-10-2020
Leon Osinski
 
PROOF course Writing articles and abstracts in English, part: Copyright in ac...
PROOF course Writing articles and abstracts in English, part: Copyright in ac...PROOF course Writing articles and abstracts in English, part: Copyright in ac...
PROOF course Writing articles and abstracts in English, part: Copyright in ac...
Leon Osinski
 
Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU Eindhoven
Leon Osinski
 
How to make your research data open : presentation held at the VU Open Scienc...
How to make your research data open : presentation held at the VU Open Scienc...How to make your research data open : presentation held at the VU Open Scienc...
How to make your research data open : presentation held at the VU Open Scienc...
Leon Osinski
 
Discussion CC licenses for data
Discussion CC licenses for dataDiscussion CC licenses for data
Discussion CC licenses for data
Leon Osinski
 
Be open: what funders want you to do with your publications and research data
Be open: what funders want you to do with your publications and research dataBe open: what funders want you to do with your publications and research data
Be open: what funders want you to do with your publications and research data
Leon Osinski
 
A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4
Leon Osinski
 
A basic course on Research data management, part 1: what and why
A basic course on Research data management, part 1: what and whyA basic course on Research data management, part 1: what and why
A basic course on Research data management, part 1: what and why
Leon Osinski
 
Research data management
Research data managementResearch data management
Research data management
Leon Osinski
 
Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...
Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...
Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...
Leon Osinski
 
How to get FUN out of sharing your data : FUN meeting, 02-04-2015 by Leon Osi...
How to get FUN out of sharing your data : FUN meeting, 02-04-2015 by Leon Osi...How to get FUN out of sharing your data : FUN meeting, 02-04-2015 by Leon Osi...
How to get FUN out of sharing your data : FUN meeting, 02-04-2015 by Leon Osi...
Leon Osinski
 
( Dutch ) Dataverse Network : Workshop (Dutch) Dataverse Network voor 3TU.Dat...
( Dutch ) Dataverse Network : Workshop (Dutch) Dataverse Network voor 3TU.Dat...( Dutch ) Dataverse Network : Workshop (Dutch) Dataverse Network voor 3TU.Dat...
( Dutch ) Dataverse Network : Workshop (Dutch) Dataverse Network voor 3TU.Dat...
Leon Osinski
 
3TU.Datacentrum: presentation for OpenML Workshop (III) at Eindhoven, 22-10-2...
3TU.Datacentrum: presentation for OpenML Workshop (III) at Eindhoven, 22-10-2...3TU.Datacentrum: presentation for OpenML Workshop (III) at Eindhoven, 22-10-2...
3TU.Datacentrum: presentation for OpenML Workshop (III) at Eindhoven, 22-10-2...
Leon Osinski
 
Horizon 2020 and research data : info meeting Horizon 2020 @ TUe, 07-10-2014 ...
Horizon 2020 and research data : info meeting Horizon 2020 @ TUe, 07-10-2014 ...Horizon 2020 and research data : info meeting Horizon 2020 @ TUe, 07-10-2014 ...
Horizon 2020 and research data : info meeting Horizon 2020 @ TUe, 07-10-2014 ...
Leon Osinski
 
Copyright and citation issues : PROOF course Writing articles and abstracts /...
Copyright and citation issues : PROOF course Writing articles and abstracts /...Copyright and citation issues : PROOF course Writing articles and abstracts /...
Copyright and citation issues : PROOF course Writing articles and abstracts /...
Leon Osinski
 
Be prepared to share your research data / Leon Osinski
Be prepared to share your research data / Leon OsinskiBe prepared to share your research data / Leon Osinski
Be prepared to share your research data / Leon Osinski
Leon Osinski
 
Onderzoeksdata-bepalingen van financiers van universitair onderzoek in NL: Ma...
Onderzoeksdata-bepalingen van financiers van universitair onderzoek in NL: Ma...Onderzoeksdata-bepalingen van financiers van universitair onderzoek in NL: Ma...
Onderzoeksdata-bepalingen van financiers van universitair onderzoek in NL: Ma...
Leon Osinski
 
OA beleid subscriptie-uitgevers / Saskia Woutersen-Windhouwer, Leon Osinski
OA beleid subscriptie-uitgevers / Saskia Woutersen-Windhouwer, Leon OsinskiOA beleid subscriptie-uitgevers / Saskia Woutersen-Windhouwer, Leon Osinski
OA beleid subscriptie-uitgevers / Saskia Woutersen-Windhouwer, Leon Osinski
Leon Osinski
 
Wat als alle artikelen open access beschikbaar zijn? / Leon Osinski
Wat als alle artikelen open access beschikbaar zijn? / Leon OsinskiWat als alle artikelen open access beschikbaar zijn? / Leon Osinski
Wat als alle artikelen open access beschikbaar zijn? / Leon Osinski
Leon Osinski
 
Open access : recente ontwikkelingen / Leon Osinski
Open access : recente ontwikkelingen / Leon OsinskiOpen access : recente ontwikkelingen / Leon Osinski
Open access : recente ontwikkelingen / Leon Osinski
Leon Osinski
 

More from Leon Osinski (20)

Articles and research data : DML Update, 08-10-2020
Articles and research data : DML Update, 08-10-2020Articles and research data : DML Update, 08-10-2020
Articles and research data : DML Update, 08-10-2020
 
PROOF course Writing articles and abstracts in English, part: Copyright in ac...
PROOF course Writing articles and abstracts in English, part: Copyright in ac...PROOF course Writing articles and abstracts in English, part: Copyright in ac...
PROOF course Writing articles and abstracts in English, part: Copyright in ac...
 
Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU Eindhoven
 
How to make your research data open : presentation held at the VU Open Scienc...
How to make your research data open : presentation held at the VU Open Scienc...How to make your research data open : presentation held at the VU Open Scienc...
How to make your research data open : presentation held at the VU Open Scienc...
 
Discussion CC licenses for data
Discussion CC licenses for dataDiscussion CC licenses for data
Discussion CC licenses for data
 
Be open: what funders want you to do with your publications and research data
Be open: what funders want you to do with your publications and research dataBe open: what funders want you to do with your publications and research data
Be open: what funders want you to do with your publications and research data
 
A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4
 
A basic course on Research data management, part 1: what and why
A basic course on Research data management, part 1: what and whyA basic course on Research data management, part 1: what and why
A basic course on Research data management, part 1: what and why
 
Research data management
Research data managementResearch data management
Research data management
 
Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...
Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...
Auteursrecht in academische omgeving: DPO Professionaliseringsbijeenkomst, 23...
 
How to get FUN out of sharing your data : FUN meeting, 02-04-2015 by Leon Osi...
How to get FUN out of sharing your data : FUN meeting, 02-04-2015 by Leon Osi...How to get FUN out of sharing your data : FUN meeting, 02-04-2015 by Leon Osi...
How to get FUN out of sharing your data : FUN meeting, 02-04-2015 by Leon Osi...
 
( Dutch ) Dataverse Network : Workshop (Dutch) Dataverse Network voor 3TU.Dat...
( Dutch ) Dataverse Network : Workshop (Dutch) Dataverse Network voor 3TU.Dat...( Dutch ) Dataverse Network : Workshop (Dutch) Dataverse Network voor 3TU.Dat...
( Dutch ) Dataverse Network : Workshop (Dutch) Dataverse Network voor 3TU.Dat...
 
3TU.Datacentrum: presentation for OpenML Workshop (III) at Eindhoven, 22-10-2...
3TU.Datacentrum: presentation for OpenML Workshop (III) at Eindhoven, 22-10-2...3TU.Datacentrum: presentation for OpenML Workshop (III) at Eindhoven, 22-10-2...
3TU.Datacentrum: presentation for OpenML Workshop (III) at Eindhoven, 22-10-2...
 
Horizon 2020 and research data : info meeting Horizon 2020 @ TUe, 07-10-2014 ...
Horizon 2020 and research data : info meeting Horizon 2020 @ TUe, 07-10-2014 ...Horizon 2020 and research data : info meeting Horizon 2020 @ TUe, 07-10-2014 ...
Horizon 2020 and research data : info meeting Horizon 2020 @ TUe, 07-10-2014 ...
 
Copyright and citation issues : PROOF course Writing articles and abstracts /...
Copyright and citation issues : PROOF course Writing articles and abstracts /...Copyright and citation issues : PROOF course Writing articles and abstracts /...
Copyright and citation issues : PROOF course Writing articles and abstracts /...
 
Be prepared to share your research data / Leon Osinski
Be prepared to share your research data / Leon OsinskiBe prepared to share your research data / Leon Osinski
Be prepared to share your research data / Leon Osinski
 
Onderzoeksdata-bepalingen van financiers van universitair onderzoek in NL: Ma...
Onderzoeksdata-bepalingen van financiers van universitair onderzoek in NL: Ma...Onderzoeksdata-bepalingen van financiers van universitair onderzoek in NL: Ma...
Onderzoeksdata-bepalingen van financiers van universitair onderzoek in NL: Ma...
 
OA beleid subscriptie-uitgevers / Saskia Woutersen-Windhouwer, Leon Osinski
OA beleid subscriptie-uitgevers / Saskia Woutersen-Windhouwer, Leon OsinskiOA beleid subscriptie-uitgevers / Saskia Woutersen-Windhouwer, Leon Osinski
OA beleid subscriptie-uitgevers / Saskia Woutersen-Windhouwer, Leon Osinski
 
Wat als alle artikelen open access beschikbaar zijn? / Leon Osinski
Wat als alle artikelen open access beschikbaar zijn? / Leon OsinskiWat als alle artikelen open access beschikbaar zijn? / Leon Osinski
Wat als alle artikelen open access beschikbaar zijn? / Leon Osinski
 
Open access : recente ontwikkelingen / Leon Osinski
Open access : recente ontwikkelingen / Leon OsinskiOpen access : recente ontwikkelingen / Leon Osinski
Open access : recente ontwikkelingen / Leon Osinski
 

Recently uploaded

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
yuvarajkumar334
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
bmucuha
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 

Recently uploaded (20)

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
一比一原版(CU毕业证)卡尔顿大学毕业证如何办理
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 

Research data management: course 0HV90, Behavioral Research Methods

  • 1. Research data management part 1: what and why Course 0HV90 TU/e, 22-11-2017 l.osinski@tue.nl, TU/e IEC/Library Available under CC BY-SA license, which permits copying and redistributing the material in any medium or format & adapting the material for any purpose, provided the original author and source are credited & you distribute the adapted material under the same license as the original
  • 2. Research data management [RDM] what #1 Essence of RDM: “… tracking back to what you did 7 years ago and recovering it (...) immediately in a re- usable manner.” (Henry Rzepa)
  • 3. Research data management [RDM] what #2 RDM: caring for your data with the purpose to: 1. protect their mere existence: data loss, data authenticity (RDM basics) 2. share them with others a. for reasons of reuse: in the same context or in a different context; during research and after research b. for reasons of reproducibility checks  scientific integrity; data quality RDM = good data practices that make your data easy to work with, understandable, and available to other scientists
  • 4. Source: Research Data Netherlands / Marina Noordegraaf Outline 1. Research data management [RDM]: what and why 2. Sharing your data, or making your data findable and accessible a. data protection: back up, file naming, organizing data b. data sharing: via collaboration platforms, data archives 3. Caring for your data, or making your data usable and interoperable a. tidy data b. metadata/documentation c. licenses d. open data formats
  • 5.  Because you work together with other researchers  collaborative science  Because of re-using results: data-driven science  open science  Because of scientific integrity: validating data analysis by reproducibility checks requires data and the code that is used to clean, process and analyze the data and to produce the final outputs  reproducible science Additional reasons  Because your data are unique / not easily repeatable (long term observational data)  Because it’s required… by funders like NWO and EC and your university Why sharing research data? #1
  • 6. Why sharing research data? #2 TU/e code of scientific conducts
  • 7. Reasons not to share your data  Preparing my data for sharing takes time and effort But research data management also increases your research efficiency  My data are confidential But you can anonymize or pseudonymize your data  My data still need to yield publications But you can publish your data under an embargo and by publishing your data you establish priority and you can get credits for it  My data can be misused or misinterpret But the best defense against malicious use is to refer to an archival copy of your data which is guaranteed exactly as you mean it to be  My data are only interesting for me But sharing your data may be required by a funder / journal or your data may be requested to validate your results
  • 8. Research data management part 2: protecting and organizing your data
  • 9. RDM: caring for your data with the purpose to: 1. protect their mere existence: data loss, data authenticity (RDM basics) 2. share them with others a. for reasons of reuse: in the same context or in a different context; during research and after research b. for reasons of reproducibility checks  scientific integrity; data quality Research data management
  • 10. Be safe  storage, backup  data safety, protecting against loss + 3-2-1 rule: save 3 copies of your data, on 2 different devices, and 1 copy off site + use local ICT infrastructure (university network servers) if possible  access control  data security, protecting against unauthorized use: with SURFdrive, Open Science Framework or DataverseNL for example you can arrange access to your data Protecting your data #1 good data practices during your research
  • 11. Be organized, or: you (and others) should be able to tell what’s in a file without opening it  file-naming  organizing data in folders Protecting your data #2 good data practices during your research “…we can copy everything and do not manage it well.” (Indra Sihar)
  • 12. File naming #1 File naming Good file names are:  consistent (use file-naming conventions)  unique (distinguishes a file from files with similar subjects as well as different versions of the file), and  meaningful (use descriptive names). + Avoid using special characters in file names + Include dates (format YYYYMMDD) and a version number on file names Source: Best practices for file naming (Stanford University Libraries)
  • 13. File naming #2  Order by date: 2013-04-12_interview-recording_THD.mp3 2013-04-12_interview-transcript_THD.docx 2012-12-15_interview-recording_MBD.mp3 2012-12-15_interview-transcript_MBD.docx  Order by subject: MBD_interview-recording_2012-12-15.mp3 MBD_interview-transcript_2012-12-15.docx THD_interview-recording_2013-04-12.mp3 THD_interview-transcript_2013-04-12.docx  Order by type: Interview-recording_MBD_2012-12-15.mp3 Interview-recording_THD_2013-04-12.mp3 Interview-transcript_MBD_2012-12-15.docx Interview-transcript_THD_2013-04-12.docx  Forced order with numbering: 01_THD_interview-recording_2013-04-12.mp3 02_THD_interview-transcript_2013-04-12.docx 03_MBD_interview-recording_2012-12-15.mp3 04_MBD_interview-transcript_2012-12-15.docx Think about the ordering of elements in a filename
  • 14. Organizing your data in folders #1 PAGE 1422-11-2017 Source: Haselager, G.J.T., Aken, M.A.G. van (2000): Personality and Family Relationships. DANS. http://dx.doi.org/10.17026/dans-xk5-y7vc .
  • 15. Organizing your data in folders #2 TIER documentation protocol Guiding principles of TIER documentation protocol 1. keep your raw or original data raw + save your raw data read-only in its original format in a separate folder + make a working copy of your raw data (input data, used for processing and analysis) 2. keep the command files (files containing code written in the syntax of the (statistical) software you use for the study) apart from the data 3. keep the analysis files (the fully cleaned and processed data files that you use to generate the results reported in your paper) in a separate folder 4. store the metadata (codebook, description of variables, etc.) in a separate folder, apart from the data itself
  • 16. Organizing your data in folders #3 TIER documentation protocol 1. Main project folder (name of your research project/working title of your paper) 1.1. Original data and metadata 1.1.1. Original data 1.1.2. Metadata 1.1.2.1. Supplements 1.2. Processing and analysis files 1.2.1. Importable data files 1.2.2. Command files 1.2.3. Analysis files 1.3. Documents 1.4. Literature
  • 17. 1. Main project folder (name of your research project/working title of your paper) 1.1. Original data and metadata 1.1.1. Original data (raw data, obtained/gathered data) Any data that were necessary for any part of the processing and/or analysis you reported in you paper. Copies of all your original data files, saved in exactly the format it was when you first obtained it. The name of the original data file may be changed Keep these data read only! 1.1.2. Metadata 1.1.2.1. Supplements Organizing your data in folders #4 TIER documentation protocol
  • 18. 1. Main project folder (name of your research project/working title of your paper) 1.1. Original data and metadata 1.1.1. Original data 1.1.2. Metadata The Metadata Guide: document that provides information about each of your original data files. Applies especially to obtained data files  A bibliographic citation of the original data files, including the date you downloaded or obtained the original data files and unique identifiers that have been assigned to the original data files.  Information about how to obtain a copy of the original data file  Whatever additional information to understand and use the data in the original data file 1.1.2.1. Supplements Additional information about an original data file that’s not written by yourself but that is found in existing supplementary documents, such as users’ guides and code books that accompany the original data file Organizing your data in folders #5 TIER documentation protocol
  • 19. Organizing your data in folders #6 TIER documentation protocol 1. Main project folder (name of your research project/working title of your paper) 1.1. Original data and metadata 1.1.1. Original data 1.1.2. Metadata 1.1.2.1. Supplements 1.2. Processing and analysis files 1.2.1. Importable data files (the data you work with, input data, suitable for processing and analysis) A corresponding version for each of the original data files. This version can be identical to the original version, or in some cases it will be a modified version. For example modifications required to allow your software to read the file (converting the file to another format, removing unusable data or explanatory notes from a table)  The original and importable versions of a data file should be given different names  The importable data file should be as nearly as identical as possible to the original  The changes you make to your original data files to create the corresponding importable data files should be described in a Readme file 1.2.2. Command files 1.2.3. Analysis files
  • 20. Organizing your data in folders #7 TIER documentation protocol 1. Main project folder (name of your research project/working title of your paper) 1.1. Original data and metadata 1.1.1. Original data 1.1.2. Metadata 1.1.2.1. Supplements 1.2. Processing and analysis files 1.2.1. Importable data files 1.2.2. Command files One or more files containing code written in the syntax of the (statistical) software you use for the study  Importing phase: commands to import or read the files and save them in a format that suits your software  Processing phase: commands that execute all the processing required to transform the importable version of your files into the final data files that you will use in your analysis (i.e. cleaning, recoding, joining two or more data files, dropping variables or cases, generating new variables)  Generating the results: commands that open the analysis data file(s), and then generate the results reported in your paper. 1.2.3. Analysis files
  • 21. Organizing your data in folders #8 TIER documentation protocol 1. Main project folder (name of your research project/working title of your paper) 1.1. Original data and metadata 1.1.1. Original data 1.1.2. Metadata 1.1.2.1. Supplements 1.2. Processing and analysis files 1.2.1. Importable data files 1.2.2. Command files 1.2.3. Analysis files  The fully cleaned and processed data files that you use to generate the results reported in your paper in your paper  The Data Appendix: codebook for your analysis data files: brief description of the analysis data file(s), a complete definition of each variable (including coding and/or units of measurement), the name of the original data files from which the variable was extracted, the number of valid observations for the variable, and the number of cases with missing values
  • 22. Organizing your data in folders #9 TIER documentation protocol 1. Main project folder (name of your research project/working title of your paper) 1.1. Original data and metadata 1.1.1. Original data 1.1.2. Metadata 1.1.2.1. Supplements 1.2. Processing and analysis files 1.2.1. Importable data files 1.2.2. Command files 1.2.3. Analysis files 1.3. Documents  An electronic copy of your complete final paper  The Readme-file for your replication documentation  What statistical software or other computer programs are needed to run the command files  Explain the structure of the hierarchy of folders in which the documentation is stored  Describe precisely any changes you made to your original data files to create the corresponding importable data files  Step-by-step instructions for using your documentation to replicate the statistical results reported in your paper 1.4. Literature  Retrieved relevant literature
  • 23. Research data management part 3: sharing your data
  • 24. RDM: caring for your data with the purpose to: 1. protect their mere existence: data loss, data authenticity (RDM basics) 2. share them with others a. for reasons of reuse: in the same context or in a different context; during research and after research b. for reasons of reproducibility checks  scientific integrity; data quality Research data management
  • 25. During research After researchInstitutionDisciplin Local ICT services Overview research data sharing and storage services Data sharing per se is pretty straightforward
  • 26. Sharing your data during your research versus after your research During your research via collaboration or sharing platforms  Data sharing is (more) aimed at collaboration, at working together on data  Being able to control access to data is crucial After your research via data archives or repositories  Publishing and keeping an archival copy of data is essential  Long term preservation of your data is important
  • 27. General data sharing platforms:  SURFdrive [TU/e only]: Dutch academic Dropbox, 250 Gb, maximum data transfer 16 Gb Students can request a SURFdrive account  Google Drive, Dropbox: don’t use these to store sensitive data Sharing your data during your research: general data sharing platforms
  • 28. DataverseNL [TU/e only]: data sharing platform for active research data [based on Harvard’s Dataverse Project] where you may:  store your data in an organized and safe way  clearly describe your data  version control of your data  arrange access to your data Sharing your data during your research: DataverseNL Students may use DataverseNL Go to: https://dataverse.nl/ Click ‘Log in’ (at the top right); under Institutional account click SURFconext Select Eindhoven University of Technology and log on with your TU/e username and password When asked for it, give permission to share your data by answering Yes or click this Tab When asked to create an account, answer Yes or click this Tab. When you succeeded to create an account, your username is the prefix of your email address Click 4TU dataverse  Eindhoven dataverse  Add data: you can now create and publish data sets, upload files and assign access rights to data sets or files.
  • 29. Sharing your data during your research: Open Science Framework
  • 30. On request “I'd like to thank E.J. Masicampo and Daniel LaLande for sharing and allowing me to share their data…” Daniël Lakens (2014), What p-hacking really looks like: A comment on Masicampo & LaLande (2012) On a (personal) website “Let me start by saying that the reason why I put all excel files online, including all the detailed excel formulas about data constructions and adjustments, is precisely because I want to promote an open and transparent debate about these important and sensitive measurement issues.” Thomas Piketty, My response to the Financial Times, HuffPost The Blog, 29-05-2014 ; originally published as Addendum: Response to FT, 28-05-2014 A data journal Journal of open psychology data, Geoscience data journal, Data in brief, Scientific data, Data reports Sharing your data after your research Source: www.aukeherrema.nl
  • 31. Choose a repository where other researchers in your discipline are sharing their data If not available, use a multidisciplinary or general repository that at least assigns a persistent identifier to your data (DOI) and requires that you provide adequate metadata  for example Zenodo, Figshare, DANS, B2SHARE, or;  4TU.ResearchData + DOI’s [ persistent identifier for citability and retrievability ] + open access + long-term availability, Data Seal of Approval + self-upload (single data sets < 3Gb) Sharing your data after your research has ended, via an archive or repository
  • 32. Link your data to your publication Sharing your data link your data to your publication
  • 33. The Human Technology Interaction group of the department of Industrial Engineering & Innovation Sciences requires that an archival copy of the results (‘replication package’) will be deposited in ARCHIE  the copy cannot be changed  the copy is only accessible to the researcher Archiving your data after your research has ended
  • 34. Research data management part 4: making your data usable
  • 35. The welfare consequences… / by Jonathan J. Cooper et. al., http://dx.doi.org/10.1371/jou rnal.pone.0102722  Word.doc Before data can be reusable, it has first to be usable
  • 36. Making your data usable and interoperable with good data practices  tidy data  metadata/documentation  licenses  open data formats Research data management
  • 37. Tidy data is about structure of a table / data set. Tidy data ≠ clean data. It’s a step towards clean data + Each variable you measure is in one column + Column headers are variable names + Each observation is in a different row + Every cell contains only one piece of information Tidy data what
  • 38. Tidy data allow your data to be easily: + imported by data management systems + analyzed by analysis software + visualized, modelled, transformed + combined with other data (interoperability) Tidy data why: making your data easy to handle for computers
  • 39. Tidy data versus messy data 1. More than one variable in a single column (‘clumped data’) 2. Column headers are values, or: one variable over many columns (‘wide data’) 3. Variables are in rows and columns 4. More pieces of information in one cell (cells are highlighted or colored; values and measurement units in one cell) 1. Each variable you measure is in one column 2. Column headers are variable names 3. Each observation is in a different row 4. Every cell contains only one piece of information Tidy data Messy data
  • 40. patient_id drug_a drug_b 1 67 56 2 80 90 3 64 50 4 85 75 Tidy data versus messy data example ‘Wide’ data: one variable over many columns Tidy data patient_id drug heart_rate 1 a 67 2 a 80 3 a 64 4 a 85 1 b 56 2 b 90 3 b 50 4 b 75
  • 41. Tools for tidying data OpenRefine  download OpenRefine: http://openrefine.org/download.html  runs on your computer (not in the cloud), inside the Firefox browser (not in IE), no web connection is needed  captures all steps done to your raw data ; original dataset is not modified; steps are easily reversed; R, TidyR package  scripted language (R (free), Matlab, SAS…) to process data (tidying, cleaning, etc.), run the analysis and to produce final outputs versus  Excel: data provenance and documentation of data processing with a graphical user interface is bad because it doesn’t leaves a record
  • 42. The table or data set itself + columns: use clear, descriptive variable names (no hard to understand abbreviations), avoid special characters (can cause problems with some software) + rows: if possible, use standard names within cells (derived from a taxonomy, for example: standard species name, CAS registry for chemical substances, standard date formats, …) + try to avoid coding categorical or ordinal data as numbers + missing data: use NA Documentation / metadata making your data understandable for humans #1
  • 43. The table or data set as a whole A description (documentation) that at least mentions: + size of the data set: number of observations and variables + information about the variables and its measurement units (code book) + what’s included and excluded in the data set, why data are missing + description of how you collected the data (study design), data manipulation steps (provenance) + when your data consists of multiple files organized in a folder structure, an explanation of the structure and naming of the files Documentation / metadata making your data understandable for humans #2 “Research outputs that are poorly documented are like canned goods with the label removed (…)” (Carly Strasser)
  • 44. Documentation / metadata metadata standards Sometimes there are metadata standards for the documentation of your data set but where no standard exists, a simple readme file can be good enough
  • 45. Raw data: https://www.amstat.org/publi cations/jse/datasets/titanic.da t.txt Documentation accompanying the data: https://www.amstat.org/publi cations/jse/datasets/titanic.txt Based on: The "Unusual Episode" Data Revisited / by Robert J. MacG. Dawson, in: Journal of Statistics Education vol. 3(1995), issue 3 But sometimes data sets are so complex that a readme file is insufficient
  • 46. Documentation / metadata making your data findable for humans and search engines Descriptive metadata for discovery and identification of your data mainly + creator + title + short description + key words + date(s) of data collection + publication year + related publications + DOI (assigned by data archive) + etc. When uploading your data in a data archive like 4TU.ResearchData, you will be asked to enter these metadata A DOI is assigned by the data archive
  • 47. User license making clear that other people are allowed to use your data Let other people know in advance what they are allowed to do with your data by attaching a user license to it + Creative Commons license for data sets + GNU General Public License (GPL) for software + License selector
  • 48. Open data formats ensuring the ‘longevity’ of your data + with open (non-proprietary) data formats it is best ensured that the data will remain usable and ‘legible’ for computers in the future + are easy to use in a variety of software, like .csv for tabular data + check the data formats that are supported by a data archive like 4TU.ResearchData
  • 49. Research data management recommended reading on ‘good data practices’ 1. Goodman, A., Pepe, A., Blocker, A.W., et al. (2014) Ten simple rules for the care and feeding of scientific data, PLOS Computional Biology, 10(4), e10033542. https://doi.org/10.1371/journal.pcbi.1003542 2. Eugene Barsky (2017), Good enough research data management: a very brief guide 3. Dynamic ecology (2016), Ten commandments for good data management. https://dynamicecology.wordpress.com/2016/08/22/ten-commandments-for- good-data-management/ 4. Ellis SE, Leek JT. (2017) How to share data for collaboration. PeerJ Preprints5:e3139v1 https://doi.org/10.7287/peerj.preprints.3139v1 5. Wilson G, Bryan J, Cranston K, Kitzes J, Nederbragt L, Teal TK (2017) Good enough practices in scientific computing. PLOS Computational Biology, 13(6): e1005510. https://doi.org/10.1371/journal.pcbi.1005510
  • 50. Data Coach [website] Online course with web lectures Support