10/30/2020
Data management
planning
Training for trainers, part II
26.10.2020
Mari Elisa Kuusniemi
ORCID: 0000-0002-7675-287X
Tuuli Office
2210/30/2020
Topics of a Data Management Plan (DMP)
• General description of the data
• Ethical and legal compliance
• Documentation and metadata
• Storage and backup during the research project
• Opening, publishing and archiving the data after the
research project
• Data management responsibilities and resources
Reference: Tuuli-project. (2020, January 24). General Finnish DMP guidance (Version 2020).
Zenodo. http://doi.org/10.5281/zenodo.3630309
3310/30/2020
During the training
• We go through the main topics of the data management plan
• We discuss the training objectives of each topic
410/30/2020
Storage and
backup
Where to store data
during the research
project?
Storage and backup during the research
project
• Where will your data be stored, and how will it be backed
up?
• Who will be responsible for controlling access to your data,
and how will the secured access be controlled?
6610/30/2020
When choosing a storage for the research
data, you have to consider several things:
• What kind of research data will you produce and how will it
be processed? (the type and amount of research data may
prevent usage of some storage services)
• How are you going to save, store, use, backup and transfer
your data?
• To whom are you going to share the data?
• What kind of access control do you need?
• Are you going to actively modify your data?
• Is your data sensitive? Does your data contain personal
data?
Tekijä
7710/30/2020
Where to store?
• PC hard drive
• External hard drive
• USB stick
• Cloud service
• University servers/network drives, with automatic
back-up & versioning
Storing solutions
Participants do not want to hear about all possible technical solution.
They want to know which solution is the best for their own data.
There are lots of storing systems and they are constantly evolving.
How can you keep up and know them all?
It is challenging to find a person who can explain clearly and so that
it’s easy to understand the most relevant storing solutions.
How you can explain versioning or databases with logs to a person
with no technical background (= most researchers)?
It is difficult to say how you should describe storing systems and
access control in the DMP. Is there a need of describing technical
details? Or is the level of storing policy what is needed?
answers
1010/30/2020
Opening,
publishing and
archiving
Where does the data go
after the project?
Opening, publishing and archiving the data
after the research project
• What part of the data can be made openly available or
published?
• Where and when will the data, or its metadata, be made
available?
• Where will the data with long-term value be archived, and
for how long?
Repositories and data archives
• Choose a data repository or an archive like you would
choose a journal where you publish articles.
• A repository or an archive should be
• well-established in your research field (or for the data
type).
• curated, if possible (parallel to peer review).
• certified.
• provide persistent identifiers (like DOI), easy to cite data in
a publication.
• secure archive for sensitive data (certified, if available).
10/30/2020 13
If participant can’t see what data he/she are
producing, how can he/she figure out where to share
the data?
How can I explain, that each data type, needs to be
deposited to different repository, if the goal is FAIR.
There are so many kinds data in world, how I can
know what is the best data repository to each one?
Which ones I should choose to market them?
A participant explains, that on their research are,
there is no need to store or share data. How you can
be sure this is the most reasonable choice to make?
Quite often archiving system is descripted to be
researcher own computer or a network drive. How
can you explain this is not a recommend solution for
archiving data?
answers
10/30/2020 14
If participant can’t see what data he/she are
producing, how can he/she figure out where to share
the data?
How can I explain, that each data type, needs to be
deposited to different repository, if the goal is FAIR.
There are so many kinds data in world, how I can
know what is the best data repository to each one?
Which ones I should choose to market them?
A participant explains, that on their research are,
there is no need to store or share data. How you can
be sure this is the most reasonable choice to make?
Quite often archiving system is descripted to be
researcher own computer or a network drive. How
can you explain this is not a recommend solution for
archiving data?
20 answers
1510/30/2020
Data
management
responsibilities
and resources
Who is responsible?
How much does RDM
cost?
Data management responsibilities and
resources
• Who (for example role, position, and institution) will be
responsible for data management (i.e., the data steward)?
• What resources will be required for your data management
procedures to ensure that the data can be opened and
preserved according to FAIR principles (Findable,
Accessible, Interoperable, Re-usable)?
Tasks Resources
Data management planning 1 week
Agreements (consortium, transfer of rights) 2-4 weeks
Data privacy (GDPR) administration 2-4 weeks
Data documentation and cleaning 1-2 hour/week/person
( ~5% of the project FTE)
Data publishing (include checking the
anonymization)
1-2 week(s)/data set
(8 main data sets)
Storage space for sensitive data 10 TB = 2 000€/year
Archiving and deleting data 1-2 week(s)/data set
(5 unpublished data sets)
Expert help for data management, preservation and sharing tasks is provided by
University of Helsinki Data Support
DMP, 2019
10/30/2020 18
5 answers
How a course participant could describe resources needed,
if she/he do not know what data management is or FAIR is in
practice?
Course participants do not know why cost and resources
should be listed. It has not been required before, why it’s
required now?
No one has seen a good example answer to this question.
How does the good answer look like?
Without clear list of services with prizes, it’s difficult to plan
costs.
Course participants do not know what kinds of
responsibilities there are around data management. Which
responsibilities are relevant and should be listed to DMP?
10/30/2020
Focus on the mission
Why on earth should I know how to
write a data management plan?
Original photo: Tyyne Savia, Finnish Heritage Agency
Learning objectives
10/30/2020 21
Beginning of
the studies
Undergraduate
Graduate student
(PhD student)
Post Doctoral
researcher
Senior
researcher
222210/30/2020
Group work
• Random small groups in breakout rooms (click to join the
group)
• Working time: about 15 min
Tekijä
232310/30/2020
Task: Learning objectives
• One topic/group.
• Your group number tells you, which learning objective/goal
you work with (you see the number when you join the group)
• Write your answers on the google doc:
http://bit.ly/DMPlearningobjectives
• Each group will present their discussion and findings to all,
after the group work
Tekijä
2410/30/2020
Thanks!
Mari Elisa ”Mek” Kuusniemi
mari.elisa.kuusniemi@helsinki.fi

Data management planning - Training for trainers, part II

  • 1.
    10/30/2020 Data management planning Training fortrainers, part II 26.10.2020 Mari Elisa Kuusniemi ORCID: 0000-0002-7675-287X Tuuli Office
  • 2.
    2210/30/2020 Topics of aData Management Plan (DMP) • General description of the data • Ethical and legal compliance • Documentation and metadata • Storage and backup during the research project • Opening, publishing and archiving the data after the research project • Data management responsibilities and resources Reference: Tuuli-project. (2020, January 24). General Finnish DMP guidance (Version 2020). Zenodo. http://doi.org/10.5281/zenodo.3630309
  • 3.
    3310/30/2020 During the training •We go through the main topics of the data management plan • We discuss the training objectives of each topic
  • 4.
    410/30/2020 Storage and backup Where tostore data during the research project?
  • 5.
    Storage and backupduring the research project • Where will your data be stored, and how will it be backed up? • Who will be responsible for controlling access to your data, and how will the secured access be controlled?
  • 6.
    6610/30/2020 When choosing astorage for the research data, you have to consider several things: • What kind of research data will you produce and how will it be processed? (the type and amount of research data may prevent usage of some storage services) • How are you going to save, store, use, backup and transfer your data? • To whom are you going to share the data? • What kind of access control do you need? • Are you going to actively modify your data? • Is your data sensitive? Does your data contain personal data? Tekijä
  • 7.
    7710/30/2020 Where to store? •PC hard drive • External hard drive • USB stick • Cloud service • University servers/network drives, with automatic back-up & versioning
  • 8.
  • 9.
    Participants do notwant to hear about all possible technical solution. They want to know which solution is the best for their own data. There are lots of storing systems and they are constantly evolving. How can you keep up and know them all? It is challenging to find a person who can explain clearly and so that it’s easy to understand the most relevant storing solutions. How you can explain versioning or databases with logs to a person with no technical background (= most researchers)? It is difficult to say how you should describe storing systems and access control in the DMP. Is there a need of describing technical details? Or is the level of storing policy what is needed? answers
  • 10.
  • 11.
    Opening, publishing andarchiving the data after the research project • What part of the data can be made openly available or published? • Where and when will the data, or its metadata, be made available? • Where will the data with long-term value be archived, and for how long?
  • 12.
    Repositories and dataarchives • Choose a data repository or an archive like you would choose a journal where you publish articles. • A repository or an archive should be • well-established in your research field (or for the data type). • curated, if possible (parallel to peer review). • certified. • provide persistent identifiers (like DOI), easy to cite data in a publication. • secure archive for sensitive data (certified, if available).
  • 13.
    10/30/2020 13 If participantcan’t see what data he/she are producing, how can he/she figure out where to share the data? How can I explain, that each data type, needs to be deposited to different repository, if the goal is FAIR. There are so many kinds data in world, how I can know what is the best data repository to each one? Which ones I should choose to market them? A participant explains, that on their research are, there is no need to store or share data. How you can be sure this is the most reasonable choice to make? Quite often archiving system is descripted to be researcher own computer or a network drive. How can you explain this is not a recommend solution for archiving data? answers
  • 14.
    10/30/2020 14 If participantcan’t see what data he/she are producing, how can he/she figure out where to share the data? How can I explain, that each data type, needs to be deposited to different repository, if the goal is FAIR. There are so many kinds data in world, how I can know what is the best data repository to each one? Which ones I should choose to market them? A participant explains, that on their research are, there is no need to store or share data. How you can be sure this is the most reasonable choice to make? Quite often archiving system is descripted to be researcher own computer or a network drive. How can you explain this is not a recommend solution for archiving data? 20 answers
  • 15.
  • 16.
    Data management responsibilitiesand resources • Who (for example role, position, and institution) will be responsible for data management (i.e., the data steward)? • What resources will be required for your data management procedures to ensure that the data can be opened and preserved according to FAIR principles (Findable, Accessible, Interoperable, Re-usable)?
  • 17.
    Tasks Resources Data managementplanning 1 week Agreements (consortium, transfer of rights) 2-4 weeks Data privacy (GDPR) administration 2-4 weeks Data documentation and cleaning 1-2 hour/week/person ( ~5% of the project FTE) Data publishing (include checking the anonymization) 1-2 week(s)/data set (8 main data sets) Storage space for sensitive data 10 TB = 2 000€/year Archiving and deleting data 1-2 week(s)/data set (5 unpublished data sets) Expert help for data management, preservation and sharing tasks is provided by University of Helsinki Data Support DMP, 2019
  • 18.
    10/30/2020 18 5 answers Howa course participant could describe resources needed, if she/he do not know what data management is or FAIR is in practice? Course participants do not know why cost and resources should be listed. It has not been required before, why it’s required now? No one has seen a good example answer to this question. How does the good answer look like? Without clear list of services with prizes, it’s difficult to plan costs. Course participants do not know what kinds of responsibilities there are around data management. Which responsibilities are relevant and should be listed to DMP?
  • 19.
  • 20.
    Why on earthshould I know how to write a data management plan? Original photo: Tyyne Savia, Finnish Heritage Agency
  • 21.
    Learning objectives 10/30/2020 21 Beginningof the studies Undergraduate Graduate student (PhD student) Post Doctoral researcher Senior researcher
  • 22.
    222210/30/2020 Group work • Randomsmall groups in breakout rooms (click to join the group) • Working time: about 15 min Tekijä
  • 23.
    232310/30/2020 Task: Learning objectives •One topic/group. • Your group number tells you, which learning objective/goal you work with (you see the number when you join the group) • Write your answers on the google doc: http://bit.ly/DMPlearningobjectives • Each group will present their discussion and findings to all, after the group work Tekijä
  • 24.
    2410/30/2020 Thanks! Mari Elisa ”Mek”Kuusniemi mari.elisa.kuusniemi@helsinki.fi