SlideShare a Scribd company logo
Long-term storage – will it fill up with 
the good stuff, or the big, bad, and ugly? 
Can checklists make a difference? 
Angus Whyte, DCC 
‘Research Data Storage and Preservation Strategies’ 
University of Edinburgh 27 October 2014 
a.whyte@ed.ac.uk
Long-term storage – will it fill up with the 
good stuff, or just the big, bad, and ugly? 
Will checklists encourage researchers to decide?
RDM Service Components 
www.dcc.ac.uk/resources/how-guides/how-develop-rdm-services
RDM Service ‘Components’ 
www.dcc.ac.uk/resources/how-guides/how-develop-rdm-services
But more support needed! 
Top 3 support needs for institutions * 
1. Defining what to retain 
2. Specifying tools/ infrastructure 
3. Supporting metadata creation for 
research data discovery 
*March 2014 DCC 2014 RDM Survey of 61 institutions 
Data available at: zenodo.org/collection/user-dcc-rdm-2014
Data Asset Surveys 
Some institutions have estimated storage requirements from these 
About your data and 
its lifecycle…? 
1.File type 
2.Volumes 
3.Density 
4.Update frequence 
5.Usage frequency 
6.Availability req’d 
7.Sensitivity 
Active storage 
Archival storage 
Data Asset Framework Implementation guide 
www.data-audit.eu/docs/DAF_Implementation_Guide.pdf
Data Asset Surveys 
Some institutions have estimated storage requirements from these 
About your data and 
its lifecycle…? 
1.File type 
2.Volumes 
3.Density 
4.Update frequence 
5.Usage frequency 
6.Availability req’d 
7.Sensitivity 
Active storage 
Archival storage 
But if you provide it will researchers use it, at what cost? 
Data Asset Framework Implementation guide 
www.data-audit.eu/docs/DAF_Implementation_Guide.pdf
Practical checklists 
key points in research cycle 
Data Mgmt Plan 
1. Collection 
2. Documentation 
3. Ethics & legal 
4. Storage & backup 
5. Selection& preserve 
6. Data sharing 
7. Responsibilities 
Repository 
selection 
1. Policy & legal 
2. Discoverable 
3. Preservation 
4. Reports 
5. Trust 
Archival storage Active storage 
Data Selection 
5 Steps to decide what 
to keep 
1. Could - benefit 
2. Must - risks 
3. Should - value 
4. Cost factors 
5. Weigh-up 1-4 
Catalogue 
Metadata 
1. Name 
2. Description 
3. Identifier 
4. Subject 
5. URL 
6. Date 
7. Creator 
8. Rights 
9. Spatial 
10.Publisher 
Start 
Write-up
Data selection checklist 
Preview at: this Google doc
11 
Data selection checklist 
Straightforward steps to guide researchers 
①Could this data be re-used 
②Must it be kept to manage compliance risk 
③Should it be kept for its potential value and… 
④Considering costs 
⑤Will ✔or won’t ✗ it be kept, shared on what terms 
Institution or 
external 
repository 
Data Selection 
5 Steps to decide what 
to keep 
1. Could - benefit 
2. Must - risks 
3. Should - value 
4. Cost factors 
5. Weigh-up 1-4 
Repository 
selection 
1. Policy & legal 
2. Discoverable 
3. Preservation 
4. Reports 
5. Trust
12 
Step 1 (?) What ‘must’ be kept? 
Research record includes data as evidence for e.g. … 
• Audit purposes 
• Health & Safety (Lab book) 
• Contractual requirement 
Compliance also about data that won’t be kept, or 
may only be shared with approved researchers… 
Research Ethics, Duty of Confidentiality, Data Protection Act, Human Rights Act, Statistics & 
Registration Services Act. UK Data Archive: 
http://www.data-archive.ac.uk/create-manage/consent-ethics/legal 
Jisc Infonet Guidance on Managing Research Records 
tools.jiscinfonet.ac.uk/downloads/bcs-rrs/managing-research-records.pdf
13 
Step 1 (?) What ‘must’ be kept? 
Research record includes data as evidence for e.g. … 
• Audit purposes 
• Health & Safety (Lab book) 
• Contractual requirement 
Compliance also about data that won’t be kept, or 
may only be shared with approved researchers… 
Research Ethics, Duty of Confidentiality, Data Protection Act, Human Rights Act, Statistics & 
Registration Services Act. UK Data Archive: 
http://www.data-archive.ac.uk/create-manage/consent-ethics/legal 
Available choices depend on what purposes the data serves 
Jisc Infonet Guidance on Managing Research Records 
tools.jiscinfonet.ac.uk/downloads/bcs-rrs/managing-research-records.pdf
14 
Step 1 (?) What ‘must’ be kept? 
But what about funder & journal data policies? 
“Data with acknowledged long-term value ” 
RCUK Common Principles on Data Policy 
“Data, information and other electronic resources of long-term interest” 
ESRC UK Data Archive Collections Development Policy 
“Where data underpins published research there is much greater 
expectation that it will be kept” 
Ben Ryan, EPSRC 
“An inherent principle of publication is that others should be able to 
replicate and build upon the authors' published claims. Nature
15 
Step 1 (?) What ‘must’ be kept? 
But what about funder & journal data policies? 
“Data with acknowledged long-term value ” 
RCUK Common Principles on Data Policy 
“Data, information and other electronic resources of long-term interest” 
ESRC UK Data Archive Collections Development Policy 
“Where data underpins published research there is much greater 
expectation that it will be kept” 
Ben Ryan, EPSRC 
“An inherent principle of publication is that others should be able to 
replicate and build upon the authors' published claims. Nature 
Still researchers’ judgement- what purposes the data may serve
Still researchers’ judgement- what purposes the data may serve 
16 
Step 1 (?) What ‘must’ be kept? 
But what about funder & journal data policies? 
“Data with acknowledged long-term value ” 
RCUK Common Principles on Data Policy 
“Data, information and other electronic resources of long-term interest” 
ESRC UK Data Archive Collections Development Policy 
“Where data underpins published research there is much greater 
expectation that it will be kept” 
Ben Ryan, EPSRC 
“An inherent principle of publication is that others should be able to 
replicate and build upon the authors' published claims. Nature 
So make thinking about that the first step
Step 2 1 What could it be reused for? 
17 
Any angles the researcher has not already considered? 
1. Verification 
2. Further analysis 
3. Reputation building 
4. Resource development 
5. Further publications inc. data articles 
6. Learning and teaching materials 
7. Private reference
Step 2 1 What could it be reused for? 
18 
Any angles the researcher has not already considered? 
1. Verification 
2. Further analysis 
3. Reputation building 
4. Resource development 
5. Further publications inc. data articles 
6. Learning and teaching materials 
7. Private reference 
Then, relative to these, which data must be kept
Step 3 What data should have value 
19 
Any two of these fit? 
1. Good quality data and description 
complete, accurate, reliable, valid, representative etc 
2. High demand 
known users, integration potential, reputation, recommendation, appeal 
3. High effort to replicate 
difficult, costly, or impossible to reproduce 
4. Low barriers to reuse 
legal/ ethical, copyright non-restrictive terms and conditions 
5. Rarity value 
unique copy or other copies at risk 
Then what else e.g. software does it depend on?
Step 4 Cost factors 
20 
Why? 
• Costs incurred during project may add to value 
• Post-project costs must be covered 
1. Creation, collection & cleaning 
2. Short-term storage & backup 
3. Short-term access & security 
4. Team communication & development 
5. Preservation & long-term access 
So what action needed to ensure on budget?
Step 5 Bring it all together 
21 
Balance risks, costs and value 
Document the choices made 
1. Name, contributors, description, sensitivity - metadata 
2. Reuse purposes and value – the ‘reuse case’ 
3. Risk of non-compliance and costs shortfall 
4. Justification to keep or dispose 
5. Actions to prepare for preservation or disposal
But will this work 
From research perspective will active selection mean bureacracy? 
Data Mgmt Plan 
1. Collection 
2. Documentation 
3. Ethics & legal 
4. Storage & backup 
5. Selection& preserve 
6. Data sharing 
7. Responsibilities 
Repository 
selection 
1. Policy & legal 
2. Discoverable 
3. Preservation 
4. Reports 
5. Trust 
Archival storage Active storage 
Data Selection 
5 Steps to decide what 
to keep 
1. Could - benefit 
2. Must - risks 
3. Should - value 
4. Cost factors 
5. Weigh-up 1-4 
Catalogue 
Metadata 
1. Name 
2. Description 
3. Identifier 
4. Subject 
5. URL 
6. Date 
7. Creator 
8. Rights 
9. Spatial 
10.Publisher
But will it work 
Easier to avoid selecting the good and let someone else deal with de-allocation? 
Data Mgmt Plan 
- enough to 
identify which 
project this data 
relates to 
The ugly 
“dont know 
its value or 
where else to 
put it” 
Archival storage Active storage 
“The bad” 
Can’t share as 
nobody knows 
its sensitivity 
The “too 
big for 
anywhere 
else”
Thank you

More Related Content

What's hot

How to write a data management plan
How to write a data management planHow to write a data management plan
How to write a data management plan
OpenExeter
 
20160523 23 Research Data Things
20160523 23 Research Data Things20160523 23 Research Data Things
20160523 23 Research Data Things
Katina Toufexis
 
Research bites: Digital Preservation for Research Data
Research bites: Digital Preservation for Research DataResearch bites: Digital Preservation for Research Data
Research bites: Digital Preservation for Research Data
Lancaster University Library
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directions
ARDC
 
Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...
Varsha Khodiyar
 
Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU Eindhoven
Leon Osinski
 
What funders want you to do with your data
What funders want you to do with your dataWhat funders want you to do with your data
What funders want you to do with your data
Leon Osinski
 
Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...
Historic Environment Scotland
 
Data management plan format
Data management plan formatData management plan format
Data management plan format
Wouter Gerritsma
 
DataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy IssuesDataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy Issues
DataONE
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data Citation
DataONE
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
DataONE
 
Writing successful data management plans
Writing successful data management plansWriting successful data management plans
Writing successful data management plans
IzzyChad
 
DataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management Planning
DataONE
 
Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...
Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...
Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...
OAbooks
 
Introduction to Research Data Management at UWA
Introduction to Research Data Management at UWAIntroduction to Research Data Management at UWA
Introduction to Research Data Management at UWA
Katina Toufexis
 
Research Data Management Services at UWA (November 2015)
Research Data Management Services at UWA (November 2015)Research Data Management Services at UWA (November 2015)
Research Data Management Services at UWA (November 2015)
Katina Toufexis
 
LSHTM Research Data Management Policy: An Overview
LSHTM Research Data Management Policy: An OverviewLSHTM Research Data Management Policy: An Overview
LSHTM Research Data Management Policy: An Overview
London School of Hygiene and Tropical Medicine
 
Introduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD StudentsIntroduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD Students
EDINA, University of Edinburgh
 
Providing support and services for researchers in good data governance
Providing support and services for researchers in good data governanceProviding support and services for researchers in good data governance
Providing support and services for researchers in good data governance
Robin Rice
 

What's hot (20)

How to write a data management plan
How to write a data management planHow to write a data management plan
How to write a data management plan
 
20160523 23 Research Data Things
20160523 23 Research Data Things20160523 23 Research Data Things
20160523 23 Research Data Things
 
Research bites: Digital Preservation for Research Data
Research bites: Digital Preservation for Research DataResearch bites: Digital Preservation for Research Data
Research bites: Digital Preservation for Research Data
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directions
 
Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...Facilitating good research data management practice as part of scholarly publ...
Facilitating good research data management practice as part of scholarly publ...
 
Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU Eindhoven
 
What funders want you to do with your data
What funders want you to do with your dataWhat funders want you to do with your data
What funders want you to do with your data
 
Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...
 
Data management plan format
Data management plan formatData management plan format
Data management plan format
 
DataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy IssuesDataONE Education Module 10: Legal and Policy Issues
DataONE Education Module 10: Legal and Policy Issues
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data Citation
 
DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
 
Writing successful data management plans
Writing successful data management plansWriting successful data management plans
Writing successful data management plans
 
DataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management PlanningDataONE Education Module 03: Data Management Planning
DataONE Education Module 03: Data Management Planning
 
Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...
Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...
Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...
 
Introduction to Research Data Management at UWA
Introduction to Research Data Management at UWAIntroduction to Research Data Management at UWA
Introduction to Research Data Management at UWA
 
Research Data Management Services at UWA (November 2015)
Research Data Management Services at UWA (November 2015)Research Data Management Services at UWA (November 2015)
Research Data Management Services at UWA (November 2015)
 
LSHTM Research Data Management Policy: An Overview
LSHTM Research Data Management Policy: An OverviewLSHTM Research Data Management Policy: An Overview
LSHTM Research Data Management Policy: An Overview
 
Introduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD StudentsIntroduction to RDM for Geoscience PhD Students
Introduction to RDM for Geoscience PhD Students
 
Providing support and services for researchers in good data governance
Providing support and services for researchers in good data governanceProviding support and services for researchers in good data governance
Providing support and services for researchers in good data governance
 

Similar to Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
Martin Donnelly
 
DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciences
Sarah Jones
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
EDINA, University of Edinburgh
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
Historic Environment Scotland
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
Marieke Guy
 
Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6
ARDC
 
Managing your research data
Managing your research dataManaging your research data
Managing your research data
University of York Library
 
Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...
Martin Donnelly
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020
Sarah Jones
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinar
Sarah Jones
 
Research-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhDResearch-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhD
University of Liverpool Library
 
Survey of research data management practices up2010digschol2011
Survey of research data management practices up2010digschol2011Survey of research data management practices up2010digschol2011
Survey of research data management practices up2010digschol2011
heila1
 
Research Data Management and your PhD
Research Data Management and your PhDResearch Data Management and your PhD
Research Data Management and your PhD
University of Liverpool Library
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
Michael Day
 
Data Management Lab: Session 4 Slides
Data Management Lab: Session 4 SlidesData Management Lab: Session 4 Slides
Data Management Lab: Session 4 Slides
IUPUI
 
Creating a Data Management Plan for your Research
Creating a Data Management Plan for your ResearchCreating a Data Management Plan for your Research
Creating a Data Management Plan for your Research
Robin Rice
 
Conquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data ManagementConquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data Management
Kathryn Houk
 
Data Management Lab: Data management plan instructions
Data Management Lab: Data management plan instructionsData Management Lab: Data management plan instructions
Data Management Lab: Data management plan instructions
IUPUI
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314
Philip Bourne
 
Getting to grips with research data management
Getting to grips with research data management Getting to grips with research data management
Getting to grips with research data management
Wendy Mears
 

Similar to Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference? (20)

Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciences
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
 
Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6
 
Managing your research data
Managing your research dataManaging your research data
Managing your research data
 
Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...Managing and Sharing Research Data: Good practices for an ideal world...in th...
Managing and Sharing Research Data: Good practices for an ideal world...in th...
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinar
 
Research-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhDResearch-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhD
 
Survey of research data management practices up2010digschol2011
Survey of research data management practices up2010digschol2011Survey of research data management practices up2010digschol2011
Survey of research data management practices up2010digschol2011
 
Research Data Management and your PhD
Research Data Management and your PhDResearch Data Management and your PhD
Research Data Management and your PhD
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
Data Management Lab: Session 4 Slides
Data Management Lab: Session 4 SlidesData Management Lab: Session 4 Slides
Data Management Lab: Session 4 Slides
 
Creating a Data Management Plan for your Research
Creating a Data Management Plan for your ResearchCreating a Data Management Plan for your Research
Creating a Data Management Plan for your Research
 
Conquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data ManagementConquering Chaos in the Age of Networked Science: Research Data Management
Conquering Chaos in the Age of Networked Science: Research Data Management
 
Data Management Lab: Data management plan instructions
Data Management Lab: Data management plan instructionsData Management Lab: Data management plan instructions
Data Management Lab: Data management plan instructions
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314
 
Getting to grips with research data management
Getting to grips with research data management Getting to grips with research data management
Getting to grips with research data management
 

More from DCC-info

European Research Funders and data sharing: an overview of current practices
European Research Funders and data sharing: an overview of current practicesEuropean Research Funders and data sharing: an overview of current practices
European Research Funders and data sharing: an overview of current practices
DCC-info
 
Introduction to Arkivum
Introduction to ArkivumIntroduction to Arkivum
Introduction to Arkivum
DCC-info
 
Research Data Management Programme in Edinburgh
Research Data Management Programme in EdinburghResearch Data Management Programme in Edinburgh
Research Data Management Programme in Edinburgh
DCC-info
 
Data Management in Human Imaging
Data Management in Human ImagingData Management in Human Imaging
Data Management in Human Imaging
DCC-info
 
Arkivum and janet
Arkivum and janet Arkivum and janet
Arkivum and janet
DCC-info
 
Sally Chambers - Linking institutional, national and international infrastruc...
Sally Chambers - Linking institutional, national and international infrastruc...Sally Chambers - Linking institutional, national and international infrastruc...
Sally Chambers - Linking institutional, national and international infrastruc...
DCC-info
 
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
DCC-info
 
Simon Willmoth - Research Data Management: an institutional view
Simon Willmoth - Research Data Management: an institutional viewSimon Willmoth - Research Data Management: an institutional view
Simon Willmoth - Research Data Management: an institutional view
DCC-info
 
David De Roure - What's so different about Arts and Humanities data?
David De Roure - What's so different about Arts and Humanities data?David De Roure - What's so different about Arts and Humanities data?
David De Roure - What's so different about Arts and Humanities data?
DCC-info
 

More from DCC-info (9)

European Research Funders and data sharing: an overview of current practices
European Research Funders and data sharing: an overview of current practicesEuropean Research Funders and data sharing: an overview of current practices
European Research Funders and data sharing: an overview of current practices
 
Introduction to Arkivum
Introduction to ArkivumIntroduction to Arkivum
Introduction to Arkivum
 
Research Data Management Programme in Edinburgh
Research Data Management Programme in EdinburghResearch Data Management Programme in Edinburgh
Research Data Management Programme in Edinburgh
 
Data Management in Human Imaging
Data Management in Human ImagingData Management in Human Imaging
Data Management in Human Imaging
 
Arkivum and janet
Arkivum and janet Arkivum and janet
Arkivum and janet
 
Sally Chambers - Linking institutional, national and international infrastruc...
Sally Chambers - Linking institutional, national and international infrastruc...Sally Chambers - Linking institutional, national and international infrastruc...
Sally Chambers - Linking institutional, national and international infrastruc...
 
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
 
Simon Willmoth - Research Data Management: an institutional view
Simon Willmoth - Research Data Management: an institutional viewSimon Willmoth - Research Data Management: an institutional view
Simon Willmoth - Research Data Management: an institutional view
 
David De Roure - What's so different about Arts and Humanities data?
David De Roure - What's so different about Arts and Humanities data?David De Roure - What's so different about Arts and Humanities data?
David De Roure - What's so different about Arts and Humanities data?
 

Recently uploaded

一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Natural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptxNatural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptx
fkyes25
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 

Recently uploaded (20)

一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Natural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptxNatural Language Processing (NLP), RAG and its applications .pptx
Natural Language Processing (NLP), RAG and its applications .pptx
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 

Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference?

  • 1. Long-term storage – will it fill up with the good stuff, or the big, bad, and ugly? Can checklists make a difference? Angus Whyte, DCC ‘Research Data Storage and Preservation Strategies’ University of Edinburgh 27 October 2014 a.whyte@ed.ac.uk
  • 2. Long-term storage – will it fill up with the good stuff, or just the big, bad, and ugly? Will checklists encourage researchers to decide?
  • 3.
  • 4. RDM Service Components www.dcc.ac.uk/resources/how-guides/how-develop-rdm-services
  • 5. RDM Service ‘Components’ www.dcc.ac.uk/resources/how-guides/how-develop-rdm-services
  • 6. But more support needed! Top 3 support needs for institutions * 1. Defining what to retain 2. Specifying tools/ infrastructure 3. Supporting metadata creation for research data discovery *March 2014 DCC 2014 RDM Survey of 61 institutions Data available at: zenodo.org/collection/user-dcc-rdm-2014
  • 7. Data Asset Surveys Some institutions have estimated storage requirements from these About your data and its lifecycle…? 1.File type 2.Volumes 3.Density 4.Update frequence 5.Usage frequency 6.Availability req’d 7.Sensitivity Active storage Archival storage Data Asset Framework Implementation guide www.data-audit.eu/docs/DAF_Implementation_Guide.pdf
  • 8. Data Asset Surveys Some institutions have estimated storage requirements from these About your data and its lifecycle…? 1.File type 2.Volumes 3.Density 4.Update frequence 5.Usage frequency 6.Availability req’d 7.Sensitivity Active storage Archival storage But if you provide it will researchers use it, at what cost? Data Asset Framework Implementation guide www.data-audit.eu/docs/DAF_Implementation_Guide.pdf
  • 9. Practical checklists key points in research cycle Data Mgmt Plan 1. Collection 2. Documentation 3. Ethics & legal 4. Storage & backup 5. Selection& preserve 6. Data sharing 7. Responsibilities Repository selection 1. Policy & legal 2. Discoverable 3. Preservation 4. Reports 5. Trust Archival storage Active storage Data Selection 5 Steps to decide what to keep 1. Could - benefit 2. Must - risks 3. Should - value 4. Cost factors 5. Weigh-up 1-4 Catalogue Metadata 1. Name 2. Description 3. Identifier 4. Subject 5. URL 6. Date 7. Creator 8. Rights 9. Spatial 10.Publisher Start Write-up
  • 10. Data selection checklist Preview at: this Google doc
  • 11. 11 Data selection checklist Straightforward steps to guide researchers ①Could this data be re-used ②Must it be kept to manage compliance risk ③Should it be kept for its potential value and… ④Considering costs ⑤Will ✔or won’t ✗ it be kept, shared on what terms Institution or external repository Data Selection 5 Steps to decide what to keep 1. Could - benefit 2. Must - risks 3. Should - value 4. Cost factors 5. Weigh-up 1-4 Repository selection 1. Policy & legal 2. Discoverable 3. Preservation 4. Reports 5. Trust
  • 12. 12 Step 1 (?) What ‘must’ be kept? Research record includes data as evidence for e.g. … • Audit purposes • Health & Safety (Lab book) • Contractual requirement Compliance also about data that won’t be kept, or may only be shared with approved researchers… Research Ethics, Duty of Confidentiality, Data Protection Act, Human Rights Act, Statistics & Registration Services Act. UK Data Archive: http://www.data-archive.ac.uk/create-manage/consent-ethics/legal Jisc Infonet Guidance on Managing Research Records tools.jiscinfonet.ac.uk/downloads/bcs-rrs/managing-research-records.pdf
  • 13. 13 Step 1 (?) What ‘must’ be kept? Research record includes data as evidence for e.g. … • Audit purposes • Health & Safety (Lab book) • Contractual requirement Compliance also about data that won’t be kept, or may only be shared with approved researchers… Research Ethics, Duty of Confidentiality, Data Protection Act, Human Rights Act, Statistics & Registration Services Act. UK Data Archive: http://www.data-archive.ac.uk/create-manage/consent-ethics/legal Available choices depend on what purposes the data serves Jisc Infonet Guidance on Managing Research Records tools.jiscinfonet.ac.uk/downloads/bcs-rrs/managing-research-records.pdf
  • 14. 14 Step 1 (?) What ‘must’ be kept? But what about funder & journal data policies? “Data with acknowledged long-term value ” RCUK Common Principles on Data Policy “Data, information and other electronic resources of long-term interest” ESRC UK Data Archive Collections Development Policy “Where data underpins published research there is much greater expectation that it will be kept” Ben Ryan, EPSRC “An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Nature
  • 15. 15 Step 1 (?) What ‘must’ be kept? But what about funder & journal data policies? “Data with acknowledged long-term value ” RCUK Common Principles on Data Policy “Data, information and other electronic resources of long-term interest” ESRC UK Data Archive Collections Development Policy “Where data underpins published research there is much greater expectation that it will be kept” Ben Ryan, EPSRC “An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Nature Still researchers’ judgement- what purposes the data may serve
  • 16. Still researchers’ judgement- what purposes the data may serve 16 Step 1 (?) What ‘must’ be kept? But what about funder & journal data policies? “Data with acknowledged long-term value ” RCUK Common Principles on Data Policy “Data, information and other electronic resources of long-term interest” ESRC UK Data Archive Collections Development Policy “Where data underpins published research there is much greater expectation that it will be kept” Ben Ryan, EPSRC “An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Nature So make thinking about that the first step
  • 17. Step 2 1 What could it be reused for? 17 Any angles the researcher has not already considered? 1. Verification 2. Further analysis 3. Reputation building 4. Resource development 5. Further publications inc. data articles 6. Learning and teaching materials 7. Private reference
  • 18. Step 2 1 What could it be reused for? 18 Any angles the researcher has not already considered? 1. Verification 2. Further analysis 3. Reputation building 4. Resource development 5. Further publications inc. data articles 6. Learning and teaching materials 7. Private reference Then, relative to these, which data must be kept
  • 19. Step 3 What data should have value 19 Any two of these fit? 1. Good quality data and description complete, accurate, reliable, valid, representative etc 2. High demand known users, integration potential, reputation, recommendation, appeal 3. High effort to replicate difficult, costly, or impossible to reproduce 4. Low barriers to reuse legal/ ethical, copyright non-restrictive terms and conditions 5. Rarity value unique copy or other copies at risk Then what else e.g. software does it depend on?
  • 20. Step 4 Cost factors 20 Why? • Costs incurred during project may add to value • Post-project costs must be covered 1. Creation, collection & cleaning 2. Short-term storage & backup 3. Short-term access & security 4. Team communication & development 5. Preservation & long-term access So what action needed to ensure on budget?
  • 21. Step 5 Bring it all together 21 Balance risks, costs and value Document the choices made 1. Name, contributors, description, sensitivity - metadata 2. Reuse purposes and value – the ‘reuse case’ 3. Risk of non-compliance and costs shortfall 4. Justification to keep or dispose 5. Actions to prepare for preservation or disposal
  • 22. But will this work From research perspective will active selection mean bureacracy? Data Mgmt Plan 1. Collection 2. Documentation 3. Ethics & legal 4. Storage & backup 5. Selection& preserve 6. Data sharing 7. Responsibilities Repository selection 1. Policy & legal 2. Discoverable 3. Preservation 4. Reports 5. Trust Archival storage Active storage Data Selection 5 Steps to decide what to keep 1. Could - benefit 2. Must - risks 3. Should - value 4. Cost factors 5. Weigh-up 1-4 Catalogue Metadata 1. Name 2. Description 3. Identifier 4. Subject 5. URL 6. Date 7. Creator 8. Rights 9. Spatial 10.Publisher
  • 23. But will it work Easier to avoid selecting the good and let someone else deal with de-allocation? Data Mgmt Plan - enough to identify which project this data relates to The ugly “dont know its value or where else to put it” Archival storage Active storage “The bad” Can’t share as nobody knows its sensitivity The “too big for anywhere else”