SlideShare a Scribd company logo
Managing Your Research Data


Catherine Pink (UKOLN)
Jez Cope (DTC)



       www.bath.ac.uk/rdso/datamanagement.html
After this workshop you will be able to:

•   To understand what are research data and to whom do they belong

•   To appreciate that the management and storing of research data
    are responsibilities of those who generate the information

•   To learn about research data management strategies and tools

•   To determine how much data needs to be managed

•   To gauge for how long research data need to be maintained
What do you already know?
2 minute discussion topics
     – 1 per group

 What do you understand these to mean?
What are Data?
•   The lowest level of abstraction from which information and
    knowledge are derived

•   Research data are collected, observed or created, for the purposes
    of analysis to produce and validate original research results

•   Both analogue and digital materials are 'data'

•   Digital data can be:
     • created in a digital form ("born digital")
     • converted to a digital form (digitised)
Research
Process
Data Lifecycles & Data Management Plans

                                                  1. What data will you produce?
           5.
                                  1.
      Preservation
                                Create            2. How will you organise the
       & Re-Use
                                                     data?

                                                  3. Can you/others understand the
                                                     data
    4.
                                          2.
Publication
& Deposit
                                     Active Use   4. What data will be deposited
                                                     and where?

                     3.                           5. Who will be interested in re-
                Documentation
                                                     using the data?
Data Management Plans (DMPs)
DMPs are a framework:
    • They ensure you’ve addressed all areas of data management
    • DMPs do not check or validate your answers!

Required by the Code of Good Practice in Research

DMP Online (www.dmponline.dcc.ac.uk)
    • Ideal for funding applications
    • Based on the DCC checklist

Postgraduate DMP template:
http://blogs.bath.ac.uk/research360/files//www/vhosts/bathblogs/wp-content/blogs.dir/969/files/2012/03/Data-M
1. What data will you produce?

                                                  •   What type of data will you
           5.
                                  1.
                                                      produce?
      Preservation
                                Create
       & Re-Use
                                                  •   What types of file format?

                                                  •   How easy is it to create or
    4.
                                                      reproduce?
                                          2.
Publication
                                     Active Use
& Deposit
                                                  •   Who owns it and is
                                                      responsible for it?
                     3.
                Documentation
Data Types
       Data Type                        Value                      Example
Observational data            Usually irreplaceable        Sensor readings, telemetry,
captured around the time of                                survey results, neuro-
the event                                                  images

Experimental data from lab    Often reproducible but can   Gene sequence,
equipment                     be expensive                 chromatograms, toroid
                                                           magnetic field readings

Simulation data generated     Model and metadata           Climate models, economic
from test models              (inputs) more important      models
                              than output data.
                              Large modules can take a
                              lot of computer time to
                              reproduce
Derived or compiled data      Reproducible                 Text and data mining,
                              (but very expensive)         compiled databases, 3D
                                                           models
Data can take many forms
•   Notebooks & lab books
•   Instrument measurements
•   Experimental observations
•   Still images, video & audio
•   Survey results & interview transcripts
•   Consent forms

•   Text corpuses
•   Models & software
Who owns or is responsible for your data?
Ownership
•   Data ownership is complex, often defined on a case-by-case basis
•   May be dependent on individual contractual agreements
•   Contracts define needs of the University, staff, students, funders,
    collaborators

Management
•   Defined in the University of Bath Code of Good Practice in Research:
    http://www.bath.ac.uk/opp/research/
Who owns or is responsible for your data?
In practice
•   Everyone plays their part

•   If you’re generating and using data, you should:
     •   Comply with guidelines from your group, department, faculty, collaborators
     •   Make sure your data is securely stored and backed up
     •   Describe your data so that you/others can understand it in future

•   If you’re managing a project, you should:
     •   Be fully aware of funder, collaborator and publisher requirements
     •   Ensure you have access to group data
     •   Assess what should be published and/or archived


More info: http://www.data-archive.ac.uk/create-manage
2. How will you look after your data?


           5.
                                  1.
      Preservation
       & Re-Use
                                Create
                                                  • Is your data safe?

                                                  • Is your data organised?
    4.
Publication
                                          2.
                                     Active Use   • Can you find your data?
& Deposit



                     3.
                Documentation
Storage and Security
3… 2… 1… Backup!
   • at least 3 copies of a file
   • on at least 2 different media
   • with at least 1 offsite                     Photo credits: Harvey Rutt
                                                 http://www.ecs.soton.ac.uk/regenesis/pictures/



Test file recovery
   • At set up time and on a regular basis

Access
   • Protect your hardware                         Ask BUCS
   • If sensitive use file encryption              for advice
   • Keep passwords safe (e.g. Keypass)
   • At least 2 people should have access to your data
More info: http://www.data-archive.ac.uk/create-manage/storage
Storage and Security – Back up options
     Media                    Advantages                        Disadvantages
CDs or DVDs        •   Useful for quick restore in the   •   Static capture of data
                       event of minor disaster           •   Not built to last
                                                         •   Vulnerable to theft
                                                         •   Physical loss of media
External hard      •   Dynamic capture of data           •   Must store securely and
                   •                                         remotely to original copy
drives                 Useful for quick restore in the
                       event of minor disaster           •   Vulnerable to theft
                                                         •   Must use file encryption if
                                                             sensitive
BUCS server        •   Resilient backup                  •   Lack of offline access
                   •   X:drive (1Tb free per project)    •   Must have a BUCS account
                   •   Safety net for major disaster
Digital scans of   •   Easy to do on a daily basis at    •   Manipulation of page
lab books              any campus printer                    content difficult
                   •   Automatically save daily page
                       scans to your H:drive
2. How will you look after my data?


           5.
                                  1.
      Preservation
       & Re-Use
                                Create
                                                  • Is your data safe?

                                                  • Is your data organised?
    4.
Publication
                                          2.
                                     Active Use   • Can you find your data?
& Deposit



                     3.
                Documentation
Can you find your data?
               If not, have you considered…
A Clear Directory Structure
•Top level folder and substructure


File Version Control
•Discard obsolete versions if no longer needed after making backups
•Manage using:        File naming (see below)
                               Version control software (e.g. Git, Mercurial, SVN)


File Naming Conventions
http://www.jiscdigitalmedia.ac.uk/crossmedia/advice/choosing-a-file-name/
• Record any naming conventions or abbreviations used
     • E.g. [Experiment]_[Reagent]_[Instrument]_[YYYYMMDD].dat
• Date/time stamp or use a separate ID (e.g. v1) for each version
3. Documenting data

                                                  •   Do you still understand
           5.
                                  1.
                                                      your older work?
      Preservation
                                Create
       & Re-Use
                                                  •   Is the file structure /
                                                      naming understandable to
                                                      others?
    4.
                                          2.
Publication
                                     Active Use
                                                  •   Which data will be kept?
& Deposit

                                                  •   Which data can be
                     3.                               discarded?
                Documentation
Understanding your data
•   Students:

     • Will you be able to write up your methods at the end of your
       studies?

•   Project leads:

     • Will you be able to respond to reviewers comments?

     • Will you be able to find the information you need for final project
       reports?

•   Can you reproduce your work if you need to?

•   What information would someone else need to replicate your work?
Understanding your data
Do you know how you generated your data?
       •   Equipment or software used
       •   Experimental protocol
       •   Other things included in (e.g.) a lab notebook
       •   Can reference a published article, if it covers everything

Are you able to give credit to external sources of data?
       • Include details of where the data are held, identified & accessed
       • Cite a publication describing the data
       • Cite the data itself e.g.
Metadata
  •   Contextual information for data is called metadata — literally data
      about data

  •   Data repositories & archives require some generic metadata, e.g.
           • author, title, publication date

  •   For data to be useful, it will also need subject-specific metadata e.g.
           • reagent names, experimental conditions, population demographic

  •   Record contextual information in a text file (such as a ‘read me’ file)
      in the same directory as the data e.g.
           • codes for categorical survey responses
           • ‘999 indicates a dummy value in the data’


More info: http://www.data-archive.ac.uk/create-manage/document
4. What data will be deposited and where?

                                                  •   Are you expected to share
                                                      your data?
           5.
                                  1.
      Preservation
                                Create
       & Re-Use
                                                  •   Are you allowed to share
                                                      your data?

                                                  •   Define the core data set of
    4.
                                          2.          the project
Publication
                                     Active Use
& Deposit
                                                  •   Which data will be
                                                      included in your
                     3.
                Documentation                         publication / thesis?
Data Sharing – Why share your data?
  • Share with your future self – avoid repeating research!

  • Promote your research – get cited!

  • Enable new discoveries

  • Replication

  • Store your data in a reliable archive

  • Comply with funding requirements
Requirements to share your data
    Some journal publishers have a policy on data availability:
         • Are you making any of your data available as supplementary
           information?
         • Is there sufficient information with the data so that it can be understood
           and reused?

    Most UK funders now expect research data to be made publically
    available
                                            Common Principles on Data Policy
                                            “Publicly funded research data are a public good,
                                            produced in the public interest, which should be made
                                            openly available with as few restrictions as possible
                                            in a timely and responsible manner that does not
                                            harm intellectual property.”


Find your funder’s policy: http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
Restrictions on sharing your data
Are there privacy requirements from the funders or commercial
partners?
   •   e.g. personal data, high security data

   You might not have the right to share data collected from other
     sources
   • It depends upon whether those data were licensed and have terms of use
   • Most databases are licensed and prohibit redistribution of data without
     permission

   If you are uncertain as to your rights to disseminate data, contact
       research-data@bath.ac.uk
How to share your data
•   Deposit in a data repository eg. GenBank

•   Data can be licensed

•   Culture of data sharing: can make available your data under a CC-
    BY or CC0 declaration to make this explicit
        • CC-BY license permits reuse but requires attribution
        • CC0 declaration is a waiver of copyright.
        • Laws about data vary in different countries.

• You may have rights to first use or to commercial exploit data

• How to license research data:
    http://www.dcc.ac.uk/resources/how-guides/license-research-data
5. Preservation and Re-use


           5.
                                                  •   How long will your data be
                                  1.
      Preservation
                                Create                reusable for?
       & Re-Use


                                                  •   Do you need to prepare
                                                      your data for long term
                                                      archive?
    4.
                                          2.
Publication
                                     Active Use
& Deposit                                         •   Which data do you need to
                                                      keep?
                     3.
                Documentation
Data retention and archiving
How permanent are the data?
    • Short term (e.g. 3-5 years)
    • Long term (e.g. 10 years)
    • Indefinite

Should discarded data be destroyed?
    • Keep all versions? Just final version? First and last?

What are the re-processing costs?
    • Keep only software and protocol/methodology information

Are there tools/software needed to create, process or visualise the
data?
    •   Archive these with your data
File formats for long-term access
 •   Unencrypted
 •   Uncompressed
 •   Non-proprietary/patent-encumbered
 •   Open, documented standard
 •   Standard representation (ASCII, Unicode)

        Type                  Recommended                 Avoid for data sharing
 Tabular data        CSV, TSV, SPSS portable              Excel
 Text                Plain text, HTML, RTF                Word
                     PDF/A only if layout matters
 Media               Container: MP4, Ogg                  Quicktime
                     Codec: Theora, Dirac, FLAC           H264
 Images              TIFF, JPEG2000, PNG                  GIF, JPG
 Structured data     XML, RDF                             RDBMS
Further examples: http://www.data-archive.ac.uk/create-manage/format/formats-table
Summary
•   Data management is important at all stages of a project

•   There are tools available to help you

•   Keep your data safe
     • Back up your data
     • Test your back-ups

•   Keep your data organised
     • Find it – good formats and file names
     • Understand it - check documentation and metadata

•   Consider publishing your data so that you can get recognition for
    your work

•   Ask for help: research-data@bath.ac.uk
What do you think now?
Further Information
• http://go.bath.ac.uk/research-data
• research-data@bath.ac.uk

Contacts




                    Cathy Pink                      Jez Cope
              Institutional Data Scientist   Technical Data Coordinator
                 c.j.pink@bath.ac.uk            j.cope@bath.ac.uk
Acknowledgements
This course was based on:




Research Data MANTRA [online course]
Created by EDINA and Data Library, University of Edinburgh
Available at http://datalib.edina.ac.uk/mantra

DataTrainArchaeology: Teaching Material Downloads

and work done by:

•Professor Richard H. Guy, Dept. of Pharmacy & Pharmacology
•http://libraries.mit.edu/data-management

More Related Content

What's hot

What's hot (19)

Creating a Data Management Plan
Creating a Data Management PlanCreating a Data Management Plan
Creating a Data Management Plan
 
Managing your research data
Managing your research dataManaging your research data
Managing your research data
 
What funders want you to do with your data
What funders want you to do with your dataWhat funders want you to do with your data
What funders want you to do with your data
 
RDM & ELNs @ Edinburgh
RDM & ELNs @ EdinburghRDM & ELNs @ Edinburgh
RDM & ELNs @ Edinburgh
 
Writing a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPToolWriting a successful data management plan with the DMPTool
Writing a successful data management plan with the DMPTool
 
Providing support and services for researchers in good data governance
Providing support and services for researchers in good data governanceProviding support and services for researchers in good data governance
Providing support and services for researchers in good data governance
 
Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU Eindhoven
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...
 
Good (enough) research data management practices
Good (enough) research data management practicesGood (enough) research data management practices
Good (enough) research data management practices
 
Practical Data Management - ACRL DCIG Webinar
Practical Data Management - ACRL DCIG WebinarPractical Data Management - ACRL DCIG Webinar
Practical Data Management - ACRL DCIG Webinar
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data Management
 
A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4
 
The Donders Repository
The Donders RepositoryThe Donders Repository
The Donders Repository
 
Getting to grips with Research Data Management
Getting to grips with Research Data ManagementGetting to grips with Research Data Management
Getting to grips with Research Data Management
 
3 tu.dc 5min nordbib jp rombouts
3 tu.dc 5min nordbib jp rombouts3 tu.dc 5min nordbib jp rombouts
3 tu.dc 5min nordbib jp rombouts
 
Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...Research data management : Open Research Data pilot, data management (plans),...
Research data management : Open Research Data pilot, data management (plans),...
 
What is-rdm
What is-rdmWhat is-rdm
What is-rdm
 
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
 
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data ServicesNISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
 

Viewers also liked

Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata management
Open Data Support
 

Viewers also liked (12)

Workshop für die LUH: Forschungsdatenmanagement
Workshop für die LUH: ForschungsdatenmanagementWorkshop für die LUH: Forschungsdatenmanagement
Workshop für die LUH: Forschungsdatenmanagement
 
Data management training: making the most of limited resources
Data management training: making the most of limited resources Data management training: making the most of limited resources
Data management training: making the most of limited resources
 
Research Data Management training with Open Educational Resources
Research Data Management training with Open Educational ResourcesResearch Data Management training with Open Educational Resources
Research Data Management training with Open Educational Resources
 
Research Data Management Training for PHD Students at University of Helsinki
Research Data Management Training for PHD Students at University of HelsinkiResearch Data Management Training for PHD Students at University of Helsinki
Research Data Management Training for PHD Students at University of Helsinki
 
Research Data Management Training and Support
Research Data Management Training and SupportResearch Data Management Training and Support
Research Data Management Training and Support
 
How to Develop Your Unique Selling Proposition
How to Develop Your Unique Selling PropositionHow to Develop Your Unique Selling Proposition
How to Develop Your Unique Selling Proposition
 
eMarketer Webinar: Data Management Platforms—Using Big Data to Power Marketin...
eMarketer Webinar: Data Management Platforms—Using Big Data to Power Marketin...eMarketer Webinar: Data Management Platforms—Using Big Data to Power Marketin...
eMarketer Webinar: Data Management Platforms—Using Big Data to Power Marketin...
 
Creating A Unique Selling Proposition (Usp)
Creating A Unique Selling Proposition (Usp)Creating A Unique Selling Proposition (Usp)
Creating A Unique Selling Proposition (Usp)
 
Data, Information And Knowledge Management Framework And The Data Management ...
Data, Information And Knowledge Management Framework And The Data Management ...Data, Information And Knowledge Management Framework And The Data Management ...
Data, Information And Knowledge Management Framework And The Data Management ...
 
Top 10 USP's (Unique Selling Propostions)
Top 10 USP's (Unique Selling Propostions)Top 10 USP's (Unique Selling Propostions)
Top 10 USP's (Unique Selling Propostions)
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata management
 
Review of Data Management Maturity Models
Review of Data Management Maturity ModelsReview of Data Management Maturity Models
Review of Data Management Maturity Models
 

Similar to University of Bath Research Data Management training for researchers

2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches
George Ang
 

Similar to University of Bath Research Data Management training for researchers (20)

DCSF19 Containerized Databases for Enterprise Applications
DCSF19 Containerized Databases for Enterprise ApplicationsDCSF19 Containerized Databases for Enterprise Applications
DCSF19 Containerized Databases for Enterprise Applications
 
Data Management Planning for PhDs
Data Management Planning for PhDsData Management Planning for PhDs
Data Management Planning for PhDs
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
 
7-Backups of security Devices-03-06-2023.ppt
7-Backups of security Devices-03-06-2023.ppt7-Backups of security Devices-03-06-2023.ppt
7-Backups of security Devices-03-06-2023.ppt
 
Preventing data loss
Preventing data lossPreventing data loss
Preventing data loss
 
Webinar: Are You Treating Unstructured Data as a Second Class Citizen?
Webinar: Are You Treating Unstructured Data as a Second Class Citizen?Webinar: Are You Treating Unstructured Data as a Second Class Citizen?
Webinar: Are You Treating Unstructured Data as a Second Class Citizen?
 
Data Analytics: HDFS with Big Data : Issues and Application
Data Analytics:  HDFS  with  Big Data :  Issues and ApplicationData Analytics:  HDFS  with  Big Data :  Issues and Application
Data Analytics: HDFS with Big Data : Issues and Application
 
Introduction to Digital Preservation
Introduction to Digital PreservationIntroduction to Digital Preservation
Introduction to Digital Preservation
 
Watching the Detectives: Using digital forensics techniques to investigate th...
Watching the Detectives: Using digital forensics techniques to investigate th...Watching the Detectives: Using digital forensics techniques to investigate th...
Watching the Detectives: Using digital forensics techniques to investigate th...
 
Data Management 101
Data Management 101Data Management 101
Data Management 101
 
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
 
Lecture 01 Evolution of Decision Support Systems
Lecture 01 Evolution of Decision Support SystemsLecture 01 Evolution of Decision Support Systems
Lecture 01 Evolution of Decision Support Systems
 
Proact story on Archiving
Proact story on ArchivingProact story on Archiving
Proact story on Archiving
 
I say emulate
I say emulateI say emulate
I say emulate
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
 
Responsible Conduct of Research: Data Management
Responsible Conduct of Research: Data ManagementResponsible Conduct of Research: Data Management
Responsible Conduct of Research: Data Management
 

More from Jez Cope (8)

Training Researchers with Sakai
Training Researchers with SakaiTraining Researchers with Sakai
Training Researchers with Sakai
 
RDM Training for Students and Staff: Progress and Observations
RDM Training for Students and Staff: Progress and ObservationsRDM Training for Students and Staff: Progress and Observations
RDM Training for Students and Staff: Progress and Observations
 
Technology training for PG students
Technology training for PG studentsTechnology training for PG students
Technology training for PG students
 
Connecting Researchers: Supporting social media use at the University of Bath
Connecting Researchers: Supporting social media use at the University of BathConnecting Researchers: Supporting social media use at the University of Bath
Connecting Researchers: Supporting social media use at the University of Bath
 
Promoting your research on the web
Promoting your research on the webPromoting your research on the web
Promoting your research on the web
 
iSusLab: a Sakai-based Virtual Research Environment for scientists
iSusLab: a Sakai-based Virtual Research Environment for scientistsiSusLab: a Sakai-based Virtual Research Environment for scientists
iSusLab: a Sakai-based Virtual Research Environment for scientists
 
Virtual Research Environments: Supporting research and researcher development
Virtual Research Environments: Supporting research and researcher developmentVirtual Research Environments: Supporting research and researcher development
Virtual Research Environments: Supporting research and researcher development
 
Blogging in Academia — Connected Researcher @ Bath
Blogging in Academia — Connected Researcher @ BathBlogging in Academia — Connected Researcher @ Bath
Blogging in Academia — Connected Researcher @ Bath
 

Recently uploaded

Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
Avinash Rai
 

Recently uploaded (20)

Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
 
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptxSolid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
NCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdfNCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdf
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
Benefits and Challenges of Using Open Educational Resources
Benefits and Challenges of Using Open Educational ResourcesBenefits and Challenges of Using Open Educational Resources
Benefits and Challenges of Using Open Educational Resources
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
 
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
Operations Management - Book1.p  - Dr. Abdulfatah A. SalemOperations Management - Book1.p  - Dr. Abdulfatah A. Salem
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
 
Gyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptxGyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptx
 
[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation
 
The Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational ResourcesThe Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational Resources
 
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
 

University of Bath Research Data Management training for researchers

  • 1. Managing Your Research Data Catherine Pink (UKOLN) Jez Cope (DTC) www.bath.ac.uk/rdso/datamanagement.html
  • 2. After this workshop you will be able to: • To understand what are research data and to whom do they belong • To appreciate that the management and storing of research data are responsibilities of those who generate the information • To learn about research data management strategies and tools • To determine how much data needs to be managed • To gauge for how long research data need to be maintained
  • 3. What do you already know?
  • 4. 2 minute discussion topics – 1 per group What do you understand these to mean?
  • 5. What are Data? • The lowest level of abstraction from which information and knowledge are derived • Research data are collected, observed or created, for the purposes of analysis to produce and validate original research results • Both analogue and digital materials are 'data' • Digital data can be: • created in a digital form ("born digital") • converted to a digital form (digitised)
  • 7. Data Lifecycles & Data Management Plans 1. What data will you produce? 5. 1. Preservation Create 2. How will you organise the & Re-Use data? 3. Can you/others understand the data 4. 2. Publication & Deposit Active Use 4. What data will be deposited and where? 3. 5. Who will be interested in re- Documentation using the data?
  • 8. Data Management Plans (DMPs) DMPs are a framework: • They ensure you’ve addressed all areas of data management • DMPs do not check or validate your answers! Required by the Code of Good Practice in Research DMP Online (www.dmponline.dcc.ac.uk) • Ideal for funding applications • Based on the DCC checklist Postgraduate DMP template: http://blogs.bath.ac.uk/research360/files//www/vhosts/bathblogs/wp-content/blogs.dir/969/files/2012/03/Data-M
  • 9. 1. What data will you produce? • What type of data will you 5. 1. produce? Preservation Create & Re-Use • What types of file format? • How easy is it to create or 4. reproduce? 2. Publication Active Use & Deposit • Who owns it and is responsible for it? 3. Documentation
  • 10. Data Types Data Type Value Example Observational data Usually irreplaceable Sensor readings, telemetry, captured around the time of survey results, neuro- the event images Experimental data from lab Often reproducible but can Gene sequence, equipment be expensive chromatograms, toroid magnetic field readings Simulation data generated Model and metadata Climate models, economic from test models (inputs) more important models than output data. Large modules can take a lot of computer time to reproduce Derived or compiled data Reproducible Text and data mining, (but very expensive) compiled databases, 3D models
  • 11. Data can take many forms • Notebooks & lab books • Instrument measurements • Experimental observations • Still images, video & audio • Survey results & interview transcripts • Consent forms • Text corpuses • Models & software
  • 12. Who owns or is responsible for your data? Ownership • Data ownership is complex, often defined on a case-by-case basis • May be dependent on individual contractual agreements • Contracts define needs of the University, staff, students, funders, collaborators Management • Defined in the University of Bath Code of Good Practice in Research: http://www.bath.ac.uk/opp/research/
  • 13. Who owns or is responsible for your data? In practice • Everyone plays their part • If you’re generating and using data, you should: • Comply with guidelines from your group, department, faculty, collaborators • Make sure your data is securely stored and backed up • Describe your data so that you/others can understand it in future • If you’re managing a project, you should: • Be fully aware of funder, collaborator and publisher requirements • Ensure you have access to group data • Assess what should be published and/or archived More info: http://www.data-archive.ac.uk/create-manage
  • 14. 2. How will you look after your data? 5. 1. Preservation & Re-Use Create • Is your data safe? • Is your data organised? 4. Publication 2. Active Use • Can you find your data? & Deposit 3. Documentation
  • 15. Storage and Security 3… 2… 1… Backup! • at least 3 copies of a file • on at least 2 different media • with at least 1 offsite Photo credits: Harvey Rutt http://www.ecs.soton.ac.uk/regenesis/pictures/ Test file recovery • At set up time and on a regular basis Access • Protect your hardware Ask BUCS • If sensitive use file encryption for advice • Keep passwords safe (e.g. Keypass) • At least 2 people should have access to your data More info: http://www.data-archive.ac.uk/create-manage/storage
  • 16. Storage and Security – Back up options Media Advantages Disadvantages CDs or DVDs • Useful for quick restore in the • Static capture of data event of minor disaster • Not built to last • Vulnerable to theft • Physical loss of media External hard • Dynamic capture of data • Must store securely and • remotely to original copy drives Useful for quick restore in the event of minor disaster • Vulnerable to theft • Must use file encryption if sensitive BUCS server • Resilient backup • Lack of offline access • X:drive (1Tb free per project) • Must have a BUCS account • Safety net for major disaster Digital scans of • Easy to do on a daily basis at • Manipulation of page lab books any campus printer content difficult • Automatically save daily page scans to your H:drive
  • 17. 2. How will you look after my data? 5. 1. Preservation & Re-Use Create • Is your data safe? • Is your data organised? 4. Publication 2. Active Use • Can you find your data? & Deposit 3. Documentation
  • 18. Can you find your data? If not, have you considered… A Clear Directory Structure •Top level folder and substructure File Version Control •Discard obsolete versions if no longer needed after making backups •Manage using: File naming (see below) Version control software (e.g. Git, Mercurial, SVN) File Naming Conventions http://www.jiscdigitalmedia.ac.uk/crossmedia/advice/choosing-a-file-name/ • Record any naming conventions or abbreviations used • E.g. [Experiment]_[Reagent]_[Instrument]_[YYYYMMDD].dat • Date/time stamp or use a separate ID (e.g. v1) for each version
  • 19. 3. Documenting data • Do you still understand 5. 1. your older work? Preservation Create & Re-Use • Is the file structure / naming understandable to others? 4. 2. Publication Active Use • Which data will be kept? & Deposit • Which data can be 3. discarded? Documentation
  • 20. Understanding your data • Students: • Will you be able to write up your methods at the end of your studies? • Project leads: • Will you be able to respond to reviewers comments? • Will you be able to find the information you need for final project reports? • Can you reproduce your work if you need to? • What information would someone else need to replicate your work?
  • 21. Understanding your data Do you know how you generated your data? • Equipment or software used • Experimental protocol • Other things included in (e.g.) a lab notebook • Can reference a published article, if it covers everything Are you able to give credit to external sources of data? • Include details of where the data are held, identified & accessed • Cite a publication describing the data • Cite the data itself e.g.
  • 22. Metadata • Contextual information for data is called metadata — literally data about data • Data repositories & archives require some generic metadata, e.g. • author, title, publication date • For data to be useful, it will also need subject-specific metadata e.g. • reagent names, experimental conditions, population demographic • Record contextual information in a text file (such as a ‘read me’ file) in the same directory as the data e.g. • codes for categorical survey responses • ‘999 indicates a dummy value in the data’ More info: http://www.data-archive.ac.uk/create-manage/document
  • 23. 4. What data will be deposited and where? • Are you expected to share your data? 5. 1. Preservation Create & Re-Use • Are you allowed to share your data? • Define the core data set of 4. 2. the project Publication Active Use & Deposit • Which data will be included in your 3. Documentation publication / thesis?
  • 24. Data Sharing – Why share your data? • Share with your future self – avoid repeating research! • Promote your research – get cited! • Enable new discoveries • Replication • Store your data in a reliable archive • Comply with funding requirements
  • 25. Requirements to share your data Some journal publishers have a policy on data availability: • Are you making any of your data available as supplementary information? • Is there sufficient information with the data so that it can be understood and reused? Most UK funders now expect research data to be made publically available Common Principles on Data Policy “Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.” Find your funder’s policy: http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
  • 26. Restrictions on sharing your data Are there privacy requirements from the funders or commercial partners? • e.g. personal data, high security data You might not have the right to share data collected from other sources • It depends upon whether those data were licensed and have terms of use • Most databases are licensed and prohibit redistribution of data without permission If you are uncertain as to your rights to disseminate data, contact research-data@bath.ac.uk
  • 27. How to share your data • Deposit in a data repository eg. GenBank • Data can be licensed • Culture of data sharing: can make available your data under a CC- BY or CC0 declaration to make this explicit • CC-BY license permits reuse but requires attribution • CC0 declaration is a waiver of copyright. • Laws about data vary in different countries. • You may have rights to first use or to commercial exploit data • How to license research data: http://www.dcc.ac.uk/resources/how-guides/license-research-data
  • 28. 5. Preservation and Re-use 5. • How long will your data be 1. Preservation Create reusable for? & Re-Use • Do you need to prepare your data for long term archive? 4. 2. Publication Active Use & Deposit • Which data do you need to keep? 3. Documentation
  • 29. Data retention and archiving How permanent are the data? • Short term (e.g. 3-5 years) • Long term (e.g. 10 years) • Indefinite Should discarded data be destroyed? • Keep all versions? Just final version? First and last? What are the re-processing costs? • Keep only software and protocol/methodology information Are there tools/software needed to create, process or visualise the data? • Archive these with your data
  • 30. File formats for long-term access • Unencrypted • Uncompressed • Non-proprietary/patent-encumbered • Open, documented standard • Standard representation (ASCII, Unicode) Type Recommended Avoid for data sharing Tabular data CSV, TSV, SPSS portable Excel Text Plain text, HTML, RTF Word PDF/A only if layout matters Media Container: MP4, Ogg Quicktime Codec: Theora, Dirac, FLAC H264 Images TIFF, JPEG2000, PNG GIF, JPG Structured data XML, RDF RDBMS Further examples: http://www.data-archive.ac.uk/create-manage/format/formats-table
  • 31. Summary • Data management is important at all stages of a project • There are tools available to help you • Keep your data safe • Back up your data • Test your back-ups • Keep your data organised • Find it – good formats and file names • Understand it - check documentation and metadata • Consider publishing your data so that you can get recognition for your work • Ask for help: research-data@bath.ac.uk
  • 32. What do you think now?
  • 33. Further Information • http://go.bath.ac.uk/research-data • research-data@bath.ac.uk Contacts Cathy Pink Jez Cope Institutional Data Scientist Technical Data Coordinator c.j.pink@bath.ac.uk j.cope@bath.ac.uk
  • 34. Acknowledgements This course was based on: Research Data MANTRA [online course] Created by EDINA and Data Library, University of Edinburgh Available at http://datalib.edina.ac.uk/mantra DataTrainArchaeology: Teaching Material Downloads and work done by: •Professor Richard H. Guy, Dept. of Pharmacy & Pharmacology •http://libraries.mit.edu/data-management