Research data management
Chris Awre
Library and Learning Innovation
Staff Development event, 10th March 2014
Agenda
• 9:15 – Introduction / structure for the session
• 9:20 – Research data – what is it and why manage it?
• 9:50 – D...
Aims for the day
• To show how data can be managed throughout the research
lifecycle
• To highlight good practice in resea...
Research data
Starting points
• Management of research data happens
– Existing activity is acknowledged
• Current research data manageme...
Research Data Management @ Hull
• What?
• Why?
• Where?
• When?
• How?
Research Data Management @ Hull | 10 March 2014 | 6
RDM @ Hull – What?
• Research data takes many forms, e.g., to name a few,
– Computer-generated data from experiments
– Sur...
What is data? Exercise
• Please list what data you work with or need to manage
– Use the flipchart sheets provided
• Inclu...
RDM @ Hull – Why?
• Data as research output
– Data itself can be a valid (REF) research output and needs to be
well manage...
RDM @ Hull – Where?
• Local research data management does not necessarily mean
local provision of storage
– Local manageme...
RDM @ Hull – When?
• When to manage data?
– At all stages of the research lifecycle
• Key to this is making it easy to emb...
RDM @ Hull – How?
• Research has a lifecycle
– Data management can also follow this cycle
– Helps to identify when data ma...
Questions…?
Data management guides
Data management lifecycle - UKDA
Research Data Management @ Hull | 10 March 2014 | 15
Data management planning
• Data management plan template
– Can be used as a full guide
– Can be used as a checklist
– It i...
Questions…?
During the research
Creating data
• Design research
• Plan data management
• Plan consent (if required)
• Plan storage
• Locate existing data
...
Organising your data
• Make sure the files are named and structured in a way that is
meaningful, to you and colleagues
– E...
Benefits of consistent file naming and organisation
• Data files are not accidentally overwritten or deleted
• Data files ...
File formats
• Be aware of the file formats your data exists in
– Does this format require a specific type of software?
– ...
Ethics and consent
• Where people or animals are involved, ethical consent will
be required
• Data management requirements...
Processing data
• Manage and store data
• Check, validate, clean data
• Anonymise data (if applicable)
• Describe data
– D...
Metadata/data description
• Information that says what the data file is
• Metadata is a brief summary
– Title, brief descr...
Storage and security
• Data storage
– How much storage will you need? Can you access this?
• Data security
– Data confiden...
Anonymisation
• If data includes personal data, it may need to be anonymised
to protect individuals
– Office of the Inform...
Analysing data
• Interpret data
• Derive data
• Produce research outputs
• Author publications based on data
Research Data...
Data as output
• Supplement to journal article
– Many journal publishers now insist on this
• If the data is going to be a...
Questions…?
Break
After the research
Preserving data
• Migrate data to best format
• Migrate data to suitable medium
• Back-up and store data
• Create metadata...
Preservation starts at the beginning
• By carrying out the actions recommended during the
research, preservation has alrea...
Preservation considerations
• Is the file format that was used during the research the most
appropriate for long-term pres...
Long-term storage and management
• Local
– Discuss early with ICTD
– Hydra digital repository (http://hydra.hull.ac.uk)
• ...
Giving access to data
• Distribute data
• Share data
• Control access
• Establish copyright
• Promote data
Research Data M...
Sharing data
• Reasons to share data (other than the funder telling you to)
– Scientific integrity
– Impact
– Collaboratio...
RCUK data principles
• Data are a public good
• Data management plans should adhere to standards and best
practice. Data s...
Hydra digital repository
• A University service to aid management of digital content, as
required by staff
– Theses, exam ...
Copyright and licensing
• Who owns the copyright in the data?
• As employees, the University owns copyright by default
– B...
Citation
• Data can be cited
• No standard format
– Different journals/repositories will have different
recommendations
– ...
Re-using data
• Follow-up research
• New research
• Undertake research reviews
• Scrutinise findings
• Teach and learn
Res...
Building on work already done
• Easy access to data for further work
• Ability to examine data from others
• Have your own...
Questions…?
Quick exercise
• What services / help would you value having access to
internally?
– Please take 5 minutes to list priorit...
Trends and external
developments
Why the emphasis on data management?
• Data as research output
– Data itself can be a valid (REF) research output and need...
Government driver
• “The Government, in line with our overarching commitment
to transparency and open data, is committed t...
The Royal Society
• Science as an Open Enterprise
– Scientists should communicate data, using open access
where possible, ...
OECD / Horizon 2020
• OECD principles and guidelines for access to research data
from public funding
– http://www.oecd.org...
EPSRC data management roadmap
Awareness
Policies
and
processes
Data
curation
Resourcing
Nine expectations
Preserving
Non-d...
RDM @ Hull: What?
Research Data Management @ Hull | 10 March 2014 | 53
DIY RDM? We are not alone
• Research data management is happening across HE
– Experience and expertise is growing
• The in...
Link to open access
• Open access publication refers largely to document outputs
– Articles, conference contributions, boo...
Thank you
Chris Awre, c.awre@hull.ac.uk
Upcoming SlideShare
Loading in …5
×

RDM staff development event presentation 140310

631 views

Published on

A presentation used for a staff development event on research data management at the University of Hull, 10th March 2014

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
631
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

RDM staff development event presentation 140310

  1. 1. Research data management Chris Awre Library and Learning Innovation Staff Development event, 10th March 2014
  2. 2. Agenda • 9:15 – Introduction / structure for the session • 9:20 – Research data – what is it and why manage it? • 9:50 – Data management lifecycle and planning • 10:15 – Data management during the research • 10:45 – Break • 11:00 – Data management after the research • 11:30 – Trends and external developments • 11:50 – Summary and wrap-up Research Data Management @ Hull | 10 March 2014 | 2
  3. 3. Aims for the day • To show how data can be managed throughout the research lifecycle • To highlight good practice in research data management practice • To explore data sharing and data preservation • To highlight how data management can be embedded Research Data Management @ Hull | 10 March 2014 | 3
  4. 4. Research data
  5. 5. Starting points • Management of research data happens – Existing activity is acknowledged • Current research data management initiatives are based on three trends – The amount of data is growing – Data management is required by multiple disciplines – Increasing perception of the value of data Research Data Management @ Hull | 10 March 2014 | 5
  6. 6. Research Data Management @ Hull • What? • Why? • Where? • When? • How? Research Data Management @ Hull | 10 March 2014 | 6
  7. 7. RDM @ Hull – What? • Research data takes many forms, e.g., to name a few, – Computer-generated data from experiments – Survey data – Compilations of historical facts • The scope of research data encompasses the materials and/or information that are created or gathered to underpin research analysis • Completed work / work in progress – Data management can, and probably should, encompass both Research Data Management @ Hull | 10 March 2014 | 7
  8. 8. What is data? Exercise • Please list what data you work with or need to manage – Use the flipchart sheets provided • Include anything you feel fits the definition: – Research data encompasses the materials and/or information that are created or gathered to underpin research analysis • Next, list issues and concerns you have in managing this data Research Data Management @ Hull | 10 March 2014 | 8
  9. 9. RDM @ Hull – Why? • Data as research output – Data itself can be a valid (REF) research output and needs to be well managed for presentation and assessment • Transparency of research – Good data management allows the process of research to be transparent, adding validity and integrity • Data security and accuracy – Data management is not just for outputs, but can support research practice itself • Data sharing – Foster collaboration and increase the value of the data through making it available for others to use Research Data Management @ Hull | 10 March 2014 | 9
  10. 10. RDM @ Hull – Where? • Local research data management does not necessarily mean local provision of storage – Local management, making use of services as required wherever they may be • Options for storage – Local – Cloud – Disciplinary, national or international data centre – Publishers (traditional journals, but also data journals) • All have associated services – What criteria determine which option(s)? Research Data Management @ Hull | 10 March 2014 | 10
  11. 11. RDM @ Hull – When? • When to manage data? – At all stages of the research lifecycle • Key to this is making it easy to embed – Minimise effort, whilst demonstrating benefits of effort undertaken • Timetable? – Your decision – Starting with data management at the start of a research process alleviates issues in the future Research Data Management @ Hull | 10 March 2014 | 11
  12. 12. RDM @ Hull – How? • Research has a lifecycle – Data management can also follow this cycle – Helps to identify when data management activities are required – Helps to plan data management within a project • Data management planning – Living document that acts as a guide to data management throughout a project – Helps define specific actions to undertake, and how they get acted upon Research Data Management @ Hull | 10 March 2014 | 12
  13. 13. Questions…?
  14. 14. Data management guides
  15. 15. Data management lifecycle - UKDA Research Data Management @ Hull | 10 March 2014 | 15
  16. 16. Data management planning • Data management plan template – Can be used as a full guide – Can be used as a checklist – It is a tool to support data management and a prompt for the issues you need to consider • Online data management planning – DMPOnline tool • http://dmponline.dcc.ac.uk • Generic questions / Funder requirements – The importance of data management planning • https://www.youtube.com/watch?v=PXr14Urf268 Research Data Management @ Hull | 10 March 2014 | 16
  17. 17. Questions…?
  18. 18. During the research
  19. 19. Creating data • Design research • Plan data management • Plan consent (if required) • Plan storage • Locate existing data • Collect data – Experiment, observe, measure, etc. • Capture and/or create metadata Research Data Management @ Hull | 10 March 2014 | 19
  20. 20. Organising your data • Make sure the files are named and structured in a way that is meaningful, to you and colleagues – Effort at the time can save time when you need to retrieve data – How do you manage version control? – Be consistent in practice • Important for collaborative research • Would you know what you had if returning to it in a year‟s time? Research Data Management @ Hull | 10 March 2014 | 20
  21. 21. Benefits of consistent file naming and organisation • Data files are not accidentally overwritten or deleted • Data files are distinguishable from each other within their containing folder • Data file naming prevents confusion when multiple people are working on shared files • Data files are easier to locate and browse • Data files can be retrieved both by creator and by other users • Data files can be sorted in logical sequence • Different versions of data files can be identified • If data files are moved to other storage platform their names will retain useful context Research Data Management @ Hull | 10 March 2014 | 21
  22. 22. File formats • Be aware of the file formats your data exists in – Does this format require a specific type of software? – Can others access the data in this format? – Can alternative formats be used? • Would transfer between formats lead to any loss of data? • Using widely available or open formats maximises the chances of your data being stable and usable • Watch out for backwards compatibility if software is upgraded Research Data Management @ Hull | 10 March 2014 | 22
  23. 23. Ethics and consent • Where people or animals are involved, ethical consent will be required • Data management requirements can be embedded within ethical/research approval – Single workflow – Ensures data management is at the heart of research planning • Consent needs to be carefully considered – What is being requested? – Think ahead, beyond the research, for implications Research Data Management @ Hull | 10 March 2014 | 23
  24. 24. Processing data • Manage and store data • Check, validate, clean data • Anonymise data (if applicable) • Describe data – Distinction with metadata – Covers more detail, e.g., methodology of creation and structure – Aim is to enable a third party to make sense of it Research Data Management @ Hull | 10 March 2014 | 24
  25. 25. Metadata/data description • Information that says what the data file is • Metadata is a brief summary – Title, brief description, date, creator, etc. • Data description is a fuller explanation of the data – Origins of the data, methodology used, structure, why it was captured, etc. • Metadata aids future retrieval, to give assurance the correct file has been found • Data description communicates the purpose of the data to others Research Data Management @ Hull | 10 March 2014 | 25
  26. 26. Storage and security • Data storage – How much storage will you need? Can you access this? • Data security – Data confidentiality – Data corruption and loss – How do you protect against these • Data backup – Good practice to have three copies of the data in different locations – Backup policy • What needs backing up? Who has access to backups? When can backups be deleted? Research Data Management @ Hull | 10 March 2014 | 26
  27. 27. Anonymisation • If data includes personal data, it may need to be anonymised to protect individuals – Office of the Information Commissioner guidance • http://ico.org.uk/for_organisations/data_protection/topic_guid es/anonymisation • Data Protection Act – Useful to know about the principles of this Act – Research exemptions • Data can be kept for the long-term to aid ongoing research, so long as there is no impact on the individuals concerned Research Data Management @ Hull | 10 March 2014 | 27
  28. 28. Analysing data • Interpret data • Derive data • Produce research outputs • Author publications based on data Research Data Management @ Hull | 10 March 2014 | 28
  29. 29. Data as output • Supplement to journal article – Many journal publishers now insist on this • If the data is going to be accessed, is the metadata/data description in place? • Data journals – Journals where the paper is just about the production of the dataset • Examples include Geoscience Data Journal and Biodiversity Data Journal, amongst others • Developing field of data science Research Data Management @ Hull | 10 March 2014 | 29
  30. 30. Questions…?
  31. 31. Break
  32. 32. After the research
  33. 33. Preserving data • Migrate data to best format • Migrate data to suitable medium • Back-up and store data • Create metadata and documentation • Archive data Research Data Management @ Hull | 10 March 2014 | 33
  34. 34. Preservation starts at the beginning • By carrying out the actions recommended during the research, preservation has already started – Preservation is not a post-research activity, but a continuous one • Any actions at this stage affect both preservation and sharing/access – Good management during the research provides a good platform for actions now Research Data Management @ Hull | 10 March 2014 | 34
  35. 35. Preservation considerations • Is the file format that was used during the research the most appropriate for long-term preservation? • What is the best medium for preservation? • What metadata and documentation will aid long-term preservation of the data? – Is this different to that already created? • What lifespan are you planning for? – When can data be deleted? Research Data Management @ Hull | 10 March 2014 | 35
  36. 36. Long-term storage and management • Local – Discuss early with ICTD – Hydra digital repository (http://hydra.hull.ac.uk) • Cloud – Who manages this? Who would act as proxy/backup? • National/international – What disciplinary data centres could you use? • E.g., NERC data centres, re3data.org • Offline Research Data Management @ Hull | 10 March 2014 | 36
  37. 37. Giving access to data • Distribute data • Share data • Control access • Establish copyright • Promote data Research Data Management @ Hull | 10 March 2014 | 37
  38. 38. Sharing data • Reasons to share data (other than the funder telling you to) – Scientific integrity – Impact – Collaboration – Innovation – Preservation (for your own use) – Teaching – Public record • "The coolest thing to do with your data will be thought of by someone else.” (Rufus Pollock, Open Knowledge Foundation) Research Data Management @ Hull | 10 March 2014 | 38
  39. 39. RCUK data principles • Data are a public good • Data management plans should adhere to standards and best practice. Data should be preserved where appropriate • Data should have metadata to facilitate discovery and understanding • Researchers should assess barriers preventing sharing • Data should be exploitable by the creators prior to sharing (within reason) • Sources of data should always be acknowledged • Public funds can be requested to assist with the management of data Research Data Management @ Hull | 10 March 2014 | 39
  40. 40. Hydra digital repository • A University service to aid management of digital content, as required by staff – Theses, exam papers, committee minutes, data, etc. – http://hydra.hull.ac.uk • Holds, presents and preserves files • Access can be open or controlled • Data can be structured to demonstrate links between files • Data catalogue – Data stored elsewhere can also be recorded here as a University output • EPSRC requirement Research Data Management @ Hull | 10 March 2014 | 40
  41. 41. Copyright and licensing • Who owns the copyright in the data? • As employees, the University owns copyright by default – But check the funding agreement for any claims • Beware third-party copyright within data • Datasets can have database copyright – Copyright in the whole, not just the parts • Data licensing – Open Data Commons, http://opendatacommons.org – Creative Commons, http://creativecommons.org – Open Government Licence Research Data Management @ Hull | 10 March 2014 | 41
  42. 42. Citation • Data can be cited • No standard format – Different journals/repositories will have different recommendations – Be clear about the information you provide • Title, creator, date, link, etc. • Example – M. Haines, ed. 'Newfoundland, 1698-1833' in M.G Barnard and J.H Nicholls (comp.), 2010, HMAP Data Pages (www.hull.ac.uk/hmap) Research Data Management @ Hull | 10 March 2014 | 42
  43. 43. Re-using data • Follow-up research • New research • Undertake research reviews • Scrutinise findings • Teach and learn Research Data Management @ Hull | 10 March 2014 | 43
  44. 44. Building on work already done • Easy access to data for further work • Ability to examine data from others • Have your own data validated • Research-informed teaching – Ability to use data within teaching Research Data Management @ Hull | 10 March 2014 | 44
  45. 45. Questions…?
  46. 46. Quick exercise • What services / help would you value having access to internally? – Please take 5 minutes to list priorities – Data will be used to help shape future assistance Research Data Management @ Hull | 10 March 2014 | 46
  47. 47. Trends and external developments
  48. 48. Why the emphasis on data management? • Data as research output – Data itself can be a valid (REF) research output and needs to be well managed for presentation and assessment • Transparency of research – Good data management allows the process of research to be transparent, adding validity and integrity • Data security and accuracy – Data management is not just for outputs, but can support research practice itself • Data sharing – Foster collaboration and increase the value of the data through making it available for others to use Research Data Management @ Hull | 10 March 2014 | 48
  49. 49. Government driver • “The Government, in line with our overarching commitment to transparency and open data, is committed that publicly- funded research should be accessible free of charge. Free and open access to taxpayer-funded research offers significant social and economic benefits by spreading knowledge, raising the prestige of UK research and encouraging technology transfer” Innovation and Research Strategy for Growth Department of Business, Innovation & Skills, 2011 Research Data Management @ Hull | 10 March 2014 | 49
  50. 50. The Royal Society • Science as an Open Enterprise – Scientists should communicate data, using open access where possible, and a repository where justifiable – Universities should support an open data culture, and be rewarded for this – Funders should support data management financially – Publishers should insist on data availability – Government should foster policies for open data – What is shared should be proportionate to the research and in the public interest, and using shared protocols • http://royalsociety.org/policy/projects/science-public- enterprise/report/ Research Data Management @ Hull | 10 March 2014 | 50
  51. 51. OECD / Horizon 2020 • OECD principles and guidelines for access to research data from public funding – http://www.oecd.org/science/sci-tech/38500813.pdf • Horizon 2020 – All data outputs must be open access – http://ec.europa.eu/research/participants/data/ref/h202 0/grants_manual/hi/oa_pilot/h2020-hi-oa-data- mgt_en.pdf • Needs UKRO account for access – Funding applications must address data management Research Data Management @ Hull | 10 March 2014 | 51
  52. 52. EPSRC data management roadmap Awareness Policies and processes Data curation Resourcing Nine expectations Preserving Non-digital data Metadata Publication link to data Manage access Research Data Management @ Hull | 10 March 2014 | 52
  53. 53. RDM @ Hull: What? Research Data Management @ Hull | 10 March 2014 | 53
  54. 54. DIY RDM? We are not alone • Research data management is happening across HE – Experience and expertise is growing • The initiatives underway are: – Making use of available tools and adapting/exploiting them for Hull – Taking us down a path that many others are following • We can all learn from each other • Need to map local needs against available external support, and fill in the gaps locally – Develop tailored University of Hull RDM support Research Data Management @ Hull | 10 March 2014 | 54
  55. 55. Link to open access • Open access publication refers largely to document outputs – Articles, conference contributions, books, reports, etc. • Open access to data is being treated separately – To avoid confusion – Because datasets need specific attention • Now seeing them being brought together more – Horizon 2020 – The „open‟ agenda will see this continuing Research Data Management @ Hull | 10 March 2014 | 55
  56. 56. Thank you Chris Awre, c.awre@hull.ac.uk

×