Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Digital Curation 101 (University of Glamorgan)

1,507 views

Published on

Draft slides for a Digital Curation 101 session, University of Glamorgan, 21 January 2013

  • Be the first to comment

  • Be the first to like this

Digital Curation 101 (University of Glamorgan)

  1. 1. … because good research needs good data Digital Curation 101 University of Glamorgan 21 January 2013 Michael Day Digital Curation Centre UKOLN, University of Bath m.day@ukoln.ac.uk http://www.dcc.ac.uk/ Funded by:DCC 101, University of Glamorgan, 21 January 2013
  2. 2. … because good research needs good dataAgenda • Part 1. Introduction to research data management: activities, roles and requirements • Exercise: Data management quiz • Part 2. Developing data policies and services • Exercise: Developing a roadmap • Part 3: DMP Online tool and guidance • With thanks to Joy Davidson, Sarah Jones and Kerry Miller (DCC) Funded by: DCC 101, University of Glamorgan, 21 January 2013
  3. 3. … because good research needs good data Introduction to Research DataManagement: activities, roles and requirements Michael Day and Kerry Miller Digital Curation Centre UKOLN, University of Bath m.day@ukoln.ac.uk http://www.dcc.ac.uk/ Funded by: DCC 101, University of Glamorgan, 21 January 2013
  4. 4. … because good research needs good dataA Quick Introduction • What is research data management? • Who is involved and how? • What skills and support are needed? Funded by: DCC 101, University of Glamorgan, 21 January 2013
  5. 5. … because good research needs good dataWhat is Research Data Management? • Caring for, • Facilitating access to, • Preserving and • Adding value to digital research data throughout its lifecycle. Funded by: DCC 101, University of Glamorgan, 21 January 2013
  6. 6. … because good research needs good dataTypical Activities • Creation and sharing of data • File naming and description • Dealing appropriately with sensitive data • Data storage • Appraisal, selection and disposal • Data licensing • Data management planning Funded by: DCC 101, University of Glamorgan, 21 January 2013
  7. 7. … because good research needs good dataWhat are the main drivers? • National and international policy development • The Organisation for Economic Co-operation and Development describes data as a public good that should be made available • Research Councils UK in its Code of Good Research Conduct says data should be preserved and accessible for 10 years + • The data management policies of funding bodies are increasingly demanding of institutional commitment and provisions ... • The needs of • Researchers • Institutions Funded by: DCC 101, University of Glamorgan, 21 January 2013
  8. 8. … because good research needs good dataBenefits to researchers • Scholarly communication/access to data • Re-purposing and re-use of data • Stimulating new networks/collaborations & • new research • Knowledge transfer to industry • Verification of research/research integrity • Re-purposing data for new audiences • Secure storage for data intensive research • Availability of data underpinning journal articles • Increased visibility/citation Keeping Research Data Safe Factsheet Keeping Research Data Safe Factsheet http://www.beagrie.com/KRDS_Factsheet_0910.pdf http://www.beagrie.com/KRDS_Factsheet_0910.pdf Funded by: DCC 101, University of Glamorgan, 21 January 2013
  9. 9. … because good research needs good dataThe researcher perspective • Managing and sharing data is simply part of good research: • Adhering to disciplinary and/or institutional codes of practice and policies • Has been practiced since the advent of modern science, but not always consistently; data intensive research makes it even more critical • Meeting the specific requirements of funding bodies • Reputational risks if data management is not handled properly Funded by: DCC 101, University of Glamorgan, 21 January 2013
  10. 10. … because good research needs good dataInstitutional drivers • Safeguarding research integrity • Increasing number of FOI requests for data • Adhering to existing codes of research practice and ethics • Developing new institution-wide strategies, policies and services for data storage and management • Increased institutional focus on research management (e.g., in response to REF) • Benchmarking – self-assessing infrastructure and planning for improvement • More demands but less resources to work with Funded by: DCC 101, University of Glamorgan, 21 January 2013
  11. 11. … because good research needs good dataResearch codes of practice (1) • UK Research Integrity Office Code of Practice for Research (2009) Data management planning is an essential part of research design Organisations should have in place procedures, resources (including physical space) and administrative support to assist researchers in the accurate and efficient collection of data and its storage in a secure and accessible form [3.12.5] Funded by: DCC 101, University of Glamorgan, 21 January 2013
  12. 12. … because good research needs good dataResearch codes of practice (2) • RCUK Code of Conduct on the Governance of Good Research Conduct (2011) Primary data and research evidence [should be made] accessible to others for reasonable periods after the completion of the research: data should normally be preserved and accessible for 10 yrs (in some cases 20 yrs or longer) Responsibility for proper management and preservation of data and primary materials is shared between the researcher and the research organisation [although deposit within national collections is endorsed] Funded by: DCC 101, University of Glamorgan, 21 January 2013
  13. 13. … because good research needs good dataResearch funding bodies • UK Research Councils • Help fund some data archives, e.g.: • Archaeology Data Service, European Bioinformatics Institute, the NERC data centres, UK Data Archive • Support for JISC (and DCC) • RCUK Common Principles on Data Policy • Recognises that data are a critical output of the research process http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx Funded by: DCC 101, University of Glamorgan, 21 January 2013
  14. 14. … because good research needs good dataRCUK Principles (in a nutshell) • Publicly funded research data should be made openly available • Data with acknowledged long-term value should be preserved and remain accessible and usable for future research • Sufficient metadata should be recorded to enable other researchers to find and understand the research to enable re-use; published results should always include information on how to access the supporting data • Recognition that there may be legal, ethical and commercial constraints • Recognition that researchers may need privileged use of data for a limited period • All users of research data should acknowledge their sources • Appropriate to use public funds to support MRD Funded by: DCC 101, University of Glamorgan, 21 January 2013
  15. 15. … because good research needs good dataFunder expectations • Institutions need to inform themselves about main funder policies (mandates) with respect to research data management • There is an explicit link between research income and appropriate data management infrastructures Funded by: DCC 101, University of Glamorgan, 21 January 2013
  16. 16. … because good research needs good dataFunder policies http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-poli Funded by: DCC 101, University of Glamorgan, 21 January 2013
  17. 17. … because good research needs good dataEPSRC expectations (1) • EPSRC policy (2011) expected all institutions receiving grant funding: • To develop a roadmap aligning their policies and processes with EPSRC’s expectations by 1st May 2012 • To be fully compliant with these expectations by 1st May 2015 Funded by: DCC 101, University of Glamorgan, 21 January 2013
  18. 18. … because good research needs good dataEPSRC expectations (2) • Appropriate metadata (including unique IDs) to be made freely available on the Internet within 12 months of data generation • Data not generated in digital format should be stored in a manner to facilitate it being shared • Data should be securely preserved for a minimum of 10 years after privileged access expires or the last date access was requested by a third party • Adequate resources from existing funding streams • EPSRC will monitor progress and compliance, and reserves the right to impose appropriate sanctions Funded by: DCC 101, University of Glamorgan, 21 January 2013
  19. 19. … because good research needs good dataImplications for researchers • Increasing number of research councils and funding bodies with data management and sharing requirements • Potential loss of research income if these mandates are not met • Need to determine the costs associated with short and longer-term management and curation and to request funds as part of grant • Responsibility for infrastructure shifting more to HEIs and less to centralised data archives, but institutional infrastructures and services are still emerging • Need guidance - some good external support • But also need more local support; often fragmented (need to draw upon existing channels within your institution wherever possible) Funded by: DCC 101, University of Glamorgan, 21 January 2013
  20. 20. … because good research needs good dataActivities, roles, requirements (1) • Requirements gathering • Identifying researchers’ data requirements • Developing a shared understanding of what needs to be done (e.g., identifying where data exist, its form and scale, any existing retention requirements) • Identifying good practice within the institution (and the opposite) • Methods: surveys, focus groups, case studies, joint R&D projects, assessment tools (e.g. DAF) Funded by: DCC 101, University of Glamorgan, 21 January 2013
  21. 21. … because good research needs good dataActivities, roles, requirements (2) • Identifying motivations and benefits • For researchers, support services, the institution • Identifying risks • Data loss (institution, research group, individual) • Increased costs (lack of planning, service inefficiency, data loss) • Legal compliance (research funder, H&S, ethics, FoI) • Reputation (institution, unit, individual) • Identifying costs • Keeping Research Data Safe (KRDS) toolkit Funded by: DCC 101, University of Glamorgan, 21 January 2013
  22. 22. … because good research needs good dataActivities, roles, requirements (3) • Assessing institutional preparedness • Identifying institutional stakeholders, existing data support services, gaps • Benchmarking and planning for the future • Skills audit • DCC CARDIO tool • Policy development • Policies – approval by senior management is just the start; policies need to be embedded in research practice and responsive to changing requirements • Data management planning • DMP online, DCC How-to Develop a Data Management Plan guide Funded by: DCC 101, University of Glamorgan, 21 January 2013
  23. 23. … because good research needs good dataActivities, roles, requirements (4) • Implementation and service development • Integrating where possible with existing services, e.g. IR, CRIS, VRE, HPC, cloud services, social media, etc. • Appraisal, deciding what needs to be kept and for how long • Storage choices – no one-size-fits-all solution, e.g. Bristol’s BluePeta petascale storage facility, Bath’s X-Drive approach, cloud approaches • Data documentation and metadata – layered approaches: top-level discovery (core metadata, collection/experiment- level?), role of standards like DCMI, CERIF, DDI, etc. Funded by: DCC 101, University of Glamorgan, 21 January 2013
  24. 24. … because good research needs good dataActivities, roles, requirements (5) • Data issues: • Appraisal: selection criteria, retention periods (who decides?) • DCC How to appraise and select research data for curation guide • Documentation: metadata, schema, semantics • Formats: proprietary formats, community standards, etc. • Provenance and authenticity • Citation (assignment of persistent IDs?) • Access (embargo policies?) • Licensing • DCC How to license research data guide Funded by: DCC 101, University of Glamorgan, 21 January 2013
  25. 25. … because good research needs good dataWho are involved? • Funding bodies • Archives / long-term data repositories • At institutions: • Senior management • Researcher(s) • Research support officers / project staff • Lab technicians • Librarians / Data Centre staff • Faculty ethics committees • Institutional legal / IP advisors • FOI officer / DPA officer / records manager • Computing support • Institutional compliance officers Funded by: DCC 101, University of Glamorgan, 21 January 2013
  26. 26. … because good research needs good dataApproaching the Issue • What data exist and are being created? • Where are greatest recoups on investment available? • Training? • Storage? • Policy development • What are the requirements? • Who needs to be involved? Funded by: DCC 101, University of Glamorgan, 21 January 2013
  27. 27. … because good research needs good dataMaking the most of what we’ve got • Local expertise more widespread than you think • Ethics committees • Data protection office • IT Services • Repository Service • If you need help, ask! From University of Glasgow’s Data Management micro-site Funded by: DCC 101, University of Glamorgan, 21 January 2013
  28. 28. … because good research needs good dataData management planning • A plan to address critical data management issues: • What data will be created (format, types) and how? • How will the data be documented and described? • How will ethics and intellectual property considerations be addressed? • What are the plans for data sharing and access? • What is the strategy for long-term preservation? Funded by: DCC 101, University of Glamorgan, 21 January 2013
  29. 29. … because good research needs good dataIntegrating is a tricky business • Make a sound case for investing in data management training • Draw upon existing policies and mandates wherever you can • Spend some time identifying current data holdings, researchers’ practice and future training needs • Make sure you are putting your effort where it will count • Don’t reinvent the wheel – augment or adapt existing training and support materials with data management aspects Funded by: DCC 101, University of Glamorgan, 21 January 2013
  30. 30. … because good research needs good dataWhat the DCC can help withNeeds assessmentCARDIO Tool– collaborative assessment & benchmarking ofRDM strengths/weaknessesData Asset Framework – interviews to scope current RDMpractice and recommend improvements Developing strategic institutional RDM frameworkWorkflow assessment – methodology for analysing current Strategy development – getting key people together to discuss/plan forRDM workflows RDM Policy development – scoping, defining, embedding research data policiesDelivering support Costing - assist with the development of costing and pricing for RDMCustomised Data Management Plans – templates / guidance to servicesbe added to DMP Online Risk management - identify risks in RDM practice and recommendTraining – institutional/disciplinary tailored courses, online mitigationsresources Institutional data catalogues - recommend options for exposing metadataIncremental – repackaging existing support to raise awareness about your research data via CRIS systems, repositories, or a mix of theseand make guidance more meaningful to researchers Funded by: DCC 101, University of Glamorgan, 21 January 2013
  31. 31. … because good research needs good dataExercise: How are you performing? • Individually, complete the quick data management quiz (5 mins) • Compare results, try to learn from those with confidence in those areas in which you consider yourself to be weaker (10 mins) • Based on your group’s discussions... • Write down one practical thing you can do at work in order to edge towards an A. Funded by: DCC 101, University of Glamorgan, 21 January 2013
  32. 32. … because good research needs good data Part 2:Developing data policies and services Based on a presentation prepared by Sarah Jones (Digital Curation Centre) sarah.jones@glasgow.ac.uk Funded by: DCC 101, University of Glamorgan, 21 January 2013
  33. 33. … because good research needs good dataOutline • Who is responsible for RDM? • What are the components of a data service? • Learning lessons from other HEIs • Developing roadmaps Funded by: DCC 101, University of Glamorgan, 21 January 2013
  34. 34. … because good research needs good dataWho is responsible for RDM? Funders Advisory Data bodies centres Research Organisations Support Publishers services Researchers Funded by: DCC 101, University of Glamorgan, 21 January 2013
  35. 35. … because good research needs good dataComponents of a research data service? Tools Support staff & services Metadata and documentation Research Archive environment& Storage systems Preserve Back-up RDM policies & Share Access Advocacy (senior mgmt & researcher) Funded by: DCC 101, University of Glamorgan, 21 January 2013
  36. 36. … because good research needs good dataData storage – Bristol example• £2m funding to date• Petascale facility – expandable• 3 machine rooms – resilience (tape archive 2012)• Available to all researchers for research data Blue Peta at Bristol 1st 5TB free per Data Steward then £400 per TB p.a. for disk storage; tape backup £40 per TB http://data.bris.ac.uk Funded by: DCC 101, University of Glamorgan, 21 January 2013
  37. 37. … because good research needs good data Tools – an ‘academic dropbox’ Piloted at Lincoln & Edinburghwww.dataflow.ox.ac.uk http://tiny.cc/owncloud-pilot National level negotiation via Janet brokerage? Funded by: DCC 101, University of Glamorgan, 21 January 2013
  38. 38. … because good research needs good data Archiving – institutional data repositories Not intended to replace national, subject or other established data collectionshttp://datashare.is.ed.ac.uk Essex-RDR and Acknowledgment of hybrid DataPool at Southampton environment www.dspace.cam.ac.uk/ https://databank.ora.ox.ac.uk Funded by: DCC 101, University of Glamorgan, 21 January 2013
  39. 39. … because good research needs good data Archiving – external data centres Research funders’ data centres… Structured databasesDisciplinary& community List of data centres: initiatives http://databib.org Funded by: DCC 101, University of Glamorgan, 21 January 2013
  40. 40. … because good research needs good dataData catalogues (metadata) • DataFinder at OxfordDevelop a research data • DDI metadata byextension to the CERIF standard ResearchData@Essexhttp://cerif4datasets.wordpress.comJISC & DCC planning national coordination Can we learn lessons from overseas? Funded by: DCC 101, University of Glamorgan, 21 January 2013
  41. 41. … because good research needs good dataGuidance and trainingCollate guidancewww.gla.ac.uk/datamanagement Online training http://datalib.edina.ac.uk/mantra Embed into curriculum via Doctoral Training Centres e.g. Research360@Bath http://blogs.bath.ac.uk/research360 Funded by: DCC 101, University of Glamorgan, 21 January 2013
  42. 42. … because good research needs good dataDisciplinary training (RDMTrain)www.dcc.ac.uk/training/train-trainer/disciplinary-rdm-training Funded by: DCC 101, University of Glamorgan, 21 January 2013
  43. 43. … because good research needs good data Early research data policies“Statement of commitment” legal compliance style Infrastructure  policy a section in uni DM policy useful guide as appendix “10 commandments” mutual promises aspirational Based on Edin. with a fewBaseline of RCUK Code additions+ procedures & support www.dcc.ac.uk/resources/policy-and-legal/institutional- Funded by: data-policies DCC 101, University of Glamorgan, 21 January 2013
  44. 44. … because good research needs good data How are others developing policies? Theme from MRD workshop in Leeds: High level policy (ratified) + User guides, practical support + RDM InfrastructureDeveloping data policies: a trend for 2012 http://tiny.cc/MRD-policy-workshophttp://tiny.cc/PolicyNews (news post from Dec 2011) Funded by: DCC 101, University of Glamorgan, 21 January 2013
  45. 45. … because good research needs good dataPolicy development “EPSRC expects all those it funds to have developed a clear roadmap to align their policies and processes with EPSRC’s expectations by 1st May 2012, and to be fully compliant with these expectations by 1st May 2015.”www.epsrc.ac.uk/about/standards/researchdata/Pages/impact.aspx Funded by: DCC 101, University of Glamorgan, 21 January 2013
  46. 46. … because good research needs good dataWhat is the EPSRC looking for?• Know what you hold – publish metadata• Link publications and data• Share data wherever possible http://tiny.cc/ EPSRC-data-policy• Curate and preserve valuable data The same as other funders (i.e. good research practice) so think broadly when you develop your strategy Funded by: DCC 101, University of Glamorgan, 21 January 2013
  47. 47. … because good research needs good dataExercise: Developing a roadmap for RDM Think about the potential components of a RDM service Based on the strengths/weaknesses you identified in the quiz: • Draft a list of actions needed at your institution • Attempt to prioritise your list and pencil in timeframes (consider quick wins!) • Decide who needs to be involved to make this happen? Funded by: DCC 101, University of Glamorgan, 21 January 2013
  48. 48. … because good research needs good data Part 3DMP Online tool and guidance Based on a presentation prepared by Sarah Jones and Joy Davidson (DCC) sarah.jones@glasgow.ac.uk Funded by: DCC 101, University of Glamorgan, 21 January 2013
  49. 49. … because good research needs good dataFunders have DMP requirements http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies Funded by: DCC 101, University of Glamorgan, 21 January 2013
  50. 50. … because good research needs good dataFunding body requirements • Typically a short (c.1-2 pp) statement, covering: • What data will be created (format, types, volume, avoidance of duplication) • Standards and methodologies to be used (including metadata) • How ethics and Intellectual Property will be addressed • Plans for data sharing and access • Strategy for long-term preservation Funded by: DCC 101, University of Glamorgan, 21 January 2013
  51. 51. … because good research needs good dataDCC support • Guidance • Examples • Tools Funded by: DCC 101, University of Glamorgan, 21 January 2013
  52. 52. … because good research needs good dataWhat is DMP Online? • A web-based tool to help researchers write plans • It features: • Templates based on different requirements • Tailored guidance (disciplinary, funder etc) • Customised exports to a variety of formats • Ability to share DMPs with others • https://dmponline.dcc.ac.uk Funded by: DCC 101, University of Glamorgan, 21 January 2013
  53. 53. … because good research needs good data Start a plan Pick relevant funder templateGet a list of their specific questions Funded by: DCC 101, University of Glamorgan, 21 January 2013
  54. 54. … because good research needs good data Create a plan at the bid stage...answerthequestionsbased oninitialresearch Funded by:ideas DCC 101, University of Glamorgan, 21 January 2013
  55. 55. … because good research needs good data Once funded, flesh the plan out (roles, etc)...answerthequestionsbased ondetailed Funded by:workplan DCC 101, University of Glamorgan, 21 January 2013
  56. 56. … because good research needs good data When project is finished...answerthequestionsbased onthe outputsthat are Funded by:beingkept DCC 101, University of Glamorgan, 21 January 2013
  57. 57. … because good research needs good dataInstitutional customisation Add your logo, URL, colours Profile local support, boilerplate text Select desired questions http://www.dcc.ac.uk/blog/tailoring-dmp-online-for-your-institution Funded by: DCC 101, University of Glamorgan, 21 January 2013
  58. 58. … because good research needs good dataLinks to specific examples Thinks about why the questions are being asked – what are funders looking for? Gives examples, local if possible http://www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/framewo Funded by: DCC 101, University of Glamorgan, 21 January 2013
  59. 59. … because good research needs good dataTop tips • Encourage researchers to start early - not wait until the last minute! • The plan will - and should - change over life of project. • Get other support staff involved - ethics, IT, library, RM, DP/FoI • Update the plan with project updates • Use plan as a communication tool - with partners, funding bodies and yourself! Funded by: DCC 101, University of Glamorgan, 21 January 2013
  60. 60. … because good research needs good data Thank you! Any questions? Michael Day, Digital Curation Centre UKOLN, University of Bath m.day@ukoln.ac.uk http://www.dcc.ac.uk/ Funded by:DCC 101, University of Glamorgan, 21 January 2013

×