Digital Curation for PGR Students<br />Joy Davidson and Sarah Jones<br />HATII and the Digital Curation Centre<br />email@example.com<br />firstname.lastname@example.org<br />DATUM for Health <br />
Session aims and objectives<br />Aimed at PGRs, we’ll use the context of starting a PhD to:<br />introduce the curation lifecycle model as a means of contextualizing the roles and activities required to maintain access to data over time<br />highlight some of the curation tools and approaches available and provide pointers to further information and support <br />help participants prepare the curation aspects of their PhD study<br />We hope you leave able to explain why data curation is important and<br />what roles PGR students / researchers play<br />
What is data curation?<br />“the active management and appraisal of data <br />over the lifecycle of scholarly and scientific interest”<br />Data have importance as the evidential base <br />of scholarly conclusions<br />Curation is part of good research practice<br />
Why curate: requirements<br />Code of good research conduct<br />data should be preserved and accessible for 10 years +<br />declaration<br />data are a public good and should be openly available<br />Funders’ data policies<br />www.dcc.ac.uk/resources/policy-and-legal/funders-data-policies<br />Common principles on data policy<br />www.rcuk.ac.uk/research/Pages/ DataPolicy.aspx<br />
Why curate: rewards<br />Prevent data loss<br />More citations: 69% ↑ <br />(Piwowar, 2007 in PLoS)<br />Validation of results<br />New research opportunities and collaborations<br />Easier to do your research<br />
Conceptualise: planning what to do<br />Activities<br /><ul><li> define a research question and design your methodology
bid for funding (incl. data management and sharing plans)
plan data creation (capture methods, standards, formats)</li></ul>Roles<br />PGR student, supervisory team, sponsors / funding bodies, IT, research governance, ethics panel <br />Decisions made now have an impact on every other stage of the lifecycle, so it is worth getting things right from the start! <br />
Specific issues to consider<br /><ul><li> Defining your method – access to software, equipment, skills
What storage needs do you anticipate - enough capacity?
What are your university’s / sponsor’s requirements?
Are there likely to be any embargoes on data publication?</li></li></ul><li>Tools and resources<br />DCC Helpdesk: www.dcc.ac.uk/helpdesk<br />If you need further assistance at this stage, please don’t hesitate to drop us a line via our helpdesk and we’ll make every effort to support your curation activity. <br />DCC Policy and Legal Pages: www.dcc.ac.uk/resources/policy-and-legal<br />Our table shows the curation related requirements for particular funding body. This may not relevant to you now, but will be in your future research career.<br />
Data Management and Sharing Plans<br />Typically want a short statement covering:<br /><ul><li>What data will be created (format, types) and how?
How will the data be documented and described?
How will you manage ethics and Intellectual Property?
What are the plans for data sharing and access?
What is the strategy for long-term preservation?</li></ul>DMP guidance: www.dcc.ac.uk/resources/data-management-plans<br />DMP online: http://dmponline.dcc.ac.uk/<br />
ESRC data management and sharing plan <br />The ESRC currently asks five set questions covering: <br />review of existing datasets for reuse; <br />what data will be produced; <br />any difficulties with mandatory archiving;<br />potential users of the dataset;<br />cost of preparing and documenting data for archiving<br />www.esds.ac.uk/aandp/create/esrcfaq.asp<br />A new requirement comes into force in spring 2011 <br />Nine themes are given that it’s expected a plan will cover<br />www.esrc.ac.uk/_images/Research_Data_Policy_2010_tcm8-4595.pdf<br />
MRC data sharing and preservation strategy <br />Expected to provide a succinct summary of:<br />Type(s) of qualitative or quantitative data that will be generated<br />Further intended and/or foreseeable research uses for the dataset(s)<br />Plans for preparing and documenting data for preservation and sharing <br />Applicants requesting funds to extend existing data should also explain:<br />The distinctive added value that the new data would provide in relation to<br /> existing studies, databases or datasets in the same field<br />How data sharing would provide opportunities for coordination or collaboration<br />www.mrc.ac.uk/Ourresearch/Ethicsresearchguidance /Datasharinginitiative/Policy/index.htm<br />
Wellcome Trust data management and sharing plan<br />Consider as briefly and unambiguously as possible:<br />What data outputs will your research generate and what data will have value to <br /> other researchers? <br />When will you share the data? <br />Where will you make the data available?<br />How will other researchers be able to access the data?<br />Are any limits to data sharing required – for example, to either safeguard <br /> research participants or to gain appropriate intellectual property protection? <br />How will you ensure key datasets are preserved to ensure their long-term value? <br />What resources will you require to deliver your plan?<br />www.wellcome.ac.uk/About-us/Policy/Spotlight-issues/Data-sharing/Guidance-for-researchers/<br />
Create: collect, capture and analyse data<br />Activities<br /><ul><li> data capture
creation of administrative, descriptive, structural and technical metadata
clarification of IPR rights and consent agreements to prevent disputes</li></ul>Roles<br />PGR student, supervisory team, information specialists, technical support<br />Your data could be expensive or impossible to recapture so it is essential to ensure it has context for long-term comprehensibility and reuse. <br />
Specific issues to consider<br /><ul><li> What do you want to do with your research data?
What do you want others to be able to do with your data?
Will you want to reuse / share your data in the future?
Who has rights over the data? How will it be licensed?
What contextual metadata will you record and how?
How will you handle file naming and version control?
What level of data quality do you need to achieve?
Will you make use of any data collection policies?
Do you have access to training and support for the above? </li></li></ul><li>Tools and resources<br />Your university’s policies and handbooks e.g. ethics handbook<br />www.northumbria.ac.uk/static/5007/respdf/ethics_handbook_2.pdf<br />UKDA guide on managing and sharing data<br />www.data-archive.ac.uk/media/2894/managingsharing.pdf<br />DCC guidance:www.dcc.ac.uk/resources<br />Creative Commons http://creativecommons.org/<br />
Appraise: select what do you need to keep<br />Activities<br /><ul><li> discuss with your supervisory team which data it is legal, appropriate,</li></ul> and valuable to curate over the long term <br /><ul><li> decide how your data will be kept e.g. data centre deposit
dispose of data no longer needed in an appropriate manner</li></ul>Roles <br />PGR students, supervisory team, information specialists, funding bodies<br />While storage space and costs may not be an issue for some, the ‘keep everything’ approach may not be viable in the longer-term. As the volume of data retained increases the efficient search and retrieval of relevant data becomes more and more difficult. <br />
Specific issues to consider: <br /><ul><li> What is the minimum you need to keep for your data findings and</li></ul> publications to be supported over time?<br /><ul><li> What do your sponsors expect you to keep and where?
What consent about data management have you obtained from your</li></ul> research participants?<br /><ul><li> Are there any data that you, by law, are not allowed to keep?
Has enough contextual information been collected to make an</li></ul> informed decision about which data to keep?<br /><ul><li> Do you have access to expertise in your supervisory team or at your</li></ul> institution to assist with selection and appraisal? <br />
Tools and resources<br />DCC briefing paper<br />www.dcc.ac.uk/resources/briefing-papers/introduction-curation/ appraisal-and-selection<br />How to guide: <br />www.dcc.ac.uk/resources/how-guides/appraise-select-research-data<br />
Appraise and select exercise<br />Work in small groups and use your PhD studies as an example<br />Discuss the issues and how they affect selection decisions<br />Epistemological and methodological aspects of qualitative research that might prevent reuse / sharing<br />Ethical constraints, consent requirements, IPR, organisational confidentiality…<br />How to provide sufficient metadata and context so the data is meaningful to the researchers themselves in the future, and to other researchers<br />Long term value of the data: to the researcher themselves, to other researchers, to ‘society’ etc<br />
CHECKLISTS: Conceptualise<br />Get into the habit of equating data curation with good research.<br />Know what your funding body expects you to do with your data and for how long. Assess your ability to be able to meet these expectations (i.e., do you need additional funding or staff?)<br />Determine intellectual property rights from the outset and ensure they are documented.<br />Identify any anticipated publication requirements (embargoes, restrictions on <br /> publishing over multiple sites)<br />Identify and document specific roles and responsibilities as early as possible. <br />