Your SlideShare is downloading. ×
Digital curation for postgraduate students
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Digital curation for postgraduate students


Published on

Slides from a DC101 workshop run at the University of Northumbria on behalf of the JISC-funded DATUM for Health project.

Slides from a DC101 workshop run at the University of Northumbria on behalf of the JISC-funded DATUM for Health project.

Published in: Education, Technology
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Digital Curation for PGR Students
    Joy Davidson and Sarah Jones
    HATII and the Digital Curation Centre
    DATUM for Health
  • 2. Session aims and objectives
    Aimed at PGRs, we’ll use the context of starting a PhD to:
    introduce the curation lifecycle model as a means of contextualizing the roles and activities required to maintain access to data over time
    highlight some of the curation tools and approaches available and provide pointers to further information and support
    help participants prepare the curation aspects of their PhD study
    We hope you leave able to explain why data curation is important and
    what roles PGR students / researchers play
  • 3. What is data curation?
    “the active management and appraisal of data
    over the lifecycle of scholarly and scientific interest”
    Data have importance as the evidential base
    of scholarly conclusions
    Curation is part of good research practice
  • 4. Why curate: requirements
    Code of good research conduct
    data should be preserved and accessible for 10 years +
    data are a public good and should be openly available
    Funders’ data policies
    Common principles on data policy DataPolicy.aspx
  • 5. Why curate: rewards
    Prevent data loss
    More citations: 69% ↑
    (Piwowar, 2007 in PLoS)
    Validation of results
    New research opportunities and collaborations
    Easier to do your research
  • 6. DCC curation lifecycle model
  • 7. Conceptualise: planning what to do
    • define a research question and design your methodology
    • 8. bid for funding (incl. data management and sharing plans)
    • 9. plan data creation (capture methods, standards, formats)
    PGR student, supervisory team, sponsors / funding bodies, IT, research governance, ethics panel
    Decisions made now have an impact on every other stage of the lifecycle, so it is worth getting things right from the start!
  • 10. Specific issues to consider
    • Defining your method – access to software, equipment, skills
    • 11. What storage needs do you anticipate - enough capacity?
    • 12. What are your university’s / sponsor’s requirements?
    • 13. What ethical approval do you require?
    • 14. What agreements do you have to establish at the outset?
    • 15. Will you make use of any existing data – licences needed?
    • 16. Can the data be shared?
    • 17. Are there any legal or ethical restrictions?
    • 18. Are there likely to be any embargoes on data publication?
  • Tools and resources
    DCC Helpdesk:
    If you need further assistance at this stage, please don’t hesitate to drop us a line via our helpdesk and we’ll make every effort to support your curation activity.
    DCC Policy and Legal Pages:
    Our table shows the curation related requirements for particular funding body. This may not relevant to you now, but will be in your future research career.
  • 19. Data Management and Sharing Plans
    Typically want a short statement covering:
    • What data will be created (format, types) and how?
    • 20. How will the data be documented and described?
    • 21. How will you manage ethics and Intellectual Property?
    • 22. What are the plans for data sharing and access?
    • 23. What is the strategy for long-term preservation?
    DMP guidance:
    DMP online:
  • 24. ESRC data management and sharing plan
    The ESRC currently asks five set questions covering:
    review of existing datasets for reuse;
    what data will be produced;
    any difficulties with mandatory archiving;
    potential users of the dataset;
    cost of preparing and documenting data for archiving
    A new requirement comes into force in spring 2011
    Nine themes are given that it’s expected a plan will cover
  • 25. MRC data sharing and preservation strategy
    Expected to provide a succinct summary of:
    Type(s) of qualitative or quantitative data that will be generated
    Further intended and/or foreseeable research uses for the dataset(s)
    Plans for preparing and documenting data for preservation and sharing
    Applicants requesting funds to extend existing data should also explain:
    The distinctive added value that the new data would provide in relation to
    existing studies, databases or datasets in the same field
    How data sharing would provide opportunities for coordination or collaboration /Datasharinginitiative/Policy/index.htm
  • 26. Wellcome Trust data management and sharing plan
    Consider as briefly and unambiguously as possible:
    What data outputs will your research generate and what data will have value to
    other researchers?
    When will you share the data?
    Where will you make the data available?
    How will other researchers be able to access the data?
    Are any limits to data sharing required – for example, to either safeguard
    research participants or to gain appropriate intellectual property protection?
    How will you ensure key datasets are preserved to ensure their long-term value?
    What resources will you require to deliver your plan?
  • 27. Create: collect, capture and analyse data
    • data capture
    • 28. creation of administrative, descriptive, structural and technical metadata
    • 29. clarification of IPR rights and consent agreements to prevent disputes
    PGR student, supervisory team, information specialists, technical support
    Your data could be expensive or impossible to recapture so it is essential to ensure it has context for long-term comprehensibility and reuse.
  • 30. Specific issues to consider
    • What do you want to do with your research data?
    • 31. What do you want others to be able to do with your data?
    • 32. Will you want to reuse / share your data in the future?
    • 33. Who has rights over the data? How will it be licensed?
    • 34. What contextual metadata will you record and how?
    • 35. How will you handle file naming and version control?
    • 36. What level of data quality do you need to achieve?
    • 37. Will you make use of any data collection policies?
    • 38. Do you have access to training and support for the above?
  • Tools and resources
    Your university’s policies and handbooks e.g. ethics handbook
    UKDA guide on managing and sharing data
    Creative Commons
  • 39. Appraise: select what do you need to keep
    • discuss with your supervisory team which data it is legal, appropriate,
    and valuable to curate over the long term
    • decide how your data will be kept e.g. data centre deposit
    • 40. dispose of data no longer needed in an appropriate manner
    PGR students, supervisory team, information specialists, funding bodies
    While storage space and costs may not be an issue for some, the ‘keep everything’ approach may not be viable in the longer-term. As the volume of data retained increases the efficient search and retrieval of relevant data becomes more and more difficult.
  • 41. Specific issues to consider:
    • What is the minimum you need to keep for your data findings and
    publications to be supported over time?
    • What do your sponsors expect you to keep and where?
    • 42. What consent about data management have you obtained from your
    research participants?
    • Are there any data that you, by law, are not allowed to keep?
    • 43. Has enough contextual information been collected to make an
    informed decision about which data to keep?
    • Do you have access to expertise in your supervisory team or at your
    institution to assist with selection and appraisal?
  • 44. Tools and resources
    DCC briefing paper appraisal-and-selection
    How to guide:
  • 45. Appraise and select exercise
    Work in small groups and use your PhD studies as an example
    Discuss the issues and how they affect selection decisions
    Epistemological and methodological aspects of qualitative research that might prevent reuse / sharing
    Ethical constraints, consent requirements, IPR, organisational confidentiality…
    How to provide sufficient metadata and context so the data is meaningful to the researchers themselves in the future, and to other researchers
    Long term value of the data: to the researcher themselves, to other researchers, to ‘society’ etc
  • 46. CHECKLISTS: Conceptualise
    Get into the habit of equating data curation with good research.
    Know what your funding body expects you to do with your data and for how long. Assess your ability to be able to meet these expectations (i.e., do you need additional funding or staff?)
    Determine intellectual property rights from the outset and ensure they are documented.
    Identify any anticipated publication requirements (embargoes, restrictions on
    publishing over multiple sites)
    Identify and document specific roles and responsibilities as early as possible.
  • 47. CHECKLISTS: Create and/or Receive
  • 48. CHECKLISTS: Select and Appraise