• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Managing data throughout the research lifecycle
 

Managing data throughout the research lifecycle

on

  • 580 views

Presentation given at University of Northampton, 20th February 2013

Presentation given at University of Northampton, 20th February 2013

Statistics

Views

Total Views
580
Views on SlideShare
535
Embed Views
45

Actions

Likes
0
Downloads
8
Comments
0

1 Embed 45

http://researchsupporthub.northampton.ac.uk 45

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Think of all the different types of information users (and you!) will need to understand the data in the future. If these aren ’t captured at the time it’s very hard to do later. Using standards can make it easier to share / combine data later.
  • A model to help you think about: a) the activities involved throughout the life of your data e.g. creating data, storing it, access; b) who plays a part e.g. storage will involve IT, preservation may be undertaken by a repository.
  • Think of all the different types of information users (and you!) will need to understand the data in the future. If these aren ’t captured at the time it’s very hard to do later. Using standards can make it easier to share / combine data later.
  • Think of all the different types of information users (and you!) will need to understand the data in the future. If these aren ’t captured at the time it’s very hard to do later. Using standards can make it easier to share / combine data later.
  • Typically, not every one of the DC elements is required to document this resource Although not apparent in this example, repetition of the Creator element is commonly found in metadata describing performing arts materials, which very often have multiple creators. The Creator element is further refined within this record with the use of a Qualifier.
  • Do you know what ’s required in the long-term and what support you can draw on? How do you decide what to keep? May not be allowed to keep everything – see DP / FoI legislation and any participant consent agreements Various subject-specific data centres are available, and there may be local support.
  • Decisions made at this stage have an impact on what can happen later on so it is worth planning to get things right from the start.
  • These seem to be the five main questions asked across the board by RCs First link takes you to a document that provides a comparison of what each funder asks for and the DCC link is to our guidance on data planning. We ’re also providing an online tool to help in the formulation of data management and sharing plans.

Managing data throughout the research lifecycle Managing data throughout the research lifecycle Presentation Transcript

  • Managing data throughout the research lifecycle considerations and pointers to support University of Northampton, 20th February 2013 Marieke Guy DCC, University of Bath m.guy@ukoln.ac.uk Funded by:This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: ScotlandLicense. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or,(b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
  • Today’s Talk… Managing data throughout the research lifecycle• What is the research lifecycle?• How do you manage data?• What questions does managing data raise?
  • What is the research lifecycle?• Research activity often takes place in stages which form a ‘lifecycle’• Data is created at points during this lifecycle• The data created has its own lifespan “Data often have a longer lifespan than the research project that creates them. Researchers may continue to work on data after funding has ceased, follow-up projects may analyse or add to the data, and data may be re-used by other researchers.” UKDA
  • Example 1: DCC lifecycle modelA model to show the activities PLANand people involved in CREATE DATAmanaging data. ADD DOCUMENTATION RESEARCHERS IT DATA CENTRE
  • Example 2: Research360 lifecycle Research Process
  • Example 3: UK Data Archive
  • Example 4: UK Data Archive
  • Key ideas from the research lifecycle• Different research lifecycles suit different researchers• Research is a circular process• Certain stages are likely to be familiar to many researchers – conceptualisation/planning, creation, active use/documentation, publication etc…• Certain stages are likely to be familiar to less researchers – sharing, re-use etc…• Data may be created at many stages during the process (intervention points)• Data is likely to need management at many stages during the process
  • Key Qs from the research lifecycle 1. What data will you produce? 5. 1. Preservation Create 2. How will you organise the data? & Re-Use 3. Can you/others understand the data 4. 2. 4. What data will be deposited andPublication & Deposit Active Use where? 5. Who will be interested in re-using 3. the data? Documentation
  • What is data curation? “the active management and Manage appraisal of data over the lifecycle of scholarly and scientific interest” Data management is part ofShare good research practice
  • Good data management is about making informed decisions
  • •http://xkcd.com/949
  • How do you manage data?Key questions to consider when:- Creating data- Documenting data- Storing data- Sharing data- Preserving data- Planning data managementExamples and pointers to support
  • Creating data: questionsWhat formats will you use?- determined by the instruments / software you have to use- common, widespread formats to enable reuseHow will you create your data?- What methodologies and standards will you use?- How will you address ethical concerns and protect participants?- Will you control variations to provide quality assurance?- What external data sets will you use?(See the BL Social Science Collection guide to Management and Business studies datasets)
  • Creating data: adviceDifferent formats are good for different things- open, lossless formats are more sustainable e.g. rtf, xml, tif, wav- proprietary and/or compressed formats are less preservable but are often in widespread use e.g. doc, jpg, mp3May choose one format for analysis then convertto a standard format for preservation / sharingExcellent guidance on creating data & managing ethics in:www.data-archive.ac.uk/media/2894/managingsharing.pdf
  • File formats for long-term access• Unencrypted• Uncompressed• Non-proprietary/patent-encumbered• Open, documented standard• Standard representation (ASCII, Unicode) Type Recommended Avoid for data sharing Tabular data CSV, TSV, SPSS portable Excel Text Plain text, HTML, RTF Word PDF/A only if layout matters Media Container: MP4, Ogg Quicktime Codec: Theora, Dirac, FLAC H264 Images TIFF, JPEG2000, PNG GIF, JPG Structured data XML, RDF RDBMSFurther examples: http://www.data-archive.ac.uk/create-manage/format/formats-table
  • Documenting data: questionsWhat information do users need to understand the data?- descriptions of all variables / fields and their values- code labels, classification schema, abbreviations list- information about the project and data creators- tips on usage e.g. exceptions, quirks, questionable resultsHow will you capture this?Are there standards you can use?
  • Dublin Core metadata example Creator:Donald Cooper Role=Photographer Subject: Shakespeare, William, 1564-1616, Antony and Cleopatra [LC] Description:Vanessa Redgrave as Cleopatra Date: 1973-08-09 Type:Image Format:JPEG Identifier:4150 [catalogue no] Source: negative no 235 Relation: Antony and Cleopatra: Thompson/73-8 IsPartOf Coverage:Bankside Globe Role=Spatial Rights:Donald Cooper•http://www.ahds.ac.uk/performingarts
  • Storing data: questionsWhat is available to you?What facilities do you need?- remote access- file sharing with colleagues- high-levels of securityHow will the data be backed up?
  • Storing data: adviceSpeak to the Northampton IT Team for advice – TUNDRA2Remember that all storage is fallible – need to back-up- keep 2+ copies on different types of media in different locations- manage back-ups (migrate media, test integrity)Choose appropriate methods to transfer / share data- email, dropbox, ftp, encrypted media, filestore, VREs...
  • Sharing data: questionsDoes your funder expect you to share data?Which data can be shared?How will you share your data?What do you get from sharing?- citations, recognition...
  • Sharing data: advice Where possible, make your data available via repositories, data centres and structured databases•Northampton Electronic Collection of Theses and Research (NECTAR)http://nectar.northampton.ac.uk/•http://datacite.org/repolist •http://databib.org/
  • Preserving data: questionsAre you required to preserve (or destroy) your data?How will you select what to keep?Is there somewhere you can archive your data?How can you support the reuse of your data?
  • Preserving data: adviceHow to select and appraise research data:www.dcc.ac.uk/resources/how-guides/appraise-select-research- dataHow to licence research datawww.dcc.ac.uk/resources/how-guides/license-research-dataHow to cite datasets and link to publicationswww.dcc.ac.uk/resources/how-guides/cite-datasets
  • Planning data managementWhat do you (and others) want to do with the data? your decisions should bear this in mind and make it feasibleRemember:Data management is about making informed decisionsTalk to colleagues and support staff to see which option works best
  • Data Management and Sharing PlansFunders typically want a short statement covering:- What data will be created (format, types) and how?- How will the data be documented and described?- How will you manage ethics and Intellectual Property?- What are the plans for data sharing and access?- What is the strategy for long-term preservation?DMP tool: https://dmponline.dcc.ac.uk/How to write a DMP:www.dcc.ac.uk/resources/how-guides/develop-data-plan
  • Thanks - any questions? Acknowledgements:Thanks to DCC staff, UK Data Archive and Research360 for slides