Research Data Management
Upcoming SlideShare
Loading in...5
×
 

Research Data Management

on

  • 804 views

Presentation given at the University of East London on 1st May 2013.

Presentation given at the University of East London on 1st May 2013.

Statistics

Views

Total Views
804
Views on SlideShare
801
Embed Views
3

Actions

Likes
1
Downloads
15
Comments
0

2 Embeds 3

https://twitter.com 2
http://oracle.sociview.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • 2012-02-07
  • 2012-02-07

Research Data Management Research Data Management Presentation Transcript

  • Funded by:Research Data ManagementUniversity of East London, 1stMay 2013Sarah JonesDigital Curation Centresarah.jones@glasgow.ac.ukTwitter: sjDCC
  • Why are you here?• You’re managing data (your own or your groups)• Or you think you maybe should be• You’re not sure why it matters• You’re not sure how best to do it• You’d like to know whether you’re on the right trackPhoto: by Orijinal http://www.flickr.com/photos/orijinal/3539418133
  • Why manage your data?
  • What if your data fell into the wrong hands?•http://news.bbc.co.uk/1/hi/uk/8332445.stm
  • What if you had to produce your data?
  • What if this was your desk?•http://www.computerweekly.com
  • Why YOU need a DataManagement PlanWhat if this was your backpack?http://blogs.ch.cam.ac.uk/pmr/2011/08/01/why-you-need-a-data-management-plan
  • Good data management is aboutmaking informed decisions
  • •http://xkcd.com/949
  • Why manage research data?• To make your research easier!• To stop yourself drowning in irrelevant stuff• In case you need the data later• To avoid accusations of fraud or bad science• To share your data for others to use and learn from• To get credit for producing it• Because somebody else said to do so
  • RDM policy at UELhttp://www.uel.ac.uk/wwwmedia/services/library/lls/resources/rspresearchtools/Research-Data-Management-policy-for-UEL-FINAL.pdf
  • Expectations of public access“Publicly funded research data are a public good,produced in the public interest, which should bemade openly available with as few restrictions aspossible in a timely and responsible manner thatdoes not harm intellectual property.”RCUK Common Principles on Data Policyhttp://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
  • •13http://www.bis.gov.uk/innovatingforgrowth…open data
  • ...personal data
  • Benefits of sharing data (1)www.nytimes.com/2010/08/13/health/research/13alzheimer.html?pagewanted=all&_r=0“It was unbelievable. Its not sciencethe way most of us have practicedin our careers. But we all realisedthat we would never get biomarkersunless all of us parked our egos andintellectual property noses outsidethe door and agreed that all of ourdata would be public immediately.”Dr John Trojanowski, University of Pennsylvania•... scientific breakthroughs
  • Benefits of sharing data (2)www.guardian.co.uk/politics/2013/apr/18/uncovered-error-george-osborne-austerity... validation of results“It was a mistake in a spreadsheet that could havebeen easily overlooked: a few rows left out of anequation to average the values in a column.The spreadsheet was used to draw the conclusionof an influential 2010 economics paper: that publicdebt of more than 90% of GDP slows down growth.This conclusion was later cited by the InternationalMonetary Fund and the UK Treasury to justifyprogrammes of austerity that have arguably led toriots, poverty and lost jobs.”
  • Benefits of sharing data (3)“There is evidence that studies that make theirdata available do indeed receive more citationsthan similar studies that do not.”Piwowar H. and Vision T.J 2013 "Data reuse and the open datacitation advantage“ https://peerj.com/preprints/1.pdf9% - 30% increase•... more citations
  • Things to think about...Photo by @boetterhttp://www.flickr.com/photos/jakecaptive/3205277810
  • What is data management?“the active management and appraisal of data overthe lifecycle of scholarly and scientific interest”Digital Curation CentreData management isjust part of goodresearch practice
  • What is involved in RDM?• Data Management Planning• Creating data• Documenting data• Accessing / using data• Storage and backup• Sharing data• Preserving data
  • If you plan to share your data....• Have you got consent for sharing?• Do any licences you’ve signed permit sharing?• Is your data in suitable formats?Decisions made early on affect what you can do later
  • File formats for long-term access• Unencrypted• Uncompressed• Non-proprietary/patent-encumbered• Open, documented standard• Standard representation (ASCII, Unicode)Type Recommended Avoid for data sharingTabular data CSV, TSV, SPSS portable ExcelText Plain text, HTML, RTFPDF/A only if layout mattersWordMedia Container: MP4, OggCodec: Theora, Dirac, FLACQuicktimeH264Images TIFF, JPEG2000, PNG GIF, JPGStructured data XML, RDF RDBMS•Further examples: http://www.data-archive.ac.uk/create-manage/format/formats-table
  • DocumentationWhat would someone unfamiliar with yourdata need in order to find, evaluate,understand, and reuse them?Consider the differences between someone insideyour research group, someone outside your groupbut in your field, and someone outside your field.Two parts: metadata and methods
  • Metadata• About the project– Title, people, key dates, funders and grants• About the data– Title, key dates, creator(s), subjects, rights,included files, format(s), versions, checksums• Keep this with the data
  • Methods• Reason #1 for not reusing someone else’s data: “I don’t knowenough about how it was gathered to trust it.”• Document what you did. (A published article may not be enough.)• Document any limitations of what you did.• If you ran code on the data, document the code and keep it withthe data.• Need a codebook? Or a data dictionary?– If I can’t identify at sight what each bit of your dataset means, yes, you doneed a codebook or data dictionary.– DO NOT FORGET THE UNITS!
  • Standards• Why reinvent the wheel? If there’s a standard formatfor your data or how to describe it, use that!• The tricky part is finding the right standard.– Standards are like toothbrushes...– But using standards is good hygiene!– Your librarian can often help you find relevant standards.– Also check out the DCC catalogue of disciplinary metadatahttp://www.dcc.ac.uk/resources/metadata-standards
  • Where to store your data?• Your own drive (PC, server, flash drive, etc.)– And if you lose it? Or it breaks?• Somebody else’s drive• Departmental drive• “Cloud” drive– Do they care as much about your data as you do?
  • How to backup?• 3… 2… 1… backup!– at least 3 copies of a file– on at least 2 different media– with at least 1 offsite• Use managed services where possible e.g. Universityfilestores rather than local or external hard drives• Ask central IT team for advice
  • What to keep?It’s not possible to keep everything. Select based on:– What has to be kept e.g. data underlying publications– What can’t be recreated e.g. environmental recordings– What is potentially useful to others– What has scientific, cultural or historical value– What legally must be destroyed– ...How to select and appraise research data:www.dcc.ac.uk/resources/how-guides/appraise-select-research-data
  • How to share/preserve data?• What is required?– By your funder– By your publisher– By your uni• What subject repositories, data centres andstructured databases are available?http://databib.org
  • Putting the pieces together...Photo by Dread Pirate Jeffhttp://www.flickr.com/photos/justageek/2851643792
  • Data Management PlansDMPs are often submitted with grant applications, butare useful whenever you are creating data to:•Make informed decisions to anticipate and avoid problems•Avoid duplication, data loss and security breaches•Develop procedures early on for consistency•Ensure data are accurate, complete, reliable and secure•Save time and effort – make your life easier!
  • Which funders require a DMP?•www.dcc.ac.uk/resources/policy-and-legal/ overview-funders-data-policies
  • What do research funders want?• A brief plan submitted in grant applications, and in thecase of NERC, a more detailed plan once funded• 1-3 sides of A4 as attachment or a section in Je-S form• Typically a prose statement covering suggested themes• An outline of data management and sharing plans,justifying decisions and any limitations
  • Five common themes1. Description of data to be collected / created(i.e. content, type, format, volume...)2. Standards / methodologies for data collection & management3. Ethics and Intellectual Property(highlight any restrictions on data sharing e.g. embargoes, confidentiality)4. Plans for data sharing and access(i.e. how, when, to whom)5. Strategy for long-term preservation
  • A useful framework to get started•Think about whythe questions arebeing asked•Look at examplesto get an idea ofwhat to include•www.icpsr.umich.edu/icpsrweb/content/datamanagement/dmp/framework.html
  • Help from the DCC•https://dmponline.dcc.ac.uk•www.dcc.ac.uk/resources/•how-guides/develop-data-plana web-based tool to help you write DMPsaccording to different requirements
  • How DMP Online worksCreate a planbased onrelevantfunder /institutionaltemplates......and thenanswer thequestionsusing theguidanceprovided
  • Example plans• Technical plan submitted to AHRC by Bristol Unihttp://data.bris.ac.uk/files/2013/02/data.bris-AHRC-Technical-Plan-v21.pdf• Rural Economy & Land Use (RELU) programmeexampleshttp://relu.data-archive.ac.uk/data-sharing/planning/examples• UCSD example DMPs (20+ scientific plans for NSF)http://rci.ucsd.edu/dmp/examples.html• My DMP – a satire (what not to write!)http://ivory.idyll.org/blog/data-management.html
  • Tips on writing DMPs• Keep it simple, short and specific• Seek advice - consult and collaborate• Base plans on available skills and support• Make sure implementation is feasible• Justify any resources or restrictions neededhttp://www.youtube.com/watch?v=7OJtiA53-Fk
  • AcknowledgementThanks in particular to Dorothea Salo, Ryan Schryver andcolleagues for content from the “Escaping Datageddon”presentation, available at:http://www.slideshare.net/cavlec/escaping-datageddonAnd to the Research360 project at the University of Bath for the“Managing your research data” presentation, available at:http://opus.bath.ac.uk/32296
  • Thanks – any questions?DCC guidance, tools and case studies:www.dcc.ac.uk/resourcesFollow us on twitter:@digitalcuration and #ukdcc
  • Exercise• Use the template to start drafting a DMP• Discuss your ideas in groups to identifyavailable support and decide the bestapproaches to follow for your context