Research Lifecycles andRDMMarieke GuyDigital Curation Centre (DCC)
About this presentationManaging data throughout theresearch lifecycle• What is the research lifecycle?• How do you manage data?• What questions does managing data raise?
What is the research lifecycle?• Research activity often takes place in stageswhich form a ‘lifecycle’• Data is created at points during this lifecycle• The data created has its own lifespan“Data often have a longer lifespan than the research project that createsthem. Researchers may continue to work on data after funding has ceased,follow-up projects may analyse or add to the data, and data may be re-usedby other researchers.” UKDA
CREATE DATAADD DOCUMENTATIONPLANDATA CENTREITRESEARCHERSCREATE DATAADD DOCUMENTATIONPLANA model to show the activities andpeople involved in managing data.Example 1: DCC lifecycle model
Key ideas from the research lifecycleDifferent research lifecycles suit different researchersResearch is a circular processCertain stages are likely to be familiar to many researchers– conceptualisation/planning, creation, activeuse/documentation, publication etc…Certain stages are likely to be familiar to less researchers –sharing, re-use etc…Data may be created at many stages during the process(intervention points)Data is likely to need management at many stages duringthe process
4.Publication& Deposit5.Preservation& Re-Use1.Create2.Active Use3.Documentation1. What data will you produce?2. How will you organise the data?3. Can you/others understand thedata4. What data will be deposited andwhere?5. Who will be interested in re-usingthe data?Key Qs from the research lifecycle
“the active management andappraisal of data over thelifecycle of scholarly andscientific interest”Data management is part ofgood research practiceWhat is data curation?ManageShare
Good data management is aboutmaking informed decisions
How do you manage data?Key questions to consider when:- Creating data- Documenting data- Storing data- Sharing data- Preserving data- Planning data managementExamples and pointers to support
Creating data: questionsWhat formats will you use?- determined by the instruments / software you have to use- common, widespread formats to enable reuseHow will you create your data?- What methodologies and standards will you use?- How will you address ethical concerns and protect participants?- Will you control variations to provide quality assurance?- What external data sets will you use?(See the BL Social Science Collection guide to Management andBusiness studies datasets)
Different formats are good for different things- open, lossless formats are more sustainable e.g. rtf, xml, tif, wav- proprietary and/or compressed formats are less preservable butare often in widespread use e.g. doc, jpg, mp3May choose one format for analysis then convertto a standard format for preservation / sharingExcellent guidance on creating data & managing ethics in:www.data-archive.ac.uk/media/2894/managingsharing.pdfCreating data: advice
Unencrypted Uncompressed Non-proprietary/patent-encumbered Open, documented standard Standard representation (ASCII, Unicode)Type Recommended Avoid for data sharingTabular data CSV, TSV, SPSS portable ExcelText Plain text, HTML, RTFPDF/A only if layout mattersWordMedia Container: MP4, OggCodec: Theora, Dirac, FLACQuicktimeH264Images TIFF, JPEG2000, PNG GIF, JPGStructured data XML, RDF RDBMSFurther examples: http://www.data-archive.ac.uk/create-manage/format/formats-tableFile formats for long-term access
Documenting data: questionsWhat information do users need to understand the data?- descriptions of all variables / fields and their values- code labels, classification schema, abbreviations list- information about the project and data creators- tips on usage e.g. exceptions, quirks, questionable resultsHow will you capture this?Are there standards you can use?
Dublin Core metadata exampleCreator:Donald CooperRole=PhotographerSubject: Shakespeare, William, 1564-1616,Antony and Cleopatra [LC]Description:Vanessa Redgrave as CleopatraDate: 1973-08-09Type:ImageFormat:JPEGIdentifier:4150 [catalogue no]Source: negative no 235Relation: Antony and Cleopatra: Thompson/73-8IsPartOfCoverage:Bankside GlobeRole=SpatialRights:Donald Cooperhttp://www.ahds.ac.uk/performingarts
Storing data: questionsWhat is available to you?What facilities do you need?- remote access- file sharing with colleagues- high-levels of securityHow will the data be backed up?
Storing data: adviceSpeak to the Northampton IT Team for advice – TUNDRA2Remember that all storage is fallible – need to back-up- keep 2+ copies on different types of media in different locations- manage back-ups (migrate media, test integrity)Choose appropriate methods to transfer / share data- email, dropbox, ftp, encrypted media, filestore, VREs...
Sharing data: questionsDoes your funder expect you to share data?Which data can be shared?How will you share your data?What do you get from sharing?- citations, recognition...
Sharing data: adviceWhere possible, make your dataavailable via repositories, datacentres and structured databaseshttp://datacite.org/repolist http://databib.org/Northampton Electronic Collection of Theses and Research (NECTAR)http://nectar.northampton.ac.uk/
Preserving data: questionsAre you required to preserve (or destroy) your data?How will you select what to keep?Is there somewhere you can archive your data?How can you support the reuse of your data?
Preserving data: adviceHow to select and appraise research data:www.dcc.ac.uk/resources/how-guides/appraise-select-research-dataHow to licence research datawww.dcc.ac.uk/resources/how-guides/license-research-dataHow to cite datasets and link to publicationswww.dcc.ac.uk/resources/how-guides/cite-datasets
Planning data managementWhat do you (and others) want to do with the data? your decisions should bear this in mind and make it feasibleRemember:Data management is about making informed decisionsTalk to colleagues and support staff to see which option works best
Data Management and Sharing PlansFunders typically want a short statement covering:- What data will be created (format, types) and how?- How will the data be documented and described?- How will you manage ethics and Intellectual Property?- What are the plans for data sharing and access?- What is the strategy for long-term preservation?DMP tool: https://dmponline.dcc.ac.uk/How to write a DMP:www.dcc.ac.uk/resources/how-guides/develop-data-plan
The research process at Cardiff Take 10 minutes to think about one of the academicdepartments you work― How would you characterise the subject as a whole?― What do you know about the social organisation of thediscipline?― What do you know about the data they create?― Are you familiar with the research process in thisdepartment?― When might they require your help?
Thanks - any questions?Acknowledgements:Thanks to DCC staff, UK Data Archive and Research360 for slides