Wednesday 6 March 2013 KAPTUR Project Conference, RIBA, LondonSupporting Research Data Management in Universities:the Jisc Managing Research Data ProgrammeSimon HodsonJISC Programme Manager, Managing Research Data
Why is managing research data important? JISC considers it a priority to support universities in improving the way research data is managed and, where appropriate, made available for reuse. Research funder policies, legislative frameworks, good practice, open data agenda – The outputs of publicly funded research should be publicly available. – The evidence underpinning research findings should be available for validation Good data management is good for research – More efficient research process, avoidance of data loss, benefits of data reuse Alignment with university missions. – Universities want to provide excellent research infrastructure. – Universities want to have better oversight of research outputs.
What is Jisc doing? Jisc Managing Research Data Programme: developing capacity and good practice – First MRD Programme, 2009-11: http://bit.ly/jiscmrd2009-11 – Selected outputs from the first programme: http://bit.ly/jiscmrd2009-11-outputs – Second JISC MRD Programme, 2011-13: http://bit.ly/jiscmrd2011-13 – Programme Manager Blog: http://researchdata.jiscinvolve.org/ Digital Curation Centre: ‘because good research needs good data’ – Advice, guidance, advocacy, training in RDM: http://www.dcc.ac.uk/ – How to Guides: http://www.dcc.ac.uk/resources/how-guides Janet Brokerage: Collaborative purchasing, B2B brokerage. – Suite of services (generic research tools, cloud storage): https://www.ja.net/products- services/janet-brokerage
STOP What do we mean by research data? The digital and other artifacts that are created during the process of research, and whichthrough analysis form the evidence that underpins research findings.
Data management and good research practice Good data management is good practice – Avoidance of data loss. – Effective research: file naming, annotation etc: how do you find your data, how do you understand it? – ‘The first person with whom you share your data is your future self’! Data sharing / data publication is good for research – Verification of research findings / Deterrence of fraud – Reproducibility of research / Science as a self-correcting process – Benefits of data reuse: asking new questions of old data. – Return on investment. – Metastudies/systematic review: greater statistical value of integrated results. – Integration of data in interdisciplinary research: the grand challenges require multiple data sets
DUDs The data centreunder the desk (or in a back pack) is not adequate.
Can we quantify the benefits of reducing data loss? Jisc Managing Research Data Programme project surveys have uncovered evidence of data loss. One survey found that 23.3% of respondents had lost research data – 0.5 % had suffered catastrophic loss of all their research data as it had not been backed up. – 7.5 % had lost one week’s work – 8 % had lost one day’s work
Royal Society Science as an Open Enterprise Report, 2012 ‘how the conduct and communication of science needs to adapt to this new era of information technology’. ‘As a first step towards this intelligent openness, data that underpin a journal article should be made concurrently available in an accessible database. We are now on the brink of an achievable aim: for all science literature to be online, for all of the data to be online and for the two to be interoperable.’ Royal Society June 2012, Science as an Open Enterprise, http://royalsociety.org/policy/projects/sci ence-public-enterprise/report/
Science as an Open Enterprise Report: six key changes1. a shift away from a research culture where data is viewed as a private preserve;2. expanding the criteria used to evaluate research to give credit for useful data communication and novel ways of collaborating;3. the development of common standards for communicating data;4. mandating intelligent openness for data relevant to published scientific papers;5. strengthening the cohort of data scientists needed to manage and support the use of digital data (which will also be crucial to the success of private sector data analysis and the government’s Open Data strategy);6. the development and use of new software tools to automate and simplify the creation and exploitation of datasets. Royal Society 2012, Science as an Open Enterprise, http://royalsociety.org/policy/projects/science-public-enterprise/report/
Drivers: Research Funder Policies RCUK Common Principles on Data Policy: http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx 1. Public good: Publicly funded research data are produced in the public interest should be made openly available with as few restrictions as possible 2. Planning for preservation: Institutional and project specific data management policies and plans needed to ensure valued data remains usable 3. Discovery: Metadata should be available and discoverable; Published results should indicate how to access supporting data 4. Confidentiality: Research organisation policies and practices to ensure legal, ethical and commercial constraints assessed; research process should not be damaged by inappropriate release 5. First use: Provision for a period of exclusive use, to enable research teams to publish results 6. Recognition: Data users should acknowledge data sources and terms & conditions of access 7. Public funding: Use of public funds for RDM infrastructure is appropriate and must be efficient and cost-effective.
DCC Overview of Funder Data Policies: http://www.dcc.ac.uk/resources/policy-and- legal/overview-funders-data-policies
EPSRC Research Data Policy Expectations Policy and expectations: http://www.epsrc.ac.uk/about/standards/researchdata/Pages/policyframework.aspx Research organisations to have RDM policy, advocacy and support functions. (i, iii) Research data to be effectively managed and curated throughout the life-cycle (viii) Research organisations to maintain public catalogue of research data holdings, adequate metadata and permanent identifier (v) Publications to indicate how research data can be accessed (ii) Data to be retained for 10 years from last access (vii) Research data management to be adequately resourced from appropriate funding streams (ix) Roadmap in place by 1 May 2012 Compliance by 1 May 2015
Barriers to data sharing… Researchers concerns: – Concern that data may be misused or misunderstood. – Concern that will lose scientific edge if sharing before fully exploited. – Desire to retain control of a professional asset. – Concern that will not be credited. – Lack of career rewards for data publication. See ODE report, using Parse.Insight findings: http://www.alliancepermanentaccess.org/wp- content/uploads/downloads/2011/11/ODE-ReportOnIntegrationOfDataAndPublications-1_1.pdf RIN Report, ‘To Share or not to share’, http://www.rin.ac.uk/our-work/data-management-and-curation/share-or-not- share-research-data-outputs
Professional benefits of data sharing“48% of trials with “We find strong and consistent evidence that data sharing, both formal and informal,publicly available increases research productivity across a widemicroarray data range of publication metrics. Data archiving,received 85% of the in particular, yields the greatest returns onaggregate citations” investment with research productivity (number of publications) being greater when-- Piwowar HA, Day RS, data are archived. Not sharing data, eitherFridsma DB (2007) Sharing formally or informally, limits severely theDetailed Research Data Is number of publications tied to research data.”Associated with Increased –Citation Rate. PLoS ONE2(3): e308. Pienta, Alter, Lyle (2010) The Enduring Value of Science Research: The Use and Reuse of Primary Research Data. “authors who make data from their articles available are cited twice as frequently as articles with “no data but otherwise equivalent credentials, including degree of formalization.”” -- Glenditsch, Petter, Metelits, and Strand (2003: 92) Slide credit, Joss Winn, University of Lincoln
Research data are an asset!Imagine the significance ofthe research collections ofkey departments/researchgroups, departed alumni.Don’t underestimate theresearch value of the stuffthat underpins yourresearch, that you makeduring your research.
Building Institutional Capacity: Second MRD Programme, 2011-13Encouraged to reuse Ownership: High leveloutputs from first RDM ownership of the problem, Training senior manager onprogramme andelsewhere. 5 projects steering .Mix of pilot projects and Sustainability: Largeembedding projects. institutional contributions.Holistic institutional Develop business casesapproach to RDM. to sustain work. Institutional RDM Infrastructure Services Data 17 Projects RDM Publication Planning 3 projects 10 projects Second JISC MRD Programme, 2011-13: http://bit.ly/jiscmrd2011-13
Jisc MRD RDM Infrastructure Projects
Components of research data management support services Business Plan and RDM Policy and Roadmap SustainabilityResearch Data Registry Data Management Planning Data Repositories/Catalogu Managing Active Data es Processes for Deposit / Handover selection and retention Guidance, Training and Support
Guidance Research Data RDM Policy and Business Plan and Roadmap Good Practice Registry Sustainability Coordination DMPonline Archival Data Management Guidance Storage Planning Templates Metadata DataStage Data Managing Active Academic Identifiers Repositories/Catalo Data Dropbox Guidance gues Coordination Active Storage SWORD Guidance Selection and Protocol Deposit / Handover Good Practice Retention Easy Uploader Case Studies Jisc / Jisc-mediated Products Training and Advocacy, Guidance, Training and Support AdvocacyProducts map to Resourcescomponents of RDMsupport services.Arrows in indicate productsdelivered.Red arrows out indicates Jisc / Jisc-mediated Institutional RDM Support Service Productsdata hosting or metadatatransfer to external service.
University RDM Guidance Pageshttp://www.gla.ac.uk/services/datamanagement/
University RDM Guidance Pageshttp://www.admin.ox.ac.uk/rdm/
University RDMGuidance Pageshttp://www.bath.ac.uk/research/data
University RDM http://www.southampton.ac.uk/library/research/researchdata/Guidance Pages
University RDM Guidance Pageshttp://www2.le.ac.uk/services/research-data
Institutional Policies and Roadmaps Institutional Research Data Management Policies: http://www.dcc.ac.uk/resources/policy-and-legal/institutional-data-policies/uk- institutional-data-policies Institutional Roadmaps to meet EPSRC Expectations on Research Data: http://www.dcc.ac.uk/resources/policy-and-legal/epsrc-institutional-roadmaps
Data Management Planning Jez Cope, University of Bath, R360 Project http://opus.bath.ac.uk/30772/ Detailed guidance on funder requirements for DMPs from DCC: http://www.dcc.ac.uk/sites/default/files/documents/resource/policy/FundersData PlanReqs_v4%204.pdf DCC How to Develop a Data Management and Sharing Plan: http://www.dcc.ac.uk/resources/how-guides/develop-data-plan DCC DMPonline tool: https://dmponline.dcc.ac.uk/
JISCMRD Training Projects Phase 1 and 2 Need for subject focussed research data management / curation training, integrated with PG studies Five projects in the first programme to design and pilot (reusable) discipline-focussed training units for postgraduate courses: http://www.jisc.ac.uk/whatwedo/programmes/mrd/rdmtrain.aspx Heath studies; creative arts; archaeology and social anthropology; psychological sciences; social sciences and geographical sciences: http://www.dcc.ac.uk/training/train- trainer/disciplinary-rdm-training/disciplinary-rdm-training Four projects in the second programme: http://researchdata.jiscinvolve.org/wp/2012/08/23/research-data-management-training-five- new-jiscmrd-projects/ Psychology and computer science; digital music; physics and astronomy; subject and liaison librarians.
MANTRA Training Materials, University of Edinburgh Online course built using OS Xerte toolkit. Sections include: – DMPs – Organising Data – File Formats and Transformation – Documentation and Metadata – Storage and Security – Data Protection – Preservation, sharing and licensing Also software practicals for users of SPSS, R, ArcGIS, Nvivo Research Data MANTRA: http://datalib.edina.ac.uk/mantra/
Lincoln Orbital Project: Joining up Institutional Systems:http://orbital.blogs.lincoln.ac.uk/2012/12/06/orbital-deposit-of-dataset-records-to-the-lincoln- repository-workflow/
University Data Repositorieshttps://ore.exeter.ac.uk/repository/handle/10871/502
University Data Repositorieshttp://data.bris.ac.uk/datasets/12mjtnrtsdjfs17sl4pq2ucqrk/
University Data Repositorieshttps://databank.ouls.ox.ac.uk/general/datasets/Tick1AudioCorpus
Metadata Schema for Institutional Data Repositorieshttp://www.data-archive.ac.uk/media/375386/rde_eprints_metadataprofile.pdf
Development of Institutional RDM CapacityThe Royal Society Science as an Open Enterprise report recommended that the JISC Managing Research Data Programme ‘should be expanded beyond the pilot 17 institutions within the next five years.’ [Royal Society 2012, Science as an Open Enterprise, p.73]
You and researchdata/research outputs…1. Does your institution have an RDM policy and a set of guidance pages supporting it?2. Does your institution provide support for data management during your research?3. Does your institution have a repository for research data?4. Do you know how to prepare a data management plan?5. Which data do you retain at the end of a research project?6. Would you reference data in your published research?7. Which data would you retain at the end of a project and how would you make this available?