RDM in higher education


Published on

Presentation given at the Inside Government data management forum on 25th October 2012 - http://www.insidegovernment.co.uk/other/managing-data

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • I’ll start by giving a brief introduction to the DCC, then I’ll cover what the drivers are for research data management in the HE sector.The main part of the talk will focus on key initiatives and trends in our sector – data management planning, the development of policies and strategies by universities and the infrastructure to support RDM.Finally I’ll cover how this work is being supported by JISC, the DCC and data centres.
  • The DCC is a JISC-funded centre to support universities with Research Data Management (RDM). Our aim is to build capacity, capability and skills across the HE research community by working with libraries, IT services, research offices and researchers.
  • We build this capacity in a number of ways: by providing guidance, running training and events targeted at different audiences, developing tools such as DMP Online which I’ll explain later, and also by supporting the JISC, particularly the MRD programmes that it funds.
  • There are a number of policy requirements pushing for research data management and sharing. The OECD declaration released in 2007 is the main overarching one. This essentially states that data are a public good – if they’ve been funded by the public purse they should be openly available wherever possible. RCUK has adopted this principle and released codes of conduct for research and more recently a set of common principles on data policy. Universities are starting to develop their own data policies too, a trend I’ll touch on later.So there are various policy drivers, as well as legislation such as the DPA and FoI which have an impact, but there are also calls from the community to manage and share data.
  • There’s a real push for increased openness and sharing from certain parts of the research community. There was an issue of science in 2011 which noted the importance of RDM and the need to re-skill and develop infrastructure. Initiatives such as the Open Knowledge Foundation and the Panton Principles have also emerged to promote open sharing – researchers have recognised that it leads to better science.
  • There are various benefits or rewards to managing data. In the HE sector you’ll often hear people talking about the ‘data behind the graph’. Researchers publish results but you need to be able to access the underlying data to validate these findings. And various studies have shown that researchers get more citations when they share data, receiving the recognition and rewards that they need. If data are manage well it’s also much easier for researchers to do their work so there’s a self-interest argument too.
  • The need for DMPs is being driven by research funders. Most of RCUK and some key health funders expect researchers to submit a short Data Management and Sharing plan in their grant application
  • Typically researchers are asked to cover what data will be created and how to ensure it’s reusable. They should note any anticipated difficulties, for example if ethical concerns mean that some of the data can’t be shared or if they need an embargo period to file for a patent to make sure that these limits to sharing are justified. The longer term plans for sharing and preservation also need to be outlined.Essentially the DMP is to reassure the funder that researchers will comply with their policies and share data wherever possible.
  • The DCC has provided a variety of support in this area: we’ve developed a checklist of possible questions you could be asked in a DMP our how to guide offers practical guidance to walk researchers through the process of writing a plan, explaining what funders are looking for in each question and DMP Online is a tool to help researchers write plans
  • At the outset in DMP Online researchers pick templates to suit their needs e.g. a funder template if applying for funding, or an institutional template if their uni has requirements. They then answer the questions based on the guidance and any suggested answers provided. We’re increasingly developing institutional customisations to build in details of local support.
  • The development of policies and strategies is also being driven by funder requirements, particularly the EPSRC data policy. Rather than asking researchers to write DMPs, the EPSRC has placed the onus on institutions to make sure that the appropriate infrastructure is in place to support RDM. They’ve asked unis to develop a roadmap by May this year and then to ensure the support and infrastructure is developed in the next 3 years.This has caused a real stir, particularly among university senior management. Lots are now very engaged and investing in this area for fear of losing funding.
  • The DCC has been collating resources to support unis to meet this requirement. We developed a series of blog posts earlier in the year, providing guidance and profiling emerging approaches. We’ve also linked to roadmaps that have been made public and more recently conducted a survey to identify current practice and what support needs there will be over the next 3 years.
  • Allied to the development of strategies is RDM policy. Edinburgh released the first policy in May last year and this has been much copied. Lots of unis have released policies since and c.15 more are currently being ratified so the trend will continue.Many of these policies reference the university’s commitment to supporting RDM by developing infrastructure and support, so that’s the final trend I’ll touch on.
  • Lots of unis have done surveys of existing RDM practice and one of the most alarming results across the board was how researchers are storing data. Often they’re not using the centrally provided networked storage – either due to a lack of capacity or convenience to access and share their data. Practice is very ad hoc with researchers storing data on USBs, external HDDs or their laptops.Many unis are now investing heavily in research data storage. At Bristol they’re using their HPC facility, a petascale data store called Blue Peta. They’re offering the 1st 5TB for free per nominated ‘data steward’ - researchers have to sign up as data stewards to make sure that someone is responsible for the data, can say who can have access and how long it needs to be kept. There’s a clear, upfront charging model too so any additional capacity needed can be costed into grant applications.Having the storage is only part of the issue though. You need the right applications layered on top of this and that’s where the problems often lies.
  • One prevalent issue is the need for researchers to have easy access to their data, so you’ll hear talk of an academic dropbox. There’s a blog post from the University of Bath which outlines this issue very clearly.Researchers are mobile. They’re often not working in their office or on campus, they’re in the field, at conferences, at other unis with collaborators... and they may not have a good network connection so need a local copy of their data for fast, anytime access.Many have turned to dropbox because it’s so easy to use. It provides a local copy that will sync back and they can share data with collaborators easily - no need to contact sysadmins or encounter difficulties providing access to anyone outside the uni.There are security and legal issues for universities if researchers use dropbox though – where is the data being held, under what licence, is it secure?
  • Different solutions are being piloted. At Oxford they’ve developed their own software. DataFlow has two components – DataStage and DataBank. DataStage provides the dropbox-like facility. At Lincoln they’ve been piloting owncloud, an open source tool that provides the same features as Dropbox.
  • The holy grail that everyone is aiming for is to bring these different services together. You can see here how they’re proposing to do this at Oxford. Very few institutions are anywhere need this stage though, just Oxford, Edinburgh, Southampton... and in no case is it embedded and fully functioning – these are all pilots. Most unis are still at the stage of identifying their problems and development components of an overall service.
  • I’ve spoken about lots of work going on in the JISC MRD programmes. This is really the main source of seed funding in this area.Simon Hodson is the programme manager if you want to find out more, and there’ll be a closing conference next March where they’ll showcase the outputs
  • The DCC also has a few strands of activity to support this work. We’ve run a series of roadshows to profile good RDM practice and get key stakeholders from an institution together to start to plan their strategy.
  • The main strand of activity for DCC is our institutional engagements. We’re working intensively with 20 unis, providing 3 months of consultancy effort to each to support them in any areas they like – it may be developing strategies, piloting services, delivering training programmes...
  • All of the lessons from MRD and DCC work are currently being pulled together in a how to guide on developing RDM services.
  • And finally I just want to note that it’s a mixed model of support in HE. I’ve spoken about what unis and the DCC are doing but there are various other national services and particularly data centres, which play a key role. The UK Data Archive provides excellent guidance and support for social scientists, NERC funds a number of disciplinary data centres in the environmental sciences and there are many more.
  • Thank you!If you want to find out more about DCC we have lots of resources online and you can follow us on twitter.
  • RDM in higher education

    1. 1. Research Data Management in Higher Education Sarah Jones Digital Curation Centre sarah.jones@glasgow.ac.uk Twitter: sjDCC
    2. 2. Outline• Introduction to the DCC• Higher Education drivers for RDM• Three key initiatives and trends 1. Data Management Planning 2. Institutional policy and strategy 3. Research data storage and tools• How is RDM being supported?
    3. 3. What is the DCC? A JISC-funded centre to support universities with Research Data Management (RDM) “Helping to build capacity, capability and skills in data management and curation across the UK’s higher education research community.” - DCC Phase 3 Business PlanFunded by: Find us at: www.dcc.ac.uk
    4. 4. What do we do?• Offer guidance – helpdesk, briefing papers, how-to guides• Run training & events – DC101, roadshow, RDMF, IDCC• Develop tools – CARDIO, DAF, DRAMBORA, DMP Online• Support the JISC – esp. via the Managing Research Data programmes
    5. 5. Outline• Introduction to the DCC• Higher Education drivers for RDM• Three key initiatives and trends 1. Data Management Planning 2. Institutional policy and strategy 3. Research data storage and tools• How is RDM being supported?
    6. 6. Why manage data: requirements declarationdata are a public good and should be openly available Common principles on data policy www.rcuk.ac.uk/research/Pages/ DataPolicy.aspx Code of good research conduct data should be preserved and accessible for 10 years + Institutional data policies http://www.dcc.ac.uk/resources/policy- and-legal/institutional-data-policies
    7. 7. Increased openness and sharing “For science to effectively function, and for society to reap the full benefits from scientific endeavours, it is crucial that science data be made open”Surfing the TsunamiScience, 11 February 2011
    8. 8. Why manage data: rewards Recognition Prevent data loss Validation of results:More citations: 69% ↑ ‘data behind the(Piwowar, 2007 in PLoS) graph’ New research opportunities and collaborations Easier to do your research…
    9. 9. Outline• Introduction to the DCC• Higher Education drivers for RDM• Three key initiatives and trends 1. Data Management Planning 2. Institutional policy and strategy 3. Research data storage and tools• How is RDM being supported?
    10. 10. 1. Data Management PlanningMost RCUK and key health funders expect researchers to submit a short(c.2 page) Data Management and Sharing plan in their grant application www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
    11. 11. Expected coverage of DMPsFunders typically want a short statement covering: • What data will be created (format, types, volume) • Standards and methodologies to be used (incl. metadata) • How ethics and Intellectual Property will be addressed • Plans for data sharing and access • Strategy for long-term preservation
    12. 12. Help from the DCChttp://www.dcc.ac.uk/resources/data-management-plans
    13. 13. How does DMP Online work? Create a plan based on relevant funder / institutional templates......and thenanswer the questions using the guidance provided
    14. 14. Outline• Introduction to the DCC• Higher Education drivers for RDM• Three key initiatives and trends 1. Data Management Planning 2. Institutional policy and strategy 3. Research data storage and tools• How is RDM being supported?
    15. 15. 2. Institutional policy and strategy EPSRC expects all those institutions it funds: • to develop a roadmap that aligns their policies and processes with EPSRC’s expectations by 1st May 2012; • to be fully compliant with these expectations by 1st May 2015. • Compliance will be monitored and non-compliance investigated. • Failure to share data could result in the imposition of sanctions.www.epsrc.ac.uk/about/standards/researchdata/Pages/default.aspx
    16. 16. Developing strategies and roadmaps A series of blog posts www.dcc.ac.uk/news Links to roadmaps http://tiny.cc/EPSRCroadmaps A survey on practice http://tiny.cc/RoadmapSurvey
    17. 17. Institutional RDM policieswww.dcc.ac.uk/resources/policy-and-legal/institutional-data-policies
    18. 18. Outline• Introduction to the DCC• Higher Education drivers for RDM• Three key initiatives and trends 1. Data Management Planning 2. Institutional policy and strategy 3. Research data storage and tools• How is RDM being supported?
    19. 19. 3. Research data storage and tools• £2m funding to date• Petascale facility – expandable• 3 machine rooms – resilience (tape archive 2012)• Available to all researchers for research data 1st 5TB free per Data Steward then Blue Peta at Bristol £400 per TB p.a. for disk storage; tape backup £40 per TB http://data.bris.ac.uk
    20. 20. The academic dropboxWhat’s the problem? Researchers need access to their data wherever they are, but don’t always have a reliable network connection  Many researchers turn to Dropbox as it is easy to use and requires no user interaction beyond the initial setup. http://blogs.bath.ac.uk/research360/2012/05/mrd-hack-days-file- backup-sync-and-versioning-or-the-academic-dropbox
    21. 21. Solutions being pilotedDataStage is a secure personal Being deployed at thelocal file management University of Lincoln on theenvironment for use at the Orbital project.research group level – open source tool– private, shared and collaborative directories – provides same features as– web access Dropbox and more– automatic backup https://orbital.blogs.lincoln.ac.uk/– and more... 2012/08/06/owncloud-an- www.dataflow.ox.ac.uk academic-dropbox
    22. 22. Bringing it all together into a serviceDiagram courtesy of Sally Rumsey, University of Oxford
    23. 23. Outline• Introduction to the DCC• Higher Education drivers for RDM• 3 key initiatives and trends – Data Management Planning – Institutional policies and strategies for RDM – Research data storage and tools• How is RDM being supported?
    24. 24. JISC MRD programmes• MRD 01: October 2009 – July 2011 – £4.3 million investment – www.jisc.ac.uk/whatwedo/programmes/mrd.aspx• MRD 02 – October 2011 – July 2013 – £4.6 million investment – www.jisc.ac.uk/whatwedo/programmes/di_research management/managingresearchdata.aspxProgramme Manager: Simon Hodson s.hodson@jisc.ac.ukClose of programme conference in March 2013
    25. 25. DCC data management roadshows to allow every institution in the UK to prepare for effective research data management, and understand more about how the DCC can help “The roadshow as a whole will feed into the implementation plan we are developing after passing our RDM policy”“I was looking for a foundation in the issues for a librarian. Spot on!” www.dcc.ac.uk/events/data-management-roadshows
    26. 26. DCC Institutional EngagementsWith funding from HEFCE we’re:• Working intensively with 20 HEIs to increase RDM capability – 60 days of effort per HEI drawn from a mix of DCC staff – Deploy DCC & external tools, approaches & best practice• Support varies based on what each institution wants/needs• Lessons & examples will be shared with the community www.dcc.ac.uk/community/institutional-engagements
    27. 27. How to develop RDM services Why develop services? Roles and responsibilitiesIn development! Process of service development The components / building blocks • Policy • Data Management Planning • Storage • Data registry..... Examples and case Getting started studies from MRD programme, DCC work & overseas
    28. 28. Mixed model of support in HE National services e.g. DCC, Data JANET Centres HEIs www.data-archive.ac.ukList of repositorieshttp://datacite.org/repolist www.nerc.ac.uk/research/sites/data
    29. 29. Thanks - any questions? For DCC guidance, tools and case studies see: www.dcc.ac.uk/resourcesFollow us on twitter @digitalcuration and #ukdcc