Imperial College London - journey to open scholarship


Published on

Talk given at the 2016 Open Repositories conference in Dublin, Ireland. This paper follows the journey of a research intensive university towards making its outputs available openly, discusses approaches outlined above and identifies problems in the global scholarly communications landscape.

Published in: Education

Imperial College London - journey to open scholarship

  1. 1. Imperial College London – journey to open scholarship Open Repositories 2016, Dublin, 15th June 2016 Dr Torsten Reimer, Scholarly Communications Officer / @torstenreimer Imperial College London
  2. 2. Imperial College London • Faculties of Engineering, Medicine, Natural Sciences and the Business School • Ranked 3rd in Europe / 8th in the world (THE 2015-16 rankings) • Net income (2015): £969m, incl. £428m research grants/contracts • ~15,000 students, ~8,000 staff, incl. ~3,900 academic & research staff • Staff publish 10-12,000 scholarly articles per year • 2015 Article Processing Charges (APC) commitment: £1.7m • Largest data traffic into Janet network of all UK universities
  3. 3. Decision to go open: College policies College support for Open Access dates back more than a decade 2012: Open Access (OA) Mandate “Imperial College London is committed to disseminating its research and scholarship as widely as possible. […] The College has implemented an open access mandate for all research publications […], authors are required to upload their final peer reviewed copy of the paper into Spiral.” 2015: Research Data Management (RDM) Policy “[...] free and timely open access to data so that they are intelligible, assessable and usable by others. [...] The minimum requirement is to share all relevant data to support and underpin published findings including e-theses. [...] Principal Investigators must deposit their shareable research data in a publicly-available repository of their choosing no later than the time of publication of the findings.”
  4. 4. Sample of UK funder requirements • Research assessment brings College ~£100m/year • All articles deposited within 3 months of acceptance Higher Education Funding Councils • Provide funding for Gold OA to universities • 100% open access to scholarly articles by 2018 Research Councils UK • College able to track location of all data assets • Ideally all research data made available publicly Engineering & Physical Sciences Research Council
  5. 5. College decisions for the route to “open” • Set up a governance structure with senior College representation • Open Access Publishing and Research Data Management working groups, chaired by Associate Provost / delegate of VP Research • OA Implementation Group, chaired by Scholarly Communications Officer • Close collaboration between Library, IT and Research Office • Establish a new role to coordinate across College • Scholarly Communications Officer • Enhance support capacity in College Library • 6 full time posts for OA, 2 for RDM, 1 for licensing (previously only part-time posts); led by Head of Scholarly Communications Management • Focus on improving systems and workflows for (and with) academics • Aim to be ahead of funders (where sensible)
  6. 6. Simplify compliance: combine green & gold workflow On acceptance workflow Elements Deposit DSpace Apply for APC ASK OA Link funding Reporting Single open access workflow to meet College and funder requirements – covers gold and green OA in one action. • User interface: Symplectic Elements • Repository: Spiral (DSpace) • Gold OA: ASK OA, dedicated APC (Article Processing Charge) management system • Minimise manual input • 2012-2015: deposits increased 18x; support staff ~3x
  7. 7. ASK OA (cloud-based APC management system)
  8. 8. Move before the funders: College ORCID project College became ORCID member in 2014: • Raise awareness and uptake • Issue researchers with an iD Approach: • Capture existing iDs (in Symplectic) • Create new iDs on behalf of academics • Encourage academics to link iD to Symplectic Outcomes: • ~75% of iDs claimed • Academics linked 1,800 iDs to Symplectic • Ongoing awareness raising and work with ORCID community (Imperial hosted 1st UK ORCID (HE) members meeting in 2015)
  9. 9. Towards an automated “on acceptance” workflow Author links ORCID with CRIS …shares ORCID iD with publisher …shares funder information with publisher Publisher mints DOI on acceptance …shares iD and funder details with CrossRef CRIS pulls data from CrossRef, using ORCID iD Jisc Publications Router manuscript Link via iD CRIS = Current Research Information System (Symplectic Elements at Imperial)
  10. 10. Process of RDM policy development • Set up a governance structure, coordination across College • Aim: guide academics through funder requirements and to best practice • Policy not be implemented until College can support compliance • Lack of reliable data (on data storage needs, scale in particular) • Concerns about cost of maintaining infrastructure • Concerns about uncertainties and changing market / policy landscape • Approach • RDM Green Shoots: 6 bottom-up, academic projects (2nd half of 2014) • RDM investigation (Oct 2014-Jan 2015) • Online survey (academics; 390 responses), in-depth interviews with academics (~40), workshops (academics & data managers) • Investigation into flexible, cost-effective infrastructure components  Deliver a solution that’s good enough for the 80% who (usually) don’t have specialised requirements
  11. 11. College RDM workflow 1. Make a data management plan: use DMPOnline 2. Store your data management plan centrally: use InfoEd 3. Store your live data securely and safely: use Box 4. Store your final data (and/or code) for 10+ years, making it publicly available: use Zenodo 5. Tell the College where your data (and/or code) is published or stored: use Symplectic 6. Reference your funding and your data in the publications it underpins: tell your publisher (5 is a similar process to OA manuscript deposit; 6 is linked with OA deposit process) RDM Workflow, College Library Services 10.5281/zenodo.54000
  12. 12. Towards compliance as by-product of good workflows Working towards: • One workflow for data generation, publishing, reporting and curation • Link data generation directly to storage (log into facility, data “at your desk” before you are out of the “lab”) • Automate reporting and generating / sharing of metadata Facilities write (meta) data into Box Data processed / analysed from Box Machine- learning adds metadata Publish to repository from Box, with reference Metadata directly or indirectly (ORCID) to CRISS Author links ORCID with CRIS …shares ORCID iD with repository …publishes dataset DataCite DOI linked to ORCID iD CRIS pulls metadata from ORCID / DataCite / Repository
  13. 13. Moving on: research software College RDM policy requires academics to archive the particular version of code developed in a project to generate or analyse data. College-funded PyRDM project developed library to automate this process. College- recommended repository Zenodo offers GitHub integration. College launched survey on DVS – 274 responses, 82% use Git Decision: College to provide GitHub Enterprise to all staff College survey on distributed version control Software Sustainability Institute – I am a fellow
  14. 14. Communications, Communications, Communications Coordinated comms plan across whole College Driven by Library (good cop) and Research Office (bad cop) Supported by departments, central communications, strategic planning etc. • E-mails to all staff • Electronic staff briefings • OA & RDM roadshows • Departmental meetings • Drop-in sessions • OA & RDM lunches • Engagement through departmental liaison librarians • Leaflets, flyers, calendars • Website, blog, social media • Funder (policy) news • Alerts sent from systems • Compliance reports to departments and faculties • Briefings for senior academics • Etc.
  15. 15. Results 0 1000 2000 3000 4000 5000 6000 2013 2014 2015 Open Access outputs Deposits APCs 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2013 2014 2015 2016 ORCIDs in Symplectic 0 1000 2000 3000 4000 5000 12-15 01-16 02-16 “Box” users College meets funder targets 18x increase in deposits 2012-2015 04/2016: 3x deposits of total 2012 >1TB research data added to Box daily
  16. 16. Average citations for articles in journals published 2011-2015 Imperial Data: Citations sourced from Scopus® 0 5 10 15 20 25 30 35 2015 2014 2013 2012 2011 Open Access: In Spiral and/or DOAJ Likely to be Open Access: In Europe PubMed Central only Possibly Open Access: In arXiV only Not known to be in an Open Access Source Data provided by Josie Lewis-Gibbs, College ICT, January 2016
  17. 17. Fixing the underlying problem: academic authors sign away rights to publishers • This restricts academics’ reuse of their own scholarly outputs for teaching and research. • This means universities retains no rights to most of the scholarly outputs of their academics. • This makes compliance with funder open access mandates more difficult or more expensive* – and in some cases impossible. • Management of embargos adds to the workload of the university OA services • This prevents or delays open access, limiting the availability and impact of research. * College pays ~50% more for hybrid open access, and hybrid is >80% of articles
  18. 18. Solution: the UK Scholarly Communications Licence • Inspired by Harvard OA Policy, adapted to UK legal and policy context • Academics grant university a non-exclusive licence to scholarly outputs • University will make accepted manuscripts available (CC BY NC) UK consultation on implementation of UK-SCL: • Led by Imperial College London (Chris Banks and Torsten Reimer) • Discussions involve 70+ organisations across the UK • Core group of “first movers” looking at implementation • International partners are expressing an interest too
  19. 19. Conclusion • Going open pays off, not just for funder compliance • Key is for universities to want to “own” the process • Governance structure, coordinated activity across the university • Engage with academic requirements • Changing culture takes time and (communications) effort • Make “compliance” a by-product of good workflows • Aim to simplify and automate workflows • Academics should only interact with each output once • Publishers can add value by providing good metadata on acceptance • Don’t wait for the perfect solution: good enough is a good enough start • If there are problems, try to fix the “root causes”, not the symptoms