Talk given at the 2016 Open Repositories conference in Dublin, Ireland. This paper follows the journey of a research intensive university towards making its outputs available openly, discusses approaches outlined above and identifies problems in the global scholarly communications landscape.
Separation of Lanthanides/ Lanthanides and Actinides
Imperial College London - journey to open scholarship
1. Imperial College London –
journey to open scholarship
Open Repositories 2016, Dublin, 15th June 2016
Dr Torsten Reimer, Scholarly Communications Officer
http://orcid.org/0000-0001-8357-9422 / @torstenreimer
Imperial College London
2. Imperial College London
• Faculties of Engineering,
Medicine, Natural Sciences
and the Business School
• Ranked 3rd in Europe / 8th in the
world (THE 2015-16 rankings)
• Net income (2015): £969m, incl.
£428m research grants/contracts
• ~15,000 students, ~8,000 staff, incl. ~3,900 academic & research staff
• Staff publish 10-12,000 scholarly articles per year
• 2015 Article Processing Charges (APC) commitment: £1.7m
• Largest data traffic into Janet network of all UK universities
3. Decision to go open: College policies
College support for Open Access dates back more than a decade
2012: Open Access (OA) Mandate
“Imperial College London is committed to disseminating its research and
scholarship as widely as possible. […] The College has implemented an
open access mandate for all research publications […], authors are
required to upload their final peer reviewed copy of the paper into Spiral.”
2015: Research Data Management (RDM) Policy
“[...] free and timely open access to data so that they are intelligible,
assessable and usable by others. [...] The minimum requirement is to
share all relevant data to support and underpin published findings
including e-theses. [...] Principal Investigators must deposit their
shareable research data in a publicly-available repository of their
choosing no later than the time of publication of the findings.”
www.imperial.ac.uk/scholarly-communication
4. Sample of UK funder requirements
• Research assessment brings College ~£100m/year
• All articles deposited within 3 months of acceptance
Higher Education Funding Councils
• Provide funding for Gold OA to universities
• 100% open access to scholarly articles by 2018
Research Councils UK
• College able to track location of all data assets
• Ideally all research data made available publicly
Engineering & Physical Sciences
Research Council
5. College decisions for the route to “open”
• Set up a governance structure with senior College representation
• Open Access Publishing and Research Data Management working groups,
chaired by Associate Provost / delegate of VP Research
• OA Implementation Group, chaired by Scholarly Communications Officer
• Close collaboration between Library, IT and Research Office
• Establish a new role to coordinate across College
• Scholarly Communications Officer
• Enhance support capacity in College Library
• 6 full time posts for OA, 2 for RDM, 1 for licensing (previously only part-time
posts); led by Head of Scholarly Communications Management
• Focus on improving systems and workflows for (and with) academics
• Aim to be ahead of funders (where sensible)
6. Simplify compliance: combine green & gold workflow
On acceptance
workflow
Elements
Deposit
DSpace
Apply for APC
ASK OA
Link funding
Reporting
Single open access workflow to meet College
and funder requirements – covers gold and
green OA in one action.
• User interface: Symplectic Elements
• Repository: Spiral (DSpace)
• Gold OA: ASK OA, dedicated APC (Article
Processing Charge) management system
• Minimise manual input
• 2012-2015: deposits increased 18x;
support staff ~3x
8. Move before the funders: College ORCID project
College became ORCID member in 2014:
• Raise awareness and uptake
• Issue researchers with an iD
Approach:
• Capture existing iDs (in Symplectic)
• Create new iDs on behalf of academics
• Encourage academics to link iD to
Symplectic
Outcomes:
• ~75% of iDs claimed
• Academics linked 1,800 iDs to Symplectic
• Ongoing awareness raising and work with
ORCID community (Imperial hosted 1st UK
ORCID (HE) members meeting in 2015)
https://www.imperial.ac.uk/orcid
https://dx.doi.org/10.1629/uksg.268
9. Towards an automated “on acceptance” workflow
Author links
ORCID with
CRIS
…shares ORCID
iD with publisher
…shares funder
information with
publisher
Publisher mints
DOI on
acceptance
…shares iD and
funder details
with CrossRef
CRIS pulls data
from CrossRef,
using ORCID iD
Jisc
Publications
Router
manuscript
Link via iD
CRIS = Current
Research Information
System (Symplectic
Elements at Imperial)
10. Process of RDM policy development
• Set up a governance structure, coordination across College
• Aim: guide academics through funder requirements and to best practice
• Policy not be implemented until College can support compliance
• Lack of reliable data (on data storage needs, scale in particular)
• Concerns about cost of maintaining infrastructure
• Concerns about uncertainties and changing market / policy landscape
• Approach
• RDM Green Shoots: 6 bottom-up, academic projects (2nd half of 2014)
• RDM investigation (Oct 2014-Jan 2015)
• Online survey (academics; 390 responses), in-depth interviews with
academics (~40), workshops (academics & data managers)
• Investigation into flexible, cost-effective infrastructure components
Deliver a solution that’s good enough for the 80% who (usually) don’t
have specialised requirements
11. College RDM workflow
1. Make a data management plan: use
DMPOnline
2. Store your data management plan
centrally: use InfoEd
3. Store your live data securely and
safely: use Box
4. Store your final data (and/or code)
for 10+ years, making it publicly
available: use Zenodo
5. Tell the College where your data
(and/or code) is published or stored:
use Symplectic
6. Reference your funding and your
data in the publications it
underpins: tell your publisher
(5 is a similar process to OA manuscript
deposit; 6 is linked with OA deposit process)
RDM Workflow, College Library Services
10.5281/zenodo.54000
12. Towards compliance as by-product of good workflows
Working towards:
• One workflow for data generation,
publishing, reporting and curation
• Link data generation directly to storage
(log into facility, data “at your desk”
before you are out of the “lab”)
• Automate reporting and generating /
sharing of metadata
Facilities
write
(meta)
data into
Box
Data
processed
/ analysed
from Box
Machine-
learning
adds
metadata
Publish to
repository
from Box,
with
reference
Metadata
directly or
indirectly
(ORCID)
to CRISS
Author links ORCID
with CRIS
…shares ORCID iD
with repository
…publishes dataset
DataCite DOI linked
to ORCID iD
CRIS pulls metadata
from ORCID /
DataCite / Repository
13. Moving on: research software
College RDM policy requires
academics to archive the
particular version of code
developed in a project to
generate or analyse data.
College-funded PyRDM project
developed library to automate
this process. College-
recommended repository
Zenodo offers GitHub
integration.
College launched survey on DVS
– 274 responses, 82% use Git
Decision: College to provide
GitHub Enterprise to all staff
College survey on distributed version control
Software Sustainability Institute – I am a fellow
14. Communications, Communications, Communications
Coordinated comms plan
across whole College
Driven by Library (good cop)
and Research Office (bad
cop)
Supported by departments,
central communications,
strategic planning etc.
• E-mails to all staff
• Electronic staff briefings
• OA & RDM roadshows
• Departmental meetings
• Drop-in sessions
• OA & RDM lunches
• Engagement through
departmental liaison librarians
• Leaflets, flyers, calendars
• Website, blog, social media
• Funder (policy) news
• Alerts sent from systems
• Compliance reports to
departments and faculties
• Briefings for senior academics
• Etc.
15. Results
0
1000
2000
3000
4000
5000
6000
2013 2014 2015
Open Access outputs
Deposits
APCs
0
200
400
600
800
1000
1200
1400
1600
1800
2000
2013 2014 2015 2016
ORCIDs in Symplectic
0
1000
2000
3000
4000
5000
12-15 01-16 02-16
“Box” users
College meets funder targets
18x increase in deposits 2012-2015
04/2016: 3x deposits of total 2012
>1TB research data added to Box daily
16. Average citations for articles in journals published 2011-2015
Imperial Data: Citations sourced from Scopus®
0
5
10
15
20
25
30
35
2015 2014 2013 2012 2011
Open Access: In Spiral and/or DOAJ
Likely to be Open Access: In Europe
PubMed Central only
Possibly Open Access: In arXiV only
Not known to be in an Open Access
Source
Data provided by Josie Lewis-Gibbs, College ICT, January 2016
17. Fixing the underlying problem: academic authors sign
away rights to publishers
• This restricts academics’ reuse of their own
scholarly outputs for teaching and research.
• This means universities retains no rights to
most of the scholarly outputs of their
academics.
• This makes compliance with funder open
access mandates more difficult or more
expensive* – and in some cases impossible.
• Management of embargos adds to the
workload of the university OA services
• This prevents or delays open access, limiting
the availability and impact of research.
* College pays ~50% more for hybrid open
access, and hybrid is >80% of articles
18. Solution: the UK Scholarly Communications Licence
• Inspired by Harvard OA Policy, adapted to UK legal and policy context
• Academics grant university a non-exclusive licence to scholarly outputs
• University will make accepted manuscripts available (CC BY NC)
UK consultation on implementation of UK-SCL:
• Led by Imperial College London (Chris Banks and Torsten Reimer)
• Discussions involve 70+ organisations across the UK
• Core group of “first movers” looking at implementation
• International partners are expressing an interest too
19. Conclusion
• Going open pays off, not just for funder compliance
• Key is for universities to want to “own” the process
• Governance structure, coordinated activity across the university
• Engage with academic requirements
• Changing culture takes time and (communications) effort
• Make “compliance” a by-product of good workflows
• Aim to simplify and automate workflows
• Academics should only interact with each output once
• Publishers can add value by providing good metadata on acceptance
• Don’t wait for the perfect solution: good enough is a good enough start
• If there are problems, try to fix the “root causes”, not the symptoms