MARGINALIZATION (Different learners in Marginalized Group
‘Making Open Access count: Creating standards to measure the use of Open Access resources’ - Joseph Greene (University College Dublin)
1. UCD Library
University College Dublin,
Belfield, Dublin 4, Ireland
Leabharlann UCD
An Coláiste Ollscoile, Baile Átha Cliath,
Belfield, Baile Átha Cliath 4, Eire
Making Open Access count:
Creating standards to measure
the use of Open Access
resources
CONUL 2017
Athlone, 31 May
Joseph Greene
Research Repository Librarian
University College Dublin
joseph.greene@ucd.ie
http://researchrepository.ucd.ie
4. Tipping
point
More than 50% of recent
papers (2011-2013) were
found to be Open access
Archambault, E. et al. (2014). Proportion of Open Access Papers Published in Peer-Reviewed Journals at the European and
World Levels: 1996–2013 (41p.). Produced for the European Commission DG Research & Innovation.
6. OA Citation
advantage
• At least 40 separate studies show that
Open Access increases citations1,2
• Wide variations between disciplines
• 35% increase in mathematics3
• 500% increase in citations in
physics/astronomy3
• Most recent study: 3.3 million papers2
• Average: OA = 50% more citations
• (Green is overall the better strategy)
1Wagner, B. (2010) ‘Open Access Citation Advantage: An Annotated Bibliography’. DOI: 10.5062/F4Q81B0W
2Archambault, E. (2016) ‘Research impact of paywalled versus open access papers’
3Swan, A. (2010) ‘The Open Access citation advantage: Studies and results to date’. https://eprints.soton.ac.uk/268516/
12. Background
research
• Up to 85% of OA repository
downloads come from non-human
agents1
• Even with robot detection, there is
room for improvement2
• DSpace stats: 62% human
• EPrints stats: 55% human
• U. Minho DSpace stats: 59-73%
human
1Greene, J. (2016) 'Web robot detection in scholarly Open Access institutional repositories'. Library Hi Tech, 34 (3):500-520
2Greene, J. (2016) 'How Accurate are IR Usage Statistics?’. Open Repositories (OR2016) Dublin, 13-16 June 2016
13. Problems
• Many ways to do robot detection
• (At least 23 in the literature , not to
mention combinations)
• Nothing resembling a standard
available
• Cross-platform comparison and
aggregation impossible
14. Addressing
the
problem
• COUNTER Robots Working Group
• Joseph Greene, UCD, RIAN (chair)
• Lorraine Estelle, Project COUNTER
• Paul Needham, IRUS-UK/COUNTER
• Representatives from EBSCO, Elsevier, Wiley,
ScholarlyIQ, DSpace, EPrints, DigitalCommons,
OpenAIRE, Base Bielefeld and Open Journal
Systems
“…to devise ‘adaptive filtering systems’ that will allow publishers/repositories/services to
follow a common set of rules to dynamically identify and filter out unusual usage and
robot activity”
16. Usage data sources
.csv
.csv
.txt
Source: Bielefeld/OJS (x3)
Lines: 233,000
Source: IRUS-UK (97 IRs)
Lines: 1.9 million
Source: Wiley
Lines: Several million
PostgreSQL database
Several million rows Period: 3-9 October 2016
17. Robot
detection
• Simple random sample taken
• 202-204 downloads for each dataset
• 95% certainty
• Manually determine human/robot
downloads
• Self-named
• Unusual number of downloads, items
downloaded