Open Metrics for Open Repositories
 20x20 Pecha Kucha delivered at OR2012 on Tuesday 10th July based on the
     unpublished paper available from http://opus.bath.ac.uk/30226/


        Brian Kelly1, Nick Sheppard2, Jenny Delasalle3, Mark Dewey1, Owen Stephens4, Gareth J
                                     Johnson5 and Stephanie Taylor1
            1 UKOLN, University of Bath, Bath, UK {B.Kelly, M.Dewey, S.Taylor}@ukoln.ac.uk
         2 UKCoRR/Leeds Metropolitan University, Leeds, UK {N.E.Sheppard@leedsmet.ac.uk}
                   3 University of Warwick, Warwick, UK {J.Delasalle@warwick.ac.uk}
                                4 Consultant, UK {owen@ostephens.com}
                      5 UKCoRR/University of Leicester, Leicester, UK {gjj6@le.ac.uk}
OA still in its infancy?
• Need for metrics
• Understand how IRs are
  being used
• Policy decisions
• Technical infrastructure
• Business data
• Not just research
• Open Data / OER
• Open landscape
•   New, alternative metrics
•   Use in evaluation and assessment
•   Exploit opportunities
•   Usage / bibliographic data
•   Make openly available
•   Adoption of altmetrics
•   The BOAI at 10 – Alma Swan (video)
The Finch report
•   Executive Summary
•   Gold vs Green
•   Estimated additional £50-60M
•   Green has failed!
•   What does failure look like?
•   What would success look like?
•   Evidence
Practice what you preach
• Open it up
• Article level (ideal)
• Software / fragmentation
• Wrangle your software
• Need for accurate
  aggregation
• Technical challenges




                                 Julian Kleyn (2008)
                             http://www.flickr.com/photos/juliankleyn/2604103124/
The Institutional Picture
• Total number of records
• Full-text / metadata only
• Raw figure / Proportion of
  total
• Metadata records accessed
• Full-text downloaded
• How records are accessed
   – Search/browse
   – Search engine referral
   – Referral from aggregation
   – Social media referral
EPrints
•   Article level metrics
•   Open interface
•   www.eprints.ac.uk/cgi/irstats.cgi
•   Use as advocacy tool
Dspace
                                                                      (GA plug-in)
                                                                 • More reliable than
                                                                   existing DS plugins
                                                                 • lost stats/inflated
                                                                   page views from
                                                                   robots
                                                                 • Stats available at
                                                                   three levels:
                                                                    – item,
                                                                    – collection
                                                                    – repository
http://ukcorr.org/2012/03/22/using-google-analytics-statistics-within-dspace-2/
https://github.com/seesmith/Dspace-googleanalytics
The Big Picture
•   OPuS - Bath (EPrints)
•   Leeds Metropolitan
•   Proportion as measure?
•   RepUK / aggregations
•   Software issues
     – EPrints
     – DSpace
     – Also ran
Number of Full Text
• Few IRs can easily provide data
• RepUK (national aggregation service)
• CORE
• Discrepancies
• Proportion as measure?
• Full text only exemplars (19)
                                                           (Should be HTML?)
• Developing Research Management
  Infrastructure     Brunel, City University London, Cranfield, Loughborough,
• CRIS               QMUL, RHUL, School of Advanced Study, Aberdeen,
                           Birmingham, Cambridge, East London, Edinburgh, Exeter,
                           Hull, St Andrews, Stirling, Surrey, Warwick, Nottingham
RepUK
Deposits
Oai_dc node occurences
                                                •   Technical challenges
Oai_dc node occurences over time                •   OpenDoar
Oai_dc percentabe node occurences over time
Metadata quality – validity of XML and DRIVER   •   OAI-PMH
compliance
dc:subject classification system occurrence     •   Data visualisation
dc:language occurrence across UK repositories
Matches against IANA Mime Types                 •   Trends revealed
(Edinburgh Research Archive)


154 IRs aggregated (28/06/12)
•   397790 PDF
•   20578 MS Word
•   44018 jpeg
•   122764 html
The Researchers’ Requirements
•   May have effect on deposit choice
•   PLoS/Arxiv display article-level metrics
•   Numbers aggregated (locations / versions)
•   PLoS show visitor numbers from PMC
•   May not deposit in IR to maximise numbers
    at preferred location
•   Few publishers display article-level metrics
•   Opportunity for IRs to engage with authors
•   Measuring impact (REF, RCUK)
•   altmetrics
Third Party Services

•   Mechanisms to harvest metadata/full-text
•   OAI-PMH
•   The search party didn’t turn up?
•   Slow growth
•   Google (Scholar)
•   CiteSeerX, CORE cache copies
•   Affects repository metrics
•   PIRUS2 / article level metrics
COnnecting REpositories (CORE)
• Repository Analytics
• Increase the visibility of content
• Provide applications to aid
  content discovery
• Enrich metadata
Conclusions
    • Yes it’s complicated!
    • Stats don’t tell the
      whole picture
    • Greatest value for
      operational / strategic
      purposes
    • Senior management
    • IR managers should be
      proactive
    • Culture of openness

Open Metrics for Open Repositories at OR2012

  • 1.
    Open Metrics forOpen Repositories 20x20 Pecha Kucha delivered at OR2012 on Tuesday 10th July based on the unpublished paper available from http://opus.bath.ac.uk/30226/ Brian Kelly1, Nick Sheppard2, Jenny Delasalle3, Mark Dewey1, Owen Stephens4, Gareth J Johnson5 and Stephanie Taylor1 1 UKOLN, University of Bath, Bath, UK {B.Kelly, M.Dewey, S.Taylor}@ukoln.ac.uk 2 UKCoRR/Leeds Metropolitan University, Leeds, UK {N.E.Sheppard@leedsmet.ac.uk} 3 University of Warwick, Warwick, UK {J.Delasalle@warwick.ac.uk} 4 Consultant, UK {owen@ostephens.com} 5 UKCoRR/University of Leicester, Leicester, UK {gjj6@le.ac.uk}
  • 2.
    OA still inits infancy? • Need for metrics • Understand how IRs are being used • Policy decisions • Technical infrastructure • Business data • Not just research • Open Data / OER • Open landscape
  • 3.
    New, alternative metrics • Use in evaluation and assessment • Exploit opportunities • Usage / bibliographic data • Make openly available • Adoption of altmetrics • The BOAI at 10 – Alma Swan (video)
  • 4.
    The Finch report • Executive Summary • Gold vs Green • Estimated additional £50-60M • Green has failed! • What does failure look like? • What would success look like? • Evidence
  • 5.
    Practice what youpreach • Open it up • Article level (ideal) • Software / fragmentation • Wrangle your software • Need for accurate aggregation • Technical challenges Julian Kleyn (2008) http://www.flickr.com/photos/juliankleyn/2604103124/
  • 6.
    The Institutional Picture •Total number of records • Full-text / metadata only • Raw figure / Proportion of total • Metadata records accessed • Full-text downloaded • How records are accessed – Search/browse – Search engine referral – Referral from aggregation – Social media referral
  • 8.
  • 9.
    Article level metrics • Open interface • www.eprints.ac.uk/cgi/irstats.cgi • Use as advocacy tool
  • 10.
    Dspace (GA plug-in) • More reliable than existing DS plugins • lost stats/inflated page views from robots • Stats available at three levels: – item, – collection – repository http://ukcorr.org/2012/03/22/using-google-analytics-statistics-within-dspace-2/ https://github.com/seesmith/Dspace-googleanalytics
  • 11.
    The Big Picture • OPuS - Bath (EPrints) • Leeds Metropolitan • Proportion as measure? • RepUK / aggregations • Software issues – EPrints – DSpace – Also ran
  • 12.
    Number of FullText • Few IRs can easily provide data • RepUK (national aggregation service) • CORE • Discrepancies • Proportion as measure? • Full text only exemplars (19) (Should be HTML?) • Developing Research Management Infrastructure Brunel, City University London, Cranfield, Loughborough, • CRIS QMUL, RHUL, School of Advanced Study, Aberdeen, Birmingham, Cambridge, East London, Edinburgh, Exeter, Hull, St Andrews, Stirling, Surrey, Warwick, Nottingham
  • 13.
  • 14.
    Deposits Oai_dc node occurences • Technical challenges Oai_dc node occurences over time • OpenDoar Oai_dc percentabe node occurences over time Metadata quality – validity of XML and DRIVER • OAI-PMH compliance dc:subject classification system occurrence • Data visualisation dc:language occurrence across UK repositories Matches against IANA Mime Types • Trends revealed
  • 15.
    (Edinburgh Research Archive) 154IRs aggregated (28/06/12) • 397790 PDF • 20578 MS Word • 44018 jpeg • 122764 html
  • 17.
    The Researchers’ Requirements • May have effect on deposit choice • PLoS/Arxiv display article-level metrics • Numbers aggregated (locations / versions) • PLoS show visitor numbers from PMC • May not deposit in IR to maximise numbers at preferred location • Few publishers display article-level metrics • Opportunity for IRs to engage with authors • Measuring impact (REF, RCUK) • altmetrics
  • 18.
    Third Party Services • Mechanisms to harvest metadata/full-text • OAI-PMH • The search party didn’t turn up? • Slow growth • Google (Scholar) • CiteSeerX, CORE cache copies • Affects repository metrics • PIRUS2 / article level metrics
  • 19.
    COnnecting REpositories (CORE) •Repository Analytics • Increase the visibility of content • Provide applications to aid content discovery • Enrich metadata
  • 20.
    Conclusions • Yes it’s complicated! • Stats don’t tell the whole picture • Greatest value for operational / strategic purposes • Senior management • IR managers should be proactive • Culture of openness

Editor's Notes

  • #2 We need metrics to support a variety of business needs for a variety of stakeholders National /international funders and policy makersInstitutional policy makersIR managers Content owners (researchers / OER creators) Developer community (increasingly important as APIs developed /open data becomes available)Paper illustrates a number of examples of how the needs of these stakeholders are being addressedIR managers should take a pro-active role in providing open access for metrics associated with their services Yes it's complicated and stats don't tell the whole picture" this will undermine repository managers work in promoting open access (which, as we know, also has complexities).
  • #3 a measure of an organization's activities and performance. Performance metrics should support a range of stakeholders’ needs from customers, shareholderstoemployees. While traditionally many metrics are financed based, inwardly focusing on the performance of the organization, metrics may also focus on the performance against customer requirements and value.help to inform policy decisions on future investment, technical policy decisions on enhancements to the technical infrastructureoperational decisions by practitioners demonstrate value of investment inform decisions on deprecating aspects of the services. monitor effectiveness of open access activities
  • #7 "two key purposes of a repository are (1) maximising access to research publications and (2) ensuring long-term preservation of research publications (Kelly, B. 2011)http://ukwebfocus.wordpress.com/2011/02/24/how-do-we-measure-the-effectiveness-of-institutional-repositories/
  • #18 JISC PIRUS