Bibliometrics in the library


Chances and pitfalls

Wouter Gerritsma, Wageningen UR
Evaluation cycle at universities


 Supervised by VSNU/QANU
     ● 6 year cycle for external peer reviews
     ● After 3 years midterm review
     ● Unit of analysis (in Wageningen): Graduate schools

 Citation analyses are not stipulated in the current
 Standard Evaluation Protocol. But have become
 mandatory at Wageningen UR, also at the social sciences
 department and for the research institutes
SEP criteria



 quality (including international academic reputation and
 PhD training)
 productivity (the relationship between input and output)
 societal relevance (including valorisation)
 vitality and feasibility (the ability to react adequately to
 important changes in the environment).
Current Research Information Systems


 Metis is a current research information system (CRIS)
    ● Information on all labour relations of all faculty and
      staff
    ● Information on all projects
    ● Information on all outputs (metadata of
      publications)
    ● Data entry at the chair group level
    ● Quality control by the library (inclusion of DOI)
Repository or Institutional Bibliography?

 Wageningen Yield (WaY) is the repository of Wageningen
 UR
      ● Synchronized overnight with the updates from Metis
      ● WaY contains metadata descriptions of all
       Wageningen UR publication output
      ● WaY is our OA repository
Repository or Institutional Bibliography?

 Wageningen Yield (WaY) is the repository of Wageningen
 UR
      ● Synchronized overnight with the updates from Metis
      ● WaY contains metadata descriptions of all
        Wageningen UR publication output
      ● WaY is our OA repository
      ● WaY is our tool for citation analyses
Full screen image with title
How do we compare numbers


 Scientist Z. Math has a publication from 2001 with 17 citations
 Scientist M. Biology has a publication from 2007 with 32 citations
Baselines for Mathematics
Baselines for Molecular Biology
For a single publication

 Zee, F.P.v.d., G. Lettinga & J.A. Field (2001) Azo dye
 decolourisation by anaerobic granular sludge.
 Chemosphere 44:1169-1176.
     ● Citations from WoS: 94
 Journal: Chemosphere
 Categorised by ESI in Environment/Ecology
 Baseline data for Environment/Ecology.
     ● Article from 2001 in Environment/ecology:
     ● On average: 19.36 citations;
     ● Top 10%: 44 citations; Top1%: 141 citations
 Relative Impact: 94 / 19.36 = 4.9
Advanced bibliometric indicators

 Follow Moed (1995) as closely as possible; but.....
 Web of Science is used for citation data
     ● We can’t make corrections for self citations
 Essential Science Indicators for baseline data (World
 average, Top 10% and Top 1%)
     ● Limited number of research fields (22)

 We can determine the representativeness of the citation
 analysis!
Representativeness
Representativeness
How to aggregate from a single publication
to an oeuvre?




  "CI" like indicator   "MNCS" like indicator
Sources of citation data



 Web of Science
 Scopus
 Google Scholar
 Microsoft Academic

 SciFinder; Psychinfo
 ArXiv; Citeseer other open access repositories

 Other altmetrics initiatives
Web of Science

 Citation data (includes also citations from other
 databases on Wok)
 API to download citation data
 Baselines from ESI
 "New" product InCites
     ● Nijmegen has licensed InCites
Scopus

 Citation data obtainable through an API
 Benchmarking with SciVal Strata
 Not yet fully developed
Google Scholar

 Give them a few more years
 Coverage?
 Ghost citations
 Content duplication
 Benchmarking?
Benchmarking in GS?




      Wouters, P. & R. Costas (2012). Users, narcissism and control. Utrecht, NL: SURFfoundation.
      http://www.surffoundation.nl/en/publicaties/Pages/Users_narcissism_control.aspx.
Comment on GS by Jacsó




 Jacsó, P. (2011). Google Scholar duped and deduped – the aura of “robometrics”. Online Information Review,
               35(1): 154-160 http://dx.doi.org/10.1108/14684521111113632
Altmetrics

 Quickly developing
    ● ScienceCard       Wouters, P. & R. Costas (2012). Users, narcissism and control.

    ● Total-Impact      Utrecht, NL: SURFfoundation. http://www.surffoundation.nl/
                        en/publicaties/Pages/Users_narcissism_control.aspx.

    ● Readermeter
    ● Microsoft
      Academic Search
    ● etc.
Why in the library?



 Library is the functional manager of Metis / WaY because
 of wide experience with bibliographic metadata
 Library manages contracts with publisher(s) of external
 databases that are being used
 Library has experience in developing and maintaining
 large databases
 Library has ample experience in searching complicated
 databases such as Web of Science
Advantage of using Metis / WaY


 Improvements in publication lists, etc. recorded
 Knowledge of, and experience with bibliometric analyses
 is better institutionalized
 Clarity / transparency for researchers
 Analysis of a single unit of the institute offers
 advantages for whole institute
 Better understanding of our own researchers
     ● We know where they publish
     ● We know what they cite
     ● We know something about their impact
Library outreach



 Improvement of the (meta)data quality in the repository
 Many presentations for research groups during the
 preparation for peer reviews
 Presentations based on detailed studies of single groups
 Library gives advice on publication strategies for groups
 and individuals
     ● there is a huge demand for these presentations
 Developed writing & citing courses with graduate schools
Closing the circle: Collection analysis

 With the coupling of publications with WoS
 We have gained insight in the relation
    ● Research group – Researchers – Publications –
      Reference list
    ● It is feasible to assign journal usage at faculty
      level, or more detailed (chair groups)
Lessons learned



 Start small, gain experience
 Show you can pull it off
 How much is your university spending on CWTS?
 Invest those resources in your own systems
Thank you!




On the Web:
@wowter
wowter.net
www.slideshare.net/wowter

Bibliometrics in the library

  • 1.
    Bibliometrics in thelibrary Chances and pitfalls Wouter Gerritsma, Wageningen UR
  • 2.
    Evaluation cycle atuniversities  Supervised by VSNU/QANU ● 6 year cycle for external peer reviews ● After 3 years midterm review ● Unit of analysis (in Wageningen): Graduate schools  Citation analyses are not stipulated in the current Standard Evaluation Protocol. But have become mandatory at Wageningen UR, also at the social sciences department and for the research institutes
  • 3.
    SEP criteria  quality(including international academic reputation and PhD training)  productivity (the relationship between input and output)  societal relevance (including valorisation)  vitality and feasibility (the ability to react adequately to important changes in the environment).
  • 4.
    Current Research InformationSystems  Metis is a current research information system (CRIS) ● Information on all labour relations of all faculty and staff ● Information on all projects ● Information on all outputs (metadata of publications) ● Data entry at the chair group level ● Quality control by the library (inclusion of DOI)
  • 5.
    Repository or InstitutionalBibliography?  Wageningen Yield (WaY) is the repository of Wageningen UR ● Synchronized overnight with the updates from Metis ● WaY contains metadata descriptions of all Wageningen UR publication output ● WaY is our OA repository
  • 6.
    Repository or InstitutionalBibliography?  Wageningen Yield (WaY) is the repository of Wageningen UR ● Synchronized overnight with the updates from Metis ● WaY contains metadata descriptions of all Wageningen UR publication output ● WaY is our OA repository ● WaY is our tool for citation analyses
  • 7.
  • 8.
    How do wecompare numbers  Scientist Z. Math has a publication from 2001 with 17 citations  Scientist M. Biology has a publication from 2007 with 32 citations
  • 9.
  • 10.
  • 11.
    For a singlepublication  Zee, F.P.v.d., G. Lettinga & J.A. Field (2001) Azo dye decolourisation by anaerobic granular sludge. Chemosphere 44:1169-1176. ● Citations from WoS: 94  Journal: Chemosphere  Categorised by ESI in Environment/Ecology  Baseline data for Environment/Ecology. ● Article from 2001 in Environment/ecology: ● On average: 19.36 citations; ● Top 10%: 44 citations; Top1%: 141 citations  Relative Impact: 94 / 19.36 = 4.9
  • 12.
    Advanced bibliometric indicators Follow Moed (1995) as closely as possible; but.....  Web of Science is used for citation data ● We can’t make corrections for self citations  Essential Science Indicators for baseline data (World average, Top 10% and Top 1%) ● Limited number of research fields (22)  We can determine the representativeness of the citation analysis!
  • 13.
  • 14.
  • 15.
    How to aggregatefrom a single publication to an oeuvre? "CI" like indicator "MNCS" like indicator
  • 16.
    Sources of citationdata  Web of Science  Scopus  Google Scholar  Microsoft Academic  SciFinder; Psychinfo  ArXiv; Citeseer other open access repositories  Other altmetrics initiatives
  • 17.
    Web of Science Citation data (includes also citations from other databases on Wok)  API to download citation data  Baselines from ESI  "New" product InCites ● Nijmegen has licensed InCites
  • 18.
    Scopus  Citation dataobtainable through an API  Benchmarking with SciVal Strata  Not yet fully developed
  • 19.
    Google Scholar  Givethem a few more years  Coverage?  Ghost citations  Content duplication  Benchmarking?
  • 20.
    Benchmarking in GS? Wouters, P. & R. Costas (2012). Users, narcissism and control. Utrecht, NL: SURFfoundation. http://www.surffoundation.nl/en/publicaties/Pages/Users_narcissism_control.aspx.
  • 21.
    Comment on GSby Jacsó Jacsó, P. (2011). Google Scholar duped and deduped – the aura of “robometrics”. Online Information Review, 35(1): 154-160 http://dx.doi.org/10.1108/14684521111113632
  • 22.
    Altmetrics  Quickly developing ● ScienceCard Wouters, P. & R. Costas (2012). Users, narcissism and control. ● Total-Impact Utrecht, NL: SURFfoundation. http://www.surffoundation.nl/ en/publicaties/Pages/Users_narcissism_control.aspx. ● Readermeter ● Microsoft Academic Search ● etc.
  • 23.
    Why in thelibrary?  Library is the functional manager of Metis / WaY because of wide experience with bibliographic metadata  Library manages contracts with publisher(s) of external databases that are being used  Library has experience in developing and maintaining large databases  Library has ample experience in searching complicated databases such as Web of Science
  • 24.
    Advantage of usingMetis / WaY  Improvements in publication lists, etc. recorded  Knowledge of, and experience with bibliometric analyses is better institutionalized  Clarity / transparency for researchers  Analysis of a single unit of the institute offers advantages for whole institute  Better understanding of our own researchers ● We know where they publish ● We know what they cite ● We know something about their impact
  • 25.
    Library outreach  Improvementof the (meta)data quality in the repository  Many presentations for research groups during the preparation for peer reviews  Presentations based on detailed studies of single groups  Library gives advice on publication strategies for groups and individuals ● there is a huge demand for these presentations  Developed writing & citing courses with graduate schools
  • 26.
    Closing the circle:Collection analysis  With the coupling of publications with WoS  We have gained insight in the relation ● Research group – Researchers – Publications – Reference list ● It is feasible to assign journal usage at faculty level, or more detailed (chair groups)
  • 27.
    Lessons learned  Startsmall, gain experience  Show you can pull it off  How much is your university spending on CWTS?  Invest those resources in your own systems
  • 28.
    Thank you! On theWeb: @wowter wowter.net www.slideshare.net/wowter