Introduction to
bibliometric data sources
– GOOGLE SCHOLAR -
Introduction to bibliometric data sources
Part 3 (Google Scholar)
Presentation prepared for the EUROPEAN SUMMER SCHOOL
FOR SCIENTOMETRICS by
Nicolas Robinson-Garcia and Daniel Torres-Salinas
This work is licensed under a Creative Commons Attribution-ShareALike 4.0 International License2
http://nrobinsongarcia.com
@nrobinsongarcia
http://sl.ugr.es/torressalinas
@torressalinas
CONTENTS
▪ Description of data source
▪ Getting data from Google Scholar
▪ Final considerations
3
1.
DESCRIPTION OF THE
DATA SOURCE
Main characteristics of Google Scholar
GOOGLE SCHOLAR, NO PRESENTATIONS
NEEDED!
5
Researchers’ Use of
Academic Libraries and
their Services
RIN, 2007
6
Researchers of
Tomorrow: the research
behavior of Generation Y
doctoral students
2012
GOOGLE SCHOLAR, NO PRESENTATIONS
NEEDED!
“When beginning research, more
than 75% are likely or extremely
likely to start with Google,
followed by Google Scholar, the
online catalog, the article
databases, and Wikipedia in that
order.
Bagget 2012
7
Preferred by students…
“Perhaps surprisingly, the simple
searches in Google Scholar had
consistently higher recall and
precision than the expert
searches.
Walters 2011
8
… better than experts
SOME FACTS
▪ Launched in 2004
▪ It is a search engine not a database
▪ Indexes material from academic and scientific
web domains
▪ Simplicity as a theme
▪ Offers full text access and citations
But provides very limited metadata!
9
MAIN CHARACTERISTICS
Document types
Journal articles
Books
Working papers
Proceedings papers
Technical reports
Presentations
…
Geographical coverage
The largest geographic
coverage
Greater coverage of
non-English sources
Disciplinary coverage
Larger coverage in
peripheral fields mainly
SSH.
10
MAIN CHARACTERISTICS
11
170-175 M
57 M 53 M
Orduña-Malea et al 2015
THE oogle FAMILY
GS CITATION PROFILES
▪ Easy to create and update
▪ Alerts on received citations
▪ Some institutions promote
their use ~ ethical issues?
GS METRICS
▪ Journal rankings based on
H-5 Index
▪ Rankings by fields (top100)
(only English) and
languages (top20)
▪ ≥100 pubs journals
excluded
12
Our own Juan in GS Profiles
13
Searching for scientometrics gurus with label:
14
Looking at the scientific community @ KU Leuven
15
Ethical issues, transparency and responsibility.
Enforcing scholars to use a commercial tool and its dangers
16
Delgado López-Cózar et al 2014
Ethical issues, transparency and responsibility.
Enforcing scholars to use a commercial tool and its dangers
17
18
oEasy to set up (just search for your
papers)
oTerrific tool for comparing researchers
within a field or department
oAutomatically updated basic
bibliometric data
oNo restrictions on source, language or
area
oEveryone can measure their
performance (and their colleagues')
oData can be easily manipulated (a
researcher can self-claim non-
authored papers)
oCan stimulate vanity and ego
oCan generate unfair comparisons
(for example, researchers from
different areas in a single univ)
oCan generate unfair analysis by
non-bibliometric experts
STRENGTHS WEAKNESSES
GS metrics overall ranking
19
Top journals in German according to GS Metrics
20
21
oFree product to compare and rank
journals
oWe can get impact information
about non-JCR journals and about
national and SSH publications
oTransparency: citations for every
paper that contributes to the h-
index can be checked
oHigh correlation with JCR Impact
factor (0.82)
oCan be easily replicated
o Methodological inconsistencies
such as comparing journals from
different areas
o Lack of proper bibliographic
control (duplicates, “dirty” data,…)
o No selection criteria
o No action against data
manipulation
o Just top results are presented
o H-index favours highly
productive journals
STRENGTHS WEAKNESSES
MAIN TOOLS WITHIN
Alerts
Email result updates
from
▪ Saved research
queries
▪ Citing documents
▪ GS Author profiles
▪ Citations to GS
profiles
Updates
Alerts of papers
based on:
▪ Search history
▪ Citations profile
▪ References
▪ Title
▪ Co-authors
Library + Cite
▪ Like a reference
manager
▪ Allows use of labels
▪ Chrome add-on
allows searching
GS anywhere
▪ Also offers citation
in any format
22
2.
GETTING DATA FROM
GOOGLE SCHOLAR
Google Scholar as a bibliometric data
source
GETTING THE FACTS STRAIGHT
▪ Google Scholar is the most powerful academic
search engine currently
▪ It also offers great functionalities for citizen
bibliometrics
▪ It is possible to use GS for bibliometric studies
and reports, but it is not feasible nor advisable
24
GETTING THE FACTS STRAIGHT
✓ Greater coverage for
SSH
✓ Wider range of document
types
✓ Reliable at the macro
level
✓ Better language and
geographical coverage
х Limited metadata (no
API)
х Not replicable
х No quality controls
х Lack of transparency
25
26
“Google Scholar contains valuable information that is not available from any
other database, but it is impractical to rely on it for large-scale analyses.”
Alberto Martín-Martín
interviewed by Holly Else, 2018
“[…] the information in GS, once
retrieved on the basis of existing
publication data and cleaning of
citation sources, indicate
acceptable levels of reliability in
terms of source.
Prins et al., 2014
27
But…
3rd PARTY TOOLS TO GATHER DATA
▪ Harzing’s Publish or Perish
MORE ON THURSDAY!
28
3.
FINAL CONSIDERATIONS
Wrapping up!
THE CRITICAL VIEW
▪ A goldmine of scholarly data
▪ Free does not mean open!
▪ Professional bibliometrics… is it worth the
effort?
30
THE OPTIMISTIC VIEW
▪ The world is much bigger than WoS and Scopus
▪ A challenge and an opportunity to explore
marginalised areas science (non-English
literature, SSH)
▪ Giving it back to the people, a trigger for citizen
bibliometrics
31
THANKS!
Introduction to bibliometric data
sources – Part 3 (Google Scholar)
32

Introduction to bibliometric data sources - Google Scholar

  • 1.
    Introduction to bibliometric datasources – GOOGLE SCHOLAR -
  • 2.
    Introduction to bibliometricdata sources Part 3 (Google Scholar) Presentation prepared for the EUROPEAN SUMMER SCHOOL FOR SCIENTOMETRICS by Nicolas Robinson-Garcia and Daniel Torres-Salinas This work is licensed under a Creative Commons Attribution-ShareALike 4.0 International License2 http://nrobinsongarcia.com @nrobinsongarcia http://sl.ugr.es/torressalinas @torressalinas
  • 3.
    CONTENTS ▪ Description ofdata source ▪ Getting data from Google Scholar ▪ Final considerations 3
  • 4.
    1. DESCRIPTION OF THE DATASOURCE Main characteristics of Google Scholar
  • 5.
    GOOGLE SCHOLAR, NOPRESENTATIONS NEEDED! 5 Researchers’ Use of Academic Libraries and their Services RIN, 2007
  • 6.
    6 Researchers of Tomorrow: theresearch behavior of Generation Y doctoral students 2012 GOOGLE SCHOLAR, NO PRESENTATIONS NEEDED!
  • 7.
    “When beginning research,more than 75% are likely or extremely likely to start with Google, followed by Google Scholar, the online catalog, the article databases, and Wikipedia in that order. Bagget 2012 7 Preferred by students…
  • 8.
    “Perhaps surprisingly, thesimple searches in Google Scholar had consistently higher recall and precision than the expert searches. Walters 2011 8 … better than experts
  • 9.
    SOME FACTS ▪ Launchedin 2004 ▪ It is a search engine not a database ▪ Indexes material from academic and scientific web domains ▪ Simplicity as a theme ▪ Offers full text access and citations But provides very limited metadata! 9
  • 10.
    MAIN CHARACTERISTICS Document types Journalarticles Books Working papers Proceedings papers Technical reports Presentations … Geographical coverage The largest geographic coverage Greater coverage of non-English sources Disciplinary coverage Larger coverage in peripheral fields mainly SSH. 10
  • 11.
    MAIN CHARACTERISTICS 11 170-175 M 57M 53 M Orduña-Malea et al 2015
  • 12.
    THE oogle FAMILY GSCITATION PROFILES ▪ Easy to create and update ▪ Alerts on received citations ▪ Some institutions promote their use ~ ethical issues? GS METRICS ▪ Journal rankings based on H-5 Index ▪ Rankings by fields (top100) (only English) and languages (top20) ▪ ≥100 pubs journals excluded 12
  • 13.
    Our own Juanin GS Profiles 13
  • 14.
    Searching for scientometricsgurus with label: 14
  • 15.
    Looking at thescientific community @ KU Leuven 15
  • 16.
    Ethical issues, transparencyand responsibility. Enforcing scholars to use a commercial tool and its dangers 16 Delgado López-Cózar et al 2014
  • 17.
    Ethical issues, transparencyand responsibility. Enforcing scholars to use a commercial tool and its dangers 17
  • 18.
    18 oEasy to setup (just search for your papers) oTerrific tool for comparing researchers within a field or department oAutomatically updated basic bibliometric data oNo restrictions on source, language or area oEveryone can measure their performance (and their colleagues') oData can be easily manipulated (a researcher can self-claim non- authored papers) oCan stimulate vanity and ego oCan generate unfair comparisons (for example, researchers from different areas in a single univ) oCan generate unfair analysis by non-bibliometric experts STRENGTHS WEAKNESSES
  • 19.
  • 20.
    Top journals inGerman according to GS Metrics 20
  • 21.
    21 oFree product tocompare and rank journals oWe can get impact information about non-JCR journals and about national and SSH publications oTransparency: citations for every paper that contributes to the h- index can be checked oHigh correlation with JCR Impact factor (0.82) oCan be easily replicated o Methodological inconsistencies such as comparing journals from different areas o Lack of proper bibliographic control (duplicates, “dirty” data,…) o No selection criteria o No action against data manipulation o Just top results are presented o H-index favours highly productive journals STRENGTHS WEAKNESSES
  • 22.
    MAIN TOOLS WITHIN Alerts Emailresult updates from ▪ Saved research queries ▪ Citing documents ▪ GS Author profiles ▪ Citations to GS profiles Updates Alerts of papers based on: ▪ Search history ▪ Citations profile ▪ References ▪ Title ▪ Co-authors Library + Cite ▪ Like a reference manager ▪ Allows use of labels ▪ Chrome add-on allows searching GS anywhere ▪ Also offers citation in any format 22
  • 23.
    2. GETTING DATA FROM GOOGLESCHOLAR Google Scholar as a bibliometric data source
  • 24.
    GETTING THE FACTSSTRAIGHT ▪ Google Scholar is the most powerful academic search engine currently ▪ It also offers great functionalities for citizen bibliometrics ▪ It is possible to use GS for bibliometric studies and reports, but it is not feasible nor advisable 24
  • 25.
    GETTING THE FACTSSTRAIGHT ✓ Greater coverage for SSH ✓ Wider range of document types ✓ Reliable at the macro level ✓ Better language and geographical coverage х Limited metadata (no API) х Not replicable х No quality controls х Lack of transparency 25
  • 26.
    26 “Google Scholar containsvaluable information that is not available from any other database, but it is impractical to rely on it for large-scale analyses.” Alberto Martín-Martín interviewed by Holly Else, 2018
  • 27.
    “[…] the informationin GS, once retrieved on the basis of existing publication data and cleaning of citation sources, indicate acceptable levels of reliability in terms of source. Prins et al., 2014 27 But…
  • 28.
    3rd PARTY TOOLSTO GATHER DATA ▪ Harzing’s Publish or Perish MORE ON THURSDAY! 28
  • 29.
  • 30.
    THE CRITICAL VIEW ▪A goldmine of scholarly data ▪ Free does not mean open! ▪ Professional bibliometrics… is it worth the effort? 30
  • 31.
    THE OPTIMISTIC VIEW ▪The world is much bigger than WoS and Scopus ▪ A challenge and an opportunity to explore marginalised areas science (non-English literature, SSH) ▪ Giving it back to the people, a trigger for citizen bibliometrics 31
  • 32.
    THANKS! Introduction to bibliometricdata sources – Part 3 (Google Scholar) 32