This document provides a summary of a presentation on bibliometrics. It discusses the basics of bibliometrics including who cites whom and patterns of scholarly research. It then covers major citation databases like Web of Science, Scopus, and Google Scholar and compares their coverage. Free online sources of bibliometric data are also presented including Eigenfactor, SJR, and Publish or Perish. Emerging altmetrics and new bibliometric indicators are discussed. The future of bibliometrics and how to follow new developments are addressed at the end.
3. Bibliometrics 101
• Bibliometrics Basics
• Introduction to Citation Databases
– WoS, Scopus, GS
• Free Web Sources with Bibliometric Indicators
• The Future!
4. Bibliometrics??
• Who cited whom
• Patterns in scholarly research
• Evolution of knowledge
• Measures of scholarly
impact, productivity, prestige
5. Keep In Mind
• Journal Quality ≠ Article Quality
• Citing a work ≠ Agreement with findings
• Self Citations
• Citation Patterns Differ Between Subjects
9. Total Citation Counts
Figure 1: Patterns of overlap and unique citations (number and percentage of total citations).
Lasda Bergman, EM (2012). Finding Citations to Social Work Literature: The Relative Benefits of Using Web of Science, Scopus, or Google
Scholar, The Journal of Academic Librarianship, http://dx.doi.org/10.1016/j.acalib.2012.08.002
10. Source Types of Citing References
Series Miscellan
eous Miscellaneous
0.4% 8.5%
4.5% Foreign
Language
8.6%
Reviews
11.7%
Books
9.7%
Journal
Dissertations, Articles
Journal theses 59.6%
Articles Journal 13.5%
99.7% Articles
83.8%
Web of Science Scopus
Google Scholar
Figure 2. Source types of all citing references.
11. Unique Citing References for Each
Journal
Figure 6. Distribution of unique citing references for each journal.
13. LIS Faculty (Meho, et al.)
• Overlap and coverage for LIS faculty
– all three needed
• Rankings of small scale and large scale bodies
of LIS research
– Scopus for small scale rankings, either for large
scale (GS not used)
• Coverage of human computer interaction
research
– Scopus preferable (GS not used)
____________________________________
Meho, L. I., & Sugimoto, C. R. (2009). Assessing the scholarly impact of information studies: A tale of two citation databases-Scopus and Web of
Science. Journal of the American Society for Information Science and Technology, 60(12), 2499–2508.
Meho, L. I., & Rogers, Y. (2008). Citation counting, citation ranking, and h-index of human-computer interaction researchers: A comparison of
scopus and web of science. Journal of the American Society for Information Science and Technology, 59(11), 1711–1726.
Meho, L. I., & Yang, K. (2007). Impact of data sources on citation counts and rankings of LIS faculty: Web of science versus Scopus and Google
scholar. Journal of the American Society for Information Science and Technology, 58(13), 2105–2125.
14. Earth Science (Mikki)
• Web of Science Preferable to Google Scholar
– GS has 85% of WoS
– Additional citations in GS “long tail” – minor and
irrelevant
– Did not compare Scopus
Mikki, S. (2010). Comparing Google Scholar and ISI Web of Science for earth sciences. Scientometrics, 82(2), 321–
331.
15. Business and Economics
(Levine-Clark & Gil)
• Scopus higher Citation Counts than WoS
• Non scholarly citations still demonstrate
impact in (GS)
• Google Scholar OK to use if WoS/Scopus not
available
Levine-Clark, M., & Gil, E. L. (2009). A comparative citation analysis of web of science, scopus, and google
scholar. Journal of Business and Finance Librarianship, 14(1), 32–46.
16. Medicine (Kulkarni, et al.)
• Variations in coverage
• Higher Citation Count in GS and Scopus
• No one citation database preferable for all of
medicine
Kulkarni, A. V., Aziz, B., Shams, I., & Busse, J. W. (2009). Comparisons of Citations in Web of Science, Scopus, and Google
Scholar for Articles Published in General Medical Journals. JAMA: The Journal of the American Medical
Association, 302(10), 1092–1096. doi:10.1001/jama.2009.1307
17. Publish or Perish Book
Harzing, A.-W. (2010). The Publish or Perish Book: Your Guide to Effective and Responsible Citation Analysis (1st ed.).
Melbourne: Tarma Software Research Pty Ltd.
20. Influence of Google Page Rank
Source: http://commons.wikimedia.org/wiki/File:PageRank-hi-res.png#file
created by Felipe Micaroni Lalli
21. Influence of Google Page Rank
• Eigenvector analysis:
– “The probability that a researcher, in documenting his or
her research, goes from a journal to another selecting a
random reference in a research article of the first journal.
Values obtained after the whole process represent a
‘random research walk’ that starts from a random journal
to end in another after following an infinite process of
selecting random references in research articles. A random
jump factor is added to represent the probability that the
researcher chooses a journal by means other than
following the references of research articles.” (Gonzales-
Pereira, et.al., 2010)
23. Leyerdoff , L. (forthcoming) “Betweenness Centrality” as an Indicator of the “Interdisciplinarity” of Scientific Journals, Journal of the
American Society for Information Science and Technology http://www.leydesdorff.net/betweenness/index.htm
30. SJR vs Article Influence/JIF
González-Pereira, B., Guerrero-Bote, V., & Moya-Anegon, F. (2009). The SJR indicator: A new indicator of journals’ scientific prestige. arXiv
preprint arXiv:0912.4141, p.8. Retrieved from http://arxiv.org/abs/0912.4141
32. Quick Comparison
Publication Self Citations Subject Field Underlying Effect of extent
Window Normalization Database of Database
Coverage
SNIP 3 years Included Yes Scopus Corrects for
differences in
coverage of
subjects
SJR 3 years Maximum 33% Yes Scopus More prestige
when database
coverage is
more extensive
AI 5 years Not Included Yes JCR (WoS) More prestige
when database
coverage is
more extensive
JIF 2 years Included No JCR (WoS) Does not
correct for
differences in
coverage of
subjects
Journal Metrics (2011). The evolution of journal assessment, p 11 http://www.journalmetrics.com/documents/Journal_Metrics_Whitepaper.pdf
37. PoP Metrics
• Papers • Hc Index
• Citations • HI index
• Cites/paper • HI, Norm
• Cites/author • Hm Index
• Papers/Author • E-index
• Authors/Paper • AWCR
• H index • Per Author AWCR
• G index
50. Follow the Discussion!
• Twitter Hashtag #altmetrics
• Blog search:
http://www.google.com/blogsearch?hl=en
– Search Bibliometrics, Citations, etc.
• Chronicle of Higher Education
• Scientometrics
51. Thank You for coming
• Elaine Lasda Bergman, University at Albany
• elasdabergman@albany.edu
• http://www.slideshare.net/librarian68/
Editor's Notes
Web of Science: 1970’s, Eugene Garfield, panel of expert reviewers for inclusion of journalsScopus: 1996, panel of reviewers, beware if non Elsevier, all cited references may not be linked yetGoogle Scholar: 1990’s, no panel of reviewers, whatever the world puts up on the web
No one source tracks ALL citations Other databases (subject specific) sometimes provide the information, Google Books, but no way to obtain be-all-end-all citation count
November 2012 article analyzed the coverage of each three databases with regard to these five social work journals
Less than a quarter are included in all sources55% are in both scopus and WoSMany of the
Almost nothing other than journal articles in WoS, reviews and conference proceedings, dissertations books, fl in GS are a sizeable chunk of scholarly material
A wide variety of scholars have done similar research. Here are some examples.
Meho and Yang: Citations of prominent LIS Scholars found that all three were needed to get the most accurate citation countMeho and Sugimoto: compared rankings not counts of article, scholar, journals and found that the rankings did change when you use WoS or Scopus. Rankings of large scale bodies of research such as from entire countries and Research domains are comparable when either is usedMeho and Rogers: scopus had better citation coverage for HCI scholars than WoS
Pubs of 29 Earth science faculty GS covers 85% of what is in WoSCoverage in GS > 2000 did not increaseMany citations in GS constitute a “long tail” of minor and irrelevant For each overlapping cited reference, WoS provided more citing referencesDid not compare Scopus
15 business and economics journalsScopus higher counts thanWoSGS non-scholarly citations still demonstrate impactGS reasonable to use if WoS/Scopus are not available
Articles in 3 major Medical journalsVariations in coverage for subgroups of articlesHigher citations (scholarly articles only) in GS & Scopus than WoS. Other tools are needed aside from citation count to measure impactDid not measure overlap
This book explains how to use Publish or Perish, a tool we will look at later which draws upon Google Scholar citation data to generate metrics. But the book has great background on citation searching, relative coverage of each database, and some discussion of various disciplines in relation to citation analysis and bibliometrics. A great introduction to bibliometric concepts.
Traditional: Journal Impact Factor, Citation Count, half-lifeCitation count does not indicate patterns in citing references over time or distributed across publicationsJournal Impact Factor and half life measure only very recent impact
Yellow page is ranked as more relevant because many pages contain links to it. A lot of pages link to the blue page, which makes it more influential in the yellow pages’ ranking.
Some citations are more important than others! Look at article a. Articles which cite Article a are weighted more heavily in Article a’s eigenfactor if those articles have themselves been cited frequently.
Eigenvector analysis: Similar to Google PageRank, “chain of citations”
Eigenfactor:JOURNAL METRIC “prestige” of a journalTakes into account the total amount of “citation traffic” appearing in JCRInfluence of the citing journal, Divided by the total number of citations appearing in that journalLarger journals with more articles will have larger eigenfactorsPast 5 years, instead of 2 years like JIFTakes into account all citations indexed in WoS that yearArticle Influence Score: average influence, per article of the papers in a journalComparable to the impact factorCorrects for the issues of journal size in the raw eigenfactor score by dividing by number of articles published in the journalAverage influence, per article of the papers on a journalComparable to the Impact Factor Corrects for the issues of journal size in the raw Eigenfactor scoreCorrects for journal self citationsNeurology’s 2006 article influence score = 2.01. Or that an avg. article in Neurology is 2X as influential as an avg. article in all of JCRWeb of Science: Journal Citation Reports
“Quick and dirty” articles on hot researchers, trending research topics, institutions and journalsPromotes analytical products being sold by Thompson; no longer freeHit or miss information, not searchable
“current “average prestige per paper”Usesscopus dataCitation time window is 3 years (instead of 2 JIF and 5 Eigenvector)Corrects for self citationsBecause it is per paper, it accounts for differences in journal size
Source Normalized Impact Per PaperUses Scopus DataCitation potential: total number of citing references in all journals which have cited this journal. Instead of in the entire scopus database like eigenfactor stats, it is only within the articles that have cited the journal. This presumably will allow for apples to apples comparisons from subject fields. Recently Updated – SNIP2“the ratio of a journal’s citation count per paper and the citation potential in its subject field.”Allows direct comparisons between journals with different subjects Larger journals
Subject field normalization; corrects for differences in citation rates in a given subject: effect of database coverage corrects or doesn’t correct for the difference in coverage of scopus, wos, and gs in indexing the journals pertaining to that subject
Provides a variety of metrics for measuring scholarly impact and output.More useful for metrics on authors than journals or institutionsUses Google Scholar citation informationUseful for interdisciplinary topics, fields relying heavily on conference papers or reports, non-English language sources, new journals, etc.Continuously updated since 2006
Total number of papers, total number of citationsAvg # citations per paper, citations per author, papers per author authors per paperH index: has 47 papers that were cited a MINIMUM of 47 times eachGindex: gives greater weight to highly cited articles: top g articles have received a minimum of g squared citatations (3, 9)Hc-index: puts weight on newer articlesHi-index: Individual h index accounts for co authors by dividing by # of authorsHi, norm: Alternative individual h index accoutns for co authors by dividing # of citatiosn by average # authors per paperHm-index (shreibers) divides by fraction of papers instead of splitting up authors or citationsE-index: greater weight to highly cited articles –compute h index, what is h squared? Then how many times cited more than h squared? Then take the square root of that. AWCR: Age weighted citation rate and AW index: caccount for variations in citation over time Variation of the h-index which gives greater weight to newer articles with high citationsPer author AWCR divided by number of authors for each paper.
Citation Impact Discerning Self CitationsMeasures output of authors for prestige and influenceSimilar to PoPCorrects for Self-CitationsUses Google Scholar dataCitations per year, h-index, g-index, total citations, avg cites per paper, self citations included and excluded, etc.
Open source Identifier, works with Scopus and WoS IDs you put it inon grants, websites, etc.
Altmetrics:Gathers information from ORCID, GS, DOI, PMID, Web pages, SlideshareItems saved, downloaded, recommended, cited, viewed, discussed, tweeted
Uses interesting visualizations to show networks of institutions, authors and co authors, interdisciplinarity, as well as h index, etc. Ask for 2 universities
DIRTY DATA!!! Citations, H index, i10 index = # of pubs with at least 10 citationsCan be made public or private
Measure of a publication career, what libraries hold books, Ranganathan
Firefox or Chrome browser extension need to upload on computer and then type in Trudi Jacobson, key information science and librMany citation based impact measures have been proposed. Each has advantages and limitations. Scholarometer computes some of the most established impact factors. Hirsch's original h-index (doi:10.1073/pnas.0507655102) is defined as the maximum number of articles h such that each has received at least h citations. Egghe'sg-index gives more weight to publications with many citations (doi:10.1007/s11192-006-0144-7). Schreiber's hm index apportions citations fairly for papers with multiple authors (doi:10.1088/1367-2630/10/4/040201). Finally, Radicchi, Fortunato and Castellano'suniversal h-indexhf allows to quantitatively compare the impact of authors in different disciplines, with different citation patterns (doi:10.1073/pnas.0806977105).ary science, and information literacy tag