Bibliometrics :
Essential Concepts and Tools

                  Elaine M. Lasda Bergman
                  Bibliographer for Social Welfare
                  and Dewey Reference
                  Dewey Graduate Library
What is bibliometrics?


Scholarly communication: tracing the history and
evolution of ideas from one scholar to another
Measures the scholarly influence of
articles, journals, scholars
The birth of citation analysis


Eugene Garfield: “father of citation analysis”
developed the first bibliometric index tools
Citation indexes and Journal Citation Reports
  “ISI Indexes”: Science Citation Index, Social Science
  Citation Index, Arts and Humanities Index
Better coverage on hard sciences than on social
sciences and worse still on humanities
Garfield’s metrics


Citation count
Impact Factor
Immediacy Index
Citation Half-Life
Citation count


Number of times cited within a given time period
  Author
  Journal

Does not take into account
  Materials not included in citation database
  Self citations
Impact factor


Measures “impact” of a journal (not an article) within
a given subject
Formula is a ratio:
  Number of citations to a journal in a given year from
  articles occurring in the past 2 years Divided by the
  number of scholarly articles published in the journal in
  the past 2 years
Concerns with impact factor


Cannot be used to compare cross disciplinary (per
Garfield himself) due to different rates of publication
and citation
Two year time frame not adequate for non-scientific
disciplines
Coverage of some disciplines not sufficient in the ISI
databases
Is a measure of “impact” a measure of “quality”?
Immediacy index


What it’s supposed to measure: how quickly articles in
a given journal have an impact on the discipline
Formula: the average number of times an article in a
journal in a given year was cited in that same year
Citation Half-Life


What it’s supposed to measure: duration of relevance
of articles in a given journal
Formula: median age of articles cited for a particular
journal in a given year
Twenty first century tools
Influence of Google Page Rank


 Eigenvector analysis:
    “The probability that a researcher, in documenting his or
    her research, goes from a journal to another selecting a
    random reference in a research article of the first
    journal. Values obtained after the whole process
    represent a ‘random research walk’ that starts from a
    random journal to end in another after following an
    infinite process of selecting random references in
    research articles. A random jump factor is added to
    represent the probability that the researcher chooses a
    journal by means other than following the references of
    research articles.” (Gonzales-Pereira, et.al., 2010)
Sources Using ISI Data
Journalranking.com


Journal Ranking.com uses ISI data and eigenvector
(PageRank) algorhythm to create one’s own
categories
  Can assign different weights to citations from the same
  journal, the same category and from other categories or
  only whithin a specific list
  Not updated since 2005
  http://libguides.library.albany.edu/content.php?pid=600
  86&sid=441804
Eigenfactor.org             http://libguides.library.albany.edu/content.php?pid=60086&sid=441804




  Uses ISI data
  Similar to PageRank
  Listed in JCR as of 2009
  Eigenfactor Score :
     Influence of the citing journal divided by the total
     number of citations appearing in that journal
  Example: Neurology (2006): score of .204 = an estimated
  0.2% of all citation traffic of journals in JCR (Bergstrom &
  West, 2008).
  Larger journals will have more citations and therefore will
  have larger eigenfactors
Article Influence Score


From Eigenfactor: measure of prestige of a journal
Average influence, per article of the papers on a
journal
Comparable to the Impact Factor
Corrects for the issues of journal size in the raw
Eigenfactor score
Neurology’s 2006 article influence score = 2.01. Or
that an avg. article in Neurology is 2X as influential as
an avg. article in all of JCR
ScienceWatch


Provides “quick and dirty” articles on hot
researchers, trending research topics, institutions and
journals
Much on this site (in-cites, etc) are now parts of
analytical products being sold byThompson; no longer
free
There are still some good articles, but not
searchable, hit or miss information
http://sciencewatch.com/dr/sci/11/
New sources for citation information


  Google Scholar
  Scopus
Scopus:
alternate database of citation data


Review panel, i.e., quality control
Bigger field than ISI: covers all the journals in WoS and
more
Strongest in “hard”sciences”, ostensibly improved
social science coverage, arts and humanities: are
“getting there”
Algorithmically determined with human editing
Google Scholar
alternate database of citation data


No rhyme or reason to what is included
Biggest source of citation data
Foreign language sources
Sources other than scholarly journals
Entirely algorithmically determined, no human editing
Scopus analytics


SNIP
SJR/SCIMago
Author Evaluator
SNIP
(Source Normalized Impact Per Paper)


Journal Ranking based on citation analysis with
adjustments for the frequency of citations of the
other journals within the field (the field is all journals
citing this particular journal)
SNIP is defined as the ratio of the journal’s citation
count per paper and the citation potential in its
subject field. (Moed, 2009)
http://www.scopus.com/home.url
SJR:SCImago Journal Rank


What it’s supposed to measure: “current “average
prestige per paper”
SCImago website uses journal/citation data from
Scopus, and is also available from scopus db
Formula: citation time window is 3 years instead of 2
like JIF
Corrections for self citations
Strong correlation to JIF
SCImago Journal Rank


Prestige factors include: number of journals in
db, number of papers from journal in
database, citation numbers and “importance”
received from other journals: size dependent: larger
journals have greater prestige values
Normalized by the number of significant works
published by the journal: helps correct for size
variations
Corrections made for journal self citations
Scopus Author Evaluator


Breakdown of documents by source
H-index
Citations per year (graph)
Google Scholar


Publish or Perish
CIDS
Publish or Perish


Provides a variety of metrics for measuring
scholarly impact and output.
More useful for metrics on authors than journals
or institutions
Uses Google Scholar citation information
Useful for interdisciplinary topics, fields relying
heavily on conference papers or reports, non-
English language sources, new journals, etc.
Continuously updated since 2006
Publish or Perish Metrics


Basic metrics:
  # papers, #citations, active years, years since first
  published, average #of citations per paper, average # of
  citations per year, average # citations per author, etc.
Complex metrics
  H index (and its many variations, mquotient, g-index (corrects
  h-index for variations in citation patterns), AR index, AW
  index
Does not have any corrections for SELF
CITATIONS
CIDS


Measures output of authors for prestige and
influence
Similar to PoP
Corrects for Self-Citations
Uses Google Scholar data
CIDS metrics


Citations per year, h-index, g-index, total
citations, avg cites per paper, self citations included
and excluded, etc.
http://cids.di.fc.ul.pt/cids_3_0/info.php?acc=252015140
41114103161
Mesur


Metric based on usage, citation and bibliographic
data
Uses its own datbases of
documents/metadata/reference, users &
authors, “usage events” and citations
Project seems to be dead?
Considerations


Don’t measure an individual journal’s impact by the
metrics for the entire journal
Cluster of years of citations
Negative citations
A few high impact citations or a lot of low impact
ciations
Source of citing documents
  Foreign, conference proceedings, traditional

Bibliometrics Primer

  • 1.
    Bibliometrics : Essential Conceptsand Tools Elaine M. Lasda Bergman Bibliographer for Social Welfare and Dewey Reference Dewey Graduate Library
  • 2.
    What is bibliometrics? Scholarlycommunication: tracing the history and evolution of ideas from one scholar to another Measures the scholarly influence of articles, journals, scholars
  • 3.
    The birth ofcitation analysis Eugene Garfield: “father of citation analysis” developed the first bibliometric index tools Citation indexes and Journal Citation Reports “ISI Indexes”: Science Citation Index, Social Science Citation Index, Arts and Humanities Index Better coverage on hard sciences than on social sciences and worse still on humanities
  • 4.
    Garfield’s metrics Citation count ImpactFactor Immediacy Index Citation Half-Life
  • 5.
    Citation count Number oftimes cited within a given time period Author Journal Does not take into account Materials not included in citation database Self citations
  • 6.
    Impact factor Measures “impact”of a journal (not an article) within a given subject Formula is a ratio: Number of citations to a journal in a given year from articles occurring in the past 2 years Divided by the number of scholarly articles published in the journal in the past 2 years
  • 7.
    Concerns with impactfactor Cannot be used to compare cross disciplinary (per Garfield himself) due to different rates of publication and citation Two year time frame not adequate for non-scientific disciplines Coverage of some disciplines not sufficient in the ISI databases Is a measure of “impact” a measure of “quality”?
  • 8.
    Immediacy index What it’ssupposed to measure: how quickly articles in a given journal have an impact on the discipline Formula: the average number of times an article in a journal in a given year was cited in that same year
  • 9.
    Citation Half-Life What it’ssupposed to measure: duration of relevance of articles in a given journal Formula: median age of articles cited for a particular journal in a given year
  • 10.
  • 11.
    Influence of GooglePage Rank Eigenvector analysis: “The probability that a researcher, in documenting his or her research, goes from a journal to another selecting a random reference in a research article of the first journal. Values obtained after the whole process represent a ‘random research walk’ that starts from a random journal to end in another after following an infinite process of selecting random references in research articles. A random jump factor is added to represent the probability that the researcher chooses a journal by means other than following the references of research articles.” (Gonzales-Pereira, et.al., 2010)
  • 12.
  • 13.
    Journalranking.com Journal Ranking.com usesISI data and eigenvector (PageRank) algorhythm to create one’s own categories Can assign different weights to citations from the same journal, the same category and from other categories or only whithin a specific list Not updated since 2005 http://libguides.library.albany.edu/content.php?pid=600 86&sid=441804
  • 14.
    Eigenfactor.org http://libguides.library.albany.edu/content.php?pid=60086&sid=441804 Uses ISI data Similar to PageRank Listed in JCR as of 2009 Eigenfactor Score : Influence of the citing journal divided by the total number of citations appearing in that journal Example: Neurology (2006): score of .204 = an estimated 0.2% of all citation traffic of journals in JCR (Bergstrom & West, 2008). Larger journals will have more citations and therefore will have larger eigenfactors
  • 15.
    Article Influence Score FromEigenfactor: measure of prestige of a journal Average influence, per article of the papers on a journal Comparable to the Impact Factor Corrects for the issues of journal size in the raw Eigenfactor score Neurology’s 2006 article influence score = 2.01. Or that an avg. article in Neurology is 2X as influential as an avg. article in all of JCR
  • 16.
    ScienceWatch Provides “quick anddirty” articles on hot researchers, trending research topics, institutions and journals Much on this site (in-cites, etc) are now parts of analytical products being sold byThompson; no longer free There are still some good articles, but not searchable, hit or miss information http://sciencewatch.com/dr/sci/11/
  • 17.
    New sources forcitation information Google Scholar Scopus
  • 18.
    Scopus: alternate database ofcitation data Review panel, i.e., quality control Bigger field than ISI: covers all the journals in WoS and more Strongest in “hard”sciences”, ostensibly improved social science coverage, arts and humanities: are “getting there” Algorithmically determined with human editing
  • 19.
    Google Scholar alternate databaseof citation data No rhyme or reason to what is included Biggest source of citation data Foreign language sources Sources other than scholarly journals Entirely algorithmically determined, no human editing
  • 20.
  • 21.
    SNIP (Source Normalized ImpactPer Paper) Journal Ranking based on citation analysis with adjustments for the frequency of citations of the other journals within the field (the field is all journals citing this particular journal) SNIP is defined as the ratio of the journal’s citation count per paper and the citation potential in its subject field. (Moed, 2009) http://www.scopus.com/home.url
  • 22.
    SJR:SCImago Journal Rank Whatit’s supposed to measure: “current “average prestige per paper” SCImago website uses journal/citation data from Scopus, and is also available from scopus db Formula: citation time window is 3 years instead of 2 like JIF Corrections for self citations Strong correlation to JIF
  • 23.
    SCImago Journal Rank Prestigefactors include: number of journals in db, number of papers from journal in database, citation numbers and “importance” received from other journals: size dependent: larger journals have greater prestige values Normalized by the number of significant works published by the journal: helps correct for size variations Corrections made for journal self citations
  • 24.
    Scopus Author Evaluator Breakdownof documents by source H-index Citations per year (graph)
  • 25.
  • 26.
    Publish or Perish Providesa variety of metrics for measuring scholarly impact and output. More useful for metrics on authors than journals or institutions Uses Google Scholar citation information Useful for interdisciplinary topics, fields relying heavily on conference papers or reports, non- English language sources, new journals, etc. Continuously updated since 2006
  • 27.
    Publish or PerishMetrics Basic metrics: # papers, #citations, active years, years since first published, average #of citations per paper, average # of citations per year, average # citations per author, etc. Complex metrics H index (and its many variations, mquotient, g-index (corrects h-index for variations in citation patterns), AR index, AW index Does not have any corrections for SELF CITATIONS
  • 28.
    CIDS Measures output ofauthors for prestige and influence Similar to PoP Corrects for Self-Citations Uses Google Scholar data
  • 29.
    CIDS metrics Citations peryear, h-index, g-index, total citations, avg cites per paper, self citations included and excluded, etc. http://cids.di.fc.ul.pt/cids_3_0/info.php?acc=252015140 41114103161
  • 30.
    Mesur Metric based onusage, citation and bibliographic data Uses its own datbases of documents/metadata/reference, users & authors, “usage events” and citations Project seems to be dead?
  • 31.
    Considerations Don’t measure anindividual journal’s impact by the metrics for the entire journal Cluster of years of citations Negative citations A few high impact citations or a lot of low impact ciations Source of citing documents Foreign, conference proceedings, traditional

Editor's Notes

  • #28 G index, contemporary h index, factors in age of articles, individual h index: per author, hm index, corrects for multiple authors by reducing paper counts,