Bibliometrics: From Garfield to Google Scholar


Published on

A presentation on new bibliometric indicators such as h-index, eigenfactor, SNIP, SJR, Publish or Perish; and the use of Google Scholar and Scopus for citation analysis.

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • “Father of Bibliometrics” wanted to trace scholarly thoughtBibliometrics are an empirical measurement. The way to measure the importance of scholar, journal etc, is either by reputation or by bibliometric measurement. Both have merits.
  • Number of journals covered, how evaluated, pros and cons to each, GS has more foreign language, conference proceedings, government reports, unpublished manuscripts, dissertations and theses. Scopus not consistent before 1996 (even then, questionable!), Google Scholar also around the same time frame, WoS goes back very far.
  • First measurement, developed by E. Garfield
  • A JIF greater than 1 would mean that the Journal had more citing references for that year than articles published in the journal the previous two years.
  • Carl Bergstrom and team at University of WashingtonScaled so that the sum of all “citation traffic” appearing in JCR for that year = 100. So, the influence is a measure of how likely that journal will be used within the total citations appearing in JCR.
  • HenkMoed, Leiden University, NetherlandsIn this case, the idea is how much is it cited relative to how much it “could have been” cited? Different disciplines have different generally accepted citation potentials.
  • “Normalizing” the impact by dividing by citation potential will account for discrepancies in citation rates between disciplinesNeed an example to do live
  • SCImago was generated from a think tank at the University of Grenada in Spain
  • Library Science
  • Most often used for promotion and tenure dossiers
  • G index of 3 means 3 articles received 9 citations eachE-index: h squared is the theoretical minimum required to get an h index of h. If there are highly cited articles, there will be a larger surplus of citations beyond the theoretical minimum required to obtain that h indexContemporary h index a concern for those with career interruptions (birth of child, etc)
  • In certain disciplines co-authors are a detriment, in others it doesn’t matter
  • Measures the number of citations for a corpus of work, adjusted for age. Number of citations for each article in the h-index is divided by age of paper. The weighted citation counts are added up, and then the square root is taken.AW index takes the square root a second time: this is done to allow for comparison with h-index, if rate of citation is stable, will approximate the h-index
  • Knowing the differences in these databases’ coverage and knowing what each metric can provide can help bring the strong points of a journal’s/author’s/article’s/instittution’s scholarly record to light
  • Bibliometrics: From Garfield to Google Scholar

    1. 1. Bibliometrics:From Garfield to Google Scholar Elaine M. Lasda Bergman University at AlbanyUpstate NY SLA Spring Meeting April 20, 2012
    2. 2. What we’re going to cover• What is the study of Bibliometrics?• Bibliometrics which assess entire Journals – JIF, Eigenfactor, SNIP, SJR• Bibliometrics assessing authors, articles, institutions – citation count, H-index, e-index, etc. etc. etc.
    3. 3. What is bibliometrics? • Scholarly communication:Eugene Garfield tracing the history and evolution of ideas from one scholar to another • Measures the scholarly influence of articles, journals, scholars, institutions
    4. 4. Three sources for citation data
    5. 5. Three sources for citation data• Citation data overlaps, but not completely• Unique citing references in all three databases• Unique metrics developed using each database – Metrics could be computed in any one of these but most are tied to a particular source
    7. 7. What is measured?• Journal Ranking – “Quality” or “Importance” of journal relative to other journals • Usually within a given field of study• There are many ways to measure “quality,” “importance”
    8. 8. “Impact”• Journal Impact Factor (JIF)• Web of Science – Journal Citation Reports• Basically “how fast are ideas spreading from this journal to other publications?”• Formula is a ratio: Number of citations to a journal in a given year from articles occurring in the past 2 years, divided by the number of scholarly articles published in the journal in the past 2 years
    9. 9. Journal Impact Factor• Journal of Hypothetical Examples Citing references appearing in 100 2010, to articles published in Journal in 2009 and 2008 Total number of articles 200 in Journal published in 2009 and 2008 0.50 JIF
    10. 10. Concerns with impact factor• Cannot be used to compare across disciplines• Two year time frame not adequate for social sciences, humanities• Coverage of some disciplines not sufficient in Web of Science• Is a measure of “impact” a measure of “quality”?
    11. 11. “Influence”•• Web of Science: Journal Citation Reports• Eigenvector analysis: Similar to Google PageRank, “chain of citations”• Takes into account the total amount of “citation traffic” appearing in JCR Influence of the citing journal, Divided by the total number of citations appearing in that journal.
    12. 12. “Influence”• Journal Impact Factor: – All citing references weighted equally• Eigenfactor: – SOME CITING REFERENCES ARE MORE IMPORTANT THAN OTHERS • The citing articles from journals that are heavily cited themselves demonstrate greater influence
    13. 13. Considerations• Eigenfactor will always be bigger if a journal is larger, i.e., publishes more articles• Article Influence Score: corrects for journal size – takes the journal’s Eigenfactor score and further divides it by the number of articles in the journal. – Correlation to the JIF
    14. 14. Examples• For the year 2011, Neurology had an eigenfactor score of .159. This number = % of all citation traffic of articles in the JCR• For the year 2011, Neurology had an article influence score of 2.57. This means an average article in this journal is roughly 2 ½ X more influential than an average article in all of JCR•
    15. 15. “Citation Potential”• SNIP: Source Normalized Impact Per Paper• Uses Scopus data• Citation Potential = total number of citing references in all journals which have cited this journal• Takes an average citation count The ratio of the journal’s average citation count per paper to the citation potential in its subject field
    16. 16. Pros and cons of SNIP• Can compare SNIP scores across disciplines• Aggregate of a journal, so larger journals automatically have higher scores than smaller journals
    17. 17. “Prestige”• SJR: Scimago Journal Rank• Uses Scopus data• Measures “current average prestige per paper” Prestige factors include: # of journals in the Scopus database, # of articles in Scopus from this journal, citation count, eigenvector analysis of important citing references, corrections for self-citations, and normalization by the number of significant works published in the journal.
    18. 18. Pros and Cons of SJR• Corrects for self citations• Correlated to JIF• Scores can be compared across disciplines• Web version provides data on countries• Three year window not good for social sciences•
    19. 19. Examples in Scopus
    20. 20. Examples in Scopus
    21. 21. Examples in Scopus
    22. 22. Examples in Scopus
    23. 23. Examples in Scopus
    24. 24. Examples in Scopus
    26. 26. Citation count• Number of times cited within a given time period – Journals, Authors, Articles, etc.• Does not take into account – Materials not included in citation database – Self citations – Variations in citation patterns/rates
    27. 27. Citation count• Citation counts will vary depending on which database you use• It is very difficult to get a complete count of all citing references
    28. 28. H-index• Scopus, Google Scholar, WoS?• Meant to account for differences in citation patterns (i.e., “one-hit wonders” vs. consistent record of scholarship) “A scientist has index h if h of his/her Np papers have at least h citations each and the other (Np-h) papers have no more than h citations each” (Hisrch 2005)
    29. 29. H-index Example 30 Scholar A Scholar B 10 27 25 10 12 9 5 20 8 4 7 4Number of Citations H-index 6 2 15 Scholar A 6 2 Scholar B 10 56 citations 56 citations 6 h-index 4 h-index 5 0 1 2 3 4 5 6 7 Article Number
    30. 30. Variations on the H-index• G-index (Egghe 2006): gives greater weight to highly cited articles – The top g number of articles have received a combined total of g2 citations• E-index (Zhang 2009): gives greater weight to highly cited articles – The square root of the surplus of citations in the h-set beyond h2• Contemporary h-index (Sidiropolous, et. Al. 2006): gives greater weight to newer articles – “parameterized”: current year, citations count 4 times, four years ago, citations count 1 time, 6 years ago, citations count 4/6 times
    31. 31. Variations on the H-index• Individual h-index (Batista, et al. 2006)accounts for co-authors – Divides the h-index by the average number of authors per paper• Alternative individual h-index (Harzing): accounts for co-authors – Normalizes citation counts: divides # of citations by average # of authors per each paper and then computes the h-index• Another alternative individual h-index (Schreiber 2006): accounts for co-authors – Divides by fractions of papers instead of # of authors, keeps full citation count
    32. 32. Variations on the H-index• Age weighted citation rate and AW index (Jin 2007): accounts for variations in citation patterns over time – AWCR= The square root of the sum of all age-weighted citation counts over all papers that contribute to the h-index – AW-index= the square root of the AWCR – Per-author AWCR: AWCR divided by number of authors for each paper
    33. 33. Publish or Perish• Google scholar citation information• Interdisciplinary topics, fields relying on conference papers or reports• Greatest variety of metrics• Dirty data• Unverified data• Nonscholarly sources
    34. 34. Differences in H-indexScopus vs. Google Scholar (PoP) The Case of Eugene Garfield
    35. 35. PoP Interface
    36. 36. PoP Search for Garfield
    37. 37. PoP Search for Garfield
    38. 38. An aside:Why I don’t like PoP for Journal Metrics
    39. 39. Scopus Search for Garfield
    40. 40. Scopus Search for Garfield
    41. 41. Scopus Search for Garfield
    42. 42. Scopus Search for Garfield
    43. 43. Scopus Search for Garfield Citation overview
    44. 44. Scopus Search for Garfield Link to graphic information next to citation overview
    45. 45. Scopus Search for Garfield
    46. 46. Scopus Search for Garfield
    47. 47. Scopus Search for Garfield
    48. 48. Google scholar citations
    49. 49. Microsoft Academic
    50. 50. Considerations• Don’t measure an individual article’s impact by the metrics for the entire journal• Do I need a comparison within a discipline or across disciplines?• Does the citation pattern matter or just the count?• Does the database being used cover my subject as thoroughly as possible?• To what degree does my subject area rely on non- journal scholarly publications?• Not all citing references are positive!
    51. 51. Questions??Elaine Lasda