Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Google Scholar Revolution: a big data bibliometric tool

718 views

Published on

The two faces of Google Scholar
Opening the academic Pandora’s Box
Why do we call it a big data bibliometric tool?
Drawbacks Google Scholar, Google Scholar Citations, Google Scholar Metrics

Published in: Technology
  • If we are speaking about saving time and money this site ⇒ www.WritePaper.info ⇐ is going to be the best option!! I personally used lots of times and remain highly satisfied.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Posso recomendar um site. Ele realmente me ajudou. Chama-se ⇒ www.boaaluna.club ⇐ Eles me ajudaram a escrever minha dissertação.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • If you’re looking for a great essay service then you should check out HelpWriting.net. A friend of mine asked them to write a whole dissertation for him and he said it turned out great! Afterwards I also ordered an essay from them and I was very happy with the work I got too.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

The Google Scholar Revolution: a big data bibliometric tool

  1. 1. The Google Scholar Revolution: a big data bibliometric tool Google Scholar Day: Changing current evaluation paradigms Cybermetrics Lab (IPP–CSIC) Madrid, 20 February 2017 Enrique Orduña-Malea, Alberto Martín-Martín, Juan M. Ayllón, Emilio Delgado López-Cózar EC3 Research Group-Scholar Division
  2. 2. EC3 Research Group – Google Scholar Division
  3. 3. searchengine Bibliographicsearchtool Bibliometrictool The two faces of Google Scholar
  4. 4. The google search familly. Its goal: finding information Scholar Search Scholar Citations Scholar Metrics
  5. 5. Simple Easy Fast Easy to understand and use Universal, international, global Multilingual Free Why is it successful?
  6. 6. 20122011 2004 Google’s incursion in Bibliometrics
  7. 7. 7 Studying it from the bibliometric perspective: EC3-Scholar Division
  8. 8. Opening the academic Pandora’s Box 2008-
  9. 9. 9 Journals Authors Publishers Multifaceted model Library & Information Sciences (Spain) http://www.biblioteconomia-documentacion-española.infoec3.es Bibliometrics & Scientometrics (International) http://www.scholar-mirrors.infoec3.es What have we analyzed? Intensive, extensive, and obsessive work Conferences
  10. 10. 11
  11. 11. 12 • Publication data about 49,930 A&H and SS professors working in public Spanish universities was extracted from Google Scholar in 2012 • Only authors in the first tercile are displayed • 68 discipline rankings (49 in Social Sciences and Law, 39 in Arts and Humanities)
  12. 12. 13 Indicators
  13. 13. 14 Collecting data Max. number of hits per page February 2013: from 100 to 20
  14. 14. 15
  15. 15. 16
  16. 16. 17 Sample of highly cited books (top 3%) published by ~49k A&H and SS professors working in public Spanish universities Data collected from Google Scholar in 2012 (n ~ 7203) 68 discipline rankings (49 in Social Sciences and Law, 39 in Arts and Humanities)
  17. 17. 18 Indicators: Nº of books, and sum of citations (relative to highest element in the ranking)
  18. 18. 19
  19. 19. Indicators H5 Index H5 Median
  20. 20. 22 Spanish Journals ± 2.500 SJR-Scopus 506 JCR 114Google Scholar Metrics 1299
  21. 21. 23
  22. 22. 24
  23. 23. 25
  24. 24. 26
  25. 25. 27 Indicators Extracted directly from Google Scholar Metrics Computed using the article and citation data available in Google Scholar Metrics H Index of documents published in the last 5 years Median of citation counts for articles published in last 5 years Sum of citations for articles above h5-index threshold
  26. 26. 28 Classification Core Related
  27. 27. 29 Coverage IMPORTANT: Google Scholar Metrics only covers journals that are indexed in Google Scholar, have published at least 100 articles in the last 5-year period, and have received at least 1 citation
  28. 28. 30 PROCEEDINGS
  29. 29. 31
  30. 30. 32 Data sources Citations Multifaceted model
  31. 31. 33 LIS researchers in Spain 336 authors in GSC 68 not in GSC Other sources ResearcherID (WoS) ResearchGate Indicators Sum of citations H Index Nº of documents RG Score Impact Points Aggregating data Highly cited docs (HCD), % of HCD by journal, book publisher, and institution
  32. 32. 34 The «Mirrors» approach There are many platforms that reflect (mirror) scientific activity on the Web. An inclusive study of the impact of scientific activity must contemplate as many of them as possible.
  33. 33. 35
  34. 34. Bibliometric potential of Google Scholar We have proved Yes, we can What do we know about Google Scholar? Its strengths and weaknesses
  35. 35. 38 It’s the most used academic search engine
  36. 36. Why do we call it a big data bibliometric tool? Source COVERAGE All documents Geographic COVERAGE All countries Linguistic COVERAGE All languages Discipline COVERAGE All of them GROWTH Faster than WoS and Scopus FAST TRACK CITATIONS Newly published documents indexed in a matter of days Big Data Big Data SIZE Largest bibliographic database in the world
  37. 37. The search engine with the largest coverage Size matters Orduña-Malea, E., Ayllón, J. M., Martín-Martín, A., Delgado López-Cózar, E.. (2014). About the size of Google Scholar: playing the numbers. arXiv preprint arXiv:1407.6239. EC3 Working Papers 18 Orduna-Malea, E., Ayllón, J. M., Martín-Martín, A., Delgado López-Cózar, E. Methods for estimating the size of Google Scholar. Scientometrics 104 (3), 931-949 2015
  38. 38. The search engine with the largest coverage Size matters 2017 Nº documents
  39. 39. 2017
  40. 40. Larger coverage, larger citation graph 0 0,5 1 1,5 2 2,5 3 Art & Humanities Social Sciences Sciences TOTAL RatioofGSCitationstoWoS Citations(avg) Documents published in 2009 Documents published in 2014 Analysis of most documents with a DOI published in 2009 and 2014 covered by Web of Science (~2.5 million documents)
  41. 41. 45 International
  42. 42. 46 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 1800 1805 1810 1815 1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870 1875 1880 1885 1890 1895 1900 1905 1910 1915 1920 1925 1930 1935 1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 Proportion of documents covered by Google Scholar by language over the years Turkish Dutch Chinese Polish Korean Italian Japanese Portuguese Spanish French Simplified Chinese German English
  43. 43. 47 Multilingual Documents covered by Google Scholar, Web of Science & Scopus by 13 languages (1800-2016)
  44. 44. 48 Multilingual
  45. 45. 49 Covers all document typologies
  46. 46. Thealternative
  47. 47. 51 Google Scholar offers a different vision of scientific production
  48. 48. Web of Science documents (2009/2014) found in Google Scholar found in GS 96% not found in GS 4% 96% of the searched WoS documents were found in GS. 98% if we only consider journal articles. The rest might have been found as well if alternative search strategies had been used.
  49. 49. Martín-Martín, A., Orduña-Malea, E., Ayllón, J. M., Delgado López-Cózar, E. (2014). Does Google Scholar contain all highly cited documents (1950-2013)?. arXiv preprint arXiv:1410.8464. Highly cited documents in Google Scholar (1950-2013) Half of them are not in WoS The ones who are in WoS: very high citation correlation
  50. 50. 54 Confirmation PRELIMINARY RESULTS: Analysis of most documents with a DOI published in 2009 covered by Web of Science (~1 million documents) Citation Index N spearman.cor p.value prop.cited.gs prop.cited.wos ratio of gs_cit to wos_cit (avg) Sciences 863801 0,94 0,00 0,97 0,95 1,68 Social Sciences 109232 0,90 0,00 0,97 0,94 2,58 Art & Humanities 13487 0,83 0,00 0,84 0,69 2,52 Sciences Social Sciences Art & Humanities
  51. 51. Logical when you see their sources elsevier.com 7.200.000 wiley.com 4,590,000 springer 3.290.000 taylor and francis 1.440.000 lww.com 1.030.000 sagepub.com 781.000 nature.com 428.000 bmj.com 370.000 Routledge 293.000 karger.com 188.000 degruyter.com 196.000 biomedcentral.com 121.000 liebertpub.com 90.900 emerald 167.000 books.google.com 14.000.000 ieee.org 2.990.000 jstor.org 2.210.000 acs.org 987.000 cambridge.org 893.000 oxfordjournals.org 872.000 acm.org 472.000 iop.org 462000 aip.org 451.000 arxiv.org 355.000 pnas.org 101.000 ams.org 98.000 sciencemag.org 62.600 nber.org 26.900 Scopus 2009 Web of Science Google Scholar
  52. 52. 56 It has a better coverage of areas like Humanities, Social Sciences, Engineering…
  53. 53. GS measures scientific impact and more Scientific Professional Educational Social
  54. 54. What impact does it measure?
  55. 55. A professional journal in Google Scholar
  56. 56. We question ourselves Drawbacks Google Scholar
  57. 57. 61 Lack of transparency There isn’t a public master list of the sources Google Scholar indexes (publishers, repositories, catalogues, bibliographic databases and repertoires, aggregators, journals…) There is no accurate method to estimate the size of Google Scholar
  58. 58. 62 Similarly, even if a document has received more than 1000 citations, only the first 1000 can be displayed when clicking the “Cited by” link We have no control over the results we get Are the relevant results for my needs among those 1000 results? Usually yes… thanks to the ranking algorithm they use Only(!) returns 1000 results for any given query Is this really a bibliographic problem? Who is interesed in bibliographic searches of that size? Weaknesses
  59. 59. 63 How does Google Scholar rank results?
  60. 60. 64 There is no native method to easily extract bibliographic data massively. Only one by one. Weaknesses There is no API. What can we use for large downloads? A scraper
  61. 61. 65 Weaknesses Limited filtering and sorting options (year and relevance) ≠
  62. 62. 66 Weaknesses The advanced search form is limited to four search dimensions: keywords, authors, source of publication (journal, conference…), and year of publication
  63. 63. 67 No quality control of sources indexed. Peer-reviewed documents coexist with documents that haven’t gone through that process. Is that really a problem? But… Google Scholar also shows which documents are covered by the Web of Science, and which of them are available from your library. YOUR CHOICE… Weaknesses It shows richness rather than a flaw
  64. 64. 68 Weaknesses • Institutional affiliation of the authors of the documents is available (institution, country) • The language in which documents are written • The typology of each document is not clear (book, journal article, conference communication, thesis, report…). Only books are marked as such, usually when they have been found on Google Books • Not all documents have an abstract • The author-supplied keywords are not available • The list of cited references in each article is not available either It doesn’t offer information regarding
  65. 65. 69 Greatest danger: manipulation Delgado López‐Cózar, E., Robinson‐García, N., Torres‐Salinas, D. (2014). The Google Scholar experiment: How to index false papers and manipulate bibliometric indicators. Journal of the Association for Information Science and Technology, 65(3), 446-454.
  66. 66. 70 Errors in the data Enough quality? Even with «dirty» data, it measures more and better Large units of analysis: no problem Individuals: check data first
  67. 67. 71
  68. 68. The Googledependency
  69. 69. Drawbacks: Google Scholar Metrics
  70. 70. 74
  71. 71. 75
  72. 72. Drawbacks: Google Scholar Citations
  73. 73. 77
  74. 74. 78
  75. 75. 79
  76. 76. 80 Google Scholar Citations: Laissez faire laissez passer Be wary of CUT / COPY – BUTTON / COMAND bibliometric products Don’t mix apples and oranges
  77. 77. A final consideration… To what end are we measuring scientific activity? Spreading light where there wasThank you very much!

×