Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Accuracy of citation data in Web of Science and Scopus

91 views

Published on

Presentation at the 16th International Conference on Scientometrics & Informetrics, Wuhan, China, October 19, 2017.

We present a large-scale analysis of the accuracy of citation data in the Web of Science and Scopus databases. The analysis is based on citations given in publications in Elsevier journals. We reveal significant data quality problems for both databases. Missing and incorrect references are important problems in Web of Science. Duplicate publications are a serious problem in Scopus.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Accuracy of citation data in Web of Science and Scopus

  1. 1. Accuracy of citation data in Web of Science and Scopus Nees Jan van Eck and Ludo Waltman Centre for Science and Technology Studies, Leiden University, Leiden (The Netherlands) 16th International Conference on Scientometrics & Informetrics Wuhan, China, October 19, 2017
  2. 2. Introduction • Question: Can we trust citation counts in WoS and Scopus? • Aim: To determine accuracy of citation data in WoS and Scopus – Accuracy of reference data – Accuracy of citation matching • Approach: Comparison of references in full text of Elsevier publications with references in WoS and Scopus 1
  3. 3. … References [1] Hirsch, JE (2005) PNAS, 102, p.16569 [2] Egghe, L (2006) Scientist, 20, p.15 … Approach 2 … References [1] Hirsch, JE (2005) PNAS, 102, p.16569 [2] Egghe, L (2006) Scientist, 20, p.15 … Original (Elsevier) publication WoS … References [1] Hirsch, JE (2005) PNAS, 102, p.16569 [2] Egghe, L (2006) Scientist, 20, p.15 … Scopus … References [1] Hirsch, JE (2005) PNAS, 102, p.16569 [2] Egghe, L (2006) Scientist, 20, p.15 …
  4. 4. Elsevier data • Elsevier ScienceDirect Article Retrieval API • Subscription-based journal publications in period 1987-2016 • Publication and reference data in XML format 3
  5. 5. Linking Elsevier data with WoS and Scopus 4 WoS Scopus Time period 1987–2016 1996–2015 Document types article, review article, review, conference paper No. of linked publications 6M 5M No. of references in Elsevier data 207M 172M No. of references in WoS/Scopus 203M 170M No. of linked references 136M 84M
  6. 6. Number of linked publications 5
  7. 7. Analysis based on number of references in publication 6
  8. 8. Linked publications classified based on number of references 7 WoS Scopus Equal no. of references 77.2% 96.4% More references 2.7% 1.2% Fewer references 19.3% 1.2% No references 0.8% 1.2%
  9. 9. More references in WoS 8
  10. 10. Analysis based on linked references 9
  11. 11. Linked references without corresponding citation relation 10
  12. 12. Validation of linked references without corresponding citation relation • Random sample from 2015 publications • WoS (60 cases) – Missing reference: 33 (55.0%) – Incorrect reference: 10 (16.7%) – Error in reference: 16 (26.7%) – No problem: 1 (1.5%) • Scopus (30 cases) – Missing reference: 6 (20.0%) – Duplicate publications: 9 (30.0%) – Citation matching problem: 15 (50.0%) 11
  13. 13. Missing references in WoS (1) 12
  14. 14. Missing references in WoS (1) 13 ???
  15. 15. Missing references in WoS (2) 14
  16. 16. Missing references in WoS (2) 15 ??? ???
  17. 17. Incorrect references in WoS (1) 16
  18. 18. Incorrect references in WoS (1) 17
  19. 19. Incorrect references in WoS (1) 18
  20. 20. Incorrect references in WoS (2) 19 Original reference in publication WoS reference J. Wang, J.K. Carson, M.F. North, D.J. Cleland, Int. J. Heat Mass Transfer 49 (17) (2006) 3075–3083. WANG J, 2006, CHINESE CHEM LETT, V17, P49 Kanber B, Hartshorne TC, Horsfield MA, Naylor AR, Robinson TG, Ramnarine KV. Dynamic variations in the ultrasound gray-scale median of carotid artery plaques. Cardiovasc Ultrasound 2013a;11:21. KANBER B, 2013, CEREBROVASC DIS S2, V35, P21 Evans PD, Chowdhury MJA. Photoprotection of wood using polyester-type UVabsorbers derived from the reaction of 2 hydroxy-4(2,3-epoxypropoxy)- benzophenone with dicarboxylic acid anhydrides. J Wood Chem Technol 2010;30:186e204. EVANS P, 2010, TLS-TIMES LIT S 0326, P30 X. Li, S. Wang, Y. Chen, G. Liu, X. Yang, Overexpression of CD40 in sacral chordomas and its correlation with low tumor recurrence, Onkologie 36 (10) (2013) 567–571 LI XY, 2013, NANJING NONGYE DAXUE, V36, P36 K. Zhang, H. Chen, G. Wu, K. Chen, H. Yang, High expression of SPHK1 in sacral chordoma and association with patients’ poor prognosis, Med. Oncol. 31 (11) (2014) 247. ZHANG K, 2014, IEEE T PATTERN ANAL, V1, P1
  21. 21. Missing references in Scopus 20
  22. 22. Missing references in Scopus 21 No references
  23. 23. Duplicate publications in Scopus 22
  24. 24. Duplicate publications in Scopus 23
  25. 25. Duplicate publications in Scopus 24
  26. 26. Duplicate publications in Scopus 25
  27. 27. Citation matching problem in Scopus 26
  28. 28. 27 Citation matching problem in Scopus
  29. 29. 28 Citation matching problem in Scopus
  30. 30. Large inaccuracies in citation counts of individual publications 29
  31. 31. Interesting case • Citation count (October 16, 2017): – Scopus: 5,204 – WoS: 172 30
  32. 32. Differences in citation counts between two versions of WoS 31 WoS Dec 2016 WoS Jun 2017 Newman, M.E.J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113. 2,073 139 Newman, M.E.J. (2004). Fast algorithm for detecting community structure in networks. Physical Review E, 69(6), 066133. 436 1,070 Clauset, A., Newman, M.E.J., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111. 1,156 2,627
  33. 33. Conclusions • Citation data suffers from significant inaccuracies both in WoS and in Scopus • WoS – Incorrect references – Missing references • Scopus – Duplicate publications – Citation matching problems • Both WoS and Scopus have inaccuracies in about 1% of references 32
  34. 34. Thank you for your attention! 33

×