Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Large-scale analysis of bibliometric networks

395 views

Published on

Presentation at the International Conference on Data-driven Discovery: "When Data Science Meets Information Science",
Beijing, China, June 20, 2016.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Large-scale analysis of bibliometric networks

  1. 1. Large-scale analysis of bibliometric networks Nees Jan van Eck Centre for Science and Technology Studies (CWTS), Leiden University International Conference on Data-driven Discovery: When Data Science Meets Information Science Beijing, China, June 20, 2016
  2. 2. Bibliographic databases: ‘Big data’ 1 Web of Science Scopus Journals 12,000 20,000 Publications 45 million 35 million Citations 1 billion 0.9 billion
  3. 3. Bibliometric networks 2 Web of Science Scopus Citation network of pubs / authors / journals Co-authorship network of authors / organizations Co-citation network of pubs / authors / journals Co-occurrence network of keywords / terms Bibliographic coupling network of pubs / authors / journals Bibliographic database
  4. 4. Outline • Software tools • Network analysis techniques • Analysis of data science 3
  5. 5. Software tools 4
  6. 6. Software tools • VOSviewer (www.vosviewer.com) – Tool for constructing and visualizing bibliometric networks • CitNetExplorer (www.citnetexplorer.nl) – Tool for visualizing and analyzing citation networks of publications • Both tools have been developed together with my colleague Ludo Waltman 5
  7. 7. VOSviewer 6
  8. 8. VOSviewer: Overview • Software tool for visualizing (bibliometric) networks • Built-in support for popular bibliographic databases • Text mining functionality • Layout and clustering techniques • Advanced visualization features: – Smart labeling algorithm – Overlay visualizations – Density visualizations (‘heat map’) • Users: – Researchers – Professional users (e.g., universities, libraries, funders, publishers) 7
  9. 9. Map of university co-authorship network 8
  10. 10. Map of journal citation network 9
  11. 11. CitNetExplorer 10
  12. 12. • Any type of bibliometric network • Co-authorship, direct citations, co-citation, and bibliographic coupling • Time dimension is ignored • Networks of at most ~10,000 nodes are supported • Only citation networks of publications • Direct citation between publications • Time dimension is explicitly considered • Millions of publications are supported 11 VOSviewer CitNetExplorer
  13. 13. Network analysis techniques 12
  14. 14. Network analysis techniques 13 Layout: • Assigning the nodes in a network to locations in a (usually 2d) space (a.k.a. mapping) • Visualization of similarities (VOS) Clustering: • Partitioning the nodes in a network into a number of groups (a.k.a. community detection) • Weighted modularity • Smart local moving algorithm
  15. 15. 1414 Clustering can be seen as mapping in a restricted space
  16. 16. 1515 Clustering can be seen as mapping in a restricted space
  17. 17. Unified approach to mapping and clustering Minimize where n: number of nodes in the network m: total weight of all edges in the network Aij: weight of edge between nodes i and j ki: total weight of all edges of node i 16    ji ij ji ijij ji n ddA kk m xxQ 2 1 2 ),,(  Mapping xi: vector denoting the location of node i in a p-dimensional space   p k jkikjiij xxxxd 1 2 )( Clustering xi: integer denoting the community to which node i belongs : resolution parameter       ji ji ij xx xx d if1 if0 
  18. 18. Smart local moving algorithm 17 Q = 0.4198 Q = 0.3791 Reduced network Local moving heuristic in subnetworks Local moving heuristic Original network
  19. 19. Algorithmically constructed classification system of science • 17.8 million publications from the period 2000– 2015 indexed in Web of Science • 282.4 million citation relations • Classification system of 3 hierarchical levels: – 27 broad disciplines – 817 fields – 4,113 subfields 18
  20. 20. Breakdown of scientific literature into 817 fields 19 Social sciences and humanitiesBiomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  21. 21. Publications in scientometrics subfield 20
  22. 22. Time-line map of highly cited scientometrics publications 21
  23. 23. Analysis of data science 22
  24. 24. What is data science? • Empirical operationalization of data science based on publications with ‘data’ in title or abstract 23 Wikipedia: “Data Science is an interdisciplinary field about processes and systems to extract knowledge or insights from data … which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics” LCDS: “Data Science … deals with finding, analyzing and validating complex patterns in data. Data Science methods are indispensable for maintaining a competitive edge in all disciplines in science”
  25. 25. Growth of data-driven research 24 0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 1990 1995 2000 2005 2010 2015 Percentageofpublications % 'data' publications % 'theory' publications
  26. 26. Breakdown of scientific literature into 817 fields 25 Social sciences and humanitiesBiomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  27. 27. Data-driven nature of different scientific fields 26 Social sciences and humanitiesBiomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering % pub. with ‘data’ in title or abstract
  28. 28. Data-driven nature of different scientific fields 27 artificial intelligence statisticsbioinformatics neuroimaging pattern recognition astronomy earth water climate remote sensing nutrition obesity addiction accident analysis % pub. with ‘data’ in title or abstract
  29. 29. Data science fields (at least 25% ‘data’ publications) 28 Social sciences and humanitiesBiomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  30. 30. Term map of data science fields 29
  31. 31. China’s publication output in data science fields 30 Social sciences and humanitiesBiomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  32. 32. China’s publication output in data science fields 31 artificial intelligence pattern recognition high energy earth atmospheres weather remote sensing
  33. 33. Chinese institutes with most publications in data science fields (2011-2015) • Chinese Academy of Sciences • Peking University • Tsinghua University • China University of Geosciences • Zhejiang University • Nanjing University • Shanghai Jiao Tong University • University of Science and Technology of China • Beijing Normal University • University of Hong Kong 32
  34. 34. CAS publication output in data science fields 33 earth atmospheres weather remote sensing vegetation astronomy high energy
  35. 35. Term map based on CAS publications in data science fields 34
  36. 36. CAS (Beijing Branch) publication output in data science fields 35 astronomy earth atmospheres weather remote sensing vegetation high energy
  37. 37. CAS (Shanghai Branch) publication output in data science fields 36 bioinformatics genetics astronomy nuclear
  38. 38. Do it yourself! 37 www.vosviewer.com www.citnetexplorer.nl
  39. 39. Thank you for your attention! 38

×