Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

3,367 views

Published on

Presentation at the LCN2 seminar on November 27, 2015.

We provide an introduction into the research program on bibliometric network analysis at Leiden University’s Centre for Science and Technology Studies (CWTS). We demonstrate two popular software tools for bibliometric network analysis developed at CWTS: VOSviewer (www.vosviewer.com) and CitNetExplorer (www.citnetexplorer.nl). We also discuss the techniques that we have developed for network layout and community detection. Finally, we use bibliometric network analysis to study the field of network science and the contributions made to this field by researchers at Leiden University.

Published in: Science

Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

  1. 1. Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University Ludo Waltman and Nees Jan van Eck Centre for Science and Technology Studies (CWTS), Leiden University LCN2 Seminar Leiden, November 27, 2015
  2. 2. Centre for Science and Technology Studies (CWTS) • Research center at Leiden University focusing on science and technology studies • About 30 staff members • History of more than 25 years in bibliometric and scientometric research • Contract research • Full access to large bibliographic database (Web of Science and Scopus) 1
  3. 3. Bibliographic databases: ‘Big data’ 2 Web of Science Scopus Journals 12,000 20,000 Publications 45 million 35 million Citations 1 billion 0.9 billion
  4. 4. Bibliometric networks 3 Web of Science Scopus Citation network of publications Co-authorship network of authors / organizations Co-citation network of pubs / authors / journals Co-occurrence network of terms Bibliographic coupling network of pubs / authors / journals Bibliographic database
  5. 5. Outline • Software tools • Network analysis techniques • Analysis of network science 4
  6. 6. Software tools 5
  7. 7. Software tools • VOSviewer (www.vosviewer.com) – Tool for constructing and visualizing bibliometric networks • CitNetExplorer (www.citnetexplorer.nl) – Tool for visualizing and analyzing citation networks of publications 6
  8. 8. VOSviewer 7
  9. 9. Map of university co-authorship network 8
  10. 10. Map of journal citation network 9
  11. 11. CitNetExplorer 10
  12. 12. • Any type of bibliometric network • Co-authorship, co-citation, and bibliographic coupling • Time dimension is ignored • Networks of at most ~10,000 nodes are supported • Only citation networks of publications • Direct citation relations • Time dimension is explicitly considered • Millions of publications are supported 11 VOSviewer CitNetExplorer
  13. 13. Network analysis techniques 12
  14. 14. Network analysis techniques 13 Layout: • Visualization of similarities (VOS) Community detection: • Weighted modularity • Smart local moving algorithm
  15. 15. 1414 Clustering can be seen as mapping in a restricted space
  16. 16. 1515 Clustering can be seen as mapping in a restricted space
  17. 17. Unified approach to mapping and clustering Minimize where n: number of nodes in the network m: total weight of all edges in the network Aij: weight of edge between nodes i and j ki: total weight of all edges of node i 16    ji ij ji ijij ji n ddA kk m xxQ 2 1 2 ),,(  Mapping xi: vector denoting the location of node i in a p-dimensional space   p k jkikjiij xxxxd 1 2 )( Clustering xi: integer denoting the community to which node i belongs : resolution parameter       ji ji ij xx xx d if1 if0 
  18. 18. Unified approach: Clustering Equivalent to a weighted variant of modularity-based community detection (Waltman et al., 2010) Maximize where (xi, xj) equals 1 if xi = xj and 0 otherwise 17         ji ji ijijjin m kk Awxx m xxQ 2 ),( 2 1 ),,(ˆ 1  ji ij kk m w 2 
  19. 19. Unified approach: Mapping • Equivalent to the VOS (visualization of similarities) technique (Van Eck & Waltman, 2007) • Limit case of multidimensional scaling (Van Eck et al., 2010) 18    ji ji ji jiij ji xxxxA kk m Q 22    ji jiijij xxDW 2  1 2   ij ji ij A m kk D ij ji ij A kk m W 2  VOS MDS
  20. 20. Unified approach Commonly used clustering technique (modularity) and commonly used mapping technique (MDS) can be brought together in a unified framework 19 Unified approach Modularity (weighted) VOS MDS (limit case)
  21. 21. Louvain algorithm • ‘Louvain algorithm’ (Blondel et al., 2008) is the most popular heuristic algorithm for large-scale modularity optimization 20
  22. 22. Louvain algorithm 21 Q = 0.3791 Q = 0.4151 Local moving heuristic Local moving heuristic Reduced network Original network
  23. 23. Smart local moving algorithm • Smart local moving algorithm extends the Louvain algorithm in two ways: 1. Multiple algorithm iterations, with output of one iteration serving as input for the next iteration 2. Recursive application of the local moving heuristic 22
  24. 24. Smart local moving algorithm 23 Q = 0.4198 Q = 0.3791 Reduced network Local moving heuristic in subnetworks Local moving heuristic Original network
  25. 25. Empirical comparison (large networks) • 6 networks • Algorithms: – Louvain (1 iteration) – Louvain (10 iterations) – Smart local moving (10 iterations) • 10 algorithm runs using different random numbers 24
  26. 26. Empirical comparison (large networks) 25 Network Louvain Louvain (iterative) Smart local moving Amazon (0.5M / 0.9M) Qmin 0.9257 0.9293 0.9335 Qmax 0.9264 0.9299 0.9338 t 6 9 28 DBLP (0.4M / 1.0M) Qmin 0.8203 0.8243 0.8357 Qmax 0.8227 0.8271 0.8367 t 7 9 26 IMDb (0.4M / 15.0M) Qmin 0.6976 0.6994 0.7050 Qmax 0.7041 0.7052 0.7077 t 18 26 100 LiveJournal (4.0M / 34.7M) Qmin 0.7441 0.7578 0.7676 Qmax 0.7557 0.7658 0.7720 t 350 566 1 549 WoS (10.6M / 104.5M) Qmin 0.7714 0.7851 0.7918 Qmax 0.7786 0.7902 0.7957 t 6 800 8 398 19 994 Web uk-2005 (39.5M / 783.0M) Qmin 0.9793 0.9796 0.9801 Qmax 0.9795 0.9797 0.9801 t 11 006 11 736 17 074
  27. 27. Large-scale analysis of the structure of science 26
  28. 28. Algorithmic classification systems of science • Publications (not journals) are clustered into research areas based on citation relations • Research areas are defined at different levels of granularity and are organized hierarchically • Clustering is performed using the smart local moving algorithm (improved Louvain algorithm; Waltman & Van Eck, 2013) 27
  29. 29. Algorithmically constructed classification system of science • 16.2 million publications from the period 2000– 2014 indexed in Web of Science • 241.7 million citation relations • Classification system of 3 hierarchical levels: – 28 broad disciplines – 813 fields – 3,822 subfields 28
  30. 30. Breakdown of scientific literature into 3822 subfields 30 Social sciences and humanities Biomedical and health sciences Life and earth sciences Physical sciences and engineering Mathematics and computer science
  31. 31. Publications in scientometrics subfield 31
  32. 32. Time-line map of highly cited scientometrics publications 32
  33. 33. Application: Exploring the interface between physical and medical sciences 33
  34. 34. Application: Emerging research areas in physics 35 Particle physics Astronomy and astrophysics Optics Applied physics Atomic, molecular, and chemical physics Condensed matter physics
  35. 35. CWTS Leiden Ranking 36
  36. 36. Analyzing the structure and evolution of network science 37
  37. 37. Network science according to Wikipedia Network science is an interdisciplinary academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks. The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. 38
  38. 38. Networks text book by Mark Newman The scientific study of networks, including computer networks, social networks, and biological networks, has received an enormous amount of interest in the last few years. (...) The study of networks is broadly interdisciplinary and important developments have occurred in many fields, including mathematics, physics, computer and information sciences, biology, and the social sciences. 39
  39. 39. Journal of Complex Networks The journal covers everything from the basic mathematical, physical and computational principles needed for studying complex networks to their applications leading to predictive models in molecular, biological, ecological, informational, engineering, social, technological and other systems. 40
  40. 40. Network Science journal Network Science is a new journal for a new discipline - one using the network paradigm, focusing on actors and relational linkages, to inform research, methodology, and applications from many fields across the natural, social, engineering and informational sciences. 41
  41. 41. Popular network terms 42 neural network social network wireless sensor network complex network wireless network regulatory network
  42. 42. Network publications • Web of Science database • Time period 1992–2014 • Research articles and review articles • ‘network’ or ‘graph’ in title or abstract • 0.7 million publications 43
  43. 43. Number of network publications per year 44
  44. 44. Co-occurrence relations between terms in network publications 45 Biology Neuroscience Social science Chemistry Mathematics Computer science
  45. 45. Co-occurrence relations between terms in network publications 46 Biology Neuroscience Social science Chemistry Mathematics Computer science
  46. 46. Network fields • Network publications are clustered into fields • Based on 3.1 million citation relations between network publications • Clustering methodology of Waltman and Van Eck (2012, 2013) • Publications in the same journal are assigned to the same cluster, except for multidisciplinary journals • 13 main clusters, covering 97% of all 0.7 million network publications 47
  47. 47. Number of network publications per field 48
  48. 48. Citation relations between journals with ≥ 100 network publications 49 Computer science Mathematics Physics Neuroscience Biology Chemistry
  49. 49. Convergence toward an integrated network science field? Number of citations between network fields (x 100; 5-year citation window) 50 2004 Physics Math CS Biology SSNeuro 3 2 2 7 4 2 1 2 Physics Math CS Biology SSNeuro 10 5 10 13 9 9 8 5 2014 2 5 27 6 39 1
  50. 50. Convergence toward an integrated network science field? % of publications in each of two fields citing at least one publication in the other field (5-year citation window) 51 2004 Physics Math CS Biology SSNeuro 3 4 2 6 5 3 2 2 Physics Math CS Biology SSNeuro 5 5 3 6 3 5 4 5 2014 6 10 7 12
  51. 51. Convergence of social science and physics 52
  52. 52. Citation relations between journals at the SS-physics interface (2005–2014) 53 Scientometrics Economics Sociology and SNA Physica A PRE PRL PLOS ONE PNAS Nature Science Sci. Rep. JSTAT EPL EPJ B
  53. 53. Leiden University’s institutes with most publications on network science • LUMC • Leiden Institute of Advanced Computer Science (Science) • Leiden Institute of Chemistry (Science) • Leiden Institute of Physics (Science) • Institute of Psychology (FSW) • Mathematical Institute (Science) • Leiden Observatory (Science) • Institute of Biology Leiden (Science) • Centre for Science and Technology Studies (FSW) 54
  54. 54. Citation relations between journals with ≥ 100 network publications 55 Computer science Mathematics Physics Neuroscience Biology Chemistry
  55. 55. Leiden University’s publication output in network science journals 56
  56. 56. Leiden University’s publication output in network science journals 57 CWTS Leiden Institute of Chemistry LIACS Leiden Institute of Physics Leiden Institute of Physics Institute of Psychology LUMC Institute of Biology Leiden Mathematical Institute
  57. 57. Conclusions • Network research has increased tremendously during the past 10–15 years • Network research covers many fields of science, but there is only limited evidence of increasing integration • Network research in social science and physics is becoming more connected • Leiden University contributes to all major areas of network research, although the contribution to in the area of computer science is somewhat modest 58
  58. 58. Do it yourself! 59 www.vosviewer.com www.citnetexplorer.nl
  59. 59. Thank you for your attention! 60

×