Applications of community detection in bibliometric network analysis

1,306 views
1,153 views

Published on

In this talk, we focus on the analysis of bibliometric networks, and in particular on the detection of communities in these networks. We start by demonstrating VOSviewer, a popular software tool for visualizing bibliometric networks. We discuss the techniques used by VOSviewer for visualizing bibliometric networks and for detecting communities in these networks. We pay special attention to the close relationship between visualization and community detection, and we discuss the unified approach to visualization and community detection that is implemented in VOSviewer. We then shift our attention to community detection in very large citation networks, including millions of publications and hundreds of millions of citation relations. We show how community detection techniques can be used to construct highly detailed classification systems of science. We also discuss applications of such classification systems to science policy questions. Finally, we demonstrate CitNetExplorer, a new software tool in which community detection techniques are used to support the large-scale analysis of citation networks. We use CitNetExplorer to analyze the citation network of publications on network science and in particular on community detection.

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,306
On SlideShare
0
From Embeds
0
Number of Embeds
24
Actions
Shares
0
Downloads
42
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Applications of community detection in bibliometric network analysis

  1. 1. Applications of community detection in bibliometric network analysis Nees Jan van Eck Centre for Science and Technology Studies (CWTS), Leiden University EURANDOM workshop “Networks with community structure”, Eindhoven January 24, 2014
  2. 2. Outline • Bibliometric network analysis at CWTS • VOSviewer • Unified approach to visualization and community detection • Community detection in large citation networks • CitNetExplorer 1
  3. 3. Bibliometric network analysis at CWTS 2
  4. 4. Bibliometric network analysis at CWTS • In-house databases: – Thomson Reuters Web of Science – Elsevier Scopus • Bibliometric networks: – Publication citation networks – Journal co-citation/bibliographic coupling networks – Term co-occurrence networks – Co-authorship networks – Etc. • Applications: – Research institutions: Research assessment – Scientific publishers: Journal profiling – Funding agencies: Science policy analyses 3
  5. 5. VOSviewer (www.vosviewer.com) 4
  6. 6. VOSviewer (Van Eck & Waltman, Scientometrics, 2010) 5
  7. 7. Citation network of fields in Web of Science 6
  8. 8. Co-occurrence network of terms in clinical neurology 7
  9. 9. Unified approach to visualization and community detection 8
  10. 10. Visualization vs. community detection • Visualization (‘mapping’): – Assigning the nodes in a network to locations in a (usually twodimensional) space • Community detection (‘clustering’): – Partitioning the nodes in a network into a number of groups 9
  11. 11. Community detection seen as visualization in a restricted space 10
  12. 12. Community detection seen as visualization in a restricted space 11
  13. 13. Unified approach to visualization and community detection Minimize Q (x 1 , , x n ) i j 2m Aij d ij2 kik j d ij i j where n: number of nodes in the network m: total weight of all edges in the network Aij: weight of edge between nodes i and j ki: total weight of all edges of node i Visualization Community detection xi: integer denoting the community to which node i belongs 0 if x i x j d ij 1 if x i x j : resolution parameter xi: vector denoting the location of node i in a p-dimensional space p d ij xi xj (x ik x jk )2 k 1 12
  14. 14. Unified approach: Community detection Equivalent to a weighted variant of modularity-based community detection (Waltman et al., 2010) Maximize ˆ Q(x 1 , , x n ) 1 2m (x i , x j )w ij Aij i j kik j 2m where (xi, xj) equals 1 if xi = xj and 0 otherwise w ij 2m kik j 13
  15. 15. Unified approach: Visualization • Equivalent to the VOS (visualization of similarities) technique (Van Eck & Waltman, 2007) • Limit case of multidimensional scaling (Van Eck et al., 2010) Q i j 2m Aij x i kik j Wij Dij xi xj 2 xi xj VOS i j xj 2 MDS i j Dij kik j 2m Aij 1 Wij 2m Aij kik j 14
  16. 16. Unified approach Most commonly used community detection technique (modularity) and most commonly used visualization technique (MDS) can be brought together in a unified framework Unified approach Modularity (weighted) VOS MDS (limit case) 15
  17. 17. Community detection in large citation networks 16
  18. 18. Classification systems of scientific publications • Web of Science/Scopus classification systems: – Scientific fields defined at the level of journals rather than individual publications – Difficulties with multidisciplinary journals – High level of aggregation – Sometimes outdated or inaccurate • Disciplinary classification systems: – E.g., CA, JEL, MeSH, PACS – Not available for all disciplines – Sometimes outdated or inaccurate 17
  19. 19. Algorithmically constructed classification systems • Publications (not journals) are clustered into fields based on citation relations • Fields are defined at different levels of granularity and are organized hierarchically • Community detection based on a variant of the standard modularity function that accounts for differences in citation practices across fields • Optimization using the smart local moving algorithm 18
  20. 20. Example (Waltman & Van Eck, 2012) • 10.2 million publications from the period 2001–2010 indexed in Web of Science • 97.6 million citation relations • Classification system of 3 hierarchical levels: – 20 broad disciplines – 672 fields – 22,412 subfields 19
  21. 21. Visualization of 672 research areas at level 2 of the classification system 20
  22. 22. Visualization of 417 publications in research area 4.30.10 21
  23. 23. Application in a science policy context 22
  24. 24. CitNetExplorer (www.citnetexplorer.nl) 23
  25. 25. Exploring citation networks • Macro-level applications: – Studying the development of a research field over time – Identifying research areas • Micro-level applications: – Studying the publication oeuvre of a researcher – Supporting systematic literature reviewing 24
  26. 26. HistCite • Timeline visualization of publications and their citation relations, referred to as algorithmic historiography by Eugene Garfield 25
  27. 27. CitNetExplorer • New software tool for analyzing and visualizing citation networks • Freely available on www.citnetexplorer.nl • Runs on any system that offers Java support • Citation networks can be constructed directly based on data downloaded from Web of Science • Interactive functionality for drilling down into a citation network • Very large citation networks can be handled, with millions of publications and tens of millions of citation relations 26
  28. 28. Demonstration • Database: Web of Science • Fields: Physics and multidisciplinary (Nature, PLoS ONE, PNAS, Science, etc.) • Time period: 1998–2012 • Number of publications: ~1.8 million • Number of citation relations: ~15.1 million 27
  29. 29. CitNetExplorer 28
  30. 30. References Van Eck, N.J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523-538. Van Eck, N.J., & Waltman, L. (2011). Text mining and visualization using VOSviewer. ISSI Newsletter, 7(3), 50-54. Van Eck, N.J., Waltman, L., Dekker, R., & Van den Berg, J. (2010). A comparison of two techniques for bibliometric mapping: Multidimensional scaling and VOS. JASIST, 61(12), 2405-2416. Waltman, L., & Van Eck, N.J. (2012). A new methodology for constructing a publication-level classification system of science. JASIST, 63(12), 2378-2392. Waltman, L., & Van Eck, N.J. (2013). A smart local moving algorithm for large-scale modularity-based community detection. European Physical Journal B, 86(11), 471. Waltman, L., Van Eck, N.J., & Noyons, E.C.M. (2010). A unified approach to mapping and clustering of bibliometric networks. Journal of Informetrics, 4(4), 629-635. 29

×