VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...Nees Jan van Eck
In this talk, an introduction is given into two software tools that have been developed for bibliometric analysis of scientific publications: VOSviewer (www.vosviewer.com) and CitNetExplorer (www.citnetexplorer.nl). VOSviewer is a popular tool that can be used for visualizing bibliometric networks of citation relations between publications, authors, and journals. In addition, the tool can be used for creating so-called term map visualizations based on a text mining analysis of the titles and abstracts of publications. The most important terms occurring in titles and abstract are identified and the co-occurrence relations between these terms are visualized. CitNetExplorer is a tool for the visualization and analysis of citation networks of scientific publications. The tool can be used to explore in detail how publications build on each other, as indicated by citation links. It is also possible to drill down into specific areas within a citation network, making it possible to perform micro-level analyses of the development of a particular area of research. In this talk, special attention will be paid to possible applications of VOSviewer and CitNetExplorer in humanities research, focusing in particular on the use of advanced text mining, network analysis, and visualization techniques for analyzing large quantities of textual data.
Nees Jan van Eck is a researcher at the Centre for Science and Technology Studies (CWTS) of Leiden University. His research focuses on the quantitative analysis of scientific research based on large amounts of bibliographic data and using sophisticated techniques from fields such as network analysis, statistics, and machine learning. Together with his colleague Ludo Waltman, Nees Jan has developed the VOSviewer and CitNetExplorer tools.
This tutorial deals with two software tools: VOSviewer and CitNetExplorer. VOSviewer (www.vosviewer.com) is a freely available tool for constructing and visualizing bibliographic coupling, co-citation, co-authorship, and term co-occurrence networks. These networks can be constructed based on data downloaded from Web of Science or Scopus. CitNetExplorer (www.citnetexplorer.nl) is a freely available tool for analyzing and visualizing citation networks of publications.
The aim of the tutorial is to provide the participants with a basic knowledge of VOSviewer and CitNetExplorer. Given time constraints, it will not be possible to explore the two tools in a fully comprehensive way, but the tutorial will offer a thorough introduction into the most essential features of the tools. This should be sufficient for the participants to perform all basic analyses that can be done using VOSviewer and CitNetExplorer. In addition, it should allow the participants to independently explore the tools in more detail.
The lecturers are Nees Jan van Eck and Ludo Waltman, both affiliated to the Centre for Science and Technology Studies (CWTS) of Leiden University. Nees Jan and Ludo are the developers and VOSviewer and CitNetExplorer, and they therefore have an in-depth knowledge of both software tools. Nees Jan and Ludo regularly organize courses and workshops on VOSviewer and CitNetExplorer (see for instance www.cwts.nl/Bibliometric-Network-Analysis-and-Science-Mapping-Using-VOSviewer), so they have a lot of experience in training people in the use of these tools.
Applications of community detection in bibliometric network analysisNees Jan van Eck
In this talk, we focus on the analysis of bibliometric networks, and in particular on the detection of communities in these networks. We start by demonstrating VOSviewer, a popular software tool for visualizing bibliometric networks. We discuss the techniques used by VOSviewer for visualizing bibliometric networks and for detecting communities in these networks. We pay special attention to the close relationship between visualization and community detection, and we discuss the unified approach to visualization and community detection that is implemented in VOSviewer. We then shift our attention to community detection in very large citation networks, including millions of publications and hundreds of millions of citation relations. We show how community detection techniques can be used to construct highly detailed classification systems of science. We also discuss applications of such classification systems to science policy questions. Finally, we demonstrate CitNetExplorer, a new software tool in which community detection techniques are used to support the large-scale analysis of citation networks. We use CitNetExplorer to analyze the citation network of publications on network science and in particular on community detection.
This document discusses large-scale visualization of science using bibliometric networks and analysis techniques. It summarizes two bibliographic data sources, Web of Science and Scopus, that can be used to construct citation, co-authorship, keyword co-occurrence, and other networks. Software tools like VOSviewer and CitNetExplorer are presented that enable interactive visualization and analysis of these networks to gain insights into the structure and evolution of science. Specific examples demonstrate classification systems of science fields, network analyses of the scientometrics subfield and the graphene subfield, and visualizations of China's and Wuhan University's scholarly output over time.
Presentation at the Data Science seminar at the Faculty of Social and Behavioural Sciences, Leiden University, Leiden, The Netherlands, December 7, 2018.
Presentation at the Workshop on Open Citations, University of Bologna, Bologna, Italy, September 4, 2018.
I will demonstrate the use of the VOSviewer software (www.vosviewer.com), of which I am one of the developers, for creating bibliometric visualizations of science based on openly available bibliographic data sources. Both the use of Crossref data and the use of data from the OpenCitations Corpus will be demonstrated. In addition, I will show how data from Dimensions can be used. The possibilities and limitations of the currently available open data sources will be discussed, also in comparison with more established data sources such as Web of Science and Scopus. Finally, I will provide my perspective on future developments, focusing especially on the integration of open data sources and visual analysis tools.
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...Nees Jan van Eck
In this talk, an introduction is given into two software tools that have been developed for bibliometric analysis of scientific publications: VOSviewer (www.vosviewer.com) and CitNetExplorer (www.citnetexplorer.nl). VOSviewer is a popular tool that can be used for visualizing bibliometric networks of citation relations between publications, authors, and journals. In addition, the tool can be used for creating so-called term map visualizations based on a text mining analysis of the titles and abstracts of publications. The most important terms occurring in titles and abstract are identified and the co-occurrence relations between these terms are visualized. CitNetExplorer is a tool for the visualization and analysis of citation networks of scientific publications. The tool can be used to explore in detail how publications build on each other, as indicated by citation links. It is also possible to drill down into specific areas within a citation network, making it possible to perform micro-level analyses of the development of a particular area of research. In this talk, special attention will be paid to possible applications of VOSviewer and CitNetExplorer in humanities research, focusing in particular on the use of advanced text mining, network analysis, and visualization techniques for analyzing large quantities of textual data.
Nees Jan van Eck is a researcher at the Centre for Science and Technology Studies (CWTS) of Leiden University. His research focuses on the quantitative analysis of scientific research based on large amounts of bibliographic data and using sophisticated techniques from fields such as network analysis, statistics, and machine learning. Together with his colleague Ludo Waltman, Nees Jan has developed the VOSviewer and CitNetExplorer tools.
This tutorial deals with two software tools: VOSviewer and CitNetExplorer. VOSviewer (www.vosviewer.com) is a freely available tool for constructing and visualizing bibliographic coupling, co-citation, co-authorship, and term co-occurrence networks. These networks can be constructed based on data downloaded from Web of Science or Scopus. CitNetExplorer (www.citnetexplorer.nl) is a freely available tool for analyzing and visualizing citation networks of publications.
The aim of the tutorial is to provide the participants with a basic knowledge of VOSviewer and CitNetExplorer. Given time constraints, it will not be possible to explore the two tools in a fully comprehensive way, but the tutorial will offer a thorough introduction into the most essential features of the tools. This should be sufficient for the participants to perform all basic analyses that can be done using VOSviewer and CitNetExplorer. In addition, it should allow the participants to independently explore the tools in more detail.
The lecturers are Nees Jan van Eck and Ludo Waltman, both affiliated to the Centre for Science and Technology Studies (CWTS) of Leiden University. Nees Jan and Ludo are the developers and VOSviewer and CitNetExplorer, and they therefore have an in-depth knowledge of both software tools. Nees Jan and Ludo regularly organize courses and workshops on VOSviewer and CitNetExplorer (see for instance www.cwts.nl/Bibliometric-Network-Analysis-and-Science-Mapping-Using-VOSviewer), so they have a lot of experience in training people in the use of these tools.
Applications of community detection in bibliometric network analysisNees Jan van Eck
In this talk, we focus on the analysis of bibliometric networks, and in particular on the detection of communities in these networks. We start by demonstrating VOSviewer, a popular software tool for visualizing bibliometric networks. We discuss the techniques used by VOSviewer for visualizing bibliometric networks and for detecting communities in these networks. We pay special attention to the close relationship between visualization and community detection, and we discuss the unified approach to visualization and community detection that is implemented in VOSviewer. We then shift our attention to community detection in very large citation networks, including millions of publications and hundreds of millions of citation relations. We show how community detection techniques can be used to construct highly detailed classification systems of science. We also discuss applications of such classification systems to science policy questions. Finally, we demonstrate CitNetExplorer, a new software tool in which community detection techniques are used to support the large-scale analysis of citation networks. We use CitNetExplorer to analyze the citation network of publications on network science and in particular on community detection.
This document discusses large-scale visualization of science using bibliometric networks and analysis techniques. It summarizes two bibliographic data sources, Web of Science and Scopus, that can be used to construct citation, co-authorship, keyword co-occurrence, and other networks. Software tools like VOSviewer and CitNetExplorer are presented that enable interactive visualization and analysis of these networks to gain insights into the structure and evolution of science. Specific examples demonstrate classification systems of science fields, network analyses of the scientometrics subfield and the graphene subfield, and visualizations of China's and Wuhan University's scholarly output over time.
Presentation at the Data Science seminar at the Faculty of Social and Behavioural Sciences, Leiden University, Leiden, The Netherlands, December 7, 2018.
Presentation at the Workshop on Open Citations, University of Bologna, Bologna, Italy, September 4, 2018.
I will demonstrate the use of the VOSviewer software (www.vosviewer.com), of which I am one of the developers, for creating bibliometric visualizations of science based on openly available bibliographic data sources. Both the use of Crossref data and the use of data from the OpenCitations Corpus will be demonstrated. In addition, I will show how data from Dimensions can be used. The possibilities and limitations of the currently available open data sources will be discussed, also in comparison with more established data sources such as Web of Science and Scopus. Finally, I will provide my perspective on future developments, focusing especially on the integration of open data sources and visual analysis tools.
A systematic empirical comparison of different approaches for normalizing cit...Nees Jan van Eck
We address the question how citation-based bibliometric indicators can best be normalized to ensure fair comparisons between publications from different scientific fields and different years. In a systematic large-scale empirical analysis, we compare a traditional normalization approach based on a field classification system with three source normalization approaches. We pay special attention to the selection of the publications included in the analysis. Publications in national scientific journals, popular scientific magazines, and trade magazines are not included. Unlike earlier studies, we use algorithmically constructed classification systems to evaluate the different normalization approaches. Our analysis shows that a source normalization approach based on the recently introduced idea of fractional citation counting does not perform well. Two other source normalization approaches generally outperform the classification-system-based normalization approach that we study. Our analysis therefore offers considerable support for the use of source-normalized bibliometric indicators.
Advanced citation matching and large-scale cited reference extractionNees Jan van Eck
This document summarizes research comparing the accuracy of citation matching algorithms from Web of Science (WoS), CWTS, and iFQ. It also analyzes the accuracy of cited references extracted from WoS using Elsevier ScienceDirect as a benchmark. The research found that CWTS and iFQ algorithms had higher recall rates than WoS, with a small decrease in precision. Analysis of WoS cited references showed about 0.3% were missing, 0.2% had minor errors, and 0.1% had major errors when compared to the original publication references. Overall, WoS performed well but room for improvement was identified.
Bibliometric network analysis: Software tools, techniques, and an analysis o...Nees Jan van Eck
This document summarizes a presentation about bibliometric network analysis tools and techniques. It discusses two software tools, VOSviewer and CitNetExplorer, that are used to construct and visualize bibliometric networks. It also outlines various network analysis techniques, including layout algorithms, community detection methods, and a unified approach to mapping and clustering networks. Finally, it provides an analysis of the structure and evolution of the field of network science based on a large bibliometric dataset.
Large-scale analysis of bibliometric data sourcesNees Jan van Eck
This document summarizes a presentation about analyzing large bibliometric data sources. It discusses the speaker's background in bibliometrics and their research center CWTS. CWTS has access to large bibliographic databases and focuses on bibliometric and scientometric research. Software tools for constructing and analyzing bibliometric networks are described, including VOSviewer and CitNetExplorer which were developed by the speaker. Network analysis techniques like community detection and layout algorithms are also covered. Finally, the document analyzes the field of data science using bibliometric methods by identifying publications related to data and mapping the growth and structure of data-driven research fields.
Visual exploration of scientific literature using VOSviewer and CitNetExplorerNees Jan van Eck
Presentation at the International Conference on ICT enhanced Social Sciences and Humanities 2020, July 1, 2020.
It is essential for researchers to have an up-to-date understanding of the literature in their research field. However, keeping up with all relevant literature is highly time consuming. Bibliometric visualizations can support this task. These visualizations provide intuitive overviews of the literature in a research field, enabling researchers to obtain a better understanding of the structure and development of a field and to get an impression of the most significant contributions made in the field.
In this talk, I will give an introduction to two software tools for bibliometric visualization: VOSviewer (www.vosviewer.com) and CitNetExplorer (www.citnetexplorer.nl). VOSviewer is a popular tool for visualizing bibliometric networks of publications, authors, journals, and keywords. CitNetExplorer is a tool for the visualization and analysis of citation networks of scientific publications. I will pay special attention to applications of VOSviewer and CitNetExplorer in the social sciences and humanities, focusing in particular on the use of advanced text mining, network analysis, and visualization techniques for analyzing large amounts of textual data.
To ensure that publications are assigned to clusters in a meaningful way, we introduce the notion of stable clusters. Essentially, a cluster is stable if it is insensitive to small changes in the underlying data. Bootstrapping is used to make small changes in the data.
A new software tool for large-scale analysis of citation networksNees Jan van Eck
This document describes a new software tool called Citation Network Explorer that allows users to explore and visualize large-scale citation networks over time in a dynamic way. It summarizes the motivation for developing this tool, which is the limited availability of software that can handle the visualization of the evolution of science. The document then provides an overview of the tool's capabilities and demonstrates it on two sample citation network datasets, concluding with a list of references for related research.
This document summarizes a presentation on scientometric approaches to classification. It discusses:
- Bibliographic databases like Web of Science and Scopus and their coverage.
- Types of classification systems for scientific literature including mono-disciplinary vs multidisciplinary and journal-level vs publication-level classifications.
- The CWTS publication-level classification system which uses a fully algorithmic approach to cluster over 21 million publications into a hierarchical structure of disciplines, fields, and subfields.
- Applications of the CWTS classification system including field normalization, field delineation, research strength analysis, and identification of interdisciplinary areas.
- Studies that have evaluated aspects of the quality and accuracy of classification systems.
Network visualization: Fine-tuning layout techniques for different types of n...Nees Jan van Eck
An important issue in network visualization is the problem of obtaining a good layout for a network. For a given network, which may be either weighted or unweighted, the problem is to position the nodes in the network in a two-dimensional space in such a way that an attractive layout is obtained. Many layout techniques have been proposed [1]. In the visualization of bibliometric networks, multidimensional scaling and the layout technique of Kamada and Kawai [2] have for instance been used a lot. More recently, the VOS (visualization of similarities) layout technique [3], implemented in our VOSviewer software (www.vosviewer.com) [4], is often used for bibliometric network visualization.
There is no layout technique that is generally considered to give optimal results. One reason for this is that comparisons between layouts produced by different techniques involve a lot of subjectiveness. Someone may consider one layout to be more attractive than another, but someone else may have an opposite opinion on this. In addition, the attractiveness of a layout may depend on the type of visualization that is needed. For instance, some layouts may be more attractive for interactive visualizations (e.g., in a software tool with zooming functionality), while other layouts may be more attractive for static visualizations. Furthermore, different types of networks may benefit from different layout techniques.
In recent studies [5, 6], the idea of parameterized layout techniques has been introduced. Parameterized layout techniques produce different types of layouts depending on the values chosen for their parameters. In this research, we present a comprehensive study of a parameterized version of our VOS layout technique. Two parameters are included. Like in [5], these are referred to as attraction and repulsion parameters. We compare the layouts obtained for different parameter values. Comparisons are made both subjectively using the VOSviewer software (i.e., which layout do we find most appealing?) and more objectively using so-called meta-criteria [6, 7]. Sensitivity to local optima is taken into account as well. Comparisons are made for all important types of bibliometric networks, in particular co-authorship, citation, co-citation, bibliographic coupling, and co-occurrence networks. Both smaller and larger networks are considered.
Bibliometric visualization using VOSviewerLudo Waltman
Presentation at the workshop Research Output & Impact – New Tools and Concepts, organized at Technical University Denmark. Lyngby, Denmark, September 14, 2017.
Large-scale visualization of science: Methods, tools, and applicationsLudo Waltman
Presentation at the International Workshop on Data-driven Science Mapping, organized on the occasion of the 60th anniversary of the Department of Library and Information Science at Yonsei University. Seoul, South Korea, June 3, 2017.
To ensure that publications are assigned to clusters in a meaningful way, we introduce the notion of stable clusters. Essentially, a cluster is stable if it is insensitive to small changes in the underlying data. Bootstrapping is used to make small changes in the data. It is shown that if we want to have an accurate and detailed clustering, we need to be satisfied with a clustering that doesn’t comprehensively cover all publications. Publications that do not clearly belong to one of the main topics in a field cannot be assigned to a cluster.
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...Nees Jan van Eck
Tutorial at the Workshop on Open Citations: Opportunities and Ongoing Developments, ISSI2019 conference, Sapienza University, Rome, Italy, September 2, 2019.
This document summarizes the research of Ludo Waltman on the field of research on research. It discusses algorithms and tools developed by Waltman's group like the Louvain and Leiden algorithms for community detection in networks. It also summarizes Waltman's work analyzing the landscape of science through Dimensions data and identifying the subset of publications focused on research on research. Finally, it shows term maps and analyses of research on research literature in areas like scientometrics, science and technology studies, and innovation studies.
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...Ludo Waltman
This document summarizes the evolving landscape of bibliometric data sources and opportunities for bibliometric visualization. It discusses how alternative data sources like Dimensions, Crossref, and OpenCitations Corpus provide more open citation data than traditional sources like Web of Science and Scopus. While coverage varies, Dimensions and Crossref provide reasonably complete publication and citation data. Discrepancies between sources are due to reference inaccuracies and inconsistencies in citation matching. VOSviewer software supports network analysis and visualization using multiple data sources. The document calls for expanding open citation indexing to further open science.
This document compares several bibliographic data sources and finds substantial discrepancies between them. It analyzes the coverage of publications and citations in Web of Science, Scopus, Dimensions, and Crossref. Dimensions and Scopus have the most complete coverage of publications, while Crossref is incomplete due to closed or missing citations. Pairwise comparisons reveal millions of citations that are unique to each source. The causes of discrepancies include reference inaccuracies, versioning issues, and different matching algorithms. Examples demonstrate problems caused by group authors and supplements in Web of Science.
Accuracy of citation data in Web of Science and ScopusNees Jan van Eck
This document analyzes the accuracy of citation data in Web of Science (WoS) and Scopus by comparing reference data from Elsevier journal publications to the reference data in WoS and Scopus. The analysis found inaccuracies in about 1% of references in both WoS and Scopus. For WoS, there were issues with missing and incorrect references. For Scopus, problems included duplicate publications and citation matching errors. There were also large discrepancies in citation counts between the two databases and over time for some highly-cited publications. The document concludes that citation data in both WoS and Scopus suffers from significant inaccuracies.
COVID-19 and its implications for the scholarly communication systemLudo Waltman
COVID-19 has implications for the scholarly communication system by affecting dissemination, accessibility, quality control, and findability of research. The pandemic led to a large increase in preprint submissions as researchers sought rapid dissemination of COVID-19 findings. Preprint servers helped make research widely accessible. While traditional peer review was bypassed, some preprints received rapid or post-publication peer review. Open access publishing also increased accessibility. The value of preprint servers and next-generation search technologies was demonstrated, but literature databases not covering preprints may lose value. COVID-19 provides a strong argument for open access and shows preprints can play an important role in biomedical research dissemination.
Scholars’ Perceptions of Relevance in Bibliography-Based People Recommender S...Ekaterina Olshannikova
Collaboration and social networking are increasingly important for academics, yet identifying relevant collaborators requires remarkable effort. While there are various networking services optimized for seeking similarities between the users, the scholarly motive of producing new knowledge calls for assistance in identifying people with complementary qualities. However, there is little empirical understanding of how academics perceive relevance, complementarity, and diversity of individuals in their profession and how these concepts can be optimally embedded in social matching systems. This paper aims to support the development of diversity-enhancing people recommender systems by exploring senior researchers’ perceptions of recommended other scholars at different levels on a similar–different continuum. To conduct the study, we built a recommender system based on topic modeling of scholars’ publications in the DBLP computer science bibliography. A study of 18 senior researchers comprised a controlled experiment and semi-structured interviewing, focusing on their subjective perceptions regarding relevance, similarity, and familiarity of the given recommendations, as well as participants’ readiness to interact with the recommended people. The study implies that the homophily bias (behavioral tendency to select similar others) is strong despite the recognized need for complementarity. While the experiment indicated consistent and significant differences between the perceived relevance of most similar vs. other levels, the interview results imply that the evaluation of the relevance of people recommendations is complex and multifaceted. Despite the inherent bias in selection, the participants could identify highly interesting collaboration opportunities on all levels of similarity.
Being an Open Scholar in a Connected WorldStian Håklev
This document discusses the benefits of open scholarship in a connected world. It argues that open access to research articles makes information more accessible to broader audiences, including the general public and students. When data and research notes are openly shared online, it can enable unexpected reuse and collaboration. However, the current academic publishing and reward systems may not fully incentivize open scholarship. The document calls for exploring new models of peer review, metrics of impact, and ways of publishing research to make the scholarly process more transparent and collaborative.
A systematic empirical comparison of different approaches for normalizing cit...Nees Jan van Eck
We address the question how citation-based bibliometric indicators can best be normalized to ensure fair comparisons between publications from different scientific fields and different years. In a systematic large-scale empirical analysis, we compare a traditional normalization approach based on a field classification system with three source normalization approaches. We pay special attention to the selection of the publications included in the analysis. Publications in national scientific journals, popular scientific magazines, and trade magazines are not included. Unlike earlier studies, we use algorithmically constructed classification systems to evaluate the different normalization approaches. Our analysis shows that a source normalization approach based on the recently introduced idea of fractional citation counting does not perform well. Two other source normalization approaches generally outperform the classification-system-based normalization approach that we study. Our analysis therefore offers considerable support for the use of source-normalized bibliometric indicators.
Advanced citation matching and large-scale cited reference extractionNees Jan van Eck
This document summarizes research comparing the accuracy of citation matching algorithms from Web of Science (WoS), CWTS, and iFQ. It also analyzes the accuracy of cited references extracted from WoS using Elsevier ScienceDirect as a benchmark. The research found that CWTS and iFQ algorithms had higher recall rates than WoS, with a small decrease in precision. Analysis of WoS cited references showed about 0.3% were missing, 0.2% had minor errors, and 0.1% had major errors when compared to the original publication references. Overall, WoS performed well but room for improvement was identified.
Bibliometric network analysis: Software tools, techniques, and an analysis o...Nees Jan van Eck
This document summarizes a presentation about bibliometric network analysis tools and techniques. It discusses two software tools, VOSviewer and CitNetExplorer, that are used to construct and visualize bibliometric networks. It also outlines various network analysis techniques, including layout algorithms, community detection methods, and a unified approach to mapping and clustering networks. Finally, it provides an analysis of the structure and evolution of the field of network science based on a large bibliometric dataset.
Large-scale analysis of bibliometric data sourcesNees Jan van Eck
This document summarizes a presentation about analyzing large bibliometric data sources. It discusses the speaker's background in bibliometrics and their research center CWTS. CWTS has access to large bibliographic databases and focuses on bibliometric and scientometric research. Software tools for constructing and analyzing bibliometric networks are described, including VOSviewer and CitNetExplorer which were developed by the speaker. Network analysis techniques like community detection and layout algorithms are also covered. Finally, the document analyzes the field of data science using bibliometric methods by identifying publications related to data and mapping the growth and structure of data-driven research fields.
Visual exploration of scientific literature using VOSviewer and CitNetExplorerNees Jan van Eck
Presentation at the International Conference on ICT enhanced Social Sciences and Humanities 2020, July 1, 2020.
It is essential for researchers to have an up-to-date understanding of the literature in their research field. However, keeping up with all relevant literature is highly time consuming. Bibliometric visualizations can support this task. These visualizations provide intuitive overviews of the literature in a research field, enabling researchers to obtain a better understanding of the structure and development of a field and to get an impression of the most significant contributions made in the field.
In this talk, I will give an introduction to two software tools for bibliometric visualization: VOSviewer (www.vosviewer.com) and CitNetExplorer (www.citnetexplorer.nl). VOSviewer is a popular tool for visualizing bibliometric networks of publications, authors, journals, and keywords. CitNetExplorer is a tool for the visualization and analysis of citation networks of scientific publications. I will pay special attention to applications of VOSviewer and CitNetExplorer in the social sciences and humanities, focusing in particular on the use of advanced text mining, network analysis, and visualization techniques for analyzing large amounts of textual data.
To ensure that publications are assigned to clusters in a meaningful way, we introduce the notion of stable clusters. Essentially, a cluster is stable if it is insensitive to small changes in the underlying data. Bootstrapping is used to make small changes in the data.
A new software tool for large-scale analysis of citation networksNees Jan van Eck
This document describes a new software tool called Citation Network Explorer that allows users to explore and visualize large-scale citation networks over time in a dynamic way. It summarizes the motivation for developing this tool, which is the limited availability of software that can handle the visualization of the evolution of science. The document then provides an overview of the tool's capabilities and demonstrates it on two sample citation network datasets, concluding with a list of references for related research.
This document summarizes a presentation on scientometric approaches to classification. It discusses:
- Bibliographic databases like Web of Science and Scopus and their coverage.
- Types of classification systems for scientific literature including mono-disciplinary vs multidisciplinary and journal-level vs publication-level classifications.
- The CWTS publication-level classification system which uses a fully algorithmic approach to cluster over 21 million publications into a hierarchical structure of disciplines, fields, and subfields.
- Applications of the CWTS classification system including field normalization, field delineation, research strength analysis, and identification of interdisciplinary areas.
- Studies that have evaluated aspects of the quality and accuracy of classification systems.
Network visualization: Fine-tuning layout techniques for different types of n...Nees Jan van Eck
An important issue in network visualization is the problem of obtaining a good layout for a network. For a given network, which may be either weighted or unweighted, the problem is to position the nodes in the network in a two-dimensional space in such a way that an attractive layout is obtained. Many layout techniques have been proposed [1]. In the visualization of bibliometric networks, multidimensional scaling and the layout technique of Kamada and Kawai [2] have for instance been used a lot. More recently, the VOS (visualization of similarities) layout technique [3], implemented in our VOSviewer software (www.vosviewer.com) [4], is often used for bibliometric network visualization.
There is no layout technique that is generally considered to give optimal results. One reason for this is that comparisons between layouts produced by different techniques involve a lot of subjectiveness. Someone may consider one layout to be more attractive than another, but someone else may have an opposite opinion on this. In addition, the attractiveness of a layout may depend on the type of visualization that is needed. For instance, some layouts may be more attractive for interactive visualizations (e.g., in a software tool with zooming functionality), while other layouts may be more attractive for static visualizations. Furthermore, different types of networks may benefit from different layout techniques.
In recent studies [5, 6], the idea of parameterized layout techniques has been introduced. Parameterized layout techniques produce different types of layouts depending on the values chosen for their parameters. In this research, we present a comprehensive study of a parameterized version of our VOS layout technique. Two parameters are included. Like in [5], these are referred to as attraction and repulsion parameters. We compare the layouts obtained for different parameter values. Comparisons are made both subjectively using the VOSviewer software (i.e., which layout do we find most appealing?) and more objectively using so-called meta-criteria [6, 7]. Sensitivity to local optima is taken into account as well. Comparisons are made for all important types of bibliometric networks, in particular co-authorship, citation, co-citation, bibliographic coupling, and co-occurrence networks. Both smaller and larger networks are considered.
Bibliometric visualization using VOSviewerLudo Waltman
Presentation at the workshop Research Output & Impact – New Tools and Concepts, organized at Technical University Denmark. Lyngby, Denmark, September 14, 2017.
Large-scale visualization of science: Methods, tools, and applicationsLudo Waltman
Presentation at the International Workshop on Data-driven Science Mapping, organized on the occasion of the 60th anniversary of the Department of Library and Information Science at Yonsei University. Seoul, South Korea, June 3, 2017.
To ensure that publications are assigned to clusters in a meaningful way, we introduce the notion of stable clusters. Essentially, a cluster is stable if it is insensitive to small changes in the underlying data. Bootstrapping is used to make small changes in the data. It is shown that if we want to have an accurate and detailed clustering, we need to be satisfied with a clustering that doesn’t comprehensively cover all publications. Publications that do not clearly belong to one of the main topics in a field cannot be assigned to a cluster.
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...Nees Jan van Eck
Tutorial at the Workshop on Open Citations: Opportunities and Ongoing Developments, ISSI2019 conference, Sapienza University, Rome, Italy, September 2, 2019.
This document summarizes the research of Ludo Waltman on the field of research on research. It discusses algorithms and tools developed by Waltman's group like the Louvain and Leiden algorithms for community detection in networks. It also summarizes Waltman's work analyzing the landscape of science through Dimensions data and identifying the subset of publications focused on research on research. Finally, it shows term maps and analyses of research on research literature in areas like scientometrics, science and technology studies, and innovation studies.
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...Ludo Waltman
This document summarizes the evolving landscape of bibliometric data sources and opportunities for bibliometric visualization. It discusses how alternative data sources like Dimensions, Crossref, and OpenCitations Corpus provide more open citation data than traditional sources like Web of Science and Scopus. While coverage varies, Dimensions and Crossref provide reasonably complete publication and citation data. Discrepancies between sources are due to reference inaccuracies and inconsistencies in citation matching. VOSviewer software supports network analysis and visualization using multiple data sources. The document calls for expanding open citation indexing to further open science.
This document compares several bibliographic data sources and finds substantial discrepancies between them. It analyzes the coverage of publications and citations in Web of Science, Scopus, Dimensions, and Crossref. Dimensions and Scopus have the most complete coverage of publications, while Crossref is incomplete due to closed or missing citations. Pairwise comparisons reveal millions of citations that are unique to each source. The causes of discrepancies include reference inaccuracies, versioning issues, and different matching algorithms. Examples demonstrate problems caused by group authors and supplements in Web of Science.
Accuracy of citation data in Web of Science and ScopusNees Jan van Eck
This document analyzes the accuracy of citation data in Web of Science (WoS) and Scopus by comparing reference data from Elsevier journal publications to the reference data in WoS and Scopus. The analysis found inaccuracies in about 1% of references in both WoS and Scopus. For WoS, there were issues with missing and incorrect references. For Scopus, problems included duplicate publications and citation matching errors. There were also large discrepancies in citation counts between the two databases and over time for some highly-cited publications. The document concludes that citation data in both WoS and Scopus suffers from significant inaccuracies.
COVID-19 and its implications for the scholarly communication systemLudo Waltman
COVID-19 has implications for the scholarly communication system by affecting dissemination, accessibility, quality control, and findability of research. The pandemic led to a large increase in preprint submissions as researchers sought rapid dissemination of COVID-19 findings. Preprint servers helped make research widely accessible. While traditional peer review was bypassed, some preprints received rapid or post-publication peer review. Open access publishing also increased accessibility. The value of preprint servers and next-generation search technologies was demonstrated, but literature databases not covering preprints may lose value. COVID-19 provides a strong argument for open access and shows preprints can play an important role in biomedical research dissemination.
Scholars’ Perceptions of Relevance in Bibliography-Based People Recommender S...Ekaterina Olshannikova
Collaboration and social networking are increasingly important for academics, yet identifying relevant collaborators requires remarkable effort. While there are various networking services optimized for seeking similarities between the users, the scholarly motive of producing new knowledge calls for assistance in identifying people with complementary qualities. However, there is little empirical understanding of how academics perceive relevance, complementarity, and diversity of individuals in their profession and how these concepts can be optimally embedded in social matching systems. This paper aims to support the development of diversity-enhancing people recommender systems by exploring senior researchers’ perceptions of recommended other scholars at different levels on a similar–different continuum. To conduct the study, we built a recommender system based on topic modeling of scholars’ publications in the DBLP computer science bibliography. A study of 18 senior researchers comprised a controlled experiment and semi-structured interviewing, focusing on their subjective perceptions regarding relevance, similarity, and familiarity of the given recommendations, as well as participants’ readiness to interact with the recommended people. The study implies that the homophily bias (behavioral tendency to select similar others) is strong despite the recognized need for complementarity. While the experiment indicated consistent and significant differences between the perceived relevance of most similar vs. other levels, the interview results imply that the evaluation of the relevance of people recommendations is complex and multifaceted. Despite the inherent bias in selection, the participants could identify highly interesting collaboration opportunities on all levels of similarity.
Being an Open Scholar in a Connected WorldStian Håklev
This document discusses the benefits of open scholarship in a connected world. It argues that open access to research articles makes information more accessible to broader audiences, including the general public and students. When data and research notes are openly shared online, it can enable unexpected reuse and collaboration. However, the current academic publishing and reward systems may not fully incentivize open scholarship. The document calls for exploring new models of peer review, metrics of impact, and ways of publishing research to make the scholarly process more transparent and collaborative.
This document discusses challenges with the current scientific publishing system and proposes a vision for next generation scientific publishing (NGSP). Some key problems include retractions due to misconduct, lack of reproducibility, and non-reusable data and methods. NGSP would feature transparent and computable data and methods, open annotation of narratives and objects, and no restrictions on text mining or remixing. It would move information more quickly and allow verification through an open, service-oriented system without walled gardens. Taking NGSP forward will require collaboration across stakeholders in research communications.
An empirical examination of the structure of scholarship in the Society for the Psychological Study of Social Issues (SPSSI) grounded in network analyses of shared citations (bibliographic couplings)
OpenMinTeD: Making Sense of Large Volumes of Dataopenminted_eu
The document discusses making scientific content more accessible and useful through text and data mining. It notes that the global research community generates over 1.5 million new articles per year but many are never read or cited. Emerging solutions like machine reading, understanding and predicting can help structure and mine textual data to extract meaningful insights. The OpenMinted project aims to establish an open text and data mining platform and infrastructure for researchers to collaboratively work with scientific sources. It outlines challenges around content, services and processing as well as main routes to make content more accessible through metadata, transfer protocols and licensing. The project involves various partners and use cases across domains like scholarly communication, life sciences, agriculture and social sciences.
What is Open Science and what role does it play in Development?Leslie Chan
What is Open Science and what role does it play in Development?
The talk begins with a review of current understanding of open science and its alleged role in providing new opportunities for addressing long-standing development challenges. I then introduce the newly launched Open and Collaborative Science in Development Network, funded by IDRC Canada, and in collaboration with iHub Nairobi, Kenya. The rationale, funding modalities, and the short and long term objectives of the network will be discussed.
This document summarizes a presentation about using NodeXL to visualize social network data from Twitter, MapBox, and other sources. It introduces NodeXL as a free and easy-to-use tool for social network analysis built on Excel. The presentation discusses key concepts in social network theory like nodes, edges, centrality, and clustering. It then shows examples of using NodeXL to analyze innovation networks and visualize data from CrunchBase on capital flows between startups, accelerators, and investors. The purpose is to demonstrate how network analysis and visualization with NodeXL can provide insights into clusters and connections in social and innovation ecosystems.
Quantitative Research Methods
1.What is scientific research? What is quantitative research?
2.Why we need research?
3.Who is conducting the research?
4.What is the research process?
5.What is the language of research?
Network analyses of psychological scienceKevin Lanning
The document analyzes citation networks in psychology using network science methods. It finds that (1) citation networks form small worlds where ideas spread rapidly, (2) different centrality measures reveal influential individuals and ideas as well as scholarly communities, and (3) while proximity in networks is ambiguous, distance provides clearer insights. The analysis is preliminary and larger datasets/advanced methods may provide deeper understanding of influence and relationships in psychological scholarship.
Centrality in Time- Dependent NetworksMason Porter
My slides for my keynote talk at the NetSci 2018 (#NetSci2018) conference in Paris, France (June 2018). This talk will take place on Thursday 13 June in the morning.
This document discusses the potential for integrating network science concepts and tools into undergraduate curriculum. It describes how network analysis can be applied to various academic subjects from biology to history. Sample modules are presented that were implemented in sociology and theater classes at Suffolk University, where students mapped networks in texts to better understand characters and plot lines. The document outlines different delivery methods for network analysis education, from workshops to full courses. It also provides references to network analysis software, tutorials, course materials and introductory books that make the field accessible.
The Web in Science and Research: A tour through four topicsOpen Knowledge Maps
Slides to my talk at the KMi Podium on July 24, 2012. The video can be found here: http://stadium.open.ac.uk/stadia/preview.php?s=29&whichevent=2011&option=both&record=0
The document discusses correlating scholarly networks derived from citations and social networks derived from tweets mentioning academic papers. It presents the motivation to study this correlation and describes collecting over 17,000 tweets referencing papers from ArXiv.org. Networks are constructed connecting papers mentioned by the same users, with edge weights based on time between tweets. Several network analysis metrics and case studies are discussed, finding multi-disciplinary papers are most discussed in both networks and core-periphery community structures. Areas for further work include integrating multiple social networks and modeling network dynamics over time.
This document summarizes research conducted to identify emerging research fields at a university through community detection in scientific collaboration networks. The researchers created a scientific collaboration network using publication and grant data from 2011-2015, detected communities using the Louvain method, and identified keywords and topics for each community to determine emerging fields. They analyzed faculty profiles and conducted interviews to understand community characteristics and perceptions. The results provide insight into the composition and structure of emerging interdisciplinary research fields at the university.
This document summarizes a presentation about the history and future of digital repositories and text analysis tools. It discusses how text collections have evolved from non-digital and dispersed, to digitized but dispersed, to full text collections in repositories, and finally to texts organized into corpora. However, many challenges remain, such as incomplete digitization and a lack of tools for combining close and distant reading. The document envisions a future of distributed infrastructure that connects dispersed data and tools. However, careful interpretation of results will still be needed to understand what texts are included or missing and make valid claims.
Evolving and emerging scholarly communication services in libraries: public a...Claire Stewart
This document provides an overview of a guest lecture about evolving scholarly communication services in libraries and their role in supporting public access compliance and assessing research impact. It discusses challenges libraries face in helping researchers comply with public access policies from funders. It also explores metrics and indicators used to measure research impact, noting limitations, and how libraries can help address this complex issue by leveraging their expertise in managing scholarly information and data.
This document provides an overview of Katrien Verbert's research topics and career. It summarizes her work from 2003 to the present on topics like flexible content reuse, semi-automatic content assembly, recommender systems, and learning analytics visualization. The document outlines her positions at Vrije Universiteit Brussel and various research visits and collaborations. It also includes timelines of her research topics and lists some of her key publications.
Evolution and state-of-the art of Altmetric research: Insights from network a...Aravind Sesagiri Raamkumar
Evolution and state-of-the art of Altmetric research: Insights from network analysis and altmetric analysis
Authors: Hiran Lathabai, Thara Prabhakaran, Manoj Changat
Workshop Website: http://www.altmetrics.ntuchess.com/AROSIM2018/
The document discusses the role and importance of theory in research. It states that theory provides guidance for research by pointing to potentially fruitful areas of study. Theory helps narrow the scope of research by selecting relevant variables and relationships to examine. Research both tests and develops theory in a reciprocal relationship - theory informs research and empirical findings refine theory. Deductive research tests existing theories while inductive research builds theory from data. Overall, theory plays a key role in framing research questions and interpreting results.
Crossref as a source of open bibliographic metadataNees Jan van Eck
Presentation at the 18th International Conference of the International Society for Scientometrics and Informetrics, July 12-15, 2021.
Several initiatives have been taken to promote the openly availability of bibliographic metadata of scholarly publications in Crossref. We present an up-to-date overview of the availability of six metadata elements in Crossref: reference lists, abstracts, ORCIDs, author affiliations, funding information, and license information. Our analysis shows that the availability of these metadata elements has improved over time. However, it also shows that many publishers need to make additional efforts to realize full openness of bibliographic metadata. To illustrate the value of open metadata, we use the metadata in Crossref to construct and visualize a large citation network of scholarly journals.
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...Nees Jan van Eck
This document provides an overview of bibliometric visualizations using VOSviewer software. It discusses the explosive growth of scientific literature and available bibliographic data sources. VOSviewer allows visualization of co-authorship, citation-based, and term co-occurrence networks. Hands-on demonstrations are provided for creating co-authorship maps, citation maps of publications and journals, and term maps. Bibliometric maps provide insights into the structure and relationships within a research field.
A scientometric perspective on university rankingNees Jan van Eck
This document discusses responsible use of university rankings. It summarizes a presentation given by Nees Jan van Eck of CWTS about their Leiden university ranking methodology. The presentation outlines principles for responsible ranking design, interpretation, and use. It emphasizes using transparent, field-normalized bibliometric indicators to measure research impact rather than composite scores. Comparisons should consider size and subject differences between universities. Ranks are less important than underlying indicator values. Non-research metrics are also important to consider.
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingNees Jan van Eck
This document summarizes a presentation about the CWTS Leiden Ranking, a university ranking produced by the Centre for Science and Technology Studies (CWTS) at Leiden University. It provides details about CWTS, the Leiden Ranking methodology, indicators, selection of universities, and differences from other rankings. The presentation emphasizes the importance of using bibliometric indicators, fractional counting of publications, and focusing on highly cited publications. It concludes with principles for the responsible use and interpretation of rankings to avoid simplistic comparisons and ensure rankings are used appropriately.
Using full-text data to create improved term mapsNees Jan van Eck
1) The document discusses using full-text data rather than just metadata to create improved term maps for visualizing topics in scientific literature.
2) It compares different approaches for creating term maps using full-text data from publications in the Journal of Informetrics, including using titles/abstracts vs full text, binary vs full counting of term co-occurrences, and mapping at the publication level vs paragraph level.
3) The results show that full-text data yields richer maps than just titles and abstracts, and that full counting is preferable to binary counting when using full text. Paragraph-level maps provide more fine-grained structure but areas may not always represent literature topics.
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingNees Jan van Eck
The document summarizes the CWTS Leiden Ranking, which provides bibliometric indicators to rank the scientific performance of universities based on Web of Science data. It uses an advanced methodology including: (1) percentile-based indicators to account for skewed citation distributions, (2) exclusion of non-core publications, and (3) field normalization through a publication-level classification system and fractional counting of co-authored publications. This methodology differs from other rankings by solely focusing on scientific performance without aggregating other dimensions and not relying on survey or self-reported data.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
PPT on Direct Seeded Rice presented at the three-day 'Training and Validation Workshop on Modules of Climate Smart Agriculture (CSA) Technologies in South Asia' workshop on April 22, 2024.
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
Immersive Learning That Works: Research Grounding and Paths ForwardLeonel Morgado
We will metaverse into the essence of immersive learning, into its three dimensions and conceptual models. This approach encompasses elements from teaching methodologies to social involvement, through organizational concerns and technologies. Challenging the perception of learning as knowledge transfer, we introduce a 'Uses, Practices & Strategies' model operationalized by the 'Immersive Learning Brain' and ‘Immersion Cube’ frameworks. This approach offers a comprehensive guide through the intricacies of immersive educational experiences and spotlighting research frontiers, along the immersion dimensions of system, narrative, and agency. Our discourse extends to stakeholders beyond the academic sphere, addressing the interests of technologists, instructional designers, and policymakers. We span various contexts, from formal education to organizational transformation to the new horizon of an AI-pervasive society. This keynote aims to unite the iLRN community in a collaborative journey towards a future where immersive learning research and practice coalesce, paving the way for innovative educational research and practice landscapes.
The technology uses reclaimed CO₂ as the dyeing medium in a closed loop process. When pressurized, CO₂ becomes supercritical (SC-CO₂). In this state CO₂ has a very high solvent power, allowing the dye to dissolve easily.
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Intermediacy of publications
1. Intermediacy of publications
Lovro Šubelj1, Ludo Waltman2, Vincent Traag2, and Nees Jan van Eck2
1Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia
2Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands
17th International Conference on Scientometrics & Informetrics
Rome, Italy, September 4, 2019
2. Introduction
• Citation networks offer insights into the
development of science
• Historiography: tracing the development of
a scientific field
• What publications have been important in
that development?
• We propose a new measure called
intermediacy
1
3. Existing approaches
• Main path analysis
– Relies on traversal counts of citation links
– Selects citation path(s) that have a high sum of traversal counts
– Rewards relatively long paths
– Conceptually unclear, not always clear results
• Shortest or longest paths
– Shortest paths typically do not include most important publications
– Longest paths typically include many irrelevant publications
2
4. Main idea of intermediacy
• Given a citation network with a source (s) and a target (t)
publication
• Intermediacy relies on citation links to identify important
intermediate publications
• Important intermediate publications should be well
connected
• The more important the role of a publication in connecting
source s to target t, the higher the intermediacy of that
publication
3
5. Illustration
• Only some citations are active
• Each citation is active with probability p
• Is there a path (of active citations)
through a publication?
4
6. Formal notation
• Each citation is active with probability p
• Intermediacy is the probability publication u lies on a
path from s to t
• Intermediacy of publication u from s to t is
Pr(Xij) is the probability there is a path from i to j
5
𝜙 𝑢 = Pr 𝑋𝑠𝑡
𝑢
= Pr 𝑋𝑠𝑢 Pr 𝑋 𝑢𝑡
7. How does intermediacy behave?
For p0 shortest paths are most
important
For p1 number of independent
paths are most important
6
8. Properties of intermediacy
• Path addition and contraction
increase intermediacy
• Intuition: path from source to
target becomes “easier”
7
11. Approximate algorithm
• Simple Monte Carlo simulation algorithm by sampling
• Runs in linear time using probabilistic depth-first search
10
12. Use case: community detection in scientometrics
Source: Klavans & Boyack (2017), Which type of citation analysis generates the most accurate taxonomy of scientific
and technical Knowledge?, JASIST, 68(4), 984-998.
Target: Newman & Girvan (2004), Finding and evaluating community structure in networks, Phys. Rev. E, 69(2),
026113.
11
14. Conclusions
• Intermediacy as a new measure of importance of publications
• Conceptually clear and provable behavior in extreme cases
• Favors short paths and many independent paths
• Shows promising results in case studies
• Future work:
– Implementation in tool
– Applicability to other types of networks
13
16. Questions?
Lovro Šubelj
University of Ljubljana
lovro.subelj@fri.uni-lj.si
http://lovro.lpt.fri.uni-lj.si
Vincent Traag
Leiden University
v.a.traag@cwts.leidenuniv.nl
www.traag.net
Ludo Waltman
Leiden University
waltmanlr@cwts.leidenuniv.n
www.ludowaltman.nl
Nees Jan van Eck
Leiden University
ecknjpvan@cwts.leidenuniv.n
www.neesjanvaneck.nl
15
Paper available on arXiv: arxiv.org/abs/1812.08259
Code available on GitHub: github.com/lovre/intermediacy