SlideShare a Scribd company logo
Large-scale analysis of bibliometric
networks
Nees Jan van Eck
Centre for Science and Technology Studies (CWTS), Leiden University
International Conference on Data-driven Discovery:
When Data Science Meets Information Science
Beijing, China, June 20, 2016
Bibliographic databases: ‘Big data’
1
Web of Science Scopus
Journals 12,000 20,000
Publications 45 million 35 million
Citations 1 billion 0.9 billion
Bibliometric networks
2
Web of
Science
Scopus
Citation network
of pubs / authors / journals
Co-authorship network
of authors / organizations
Co-citation network
of pubs / authors / journals
Co-occurrence network
of keywords / terms
Bibliographic coupling network
of pubs / authors / journals
Bibliographic
database
Outline
• Software tools
• Network analysis techniques
• Analysis of data science
3
Software tools
4
Software tools
• VOSviewer (www.vosviewer.com)
– Tool for constructing and visualizing bibliometric networks
• CitNetExplorer (www.citnetexplorer.nl)
– Tool for visualizing and analyzing citation networks of
publications
• Both tools have been developed together
with my colleague Ludo Waltman 5
VOSviewer
6
VOSviewer: Overview
• Software tool for visualizing (bibliometric) networks
• Built-in support for popular bibliographic databases
• Text mining functionality
• Layout and clustering techniques
• Advanced visualization features:
– Smart labeling algorithm
– Overlay visualizations
– Density visualizations (‘heat map’)
• Users:
– Researchers
– Professional users (e.g., universities, libraries, funders,
publishers)
7
Map of university co-authorship
network
8
Map of journal citation network
9
CitNetExplorer
10
• Any type of bibliometric
network
• Co-authorship, direct citations,
co-citation, and bibliographic
coupling
• Time dimension is ignored
• Networks of at most ~10,000
nodes are supported
• Only citation networks of
publications
• Direct citation between
publications
• Time dimension is explicitly
considered
• Millions of publications are
supported
11
VOSviewer CitNetExplorer
Network
analysis
techniques
12
Network analysis techniques
13
Layout:
• Assigning the nodes in a network to
locations in a (usually 2d) space
(a.k.a. mapping)
• Visualization of similarities (VOS)
Clustering:
• Partitioning the nodes in a network
into a number of groups (a.k.a.
community detection)
• Weighted modularity
• Smart local moving algorithm
1414
Clustering can be seen as mapping
in a restricted space
1515
Clustering can be seen as mapping
in a restricted space
Unified approach to mapping and
clustering
Minimize
where
n: number of nodes in the network
m: total weight of all edges in the network
Aij: weight of edge between nodes i and j
ki: total weight of all edges of node i
16
 

ji
ij
ji
ijij
ji
n ddA
kk
m
xxQ 2
1
2
),,( 
Mapping
xi: vector denoting the location
of node i in a p-dimensional
space


p
k
jkikjiij xxxxd
1
2
)(
Clustering
xi: integer denoting the
community to which node i
belongs
: resolution parameter






ji
ji
ij
xx
xx
d
if1
if0

Smart local moving algorithm
17
Q = 0.4198
Q = 0.3791
Reduced
network
Local moving
heuristic in
subnetworks
Local moving heuristic
Original
network
Algorithmically constructed
classification system of science
• 17.8 million publications from the period 2000–
2015 indexed in Web of Science
• 282.4 million citation relations
• Classification system of 3 hierarchical levels:
– 27 broad disciplines
– 817 fields
– 4,113 subfields
18
Breakdown of scientific literature into
817 fields
19
Social sciences
and humanitiesBiomedical and
health sciences
Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
Publications in scientometrics
subfield
20
Time-line map of highly cited
scientometrics publications
21
Analysis of
data science
22
What is data science?
• Empirical operationalization of data science based
on publications with ‘data’ in title or abstract
23
Wikipedia: “Data Science is an interdisciplinary field
about processes and systems to extract knowledge
or insights from data … which is a continuation of
some of the data analysis fields such as statistics,
data mining, and predictive analytics”
LCDS: “Data Science … deals with finding, analyzing
and validating complex patterns in data. Data
Science methods are indispensable for maintaining a
competitive edge in all disciplines in science”
Growth of data-driven research
24
0%
2%
4%
6%
8%
10%
12%
14%
16%
18%
20%
1990 1995 2000 2005 2010 2015
Percentageofpublications
% 'data' publications % 'theory' publications
Breakdown of scientific literature into
817 fields
25
Social sciences
and humanitiesBiomedical and
health sciences
Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
Data-driven nature of different
scientific fields
26
Social sciences
and humanitiesBiomedical and
health sciences
Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
% pub. with ‘data’ in title or abstract
Data-driven nature of different
scientific fields
27
artificial
intelligence
statisticsbioinformatics
neuroimaging pattern
recognition
astronomy
earth
water
climate
remote
sensing
nutrition
obesity
addiction
accident
analysis
% pub. with ‘data’ in title or abstract
Data science fields (at least 25% ‘data’
publications)
28
Social sciences
and humanitiesBiomedical and
health sciences
Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
Term map of data science fields
29
China’s publication output in data
science fields
30
Social sciences
and humanitiesBiomedical and
health sciences
Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
China’s publication output in data
science fields
31
artificial
intelligence
pattern
recognition
high
energy
earth
atmospheres
weather
remote
sensing
Chinese institutes with most publications
in data science fields (2011-2015)
• Chinese Academy of Sciences
• Peking University
• Tsinghua University
• China University of Geosciences
• Zhejiang University
• Nanjing University
• Shanghai Jiao Tong University
• University of Science and Technology of China
• Beijing Normal University
• University of Hong Kong
32
CAS publication output in data
science fields
33
earth
atmospheres
weather
remote
sensing
vegetation
astronomy
high energy
Term map based on CAS publications in
data science fields
34
CAS (Beijing Branch) publication
output in data science fields
35
astronomy
earth
atmospheres
weather
remote
sensing
vegetation
high energy
CAS (Shanghai Branch) publication
output in data science fields
36
bioinformatics
genetics
astronomy
nuclear
Do it yourself!
37
www.vosviewer.com www.citnetexplorer.nl
Thank you for your attention!
38

More Related Content

What's hot

Bibliometric network analysis: Software tools, techniques, and an analysis o...
Bibliometric network analysis: Software tools, techniques, and an analysis o...Bibliometric network analysis: Software tools, techniques, and an analysis o...
Bibliometric network analysis: Software tools, techniques, and an analysis o...
Nees Jan van Eck
 
Multiple perspectives on bibliometric data
Multiple perspectives on bibliometric dataMultiple perspectives on bibliometric data
Multiple perspectives on bibliometric data
Nees Jan van Eck
 
VOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer TutorialVOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer Tutorial
Nees Jan van Eck
 
A new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networksA new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networks
Nees Jan van Eck
 
Intermediacy of publications
Intermediacy of publicationsIntermediacy of publications
Intermediacy of publications
Nees Jan van Eck
 
A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...
Nees Jan van Eck
 
Advanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extractionAdvanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extraction
Nees Jan van Eck
 
Visual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorerVisual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorer
Nees Jan van Eck
 
Large-scale visualization of science
Large-scale visualization of scienceLarge-scale visualization of science
Large-scale visualization of science
Nees Jan van Eck
 
Using full-text data to create improved term maps
Using full-text data to create improved term mapsUsing full-text data to create improved term maps
Using full-text data to create improved term maps
Nees Jan van Eck
 
Cluster stability
Cluster stabilityCluster stability
Cluster stability
Nees Jan van Eck
 
On cluster stability
On cluster stabilityOn cluster stability
On cluster stability
Nees Jan van Eck
 
Getting started with CitNetExplorer
Getting started with CitNetExplorerGetting started with CitNetExplorer
Getting started with CitNetExplorer
Nees Jan van Eck
 
Visualizing science based on open data sources
Visualizing science based on open data sourcesVisualizing science based on open data sources
Visualizing science based on open data sources
Nees Jan van Eck
 
Scientometric approaches to classification
Scientometric approaches to classificationScientometric approaches to classification
Scientometric approaches to classification
Nees Jan van Eck
 
Bibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewerBibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewer
Ludo Waltman
 
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Nees Jan van Eck
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
Nees Jan van Eck
 
Large-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applicationsLarge-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applications
Ludo Waltman
 
The landscape of research on research
The landscape of research on researchThe landscape of research on research
The landscape of research on research
Ludo Waltman
 

What's hot (20)

Bibliometric network analysis: Software tools, techniques, and an analysis o...
Bibliometric network analysis: Software tools, techniques, and an analysis o...Bibliometric network analysis: Software tools, techniques, and an analysis o...
Bibliometric network analysis: Software tools, techniques, and an analysis o...
 
Multiple perspectives on bibliometric data
Multiple perspectives on bibliometric dataMultiple perspectives on bibliometric data
Multiple perspectives on bibliometric data
 
VOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer TutorialVOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer Tutorial
 
A new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networksA new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networks
 
Intermediacy of publications
Intermediacy of publicationsIntermediacy of publications
Intermediacy of publications
 
A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...
 
Advanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extractionAdvanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extraction
 
Visual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorerVisual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorer
 
Large-scale visualization of science
Large-scale visualization of scienceLarge-scale visualization of science
Large-scale visualization of science
 
Using full-text data to create improved term maps
Using full-text data to create improved term mapsUsing full-text data to create improved term maps
Using full-text data to create improved term maps
 
Cluster stability
Cluster stabilityCluster stability
Cluster stability
 
On cluster stability
On cluster stabilityOn cluster stability
On cluster stability
 
Getting started with CitNetExplorer
Getting started with CitNetExplorerGetting started with CitNetExplorer
Getting started with CitNetExplorer
 
Visualizing science based on open data sources
Visualizing science based on open data sourcesVisualizing science based on open data sources
Visualizing science based on open data sources
 
Scientometric approaches to classification
Scientometric approaches to classificationScientometric approaches to classification
Scientometric approaches to classification
 
Bibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewerBibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewer
 
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
 
Large-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applicationsLarge-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applications
 
The landscape of research on research
The landscape of research on researchThe landscape of research on research
The landscape of research on research
 

Similar to Large-scale analysis of bibliometric networks

AHM 2014: Governance and Cyberinfrastructure in the Earth System Sciences
AHM 2014: Governance and Cyberinfrastructure in the Earth System SciencesAHM 2014: Governance and Cyberinfrastructure in the Earth System Sciences
AHM 2014: Governance and Cyberinfrastructure in the Earth System Sciences
EarthCube
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care?
Robert Grossman
 
Scratchpads introductory presentation 45mins
Scratchpads introductory presentation   45minsScratchpads introductory presentation   45mins
Scratchpads introductory presentation 45mins
Dimitrios Koureas
 
Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)
tm1966
 
Cyberinfrastructure for Einstein's Equations and Beyond
Cyberinfrastructure for Einstein's Equations and BeyondCyberinfrastructure for Einstein's Equations and Beyond
Cyberinfrastructure for Einstein's Equations and Beyond
University of Illinois at Urbana-Champaign
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
AbcdDcba12
 
The Science of Data Science
The Science of Data Science The Science of Data Science
The Science of Data Science
James Hendler
 
Digital Science: Reproducibility and Visibility in Astronomy
Digital Science: Reproducibility and Visibility in AstronomyDigital Science: Reproducibility and Visibility in Astronomy
Digital Science: Reproducibility and Visibility in Astronomy
Jose Enrique Ruiz
 
Enabling Data-Intensive Science Through Data Infrastructures
Enabling Data-Intensive Science Through Data InfrastructuresEnabling Data-Intensive Science Through Data Infrastructures
Enabling Data-Intensive Science Through Data Infrastructures
LIBER Europe
 
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific EndeavourBeyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
KNOWeSCAPE2014
 
Descobrindo o tesouro escondido nos seus dados usando grafos.
Descobrindo o tesouro escondido nos seus dados usando grafos.Descobrindo o tesouro escondido nos seus dados usando grafos.
Descobrindo o tesouro escondido nos seus dados usando grafos.
Ana Appel
 
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
Quo vadis, provenancer? Cui prodest? our own trajectory: provenance of data...Quo vadis, provenancer? Cui prodest? our own trajectory: provenance of data...
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
Paolo Missier
 
06 e science-bio diversity@ pacc 18.07.2014
06 e science-bio diversity@ pacc 18.07.201406 e science-bio diversity@ pacc 18.07.2014
06 e science-bio diversity@ pacc 18.07.2014
VinothkumaR Ramu
 
MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage data
Herbert Van de Sompel
 
How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...
Andrea Scharnhorst
 
unit 1 DATA MINING.ppt
unit 1 DATA MINING.pptunit 1 DATA MINING.ppt
unit 1 DATA MINING.ppt
BREENAHICETSTAFFCSE
 
Building Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFBuilding Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVF
Olga Scrivner
 
New ways to communicate in science: perspectives from biodiversity research
New ways to communicate in science: perspectives from biodiversity researchNew ways to communicate in science: perspectives from biodiversity research
New ways to communicate in science: perspectives from biodiversity research
Vince Smith
 
20200901 ECCB M. Kutmon
20200901 ECCB M. Kutmon20200901 ECCB M. Kutmon
20200901 ECCB M. Kutmon
Martina Summer-Kutmon
 
Network Science: Theory, Modeling and Applications
Network Science: Theory, Modeling and ApplicationsNetwork Science: Theory, Modeling and Applications
Network Science: Theory, Modeling and Applications
Biocomplexity Institute of Virginia Tech
 

Similar to Large-scale analysis of bibliometric networks (20)

AHM 2014: Governance and Cyberinfrastructure in the Earth System Sciences
AHM 2014: Governance and Cyberinfrastructure in the Earth System SciencesAHM 2014: Governance and Cyberinfrastructure in the Earth System Sciences
AHM 2014: Governance and Cyberinfrastructure in the Earth System Sciences
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care?
 
Scratchpads introductory presentation 45mins
Scratchpads introductory presentation   45minsScratchpads introductory presentation   45mins
Scratchpads introductory presentation 45mins
 
Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)
 
Cyberinfrastructure for Einstein's Equations and Beyond
Cyberinfrastructure for Einstein's Equations and BeyondCyberinfrastructure for Einstein's Equations and Beyond
Cyberinfrastructure for Einstein's Equations and Beyond
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
The Science of Data Science
The Science of Data Science The Science of Data Science
The Science of Data Science
 
Digital Science: Reproducibility and Visibility in Astronomy
Digital Science: Reproducibility and Visibility in AstronomyDigital Science: Reproducibility and Visibility in Astronomy
Digital Science: Reproducibility and Visibility in Astronomy
 
Enabling Data-Intensive Science Through Data Infrastructures
Enabling Data-Intensive Science Through Data InfrastructuresEnabling Data-Intensive Science Through Data Infrastructures
Enabling Data-Intensive Science Through Data Infrastructures
 
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific EndeavourBeyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
 
Descobrindo o tesouro escondido nos seus dados usando grafos.
Descobrindo o tesouro escondido nos seus dados usando grafos.Descobrindo o tesouro escondido nos seus dados usando grafos.
Descobrindo o tesouro escondido nos seus dados usando grafos.
 
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
Quo vadis, provenancer? Cui prodest? our own trajectory: provenance of data...Quo vadis, provenancer? Cui prodest? our own trajectory: provenance of data...
Quo vadis, provenancer?  Cui prodest?  our own trajectory: provenance of data...
 
06 e science-bio diversity@ pacc 18.07.2014
06 e science-bio diversity@ pacc 18.07.201406 e science-bio diversity@ pacc 18.07.2014
06 e science-bio diversity@ pacc 18.07.2014
 
MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage data
 
How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...How to use science maps to navigate large information spaces? What is the lin...
How to use science maps to navigate large information spaces? What is the lin...
 
unit 1 DATA MINING.ppt
unit 1 DATA MINING.pptunit 1 DATA MINING.ppt
unit 1 DATA MINING.ppt
 
Building Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFBuilding Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVF
 
New ways to communicate in science: perspectives from biodiversity research
New ways to communicate in science: perspectives from biodiversity researchNew ways to communicate in science: perspectives from biodiversity research
New ways to communicate in science: perspectives from biodiversity research
 
20200901 ECCB M. Kutmon
20200901 ECCB M. Kutmon20200901 ECCB M. Kutmon
20200901 ECCB M. Kutmon
 
Network Science: Theory, Modeling and Applications
Network Science: Theory, Modeling and ApplicationsNetwork Science: Theory, Modeling and Applications
Network Science: Theory, Modeling and Applications
 

More from Nees Jan van Eck

Crossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadataCrossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadata
Nees Jan van Eck
 
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Nees Jan van Eck
 
Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...
Nees Jan van Eck
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
Nees Jan van Eck
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
Nees Jan van Eck
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
Nees Jan van Eck
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
Nees Jan van Eck
 
Accuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and ScopusAccuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and Scopus
Nees Jan van Eck
 
How to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparisonHow to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparison
Nees Jan van Eck
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
Nees Jan van Eck
 

More from Nees Jan van Eck (10)

Crossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadataCrossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadata
 
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
 
Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
 
Accuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and ScopusAccuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and Scopus
 
How to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparisonHow to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparison
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
 

Recently uploaded

ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
Sciences of Europe
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 

Recently uploaded (20)

ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 

Large-scale analysis of bibliometric networks

  • 1. Large-scale analysis of bibliometric networks Nees Jan van Eck Centre for Science and Technology Studies (CWTS), Leiden University International Conference on Data-driven Discovery: When Data Science Meets Information Science Beijing, China, June 20, 2016
  • 2. Bibliographic databases: ‘Big data’ 1 Web of Science Scopus Journals 12,000 20,000 Publications 45 million 35 million Citations 1 billion 0.9 billion
  • 3. Bibliometric networks 2 Web of Science Scopus Citation network of pubs / authors / journals Co-authorship network of authors / organizations Co-citation network of pubs / authors / journals Co-occurrence network of keywords / terms Bibliographic coupling network of pubs / authors / journals Bibliographic database
  • 4. Outline • Software tools • Network analysis techniques • Analysis of data science 3
  • 6. Software tools • VOSviewer (www.vosviewer.com) – Tool for constructing and visualizing bibliometric networks • CitNetExplorer (www.citnetexplorer.nl) – Tool for visualizing and analyzing citation networks of publications • Both tools have been developed together with my colleague Ludo Waltman 5
  • 8. VOSviewer: Overview • Software tool for visualizing (bibliometric) networks • Built-in support for popular bibliographic databases • Text mining functionality • Layout and clustering techniques • Advanced visualization features: – Smart labeling algorithm – Overlay visualizations – Density visualizations (‘heat map’) • Users: – Researchers – Professional users (e.g., universities, libraries, funders, publishers) 7
  • 9. Map of university co-authorship network 8
  • 10. Map of journal citation network 9
  • 12. • Any type of bibliometric network • Co-authorship, direct citations, co-citation, and bibliographic coupling • Time dimension is ignored • Networks of at most ~10,000 nodes are supported • Only citation networks of publications • Direct citation between publications • Time dimension is explicitly considered • Millions of publications are supported 11 VOSviewer CitNetExplorer
  • 14. Network analysis techniques 13 Layout: • Assigning the nodes in a network to locations in a (usually 2d) space (a.k.a. mapping) • Visualization of similarities (VOS) Clustering: • Partitioning the nodes in a network into a number of groups (a.k.a. community detection) • Weighted modularity • Smart local moving algorithm
  • 15. 1414 Clustering can be seen as mapping in a restricted space
  • 16. 1515 Clustering can be seen as mapping in a restricted space
  • 17. Unified approach to mapping and clustering Minimize where n: number of nodes in the network m: total weight of all edges in the network Aij: weight of edge between nodes i and j ki: total weight of all edges of node i 16    ji ij ji ijij ji n ddA kk m xxQ 2 1 2 ),,(  Mapping xi: vector denoting the location of node i in a p-dimensional space   p k jkikjiij xxxxd 1 2 )( Clustering xi: integer denoting the community to which node i belongs : resolution parameter       ji ji ij xx xx d if1 if0 
  • 18. Smart local moving algorithm 17 Q = 0.4198 Q = 0.3791 Reduced network Local moving heuristic in subnetworks Local moving heuristic Original network
  • 19. Algorithmically constructed classification system of science • 17.8 million publications from the period 2000– 2015 indexed in Web of Science • 282.4 million citation relations • Classification system of 3 hierarchical levels: – 27 broad disciplines – 817 fields – 4,113 subfields 18
  • 20. Breakdown of scientific literature into 817 fields 19 Social sciences and humanitiesBiomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  • 22. Time-line map of highly cited scientometrics publications 21
  • 24. What is data science? • Empirical operationalization of data science based on publications with ‘data’ in title or abstract 23 Wikipedia: “Data Science is an interdisciplinary field about processes and systems to extract knowledge or insights from data … which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics” LCDS: “Data Science … deals with finding, analyzing and validating complex patterns in data. Data Science methods are indispensable for maintaining a competitive edge in all disciplines in science”
  • 25. Growth of data-driven research 24 0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 1990 1995 2000 2005 2010 2015 Percentageofpublications % 'data' publications % 'theory' publications
  • 26. Breakdown of scientific literature into 817 fields 25 Social sciences and humanitiesBiomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  • 27. Data-driven nature of different scientific fields 26 Social sciences and humanitiesBiomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering % pub. with ‘data’ in title or abstract
  • 28. Data-driven nature of different scientific fields 27 artificial intelligence statisticsbioinformatics neuroimaging pattern recognition astronomy earth water climate remote sensing nutrition obesity addiction accident analysis % pub. with ‘data’ in title or abstract
  • 29. Data science fields (at least 25% ‘data’ publications) 28 Social sciences and humanitiesBiomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  • 30. Term map of data science fields 29
  • 31. China’s publication output in data science fields 30 Social sciences and humanitiesBiomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  • 32. China’s publication output in data science fields 31 artificial intelligence pattern recognition high energy earth atmospheres weather remote sensing
  • 33. Chinese institutes with most publications in data science fields (2011-2015) • Chinese Academy of Sciences • Peking University • Tsinghua University • China University of Geosciences • Zhejiang University • Nanjing University • Shanghai Jiao Tong University • University of Science and Technology of China • Beijing Normal University • University of Hong Kong 32
  • 34. CAS publication output in data science fields 33 earth atmospheres weather remote sensing vegetation astronomy high energy
  • 35. Term map based on CAS publications in data science fields 34
  • 36. CAS (Beijing Branch) publication output in data science fields 35 astronomy earth atmospheres weather remote sensing vegetation high energy
  • 37. CAS (Shanghai Branch) publication output in data science fields 36 bioinformatics genetics astronomy nuclear
  • 38. Do it yourself! 37 www.vosviewer.com www.citnetexplorer.nl
  • 39. Thank you for your attention! 38