SlideShare a Scribd company logo
Scientometric approaches to classification
Nees Jan van Eck
Centre for Science and Technology Studies (CWTS), Leiden University
Colloquium Research Information Systems and Science Classifications: Revisiting the NARCIS Classification
Museum Meermanno, The Hague, The Netherlands
September 28, 2018
Outline
• Bibliographic databases
• Classification systems of scientific literature
• CWTS publication-level classification system of science
– Methodology
– Structure
– Applications
• Quality of classification systems
1
Bibliographic
databases
2
Bibliographic databases
3
Bibliographic databases
4
Web of Science Scopus
Journals 20,000 24,000
Publications 55 million 45 million
Citations 1.2 billion 1.2 billion
Classification systems
of scientific literature
5
Classification systems of scientific literature
• Mono-disciplinary vs. multidisciplinary
• Journal-level vs. publication-level
• Manual vs. algorithmic
6
Classification systems of scientific literature
• Mono-disciplinary:
– Chemical Abstracts: 80 different sections and 5 broad headings
– EconLit: Journal of Economic Literature (JEL) classification system
– PubMed: Medical Subject Headings (MeSH)
• Multidisciplinary:
– Web of Science: 250 categories
– Scopus (ASJC): bottom level has 304 categories and top level includes 27 categories
– Science-Metrix: 176 categories
– National Science Foundation (NSF): 125 categories
– University of California, San Diego (UCSD): more than 500 categories
– Australian and New Zealand Standard Research Classification (FoR): 3 hierarchical levels
7
CWTS publication-
level classification
system of science
8
Algorithmic classification system of science
• First version created in 2012
• Publications (not journals) are clustered into research areas based on citation
relations
• Research areas are defined at different levels of granularity and are
organized hierarchically
• Clustering is performed using the smart local moving algorithm (improved
Louvain algorithm; Waltman & Van Eck, 2013)
9
Objectives
To create a classification system
• in a fully algorithmic manner
• covering all sciences and social sciences
• at the level of individual publications
• with a hierarchical structure
• using transparent, freely available algorithms
• without excessive computational requirements
10
Main challenges
• Dealing with huge volumes of data
• Avoiding disciplinary biases
• Reaching a high level of accuracy
• Being flexible in terms of number of hierarchical levels and size of research
areas
• Obtaining proper labels for the research areas
• Keeping the methodology reasonably simple and transparent
11
Dealing with huge volumes of data
• Linking publications based on direct citations only; no co-citations,
bibliographic coupling, or word co-occurrences
• Efficient clustering algorithm based on ideas taken from:
– Newman (2004): Modularity-based clustering
– Blondel et al. (2008): ‘Louvain method’
– Waltman et al. (2010): VOS clustering technique
– Rotta & Noack (2011): Multilevel local search algorithms
12
Avoiding disciplinary biases
• cij: Relatedness of publications i and j, i.e., 1 if there is a direct citation
relation between i and j, 0 otherwise
• aij: Normalized relatedness of publications i and j, defined as
• Similar to fractional citation counting (Small & Sweeney, 1985)


k ik
ij
ij
c
c
a
13
Reaching a high level of accuracy
• Clustering technique based on maximization of a quality function:
• xi denotes the cluster (research area) to which publication i is assigned
• (xi, xj) = 1 if xi = xj and 0 otherwise
• r denotes a resolution parameter
• Quality function is maximized with respect to x1, ..., xn
 
i j
ijji raxx ))(,(
14
Being flexible in terms of number of hierarchical levels
and size of research areas
• Three types of parameters:
– Number of hierarchical levels
– Each level’s resolution parameter
– Each level’s minimum number of publications per research area
15
Obtaining proper labels for the research areas
1. Identification of terms in titles and abstracts of articles using part-of-speech
tagging
2. Calculation of term relevance scores based on a combination of a term’s
absolute and relative frequency of occurrence
3. Selection of the most relevant terms based on term relevance scores
combined with a filter for removing similar terms
16
CWTS publication-level classification system of
science
• 21.2 million publications from the period 2000–2017 indexed in Web of
Science
• 374.1 million citation relations
• Classification system of 3 hierarchical levels:
– 22 broad disciplines
– 868 fields
– 4,047 subfields
• Computational performance: less than 2 hours
17
18
Breakdown of scientific literature into 22 broad
disciplines
Social sciences
and humanities
Biomedical and
health sciences
Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
22 broad disciplines
19
20
Breakdown of scientific literature into 868 fields
Social sciences
and humanities
Biomedical and
health sciences
Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
21
Breakdown of scientific literature into 4,047 subfields
Social sciences
and humanities
Biomedical and
health sciences Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
22
Breakdown of scientific literature into 4,047 subfields
Social sciences
and humanities
Biomedical and
health sciences Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
Scientometrics
Summary of scientometrics subfield
23
Cluster: 145
No. publications: 16,312
Top 5 terms No. pubs
bibliometric analysis 852
impact factor 495
h index 264
peer review 515
citation 642
Top 5 publications No. cits
hirsch, je (2005). an index to quantify an individual's scientific research output. p natl acad sci usa, 102(46), 16569-16572. 2,635
wuchty, s; et al. (2007). the increasing dominance of teams in production of knowledge. science, 316(5827), 1036-1039. 699
egghe, l (2006). theory and practise of the g-index. scientometrics, 69(1), 131-152. 609
king, da (2004). the scientific impact of nations. nature, 430(6997), 311-316. 496
newman, mej (2004). coauthorship networks and patterns of scientific collaboration. p natl acad sci usa, 101, 5200-5205. 488
Top 5 authors No. pubs Top 5 journals No. pubs
bornmann, l 221 scientometrics 2,865
thelwall, m 202 journal of informetrics 700
leydesdorff, l 175 journal of the american society for information science and technology 613
rousseau, r 161 plos one 339
egghe, l 133 research evaluation 324
Top 5 institutes No. pubs Top 5 departments No. pubs
univ granada 316 sch lib & informat sci (indiana univ) 106
kathol univ leuven 256 amsterdam sch commun res ascor (univ amsterdam) 97
leiden univ 249 ctr sci & technol studies (leiden univ) 90
indiana univ 246 sch publ policy (georgia inst technol - atlanta) 88
univ wolverhampton 216 trend res ctr (asia univ) 84
0
200
400
600
800
1,000
1,200
1,400
1,600
2000 2002 2004 2006 2008 2010 2012 2014 2016
No.publications
Publications in scientometrics subfield
24
25
Term map of scientometrics subfield
Peer review,
OA, careers,
and gender
CollaborationScientometric
indicators and
networks
Medical research
Country-level
analyses
26
Time-line map of highly cited scientometrics
publications
27
Overlay visualizations
Social sciences
and humanities
Biomedical and
health sciences Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
Time trend
28
Social sciences
and humanities
Biomedical and
health sciences Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
Time trend
29
MicroRNA Graphene
Summary of graphene subfield
30
Cluster: 9
No. publications: 27,771
Top 5 terms No. pubs
bilayer graphene 836
epitaxial graphene 491
silicene 401
graphene nanoribbon 1,035
graphene field effect transistor 207
Top 5 publications No. cits
novoselov, ks; et al. (2004). electric field effect in atomically thin carbon films. science, 306(5696), 666-669. 27,743
geim, ak; et al. (2007). the rise of graphene. nat mater, 6(3), 183-191. 20,073
novoselov, ks; et al. (2005). two-dimensional gas of massless dirac fermions in graphene. nature, 438(7065), 197-200. 11,359
castro neto, ah; et al. (2009). the electronic properties of graphene. rev mod phys, 81(1), 109-162. 11,368
zhang, yb; et al. (2005). experimental observation of the quantum hall effect and berry's phase in graphene. nature, 438(7065), 201-204. 8,110
Top 5 authors No. pubs Top 5 journals No. pubs
watanabe, k 249 physical review b 4,013
taniguchi, t 240 applied physics letters 1,834
peeters, fm 233 carbon 994
lin, mf 178 nano letters 906
katsnelson, mi 177 journal of applied physics 841
Top 5 institutes No. pubs Top 5 departments No. pubs
chinese acad sci 1,394 dept phys (natl univ singapore) 257
russian acad sci 778 inst phys (chinese acad sci) 226
peking univ 557 inst mol & mat (radboud univ nijmegen) 216
natl univ singapore 482 dept phys (mit) 209
tsing hua univ 458 dept phys (univ calif berkeley and berkeley national lab) 206
0
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
2000 2002 2004 2006 2008 2010 2012 2014 2016
No.publications
Open access
31
Social sciences
and humanities
Biomedical and
health sciences Life and earth
sciences
Mathematics and
computer science
Physical
sciences and
engineering
University profiles
32
Delft University of TechnologyLeiden University
Applications
• Field normalization
– CWTS Leiden Ranking/U-Multirank
– Dutch University Medical Centers
• Field delineation
– European research funders
• High-resolution research strengths analysis
– European universities
– European research funders
• Identification of interdisciplinary and emerging research areas
– UK Engineering and Physical Sciences Research Council
33
Adopters and potential adopters
• Adopters:
– CWTS
– SciTech Strategies (e.g. SciVal)
– Royal School of Technology (KTH) Stockholm
• Potential adopters:
– Chinese Academy of Sciences
– European Research Council
– Max Planck
34
Quality of
classification systems
35
Empirical micro study using papers on overall water
splitting
• Haunschild et al. (2018)
• Case study comparing CWTS classification to
journal-based and manually constructed
classifications
• Ability of CWTS classification to distinguish
between fields is questioned
36
Accuracy of the journal classification systems of Web
of Science and Scopus
• Wang and Waltman (2016)
• Two criteria to identify journals with questionable
classifications:
– journals that have weak connections with their assigned
categories
– journals that are not assigned to categories with which they
have strong connections
• Web of Science performs significantly better than
Scopus
37
Field classification of publications in Dimensions
• Bornmann (2018)
• Field classification in Dimensions:
– Based on Fields of Research (FOR) from Australian and New
Zealand Standard Research Classification (ANZSRC)
– Machine learning approach
– Each publication is assigned to at least one field
• Based on Bornmann’s own publications
• Questions reliability and validity of Dimensions
classification
38
Response from Dimensions
• Herzog and Lunn (2018)
• Implementation at launch was first step and
requires improvements:
– Improvement of training sets
– Adding new subcategories to FOR system
39
Large-scale system to organize publications into
hierarchical concept structure
• Shen et al. (2018)
• Core component in Microsoft Academic
• Iterative approach to:
– concept discovery (Wikipedia)
– concept tagging to publications (both textual data and graph
structure are considered)
– concept hierarchy construction
• Based on 2000 initial seed concepts, over 228K
concepts have been identified
• Concepts are organized in six-level hierarchy
• 1 billion publication-concept relations
40
Conclusions
41
Conclusions
• Algorithmic approaches can be used to construct large-scale classifications
• Algorithmic classifications at the level of publications gain popularity
• Algorithmic possibilities depend on data availability
• Algorithmic classifications may have the disadvantage of mixing up different
principles for classifying items (e.g., research topic, research method,
scientific community, theoretical tradition, basic vs. applied)
42
Thank you for your attention!
43

More Related Content

What's hot

VOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer TutorialVOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer Tutorial
Nees Jan van Eck
 
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
Nees Jan van Eck
 
Large-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applicationsLarge-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applications
Ludo Waltman
 
Advanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extractionAdvanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extraction
Nees Jan van Eck
 
Science Mapping and Research Positioning
Science Mapping and Research PositioningScience Mapping and Research Positioning
Science Mapping and Research Positioning
Nees Jan van Eck
 
Intermediacy of publications
Intermediacy of publicationsIntermediacy of publications
Intermediacy of publications
Nees Jan van Eck
 
VOSviewer: A software tool for analyzing and visualizing scientific literature
VOSviewer: A software tool for analyzing and visualizing scientific literatureVOSviewer: A software tool for analyzing and visualizing scientific literature
VOSviewer: A software tool for analyzing and visualizing scientific literature
Nees Jan van Eck
 
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Ludo Waltman
 
Bibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewerBibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewer
Ludo Waltman
 
Applications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysisApplications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysis
Nees Jan van Eck
 
Large-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sourcesLarge-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sources
Nees Jan van Eck
 
Scientometrics for research assessment
Scientometrics for research assessmentScientometrics for research assessment
Scientometrics for research assessment
Ludo Waltman
 
Scientific information retrieval: Challenges and opportunities
Scientific information retrieval: Challenges and opportunitiesScientific information retrieval: Challenges and opportunities
Scientific information retrieval: Challenges and opportunities
Ludo Waltman
 
Crossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadataCrossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadata
Nees Jan van Eck
 
Comparing bibliographic data sources
Comparing bibliographic data sourcesComparing bibliographic data sources
Comparing bibliographic data sources
Ludo Waltman
 
The landscape of research on research
The landscape of research on researchThe landscape of research on research
The landscape of research on research
Ludo Waltman
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
Nees Jan van Eck
 
A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...
Nees Jan van Eck
 
Comparing scientific performance across disciplines: Methodological and conce...
Comparing scientific performance across disciplines: Methodological and conce...Comparing scientific performance across disciplines: Methodological and conce...
Comparing scientific performance across disciplines: Methodological and conce...
Ludo Waltman
 
Multiple perspectives on bibliometric data
Multiple perspectives on bibliometric dataMultiple perspectives on bibliometric data
Multiple perspectives on bibliometric data
Nees Jan van Eck
 

What's hot (20)

VOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer TutorialVOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer Tutorial
 
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
 
Large-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applicationsLarge-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applications
 
Advanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extractionAdvanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extraction
 
Science Mapping and Research Positioning
Science Mapping and Research PositioningScience Mapping and Research Positioning
Science Mapping and Research Positioning
 
Intermediacy of publications
Intermediacy of publicationsIntermediacy of publications
Intermediacy of publications
 
VOSviewer: A software tool for analyzing and visualizing scientific literature
VOSviewer: A software tool for analyzing and visualizing scientific literatureVOSviewer: A software tool for analyzing and visualizing scientific literature
VOSviewer: A software tool for analyzing and visualizing scientific literature
 
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
 
Bibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewerBibliometric visualization using VOSviewer
Bibliometric visualization using VOSviewer
 
Applications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysisApplications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysis
 
Large-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sourcesLarge-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sources
 
Scientometrics for research assessment
Scientometrics for research assessmentScientometrics for research assessment
Scientometrics for research assessment
 
Scientific information retrieval: Challenges and opportunities
Scientific information retrieval: Challenges and opportunitiesScientific information retrieval: Challenges and opportunities
Scientific information retrieval: Challenges and opportunities
 
Crossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadataCrossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadata
 
Comparing bibliographic data sources
Comparing bibliographic data sourcesComparing bibliographic data sources
Comparing bibliographic data sources
 
The landscape of research on research
The landscape of research on researchThe landscape of research on research
The landscape of research on research
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
 
A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...
 
Comparing scientific performance across disciplines: Methodological and conce...
Comparing scientific performance across disciplines: Methodological and conce...Comparing scientific performance across disciplines: Methodological and conce...
Comparing scientific performance across disciplines: Methodological and conce...
 
Multiple perspectives on bibliometric data
Multiple perspectives on bibliometric dataMultiple perspectives on bibliometric data
Multiple perspectives on bibliometric data
 

Similar to Scientometric approaches to classification

MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage data
Herbert Van de Sompel
 
Value-added services for the Wageningen Institutional Repository (WaY)
Value-added services for the Wageningen Institutional Repository (WaY)Value-added services for the Wageningen Institutional Repository (WaY)
Value-added services for the Wageningen Institutional Repository (WaY)
AIMS (Agricultural Information Management Standards)
 
Investigation of Partition Cells as a Structural Basis Suitable for Assessmen...
Investigation of Partition Cells as a Structural Basis Suitable for Assessmen...Investigation of Partition Cells as a Structural Basis Suitable for Assessmen...
Investigation of Partition Cells as a Structural Basis Suitable for Assessmen...
Nadine Rons
 
Using Bibliometrics in the Library
Using Bibliometrics in the LibraryUsing Bibliometrics in the Library
Using Bibliometrics in the Library
State Of Innovation
 
Paper 6: World University's Evaluation (Qiu & Zhao)
Paper 6: World University's Evaluation (Qiu & Zhao)Paper 6: World University's Evaluation (Qiu & Zhao)
Paper 6: World University's Evaluation (Qiu & Zhao)Kent Business School
 
Bibliometric analysis tools on top of the university’s bibliographic database...
Bibliometric analysis tools on top of the university’s bibliographic database...Bibliometric analysis tools on top of the university’s bibliographic database...
Bibliometric analysis tools on top of the university’s bibliographic database...Wouter Gerritsma
 
A new role for libraries in research assessments
A new role for libraries in research assessmentsA new role for libraries in research assessments
A new role for libraries in research assessmentsWouter Gerritsma
 
Where to publish_130709
Where to publish_130709Where to publish_130709
Where to publish_130709
opl10
 
Publication strategy for LEI
Publication strategy for LEIPublication strategy for LEI
Publication strategy for LEIWouter Gerritsma
 
Presentation of a bibliometric Analysis of Quantum machine Learning.ppt
Presentation of a bibliometric Analysis of Quantum machine Learning.pptPresentation of a bibliometric Analysis of Quantum machine Learning.ppt
Presentation of a bibliometric Analysis of Quantum machine Learning.ppt
aliasgharahmadikia77
 
Broad altmetric analysis of Mendeley readerships through the ‘academic status...
Broad altmetric analysis of Mendeley readerships through the ‘academic status...Broad altmetric analysis of Mendeley readerships through the ‘academic status...
Broad altmetric analysis of Mendeley readerships through the ‘academic status...
Zohreh Zahedi
 
What is your h-index and other measures of impact
What is your h-index and other measures of impactWhat is your h-index and other measures of impact
What is your h-index and other measures of impact
Berenika Webster
 
Towards Automatic Classification of LOD Datasets
Towards Automatic Classification of LOD DatasetsTowards Automatic Classification of LOD Datasets
Towards Automatic Classification of LOD Datasets
Blerina Spahiu
 
PLOS Visualization Project
PLOS Visualization ProjectPLOS Visualization Project
PLOS Visualization Project
Access Innovations, Inc.
 
THOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSTHOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOS
Maaike Duine
 
Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...
Jakaria Rahman
 
بنك المعرفة-المصرى
بنك المعرفة-المصرىبنك المعرفة-المصرى
بنك المعرفة-المصرى
ghadeermagdy
 
A new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networksA new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networksNees Jan van Eck
 
بنك المعرفة المصرى Egyptian knowledge bank
بنك المعرفة المصرى  Egyptian knowledge bankبنك المعرفة المصرى  Egyptian knowledge bank
بنك المعرفة المصرى Egyptian knowledge bank
sameh shalash
 

Similar to Scientometric approaches to classification (20)

MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage data
 
Value-added services for the Wageningen Institutional Repository (WaY)
Value-added services for the Wageningen Institutional Repository (WaY)Value-added services for the Wageningen Institutional Repository (WaY)
Value-added services for the Wageningen Institutional Repository (WaY)
 
Investigation of Partition Cells as a Structural Basis Suitable for Assessmen...
Investigation of Partition Cells as a Structural Basis Suitable for Assessmen...Investigation of Partition Cells as a Structural Basis Suitable for Assessmen...
Investigation of Partition Cells as a Structural Basis Suitable for Assessmen...
 
Using Bibliometrics in the Library
Using Bibliometrics in the LibraryUsing Bibliometrics in the Library
Using Bibliometrics in the Library
 
Paper 6: World University's Evaluation (Qiu & Zhao)
Paper 6: World University's Evaluation (Qiu & Zhao)Paper 6: World University's Evaluation (Qiu & Zhao)
Paper 6: World University's Evaluation (Qiu & Zhao)
 
Bibliometric analysis tools on top of the university’s bibliographic database...
Bibliometric analysis tools on top of the university’s bibliographic database...Bibliometric analysis tools on top of the university’s bibliographic database...
Bibliometric analysis tools on top of the university’s bibliographic database...
 
A new role for libraries in research assessments
A new role for libraries in research assessmentsA new role for libraries in research assessments
A new role for libraries in research assessments
 
Where to publish_130709
Where to publish_130709Where to publish_130709
Where to publish_130709
 
Öppen data och forskningens genomslag
Öppen data och forskningens genomslagÖppen data och forskningens genomslag
Öppen data och forskningens genomslag
 
Publication strategy for LEI
Publication strategy for LEIPublication strategy for LEI
Publication strategy for LEI
 
Presentation of a bibliometric Analysis of Quantum machine Learning.ppt
Presentation of a bibliometric Analysis of Quantum machine Learning.pptPresentation of a bibliometric Analysis of Quantum machine Learning.ppt
Presentation of a bibliometric Analysis of Quantum machine Learning.ppt
 
Broad altmetric analysis of Mendeley readerships through the ‘academic status...
Broad altmetric analysis of Mendeley readerships through the ‘academic status...Broad altmetric analysis of Mendeley readerships through the ‘academic status...
Broad altmetric analysis of Mendeley readerships through the ‘academic status...
 
What is your h-index and other measures of impact
What is your h-index and other measures of impactWhat is your h-index and other measures of impact
What is your h-index and other measures of impact
 
Towards Automatic Classification of LOD Datasets
Towards Automatic Classification of LOD DatasetsTowards Automatic Classification of LOD Datasets
Towards Automatic Classification of LOD Datasets
 
PLOS Visualization Project
PLOS Visualization ProjectPLOS Visualization Project
PLOS Visualization Project
 
THOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSTHOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOS
 
Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...
 
بنك المعرفة-المصرى
بنك المعرفة-المصرىبنك المعرفة-المصرى
بنك المعرفة-المصرى
 
A new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networksA new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networks
 
بنك المعرفة المصرى Egyptian knowledge bank
بنك المعرفة المصرى  Egyptian knowledge bankبنك المعرفة المصرى  Egyptian knowledge bank
بنك المعرفة المصرى Egyptian knowledge bank
 

More from Nees Jan van Eck

Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...
Nees Jan van Eck
 
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Nees Jan van Eck
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
Nees Jan van Eck
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
Nees Jan van Eck
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
Nees Jan van Eck
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
Nees Jan van Eck
 
How to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparisonHow to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparison
Nees Jan van Eck
 
Advanced bibliometric software tools for publishers and editors
Advanced bibliometric software tools for publishers and editorsAdvanced bibliometric software tools for publishers and editors
Advanced bibliometric software tools for publishers and editors
Nees Jan van Eck
 
Large-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networksLarge-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networks
Nees Jan van Eck
 
On cluster stability
On cluster stabilityOn cluster stability
On cluster stability
Nees Jan van Eck
 
Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...
Nees Jan van Eck
 
Cluster stability
Cluster stabilityCluster stability
Cluster stability
Nees Jan van Eck
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
Nees Jan van Eck
 

More from Nees Jan van Eck (13)

Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...
 
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
 
How to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparisonHow to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparison
 
Advanced bibliometric software tools for publishers and editors
Advanced bibliometric software tools for publishers and editorsAdvanced bibliometric software tools for publishers and editors
Advanced bibliometric software tools for publishers and editors
 
Large-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networksLarge-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networks
 
On cluster stability
On cluster stabilityOn cluster stability
On cluster stability
 
Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...
 
Cluster stability
Cluster stabilityCluster stability
Cluster stability
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
 

Recently uploaded

如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), EligibilityISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
SciAstra
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
frank0071
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
zeex60
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 

Recently uploaded (20)

如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), EligibilityISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 

Scientometric approaches to classification

  • 1. Scientometric approaches to classification Nees Jan van Eck Centre for Science and Technology Studies (CWTS), Leiden University Colloquium Research Information Systems and Science Classifications: Revisiting the NARCIS Classification Museum Meermanno, The Hague, The Netherlands September 28, 2018
  • 2. Outline • Bibliographic databases • Classification systems of scientific literature • CWTS publication-level classification system of science – Methodology – Structure – Applications • Quality of classification systems 1
  • 5. Bibliographic databases 4 Web of Science Scopus Journals 20,000 24,000 Publications 55 million 45 million Citations 1.2 billion 1.2 billion
  • 7. Classification systems of scientific literature • Mono-disciplinary vs. multidisciplinary • Journal-level vs. publication-level • Manual vs. algorithmic 6
  • 8. Classification systems of scientific literature • Mono-disciplinary: – Chemical Abstracts: 80 different sections and 5 broad headings – EconLit: Journal of Economic Literature (JEL) classification system – PubMed: Medical Subject Headings (MeSH) • Multidisciplinary: – Web of Science: 250 categories – Scopus (ASJC): bottom level has 304 categories and top level includes 27 categories – Science-Metrix: 176 categories – National Science Foundation (NSF): 125 categories – University of California, San Diego (UCSD): more than 500 categories – Australian and New Zealand Standard Research Classification (FoR): 3 hierarchical levels 7
  • 10. Algorithmic classification system of science • First version created in 2012 • Publications (not journals) are clustered into research areas based on citation relations • Research areas are defined at different levels of granularity and are organized hierarchically • Clustering is performed using the smart local moving algorithm (improved Louvain algorithm; Waltman & Van Eck, 2013) 9
  • 11. Objectives To create a classification system • in a fully algorithmic manner • covering all sciences and social sciences • at the level of individual publications • with a hierarchical structure • using transparent, freely available algorithms • without excessive computational requirements 10
  • 12. Main challenges • Dealing with huge volumes of data • Avoiding disciplinary biases • Reaching a high level of accuracy • Being flexible in terms of number of hierarchical levels and size of research areas • Obtaining proper labels for the research areas • Keeping the methodology reasonably simple and transparent 11
  • 13. Dealing with huge volumes of data • Linking publications based on direct citations only; no co-citations, bibliographic coupling, or word co-occurrences • Efficient clustering algorithm based on ideas taken from: – Newman (2004): Modularity-based clustering – Blondel et al. (2008): ‘Louvain method’ – Waltman et al. (2010): VOS clustering technique – Rotta & Noack (2011): Multilevel local search algorithms 12
  • 14. Avoiding disciplinary biases • cij: Relatedness of publications i and j, i.e., 1 if there is a direct citation relation between i and j, 0 otherwise • aij: Normalized relatedness of publications i and j, defined as • Similar to fractional citation counting (Small & Sweeney, 1985)   k ik ij ij c c a 13
  • 15. Reaching a high level of accuracy • Clustering technique based on maximization of a quality function: • xi denotes the cluster (research area) to which publication i is assigned • (xi, xj) = 1 if xi = xj and 0 otherwise • r denotes a resolution parameter • Quality function is maximized with respect to x1, ..., xn   i j ijji raxx ))(,( 14
  • 16. Being flexible in terms of number of hierarchical levels and size of research areas • Three types of parameters: – Number of hierarchical levels – Each level’s resolution parameter – Each level’s minimum number of publications per research area 15
  • 17. Obtaining proper labels for the research areas 1. Identification of terms in titles and abstracts of articles using part-of-speech tagging 2. Calculation of term relevance scores based on a combination of a term’s absolute and relative frequency of occurrence 3. Selection of the most relevant terms based on term relevance scores combined with a filter for removing similar terms 16
  • 18. CWTS publication-level classification system of science • 21.2 million publications from the period 2000–2017 indexed in Web of Science • 374.1 million citation relations • Classification system of 3 hierarchical levels: – 22 broad disciplines – 868 fields – 4,047 subfields • Computational performance: less than 2 hours 17
  • 19. 18 Breakdown of scientific literature into 22 broad disciplines Social sciences and humanities Biomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  • 21. 20 Breakdown of scientific literature into 868 fields Social sciences and humanities Biomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  • 22. 21 Breakdown of scientific literature into 4,047 subfields Social sciences and humanities Biomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  • 23. 22 Breakdown of scientific literature into 4,047 subfields Social sciences and humanities Biomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering Scientometrics
  • 24. Summary of scientometrics subfield 23 Cluster: 145 No. publications: 16,312 Top 5 terms No. pubs bibliometric analysis 852 impact factor 495 h index 264 peer review 515 citation 642 Top 5 publications No. cits hirsch, je (2005). an index to quantify an individual's scientific research output. p natl acad sci usa, 102(46), 16569-16572. 2,635 wuchty, s; et al. (2007). the increasing dominance of teams in production of knowledge. science, 316(5827), 1036-1039. 699 egghe, l (2006). theory and practise of the g-index. scientometrics, 69(1), 131-152. 609 king, da (2004). the scientific impact of nations. nature, 430(6997), 311-316. 496 newman, mej (2004). coauthorship networks and patterns of scientific collaboration. p natl acad sci usa, 101, 5200-5205. 488 Top 5 authors No. pubs Top 5 journals No. pubs bornmann, l 221 scientometrics 2,865 thelwall, m 202 journal of informetrics 700 leydesdorff, l 175 journal of the american society for information science and technology 613 rousseau, r 161 plos one 339 egghe, l 133 research evaluation 324 Top 5 institutes No. pubs Top 5 departments No. pubs univ granada 316 sch lib & informat sci (indiana univ) 106 kathol univ leuven 256 amsterdam sch commun res ascor (univ amsterdam) 97 leiden univ 249 ctr sci & technol studies (leiden univ) 90 indiana univ 246 sch publ policy (georgia inst technol - atlanta) 88 univ wolverhampton 216 trend res ctr (asia univ) 84 0 200 400 600 800 1,000 1,200 1,400 1,600 2000 2002 2004 2006 2008 2010 2012 2014 2016 No.publications
  • 26. 25 Term map of scientometrics subfield Peer review, OA, careers, and gender CollaborationScientometric indicators and networks Medical research Country-level analyses
  • 27. 26 Time-line map of highly cited scientometrics publications
  • 28. 27 Overlay visualizations Social sciences and humanities Biomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  • 29. Time trend 28 Social sciences and humanities Biomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  • 31. Summary of graphene subfield 30 Cluster: 9 No. publications: 27,771 Top 5 terms No. pubs bilayer graphene 836 epitaxial graphene 491 silicene 401 graphene nanoribbon 1,035 graphene field effect transistor 207 Top 5 publications No. cits novoselov, ks; et al. (2004). electric field effect in atomically thin carbon films. science, 306(5696), 666-669. 27,743 geim, ak; et al. (2007). the rise of graphene. nat mater, 6(3), 183-191. 20,073 novoselov, ks; et al. (2005). two-dimensional gas of massless dirac fermions in graphene. nature, 438(7065), 197-200. 11,359 castro neto, ah; et al. (2009). the electronic properties of graphene. rev mod phys, 81(1), 109-162. 11,368 zhang, yb; et al. (2005). experimental observation of the quantum hall effect and berry's phase in graphene. nature, 438(7065), 201-204. 8,110 Top 5 authors No. pubs Top 5 journals No. pubs watanabe, k 249 physical review b 4,013 taniguchi, t 240 applied physics letters 1,834 peeters, fm 233 carbon 994 lin, mf 178 nano letters 906 katsnelson, mi 177 journal of applied physics 841 Top 5 institutes No. pubs Top 5 departments No. pubs chinese acad sci 1,394 dept phys (natl univ singapore) 257 russian acad sci 778 inst phys (chinese acad sci) 226 peking univ 557 inst mol & mat (radboud univ nijmegen) 216 natl univ singapore 482 dept phys (mit) 209 tsing hua univ 458 dept phys (univ calif berkeley and berkeley national lab) 206 0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 2000 2002 2004 2006 2008 2010 2012 2014 2016 No.publications
  • 32. Open access 31 Social sciences and humanities Biomedical and health sciences Life and earth sciences Mathematics and computer science Physical sciences and engineering
  • 33. University profiles 32 Delft University of TechnologyLeiden University
  • 34. Applications • Field normalization – CWTS Leiden Ranking/U-Multirank – Dutch University Medical Centers • Field delineation – European research funders • High-resolution research strengths analysis – European universities – European research funders • Identification of interdisciplinary and emerging research areas – UK Engineering and Physical Sciences Research Council 33
  • 35. Adopters and potential adopters • Adopters: – CWTS – SciTech Strategies (e.g. SciVal) – Royal School of Technology (KTH) Stockholm • Potential adopters: – Chinese Academy of Sciences – European Research Council – Max Planck 34
  • 37. Empirical micro study using papers on overall water splitting • Haunschild et al. (2018) • Case study comparing CWTS classification to journal-based and manually constructed classifications • Ability of CWTS classification to distinguish between fields is questioned 36
  • 38. Accuracy of the journal classification systems of Web of Science and Scopus • Wang and Waltman (2016) • Two criteria to identify journals with questionable classifications: – journals that have weak connections with their assigned categories – journals that are not assigned to categories with which they have strong connections • Web of Science performs significantly better than Scopus 37
  • 39. Field classification of publications in Dimensions • Bornmann (2018) • Field classification in Dimensions: – Based on Fields of Research (FOR) from Australian and New Zealand Standard Research Classification (ANZSRC) – Machine learning approach – Each publication is assigned to at least one field • Based on Bornmann’s own publications • Questions reliability and validity of Dimensions classification 38
  • 40. Response from Dimensions • Herzog and Lunn (2018) • Implementation at launch was first step and requires improvements: – Improvement of training sets – Adding new subcategories to FOR system 39
  • 41. Large-scale system to organize publications into hierarchical concept structure • Shen et al. (2018) • Core component in Microsoft Academic • Iterative approach to: – concept discovery (Wikipedia) – concept tagging to publications (both textual data and graph structure are considered) – concept hierarchy construction • Based on 2000 initial seed concepts, over 228K concepts have been identified • Concepts are organized in six-level hierarchy • 1 billion publication-concept relations 40
  • 43. Conclusions • Algorithmic approaches can be used to construct large-scale classifications • Algorithmic classifications at the level of publications gain popularity • Algorithmic possibilities depend on data availability • Algorithmic classifications may have the disadvantage of mixing up different principles for classifying items (e.g., research topic, research method, scientific community, theoretical tradition, basic vs. applied) 42
  • 44. Thank you for your attention! 43