SlideShare a Scribd company logo
Using full-text data to create
improved term maps
Nees Jan van Eck1, Ludo Waltman1, Min Song2, and Yoo Kyung Jeong2
1Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands
2Department of Library and Information Science, Yonsei University, Seoul, Republic of Korea
16th International Conference on Scientometrics & Informetrics
Wuhan, China, October 19, 2017
Introduction
• Traditionally bibliometric analyses are based on
meta data of scientific publications
• Full text of scientific publications is increasingly
becoming available in structured formats
• We study different approaches for creating
VOSviewer term maps using full text data
• We perform comparisons with a traditional
approach based on titles and abstracts
1
VOSviewer term maps
2
Interpretation of a term map
• Size:
– The larger a term, the higher the frequency of occurrence of the
term
• Distance:
– In general, the smaller the distance between two terms, the
higher the relatedness of the terms, as measured by co-
occurrences
– Horizontal and vertical axes have no special meaning
• Colors:
– Colors indicate clusters of closely related terms
3
Creating a term map
1. Input English-language text corpus
2. Identify terms
3. Count co-occurrences of terms
4. Create layout and clustering
4
Counting co-occurrences of terms
• Full counting:
– All occurrences of a term in a document are counted
• Binary counting:
– Only the presence or absence of a term matters
– Number of occurrences of a term is not taken into account
5
Data
• Full text of publications in Journal of Informetrics
• 688 publications in the period 2007-2016
• Downloaded in XML format using the Elsevier
ScienceDirect Article Retrieval API
6
Average
per pub.
Sections 6.0
Paragraphs 42.1
Sentences 191.1
7
Term maps
8
Titles and abstracts / binary counting
9
Full text, publication level / full counting
10
Full text, paragraph level / full counting
11
Conclusions
• Full text vs. titles and abstracts:
– Full text yields richer maps than titles and abstracts
– Richer maps may be useful for interactive visualization, perhaps
not for static visualization
• Full counting vs. binary counting:
– When using full text data, full counting is preferable over binary
counting
• Paragraph level vs. publication level:
– Paragraph-level maps have more fine-grained structure than
publication-level maps
– However, areas in paragraph-level maps do not always represent
topics in the literature
12
Future research
• Use full-text data for creating other types of maps,
in particular co-citation maps
13
14
Thank you for your attention!

More Related Content

What's hot

Large-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networksLarge-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networks
Nees Jan van Eck
 
Large-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applicationsLarge-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applications
Ludo Waltman
 
Crossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadataCrossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadata
Nees Jan van Eck
 
A new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networksA new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networks
Nees Jan van Eck
 
Applications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysisApplications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysis
Nees Jan van Eck
 
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
Nees Jan van Eck
 
Science Mapping and Research Positioning
Science Mapping and Research PositioningScience Mapping and Research Positioning
Science Mapping and Research Positioning
Nees Jan van Eck
 
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Nees Jan van Eck
 
Intermediacy of publications
Intermediacy of publicationsIntermediacy of publications
Intermediacy of publications
Nees Jan van Eck
 
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Ludo Waltman
 
Advanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extractionAdvanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extraction
Nees Jan van Eck
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
Nees Jan van Eck
 
Visualizing science based on open data sources
Visualizing science based on open data sourcesVisualizing science based on open data sources
Visualizing science based on open data sources
Nees Jan van Eck
 
Multiple perspectives on bibliometric data
Multiple perspectives on bibliometric dataMultiple perspectives on bibliometric data
Multiple perspectives on bibliometric data
Nees Jan van Eck
 
Advanced bibliometric software tools for publishers and editors
Advanced bibliometric software tools for publishers and editorsAdvanced bibliometric software tools for publishers and editors
Advanced bibliometric software tools for publishers and editors
Nees Jan van Eck
 
VOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer TutorialVOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer Tutorial
Nees Jan van Eck
 
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Nees Jan van Eck
 
Toward open citations: Why, how, and when?
Toward open citations: Why, how, and when?Toward open citations: Why, how, and when?
Toward open citations: Why, how, and when?
Ludo Waltman
 
Accuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and ScopusAccuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and Scopus
Nees Jan van Eck
 
A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...
Nees Jan van Eck
 

What's hot (20)

Large-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networksLarge-scale analysis of bibliometric networks
Large-scale analysis of bibliometric networks
 
Large-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applicationsLarge-scale visualization of science: Methods, tools, and applications
Large-scale visualization of science: Methods, tools, and applications
 
Crossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadataCrossref as a source of open bibliographic metadata
Crossref as a source of open bibliographic metadata
 
A new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networksA new software tool for large-scale analysis of citation networks
A new software tool for large-scale analysis of citation networks
 
Applications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysisApplications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysis
 
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
 
Science Mapping and Research Positioning
Science Mapping and Research PositioningScience Mapping and Research Positioning
Science Mapping and Research Positioning
 
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
Bibliometrische visualisaties voor het bijhouden van wetenschappelijke litera...
 
Intermediacy of publications
Intermediacy of publicationsIntermediacy of publications
Intermediacy of publications
 
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
Web of Science, Scopus, Dimensions, and beyond: The evolving landscape of bib...
 
Advanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extractionAdvanced citation matching and large-scale cited reference extraction
Advanced citation matching and large-scale cited reference extraction
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
 
Visualizing science based on open data sources
Visualizing science based on open data sourcesVisualizing science based on open data sources
Visualizing science based on open data sources
 
Multiple perspectives on bibliometric data
Multiple perspectives on bibliometric dataMultiple perspectives on bibliometric data
Multiple perspectives on bibliometric data
 
Advanced bibliometric software tools for publishers and editors
Advanced bibliometric software tools for publishers and editorsAdvanced bibliometric software tools for publishers and editors
Advanced bibliometric software tools for publishers and editors
 
VOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer TutorialVOSviewer and CitNetExplorer Tutorial
VOSviewer and CitNetExplorer Tutorial
 
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
Visualizing science using VOSviewer based on Crossref, Microsoft Academic, an...
 
Toward open citations: Why, how, and when?
Toward open citations: Why, how, and when?Toward open citations: Why, how, and when?
Toward open citations: Why, how, and when?
 
Accuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and ScopusAccuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and Scopus
 
A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...
 

Similar to Using full-text data to create improved term maps

Szomszor "Methods and Tools for Scholarly Data Analytics"
Szomszor "Methods and Tools for Scholarly Data Analytics"Szomszor "Methods and Tools for Scholarly Data Analytics"
Szomszor "Methods and Tools for Scholarly Data Analytics"
National Information Standards Organization (NISO)
 
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Sergey Sosnovsky
 
Towards a Semantic Citation Index for the German Social Sciences
Towards a Semantic Citation Index for the German Social SciencesTowards a Semantic Citation Index for the German Social Sciences
Towards a Semantic Citation Index for the German Social Sciences
GESIS
 
British Library
British LibraryBritish Library
British Library
clarivate
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval
GESIS
 
Head Start: Improving Academic Literature Search with Overview Visualizations...
Head Start: Improving Academic Literature Search with Overview Visualizations...Head Start: Improving Academic Literature Search with Overview Visualizations...
Head Start: Improving Academic Literature Search with Overview Visualizations...
Open Knowledge Maps
 
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Parang Saraf
 
· ;,Individual Research Paper TopicsDiscussion TopicIm Done.docx
· ;,Individual Research Paper TopicsDiscussion TopicIm Done.docx· ;,Individual Research Paper TopicsDiscussion TopicIm Done.docx
· ;,Individual Research Paper TopicsDiscussion TopicIm Done.docx
oswald1horne84988
 
B sc mathematics project guidelines for final year students
B sc mathematics project guidelines for final year studentsB sc mathematics project guidelines for final year students
B sc mathematics project guidelines for final year students
kuckoo1
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the Web
Rinke Hoekstra
 
Groups of Highly Cited Publications: Stability in Content with Citation Windo...
Groups of Highly Cited Publications: Stability in Content with Citation Windo...Groups of Highly Cited Publications: Stability in Content with Citation Windo...
Groups of Highly Cited Publications: Stability in Content with Citation Windo...
Nadine Rons
 
Making topic maps from Subject Headings for linking and organizing
Making topic maps from Subject Headings for linking and organizingMaking topic maps from Subject Headings for linking and organizing
Making topic maps from Subject Headings for linking and organizing
Lars Marius Garshol
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
IMPACT Centre of Competence
 
Semantically-enabled Browsing of Large Multilingual Document Collections
Semantically-enabled Browsing of Large Multilingual Document CollectionsSemantically-enabled Browsing of Large Multilingual Document Collections
Semantically-enabled Browsing of Large Multilingual Document Collections
Carlos Badenes-Olmedo
 
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
Matthäus Zloch
 
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Aravind Sesagiri Raamkumar
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
Kalpit Desai
 
information-skills-for-researchers-v3
information-skills-for-researchers-v3information-skills-for-researchers-v3
information-skills-for-researchers-v3
Jacqueline Thomas
 
Search term recommendation and non-textual ranking evaluated
 Search term recommendation and non-textual ranking evaluated Search term recommendation and non-textual ranking evaluated
Search term recommendation and non-textual ranking evaluated
GESIS
 
Concept map
Concept mapConcept map
Concept map
hariom2015
 

Similar to Using full-text data to create improved term maps (20)

Szomszor "Methods and Tools for Scholarly Data Analytics"
Szomszor "Methods and Tools for Scholarly Data Analytics"Szomszor "Methods and Tools for Scholarly Data Analytics"
Szomszor "Methods and Tools for Scholarly Data Analytics"
 
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic ...
 
Towards a Semantic Citation Index for the German Social Sciences
Towards a Semantic Citation Index for the German Social SciencesTowards a Semantic Citation Index for the German Social Sciences
Towards a Semantic Citation Index for the German Social Sciences
 
British Library
British LibraryBritish Library
British Library
 
Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval Intra- and interdisciplinary cross-concordances for information retrieval
Intra- and interdisciplinary cross-concordances for information retrieval
 
Head Start: Improving Academic Literature Search with Overview Visualizations...
Head Start: Improving Academic Literature Search with Overview Visualizations...Head Start: Improving Academic Literature Search with Overview Visualizations...
Head Start: Improving Academic Literature Search with Overview Visualizations...
 
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
 
· ;,Individual Research Paper TopicsDiscussion TopicIm Done.docx
· ;,Individual Research Paper TopicsDiscussion TopicIm Done.docx· ;,Individual Research Paper TopicsDiscussion TopicIm Done.docx
· ;,Individual Research Paper TopicsDiscussion TopicIm Done.docx
 
B sc mathematics project guidelines for final year students
B sc mathematics project guidelines for final year studentsB sc mathematics project guidelines for final year students
B sc mathematics project guidelines for final year students
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the Web
 
Groups of Highly Cited Publications: Stability in Content with Citation Windo...
Groups of Highly Cited Publications: Stability in Content with Citation Windo...Groups of Highly Cited Publications: Stability in Content with Citation Windo...
Groups of Highly Cited Publications: Stability in Content with Citation Windo...
 
Making topic maps from Subject Headings for linking and organizing
Making topic maps from Subject Headings for linking and organizingMaking topic maps from Subject Headings for linking and organizing
Making topic maps from Subject Headings for linking and organizing
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Semantically-enabled Browsing of Large Multilingual Document Collections
Semantically-enabled Browsing of Large Multilingual Document CollectionsSemantically-enabled Browsing of Large Multilingual Document Collections
Semantically-enabled Browsing of Large Multilingual Document Collections
 
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
 
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
 
information-skills-for-researchers-v3
information-skills-for-researchers-v3information-skills-for-researchers-v3
information-skills-for-researchers-v3
 
Search term recommendation and non-textual ranking evaluated
 Search term recommendation and non-textual ranking evaluated Search term recommendation and non-textual ranking evaluated
Search term recommendation and non-textual ranking evaluated
 
Concept map
Concept mapConcept map
Concept map
 

More from Nees Jan van Eck

Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...
Nees Jan van Eck
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
Nees Jan van Eck
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
Nees Jan van Eck
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
Nees Jan van Eck
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
Nees Jan van Eck
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
Nees Jan van Eck
 
How to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparisonHow to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparison
Nees Jan van Eck
 
On cluster stability
On cluster stabilityOn cluster stability
On cluster stability
Nees Jan van Eck
 
Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...
Nees Jan van Eck
 
Cluster stability
Cluster stabilityCluster stability
Cluster stability
Nees Jan van Eck
 

More from Nees Jan van Eck (10)

Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...Community detection using citation relations and textual similarities in a la...
Community detection using citation relations and textual similarities in a la...
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
 
A scientometric perspective on university ranking
A scientometric perspective on university rankingA scientometric perspective on university ranking
A scientometric perspective on university ranking
 
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university rankingCWTS Leiden Ranking: An advanced bibliometric approach to university ranking
CWTS Leiden Ranking: An advanced bibliometric approach to university ranking
 
Open data sources in VOSviewer
Open data sources in VOSviewerOpen data sources in VOSviewer
Open data sources in VOSviewer
 
How to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparisonHow to design a ranking system: Criteria and opportunities for a comparison
How to design a ranking system: Criteria and opportunities for a comparison
 
On cluster stability
On cluster stabilityOn cluster stability
On cluster stability
 
Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...Network visualization: Fine-tuning layout techniques for different types of n...
Network visualization: Fine-tuning layout techniques for different types of n...
 
Cluster stability
Cluster stabilityCluster stability
Cluster stability
 

Recently uploaded

VIII-Geography FOR CBSE CLASS 8 INDIA.pdf
VIII-Geography FOR CBSE CLASS 8 INDIA.pdfVIII-Geography FOR CBSE CLASS 8 INDIA.pdf
VIII-Geography FOR CBSE CLASS 8 INDIA.pdf
poorvarajgolkar
 
Speed-accuracy trade-off for the diffusion models
Speed-accuracy trade-off for the diffusion modelsSpeed-accuracy trade-off for the diffusion models
Speed-accuracy trade-off for the diffusion models
sosukeito
 
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
Faculty of Applied Chemistry and Materials Science
 
bloodclotfactorsprocoagulantsexstrinsicintrinsicfactors-240607054610-6895d6e5...
bloodclotfactorsprocoagulantsexstrinsicintrinsicfactors-240607054610-6895d6e5...bloodclotfactorsprocoagulantsexstrinsicintrinsicfactors-240607054610-6895d6e5...
bloodclotfactorsprocoagulantsexstrinsicintrinsicfactors-240607054610-6895d6e5...
muralinath2
 
Bioconversion of sago waste and oil cakes into biobutanol using Environmental...
Bioconversion of sago waste and oil cakes into biobutanol using Environmental...Bioconversion of sago waste and oil cakes into biobutanol using Environmental...
Bioconversion of sago waste and oil cakes into biobutanol using Environmental...
Dr NEETHU ASOKAN
 
16. 20240529_Ailin Molosag_MARIGREEN_SS_Day3_Ailin.pdf
16. 20240529_Ailin Molosag_MARIGREEN_SS_Day3_Ailin.pdf16. 20240529_Ailin Molosag_MARIGREEN_SS_Day3_Ailin.pdf
16. 20240529_Ailin Molosag_MARIGREEN_SS_Day3_Ailin.pdf
marigreenproject
 
Accessing Data to Support Pesticide Residue and Emerging Contaminant Analysis...
Accessing Data to Support Pesticide Residue and Emerging Contaminant Analysis...Accessing Data to Support Pesticide Residue and Emerging Contaminant Analysis...
Accessing Data to Support Pesticide Residue and Emerging Contaminant Analysis...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Phytoremediation: Harnessing Nature's Power with Phytoremediation
Phytoremediation: Harnessing Nature's Power with PhytoremediationPhytoremediation: Harnessing Nature's Power with Phytoremediation
Phytoremediation: Harnessing Nature's Power with Phytoremediation
Gurjant Singh
 
SOFIA/HAWC+ FAR-INFRARED POLARIMETRIC LARGE-AREA CMZ EXPLORATION (FIREPLACE) ...
SOFIA/HAWC+ FAR-INFRARED POLARIMETRIC LARGE-AREA CMZ EXPLORATION (FIREPLACE) ...SOFIA/HAWC+ FAR-INFRARED POLARIMETRIC LARGE-AREA CMZ EXPLORATION (FIREPLACE) ...
SOFIA/HAWC+ FAR-INFRARED POLARIMETRIC LARGE-AREA CMZ EXPLORATION (FIREPLACE) ...
Sérgio Sacani
 
A NICER VIEW OF THE NEAREST AND BRIGHTEST MILLISECOND PULSAR: PSR J0437−4715
A NICER VIEW OF THE NEAREST AND BRIGHTEST MILLISECOND PULSAR: PSR J0437−4715A NICER VIEW OF THE NEAREST AND BRIGHTEST MILLISECOND PULSAR: PSR J0437−4715
A NICER VIEW OF THE NEAREST AND BRIGHTEST MILLISECOND PULSAR: PSR J0437−4715
Sérgio Sacani
 
Traditional, current and future use of fish and seaweed for fertilisation - ...
Traditional, current and future use of fish and seaweed for fertilisation -  ...Traditional, current and future use of fish and seaweed for fertilisation -  ...
Traditional, current and future use of fish and seaweed for fertilisation - ...
Faculty of Applied Chemistry and Materials Science
 
From Seeds to Supermassive Black Holes: Capture, Growth, Migration, and Pairi...
From Seeds to Supermassive Black Holes: Capture, Growth, Migration, and Pairi...From Seeds to Supermassive Black Holes: Capture, Growth, Migration, and Pairi...
From Seeds to Supermassive Black Holes: Capture, Growth, Migration, and Pairi...
Sérgio Sacani
 
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
bellared2
 
ellipticytescausesprognosistreatment-240622051139-23d50b05.pptx
ellipticytescausesprognosistreatment-240622051139-23d50b05.pptxellipticytescausesprognosistreatment-240622051139-23d50b05.pptx
ellipticytescausesprognosistreatment-240622051139-23d50b05.pptx
muralinath2
 
Fish in the Loop: Exploring RAS - Julie Hansen Bergstedt
Fish in the Loop: Exploring RAS - Julie Hansen BergstedtFish in the Loop: Exploring RAS - Julie Hansen Bergstedt
Fish in the Loop: Exploring RAS - Julie Hansen Bergstedt
Faculty of Applied Chemistry and Materials Science
 
Potential of Marine renewable and Non renewable energy.pptx
Potential of Marine renewable and Non renewable energy.pptxPotential of Marine renewable and Non renewable energy.pptx
Potential of Marine renewable and Non renewable energy.pptx
J. Bovas Joel BFSc
 
17. 20240529_Ingrid Olesen_MariGreen summer school.pdf
17. 20240529_Ingrid Olesen_MariGreen summer school.pdf17. 20240529_Ingrid Olesen_MariGreen summer school.pdf
17. 20240529_Ingrid Olesen_MariGreen summer school.pdf
marigreenproject
 
NuGOweek 2024 Ghent programme__flyer.pdf
NuGOweek 2024 Ghent programme__flyer.pdfNuGOweek 2024 Ghent programme__flyer.pdf
NuGOweek 2024 Ghent programme__flyer.pdf
pablovgd
 
Types of Hypersensitivity Reactions.pptx
Types of Hypersensitivity Reactions.pptxTypes of Hypersensitivity Reactions.pptx
Types of Hypersensitivity Reactions.pptx
Isha Pandey
 
All-domain Anomaly Resolution Office Supplement to Oak Ridge National Laborat...
All-domain Anomaly Resolution Office Supplement to Oak Ridge National Laborat...All-domain Anomaly Resolution Office Supplement to Oak Ridge National Laborat...
All-domain Anomaly Resolution Office Supplement to Oak Ridge National Laborat...
Sérgio Sacani
 

Recently uploaded (20)

VIII-Geography FOR CBSE CLASS 8 INDIA.pdf
VIII-Geography FOR CBSE CLASS 8 INDIA.pdfVIII-Geography FOR CBSE CLASS 8 INDIA.pdf
VIII-Geography FOR CBSE CLASS 8 INDIA.pdf
 
Speed-accuracy trade-off for the diffusion models
Speed-accuracy trade-off for the diffusion modelsSpeed-accuracy trade-off for the diffusion models
Speed-accuracy trade-off for the diffusion models
 
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
 
bloodclotfactorsprocoagulantsexstrinsicintrinsicfactors-240607054610-6895d6e5...
bloodclotfactorsprocoagulantsexstrinsicintrinsicfactors-240607054610-6895d6e5...bloodclotfactorsprocoagulantsexstrinsicintrinsicfactors-240607054610-6895d6e5...
bloodclotfactorsprocoagulantsexstrinsicintrinsicfactors-240607054610-6895d6e5...
 
Bioconversion of sago waste and oil cakes into biobutanol using Environmental...
Bioconversion of sago waste and oil cakes into biobutanol using Environmental...Bioconversion of sago waste and oil cakes into biobutanol using Environmental...
Bioconversion of sago waste and oil cakes into biobutanol using Environmental...
 
16. 20240529_Ailin Molosag_MARIGREEN_SS_Day3_Ailin.pdf
16. 20240529_Ailin Molosag_MARIGREEN_SS_Day3_Ailin.pdf16. 20240529_Ailin Molosag_MARIGREEN_SS_Day3_Ailin.pdf
16. 20240529_Ailin Molosag_MARIGREEN_SS_Day3_Ailin.pdf
 
Accessing Data to Support Pesticide Residue and Emerging Contaminant Analysis...
Accessing Data to Support Pesticide Residue and Emerging Contaminant Analysis...Accessing Data to Support Pesticide Residue and Emerging Contaminant Analysis...
Accessing Data to Support Pesticide Residue and Emerging Contaminant Analysis...
 
Phytoremediation: Harnessing Nature's Power with Phytoremediation
Phytoremediation: Harnessing Nature's Power with PhytoremediationPhytoremediation: Harnessing Nature's Power with Phytoremediation
Phytoremediation: Harnessing Nature's Power with Phytoremediation
 
SOFIA/HAWC+ FAR-INFRARED POLARIMETRIC LARGE-AREA CMZ EXPLORATION (FIREPLACE) ...
SOFIA/HAWC+ FAR-INFRARED POLARIMETRIC LARGE-AREA CMZ EXPLORATION (FIREPLACE) ...SOFIA/HAWC+ FAR-INFRARED POLARIMETRIC LARGE-AREA CMZ EXPLORATION (FIREPLACE) ...
SOFIA/HAWC+ FAR-INFRARED POLARIMETRIC LARGE-AREA CMZ EXPLORATION (FIREPLACE) ...
 
A NICER VIEW OF THE NEAREST AND BRIGHTEST MILLISECOND PULSAR: PSR J0437−4715
A NICER VIEW OF THE NEAREST AND BRIGHTEST MILLISECOND PULSAR: PSR J0437−4715A NICER VIEW OF THE NEAREST AND BRIGHTEST MILLISECOND PULSAR: PSR J0437−4715
A NICER VIEW OF THE NEAREST AND BRIGHTEST MILLISECOND PULSAR: PSR J0437−4715
 
Traditional, current and future use of fish and seaweed for fertilisation - ...
Traditional, current and future use of fish and seaweed for fertilisation -  ...Traditional, current and future use of fish and seaweed for fertilisation -  ...
Traditional, current and future use of fish and seaweed for fertilisation - ...
 
From Seeds to Supermassive Black Holes: Capture, Growth, Migration, and Pairi...
From Seeds to Supermassive Black Holes: Capture, Growth, Migration, and Pairi...From Seeds to Supermassive Black Holes: Capture, Growth, Migration, and Pairi...
From Seeds to Supermassive Black Holes: Capture, Growth, Migration, and Pairi...
 
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
 
ellipticytescausesprognosistreatment-240622051139-23d50b05.pptx
ellipticytescausesprognosistreatment-240622051139-23d50b05.pptxellipticytescausesprognosistreatment-240622051139-23d50b05.pptx
ellipticytescausesprognosistreatment-240622051139-23d50b05.pptx
 
Fish in the Loop: Exploring RAS - Julie Hansen Bergstedt
Fish in the Loop: Exploring RAS - Julie Hansen BergstedtFish in the Loop: Exploring RAS - Julie Hansen Bergstedt
Fish in the Loop: Exploring RAS - Julie Hansen Bergstedt
 
Potential of Marine renewable and Non renewable energy.pptx
Potential of Marine renewable and Non renewable energy.pptxPotential of Marine renewable and Non renewable energy.pptx
Potential of Marine renewable and Non renewable energy.pptx
 
17. 20240529_Ingrid Olesen_MariGreen summer school.pdf
17. 20240529_Ingrid Olesen_MariGreen summer school.pdf17. 20240529_Ingrid Olesen_MariGreen summer school.pdf
17. 20240529_Ingrid Olesen_MariGreen summer school.pdf
 
NuGOweek 2024 Ghent programme__flyer.pdf
NuGOweek 2024 Ghent programme__flyer.pdfNuGOweek 2024 Ghent programme__flyer.pdf
NuGOweek 2024 Ghent programme__flyer.pdf
 
Types of Hypersensitivity Reactions.pptx
Types of Hypersensitivity Reactions.pptxTypes of Hypersensitivity Reactions.pptx
Types of Hypersensitivity Reactions.pptx
 
All-domain Anomaly Resolution Office Supplement to Oak Ridge National Laborat...
All-domain Anomaly Resolution Office Supplement to Oak Ridge National Laborat...All-domain Anomaly Resolution Office Supplement to Oak Ridge National Laborat...
All-domain Anomaly Resolution Office Supplement to Oak Ridge National Laborat...
 

Using full-text data to create improved term maps

  • 1. Using full-text data to create improved term maps Nees Jan van Eck1, Ludo Waltman1, Min Song2, and Yoo Kyung Jeong2 1Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands 2Department of Library and Information Science, Yonsei University, Seoul, Republic of Korea 16th International Conference on Scientometrics & Informetrics Wuhan, China, October 19, 2017
  • 2. Introduction • Traditionally bibliometric analyses are based on meta data of scientific publications • Full text of scientific publications is increasingly becoming available in structured formats • We study different approaches for creating VOSviewer term maps using full text data • We perform comparisons with a traditional approach based on titles and abstracts 1
  • 4. Interpretation of a term map • Size: – The larger a term, the higher the frequency of occurrence of the term • Distance: – In general, the smaller the distance between two terms, the higher the relatedness of the terms, as measured by co- occurrences – Horizontal and vertical axes have no special meaning • Colors: – Colors indicate clusters of closely related terms 3
  • 5. Creating a term map 1. Input English-language text corpus 2. Identify terms 3. Count co-occurrences of terms 4. Create layout and clustering 4
  • 6. Counting co-occurrences of terms • Full counting: – All occurrences of a term in a document are counted • Binary counting: – Only the presence or absence of a term matters – Number of occurrences of a term is not taken into account 5
  • 7. Data • Full text of publications in Journal of Informetrics • 688 publications in the period 2007-2016 • Downloaded in XML format using the Elsevier ScienceDirect Article Retrieval API 6 Average per pub. Sections 6.0 Paragraphs 42.1 Sentences 191.1
  • 8. 7
  • 10. Titles and abstracts / binary counting 9
  • 11. Full text, publication level / full counting 10
  • 12. Full text, paragraph level / full counting 11
  • 13. Conclusions • Full text vs. titles and abstracts: – Full text yields richer maps than titles and abstracts – Richer maps may be useful for interactive visualization, perhaps not for static visualization • Full counting vs. binary counting: – When using full text data, full counting is preferable over binary counting • Paragraph level vs. publication level: – Paragraph-level maps have more fine-grained structure than publication-level maps – However, areas in paragraph-level maps do not always represent topics in the literature 12
  • 14. Future research • Use full-text data for creating other types of maps, in particular co-citation maps 13
  • 15. 14 Thank you for your attention!