SlideShare a Scribd company logo
Citation Graph Analysis to Identify Memes in
Scientific Literature
Tobias Kuhn and Matjaz Perc and Dirk Helbing
http://www.tkuhn.ch
@txkuhn
ETH Zurich
NetSci 2014 — Network Science Conference
5 June 2014
Citation Graph of Scientific Publications
Entire giant component (33
million nodes) of the citation
graph of Thomson Reuter’s
Web of Science dataset.
Legend:
Natural/Agricultural Sciences
(except Physical Sciences)
Physical Sciences
Engineering and Technology
Medical and Health Sciences
Social Sciences / Humanities
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 2 / 15
Citation Graph: American Physical Society
Citation graph of the Phys-
ical Review journals (463k
nodes).
Legend:
A: Atomic, molecular,
optical phys.
B: Condensed matter,
materials phys.
C: Nuclear phys.
D: Particles, fields, gravitation,
cosmology
E: Statistical, nonlinear,
soft matter phys.
other journals
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 3 / 15
Citation Graph: Memes
Specific phrases or “memes”
localize to specific regions in
the citation graph.
Legend:
quantum
fission
graphene
self-organized criticality
traffic flow
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 4 / 15
Scientific Memes
“Meme” was coined by Richard Dawkins:
“Just as genes propagate themselves in the gene pool by leaping from body
to body via sperm or eggs, so memes propagate themselves in the meme pool
by leaping from brain to brain via a process which, in the broad sense, can
be called imitation.” [Dawkins, The Selfish Gene]
Examples of memes:
• Melodies
• Recipes
• Cultural habits
• Scientific concepts
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 5 / 15
Genes/Memes as Network Patterns!
Dawkins’ Definition of “Gene”:
“I am using the word gene to mean a genetic unit that is small enough to last
for a number of generations and to be distributed around in many copies.”
[Dawkins, The Selfish Gene]
Our Working Definition of “Scientific Meme”:
A scientific meme is a short unit of text in a publication that is replicated in
citing publications and thereby distributed around in many copies.
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 6 / 15
Propagation Score
Propagation score P quantifies the degree to which a meme’s
occurrence aligns with the citation graph:
Pm =
sticking factor
sparking factor
=
? ?
=
dm→m
d→m
dm→&m
d→&m
To prevent that some infrequent phrases get a high propagation score by chance, we can
add small amount of controlled noise δ (we use δ = 3):
Pm =
dm→m
d→m + δ
dm→
¡m
+ δ
d→
¡m
+ δ
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 7 / 15
Frequency/Propagation Score for APS Data
relativefrequency→
10−2
100
102
104
106
10−6
10
−4
10−2
100
APS
n = 1,372,365
quantum
fission
graphene
self-organized
criticality
traffic flow
propagation score →
densityofn-grams:
100
101
102
103
104
105
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 8 / 15
Randomized Network
relativefrequency→
10−2
100
102
104
106
10−6
10
−4
10−2
100
APS
randomized
(time preserving)
n = 89,356
propagation score →
densityofn-grams:
100
101
102
103
104
105
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 9 / 15
Meme Score
Meme score M as the Product of relative frequency f and
propagation score P:
Mm = fmPm
Top 20 Memes:
1. loop quantum cosmology+
* 11. dark energy+
*
2. unparticle+
* 12. Rashba
3. sonoluminescence+
* 13. CuGeO3
+
4. MgB2
+
14. strange nonchaotic
5. stochastic resonance+
* 15. in NbSe3
6. carbon nanotubes+
* 16. spin Hall+
7. NbSe3
+
17. elliptic flow+
*
8. black hole+
* 18. quantum Hall+
*
9. nanotubes+
19. CeCoIn5
+
10. lattice Boltzmann+
* 20. inflation+
+
annotators agreed that this is an interesting and important physics concept
* also found on the list of terms extracted from Wikipedia
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 10 / 15
Manual Annotation
• Two annotators (A1, A2): PhD students with physics degree
• Annotation with respect to (1) physics concept or not and (2)
linguistic category
• Randomly extracted phrases for comparison
physics concept not a physics concept
noun phrase verb adjective or adverb other
meme score
A1
A2
A1
A2
random
A1
A2
A1
A2
weighted random
terms
30 60 90 120 150
A1
A2
A1
A2
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 11 / 15
Comparison to Alternative Metrics
0 0.1 0.2 0.3 0.4 0.5
meme score
frequency
max. absolute
change
over time
max. relative
change
over time
max. absolute
difference
across journals
max. relative
difference
across journals
A (area under curve)
10
1
10
2
10
3
0
20
40
60
80
100
top x terms by meme score
percentageofWikipediaterms
40% of top 50
terms are found
on Wikipedia list
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 12 / 15
Evolution over Time
0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 10
5
0
2
4
6
8
10
12
publication count
memescore
1940
1960
1970
198019821984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
graphene
entanglement
MgB2
nanotubes
carbon nanotubes
quark
neutrino
Bose−Einstein
quantum Hall
black
C60
Hubbard model
quantum wells
graphite
reactions
photoemission
black hole
tricritical
Kondo
superconducting
fission
MeV
diffuse scattering
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 13 / 15
Conclusions
Inheritance patterns of memes in the scientific citation graph reveal a
simple mathematical regularity.
This regularity can be formalized by the meme score.
Allows for studying memes in an exhaustive manner.
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 14 / 15
Thank you for your Attention!
Twitter: @txkuhn
Pre-print article:
http://arxiv.org/abs/1404.3757
Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 15 / 15

More Related Content

Similar to Citation Graph Analysis to Identify Memes in Scientific Literature

QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
The Statistical and Applied Mathematical Sciences Institute
 
What_is_Information.pdf
What_is_Information.pdfWhat_is_Information.pdf
What_is_Information.pdf
YogiJeyaPrakash
 
Media IT - Entropy
Media IT - EntropyMedia IT - Entropy
Media IT - Entropy
Serge Linckels
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notation
khinsen
 
Visual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorerVisual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorer
Nees Jan van Eck
 
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
EmadfHABIB2
 
Differences Between Informal Logic, And Theoretical...
Differences Between Informal Logic, And Theoretical...Differences Between Informal Logic, And Theoretical...
Differences Between Informal Logic, And Theoretical...
Claudia Brown
 
list_of_publications
list_of_publicationslist_of_publications
list_of_publications
Andrei Patrascu
 
list_of_publications
list_of_publicationslist_of_publications
list_of_publications
Andrei Patrascu
 
Recommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenuRecommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenu
Patrice Bellot - Aix-Marseille Université / CNRS (LIS, INS2I)
 
SciDataCon 2014 TDM Workshop Intro Slides
SciDataCon 2014 TDM Workshop Intro SlidesSciDataCon 2014 TDM Workshop Intro Slides
SciDataCon 2014 TDM Workshop Intro Slides
Jenny Molloy
 
THE ISSUE OF UNCERTAINTY FOR HYDROLOGIC EVENTS IN THE MISSOURI RIVER WATERSHE...
THE ISSUE OF UNCERTAINTY FOR HYDROLOGIC EVENTS IN THE MISSOURI RIVER WATERSHE...THE ISSUE OF UNCERTAINTY FOR HYDROLOGIC EVENTS IN THE MISSOURI RIVER WATERSHE...
THE ISSUE OF UNCERTAINTY FOR HYDROLOGIC EVENTS IN THE MISSOURI RIVER WATERSHE...
Boris Shmagin
 
Striving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational ModellingStriving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational Modelling
Marco Wirthlin
 
Arcomem training Topic Analysis Models advanced
Arcomem training Topic Analysis Models advancedArcomem training Topic Analysis Models advanced
Arcomem training Topic Analysis Models advanced
arcomem
 
What's at Stake in the Information Debate?
What's at Stake in the Information Debate?What's at Stake in the Information Debate?
What's at Stake in the Information Debate?
Craig Simon
 
S5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t testS5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t test
Rachel Chung
 
cs_method.pdf
cs_method.pdfcs_method.pdf
cs_method.pdf
nyazirecarlington
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science Tales
Bertram Ludäscher
 
Vibration of plates by Leissa
Vibration of plates  by LeissaVibration of plates  by Leissa
Vibration of plates by Leissa
Aghilesh V
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
Patrice Bellot - Aix-Marseille Université / CNRS (LIS, INS2I)
 

Similar to Citation Graph Analysis to Identify Memes in Scientific Literature (20)

QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
 
What_is_Information.pdf
What_is_Information.pdfWhat_is_Information.pdf
What_is_Information.pdf
 
Media IT - Entropy
Media IT - EntropyMedia IT - Entropy
Media IT - Entropy
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notation
 
Visual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorerVisual exploration of scientific literature using VOSviewer and CitNetExplorer
Visual exploration of scientific literature using VOSviewer and CitNetExplorer
 
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
 
Differences Between Informal Logic, And Theoretical...
Differences Between Informal Logic, And Theoretical...Differences Between Informal Logic, And Theoretical...
Differences Between Informal Logic, And Theoretical...
 
list_of_publications
list_of_publicationslist_of_publications
list_of_publications
 
list_of_publications
list_of_publicationslist_of_publications
list_of_publications
 
Recommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenuRecommandation sociale : filtrage collaboratif et par le contenu
Recommandation sociale : filtrage collaboratif et par le contenu
 
SciDataCon 2014 TDM Workshop Intro Slides
SciDataCon 2014 TDM Workshop Intro SlidesSciDataCon 2014 TDM Workshop Intro Slides
SciDataCon 2014 TDM Workshop Intro Slides
 
THE ISSUE OF UNCERTAINTY FOR HYDROLOGIC EVENTS IN THE MISSOURI RIVER WATERSHE...
THE ISSUE OF UNCERTAINTY FOR HYDROLOGIC EVENTS IN THE MISSOURI RIVER WATERSHE...THE ISSUE OF UNCERTAINTY FOR HYDROLOGIC EVENTS IN THE MISSOURI RIVER WATERSHE...
THE ISSUE OF UNCERTAINTY FOR HYDROLOGIC EVENTS IN THE MISSOURI RIVER WATERSHE...
 
Striving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational ModellingStriving to Demystify Bayesian Computational Modelling
Striving to Demystify Bayesian Computational Modelling
 
Arcomem training Topic Analysis Models advanced
Arcomem training Topic Analysis Models advancedArcomem training Topic Analysis Models advanced
Arcomem training Topic Analysis Models advanced
 
What's at Stake in the Information Debate?
What's at Stake in the Information Debate?What's at Stake in the Information Debate?
What's at Stake in the Information Debate?
 
S5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t testS5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t test
 
cs_method.pdf
cs_method.pdfcs_method.pdf
cs_method.pdf
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science Tales
 
Vibration of plates by Leissa
Vibration of plates  by LeissaVibration of plates  by Leissa
Vibration of plates by Leissa
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
 

More from Tobias Kuhn

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
Tobias Kuhn
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
Tobias Kuhn
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
Tobias Kuhn
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
Tobias Kuhn
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
Tobias Kuhn
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Tobias Kuhn
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublications
Tobias Kuhn
 
Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
Tobias Kuhn
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
Tobias Kuhn
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
Tobias Kuhn
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
Tobias Kuhn
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
Tobias Kuhn
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications
Tobias Kuhn
 
Nanopubs
NanopubsNanopubs
Nanopubs
Tobias Kuhn
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
Tobias Kuhn
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Tobias Kuhn
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
Tobias Kuhn
 
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Tobias Kuhn
 

More from Tobias Kuhn (20)

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of DataA Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
A Decentralized Approach to Dissemination, Retrieval, and Archiving of Data
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
 
nanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublicationsnanopub-java: A Java Library for Nanopublications
nanopub-java: A Java Library for Nanopublications
 
Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications
 
Nanopubs
NanopubsNanopubs
Nanopubs
 
A Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural LanguageA Multilingual Semantic Wiki Based on Controlled Natural Language
A Multilingual Semantic Wiki Based on Controlled Natural Language
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
 

Recently uploaded

Concept of Balanced Diet & Nutrients.pdf
Concept of Balanced Diet & Nutrients.pdfConcept of Balanced Diet & Nutrients.pdf
Concept of Balanced Diet & Nutrients.pdf
SELF-EXPLANATORY
 
End of pipe treatment: Unlocking the potential of RAS waste - Carlos Octavio ...
End of pipe treatment: Unlocking the potential of RAS waste - Carlos Octavio ...End of pipe treatment: Unlocking the potential of RAS waste - Carlos Octavio ...
End of pipe treatment: Unlocking the potential of RAS waste - Carlos Octavio ...
Faculty of Applied Chemistry and Materials Science
 
Structure of Sperm / Spermatozoon .pdf
Structure of  Sperm / Spermatozoon  .pdfStructure of  Sperm / Spermatozoon  .pdf
Structure of Sperm / Spermatozoon .pdf
SELF-EXPLANATORY
 
Simulations of pulsed overpressure jets: formation of bellows and ripples in ...
Simulations of pulsed overpressure jets: formation of bellows and ripples in ...Simulations of pulsed overpressure jets: formation of bellows and ripples in ...
Simulations of pulsed overpressure jets: formation of bellows and ripples in ...
Sérgio Sacani
 
Direct instructions, towards hundred fold yield,layering,budding,grafting,pla...
Direct instructions, towards hundred fold yield,layering,budding,grafting,pla...Direct instructions, towards hundred fold yield,layering,budding,grafting,pla...
Direct instructions, towards hundred fold yield,layering,budding,grafting,pla...
Dr. sreeremya S
 
Potential of Marine renewable and Non renewable energy.pptx
Potential of Marine renewable and Non renewable energy.pptxPotential of Marine renewable and Non renewable energy.pptx
Potential of Marine renewable and Non renewable energy.pptx
J. Bovas Joel BFSc
 
Plant Kingdom BioHack class 11 neet ....
Plant Kingdom BioHack class 11 neet ....Plant Kingdom BioHack class 11 neet ....
Plant Kingdom BioHack class 11 neet ....
anushkakharat13
 
Adjusted NuGOweek 2024 Ghent programme flyer
Adjusted NuGOweek 2024 Ghent programme flyerAdjusted NuGOweek 2024 Ghent programme flyer
Adjusted NuGOweek 2024 Ghent programme flyer
pablovgd
 
Surface properties of the seas of Titan as revealed by Cassini mission bistat...
Surface properties of the seas of Titan as revealed by Cassini mission bistat...Surface properties of the seas of Titan as revealed by Cassini mission bistat...
Surface properties of the seas of Titan as revealed by Cassini mission bistat...
Sérgio Sacani
 
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
Faculty of Applied Chemistry and Materials Science
 
Synopsis: Analysis of a Metallic Specimen
Synopsis: Analysis of a Metallic SpecimenSynopsis: Analysis of a Metallic Specimen
Synopsis: Analysis of a Metallic Specimen
Sérgio Sacani
 
Review Article:- A REVIEW ON RADIOISOTOPES IN CANCER THERAPY
Review Article:- A REVIEW ON RADIOISOTOPES IN CANCER THERAPYReview Article:- A REVIEW ON RADIOISOTOPES IN CANCER THERAPY
Review Article:- A REVIEW ON RADIOISOTOPES IN CANCER THERAPY
niranjangiri009
 
Phytoremediation: Harnessing Nature's Power with Phytoremediation
Phytoremediation: Harnessing Nature's Power with PhytoremediationPhytoremediation: Harnessing Nature's Power with Phytoremediation
Phytoremediation: Harnessing Nature's Power with Phytoremediation
Gurjant Singh
 
AN EMPIRE ACROSS THE THREE CONTINENTS.pptx
AN EMPIRE ACROSS THE THREE CONTINENTS.pptxAN EMPIRE ACROSS THE THREE CONTINENTS.pptx
AN EMPIRE ACROSS THE THREE CONTINENTS.pptx
kalpnayadav03021986
 
Complementary interstellar detections from the heliotail
Complementary interstellar detections from the heliotailComplementary interstellar detections from the heliotail
Complementary interstellar detections from the heliotail
Sérgio Sacani
 
Testing the Son of God Hypothesis (Jesus Christ)
Testing the Son of God Hypothesis (Jesus Christ)Testing the Son of God Hypothesis (Jesus Christ)
Testing the Son of God Hypothesis (Jesus Christ)
Robert Luk
 
20240710 ACMJ Diagrams Set 3.docx . Apache, Csharp, Mysql, Javascript stack a...
20240710 ACMJ Diagrams Set 3.docx . Apache, Csharp, Mysql, Javascript stack a...20240710 ACMJ Diagrams Set 3.docx . Apache, Csharp, Mysql, Javascript stack a...
20240710 ACMJ Diagrams Set 3.docx . Apache, Csharp, Mysql, Javascript stack a...
Sharon Liu
 
A hot-Jupiter progenitor on a super-eccentric retrograde orbit
A hot-Jupiter progenitor on a super-eccentric retrograde orbitA hot-Jupiter progenitor on a super-eccentric retrograde orbit
A hot-Jupiter progenitor on a super-eccentric retrograde orbit
Sérgio Sacani
 
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
bellared2
 
Detection of the elusive dangling OH ice features at ~2.7 μm in Chamaeleon I ...
Detection of the elusive dangling OH ice features at ~2.7 μm in Chamaeleon I ...Detection of the elusive dangling OH ice features at ~2.7 μm in Chamaeleon I ...
Detection of the elusive dangling OH ice features at ~2.7 μm in Chamaeleon I ...
Sérgio Sacani
 

Recently uploaded (20)

Concept of Balanced Diet & Nutrients.pdf
Concept of Balanced Diet & Nutrients.pdfConcept of Balanced Diet & Nutrients.pdf
Concept of Balanced Diet & Nutrients.pdf
 
End of pipe treatment: Unlocking the potential of RAS waste - Carlos Octavio ...
End of pipe treatment: Unlocking the potential of RAS waste - Carlos Octavio ...End of pipe treatment: Unlocking the potential of RAS waste - Carlos Octavio ...
End of pipe treatment: Unlocking the potential of RAS waste - Carlos Octavio ...
 
Structure of Sperm / Spermatozoon .pdf
Structure of  Sperm / Spermatozoon  .pdfStructure of  Sperm / Spermatozoon  .pdf
Structure of Sperm / Spermatozoon .pdf
 
Simulations of pulsed overpressure jets: formation of bellows and ripples in ...
Simulations of pulsed overpressure jets: formation of bellows and ripples in ...Simulations of pulsed overpressure jets: formation of bellows and ripples in ...
Simulations of pulsed overpressure jets: formation of bellows and ripples in ...
 
Direct instructions, towards hundred fold yield,layering,budding,grafting,pla...
Direct instructions, towards hundred fold yield,layering,budding,grafting,pla...Direct instructions, towards hundred fold yield,layering,budding,grafting,pla...
Direct instructions, towards hundred fold yield,layering,budding,grafting,pla...
 
Potential of Marine renewable and Non renewable energy.pptx
Potential of Marine renewable and Non renewable energy.pptxPotential of Marine renewable and Non renewable energy.pptx
Potential of Marine renewable and Non renewable energy.pptx
 
Plant Kingdom BioHack class 11 neet ....
Plant Kingdom BioHack class 11 neet ....Plant Kingdom BioHack class 11 neet ....
Plant Kingdom BioHack class 11 neet ....
 
Adjusted NuGOweek 2024 Ghent programme flyer
Adjusted NuGOweek 2024 Ghent programme flyerAdjusted NuGOweek 2024 Ghent programme flyer
Adjusted NuGOweek 2024 Ghent programme flyer
 
Surface properties of the seas of Titan as revealed by Cassini mission bistat...
Surface properties of the seas of Titan as revealed by Cassini mission bistat...Surface properties of the seas of Titan as revealed by Cassini mission bistat...
Surface properties of the seas of Titan as revealed by Cassini mission bistat...
 
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
AlgaeBrew project - Unlocking the potential of microalgae for the valorisatio...
 
Synopsis: Analysis of a Metallic Specimen
Synopsis: Analysis of a Metallic SpecimenSynopsis: Analysis of a Metallic Specimen
Synopsis: Analysis of a Metallic Specimen
 
Review Article:- A REVIEW ON RADIOISOTOPES IN CANCER THERAPY
Review Article:- A REVIEW ON RADIOISOTOPES IN CANCER THERAPYReview Article:- A REVIEW ON RADIOISOTOPES IN CANCER THERAPY
Review Article:- A REVIEW ON RADIOISOTOPES IN CANCER THERAPY
 
Phytoremediation: Harnessing Nature's Power with Phytoremediation
Phytoremediation: Harnessing Nature's Power with PhytoremediationPhytoremediation: Harnessing Nature's Power with Phytoremediation
Phytoremediation: Harnessing Nature's Power with Phytoremediation
 
AN EMPIRE ACROSS THE THREE CONTINENTS.pptx
AN EMPIRE ACROSS THE THREE CONTINENTS.pptxAN EMPIRE ACROSS THE THREE CONTINENTS.pptx
AN EMPIRE ACROSS THE THREE CONTINENTS.pptx
 
Complementary interstellar detections from the heliotail
Complementary interstellar detections from the heliotailComplementary interstellar detections from the heliotail
Complementary interstellar detections from the heliotail
 
Testing the Son of God Hypothesis (Jesus Christ)
Testing the Son of God Hypothesis (Jesus Christ)Testing the Son of God Hypothesis (Jesus Christ)
Testing the Son of God Hypothesis (Jesus Christ)
 
20240710 ACMJ Diagrams Set 3.docx . Apache, Csharp, Mysql, Javascript stack a...
20240710 ACMJ Diagrams Set 3.docx . Apache, Csharp, Mysql, Javascript stack a...20240710 ACMJ Diagrams Set 3.docx . Apache, Csharp, Mysql, Javascript stack a...
20240710 ACMJ Diagrams Set 3.docx . Apache, Csharp, Mysql, Javascript stack a...
 
A hot-Jupiter progenitor on a super-eccentric retrograde orbit
A hot-Jupiter progenitor on a super-eccentric retrograde orbitA hot-Jupiter progenitor on a super-eccentric retrograde orbit
A hot-Jupiter progenitor on a super-eccentric retrograde orbit
 
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
Celebrity Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl S...
 
Detection of the elusive dangling OH ice features at ~2.7 μm in Chamaeleon I ...
Detection of the elusive dangling OH ice features at ~2.7 μm in Chamaeleon I ...Detection of the elusive dangling OH ice features at ~2.7 μm in Chamaeleon I ...
Detection of the elusive dangling OH ice features at ~2.7 μm in Chamaeleon I ...
 

Citation Graph Analysis to Identify Memes in Scientific Literature

  • 1. Citation Graph Analysis to Identify Memes in Scientific Literature Tobias Kuhn and Matjaz Perc and Dirk Helbing http://www.tkuhn.ch @txkuhn ETH Zurich NetSci 2014 — Network Science Conference 5 June 2014
  • 2. Citation Graph of Scientific Publications Entire giant component (33 million nodes) of the citation graph of Thomson Reuter’s Web of Science dataset. Legend: Natural/Agricultural Sciences (except Physical Sciences) Physical Sciences Engineering and Technology Medical and Health Sciences Social Sciences / Humanities Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 2 / 15
  • 3. Citation Graph: American Physical Society Citation graph of the Phys- ical Review journals (463k nodes). Legend: A: Atomic, molecular, optical phys. B: Condensed matter, materials phys. C: Nuclear phys. D: Particles, fields, gravitation, cosmology E: Statistical, nonlinear, soft matter phys. other journals Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 3 / 15
  • 4. Citation Graph: Memes Specific phrases or “memes” localize to specific regions in the citation graph. Legend: quantum fission graphene self-organized criticality traffic flow Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 4 / 15
  • 5. Scientific Memes “Meme” was coined by Richard Dawkins: “Just as genes propagate themselves in the gene pool by leaping from body to body via sperm or eggs, so memes propagate themselves in the meme pool by leaping from brain to brain via a process which, in the broad sense, can be called imitation.” [Dawkins, The Selfish Gene] Examples of memes: • Melodies • Recipes • Cultural habits • Scientific concepts Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 5 / 15
  • 6. Genes/Memes as Network Patterns! Dawkins’ Definition of “Gene”: “I am using the word gene to mean a genetic unit that is small enough to last for a number of generations and to be distributed around in many copies.” [Dawkins, The Selfish Gene] Our Working Definition of “Scientific Meme”: A scientific meme is a short unit of text in a publication that is replicated in citing publications and thereby distributed around in many copies. Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 6 / 15
  • 7. Propagation Score Propagation score P quantifies the degree to which a meme’s occurrence aligns with the citation graph: Pm = sticking factor sparking factor = ? ? = dm→m d→m dm→&m d→&m To prevent that some infrequent phrases get a high propagation score by chance, we can add small amount of controlled noise δ (we use δ = 3): Pm = dm→m d→m + δ dm→ ¡m + δ d→ ¡m + δ Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 7 / 15
  • 8. Frequency/Propagation Score for APS Data relativefrequency→ 10−2 100 102 104 106 10−6 10 −4 10−2 100 APS n = 1,372,365 quantum fission graphene self-organized criticality traffic flow propagation score → densityofn-grams: 100 101 102 103 104 105 Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 8 / 15
  • 9. Randomized Network relativefrequency→ 10−2 100 102 104 106 10−6 10 −4 10−2 100 APS randomized (time preserving) n = 89,356 propagation score → densityofn-grams: 100 101 102 103 104 105 Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 9 / 15
  • 10. Meme Score Meme score M as the Product of relative frequency f and propagation score P: Mm = fmPm Top 20 Memes: 1. loop quantum cosmology+ * 11. dark energy+ * 2. unparticle+ * 12. Rashba 3. sonoluminescence+ * 13. CuGeO3 + 4. MgB2 + 14. strange nonchaotic 5. stochastic resonance+ * 15. in NbSe3 6. carbon nanotubes+ * 16. spin Hall+ 7. NbSe3 + 17. elliptic flow+ * 8. black hole+ * 18. quantum Hall+ * 9. nanotubes+ 19. CeCoIn5 + 10. lattice Boltzmann+ * 20. inflation+ + annotators agreed that this is an interesting and important physics concept * also found on the list of terms extracted from Wikipedia Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 10 / 15
  • 11. Manual Annotation • Two annotators (A1, A2): PhD students with physics degree • Annotation with respect to (1) physics concept or not and (2) linguistic category • Randomly extracted phrases for comparison physics concept not a physics concept noun phrase verb adjective or adverb other meme score A1 A2 A1 A2 random A1 A2 A1 A2 weighted random terms 30 60 90 120 150 A1 A2 A1 A2 Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 11 / 15
  • 12. Comparison to Alternative Metrics 0 0.1 0.2 0.3 0.4 0.5 meme score frequency max. absolute change over time max. relative change over time max. absolute difference across journals max. relative difference across journals A (area under curve) 10 1 10 2 10 3 0 20 40 60 80 100 top x terms by meme score percentageofWikipediaterms 40% of top 50 terms are found on Wikipedia list Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 12 / 15
  • 13. Evolution over Time 0.5 1 1.5 2 2.5 3 3.5 4 4.5 x 10 5 0 2 4 6 8 10 12 publication count memescore 1940 1960 1970 198019821984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 graphene entanglement MgB2 nanotubes carbon nanotubes quark neutrino Bose−Einstein quantum Hall black C60 Hubbard model quantum wells graphite reactions photoemission black hole tricritical Kondo superconducting fission MeV diffuse scattering Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 13 / 15
  • 14. Conclusions Inheritance patterns of memes in the scientific citation graph reveal a simple mathematical regularity. This regularity can be formalized by the meme score. Allows for studying memes in an exhaustive manner. Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 14 / 15
  • 15. Thank you for your Attention! Twitter: @txkuhn Pre-print article: http://arxiv.org/abs/1404.3757 Tobias Kuhn, ETH Zurich Citation Graph Analysis to Identify Memes in Scientific Literature 15 / 15