Being able to rapidly recognise new research trends is strategic for many stakeholders, including universities, institutional funding bodies, academic publishers and companies. The literature presents several approaches to identifying the emergence of new research topics, which rely on the assumption that the topic is already exhibiting a certain degree of popularity and consistently referred to by a community of researchers. However, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. In this dissertation, we begin to address this challenge by performing a study of the dynamics preceding the creation of new topics. This study indicates that the emergence of a new topic is anticipated by a significant increase in the pace of collaboration between relevant research areas, which can be seen as the ‘ancestors’ of the new topic. Based on this understanding, we developed Augur, a novel approach to effectively detecting the emergence of new research topics. Augur analyses the diachronic relationships between research areas and is able to detect clusters of topics that exhibit dynamics correlated with the emergence of new research topics. Here we also present the Advanced Clique Percolation Method (ACPM), a new community detection algorithm developed specifically for supporting this task. Augur was evaluated on a gold standard of 1,408 debutant topics in the 2000-2011 timeframe and outperformed four alternative approaches in terms of both precision and recall.
2. Augur
Here we present a novel framework for identifying the
appearance of new topics at an embryonic stage:
• analyses networks of research topics
• detects areas exhibiting a significant increase in the pace of
collaboration
• produces clusters of topics correlated with the future
emergence of new research areas
4. Background
«[…] successive transition from one
paradigm to another via revolution is the
usual developmental pattern of mature
science.»
Thomas Kuhn - The Structure of Scientific
Revolutions
5. Background
«As the work and the points of view grow
more specialised, men in different
disciplines have fewer things in common, in
their background and in their daily
problems»
Clark - The study of Campus Cultures
«Sometimes, of course, friendly relations
may be established to mutual benefit ...»
Becher and Throwler - Academic Tribes and
Territories
6. Is it possible to detect a new research topic at the embryonic
stage before it is consistently recognised by a research
community (e.g., there is an established label for it)?
7. Related questions
1. Is it possible to precisely define the notion of established
topic?
2. How early in the topic lifecycle is it possible to identify an
emerging topic?
3. What are the indicators that can be exploited to predict the
emergence of new topics?
4. Is it possible to develop an effective computational method
that can support this task?
5. Are there commonalities between our approach to
predicting the emergence of new research topics and
epistemological theories of research dynamics?
6. What evaluation mechanisms are appropriate for this task?
Study #1: Chapter 4
Study #2: Chapter
5/6
8. Initial 3 questions
1. Is it possible to precisely
define the notion of
established topic?
2. How early in the topic
lifecycle is it possible to
identify an emerging
topic?
3. What are the indicators
that can be exploited to
predict the emergence of
new topics?
1. Before being labelled and
recognised by research
communities, new topics go
through an embryonic stage, in
which researchers from different
fields start to work on it.
2. The emergence of a new research
topic is anticipated by an
increased rate of interaction of
pre-existing topics, involved in
developing this new area which is
still in its embryonic stage.
Research questions
Hypotheses
9. Study #1: Analysis of dynamics
• Selection Phase
• Treatment group (debutant topics) vs.
• Control group (established or non-debutant topics)
• Analysis Phase
• Statistical analysis of the pace of collaboration (diachronic
activity of triangles of collaborating topics) and changes in
network density over the two populations (treatment and
control groups)
pace of collaboration
network density
10. Answering first 3 questions
1. Is it possible to precisely
define the notion of
established topic?
2. How early in the topic
lifecycle is it possible to
identify an emerging
topic?
3. What are the indicators
that can be exploited to
predict the emergence of
new topics?
Research questions
Yes, but only in a practical manner, as there
is no widely accepted definition of topic.
Thus we relied on #years of activity and
published #papers.
We can predict the emergence of a topics in
its embryonic stage. As we will see later, we
can do it with good precision, up to 2 years.
We found that the pace of collaboration and
network density are good indicators.
12. We asked further 3 questions
4. Is it possible to develop an
effective computational
method that can support
this task?
5. Are there commonalities
between our approach to
predicting the emergence of
new research topics and
epistemological theories of
research dynamics?
6. What evaluation mechanisms
are appropriate for this
task?
3. It is possible to create an automatic
approach for detecting new emerging
topics in their embryonic stage by
analysing the dynamics of existing
topics (i.e., observing their patterns of
collaboration).
Research questions Hypotheses
13. Study #2: Early detection of topics
• We evaluated Augur against a gold standard of 1408 topics
• We compared ACPM against other four clustering
algorithms: FG, LE, Fuzzy C-Means, CPM
14. Answering second 3 questions
4. Is it possible to develop an
effective computational
method that can support
this task?
5. Are there commonalities
between our approach to
predicting the emergence of
new research topics and
epistemological theories of
research dynamics?
6. What evaluation mechanisms
are appropriate for this
task?
Yes, Augur uses both PoC and ND to identify
topics in their embryonic stage up to two
years with high precision.
Yes, we got inspired by those theories and in
addition our results provide further
empirical evidence, highlighting the role of
multidisciplinarity in the creation of new
research areas
We created a gold standards of 1408
debutant topics. We extracted their
ancestors and we compared against the
clusters produced by Augur.
15. Limitations & Future Work
Limitation
• Gold Standard
• Scope
Future Work
• Develop new version of Augur and test it on new data
• Testing new version of CSO
• Investigating other dynamics