Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Detection of Embryonic Research Topics
by Analysing Semantic Topic Networks
Angelo Antonio Salatino, Enrico Motta
@angelos...
Detecting Topic Trends
• In a recognised research area we can find
two main stages:
– initial stage
– recognised
• Can we ...
Hypothesis
• We hypothesise the existence of an earlier
embryonic phase:
– The topic itself has no label, but
– We theoriz...
Experiment
• Dataset
– Semantically-enhanced co-occurrence graph
• Selection Phase
– Debutant topics vs. Control group
• A...
Experiment: Dataset
• From the topic network
we selected two groups
of topics:
– debutant group: topics
that made their de...
Experiment: Selection Phase
For each testing topic we have:
Experiment: Analysis Phase
Clique metric:
• Harmonic
mean
• Arithmetic
mean
Timeline metric:
• Linear regression of the ti...
Findings
• We performed two evaluations over 3 million
publications
• Preliminary Evaluation:
– 2 topics in the debutant g...
Findings: Preliminary Evaluation
• AM-N: arithmetic mean and the
difference between the two
extreme values;
• AM-CF: arith...
Findings: Interesting Insights
Findings: Evaluation
• We used different
sizes of the subgraph
associated to each
testing topic
p-values ≤ 1.28•10-51
Conclusion
• Our findings confirm the initial hypothesis
• Next step:
– Automatic detection of embryonic topics by
analysi...
Detection of Embryonic Research Topics by Analysing Semantic Topic Networks
Upcoming SlideShare
Loading in …5
×

Detection of Embryonic Research Topics by Analysing Semantic Topic Networks

603 views

Published on

Being aware of new research topics is an important asset for anybody involved in the research environment, including researchers, academic publishers and institutional funding bodies. In recent years, the amount of scholarly data available on the web has increased steadily, allowing the development of several approaches for detecting emerging research topics and assessing their trends. However, current methods focus on the detection of topics which are already associated with a label or a substantial number of documents. In this paper, we address instead the issue of detecting embryonic topics, which do not possess these characteristics yet. We suggest that it is possible to forecast the emergence of novel research topics even at such early stage and demonstrate that the emergence of a new topic can be anticipated by analysing the dynamics of pre-existing topics. We present an approach to evaluate such dynamics and an experiment on a sample of 3 million research papers, which confirms our hypothesis. In particular, we found that the pace of collaboration in sub-graphs of topics that will give rise to novel topics is significantly higher than the one in the control group.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Detection of Embryonic Research Topics by Analysing Semantic Topic Networks

  1. 1. Detection of Embryonic Research Topics by Analysing Semantic Topic Networks Angelo Antonio Salatino, Enrico Motta @angelosalatino SAVE-SD @ WWW2016
  2. 2. Detecting Topic Trends • In a recognised research area we can find two main stages: – initial stage – recognised • Can we intervene before?
  3. 3. Hypothesis • We hypothesise the existence of an earlier embryonic phase: – The topic itself has no label, but – We theorize that they can be detected by analysing the dynamics of already established research areas
  4. 4. Experiment • Dataset – Semantically-enhanced co-occurrence graph • Selection Phase – Debutant topics vs. Control group • Analysis Phase – Statistical analysis of the two populations
  5. 5. Experiment: Dataset • From the topic network we selected two groups of topics: – debutant group: topics that made their debut in the period between 2000 and 2010 – control group: already existing in the decade 2000-10Semantic Topic Networks using Klink-2 by Osborne et al. @ ISWC 2015 semantic web technology semantic web semantic web technologies ontology mapping ontology matching case study knowledge management systems knowledge management system linked datum linked data fast implementation
  6. 6. Experiment: Selection Phase For each testing topic we have:
  7. 7. Experiment: Analysis Phase Clique metric: • Harmonic mean • Arithmetic mean Timeline metric: • Linear regression of the time series • Difference between the extreme values 𝛼 slope
  8. 8. Findings • We performed two evaluations over 3 million publications • Preliminary Evaluation: – 2 topics in the debutant group (Semantic Web and Cloud Computing) – Tested all the combination of the mentioned techniques • Evaluation: – 50 topic in both debutant and non-debutant group
  9. 9. Findings: Preliminary Evaluation • AM-N: arithmetic mean and the difference between the two extreme values; • AM-CF: arithmetic mean and the linear interpolation; • HM-N: harmonic mean and the difference between the first and the last values; • HM-CF: harmonic mean and the linear interpolation. Splittedbyyear p-value = 7.0280•10-12Semantic Web Cloud Computing
  10. 10. Findings: Interesting Insights
  11. 11. Findings: Evaluation • We used different sizes of the subgraph associated to each testing topic p-values ≤ 1.28•10-51
  12. 12. Conclusion • Our findings confirm the initial hypothesis • Next step: – Automatic detection of embryonic topics by analysing the topic network and identify sub- graps exhibiting such dynamics – Analyse dynamics in other networks (e.g., authors and venues)

×