Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Forecasting the Spreading of Technologies in Research Communities @ K-CAP 2017

378 views

Published on

Technologies such as algorithms, applications and formats are an
important part of the knowledge produced and reused in the
research process. Typically, a technology is expected to originate
in the context of a research area and then spread and contribute to several other fields. For example, Semantic Web technologies
have been successfully adopted by a variety of fields, e.g.,
Information Retrieval, Human Computer Interaction, Biology, and
many others. Unfortunately, the spreading of technologies across
research areas may be a slow and inefficient process, since it is
easy for researchers to be unaware of potentially relevant
solutions produced by other research communities. In this paper,
we hypothesise that it is possible to learn typical technology
propagation patterns from historical data and to exploit this
knowledge i) to anticipate where a technology may be adopted
next and ii) to alert relevant stakeholders about emerging and
relevant technologies in other fields. To do so, we propose the
Technology-Topic Framework, a novel approach which uses a
semantically enhanced technology-topic model to forecast the
propagation of technologies to research areas. A formal evaluation
of the approach on a set of technologies in the Semantic Web and
Artificial Intelligence areas has produced excellent results,
confirming the validity of our solution.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Forecasting the Spreading of Technologies in Research Communities @ K-CAP 2017

  1. 1. Francesco Osborne, Andrea Mannocci, Enrico Motta Knowledge Media Institute, The Open University, United Kingdom K-CAP 2017 Forecasting the Spreading of Technologies in Research Communities
  2. 2. Standing on the Shoulder of Giants (and Technologies) • We constantly reuse ideas, technologies, methods and materials. • Technologies will usually appear in a research community and then spread to other research areas in the following years. – e.g., SW technologies were created in the field of AI, KBS, WWW and then they spread to Information Retrieval, HCI, Biology and so on. • This process is often inefficient and may take several years. • Currently there are no methods to predict technology spreading across research areas. 2
  3. 3. Standing on the shoulder of giants (and technologies) • We constantly reuse ideas, technologies, methods and materials. • Technologies will usually appear in a research community and then spread to other research areas in the following years. – e.g., SW technologies were created in the field of AI, KBS, WWW and then they spread to Information Retrieval, HCI, Biology and so on. • This process is often inefficient and may take several years. • Currently there are no methods to predict technology spreading across research areas. How can we improve the technology transfer? How can we help researchers to track down relevant technologies? 3
  4. 4. Predict Technology Spreading Technology-Topic Framework (TTF) is a novel approach for predicting the technologies that will be adopted in a research field. • It is based on the hypothesis that technologies that exhibit similar spreading patterns will be adopted by similar communities. 4
  5. 5. Input Knowledge Bases TTF takes as input three knowledge bases: 1) A dataset of research papers, described by means of their titles, abstracts, and keywords; – A dump of the Scopus database in the 1990-2013 period, containing about 16 million papers 5
  6. 6. Input Knowledge Bases TTF takes as input three knowledge bases: 1) A dataset of research papers, described by means of their titles, abstracts, and keywords; – A dump of the Scopus database in the 1990-2013 period, containing about 16 million papers 2) A list of input technologies, associated to the relevant publications in the research paper dataset; – 1,118 technologies extracted with TechMiner and from Wikipedia which appeared in > 10 publications in the Scopus dataset. – We focused on three categories: • algorithms/approaches (e.g., Support Vector Machines, Particle Swarm Optimisation, Latent Semantic Analysis) • formats (e.g., Rule Interchange Format, OWL 2, Systems Modeling Language) • applications (e.g., OntoClean, Taverna, Annotea). 6
  7. 7. Input Knowledge Bases 2 3) An ontology of research areas, describing topics and their relationships. • Computer Science Ontology (CSO), automatically created by Klink-2 algorithm and currently trialled by Springer Nature to classify proceedings. Includes 15K concepts and 70K relationships. 7 Osborne, F. and Motta, E.: Klink-2: integrating multiple web sources to generate semantic topic networks. In ISWC 2015. (2015).
  8. 8. Generation of Technology-Topic Matrices • We associated each paper to relevant areas in CSO exploiting the skos:broaderGeneric and relatedEquivalent relationships. – e.g., a publication associated with SPARQL will be tagged with topics such as RDF, Linked Data, SW, WWW, and Computer Science. • We produced a sequence of matrices that contain the number of publications of a technology in a topic in a given year. 8
  9. 9. Technology Propagation Forecasting It is treated as ! separate classification problems, one for each topic. 9
  10. 10. Evaluation • We evaluated TTF on 1,118 technologies and 173 topics in the field of Computer Science during the 1990-2013 period. • Two main goals: – Confirming that it is possible to forecast technology propagation – Comparing the performance of several ML algorithms • We tested six ML algorithms: Logistic Regression, Random Forest, Decision Tree, Support Vector Machine, Neural Network, and Gradient Boosting. • Each topic classifier was trained on average on 5,136 240 examples and was evaluated on 679 90 examples. 10
  11. 11. Precision 11
  12. 12. Recall 12
  13. 13. Best research areas - Random Forest - F1 score 13
  14. 14. Example of forecasted topics 14
  15. 15. Conclusions • It is possible to forecast technology spreading, at least for some categories of topics. • TTF performed well, but it would be interesting to evaluate it with a larger set of technologies and topics. • We could use this technology for alerting researchers about promising new technologies relevant to their research and shorten the technology transfer time. 15
  16. 16. Next Steps • Collecting data about more technologies to perform larger scale experiments. • Creating an ontology of technologies and incorporate it in the analysis. • Enriching the forecasting model by considering text generated features and external knowledge bases. • Expanding the scope of our work by including other research fields, such as Biology, Social Science, and Engineering. 16
  17. 17. Francesco Osborne Andrea Mannocci Enrico Motta Email: francesco.osborne@open.ac.uk Twitter: FraOsborne Site: people.kmi.open.ac.uk/francesco Osborne, F., Mannocci, A. and Motta, E. (2017) Forecasting the Spreading of Technologies in Research Communities K-CAP 2017, Austin, Texas, USA.

×