Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas

88 views

Published on

Ontologies of research areas are important tools for characterising, exploring, and analysing the research landscape. Some fields of research are comprehensively described by large-scale taxonomies, e.g., MeSH in Biology and PhySH in Physics. Conversely, current Computer Science taxonomies are coarse-grained and tend to evolve slowly. For instance, the ACM classification scheme contains only about 2K research topics and the last version dates back to 2012. In this paper, we introduce the Computer Science Ontology (CSO), a large-scale, automatically generated ontology of research areas, which includes about 15K topics and 70K semantic relationships. It was created by applying the Klink-2 algorithm on a very large dataset of 16M scientific articles. CSO presents two main advantages over the alternatives: i) it includes a very large number of topics that do not appear in other classifications, and ii) it can be updated automatically by running Klink-2 on recent corpora of publications. CSO powers several tools adopted by the editorial team at Springer Nature and has been used to enable a variety of solutions, such as classifying research publications, detecting research communities, and predicting research trends. To facilitate the uptake of CSO we have developed the CSO Portal, a web application that enables users to download, explore, and provide granular feedback on CSO at different levels. Users can use the portal to rate topics and relationships, suggest missing relationships, and visualise sections of the ontology. The portal will support the publication of and access to regular new releases of CSO, with the aim of providing a comprehensive resource to the various communities engaged with scholarly data.

Published in: Science
  • Be the first to comment

  • Be the first to like this

The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas

  1. 1. The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas Angelo A. Salatino, Thiviyan Thanapalasingam, Andrea Mannocci, Francesco Osborne, Enrico Motta @angelosalatino Knowledge Media Institute The Open University United Kingdom
  2. 2. Ontologies of Research Areas I. making sense of the research dynamics II. classifying publications III. identifying research communities IV. forecasting research trends
  3. 3. Ontologies and Taxonomies of Research Areas Mathematics Subject Classification – MSC2010 Physics and Astronomy Classification Scheme (PACS) JEL Classification System Library of Congress Classification (LCC) Computing Classification System (CCS)
  4. 4. The Computer Science Ontology • Ontology of research areas, automatically generated using Klink-2* algorithm, on a dataset of 16 million publications mainly in Computer Science • Current version of CSO includes 14K topics and 143K relationships • Main roots include Computer Science, Linguistic, Mathematics, Geometry, Semantics and so on. *Francesco Osborne, and Enrico Motta. "Klink-2: integrating multiple web sources to generate semantic topic networks." In ISWC 2015, Bethlehem, PA (USA).
  5. 5. Data Model The CSO data model includes seven semantic relations: • skos:broaderGeneric, which indicates that a topic is a sub-area of another one (e.g., Linked Data, Semantic Web). • relatedEquivalent, which indicates that two topics can be treated as equivalent for the purpose of exploring research data (e.g., Ontology Matching, Ontology Alignment). • contributesTo, which indicates that the research outputs of one topic contributes to another. For instance, research in Ontology Engineering contributes to the Semantic Web, but arguably Ontology Engineering is not a sub-area of the Semantic Web – but arguably Ontology Engineering is not a sub- area of Semantic Web – that is, there is plenty of research in Ontology Engineering outside the Semantic Web area. • owl:sameAs, this relation indicates that a research concepts is identical to an external resource. We used DBpedia Spotlight to connect research concepts to Dbpedia. • primaryLabel, this relation is used to state the main label for topics belonging to a cluster of relatedEquivalent. For instance, the topics Ontology Matching and Ontology Alignment will both have their primaryLabel set to Ontology Matching. • rdf:type, this relation is used to state that a resource is an instance of a class. For example, a resource in our ontology is an instance of topic. • rdfs:label, this relation is used to provide a human-readable version of a resource’s name.
  6. 6. CSO Generation Klink-2 is an approach for learning large-scale ontologies of research topics from corpora of scientific articles and knowledge sources on the web. Given a pair of keywords it infers their semantic relationship: • skos:broaderGeneric • contributesTo • relatedEquivalent Francesco Osborne, and Enrico Motta. "Klink-2: integrating multiple web sources to generate semantic topic networks." In ISWC 2015, Bethlehem, PA (USA). relatedEquivalent skos:broaderGeneric contributesTo
  7. 7. In brief • Manually Crafted • Evolves slowly • Coarse-grained • High correctness • Low completeness • Automatically generated • Frequent updates • Fine-grained • Lower correctness • High completeness ACM Computing Classification Scheme Computer Science Ontology
  8. 8. ISWC 2018 - Call for Papers database internet reasoning knowledge base artificial intelligenceaccess control social networks data miningontology machine learning semantics privacy knowledge representation natural language processing semantic web data stream information retrieval ontology-based data access web data mining cloud environments information visualization mobile platform ontology merging ontology matching geo-spatial data data cleaning semantic data blockchain ontology mapping ontology engineering question answering linked data data mining techniques knowledge discovery information extraction About 50% of these topics are not in ACM Computing Classification Scheme ontology matching Not available in ACM CCS Available in ACM CCS
  9. 9. Smart Topic Miner The Smart Topic Miner (STM) is a semantic application that support the Springer Nature editorial team in classifying scholarly publications in the field of Computer Science. Francesco Osborne, Angelo Salatino, Aliaksandr Birukou, and Enrico Motta. "Automatic classification of springer nature proceedings with smart topic miner." In ISWC 2016. Kobe, Japan. http://rexplore.kmi.open.ac.uk/STM_demo
  10. 10. Smart Book Recommender Smart Book Recommender (SBR) is a web application that takes as input a conference and suggests books, proceedings and journals which address similar topics. It helps Springer Nature editorial team in marketing books. Thiviyan Thanapalasingam, Francesco Osborne, Aliaksandr Birukou, and Enrico Motta. "Ontology- Based Recommendation of Editorial Products." ISWC 2018. Monterey, CA (USA). http://rexplore.kmi.open.ac.uk/SBR_demo
  11. 11. Augur – Early Detection of Research Topics Augur is a method for detecting the emergence of research areas at an embryonic stage, i.e., before the topic has been consistently labelled by researchers and associated with several publications. Angelo Salatino, Francesco Osborne, and Enrico Motta. "AUGUR: Forecasting the Emergence of New Research Topics." In JCDL’18. Fort Worth, Texas, USA.
  12. 12. CSO through CSO Portal I. Browse II. Download • https://cso.kmi.open.ac.uk/downloads • or https://w3id.org/cso/downloads • It is available in OWL, Turtle and CSV format. III. Provide granular feedback This work is licensed under a Creative Commons Attribution 4.0 International License.
  13. 13. CSO Ecosystem – Let’s keep humans in the loop New Systems Use CSO Feedback Explore / Download Computer Science Ontology Update CSO Portal Community of researchers
  14. 14. CSO Portal Architecture Visit CSO Portal: https://cso.kmi.open.ac.uk Registered Users Editorial Board Rexplore Dataset DBpedia Klink Computer Science Ontology Ontology Feedback Topic Feedback Relationship Feedback Suggest New Relationship Version x.y Snapshot of Feedbacks Revision and Analysis of Feedbacks Minor Revision Major Revision Create version x.(y+1) Create version (x+1).0 Revision and Update Framework Annotation Ontology Browsing Ontology Users Ontology Generation Download Ontology Check Dashboard/Contributions
  15. 15. CSO Portal Architecture Visit CSO Portal: https://cso.kmi.open.ac.uk Registered Users Editorial Board Rexplore Dataset DBpedia Klink Computer Science Ontology Ontology Feedback Topic Feedback Relationship Feedback Suggest New Relationship Version x.y Snapshot of Feedbacks Revision and Analysis of Feedbacks Minor Revision Major Revision Create version x.(y+1) Create version (x+1).0 Revision and Update Framework Annotation Ontology Browsing Ontology Users Ontology Generation Download Ontology Check Dashboard/Contributions
  16. 16. CSO Portal Architecture Visit CSO Portal: https://cso.kmi.open.ac.uk Registered Users Editorial Board Rexplore Dataset DBpedia Klink Computer Science Ontology Ontology Feedback Topic Feedback Relationship Feedback Suggest New Relationship Version x.y Snapshot of Feedbacks Revision and Analysis of Feedbacks Minor Revision Major Revision Create version x.(y+1) Create version (x+1).0 Revision and Update Framework Annotation Ontology Browsing Ontology Users Ontology Generation Download Ontology Check Dashboard/Contributions
  17. 17. CSO Portal Architecture Visit CSO Portal: https://cso.kmi.open.ac.uk Registered Users Editorial Board Rexplore Dataset DBpedia Klink Computer Science Ontology Ontology Feedback Topic Feedback Relationship Feedback Suggest New Relationship Version x.y Snapshot of Feedbacks Revision and Analysis of Feedbacks Minor Revision Major Revision Create version x.(y+1) Create version (x+1).0 Revision and Update Framework Annotation Ontology Browsing Ontology Users Ontology Generation Download Ontology Check Dashboard/Contributions
  18. 18. Browsing research concepts Three views allowing you to seamlessly browse CSO: • Graph • Compact • Detailed
  19. 19. Predicates shown Shown predicate Ontology predicate Example parent of skos:broaderGeneric semantic web patent of linked data semantic web skos:broaderGeneric linked data alternative label relatedEquivalent computer network patent of computer networks computer network skos:broaderGeneric computer networks child of inverseOf(skos:broaderGeneric) semantic web child of world wide web world wide web skos:broaderGeneric semantic web same as owl:sameAs semantic web same as dbpedia:Semantic_Web semantic web owl:sameAs dbpedia:Semantic_Web
  20. 20. Browsing research concepts: content negotiation Format Header Resource HTML text/html https://cso.kmi.open.ac.uk/topics/semantic web RDF/XML application/rdf+xml https://cso.kmi.open.ac.uk/topics/semantic web.rdf https://cso.kmi.open.ac.uk/topics/semantic web.xml Turtle text/turtle https://cso.kmi.open.ac.uk/topics/semantic web.ttl JSON-LD application/json or application/ld+json https://cso.kmi.open.ac.uk/topics/semantic web.json https://cso.kmi.open.ac.uk/topics/semantic web.jsonld N-Triples application/n-triples https://cso.kmi.open.ac.uk/topics/semantic web.nt CSO Portal supports the content negotiation to serve different representations of the same resource (URI)
  21. 21. Providing Feedback Users can offer four kinds of feedback: • Topic • Relationship • Suggest new relationship • Entire ontology
  22. 22. Editorial Panel Some functionalities are already available: • Add/Remove topic • Add/Remove relationship • Change cluster’s primary label • Check Ontology Consistency • Check Ontology state • Check History operations • Deploy Ontology
  23. 23. Release cycle • Minor revisions • Correcting specific errors • Add/Remove relationships • Add/Remove topic • Major revisions • Expanding ontology by re-running Klink-2 • New recent corpus of publications • Considering user feedback
  24. 24. CSO Classifier (beta) http://w3id.org/cso/classify
  25. 25. Future Work • Currently we are working on Klink v3.0 • Extract further information from abstracts • Can take into account the feedback gathered through the portal • We plan to release ontologies in other fields of Science • Engineering • Medicine • Producing external links to other resources • E.g., mapping to other available taxonomies • Developing new features for the CSO Portal • Relevant papers and authors associated to each research topic
  26. 26. Angelo Salatino Thiviyan Thanapalasingam Andrea Mannocci Francesco Osborne Enrico Motta https://cso.kmi.open.ac.uk/about Sign Up to CSO Portal and contribute!

×