Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

II-SDV 2017: The "International Chemical Ontology Network"


Published on

The "International Chemical Ontology Network" is established by major chemical industries in order to elaborate and understand the connections in the Big Data collections of chemistry / pharmacology. The collections come from the different natural sciences and human sciences and can only be explained or experienced in overlapping structures and connections. In this way, new innovative approaches are to be defined, new knowledge generated, products produced and marketed. Thus, the whole value creation process can be simulated and defined in the initial phase of a project, and it is possible to counteract unfavorable developments even at a very early stage. The clear structure and the incorporation of all relevant and common rules will help to improve the understanding of overlapping structures in the research, development and production processes, which will lead to a considerable saving of costs and resources, Full innovation potential of the industry 4.0 approach.

Published in: Internet
  • Be the first to comment

  • Be the first to like this

II-SDV 2017: The "International Chemical Ontology Network"

  1. 1. INCOT.NET René Deplanque | International Chemical Ontology Network @The International Information Conference on Search, Data Mining and Visualization.
  2. 2. y
  3. 3. SUMMERY • The final goal of this project is to build a system that is available to all member of the network. • It will be developed to make Big Data collections from various fields of chemistry / pharmacology manageable using parent ontologies. • Thus the development of an innovative industry 4.0 approach will be simplified and accelerated.
  4. 4. PAINPOINTS OF TODAY'S DATA COLLECTIONS: Access Issues => Problems with finding and/or getting access to data Audience Issues => who is looking at data, how they perceive it, perspectives, language of discipline Chemical Structure Representation Issues => what areas are problems - inorganic, organometallic, large molecules, mixtures, chiral centers Community Issues => policies, procedures, and best practices we need to adopt to move things forwards Data Issues => standardization/interoperability, metadata, gaps, scale, and sharing, dark data Ontology/Vocabulary Issues => consensus on terms, maintenance, versions, optimal vocabularies, areas where needed Tools to Help Data/Metadata Capture Issues => adding metadata, feedback, consistency, synchronization
  5. 5. InternetofThings AI 3D-Printing VirtualReality CloudComputing SocialMedia Mobility Analytics Security Energy / Utilities Consumer goods Entertainment / media Administration Insurance IT-technologies Pharmaceuticals Productions Industries Trade Telecommunication Banks Important Unchanged Unimportant Technologies Trends of the coming 3-5 YearsBasis 3700 Manager worldwide
  6. 6. Source: Krallinger, M. et al. (2005) Text-mining approaches in molecular biology and biomedicine. DDT 10(6) 440
  7. 7. Ontology Defined Google Definitions on the web • An ontology is a controlled vocabulary that describes objects and the relations between them in a formal way, and has a grammar for using the vocabulary terms to express something meaningful within a specified domain of interest. Source: • Ontology is the newest label attached to some KOSs. Ontologies are being developed as specific concept models by the Knowledge Management community. They can represent complex relationships between objects, and include the rules and axioms missing from semantic networks. Ontologies that describe knowledge in a specific area are often connected with systems for data mining and knowledge management. Source:
  8. 8. CREATING A COMPUTABLE CHEMICAL TAXONOMY REQUIRES THREE KEY COMPONENTS: A well-defined hierarchical taxonomic structure; A dictionary of chemical classes (with full definitions and category mappings); and Computable rules or algorithms for assigning chemicals to taxonomic categories.
  9. 9. Semantic Web The Semantic Web "layer cake" as presented by Tim Berners-Lee. Source: Hendler, J. (2001) Agents and the semantic web.
  10. 10. KNOWN CLASSIFICATION SYSTEMS OF CHEMICAL SUBSTANCES  Classification as defined by EU regulations  Regulation (EC) No 1272/2008 on classification, labelling and packaging of substances and mixtures (the 'CLP Regulation').  Classification as defined by UBA (Germany’s environmental protection agency)  These criteria and limiting values help to determine hazardous physical-chemical properties as well as health and environmental hazards  The Anatomical Therapeutic Chemical Classification System (ATC/DDD of the world health organisation WHO)  The purpose of the ATC/DDD system is to serve as a tool for drug utilization research in order to improve quality of drug use.
  11. 11.  GUIDANCE ON THE CLASSIFICATION OF HAZARDOUS CHEMICALS UNDER THE WHS REGULATIONS  This Guidance is intended for manufacturers and importers of substances, mixtures and articles who have a duty under the World Health and Safety (WHS) Act and Regulations to classify them  the Globally Harmonised System of Classification and Labelling of Chemicals (the GHS).  The WHS Regulations also implement the harmonised hazard communication elements of the GHS that are to appear on labels and safety data sheets (SDS)  The Chemical Fragmentation Coding system  It was developed in 1963 by the Derwent World Patent Index (DWPI) to facilitate the manual classification of chemical compounds reported in patents.  The system consists of 2200 numerical codes corresponding to a set of pre-defined, chemically significant structure fragments
  12. 12. Tools for developing Chemical Ontologies  HOSE (Hierarchical Organisation of Spherical Environments) code.  This hierarchical substructure system, allows one to automatically characterize atoms and complete rings in terms of their spherical environment  Gene Ontology (GO) system,  was one of the first open-source, automated functional group ontologies to be formalized.  CO functional groups can be automatically assigned to a given structure by Checkmol a freely available program. CO’s assignment of functional groups is accurate and consistent, and it has been applied to several small datasets. However,  the CO system is limited to just ~200 chemical groups  SODIAC tool for automatic compound classification.  It uses a comprehensive chemical ontology and an elegant structure- based reasoning logic.  The underlying chemical ontology can be freely downloaded and the SODIAC software, which is closed-source, is free for academics
  13. 13. WHAT ARE THE MAJOR PROBLEMS ➢ In contrast to biology, geology, and many other scientific disciplines, the world of chemistry still lacks a standardized chemical ontology or taxonomy ➢ The chemical classification of a compound could help predict its metabolic fate in humans, its drug ability or potential hazards associated with it. ➢ The sheer number (tens of millions of compounds) and complexity of chemical structures is such that any manual classification effort would prove to be near impossible
  14. 14. two-ring heterocyclic compounds isoquinolines isoquinoline alkaloids morphinans morphine grouped_by_chemistry FRAGMENT OF CHEMICAL ONTOLOGY molecules organic molecules heterocyclic compounds bridged-ring heterocyclic compounds morphinans morphine IsA O N OH OH CH3 H NH H morphine morphinan IsA Source: Ennis, M. (2004) ChEBI A Dictionary of Chemical Entities with an Associated Ontology. SOFG-2, Philadelphia, October 23-26 2004
  15. 15. CH3 O NH2 H O OHCH2 OH NH2 H O O - CH3 O NH2 H O O - CH2 OH NH2 H O OH CH3 O H NH2 O OH CH2 OH H NH2 O O - CH2 OH H NH2 O OH CH3 O H NH2 O O - L-Amino acid D-Amino acid Amino acid CO2H OH =O NH2 CO2 ¯ is_a is_part_of is_enantiomer_of is_conjugate_base_of is_tautomer_of Source: Ennis, M. (2004) ChEBI A Dictionary of Chemical Entities with an Associated Ontology. SOFG-2, Philadelphia, October 23-26 2004
  16. 16. AUTOMATED CHEMICAL CLASSIFICATION SYSTEM However, as in PubChem, the annotation is incomplete. Class assignments to “clavams” and “azetidines”, among others, are missing
  18. 18. HOW TO WORK WITH ONTOLOGIES Michael Büttner Ontology Learning
  19. 19. THE ONTOLOGY DEVELOPMENT PROCESS Michael Büttner Ontology Learning
  20. 20. WHAT DO WE HAVE - WHAT DO WE NEED ➢Chemists have a standardized nomenclature (IUPAC, CAS, REAXIS) ➢Chemists have standardized methods for drawing or exchanging chemical structures ➢Chemistry still lacks a standardized, comprehensive, and clearly defined chemical taxonomy or chemical ontology
  21. 21. WHAT WAS DONE ➢ Chemist have developed domain specific ontologies ➢ Medical Chemist classify according to pharmaceutical activities (antibacterial antihypertensive) ➢ Biochemist classify according biosynthetic origin (nucleic acids, terpenoids) ➢ They do not fit ➢ In the PubChem database only 0.12% of the >91,000,000 compounds (as of June 2016) are classified via the MeSH thesaurus
  22. 22. WHO AND WHAT IS INCOT.NET  The Problem of defining overlapping Ontologies is of such a magnitude that it can not be solved by a single Organisation.  INCOT.NET is an organisation based on an idea, need and interest of major Chemical Companies.  It is organized as independent Partnership  It is attempting to coordinate a large variety of Organisations to solve major pre-production problems.  One of the prototype problems will be: The use of Ontologies in the development of new methodologies for the development of new Antibiotics.
  23. 23. Thank you for your patience you will need it for your future