From chemicals to minds: Integrated ontologies in the search for scientific understanding


Published on

Presented at the 2012 Interdisciplinary Ontology (InterOntology) Conference in Tokyo, February 24th 2012. This presentation gives a whirlwind tour of some "reports from the front lines" of practical bio-ontology development in ChEBI and in the Mental Functioning and Emotion Ontology projects.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • By the end of this talk, I hope to have convinced you that studying chemicals and minds .. And doing so via bio-ontology .. Are a perfectly sensible combination. I will give an overview “report from the front lines” of some of the practical bio-ontology projects that I am involved in, distributed across computer science, philosophy and domain science; distributed across the relatively mature ChEBI project to the fledgling MFO project.
  • I am not a chemist… ChEBI is my external chemical brain!
  • ChEBI was introduced in order to address the standardisation of chemical annotation across bioinformatics databases… i.e. the equivalent of the Gene Ontology but for chemistry within a biological context. And it does serve that purpose: many different databases use ChEBI as their chemical annotation resource. But like other bio-ontologies, it serves many other purposes by now. One of those is to be my external chemistry brain. I don’t know chemistry, and I don’t have to (for my purposes): ChEBI knows chemistry, and I know how to ask ChEBI questions.
  • ChEBI is manually curated. Chemicals are given a structure-based classification and assigned with the has_role relationship to the role ontology.
  • From Swainston et al., Subliminal toolbox ( also:
  • ChEBI is manually curated and growth rate is slow… show a picture of an unhappy person waiting in the line to get their chemical annotated…
  • This is a good argument for why ontology is useful in chemistry. Chemoinformatics is full of systems to automatically classify compounds based on their structural features. The problem is that you need a new algorithm – or a new trained statistical model – for each different problem, and this does not in any way render the result accessible to domain experts, nor provide explanations of predictions.
  • Higher expressivity is not necessarily required for question answering, since the inferred hierarchy can be exported to OWL-EL for question answering.
  • This is BFO terminology
  • The history of bio-ontologies is part of the open data movement in bioinformatics. The Gene Ontology: most successful bio-ontology. Many others exist: phenotypes, chemicals, anatomy, cells, proteins…The OBO Foundry: a coordinating organisation which brings together bio-ontologists to address matters of interoperability and integration(Controlled vocabularies give you the benefit of standards without the benefits of AI)So that researchers don’t have to spend their time doing data integration
  • There are 134 hits for ‘has role’ some psychotropic in ChEBI in February 2012. This screenshot (inter alia) shows Lithium (a mood stabilizer); chlorpromazine (an antipsychotic); valproate (antimanic); 5-methoxy-N,N-dimethyltryptamine (hallucinogen);
  • Mental functioning related anatomical structure: an anatomical structure in which there inheres the disposition to be the agent of a mental processBehaviour inducing state: a bodily quality inhering in a mental functioning related anatomical structure which leads to behaviour of some sortAffective representation: a cognitive representation sustained by an organism about its own emotionsCognitive representation: a representation which specifically depends on an anatomical structure in the cognitive system of an organismMental process: a bodily process which brings into being, sustains or modifies a cognitive representation or a behaviour inducing state
  • (Not the million dollar question, but the many billion dollars question!)We’re drowning in data and starving for knowledge! Not only different domains BUT different methods and different subjects (model organisms etc)Huge piles of different sorts of information coming out of different research areas. DIFFERENT PERSPECTIVES: if you try to get people to agree on names, they just don’t. But give them semantics-free identifiers and their own preferred (scoped) synonyms and you can get agreement on the definitions. Nobody is an expert in everything, most scientists are stuck in their narrow area of focus and expertise (which is a good thing for progress because you HAVE to become that specialised)
  • Different interpretations for the same results can ensue; based on the underlying theory of mental functioning. Linking the theory directly to the paradigm (tests) and the research results allows more straightforward generation of testable hypothesis for evaluating different theories… getting away from conceptual arguments, or at least helping to resolve them(Explicit logical formulation)
  • SNOMED, MeSH, ICD, ICF, Cognitive Atlas, Cognitive Paradigm Ontology, We will build on these vocabulary resources as sources, but maintain links so that we don’t lose mappings which have already been annotated to these sources.Most of these sources maintain controlled vocabularies but not real ontologies. There is a shortage of explicit relationships and formal (computable) definitions, so you can’t infer anything from annotations and you can’t link between different resources.
  • Canonical fear also involves an action tendency to fight-or-flight, a bad (powerless, negative, anxious) feeling, a behavioural response to the emotion that includes a characteristic fearful facial expression
  • Cognitive neuroscience uses research “paradigms” – experimental designs intended to allow comparison of brain activation between different conditions. The subtraction of the brain activation for the control condition from the brain activation for the test condition then gives the “net” activation, which is what is reported on in the literature, subject to statistical analysis.
  • This is, of course, just one tiny part of the story. The overall story would have to be built up out of many, many cross-ontology links.
  • Depression and bipolar disorder are paradigm affective disorders.
  • Linking entities in ontologies describing mental disease to the entities describedin ontologies for the underlying mechanism of action such as chemicals and proteinsthus allows automated retrieval of biological knowledge in relevant databases and au-tomated linking of these data to the corresponding medical and psychiatric data intoaddiction. For one example of the enhanced querying capability that results from theabove described chain of interlinkages to describe the biochemistry and neurobiologyof addiction: rather than querying the pathway databases for heroin alone, a query canretrieve results for all molecules that act with the same mechanism of action (22 dif-ferent molecules are annotated as has role `-opioid receptor agonist' (CHEBI:55322)in the January 2012 release of ChEBI).The life sciences still hold many mysteries at all of the dierent levels from thevery small to the very large { from the processes at the biochemical level that controlDNA replication and cellular metabolism to the complicated synchronization of thefunctioning of whole organisms. Alongside the need to interpret data at all levels,there is a need for integration between the different levels, to achieve a holistic viewacross everything that is currently known. This is the vision of whole-systems biology,and computational processing is essential in making that vision into a reality. Butthe computational processing needs to be guided by a very special structure { thatis, it needs to be guided by our best understanding of what the entities are that thescience is about. Only by focusing on what the science is about, on what is known tohold in the world that science tries to study, can scientific results be integrated acrossdifferent technological platforms, across different research programs, across differentmountains of raw data, and across conflicting and sometimes bewildering results. Afurther necessary condition is that ontologies for different domains are able to workwell together.
  • Interlinking ontologies provides many good things: automated bridging across levels of granularity for representation of modes of action; indexing; querying; aggregation; comparison of results across disciplines.
  • Scientists are ants, each contributing a tiny amount to the knowledge that humans collectively possess about the world. Bio-ontology aims to computably represent that knowledge since it is greater than any one person can amass – across all the domains, across all the different fields, across all the different levels and granularities. Computers become our external minds – but they need to be much better at being external minds – and they need to be able to do it in a global, cross-disciplinary fashion.
  • From chemicals to minds: Integrated ontologies in the search for scientific understanding

    1. 1. InterOntology @ Tokyo, February 2012 From Chemicals to Minds:Integrated ontologies in the search for scientific understanding Janna Hastings1,2 1 Cheminformatics and Metabolism, European Bioinformatics Institute, UK 2 Swiss Center for Affective Sciences, University of Geneva, Switzerland
    2. 2. I want…Oxytocin is believed to play a role in various behaviors,including orgasm, social recognition, pair bonding, anxiety …it is sometimes referred to as the "love hormone".The inability to secrete oxytocin and feel empathy is I think…linked to sociopathy, psychopathy, narcissism andgeneral manipulativeness. Tuesday, February 28, 2012 2
    3. 3. Bio-ontologies serve many purposesStandards … for automated data exchange in rapidlychanging scientific environmentsCategorisation of entities in the domain … for data-driven research such as functional analysis of genetranscriptionFacilitating interdisciplinary research throughenabling comparison of results across disciplinesRepresenting what we know about scienceTuesday, February 28, 2012 3
    4. 4. Tuesday, February 28, 2012 4
    5. 5. ChEBI is an ontology of small molecules ChEBI Ontology chemical entity role chemical substance biological role molecular entity application group chemical role carbonyl compound pharmaceutical solventcarboxy group carboxylic acid antibacterial drug cyclooxygenase has part inhibitor has role cefpodoxime (CHEBI:606443) Tuesday, February 28, 2012 5
    6. 6. Why do people want their chemicals annotated in ChEBI?ChEBI is the only freely available chemicaldatabase with high-quality manual curationChEBI IDs are stable and maintained ChEBI ontology allows automatic traversal and retrieval of chemical knowledge e.g. for metabolic network reconstruction e.g. for scaffold hopping in drug discoveryTuesday, February 28, 2012 6
    7. 7. Pathways and metabolic network reconstructions encode dynamic biochemical knowledgeTuesday, February 28, 2012 7
    8. 8. ChEBI and Metabolic Network Reconstruction1.Fuzzy merging between different models for integrated view on patchy knowledge , uses nearest shared ancestor2.Protonation state of metabolites important for charge balancing, uses conjugate base / acid relationships3.Proper treatment of biomass reactions (searches for “is a” lipid relationship [Swainston et al, Manchester]Tuesday, February 28, 2012 8
    9. 9. ChEBI is manually maintained by a team of chemists ChEBI growth: no. of entries 30,000 25,000 20,000 15,000 10,000 5,000 0 Jun-08 Apr-09 Sep-09 Feb-10 Dec-10 Jan-08 Nov-08 Jul-10 Oct-11 May-11Tuesday, February 28, 2012 9
    10. 10. Classification practices in chemistry lead to massive multiple inheritanceChEBI ontology 10
    11. 11. Tuesday, February 28, 2012 11
    12. 12. Desiderata for structure-based automated classificationClass definitions should be expressed in a language or formalismwhich is accessible to domain experts (chemists);It should be possible to combine different elementary featuresinto sophisticated class definitions using compositionality;The specification of class definitions should allow automaticarrangement of those classes into a hierarchyMid-level groupings should be semantic, i.e. they should makesense to chemists and be named;Tuesday, February 28, 2012 12
    13. 13. Logical definitions enable automatic classification hydrocarbon equivalentTo molecule and has_atom only (carbon atom or hydrogen atom) peptide cation equivalentTo peptide and has_charge some double [>, 0.0]ChEBI ontology 13
    14. 14. tricarboxylic acid equivalentTo molecule and has_functional_group exactly 3 carboxy group Beyond OWL Structured object representation & reasoning (Magka et al., Oxford) Hybrid reasoning with second-order features of symmetric graphs such as fullerenes (Kutz et al., Bremen)ChEBI ontology 14
    15. 15. What about non-structural classes?All sulphuric acid molecules have a sulphur atom andfour oxygen atoms arranged in a certain bondingpattern at all times that they exist.But any given moleculemay or may not everbe involved in acting asa strong acid 15
    16. 16. ChEBI ‘roles’ represent how chemicals actSubatomic particle:parts of atomsChemical entity:parts and structuralfeatures of molecules ‘Has role’Role ontology:active propertiesof chemical entities
    17. 17. ChEBI ‘roles’ are BFO realizables (mostly)Properties that we ascribe to things because ofwhat can happen under certain circumstances(future-pointing) are called realizable entitiesThe processes (/events) in which they displaythose properties are called realizations (the property, however, exists all the time) 17
    18. 18. Examples of chemical dispositions• Buffer • Surfactant• Catalyst • Antioxidant• Hydrogen donor / • Detergent acceptor • De-aminating agent• Acid / base • Radical scavenger 18
    19. 19. Biological functions• Epitope ChEBI functions are the• Mitogen ‘other side’ of the GO molecular functions• Hormone (which have protein• Growth regulator bearers)• Toxin• Nutrient Both functions are• COX inhibitor realized in the same• Cholinesterase process reactivator 19
    20. 20. Artefactual functions• Label Chemicals are designed• Fragrance synthetically or selected by chemists in order to• Pesticide perform certain functions• Fuel outside of biological• Dye evolution• Detergent• Probe But what about• Reagent drugs, e.g. for treatment• Agrochemical of headaches? 20
    21. 21. Thalidomide is not a drug for treating morning sickness (anymore)Originally introduced as a sedative and hypnotic for treatment of morning sickness in1957, thalidomide was withdrawn from use in the early 1960s after it was shown toproduce severe teratogenic effects. It was subsequently found that the (R)-enantiomer iseffective against morning sickness, whereas the (S)-enantiomer is teratogenic. However, asthe enantiomers can interconvert in vivo, administering only the (R)-enantomer would notprevent the teratogenic effect. Image credit: Hildeenmikey
    22. 22. A harmless metabolite in one organism is food to another and toxin to a third Paracetamol treats pain and fever in humans and is safe enough to give to babies,22 but it kills cats
    23. 23. Bridging from chemistry to biologySome ChEBI roles are realized in biological processes but beware: NOT toxin realized_in ‘response to toxin’Life cycle of an organism: insecticide realized_in process ‘death’ and has_organism some ‘insect’ (a kind of participation)
    24. 24. ChEBI is now the (behind-the-scenes) chemical representation of all the chemical biologyGO:0051610 in GOThe directed movement (to be published soon!)of serotonin into a cell
    25. 25. A very interesting class of molecules: those that alter mental functioningTuesday, February 28, 2012 25
    26. 26. Mental Functioning Ontology (MFO) BFO:Entity BFO BFO:Continuant BFO:Occurrent MFO BFO:Independent BFO:Dependent BFO:Process Continuant Continuant Bodily Process Organism BFO:Disposition Cognitive Representation BFO:Quality Mental Functioning Mental Process Related Anatomical Structure Behaviour inducing state Affective RepresentationTuesday, February 28, 2012 26
    27. 27. How does mental functioning actually work? EEGBiology Mouse Psychology Human Physics fMRI Genetic profiling GeneNeuroscience expression analysis Psychiatry Metabolic Chemistry Self-reports analysis Questionnaires
    28. 28. Theories of mental functioning Abducted! Replaced!Capgras delusion:a disorder in which a personholds a delusion that a friend,spouse, parent, or other closefamily member has been replacedby an identical-looking impostor. Faulty perception? Normal perception, faulty reasoning? Faulty emotional reaction to perception? Overactive imagination? TESTABLE IMPLICATIONS
    29. 29. Existing vocabularies don’t include computable definitions
    30. 30. The Emotion Ontology (MFO-EM) BFO:Entity BFO MFO BFO:Continuant BFO:Occurrent MFO-EM BFO:Independent BFO:Dependent Continuant Continuant BFO:Process Organism BFO:Disposition Bodily Process Physiological Response to Emotion Process Mental Process Cognitiveinheres_in Representation Appraisal Process Emotional Action Tendencies Affective is_output_of Representation Appraisal Emotional Behavioural Process Subjective Emotional Feeling has_part agent_of Emotion Occurrent
    31. 31. Types of emotionTuesday, February 28, 2012 31
    32. 32. To define the characteristics of different emotions start with canonical emotionsEmotion types (such as fear) show enormous variance across instancesJust as do anatomical types, e.g. human bodiesOntology expresses what is always true… But aims to saysomething useful for representation of domain knowledge.Solution: encode such knowledge in ‘canonical’ types canonical Has part appraisal Has output Appraisal of fear process dangerousness Canonical fear results from an appraisal of dangerousness Tuesday, February 28, 2012 32
    33. 33. Canonical fear fear subtype canonical fear EMOTION COMPONENT CHARACTERISTIC FOR FEAR Action tendency Fight-or-flight Subjective emotional feeling Negative, tense, powerless Behavioural response Characteristic fearful facial expression Characteristic appraisal Something is dangerous to meTuesday, February 28, 2012 The Emotion Ontology (ICBO 2011) 33
    34. 34. Canonical and non-canonical fearCanonical fear gives rise to action tendenciesthat are conformant to the perceived dangerPhobia =disposition giving rise to non-canonical fearlaridaphobia : intense fear of seagullsTuesday, February 28, 2012 34
    35. 35. Cognitive Neuroscience of Emotion Task Classification in MFO/MFOEM Recognition of gender in emotional facial Visual perception of emotional facial expressions expressions (subClassOf perception) Recall of personal emotional memories Memory of emotional episodes with instructions to try re-create feeling (subClassOf memory) Listening to emotional sounds (e.g. grunts Auditory perception of emotional stimuli of disgust) (subClassOf perception) Viewing emotional film extracts Visual and auditory perception of emotional stimuli (subClassOf perception)Paradigms selected based on study of random sample ofpapers from BrainMap database. Conclusion…Cognitive Neuroscience does not usually study canonicalemotions! The link from perception of emotional fear in facialexpressions to canonical fear is subject to empirical research Tuesday, February 28, 2012 35
    36. 36. (Part of) the biochemical basis of emotion is in ChEBI Emotions are effected in part by neurotransmitters such as dopamine, tryptophanmolecular entity biological role Molecular function emotion (CHEBI:25375) (CHEBI:24432) (GO:0003674) (MFOEM:1) subtype neurotransmitter happiness dopamine neurotransmitter receptor activity (MFOEM:42) (CHEBI:25375) (CHEBI:25512) (GO:0030594) has role realized in part of Tuesday, February 28, 2012 36
    37. 37. Disorders of affectSome mental diseases involve altered emotionalfunctioning. (E.g. depression, bipolar disorder) Disposition Process mental emotion biological process disease Mechanism of action: complex down-regulation disturbances in non-canonical of dopaminergic depression underlying sadness system systems (GO:0032227) realized in has partTuesday, February 28, 2012 37
    38. 38. Interlinked ontologies accelerate the search for scientific understandingLinking diseases to mechanisms of action totreatment drugs and biomarkers to geneticfactors giving rise to predispositions … Different disciplines and methods – one reality being investigated Different types of data – one ontologyTuesday, February 28, 2012 38
    39. 39. Good ontology design facilitates interlinkingSuccessfully interlinking ontologies depends oninteroperability of the underlying ontologies:shared technological platform;common upper level;non-overlapping content;agreed bridging relationships; …Tuesday, February 28, 2012 39
    40. 40. ConclusionsBio-ontology is inescapably interdisciplinaryIt needs to be guided by the experts in each domainThe collective goal is to build an interlinkedframework of ontologies which describe the best ofwhat is known across the sciences… in order to provide a knowledge-based backbonefor increasingly intelligent and sophisticatedcomputational data analysis and processingtechniquesTuesday, February 28, 2012 40
    41. 41. Acknowledgements Mental Functioning and Emotion Ontologies:Kevin Mulligan (UNIGE), Barry Smith & Werner Ceusters (Buffalo) ChEBI: Paula de Matos, Christoph Steinbeck, Colin Batchelor (RSC), Stefan Schulz (Graz) Funding BBSRC (UK), NSF (CH), EU-OPENSCREENTuesday, February 28, 2012 41