Your SlideShare is downloading. ×
0
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Modelling metabolite concentrations in OWL using Pronto
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Modelling metabolite concentrations in OWL using Pronto

550

Published on

OWLED 2011 presentation on the topic of modelling metabolite concentrations

OWLED 2011 presentation on the topic of modelling metabolite concentrations

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
550
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • While genomic and proteomic information describe the overallcellular machinery available to an organism, the metabolic profile ofan individual at a given time provides a canvas as to the current physiologicalstate. Concentration levels of relevant metabolites vary underdifferent conditions, in particular, in the presence or absence of differentdisorders.
  • 780 chemical entities with chemical structures have associated the role ‘metabolite’.This information is somewhat useful for the clustering of molecules – at least it allows us to distinguish those molecules that can be metabolites from those that cannot – but it is far too general. We can do much better.
  • ChEBI roles represent activeproperties of chemical entities – what chemicals do in biological contextsThis information is enhanced by specific representation of the context in which the chemicals are so activeFor metabolites, contextual information includes: - which organism (taxonomy) - how much of the metabolite is usually (normally) present in different body fluids - which disorders are associated to abnormal levels in different body fluids
  • The HMDB is a database collecting together knowledge about all known human metabolites, including physicochemical, spectral, clinical, biochemical and genomic informationFor each metabolite, HMDB includes measured concentration values taken from human samples of different biofluids (such as blood, urine, cerebrospinal fluid), from persons of different ages and with different underlying conditions.
  • In some cases the link between certain concentration values and the associated disorders can be pretty close to certain – consider – pregnancy testing.
  • HMDB data is parsed from its MetaboCards download formatWe extract metabolite concentrations from HMDB where there are both a normal and an abnormal (associated with some disease) concentration level for an adult subject. The difference between the normal and abnormal concentration values indicates a threshold between these scenariosWe want to infer the likelihood of presence of disorder by virtue of the numeric concentration value being closer to the known disordered concentration than to the known normal concentration.
  • uM = micromolar (1e-6 M)
  • Problem with standard OWL inferences and uncertainty...Introduce probabilistic DL...
  • Lukasiewicz, T.: Probabilistic description logics for the semantic web. TU Vienna infsys research report (2007)
  • We create classes for the categories of low, medium and high risk of having the given disorder.Note that the variation of risk with concentration value can be thought of, asa simplifying assumption, as a continuously valued function ranging over allpossible concentration values. However, as Pronto constraints take the formof intervals associated with classes (or instances), to create a finite numberof OWL classes and associate probability intervals to them, it is necessary todiscretize the probability function into fixed ranges.
  • What is the risk that Barry has diabetes?
  • Pronto’s strategy for combining two constraints, in the absence of a conflict, resembles a data union operationWhen multiple constraints conflict, Pronto prefers more specific statements toless specific. We evaluated this behaviour by changing the medium risk constraintto overlap with the high risk constraint, setting the upper bound for mediumto 0.55 instead of 0.54. In this case, Pronto concludes that the probability forBarry having diabetes is [0.55;0.55] -- the most specific (narrowest) resolution. Ifthe medium risk ranges to 0.6, Pronto entails Barry the range [0.55;0.6]. Thus,it seems that the behaviour on conflict (at least for the two-axiom scenario wetest here) resembles an intersection of the two underlying data ranges.
  • Neither of these results is an optimal representation of the intuitive requirement driven by the use case: it would be betterif the probabilistic combination of different types of evidence for the same conclusion increased the certainty of the conclusion. However, Pronto does allow for overriding inherited constraints in more specific subclasses. Thus, we can specify a new risk subclass for Barry's combined risk categories, and associate this with the disease with a new probability range (e.g. [0.54;0.85]). However,this approach is in general somewhat cumbersome as it would require adding many more classes and constraints to the knowledge base -- for all interesting combinations of risk factors.
  • Transcript

    • 1. OWLED 2011
      Modelling threshold phenomena in OWL:Metabolite concentrations as evidence for disorders
      Janna Hastings 1,2
      Ludger Jansen 3,4
      Christoph Steinbeck 1
      Stefan Schulz 5
      1Chemoinformatics and Metabolism, European Bioinformatics Institute, UK
      2 Swiss Centre for Affective Sciences, University of Geneva, Switzerland
      3 Department of Philosophy, University of Rostock, Germany
      4 Department of Philosophy, RWTH Aachen University, Germany
      5 Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria
    • 2. Motivation
      How do we link chemical entities to diseases?
      Chemicals can be used as drugs to treatdiseases
      But also, chemicals infuse living organisms as metabolites: by-products of metabolic processes that indicate which of those processes have taken place
      Wednesday, June 08, 2011
      2
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 3. ChEBI
      ChEBI is an ontology for chemicalswhich appear in a biological context
      Chemical entities, such as molecules and ions are classified structurally, and assigned to one or more roles
      Examples: antioxidant, analgesic drug, cyclooxygenase inhibitor, ... metabolite
      Wednesday, June 08, 2011
      3
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 4. ChEBI Roles
      Wednesday, June 08, 2011
      4
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 5. Metabolites in ChEBI
      Wednesday, June 08, 2011
      5
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 6. Contextual information
      In which organism(s) is the molecule a metabolite?
      How much (what concentration) of this metabolite is normally present in different bio-fluids of those organisms?
      Which disorders are associated with abnormal levels (increased or decreased) of this metabolite?
      Wednesday, June 08, 2011
      6
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 7. Human Metabolome DB
      Database of humanmetabolites and associated contextual information
      Includes measured concentration valuesfrom different human samples under different conditions (specified as free text!)
      Wednesday, June 08, 2011
      7
      Wishart DS, Knox C, Guo AC, et al. HMDB: a knowledgebase for the human metabolome. Nucleic Acids Res. 2009 37(Database issue):D603-610.
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 8. Metabolite concentrations and OWL
      Numeric data (OWL data ranges; DL concrete domains)
      Link between concentrations and disorders is not certain, but a concentration of some metabolite above a certain threshold isconsidered evidence for the presence of a disorder
      Threshold between normal and abnormal levels is vague(no definite cut-off)
      Wednesday, June 08, 2011
      8
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 9. We extract:
      Metabolite concentration values
      for metabolites found in ChEBI
      where both a normal and an abnormal value are present for an adult subject
      The difference between the normal and abnormal concentration indicates a thresholdbetween these scenarios
      Wednesday, June 08, 2011
      9
      Data extraction
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 10. Reasoning with OWL data ranges
      Can we use the ontology to automatically differentiate normal from abnormal concentrations?
      Wednesday, June 08, 2011
      10
      4440 uM (normal adult)
      7000 uM (adult with diabetes)
      D-glucosein blood
      measured value(abnormal)
      measured value(normal)
      threshold
      metaboliteconcentration
      abnormal
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 11. Generated ontology
      Wednesday, June 08, 2011
      11
      `concentration of D-glucose in Blood associated with Diabetes mellitus type 2'
      equivalentTo ( `concentration in blood'
      and (hasMetabolite some `portion of D-glucose')
      and (hasConcentrationValue some double[>= 5700.0]) )
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 12. Uncertainty
      Individual differences mean that we can’t straightforwardly associate an abnormal metabolite concentration with a disorder
      Rather, we want to infer the likelihood(risk) that a patient has a given disorder, given their metabolite concentration value
      Wednesday, June 08, 2011
      12
      ?
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 13. Probabilistic DLs
      Probabilistic DLs extend traditional DLs with the ability to associate with each axiom in the ontology a probability valuewhich represents the degree of certainty of the axiom.
      Probabilistic knowledge consists of conditional constraints:
      (v | j) [l, u]
      with l, u real numbers in the range [0, 1]
      encodes that j is a subclass of v with probability between l and u.
      Wednesday, June 08, 2011
      13
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 14. PRONTO
      A probabilistic, non-monotonic extension to Pellet
      Accepts probabilistic axioms of the form
      X subClassOf Y [l, u]
      (as annotations: pronto:certainty)
      Version 0.2 with slight modification: upgraded to the latest Pellet and OWL API releases
      Klinov, P.: Pronto: A Non-monotonic Probabilistic Description Logic Reasoner. Lecture Notes in Computer Science, vol. 5021, chap. 66, pp. 822-826.
      Wednesday, June 08, 2011
      14
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 15. Discretization
      We assume disorder risk varies continuouslywith metabolite concentration
      However, Pronto accepts only discreteranges
      Wednesday, June 08, 2011
      15
      high
      measured value(normal)
      measured value(abnormal)
      threshold
      probability of associated disorder
      metaboliteconcentration
      low
      mediumrisk
      low risk
      high risk
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 16. Reasoning with probabilities
      Wednesday, June 08, 2011
      16
      2
      what is the likelihood that this person has this disorder? (reasoning based on probabilistic constraints)
      Low risk
      0.00—0.24
      Disorder
      Medium risk
      concentration
      in blood
      0.25—0.54
      High risk
      0.55—1.00
      1
      what risk category is this concentration? (reasoning based on data restrictions)
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 17. Results
      Wednesday, June 08, 2011
      17

      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 18. Combining different evidence
      Can we accumulate the evidence (i.e. increase the likelihood) for the presence of a given disorder if there are multiple metabolite concentration values pointing towards it?
      Wednesday, June 08, 2011
      18
      concentration of D-glucose
      in blood
      Diabetes
      concentration of Acetoacetic acid
      in blood
      BARRY
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 19. Results: no conflict (Union)
      Pronto will combinethe probabilistic constraints
      medium risk [0.25; 0.54]
      and
      high risk [0.55; 1.00]
      Barry’s risk of having diabetes is in [0.25; 1.00]
      Wednesday, June 08, 2011
      19
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 20. Results: conflict (Intersection)
      What happens if Pronto combines probabilistic constraints that overlap?
      medium [0.25; 0.55] and high risk [0.55; 1.00]
       Barry’s risk : [0.55; 0.55]
      medium [0.25; 0.60] and high risk [0.55; 1.00]
       Barry’s risk : [0.55; 0.60]
      Wednesday, June 08, 2011
      20
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 21. Discussion
      Our intuitive requirement is not met:
      multiple forms of evidence for the same conclusion strengthen the likelihood of that conclusion
      To address this, Pronto allows creating overriding constraints in sub-classes
      Wednesday, June 08, 2011
      21
      +
      =
      Medium risk
      High risk
      Medium-high risk
      e.g. [0.54;0.85]
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 22. Limitations
      We did not attempt:
      Combined reasoning with more than two conflicting or non-conflicting constraints;
      Linking the generated ontology to the rest of ChEBI and to a relevant disease ontology;
      Applying probabilistic constraints to all possible diseases and metabolites in generated ontology; and
      Systematic performance evaluation of Pronto for this use case
      Wednesday, June 08, 2011
      22
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 23. OWL and uncertainty
      Accurately modelling the association between metabolites and disorders requires a semantics for uncertainty
      Reasoner behaviour when combining different constraints is crucial for adequate applicability to different use cases
      Future work will involve evaluating alternative probabilistic DLs based on Bayesian networks
      Wednesday, June 08, 2011
      23
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 24. Conclusion
      OWL 2 (with data properties and restrictions)
      and probabilistic DL (as implemented in Pronto)
      CAN be used to represent the chemical—disease association
      via metabolite concentration values
      The ontology (META.owl) and software (META.zip) are available for download from http://www.ebi.ac.uk/~hastings/concentrations/.
      Wednesday, June 08, 2011
      24
      Metabolite concentrations as evidence for disorders (OWLED 2011)
    • 25. Acknowledgements
      Funding
      BBSRC, grant agreement number BB/G022747/1 within the "Bioinformatics and biological resources" fund; and
      DFG, grant agreement number JA 1904/2-1, SCHU 2515/1-1 GoodOD(Good Ontology Design).
      Wednesday, June 08, 2011
      25
      Metabolite concentrations as evidence for disorders (OWLED 2011)

    ×