Modelling metabolite concentrations in OWL using Pronto

  • 521 views
Uploaded on

OWLED 2011 presentation on the topic of modelling metabolite concentrations

OWLED 2011 presentation on the topic of modelling metabolite concentrations

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
521
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • While genomic and proteomic information describe the overallcellular machinery available to an organism, the metabolic profile ofan individual at a given time provides a canvas as to the current physiologicalstate. Concentration levels of relevant metabolites vary underdifferent conditions, in particular, in the presence or absence of differentdisorders.
  • 780 chemical entities with chemical structures have associated the role ‘metabolite’.This information is somewhat useful for the clustering of molecules – at least it allows us to distinguish those molecules that can be metabolites from those that cannot – but it is far too general. We can do much better.
  • ChEBI roles represent activeproperties of chemical entities – what chemicals do in biological contextsThis information is enhanced by specific representation of the context in which the chemicals are so activeFor metabolites, contextual information includes: - which organism (taxonomy) - how much of the metabolite is usually (normally) present in different body fluids - which disorders are associated to abnormal levels in different body fluids
  • The HMDB is a database collecting together knowledge about all known human metabolites, including physicochemical, spectral, clinical, biochemical and genomic informationFor each metabolite, HMDB includes measured concentration values taken from human samples of different biofluids (such as blood, urine, cerebrospinal fluid), from persons of different ages and with different underlying conditions.
  • In some cases the link between certain concentration values and the associated disorders can be pretty close to certain – consider – pregnancy testing.
  • HMDB data is parsed from its MetaboCards download formatWe extract metabolite concentrations from HMDB where there are both a normal and an abnormal (associated with some disease) concentration level for an adult subject. The difference between the normal and abnormal concentration values indicates a threshold between these scenariosWe want to infer the likelihood of presence of disorder by virtue of the numeric concentration value being closer to the known disordered concentration than to the known normal concentration.
  • uM = micromolar (1e-6 M)
  • Problem with standard OWL inferences and uncertainty...Introduce probabilistic DL...
  • Lukasiewicz, T.: Probabilistic description logics for the semantic web. TU Vienna infsys research report (2007)
  • We create classes for the categories of low, medium and high risk of having the given disorder.Note that the variation of risk with concentration value can be thought of, asa simplifying assumption, as a continuously valued function ranging over allpossible concentration values. However, as Pronto constraints take the formof intervals associated with classes (or instances), to create a finite numberof OWL classes and associate probability intervals to them, it is necessary todiscretize the probability function into fixed ranges.
  • What is the risk that Barry has diabetes?
  • Pronto’s strategy for combining two constraints, in the absence of a conflict, resembles a data union operationWhen multiple constraints conflict, Pronto prefers more specific statements toless specific. We evaluated this behaviour by changing the medium risk constraintto overlap with the high risk constraint, setting the upper bound for mediumto 0.55 instead of 0.54. In this case, Pronto concludes that the probability forBarry having diabetes is [0.55;0.55] -- the most specific (narrowest) resolution. Ifthe medium risk ranges to 0.6, Pronto entails Barry the range [0.55;0.6]. Thus,it seems that the behaviour on conflict (at least for the two-axiom scenario wetest here) resembles an intersection of the two underlying data ranges.
  • Neither of these results is an optimal representation of the intuitive requirement driven by the use case: it would be betterif the probabilistic combination of different types of evidence for the same conclusion increased the certainty of the conclusion. However, Pronto does allow for overriding inherited constraints in more specific subclasses. Thus, we can specify a new risk subclass for Barry's combined risk categories, and associate this with the disease with a new probability range (e.g. [0.54;0.85]). However,this approach is in general somewhat cumbersome as it would require adding many more classes and constraints to the knowledge base -- for all interesting combinations of risk factors.

Transcript

  • 1. OWLED 2011
    Modelling threshold phenomena in OWL:Metabolite concentrations as evidence for disorders
    Janna Hastings 1,2
    Ludger Jansen 3,4
    Christoph Steinbeck 1
    Stefan Schulz 5
    1Chemoinformatics and Metabolism, European Bioinformatics Institute, UK
    2 Swiss Centre for Affective Sciences, University of Geneva, Switzerland
    3 Department of Philosophy, University of Rostock, Germany
    4 Department of Philosophy, RWTH Aachen University, Germany
    5 Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria
  • 2. Motivation
    How do we link chemical entities to diseases?
    Chemicals can be used as drugs to treatdiseases
    But also, chemicals infuse living organisms as metabolites: by-products of metabolic processes that indicate which of those processes have taken place
    Wednesday, June 08, 2011
    2
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 3. ChEBI
    ChEBI is an ontology for chemicalswhich appear in a biological context
    Chemical entities, such as molecules and ions are classified structurally, and assigned to one or more roles
    Examples: antioxidant, analgesic drug, cyclooxygenase inhibitor, ... metabolite
    Wednesday, June 08, 2011
    3
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 4. ChEBI Roles
    Wednesday, June 08, 2011
    4
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 5. Metabolites in ChEBI
    Wednesday, June 08, 2011
    5
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 6. Contextual information
    In which organism(s) is the molecule a metabolite?
    How much (what concentration) of this metabolite is normally present in different bio-fluids of those organisms?
    Which disorders are associated with abnormal levels (increased or decreased) of this metabolite?
    Wednesday, June 08, 2011
    6
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 7. Human Metabolome DB
    Database of humanmetabolites and associated contextual information
    Includes measured concentration valuesfrom different human samples under different conditions (specified as free text!)
    Wednesday, June 08, 2011
    7
    Wishart DS, Knox C, Guo AC, et al. HMDB: a knowledgebase for the human metabolome. Nucleic Acids Res. 2009 37(Database issue):D603-610.
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 8. Metabolite concentrations and OWL
    Numeric data (OWL data ranges; DL concrete domains)
    Link between concentrations and disorders is not certain, but a concentration of some metabolite above a certain threshold isconsidered evidence for the presence of a disorder
    Threshold between normal and abnormal levels is vague(no definite cut-off)
    Wednesday, June 08, 2011
    8
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 9. We extract:
    Metabolite concentration values
    for metabolites found in ChEBI
    where both a normal and an abnormal value are present for an adult subject
    The difference between the normal and abnormal concentration indicates a thresholdbetween these scenarios
    Wednesday, June 08, 2011
    9
    Data extraction
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 10. Reasoning with OWL data ranges
    Can we use the ontology to automatically differentiate normal from abnormal concentrations?
    Wednesday, June 08, 2011
    10
    4440 uM (normal adult)
    7000 uM (adult with diabetes)
    D-glucosein blood
    measured value(abnormal)
    measured value(normal)
    threshold
    metaboliteconcentration
    abnormal
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 11. Generated ontology
    Wednesday, June 08, 2011
    11
    `concentration of D-glucose in Blood associated with Diabetes mellitus type 2'
    equivalentTo ( `concentration in blood'
    and (hasMetabolite some `portion of D-glucose')
    and (hasConcentrationValue some double[>= 5700.0]) )
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 12. Uncertainty
    Individual differences mean that we can’t straightforwardly associate an abnormal metabolite concentration with a disorder
    Rather, we want to infer the likelihood(risk) that a patient has a given disorder, given their metabolite concentration value
    Wednesday, June 08, 2011
    12
    ?
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 13. Probabilistic DLs
    Probabilistic DLs extend traditional DLs with the ability to associate with each axiom in the ontology a probability valuewhich represents the degree of certainty of the axiom.
    Probabilistic knowledge consists of conditional constraints:
    (v | j) [l, u]
    with l, u real numbers in the range [0, 1]
    encodes that j is a subclass of v with probability between l and u.
    Wednesday, June 08, 2011
    13
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 14. PRONTO
    A probabilistic, non-monotonic extension to Pellet
    Accepts probabilistic axioms of the form
    X subClassOf Y [l, u]
    (as annotations: pronto:certainty)
    Version 0.2 with slight modification: upgraded to the latest Pellet and OWL API releases
    Klinov, P.: Pronto: A Non-monotonic Probabilistic Description Logic Reasoner. Lecture Notes in Computer Science, vol. 5021, chap. 66, pp. 822-826.
    Wednesday, June 08, 2011
    14
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 15. Discretization
    We assume disorder risk varies continuouslywith metabolite concentration
    However, Pronto accepts only discreteranges
    Wednesday, June 08, 2011
    15
    high
    measured value(normal)
    measured value(abnormal)
    threshold
    probability of associated disorder
    metaboliteconcentration
    low
    mediumrisk
    low risk
    high risk
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 16. Reasoning with probabilities
    Wednesday, June 08, 2011
    16
    2
    what is the likelihood that this person has this disorder? (reasoning based on probabilistic constraints)
    Low risk
    0.00—0.24
    Disorder
    Medium risk
    concentration
    in blood
    0.25—0.54
    High risk
    0.55—1.00
    1
    what risk category is this concentration? (reasoning based on data restrictions)
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 17. Results
    Wednesday, June 08, 2011
    17

    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 18. Combining different evidence
    Can we accumulate the evidence (i.e. increase the likelihood) for the presence of a given disorder if there are multiple metabolite concentration values pointing towards it?
    Wednesday, June 08, 2011
    18
    concentration of D-glucose
    in blood
    Diabetes
    concentration of Acetoacetic acid
    in blood
    BARRY
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 19. Results: no conflict (Union)
    Pronto will combinethe probabilistic constraints
    medium risk [0.25; 0.54]
    and
    high risk [0.55; 1.00]
    Barry’s risk of having diabetes is in [0.25; 1.00]
    Wednesday, June 08, 2011
    19
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 20. Results: conflict (Intersection)
    What happens if Pronto combines probabilistic constraints that overlap?
    medium [0.25; 0.55] and high risk [0.55; 1.00]
     Barry’s risk : [0.55; 0.55]
    medium [0.25; 0.60] and high risk [0.55; 1.00]
     Barry’s risk : [0.55; 0.60]
    Wednesday, June 08, 2011
    20
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 21. Discussion
    Our intuitive requirement is not met:
    multiple forms of evidence for the same conclusion strengthen the likelihood of that conclusion
    To address this, Pronto allows creating overriding constraints in sub-classes
    Wednesday, June 08, 2011
    21
    +
    =
    Medium risk
    High risk
    Medium-high risk
    e.g. [0.54;0.85]
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 22. Limitations
    We did not attempt:
    Combined reasoning with more than two conflicting or non-conflicting constraints;
    Linking the generated ontology to the rest of ChEBI and to a relevant disease ontology;
    Applying probabilistic constraints to all possible diseases and metabolites in generated ontology; and
    Systematic performance evaluation of Pronto for this use case
    Wednesday, June 08, 2011
    22
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 23. OWL and uncertainty
    Accurately modelling the association between metabolites and disorders requires a semantics for uncertainty
    Reasoner behaviour when combining different constraints is crucial for adequate applicability to different use cases
    Future work will involve evaluating alternative probabilistic DLs based on Bayesian networks
    Wednesday, June 08, 2011
    23
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 24. Conclusion
    OWL 2 (with data properties and restrictions)
    and probabilistic DL (as implemented in Pronto)
    CAN be used to represent the chemical—disease association
    via metabolite concentration values
    The ontology (META.owl) and software (META.zip) are available for download from http://www.ebi.ac.uk/~hastings/concentrations/.
    Wednesday, June 08, 2011
    24
    Metabolite concentrations as evidence for disorders (OWLED 2011)
  • 25. Acknowledgements
    Funding
    BBSRC, grant agreement number BB/G022747/1 within the "Bioinformatics and biological resources" fund; and
    DFG, grant agreement number JA 1904/2-1, SCHU 2515/1-1 GoodOD(Good Ontology Design).
    Wednesday, June 08, 2011
    25
    Metabolite concentrations as evidence for disorders (OWLED 2011)