My ontology is better than yours! Building and evaluating ontologies for integrative research

1,132 views
1,020 views

Published on

Published in: Education, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,132
On SlideShare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
18
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

My ontology is better than yours! Building and evaluating ontologies for integrative research

  1. 1. Introduction Biomedical ontology Use case: pharmacogenomics Outlook My ontology is better than yours! Building and evaluating ontologies for integrative research Robert Hoehndorf Department of Genetics University of Cambridge Bio-Ontology SIG
  2. 2. Introduction Biomedical ontology Use case: pharmacogenomics OutlookTranslational research National Cancer Institute: Translational research transforms scientific discoveries arising from laboratory, clinical, or population studies into clinical applications to reduce [disease] incidence, morbidity, and mortality.
  3. 3. slide by Robert Stevens
  4. 4. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologies Gruber (1993): An ontology is the explicit specification of a conceptualization of a domain. controlled vocabularies hierarchically organized facilitate data integration
  5. 5. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologies
  6. 6. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologies Individual Physical object Quality Function Process ChEBI Ontology Molecule Gene Sequence Ontology Transcript GO-CC Organelle Celltype Gene Ontology Cell Phenotype Tissue Ontology Organ Anatomy Ontology Body Population
  7. 7. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologies How can we find the “best” ontology? How can we develop the “best” ontology?
  8. 8. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologiesOntology evaluation
  9. 9. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologiesEvaluation criteria ontology design principles rooted in best practices philosophy logic ontology engineering linguistics community agreement community requests peer review
  10. 10. Introduction Biomedical ontology Use case: pharmacogenomics OutlookOntologyOntology evaluation definitions singular nouns common relations single is-a hierarchy orthogonality realism ...
  11. 11. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologies Most ontology evaluation criteria are intrinsic criteria and evaluate what ontologies are.
  12. 12. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologies Most ontology evaluation criteria are intrinsic criteria and evaluate what ontologies are. How can we evaluate what ontologies do?
  13. 13. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologiesA functional perspective
  14. 14. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologiesEvaluation criteria criteria from software engineering, etc. user study unit tests complexity ...
  15. 15. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologiesA functional perspective
  16. 16. Introduction Biomedical ontology Use case: pharmacogenomics OutlookBiomedical ontologiesEvaluation criteria criteria from biology experiments statistics (p-values) comparison to gold/silver standard ...
  17. 17. PharmacogenomicsPharmacogenomics databases
  18. 18. Introduction Biomedical ontology Use case: pharmacogenomics OutlookResearch questions drug discovery drug repurposing drug response drug pathways disease pathways causal mutations
  19. 19. Introduction Biomedical ontology Use case: pharmacogenomics OutlookResearch questions drug discovery drug repurposing drug response drug pathways disease pathways causal mutations
  20. 20. Introduction Biomedical ontology Use case: pharmacogenomics OutlookTraditional approaches to drug repurposing drug target identification models of drug binding experiment design and execution (e.g., binding assays) analysis and interpretation of experiment results
  21. 21. Introduction Biomedical ontology Use case: pharmacogenomics OutlookIntegrative approaches to drug repurposing SIDER text mining of drug labels side-effect similarity UMLS PREDICT disease–disease similarity drug–drug similarity disease phenotypes, gene functions, side effects, chemical structure, protein interactions, text mining HPO, MESH, GO OFFSIDES adverse event reports ATC, UMLS
  22. 22. Introduction Biomedical ontology Use case: pharmacogenomics OutlookPharmacogenomics Can we get some novel information about drug indications (and causal mutations) by analyzing experimental data from animal models?
  23. 23. Introduction Biomedical ontology Use case: pharmacogenomics OutlookApproach
  24. 24. Introduction Biomedical ontology Use case: pharmacogenomics OutlookApproach
  25. 25. Introduction Biomedical ontology Use case: pharmacogenomics OutlookRelevant ontologies Mammalian Phenotype Ontology 9,161 classes manually developed annotation of animal models formal (EQ) definitions Human Phenotype Ontology 9,796 classes manually developed annotation of diseases formal (EQ) definitions
  26. 26. Introduction Biomedical ontology Use case: pharmacogenomics OutlookChallenges 1 comparison of human and mouse phenotypes cross-species integration how do we represent phenotypes? 2 computation of similarity semantic similarity based on ontology taxonomy which ontology do we use for computing similarity?
  27. 27. Introduction Biomedical ontology Use case: pharmacogenomics OutlookCross-species phenotype integration representation of MP and HPO phenotypes PATO-based formal definitions GO homologous and analogous anatomical structures (UBERON) aim: cross-species integration of phenotypes
  28. 28. Introduction Biomedical ontology Use case: pharmacogenomics OutlookWhat are phenotypes and how do we represent them (forcross-species integration)? Abnormal appendix: E=Appendix, Q=Abnormal representation: appendix with quality Abnormal quality Abnormal of some appendix organism with appendix that has quality Abnormal ... inheritance of phenotypes across parthood Abnormality of tip of appendix subclass of Abnormality of appendix? absence of appendix
  29. 29. Introduction Biomedical ontology Use case: pharmacogenomics OutlookSemantic similarity Semantic similarity results depend on the number of distinctions made by ontology developers the kind of distinctions made by ontology developers the data that is analyzed the similarity measure
  30. 30. Introduction Biomedical ontology Use case: pharmacogenomics OutlookSemantic similarity Should we compute phenotypic similarity based on the Human or the Mammalian Phenotype Ontology (or both)? How can we compare the results?
  31. 31. Introduction Biomedical ontology Use case: pharmacogenomics OutlookOntology design decisions can be resolved empirically! no a priori “right” way to represent phenotypes focus on scientific results, not representation evaluation: empirical objective quantitative external
  32. 32. Introduction Biomedical ontology Use case: pharmacogenomics OutlookOntology design decisions can be resolved empirically! finish the analysis use known gene–disease associations as gold standard use FDA-approved drug indications as gold standard compare analysis results against gold standard
  33. 33. Introduction Biomedical ontology Use case: pharmacogenomics OutlookSemantic similarity over phenotype ontologies measuresphenotypic similarity semantic similarity pairwise comparison of disease and animal phenotypes IC (x) x∈Cl(P)∩Cl(D) sim(P, D) = IC (y ) y ∈Cl(P)∪Cl(D)
  34. 34. Introduction Biomedical ontology Use case: pharmacogenomics OutlookPhenomeNET compares phenotypes across species ranking of gene for each disease candidate genes for disease
  35. 35. Introduction Biomedical ontology Use case: pharmacogenomics OutlookStatistical testing to rank drug–disease pairs one-sided Wilcoxon signed rank test result: ranking of drugs for each disease based on p-value low p-value: mutations in mouse genes associated with a drug result in phenotypes that are very similar to a disease phenotype high p-value: genes uniformly distributed across ranks
  36. 36. Introduction Biomedical ontology Use case: pharmacogenomics OutlookReceiver Operating Characteristic Source: Wikipedia
  37. 37. Introduction Biomedical ontology Use case: pharmacogenomics OutlookGene-disease associations PhenomeNet initial 1 0.9 0.8 0.7 True Positive Rate 0.6 AUC: original 0.68 0.5 0.4 0.3 0.2 0.1 x original 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False Positive Rate
  38. 38. Introduction Biomedical ontology Use case: pharmacogenomics OutlookGene-disease associations PhenomeNet improved 1 0.9 0.8 0.7 AUC (original): 0.68 True Positive Rate 0.6 AUC (latest): 0.89 0.5 0.4 0.3 0.2 0.1 x original latest 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False Positive Rate
  39. 39. Introduction Biomedical ontology Use case: pharmacogenomics OutlookGene-drug associations PhenomeDrug initial 1 0.9 0.8 0.7 True Positive Rate 0.6 AUC: original 0.61 0.5 0.4 0.3 0.2 0.1 x original 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False Positive Rate
  40. 40. Introduction Biomedical ontology Use case: pharmacogenomics OutlookGene-drug associations PhenomeDrug improved 1 0.9 0.8 0.7 AUC (original): 0.61 True Positive Rate 0.6 AUC (latest): 0.67 0.5 0.4 0.3 0.2 0.1 x original latest 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False Positive Rate
  41. 41. Introduction Biomedical ontology Use case: pharmacogenomics OutlookRepresentation of phenotypes for cross-species integration ’Abnormality of appendix’ EquivalentTo: has-part some (part-of some (Appendix and has-quality some Quality)) organism-centric approach (has-part some) transitivity over parthood (part-of some) Quality used as indicator of abnormality use of OWL EL
  42. 42. Introduction Biomedical ontology Use case: pharmacogenomics OutlookRepresentation of phenotypes for cross-species integration ’Large appendix’ EquivalentTo: has-part some (Appendix and has-quality some ’Increased size’) organism-centric approach (has-part some) no transitivity over parthood use of OWL EL
  43. 43. Introduction Biomedical ontology Use case: pharmacogenomics OutlookAbsence ’Absence of appendix’ EquivalentTo: has-part some (Appendix and has-quality some Absent) subclass of Abnormality of appendix use of OWL EL
  44. 44. Introduction Biomedical ontology Use case: pharmacogenomics OutlookSemantic similarity Should we compute phenotypic similarity based on the Human or the Mammalian Phenotype Ontology (or both)? How can we compare the results?
  45. 45. Introduction Biomedical ontology Use case: pharmacogenomics OutlookSemantic similarity Computation of semantic similarity using the Mammalian Phenotype Ontology improves the analysis results. problem specific depending on mouse data depending on the approach depending on similarity measure depending on gold standard dataset
  46. 46. Introduction Biomedical ontology Use case: pharmacogenomics OutlookConclusion Quantitative, external evaluation can improve ontologies and ontology-based analysis methods.
  47. 47. Introduction Biomedical ontology Use case: pharmacogenomics OutlookAnnotation Definitions: intrinsic: having definitions Aristotelian definitions external: having definitions that are easily understandable having definitions that improve annotation consistency criteria: measure annotation consistency user study Dolan, M. E., et al. A procedure for assessing GO annotation consistency. Bioinformatics 21, i136–i143 (2005).
  48. 48. Introduction Biomedical ontology Use case: pharmacogenomics OutlookAnnotation Labels: intrinsic: singular nouns reference to universals external: use of common, widely used terms use of unambiguous terms criteria: measure annotation consistency user study recall in text Yao, L., et al. Benchmarking Ontologies: Bigger or Better? PLoS Comput Biol 7, e1001055 (Jan. 2011).
  49. 49. Introduction Biomedical ontology Use case: pharmacogenomics OutlookKnowledge bases and querying Queries: intrinsic: use of OWL use of specific relations use of upper level ontology consistency external: retrieve correct answers retrieve relevant answers criteria: user study (to evaluate query answers) test set comparison to gold standard Boeker, M., et al. Unintended consequences of existential quantifications in biomedical ontologies. BMC Bioinformatics 12, 456 (2011).
  50. 50. Introduction Biomedical ontology Use case: pharmacogenomics OutlookConclusions My ontology is better than yours.
  51. 51. Introduction Biomedical ontology Use case: pharmacogenomics OutlookConclusions My ontology is better than yours. My ontology can do some things better than your ontology.
  52. 52. Introduction Biomedical ontology Use case: pharmacogenomics OutlookConclusionsQuantitative criteria Empirical, objective, quantitative, application-based evaluation will allow us to systematically improve ontologies for science.
  53. 53. Thank you for your attention
  54. 54. Introduction Biomedical ontology Use case: pharmacogenomics OutlookSemantic similarity
  55. 55. Introduction Biomedical ontology Use case: pharmacogenomics Outlook 1Semantic similarity: 12
  56. 56. Introduction Biomedical ontology Use case: pharmacogenomics OutlookSemantic similarity
  57. 57. Introduction Biomedical ontology Use case: pharmacogenomics Outlook 4Semantic similarity: 12

×