Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ontologies: Necessary, but not sufficient

331 views

Published on

Key note talk at Bio-Ontologies 2017, Prague

Published in: Science
  • Be the first to comment

  • Be the first to like this

Ontologies: Necessary, but not sufficient

  1. 1. Ontologies: Necessary, but not Sufficient Robert Stevens School of Computer Science The University of Manchester Manchester United Kingdom M13 9PL Robert.Stevens@Manchester.ac.UK
  2. 2. Knowing what we’re talking about • This is what ontologies are for • For human and machines
  3. 3. Number of PubMed papers per year for 1998 to 2016 (without normalisation) PubMed search: (ontology[All Fields] OR ontologies[All Fields]) Number of PubMed papers per year for 1998 to 2016 (with normalisation)
  4. 4. Number of PubMed papers per year for 1998 to 2016 (without normalisation) PubMed search: ("gene ontology"[MeSH Terms] OR ("gene"[All Fields] AND "ontology"[All Fields]) OR "gene ontology"[All Fields]) Number of PubMed papers per year for 1998 to 2016 (with normalisation)
  5. 5. PubMed papers from 1998 to 2016: “ontology OR ontologies” Other GO ontology Other: 25% GO: 75%
  6. 6. Word cloud for all the citations of the 1998 search Total 35 citations Text from all fields No data cleaning Make word cloud via https://www.jasondavies.com/wordcloud/ PubMed search: (ontology[All Fields] OR ontologies[All Fields]) AND (”1998/01/01"[PDAT] : ”1998/12/31"[PDAT])
  7. 7. Word cloud for all the citations of the 2005 search First 200 citations of a total of 516 citations for for 2005 Text from all fields No data cleaning Make word cloud via https://www.jasondavies.com/wordcloud/ PubMed search: (ontology[All Fields] OR ontologies[All Fields]) AND ("2005/01/01"[PDAT] : "2005/12/31"[PDAT])
  8. 8. Word cloud for the first 200 citations of the 2016 search First 200 citations of a total of 2647 for 2016 Text from all fields No data cleaning Make word cloud via https://www.jasondavies.com/wordcloud/ PubMed search: (ontology[All Fields] OR ontologies[All Fields]) AND ("2016/01/01"[PDAT] : "2016/12/31"[PDAT])
  9. 9. Top ten mentioned resources in PMc full text corpus • R • Gene Ontology • GenBank • BLAST • PDB • KEGG • GEO • Ensembl • ABA • Cluster Duck et al PLOS1 2016
  10. 10. The OBO Library http://www.obofoundry.org Anatomy & development zfs, wbbt, wbls, uberon, tgma, tads, spd, pdumdv, plana, poro, opl, olatdv, oarcs, mmusdv, mfmo, ma, xao, zfa, aeo, PO, caro, ceph, cmf, cteno, ddanat, ehdaa2, emap, emapa, fao, fbbt, fbdv, fma, hao, hsapdv How we do science mro, xco, zeco, uo, stato, swo, obcs, ms, mamo, kisao, cheminf, chmo, mmo, sbo, sep, sepio, obi, agro, bcgo, cdao, cmo, duo, eaglei, fbbi, fix Phenotypes and disease ogsf, ohd, wbphenotype, vt, oba, to, micro, ncit, mondo, mfomd, doid, ppo, upheno, miro, nbo, sibo, omp, pato, geno, apo, bspo, cvdo, ddpheno, dpo, flopo, hp, mp, mpath, ido, idomal Molecules, Macromolecules GO MF & BP, pr, ncro, mirnao, mod, rnao, mop, xl, rex, omit, chebi, mi Clinical epo, ogms, omrse, ontoneo, oostt, ovae, pdro, symp, vo, aero, dideo, dinto, dron, exo, genepio, ico, oae Data sources fbcv, omiabis, bco, cio, miapa, iao, obib Cells and their parts Go CC, cl, clo, bto Environment envo, eo, ero, geo Mental phenomena mf, mfoem Species and populations tto, vto, ncbitaxon, pco, rs, taxran Genes and Genomes ogg, ogi, so, hom, vario
  11. 11. The GO Challenge Computer scientists have made significant contributions to linguistic formalisms and computational tools for developing complex vocabulary systems using reason- based structures, and we hope that our ontologies will be useful in providing a well-developed data set for this community to test their systems. Ashburner et al Nat. Genet. 2000
  12. 12. Ontologies don’t do Biology • We have riches in annotations • We should do more than Gene Expression Analysis • We need software that uses ontologies to draw conclusions • That can be automated reasoning • …but also via levels of indirection
  13. 13. A rich description of the common buttercup and (hasRegion some (MarginRegion and (hasSepalPetalFeature some Entire) and (hasSepalPetalFeature some Membranous))) and (hasRegion some (SurfaceRegion and (hasSepalPetalFeature some Pubescent) and (hasSurfaceSelector some LowerSurfaceSelector))) and (hasRegion some (SurfaceRegion and (hasSepalPetalFeature some Smooth) and (hasSurfaceSelector some UpperSurfaceSelector))) and (hasRegion some (TipRegion and (hasForm some Truncate))) and (hasSepalPetalFeature some PalmatelyNetted) and (hasSepalPetalShape some Ovate) and (hasSepalousity some Aposepalos))))) and (hasPart some (Corolla and (hasPart exactly 5 (Petal and (hasColour some Yellow) and (hasPetalousity some Apopetalos) and (hasRegion some (BaseRegion and (hasForm some Acute))) and (hasRegion some (MarginRegion and (hasSepalPetalFeature some Entire))) and (hasRegion some (TipRegion and (hasForm some Acute))) and (hasSepalPetalFeature some PalmatelyNetted) and (hasSepalPetalShape some Obovate) and (hasPart exactly 1 Nectary))))) and (hasPerianthArrangement some AlternatingPerianthArrangement) and (hasPart only (Calyx or Corolla)))) Class: "Ranunculus Repens" SubClassOf: Flower and (hasFlowerSymmetry some RadialSymmetry) and (hasPart some (Androecium and (hasAndroecialFusion some Apostemonous) and (hasPart some (Stamen and (hasPart some Filament) and (hasPart some (Anther and (hasAntherAttachment some AdnateAntherAttachment) and (hasDehiscenceType some LongitudinalDehiscence))))))) and (hasPart some (Gynoecium and (hasGynoecialFusion some Apocarpous) and (hasPart some (Pistil and (hasPart some Carpel) and (hasPart some Style) and (hasPart some (Stigma and (hasStickiness some Stickiness) and (hasStigmaShape some HookedStigmaShape))) and (hasPart only (Carpel or Stigma or Style)))) and (hasSexualPartArrangement some SpiralArrangement))) and (hasPart exactly 1 (Perianth and (hasPart some (Calyx and (hasPart exactly 5 (Sepal and (hasColour some Green) and (hasRegion some (BaseRegion and (hasForm some Truncate)))
  14. 14. Ontology Ontology driven user interfaces Class: "Ranunculus Repens" SubClassOf: Flower and (hasFlowerSymmetry some RadialSymmetry) and (hasPart some … … … generate menus Graphical User Interface (GUI) generate axioms for a flower
  15. 15. More axioms • Better maintenance, better use • More queries • Moving away from vocabulary as the sole deliverable • Sampling across ontologies to deliver a particular use case • Still a challenge to reasoning tools
  16. 16. My favourite tenet of Agile methods Maximising the work not done
  17. 17. Going programmatic Ontology Ontology Manual creation Manual curation Software programmatic intervention creates updates Program Visual Inspection Visual Inspection
  18. 18. Making Ontology Development programmatic • Making ontologies by hand is hard • Pattern based development is the way • Programmatic first, rather than programmatic after • Programmatic only
  19. 19. Views Over Ontologies Coping with custom, practice and differing views – Carbohydrates (Mungall et al JBI 2011) – Genes/proteins; what matters about chemicals (OpenPhacts, Batchelor et al, ISWC 2014) Different answers for different communities; the right answer is not always what people want Navigation within Knowledge – constipation in the Read codes Using other forms of knowledge representation, such as SKOS, RDF graphs, knowledge graphs, and so on; it’s all knowledge in some form Avoid the ontological hammer
  20. 20. Experimental Factor Ontology: a view on the worlds bio-ontologies Applications External ontologies Disease BioAssays Cell lines Cell types Small molecules Evidence Taxonomy Drugs Adverse events InformationGene function Plant anatomy Mouse anatomy Phenotype EVA Expression Atlas GWAS catalog Array Express 1 million+ terms 20,000 terms Applications
  21. 21. Reuse and request import and update entity request Client ontologySource ontologies
  22. 22. Data Driven Ontology Content • Ontologies represent the data we describe • Our data should guide us as to what to describe • FCA, ML approaches to analysing data and KB content • Let our data help us improve our ontologies • And let our ontologies improve our data mining
  23. 23. The rest of the world • SNOMED, MeSH, ICD, UMLS • A whole host of medical and clinical vocabularies • They will continue to exist • We need to work with them • Rather than just ontology, we should talk about knowledge representations
  24. 24. OBOPedia entry for Golgi apparatus http://www.obopedia.org.uk
  25. 25. Ontology as Tutorial • We have a huge amount of knowledge captured in our ontologies • Particularly rich with natural language definitions and vocabulary • It should be usable as a learning resource
  26. 26. What we need to do (at least) • Make ontology development industrial • Make our ontologies axiomatically rich • Enable effective sampling of ontologies • Enable differing views over knowledge • (At some point) stop creating new ontologies • Think about knowledge ecosystems and not just ontologies • Use ontologies to do some biology
  27. 27. Knowledge in Biology • Bio-ontologies should be “Knowledge in Biology” (thanks Phil) • Knowledge in some kind of computational form is vital • Ontologies are not the only knowledge fruit • …but they are a vital, necessary component

×