Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

544 views

Published on

Presentation of the methods used to simplify the display of the Uberon ontology, and to maintain up-to-date annotations to it.
Presented at the Biocuration 2013 conference.

Published in: Education, Health & Medicine
  • Be the first to comment

  • Be the first to like this

Use of Uberon in the Bgee database: How to deal with a complex, large, dynamic ontology?

  1. 1. Use of Uberon in the Bgee database:How to deal with a complex, large, dynamicontology?Frederic BastianBiocuration 2013
  2. 2. A biocurator nightmare?Ontologies now regularly include thousands of terms.Complex relations are used, e.g., “transitively proximallyconnected to”.Curators are expected to provide complex annotations, e.g.:post-composition of terms.=> How can we simplify the use of complex ontologies?© 2013 SIB
  3. 3. The Bgee database http://bgee.unil.ch© 2013 SIB
  4. 4. The Bgee database http://bgee.unil.ch Description of anatomy and development© 2013 SIB
  5. 5. The Bgee database http://bgee.unil.chExpression data Description of anatomy and development © 2013 SIB
  6. 6. The Bgee database http://bgee.unil.chExpression data Description of anatomy Homology and development © 2013 SIB
  7. 7. The Bgee database … http://tinyurl.com/bgee12-hoxa5a© 2013 SIB
  8. 8. The Bgee database … http://tinyurl.com/bgee12-hoxa5a© 2013 SIB
  9. 9. The Bgee database … http://tinyurl.com/bgee12-hoxa5a© 2013 SIB
  10. 10. Use of anatomical ontologies in BgeeSeveral species-specific ontologies were used:• ZFA• XAO• FBbt• EMAPA, MA• EHDAA, EV© 2013 SIB
  11. 11. Use of anatomical ontologies in BgeeSeveral species-specific ontologies were used:• ZFA• XAO• FBbt• EMAPA, MA• EHDAA, EV=> Limitation to add new species=> Inconsistent anatomical descriptions, different formalismsadopted, etc.© 2013 SIB
  12. 12. Homology relations between anatomical ontologies To perform automated comparisons: • We built groups of homologous organs • We organized these groups into an ontology VHOG:0000157 brain EHDAA:2629 brain EHDAA:300 brain EHDAA:830 future brain EMAPA:16089 future brain EMAPA:16894 brain EV:0100164 brain MA:0000168 brain XAO:0000010 brain ZFA:0000008 brain ZFA:0000146 presumptive brain© 2013 SIB
  13. 13. Homology relations between anatomical ontologies To perform automated comparisons: • We built groups of homologous organs • We organized these groups into an ontology => vHOG ontology vHOG, a multispecies vertebrate ontology of homologous organs groups Bioinformatics (2012) 28(7): 1017-1020, 2012.© 2013 SIB
  14. 14. Homology relations between anatomical ontologies To perform automated comparisons: • We built groups of homologous organs • We organized these groups into an ontology => vHOG ontology To add a species: • All groups need to be re-evaluated • The graph structure needs to be updated => Not maintainable on the long run© 2013 SIB
  15. 15. And then came Uberon … only_in_taxon UBERON: bone Vertebrata is_a is_aDrosophila melanogaster UBERON: tibia Homo sapiens is_a is_a part_of part_of Fruit fly FBbt ‘tibia’ Human FMA ‘tibia’© 2013 SIB
  16. 16. And then came Uberon … only_in_taxon UBERON: bone Vertebrata is_a is_aDrosophila melanogaster UBERON: tibia Homo sapiens is_a is_a part_of part_of Fruit fly FBbt ‘tibia’ Human FMA ‘tibia’© 2013 SIB
  17. 17. And then came Uberon … Uberon also provides a composite ontology: Merges terms from species-specific ontologies, when term not present in Uberon. .... is_a UBERON:0003059 ! presomitic mesoderm devf UBERON:0002329 ! somite is_a ZFA:0000073 ! somite 5 (zebrafish) is_a ZFA:0000982 ! somite 6 (zebrafish) is_a EHDAA2:0001853 ! somite 05 (embryonic human) is_a EHDAA2:0001854 ! somite 06 (embryonic human) => Allow to import data from Model Organism Databases.© 2013 SIB
  18. 18. And then came Uberon … BUT Uberon is complex: • About 22 000 terms in the composite ontology© 2013 SIB
  19. 19. And then came Uberon … BUT Uberon is complex: • About 22 000 terms in the composite ontology • Use of advanced constructs, supported only in OWL • Use of high level abstract terms for interoperability© 2013 SIB
  20. 20. And then came Uberon … BUT Uberon is complex: • About 22 000 terms in the composite ontology • Use of advanced constructs, supported only in OWL • Use of high level abstract terms for interoperability • Frequently updated, highly responsive • Structure changes when any imported species-specific ontology changes => even more updated© 2013 SIB
  21. 21. Uberon cannot be easily browsed© 2013 SIB
  22. 22. First step: ontology simplification© 2013 SIB
  23. 23. First step: ontology simplification 1. Simplification of the relations Keep only is_a, part_of, develops_from. Map all relations to their ancestors, e.g.: develops_directly_from => develops_from© 2013 SIB
  24. 24. First step: ontology simplification 2. Removal of redundant relations A is_a B; B is_a C; => A is_a C is redundant.© 2013 SIB
  25. 25. First step: ontology simplification 2. Removal of redundant relations A is_a B; B is_a C; => A is_a C is redundant. But, we consider part_of and is_a relations as equivalent. A part_of B; B is_a C => A part_of C and A is_a C are considered redundant This removes almost all “is_a anatomical entity”© 2013 SIB
  26. 26. First step: ontology simplification 3. Removal of relations to upper_level terms upper_level subset: "abstract upper-level terms not directly useful for analysis” Terms useful for analysis are almost all present under “upper_level” terms, thus being confusing. => remove relations to “upper_level” terms if non-orphan© 2013 SIB
  27. 27. First step: ontology simplification 3. Removal of relations to upper_level terms upper_level subset: "abstract upper-level terms not directly useful for analysis” Terms useful for analysis are almost all present under “upper_level” terms, thus being confusing. => remove relations to “upper_level” terms if non-orphan [Term] id: MA:0000747 name: lymph organ (mouse) is_a: UBERON:0001062 ! anatomical entity relationship: part_of UBERON:0002465 ! lymphoid system© 2013 SIB
  28. 28. First step: ontology simplification 3. Removal of relations to upper_level terms upper_level subset: "abstract upper-level terms not directly useful for analysis” Terms useful for analysis are almost all present under “upper_level” terms, thus being confusing. => remove relations to “upper_level” terms if non-orphan [Term] id: MA:0000747 name: lymph organ (mouse) is_a: UBERON:0001062 ! anatomical entity relationship: part_of UBERON:0002465 ! lymphoid system© 2013 SIB
  29. 29. First step: ontology simplification 3. Removal of relations to upper_level terms upper_level subset: "abstract upper-level terms not directly useful for analysis” Terms useful for analysis are almost all present under “upper_level” terms, thus being confusing. => remove relations to “upper_level” terms if non-orphan [Term] id: UBERON:0007502 name: epithelial plexus is_a: UBERON:0000480 ! anatomical group© 2013 SIB
  30. 30. First step: ontology simplification 4. Generate species-specific versions To simplify even more the “composite-metazoan” ontology, generate a version for each species used in Bgee.© 2013 SIB
  31. 31. First step: ontology simplification© 2013 SIB
  32. 32. Second step: track ontology changes 1. Store annotation status - “Perfect” annotation: would not need to be refined as long as the term used is not obsoleted. - “Missing granularity” annotation: a term is missing in the ontology, e.g., vastus lateralis. If a new child was added to the term, refine annotation© 2013 SIB
  33. 33. Second step: track ontology changes 2. Track ontology changes - Compare the versions used between two annotation cycles. - If a term used in a “missing granularity” annotation has new children, refine the annotation.© 2013 SIB
  34. 34. Conclusion 1/2 To manage complex, frequently updated ontology: 1. Provide a formal version for the reasoning, and a simplified view for the end-user. 2. Store annotation status, to focus only on annotations which need to be updated.© 2013 SIB
  35. 35. Conclusion 2/2 Major update of Bgee incoming for fall 2013: - All expression data annotations are being transferred to Uberon. - All homology information are being transferred from vHOG to Uberon, using an external file.© 2013 SIB
  36. 36. Conclusion 2/2 Major update of Bgee incoming for fall 2013: - All expression data annotations are being transferred to Uberon. - All homology information are being transferred from vHOG to Uberon, using an external file. And also: - Besides present/absent calls, Bgee will include: overexpression calls; biologically significant expression. - Revamped interfaces, webservices, APIs, …© 2013 SIB
  37. 37. Advertisement! Other Bgee-related workPoster 145:Average rank IQR: a new improved method forAffymetrix microarray quality control for meta-analyses and database curation. Marta RosikiewiczDatabase biocuration virtual issue:Uncovering hidden duplicated content in public transcriptomicsdataMarta Rosikiewicz, Aurélie Comte, Anne Niknejad, Marc Robinson-Rechavi, and Frederic B. BastianDatabase Vol. 2013, bat010; doi:10.1093/database/bat010 © 2013 SIB
  38. 38. Thank YouAurélie Comte Sébastien Moretti Anne Niknejad Marta Rosikiewicz Marc Robinson-Rechavi Komal Sanjeev Mathieu SeppeyAnd also:• Melissa Haendel• Chris Mungall

×