Bringing reason to phenotype diversity, character change, and common descent


Published on

Talk I gave in the National Center for BioOntologies (NCBO) Webinar series, on Nov 17, 2010.

Abstract, bio, and video recording are at the NCBO website:

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Bringing reason to phenotype diversity, character change, and common descent

  1. 1. Bringing reason to phenotype diversity, character change, and common descent Hilmar LappNational Evolutionary Synthesis Center (NESCent) NCBO Webinar, Nov 17, 2010
  2. 2. Regier et al (2010 Parfrey et al (2010, Parfrey & Katz Life has evolved a stunning diversity of phenotypes Images: Web Tree of Life (
  3. 3. Large body of evolutionaryphenotype documentation
  4. 4. Chen & Mayden (2010) PhenotypeMabee (2000) changes inform phylogenetic reconstruction from: Understanding Evolution Sereno (1999)
  5. 5. As complex, free text phenotypes are resistant to computing (Lundberg and Akama 2005)
  6. 6. Finding similar information in free-text is difficult “lacrymal bone...flat’’ Mayden 1989 Grande and Poyato- “lacrimal...small, flat” Ariza 1999 “lacrimal...triangular’’ Royero 1999 “first infraorbital (lachrimal) Kailola 2004 shape...flattened”“fourth infraorbital...anterior and Zanata and Vari 2005 posterior parallel”
  7. 7. Computing example:Search bySimilarity Fig. 3, Washington et al (2009) Fig. 1, Washington et al (2009)
  8. 8. Computing example:Search bySimilarity Fig. 3, Washington et al (2009) Trogloglanis pattersoni - a blind catfish Fig. 1, Washington et al (2009)
  9. 9. Integrating across studies?Fig. 7, Sereno (2009) Fig. 6, Sereno (2009)
  10. 10. Computing overcomparative morphology? Cyprinus carpio Pangio anguillaris Nemacheilus fasciatusCatostomus commersoni Gyrinocheilus aymonieri Phenacogrammus interruptus
  11. 11. Knowledge mining & hypothesis generation Model Organism Non-model organisms Mutagenesis Mutation, selection, drift, gene flowMutant or missing protein at Altered expression orspecific developmental stage function of protein Phenotype change(s) Phenotype changes between to wildtype evolutionary lineages middle nuchal plate predorsal spinelet spine anterior nuchal plate Order Siluriformes Laue et al (2008) Pimelodus maculatus 2 cm abdominal Order Characiformes scutes Catoprion mento
  12. 12. Phenoscape• Collaboration between P. Mabee (PI, U. South Dakota), M. Westerfield (ZFIN), and Todd Vision (UNC, NESCent)• Aim: Foster devo-evo synthesis by • Prototyping a database of curated, machine- interpretable evolutionary phenotypes. • Integrating these with mutant phenotypes from model organisms. • Enabling data-mining and discovery for candidate genes of evolutionary phenotype transitions.• Informatics for the project is developed and hosted at NESCent
  13. 13. Entity-Quality Model forEvolutionary Phenotypes Character State supraorbital bone shape bent Entity (TAO) Quality (PATO) supraorbital bone bent
  14. 14. Entity-Quality Model forEvolutionary Phenotypes Character State supraorbital bone shape bent Entity (TAO) Quality (PATO) supraorbital bone bent
  15. 15. Entity-Quality Model forEvolutionary Phenotypes Character State supraorbital bone shape bent Entity (TAO) Quality (PATO) }supraorbital bone bent Phenotype
  16. 16. Phenotype Assertion Links a quality to the entity that is its bearer Links a taxon to a Phenotypic Quality phenotype ontology term bent Brycinus exhibits inheres_in brevis some supraorbital bone Taxon Anatomyontology term ontology term
  17. 17. Phenotype Assertion Links a quality to the entity that is its bearer Links a taxon to a Phenotypic Quality phenotype ontology term bent Brycinus exhibits inheres_in brevis some supraorbital bone Taxon Anatomyontology term ontology term Evidence Specimen Publication Code
  18. 18. Ontology development Ontologies for • Teleost Anatomy • Teleost TaxonomyDahdul et al (2009) • Phenotypic Quality (PATO)Cover art: K. Luckenbill
  19. 19. Curation Dahdul et al., 2010 PLoS ONE 2. Students: 3. Character Manual entry of free annotation by experts: text character Entry of phenotypes descriptions, matrix, and homology taxon list, specimens assertions using and museum numbers Phenex using Phenex ~ 5 person years Curators: Wasila Dahdul Miles Coburn Jeff Engemen Terry Grande 1. Students: Eric Hilton gather publications John Lundberg 4. Consistency (scan hard copies, Paula Mabee checks, upload ofproduce OCR PDFs) Richard Mayden data to public view of Mark Sabaj Pérez Phenoscape KB
  20. 20. • Curated 4,208 characters in 2,310 species from 51 papers• 333,987 evolutionary phenotype assertions• 11,267 phenotype statements about 2,953 genes
  21. 21. Phenoscape Knowledgebase
  22. 22. Full workflow: free-text → EQ → integrated KBlegacy free-text EQ = bodycharacter data lacks all Taxon Gene Anatomical Entity Quality parts of type anatomical has number Teleostei eda scale structure of is_a is_a is_a is_a Siluriformes is_a body scale lacks allKailola (2004) inheres_in towards parts of type Gasterosteiformes inheres_in towards is_a is_a is_a is_a variant_of inheres_in towards is_a © Jean Ricardo Simões Vitule has fewer body lacks all parts of type Ictalurus punctatus exhibitsmutant phenotype EQ = body parts of type scale Here, we describe the phenotypic and has fewer is_a molecular characterization of a set of parts of type body lacks all mutants showing loss of adult structures of Apeltes quadracus exhibits the dermal skeleton, such as the rays of the scale parts of type scale fins and the scales, as well as the pharyngeal teeth. The mutations represent adult-viable, loss of function alleles in the body has fewer ectodysplasin (eda) and ectodysplasin edadt3S243X/+ influences parts of type scale receptor (edar) genes. Harris et al. (2008)
  23. 23. System architecture Knowledgebase User Inteface External web sites Web Application for Exploration & Mining and client (Ruby on Rails, JavaScript) applications Knowledgebase Data Services API (REST) OBD Programming API OBD Reasoner (Java) Teleost Taxonomy Ontology (TTO) Knowledgebase (OBD) (PostgreSQL) Phenotypic Anatomy Quality Ontology Ontologies (PATO) (ZFA, TAO)Genes & genotypes Homology assertions Mutant EQ phenotypes Evolutionary EQ Phenotypes NeXML OBO Library from Zebrafish Model (through annotation) Organism Database Phenex Skeletal Character Data (Evolutionary EQ (from phylogenetic annotation) treatments in literature)
  24. 24. KB is based on OBD(Ontology-Based Database) (C. Mungall, LBL)
  25. 25. PATO:quality Measurement ZFIN:Publication ECO:evidence curator(s) -value/max/min uid = ZFIN ID -unit OBO_REL:is_a OBO_REL:posited_by has_measurement has_evidence dc:creatorTAO:taxon PHENO:exhibits Phenotype OBO_REL:influences (class expression) Genotype OBO_REL:towards OBO_REL:inheres_in OBO_REL:variant_of PHENO:asserted_for_otu TAO:entity OBO_REL:posited_by PHENO:has_taxon OBO_REL:posited_by TAO:entity Gene CDAO:TU name = Publication Taxon CDAO:CharacterStateDomain CDAO:CharacterStateDatum CDAO:has_State name = state text dwc:individualID PHENO:has_comment PHENO:has_comment comments CDAO:has_TU dwc:collectionID COLLECTION (literal text) Specimen dwc:catalogID comment CDAO:has_Datum (literal text) catalog number (literal text) CDAO:CharacterStateDataMatrix CDAO:has_Character CDAO:Character PHENO:has_publication name = character text PHENO:has_comment PHENO:Publication PHENO:has_comment -dc:abstract publication notes -dc:bibliographicCitation (literal text) comment -dc:date (literal text)
  26. 26. Phenoscape OBD reasoner • All OBD built-in rules: • is_transitive(R), X R Y, Y R Z → X R Z • is_reflexive(R) → X R X • X is_a Y, Y R some Z → X R some Z • Y is_a Z, X R some Y → X R some Z • X R Y, Y S Z, transitive_over(R,S) → X R Z • X R1 Y, Y R2 Z, holds_over_chain(R,R1,R2) → X R Z • Additional Phenoscape rule: • T exhibits P, T is_a T’ → T’ exhibits P • Consistent with OWL due to instance quantification
  27. 27. Major taxonomic groups havesimilar distribution of entities among phenotypes
  28. 28. Substantial overlap between model organism and evolutionary phenotypeshematopoie7c system  •4,217 zebrafish phenotypes reproduc7ve system  •3,405 evolutionary characters musculature system liver and biliary system  respiratory system  renal system  endocrine system  Evolu7onary characters  Zebrafish phenotypes  immune system  diges7ve system  cardiovascular system  skeletal system  sensory system  nervous system  0  500  1000  1500  2000  2500 
  29. 29. Hypothesis generation: Genetic basis for scale loss in SiluriformesMutation of eda gene Ictalurus punctatus:in Danio: Harris et al., 2007 Copyright  ©  Jean  Ricardo  Simões  Vitule,  All  Rights  Reserved
  30. 30. Hypothesis generation: Genetic basis for absence of the basihyal bone in SiluriformesMutation of brpf1 gene Ictalurus punctatus:in Danio: Laue et al (2008)
  31. 31. Making PATO usable for evolutionary data Attribute Example Qualities Color black, colorless Attribute Example Qualities Composition cartilaginous, ossified Relational Spatial anterior to, lateral to Count present, absent Relational Structural fused with, overlap with, separated from Position horizontal, vertical Shape concave, interdigitated, lobed, triangular Quality open, closed, flexible Size increased length, decreased lengthRelational Shape protruding into Texture wrinkled, smooth
  32. 32. Getting PATO right is a challenge • PATO is “single-inheritance” - what is the right axis of classification? • relational shape vs monadic shape • relational spatial vs position • shape and size vs natural language ”Interopercle shape: expanded posteroventrally” • Different ways to observe or generate a phenotypic quality • Color as color hue (radiation quality) or pigmentation (structural quality) • Relative sizes don’t have a universal reference • Negation (“not round”, “unelongated”): means complement under attribute ‘(shape and not(round))’
  33. 33. Mapping EQs back to characters is a challenge• Properties of “good” phylogenetic characters: • Exclusivity of states • Distinguishability of states • Independence of characters• Finding exclusive states requires incompatible phenotypes. How to determine incompatibility? • Two phenotypes are incompatible iff they cannot both inhere in the same specimen. • Two qualities are incompatible iff an entity cannot bear both.
  34. 34. Which EQs and qualities are incompatible?• Incompatible Qs • Compatible Qs • present vs. absent • present vs. any • triangular vs. other quality round (except absent) • absent vs. any • serrated vs. round other quality • some colors• Incompatible EQs • (Q inheres_in bone E) vs (cartilage E absent)
  35. 35. Detecting phenotype change and variation Hemiodus argenteus {shape:bent inheres_in supraorbital bone, count:absent inheres_in upper pharyngeal 5 tooth} {shape:bent inheres_in supraorbital bone,Hemiodus shape:straight inheres_in supraorbital bone, count:absent inheres_in upper pharyngeal 5 tooth, count:present inheres_in upper pharyngeal 5 tooth} {shape:straight inheres_in supraorbital Hemiodus unimaculatus bone, count:present inheres_in upper pharyngeal 5 tooth} {Change in: shape inheres_in supraorbital bone, Change in: count inheres_in upper pharyngeal 5 tooth}
  36. 36. Visualizing phenotype profiles on a treePhenotypic profile: Phenotypic profile tree Taxon color indicates the greatest level of match of specified phenotype(s) Phenotypes found within a species in the clade. dorsal fin absent X Phenotype match including parts adipose fin absent X 100% including parts 75% opercle triangular X 50% including parts <50% Include inferred phenotypesQuery taxa with these phenotypes.
  37. 37. Navigating phenotype variation on a tree entity term: basihyal bone Taxonomic distribution of for basihyal bone Limit tree to Cypriniformes X or Sets of taxa with matching Phenotype Profiles Show taxa without phenotype data Show taxa with unspecified shape phenotypes Cyprinidae Osariophysii Cypriniformes Balitoridae Phenotype Gyrinocheilidae Phenotype Taxa Profiles Cobitidae triangular 1 1 Vaillantellidae Y-shaped 1 1 Botiidae Catostomidae shape 8 3 Psilorhynchidae
  38. 38. Entity 1  Taxon 1  Relationship  Entity 2  Taxon 2  Evidence  Reference(s)  (Fink and Fink,  1981; Rosen and  scaphium  Otophysi  homologous_to  neural arch 1  Teleostei   IDS, IMS, IPS   Reasoning Greenwood,  1970)  neural arch 2  (Rosen and intercalarium  Otophysi  homologous_to  (ventral  Teleostei  IDS, IMS, IPS   Greenwood, intercalarium  Otophysi  portion)  homologous_to  neural arch 2   Teleostei  NAS  1970)  (Fink and Fink,  1981)  over homologyintercalarium  Otophysi  homologous_to  neural arch 2   Teleostei  IMS  (Hora, 1922) intercalarium  Otophysi  homologous_to  rib of vertebra 2  Teleostei  TAS   (Hora 1922)  (Fink and Fink,  parapophysis +  1981; Rosen and  tripus  Otophysi  homologous_to  Teleostei   IDS, IMS, IPS   rib of vertebra 3  Greenwood,  1970)  image by Kyle Luckenbill, ANSP
  39. 39. Formalizing homology relationships• Formal pattern is ternary: E1 in_taxon T1 homologous_to E2 in_taxon T2 as E3 in_taxon T3• Classifying homology relationships • 1-1 homology (phylogenetic homology) • serial homology• A iso_homologous_to B as C all A derived_by_descent_from some (C and has_derived_by_descendent some B) and all B derived_by_descent_from some (C and has_derived_by_descendent some A)• shares_ancestor_with as a relation chain: derived_by_descent_from o has_derived_by_descendent
  40. 40. Option 1: Asserting homology at higher-level taxa
  41. 41. Option 2: Asserting homology at species level
  42. 42. Validation through standard OWL-DL reasoning
  43. 43. Opening descriptive biological data to computing can enable new science Taxonomy, Conservation Species ID Biology Biodiversity (Specimens, Occurrence records) Descriptive biology Ecology - Phenotypes - Traits - Function - Behavior - Habitat - Life Cycle - Reproduction Physiology - Conservation Threats Genetics Genomics, Genetic Gene variation expression
  44. 44. Acknowledgements• Phenoscape • Berkeley Bioinformatics Personnel & PIs: & Ontologies Project P. Mabee, (BBOP): M. Westerfield, C.Mungall, S.Lewis T. Vision, J. Balhoff, • National Evolutionary C. Kothari, Synthesis Center W. Dahdul, (NESCent) P. Midford • NSF (DBI 0641025)• Phenoscape curators & workshop participants