Bringing reason to
     phenotype diversity,
    character change, and
       common descent

                Hilmar Lapp
National Evolutionary Synthesis Center (NESCent)
           NCBO Webinar, Nov 17, 2010
Regier et al (2010
                                            Parfrey et al (2010, Parfrey & Katz


                       Life has
                      evolved a
                      stunning
                     diversity of
                     phenotypes


                                    Images: Web Tree of Life (http://tolweb.org)
Large body of evolutionary
phenotype documentation
Chen & Mayden (2010)




                                             Phenotype
Mabee (2000)                               changes inform
                                            phylogenetic
                                           reconstruction




               from: Understanding Evolution   Sereno (1999)
As complex, free text phenotypes
   are resistant to computing




              (Lundberg and Akama 2005)
Finding similar information
       in free-text is difficult
      “lacrymal bone...flat’’          Mayden 1989

                                   Grande and Poyato-
     “lacrimal...small, flat”
                                      Ariza 1999

     “lacrimal...triangular’’         Royero 1999

  “first infraorbital (lachrimal)
                                      Kailola 2004
        shape...flattened”
“fourth infraorbital...anterior and
                                    Zanata and Vari 2005
  posterior margins...in parallel”
Computing
 example:
Search by
Similarity
                                           Fig. 3, Washington et al (2009)




         Fig. 1, Washington et al (2009)
Computing
 example:
Search by
Similarity
                                                                            Fig. 3, Washington et al (2009)




                                           Trogloglanis pattersoni - a blind catfish
                                                                            http://tolweb.org/Trogloglanis/69910

         Fig. 1, Washington et al (2009)
Integrating across studies?




Fig. 7, Sereno (2009)




                        Fig. 6, Sereno (2009)
Computing over
comparative morphology?



  Cyprinus carpio         Pangio anguillaris        Nemacheilus fasciatus




Catostomus commersoni   Gyrinocheilus aymonieri   Phenacogrammus interruptus
Knowledge mining &
        hypothesis generation
      Model Organism                                             Non-model organisms
                Mutagenesis                          Mutation, selection, drift, gene flow


Mutant or missing protein at                                                    Altered expression or
specific developmental stage                                                      function of protein


   Phenotype change(s)                                           Phenotype changes between
       to wildtype                                                  evolutionary lineages
                                           middle nuchal plate                                    predorsal
                                                                 spinelet                           spine
                                      anterior
                                    nuchal plate




                                                                            Order Siluriformes
                Laue et al (2008)                                           Pimelodus maculatus   2 cm        abdominal   Order Characiformes
                                                                                                               scutes     Catoprion mento
Phenoscape
• Collaboration between P. Mabee (PI, U. South
  Dakota), M. Westerfield (ZFIN), and Todd Vision
  (UNC, NESCent)

• Aim: Foster devo-evo synthesis by
  • Prototyping a database of curated, machine-
     interpretable evolutionary phenotypes.

  • Integrating these with mutant phenotypes from
     model organisms.
  • Enabling data-mining and discovery for candidate
     genes of evolutionary phenotype transitions.

• Informatics for the project is developed and hosted
  at NESCent
Entity-Quality Model for
Evolutionary Phenotypes
        Character                State
   supraorbital bone   shape      bent


     Entity (TAO) Quality (PATO)
   supraorbital bone           bent
Entity-Quality Model for
Evolutionary Phenotypes
        Character                State
   supraorbital bone   shape      bent


     Entity (TAO) Quality (PATO)
   supraorbital bone           bent
Entity-Quality Model for
Evolutionary Phenotypes
        Character                State
   supraorbital bone   shape      bent


     Entity (TAO) Quality (PATO)
  }supraorbital bone           bent




               Phenotype
Phenotype Assertion
                                  Links a quality to the
                                 entity that is its bearer
           Links a taxon to a                                Phenotypic Quality
              phenotype                                        ontology term


                                                     bent
                Brycinus        exhibits          inheres_in
                 brevis           some            supraorbital
                                                     bone

    Taxon                                                          Anatomy
ontology term                                                    ontology term
Phenotype Assertion
                                  Links a quality to the
                                 entity that is its bearer
           Links a taxon to a                                Phenotypic Quality
              phenotype                                        ontology term


                                                     bent
                Brycinus        exhibits          inheres_in
                 brevis           some            supraorbital
                                                     bone

    Taxon                                                          Anatomy
ontology term                                                    ontology term


                Evidence
                                Specimen          Publication
                  Code
Ontology development




                             Ontologies for

                             • Teleost Anatomy
                             • Teleost Taxonomy

Dahdul et al (2009)
                             • Phenotypic
                                Quality (PATO)
Cover art: K. Luckenbill
Curation
                               Dahdul et al., 2010 PLoS ONE

           2. Students:
                                                        3. Character
       Manual entry of free
                                                   annotation by experts:
          text character
                                                    Entry of phenotypes
       descriptions, matrix,
                                                      and homology
      taxon list, specimens
                                                     assertions using
     and museum numbers
                                                          Phenex
         using Phenex

                                ~ 5 person years

                               Curators:
                               Wasila Dahdul
                               Miles Coburn
                               Jeff Engemen
                               Terry Grande
    1. Students:               Eric Hilton
 gather publications           John Lundberg                 4. Consistency
 (scan hard copies,            Paula Mabee                 checks, upload of
produce OCR PDFs)              Richard Mayden             data to public view of
                               Mark Sabaj Pérez             Phenoscape KB
• Curated 4,208 characters in 2,310 species
  from 51 papers

• 333,987 evolutionary phenotype assertions
• 11,267 phenotype statements about 2,953
  genes
Phenoscape Knowledgebase
Full workflow:
              free-text → EQ → integrated KB
legacy free-text                                    EQ = body
character data                                      lacks all                    Taxon            Gene          Anatomical Entity             Quality
                                                    parts of type                                                   anatomical              has number
                                                                            Teleostei              eda
                                                    scale                                                            structure                   of
                                                                           is_a                                     is_a      is_a
                                                                                                                                                  is_a

                                                                    Siluriformes     is_a                    body               scale
                                                                                                                                             lacks all
Kailola (2004)
                                                                                                               inheres_in    towards       parts of type
                                                                                 Gasterosteiformes                 inheres_in towards
                                                                                                                                      is_a is_a           is_a
                                                                          is_a                  variant_of             inheres_in towards
                                                                                         is_a
                     © Jean Ricardo Simões Vitule                                                                                                   has fewer
                                                                                                                  body lacks all                   parts of type
                                                                    Ictalurus punctatus           exhibits
mutant phenotype                                    EQ = body                                                   parts of type scale
 Here, we describe the phenotypic and
                                                    has fewer                                                                                            is_a

 molecular characterization of a set of             parts of type                                                             body lacks all
 mutants showing loss of adult structures of                              Apeltes quadracus                  exhibits
 the dermal skeleton, such as the rays of the       scale                                                                   parts of type scale
 fins and the scales, as well as the
 pharyngeal teeth. The mutations represent
 adult-viable, loss of function alleles in the                                                                                           body has fewer
 ectodysplasin (eda) and ectodysplasin                                                edadt3S243X/+                 influences
                                                                                                                                        parts of type scale
 receptor (edar) genes.

 Harris et al. (2008)
System architecture
        Knowledgebase User Inteface             External web sites
   Web Application for Exploration & Mining         and client
         (Ruby on Rails, JavaScript)               applications



               Knowledgebase Data Services API (REST)


          OBD Programming API
                                                 OBD Reasoner
                 (Java)
                                                                                          Teleost Taxonomy
                                                                                           Ontology (TTO)
                        Knowledgebase (OBD)
                            (PostgreSQL)
                                                                                                       Phenotypic
                                                                                     Anatomy         Quality Ontology
                                                                                    Ontologies            (PATO)
                                                                                    (ZFA, TAO)
Genes & genotypes                    Homology assertions

  Mutant EQ phenotypes              Evolutionary EQ Phenotypes   NeXML
                                                                                                      OBO Library
  from Zebrafish Model                 (through annotation)
   Organism Database

                                                                      Phenex                 Skeletal Character Data
                                                                 (Evolutionary EQ              (from phylogenetic
                                                                   annotation)              treatments in literature)
KB is based on OBD
(Ontology-Based Database)
        (C. Mungall, LBL)
PATO:quality                    Measurement                     ZFIN:Publication
               ECO:evidence               curator(s)                                               -value/max/min                   uid = ZFIN ID
                                                                                                        -unit
                                                                  OBO_REL:is_a
                                                                                                                           OBO_REL:posited_by
                                                                             has_measurement
                   has_evidence dc:creator

TAO:taxon                             PHENO:exhibits              Phenotype                       OBO_REL:influences
                                                              (class expression)                                                            Genotype

                                                                                OBO_REL:towards
                                                                     OBO_REL:inheres_in
                                                                                                                                            OBO_REL:variant_of
         PHENO:asserted_for_otu                                                                          TAO:entity
                                                   OBO_REL:posited_by
  PHENO:has_taxon
                              OBO_REL:posited_by                                TAO:entity                                                             Gene



        CDAO:TU
  name = Publication Taxon                                                                                                    CDAO:CharacterStateDomain
                                                       CDAO:CharacterStateDatum                 CDAO:has_State
                                                                                                                                   name = state text
                              dwc:individualID
 PHENO:has_comment

                                                                                                                                         PHENO:has_comment
  comments                CDAO:has_TU                                     dwc:collectionID     COLLECTION
  (literal text)                                       Specimen
                                                                           dwc:catalogID                                                           comment
                                                                                                                           CDAO:has_Datum          (literal text)
                                                                                              catalog number
                                                                                                (literal text)


                              CDAO:CharacterStateDataMatrix
                                                                         CDAO:has_Character
                                                                                                                   CDAO:Character
                        PHENO:has_publication                                                                    name = character text
                                                 PHENO:has_comment

      PHENO:Publication                                                                                                         PHENO:has_comment
              -dc:abstract                                  publication notes
       -dc:bibliographicCitation                               (literal text)                                                                    comment
                -dc:date                                                                                                                         (literal text)
Phenoscape OBD reasoner
 • All OBD built-in rules:
   •   is_transitive(R), X R Y, Y R Z → X R Z
   •   is_reflexive(R) → X R X
   •   X is_a Y, Y R some Z → X R some Z
   •   Y is_a Z, X R some Y → X R some Z
   •   X R Y, Y S Z, transitive_over(R,S) → X
       R Z
   •   X R1 Y, Y R2 Z,
       holds_over_chain(R,R1,R2) → X R Z
 • Additional Phenoscape rule:
   • T exhibits P, T is_a T’ → T’ exhibits P
     • Consistent with OWL due to instance
         quantification
Major taxonomic groups have
similar distribution of entities
      among phenotypes
Substantial overlap between
                 model organism and
               evolutionary phenotypes
hematopoie7c system                         •4,217 zebrafish phenotypes
  reproduc7ve system                        •3,405 evolutionary characters
   musculature system 

liver and biliary system 

    respiratory system 

           renal system 

     endocrine system                                                   Evolu7onary characters 
                                                                        Zebrafish phenotypes 
       immune system 

       diges7ve system 

 cardiovascular system 

        skeletal system 

        sensory system 

       nervous system 

                            0    500    1000    1500    2000    2500 
Hypothesis generation:
    Genetic basis for scale loss in
            Siluriformes

Mutation of eda gene                 Ictalurus punctatus:
in Danio:




               Harris et al., 2007


                                     Copyright	
  ©	
  Jean	
  Ricardo	
  Simões	
  Vitule,	
  All	
  Rights	
  Reserved
Hypothesis generation:
  Genetic basis for absence of the
   basihyal bone in Siluriformes
Mutation of brpf1 gene            Ictalurus punctatus:
in Danio:




              Laue et al (2008)
Making PATO usable for evolutionary data




   Attribute        Example Qualities
     Color            black, colorless           Attribute                   Example Qualities
 Composition       cartilaginous, ossified    Relational Spatial            anterior to, lateral to
    Count             present, absent       Relational Structural fused with, overlap with, separated from
    Position        horizontal, vertical           Shape          concave, interdigitated, lobed, triangular
    Quality        open, closed, flexible            Size             increased length, decreased length
Relational Shape     protruding into              Texture                    wrinkled, smooth
Getting PATO right is a challenge
 • PATO is “single-inheritance” - what is the right
   axis of classification?

   • relational shape vs monadic shape
   • relational spatial vs position
   • shape and size vs natural language
      ”Interopercle shape: expanded posteroventrally”

 • Different ways to observe or generate a
   phenotypic quality

   • Color as color hue (radiation quality) or
      pigmentation (structural quality)

 • Relative sizes don’t have a universal reference
 • Negation (“not round”, “unelongated”): means
   complement under attribute ‘(shape              and not(round))’
Mapping EQs back to
 characters is a challenge
• Properties of “good” phylogenetic characters:
  • Exclusivity of states
  • Distinguishability of states
  • Independence of characters
• Finding exclusive states requires incompatible
  phenotypes. How to determine incompatibility?

  • Two phenotypes are incompatible iff they
     cannot both inhere in the same specimen.

  • Two qualities are incompatible iff an entity
     cannot bear both.
Which EQs and qualities
  are incompatible?
• Incompatible Qs        • Compatible Qs
  • present vs. absent     • present vs. any
  • triangular vs.           other quality
    round                    (except absent)

  • absent vs. any         • serrated vs. round
    other quality          • some colors
• Incompatible EQs
  • (Q inheres_in bone
    E) vs (cartilage E
    absent)
Detecting phenotype
        change and variation
                 Hemiodus argenteus {shape:bent inheres_in supraorbital bone,
                                     count:absent inheres_in upper pharyngeal
                                     5 tooth}



         {shape:bent inheres_in supraorbital bone,
Hemiodus shape:straight inheres_in supraorbital bone,
          count:absent inheres_in upper pharyngeal 5 tooth,
         count:present inheres_in upper pharyngeal 5 tooth}


                                    {shape:straight inheres_in supraorbital
              Hemiodus unimaculatus bone,
                                    count:present inheres_in upper
                                    pharyngeal 5 tooth}



      {Change in: shape inheres_in supraorbital bone,
      Change in: count inheres_in upper pharyngeal 5 tooth}
Visualizing phenotype profiles on a tree
Phenotypic profile:                  Phenotypic profile tree
                                    Taxon color indicates the greatest level of match of specified phenotype(s)
 Phenotypes
                                    found within a species in the clade.
  dorsal fin absent     X                                                                                         Phenotype match
     including parts

  adipose fin absent X                                                                                                 100%
     including parts
                                                                                                                      75%
 opercle triangular    X                                                                                              50%
     including parts
                                                                                                                      <50%

   Include inferred phenotypes


Query taxa with these phenotypes.
Navigating phenotype variation on a tree

    entity term: basihyal bone
    Taxonomic distribution of                          for basihyal bone

    Limit tree to      Cypriniformes                   X
     or                                                                    Sets of taxa with matching
                                                                               Phenotype Profiles
          Show taxa without phenotype data
          Show taxa with unspecified shape phenotypes
                                                                               Cyprinidae




    Osariophysii                   Cypriniformes                               Balitoridae




                                       Phenotype                                Gyrinocheilidae
     Phenotype                    Taxa
                                        Profiles                                    Cobitidae
    triangular                      1          1                                Vaillantellidae

    Y-shaped                        1          1                                    Botiidae
                                                                                 Catostomidae
    shape                           8          3
                                                                               Psilorhynchidae
Entity 1       Taxon 1     Relationship         Entity 2         Taxon 2        Evidence       Reference(s) 
                                                                                                  (Fink and Fink, 
                                                                                                 1981; Rosen and 
 scaphium        Otophysi    homologous_to  neural arch 1          Teleostei     IDS, IMS, IPS  

                                                                                                                     Reasoning
                                                                                                    Greenwood, 
                                                                                                       1970) 
                                            neural arch 2                                           (Rosen and 
intercalarium    Otophysi    homologous_to    (ventral             Teleostei     IDS, IMS, IPS      Greenwood, 


intercalarium    Otophysi 
                                              portion) 
                             homologous_to  neural arch 2          Teleostei         NAS 
                                                                                                       1970) 
                                                                                                  (Fink and Fink, 
                                                                                                       1981) 
                                                                                                                       over
                                                                                                                     homology
intercalarium    Otophysi    homologous_to  neural arch 2          Teleostei         IMS          (Hora, 1922) 

intercalarium    Otophysi    homologous_to  rib of vertebra 2      Teleostei         TAS          (Hora 1922) 
                                                                                                  (Fink and Fink, 
                                              parapophysis +                                     1981; Rosen and 
   tripus        Otophysi    homologous_to                         Teleostei     IDS, IMS, IPS  
                                              rib of vertebra 3                                     Greenwood, 
                                                                                                       1970) 




 image by Kyle Luckenbill, ANSP
Formalizing homology
        relationships
• Formal pattern is ternary:
    E1 in_taxon T1 homologous_to E2 in_taxon T2 as E3 in_taxon T3

• Classifying homology relationships
    • 1-1 homology (phylogenetic homology)
    • serial homology
•   A iso_homologous_to B as C
       all A derived_by_descent_from some
                 (C and has_derived_by_descendent some B)
       and
       all B derived_by_descent_from some
               (C and has_derived_by_descendent some A)

• shares_ancestor_with as a relation chain:
    derived_by_descent_from o has_derived_by_descendent
Option 1: Asserting homology
     at higher-level taxa
Option 2: Asserting homology
       at species level
Validation through standard
     OWL-DL reasoning
Opening descriptive biological data to
 computing can enable new science
                   Taxonomy,                  Conservation
                   Species ID                   Biology



    Biodiversity
    (Specimens,
    Occurrence
      records)
                         Descriptive biology                 Ecology

                               - Phenotypes
                                   - Traits
                                 - Function
                                 - Behavior
                                  - Habitat
                                - Life Cycle
                              - Reproduction
    Physiology            - Conservation Threats             Genetics




                                              Genomics,
                    Genetic
                                                Gene
                   variation
                                              expression
Acknowledgements

• Phenoscape              • Berkeley Bioinformatics
 Personnel & PIs:          & Ontologies Project
 P. Mabee,                 (BBOP):
 M. Westerfield,            C.Mungall, S.Lewis
 T. Vision,
 J. Balhoff,              • National Evolutionary
 C. Kothari,               Synthesis Center
 W. Dahdul,                (NESCent)
 P. Midford
                          • NSF (DBI 0641025)
• Phenoscape curators &
 workshop participants

Bringing reason to phenotype diversity, character change, and common descent

  • 1.
    Bringing reason to phenotype diversity, character change, and common descent Hilmar Lapp National Evolutionary Synthesis Center (NESCent) NCBO Webinar, Nov 17, 2010
  • 2.
    Regier et al(2010 Parfrey et al (2010, Parfrey & Katz Life has evolved a stunning diversity of phenotypes Images: Web Tree of Life (http://tolweb.org)
  • 3.
    Large body ofevolutionary phenotype documentation
  • 4.
    Chen & Mayden(2010) Phenotype Mabee (2000) changes inform phylogenetic reconstruction from: Understanding Evolution Sereno (1999)
  • 5.
    As complex, freetext phenotypes are resistant to computing (Lundberg and Akama 2005)
  • 6.
    Finding similar information in free-text is difficult “lacrymal bone...flat’’ Mayden 1989 Grande and Poyato- “lacrimal...small, flat” Ariza 1999 “lacrimal...triangular’’ Royero 1999 “first infraorbital (lachrimal) Kailola 2004 shape...flattened” “fourth infraorbital...anterior and Zanata and Vari 2005 posterior margins...in parallel”
  • 7.
    Computing example: Search by Similarity Fig. 3, Washington et al (2009) Fig. 1, Washington et al (2009)
  • 8.
    Computing example: Search by Similarity Fig. 3, Washington et al (2009) Trogloglanis pattersoni - a blind catfish http://tolweb.org/Trogloglanis/69910 Fig. 1, Washington et al (2009)
  • 9.
    Integrating across studies? Fig.7, Sereno (2009) Fig. 6, Sereno (2009)
  • 10.
    Computing over comparative morphology? Cyprinus carpio Pangio anguillaris Nemacheilus fasciatus Catostomus commersoni Gyrinocheilus aymonieri Phenacogrammus interruptus
  • 11.
    Knowledge mining & hypothesis generation Model Organism Non-model organisms Mutagenesis Mutation, selection, drift, gene flow Mutant or missing protein at Altered expression or specific developmental stage function of protein Phenotype change(s) Phenotype changes between to wildtype evolutionary lineages middle nuchal plate predorsal spinelet spine anterior nuchal plate Order Siluriformes Laue et al (2008) Pimelodus maculatus 2 cm abdominal Order Characiformes scutes Catoprion mento
  • 12.
    Phenoscape • Collaboration betweenP. Mabee (PI, U. South Dakota), M. Westerfield (ZFIN), and Todd Vision (UNC, NESCent) • Aim: Foster devo-evo synthesis by • Prototyping a database of curated, machine- interpretable evolutionary phenotypes. • Integrating these with mutant phenotypes from model organisms. • Enabling data-mining and discovery for candidate genes of evolutionary phenotype transitions. • Informatics for the project is developed and hosted at NESCent
  • 13.
    Entity-Quality Model for EvolutionaryPhenotypes Character State supraorbital bone shape bent Entity (TAO) Quality (PATO) supraorbital bone bent
  • 14.
    Entity-Quality Model for EvolutionaryPhenotypes Character State supraorbital bone shape bent Entity (TAO) Quality (PATO) supraorbital bone bent
  • 15.
    Entity-Quality Model for EvolutionaryPhenotypes Character State supraorbital bone shape bent Entity (TAO) Quality (PATO) }supraorbital bone bent Phenotype
  • 16.
    Phenotype Assertion Links a quality to the entity that is its bearer Links a taxon to a Phenotypic Quality phenotype ontology term bent Brycinus exhibits inheres_in brevis some supraorbital bone Taxon Anatomy ontology term ontology term
  • 17.
    Phenotype Assertion Links a quality to the entity that is its bearer Links a taxon to a Phenotypic Quality phenotype ontology term bent Brycinus exhibits inheres_in brevis some supraorbital bone Taxon Anatomy ontology term ontology term Evidence Specimen Publication Code
  • 18.
    Ontology development Ontologies for • Teleost Anatomy • Teleost Taxonomy Dahdul et al (2009) • Phenotypic Quality (PATO) Cover art: K. Luckenbill
  • 19.
    Curation Dahdul et al., 2010 PLoS ONE 2. Students: 3. Character Manual entry of free annotation by experts: text character Entry of phenotypes descriptions, matrix, and homology taxon list, specimens assertions using and museum numbers Phenex using Phenex ~ 5 person years Curators: Wasila Dahdul Miles Coburn Jeff Engemen Terry Grande 1. Students: Eric Hilton gather publications John Lundberg 4. Consistency (scan hard copies, Paula Mabee checks, upload of produce OCR PDFs) Richard Mayden data to public view of Mark Sabaj Pérez Phenoscape KB
  • 21.
    • Curated 4,208characters in 2,310 species from 51 papers • 333,987 evolutionary phenotype assertions • 11,267 phenotype statements about 2,953 genes
  • 22.
  • 27.
    Full workflow: free-text → EQ → integrated KB legacy free-text EQ = body character data lacks all Taxon Gene Anatomical Entity Quality parts of type anatomical has number Teleostei eda scale structure of is_a is_a is_a is_a Siluriformes is_a body scale lacks all Kailola (2004) inheres_in towards parts of type Gasterosteiformes inheres_in towards is_a is_a is_a is_a variant_of inheres_in towards is_a © Jean Ricardo Simões Vitule has fewer body lacks all parts of type Ictalurus punctatus exhibits mutant phenotype EQ = body parts of type scale Here, we describe the phenotypic and has fewer is_a molecular characterization of a set of parts of type body lacks all mutants showing loss of adult structures of Apeltes quadracus exhibits the dermal skeleton, such as the rays of the scale parts of type scale fins and the scales, as well as the pharyngeal teeth. The mutations represent adult-viable, loss of function alleles in the body has fewer ectodysplasin (eda) and ectodysplasin edadt3S243X/+ influences parts of type scale receptor (edar) genes. Harris et al. (2008)
  • 28.
    System architecture Knowledgebase User Inteface External web sites Web Application for Exploration & Mining and client (Ruby on Rails, JavaScript) applications Knowledgebase Data Services API (REST) OBD Programming API OBD Reasoner (Java) Teleost Taxonomy Ontology (TTO) Knowledgebase (OBD) (PostgreSQL) Phenotypic Anatomy Quality Ontology Ontologies (PATO) (ZFA, TAO) Genes & genotypes Homology assertions Mutant EQ phenotypes Evolutionary EQ Phenotypes NeXML OBO Library from Zebrafish Model (through annotation) Organism Database Phenex Skeletal Character Data (Evolutionary EQ (from phylogenetic annotation) treatments in literature)
  • 29.
    KB is basedon OBD (Ontology-Based Database) (C. Mungall, LBL)
  • 30.
    PATO:quality Measurement ZFIN:Publication ECO:evidence curator(s) -value/max/min uid = ZFIN ID -unit OBO_REL:is_a OBO_REL:posited_by has_measurement has_evidence dc:creator TAO:taxon PHENO:exhibits Phenotype OBO_REL:influences (class expression) Genotype OBO_REL:towards OBO_REL:inheres_in OBO_REL:variant_of PHENO:asserted_for_otu TAO:entity OBO_REL:posited_by PHENO:has_taxon OBO_REL:posited_by TAO:entity Gene CDAO:TU name = Publication Taxon CDAO:CharacterStateDomain CDAO:CharacterStateDatum CDAO:has_State name = state text dwc:individualID PHENO:has_comment PHENO:has_comment comments CDAO:has_TU dwc:collectionID COLLECTION (literal text) Specimen dwc:catalogID comment CDAO:has_Datum (literal text) catalog number (literal text) CDAO:CharacterStateDataMatrix CDAO:has_Character CDAO:Character PHENO:has_publication name = character text PHENO:has_comment PHENO:Publication PHENO:has_comment -dc:abstract publication notes -dc:bibliographicCitation (literal text) comment -dc:date (literal text)
  • 31.
    Phenoscape OBD reasoner • All OBD built-in rules: • is_transitive(R), X R Y, Y R Z → X R Z • is_reflexive(R) → X R X • X is_a Y, Y R some Z → X R some Z • Y is_a Z, X R some Y → X R some Z • X R Y, Y S Z, transitive_over(R,S) → X R Z • X R1 Y, Y R2 Z, holds_over_chain(R,R1,R2) → X R Z • Additional Phenoscape rule: • T exhibits P, T is_a T’ → T’ exhibits P • Consistent with OWL due to instance quantification
  • 32.
    Major taxonomic groupshave similar distribution of entities among phenotypes
  • 33.
    Substantial overlap between model organism and evolutionary phenotypes hematopoie7c system  •4,217 zebrafish phenotypes reproduc7ve system  •3,405 evolutionary characters musculature system  liver and biliary system  respiratory system  renal system  endocrine system  Evolu7onary characters  Zebrafish phenotypes  immune system  diges7ve system  cardiovascular system  skeletal system  sensory system  nervous system  0  500  1000  1500  2000  2500 
  • 34.
    Hypothesis generation: Genetic basis for scale loss in Siluriformes Mutation of eda gene Ictalurus punctatus: in Danio: Harris et al., 2007 Copyright  ©  Jean  Ricardo  Simões  Vitule,  All  Rights  Reserved
  • 35.
    Hypothesis generation: Genetic basis for absence of the basihyal bone in Siluriformes Mutation of brpf1 gene Ictalurus punctatus: in Danio: Laue et al (2008)
  • 36.
    Making PATO usablefor evolutionary data Attribute Example Qualities Color black, colorless Attribute Example Qualities Composition cartilaginous, ossified Relational Spatial anterior to, lateral to Count present, absent Relational Structural fused with, overlap with, separated from Position horizontal, vertical Shape concave, interdigitated, lobed, triangular Quality open, closed, flexible Size increased length, decreased length Relational Shape protruding into Texture wrinkled, smooth
  • 37.
    Getting PATO rightis a challenge • PATO is “single-inheritance” - what is the right axis of classification? • relational shape vs monadic shape • relational spatial vs position • shape and size vs natural language ”Interopercle shape: expanded posteroventrally” • Different ways to observe or generate a phenotypic quality • Color as color hue (radiation quality) or pigmentation (structural quality) • Relative sizes don’t have a universal reference • Negation (“not round”, “unelongated”): means complement under attribute ‘(shape and not(round))’
  • 38.
    Mapping EQs backto characters is a challenge • Properties of “good” phylogenetic characters: • Exclusivity of states • Distinguishability of states • Independence of characters • Finding exclusive states requires incompatible phenotypes. How to determine incompatibility? • Two phenotypes are incompatible iff they cannot both inhere in the same specimen. • Two qualities are incompatible iff an entity cannot bear both.
  • 39.
    Which EQs andqualities are incompatible? • Incompatible Qs • Compatible Qs • present vs. absent • present vs. any • triangular vs. other quality round (except absent) • absent vs. any • serrated vs. round other quality • some colors • Incompatible EQs • (Q inheres_in bone E) vs (cartilage E absent)
  • 40.
    Detecting phenotype change and variation Hemiodus argenteus {shape:bent inheres_in supraorbital bone, count:absent inheres_in upper pharyngeal 5 tooth} {shape:bent inheres_in supraorbital bone, Hemiodus shape:straight inheres_in supraorbital bone, count:absent inheres_in upper pharyngeal 5 tooth, count:present inheres_in upper pharyngeal 5 tooth} {shape:straight inheres_in supraorbital Hemiodus unimaculatus bone, count:present inheres_in upper pharyngeal 5 tooth} {Change in: shape inheres_in supraorbital bone, Change in: count inheres_in upper pharyngeal 5 tooth}
  • 41.
    Visualizing phenotype profileson a tree Phenotypic profile: Phenotypic profile tree Taxon color indicates the greatest level of match of specified phenotype(s) Phenotypes found within a species in the clade. dorsal fin absent X Phenotype match including parts adipose fin absent X 100% including parts 75% opercle triangular X 50% including parts <50% Include inferred phenotypes Query taxa with these phenotypes.
  • 42.
    Navigating phenotype variationon a tree entity term: basihyal bone Taxonomic distribution of for basihyal bone Limit tree to Cypriniformes X or Sets of taxa with matching Phenotype Profiles Show taxa without phenotype data Show taxa with unspecified shape phenotypes Cyprinidae Osariophysii Cypriniformes Balitoridae Phenotype Gyrinocheilidae Phenotype Taxa Profiles Cobitidae triangular 1 1 Vaillantellidae Y-shaped 1 1 Botiidae Catostomidae shape 8 3 Psilorhynchidae
  • 43.
    Entity 1  Taxon 1  Relationship  Entity 2  Taxon 2  Evidence  Reference(s)  (Fink and Fink,  1981; Rosen and  scaphium  Otophysi  homologous_to  neural arch 1  Teleostei   IDS, IMS, IPS   Reasoning Greenwood,  1970)  neural arch 2  (Rosen and  intercalarium  Otophysi  homologous_to  (ventral  Teleostei  IDS, IMS, IPS   Greenwood,  intercalarium  Otophysi  portion)  homologous_to  neural arch 2   Teleostei  NAS  1970)  (Fink and Fink,  1981)  over homology intercalarium  Otophysi  homologous_to  neural arch 2   Teleostei  IMS  (Hora, 1922)  intercalarium  Otophysi  homologous_to  rib of vertebra 2  Teleostei  TAS   (Hora 1922)  (Fink and Fink,  parapophysis +  1981; Rosen and  tripus  Otophysi  homologous_to  Teleostei   IDS, IMS, IPS   rib of vertebra 3  Greenwood,  1970)  image by Kyle Luckenbill, ANSP
  • 44.
    Formalizing homology relationships • Formal pattern is ternary: E1 in_taxon T1 homologous_to E2 in_taxon T2 as E3 in_taxon T3 • Classifying homology relationships • 1-1 homology (phylogenetic homology) • serial homology • A iso_homologous_to B as C all A derived_by_descent_from some (C and has_derived_by_descendent some B) and all B derived_by_descent_from some (C and has_derived_by_descendent some A) • shares_ancestor_with as a relation chain: derived_by_descent_from o has_derived_by_descendent
  • 45.
    Option 1: Assertinghomology at higher-level taxa
  • 46.
    Option 2: Assertinghomology at species level
  • 47.
  • 49.
    Opening descriptive biologicaldata to computing can enable new science Taxonomy, Conservation Species ID Biology Biodiversity (Specimens, Occurrence records) Descriptive biology Ecology - Phenotypes - Traits - Function - Behavior - Habitat - Life Cycle - Reproduction Physiology - Conservation Threats Genetics Genomics, Genetic Gene variation expression
  • 50.
    Acknowledgements • Phenoscape • Berkeley Bioinformatics Personnel & PIs: & Ontologies Project P. Mabee, (BBOP): M. Westerfield, C.Mungall, S.Lewis T. Vision, J. Balhoff, • National Evolutionary C. Kothari, Synthesis Center W. Dahdul, (NESCent) P. Midford • NSF (DBI 0641025) • Phenoscape curators & workshop participants