SlideShare a Scribd company logo
1 of 35
Download to read offline
Of Trees and Owl:

The challenges of reasoning over
the semantics of shared descent
Hilmar Lapp
Duke University
US2TS 2019 in Durham, NC
https://commons.wikimedia.org/wiki/File:Dobzhansky_Evolution_Notre_Dame.jpg
Phylogenetic trees express
hypotheses of common descent
Brochu 2003
Phylogenetic trees express
hypotheses of common descent
Brochu 2003
Most data use Linnaean taxonomy
Linnaean names suffer from
fundamental shortfalls
• Names ≠ Identifiers
• Underlying taxon concepts (= semantics) shift
over time, and are computationally inaccessible
• At least 90-95% of estimated biodiversity doesn’t
have a name, and most will never receive one
• The better the Tree of Life is known, the more
groups of organisms without a name
The Phyloreferencing Project
• Funded by US National Science Foundation
since 2015
• Goal: Capture and make computable the
semantics of phylogenetic clade definitions for
computational integration of biological data.
• Major expected products:
• Ontology, specification, and tooling for authoring
phyloreferences.
• Ontology of phylogenetic clade definitions
• Online tools for using phyloreferences to retrieve data
phyloref.org
Anatomy of a phylogenetic
clade definition
Alligatoroidea =

Alligator mississippiensis and
all crocodylians closer to it than
to Crocodylus niloticus or
Gavialis gangeticus
Crocodylia =
Last common ancestor of
Gavialis gangeticus, Alligator
mississippiensis, and
Crocodylus niloticus and all of
its descendents
Brochu 2003
Anatomy of a phylogenetic
clade definition
Alligatoroidea =

Alligator mississippiensis and
all crocodylians closer to it than
to Crocodylus niloticus or
Gavialis gangeticus
Crocodylia =
Last common ancestor of
Gavialis gangeticus, Alligator
mississippiensis, and
Crocodylus niloticus and all of
its descendents
Brochu 2003
“Specifiers”
Anatomy of a phylogenetic
clade definition
Alligatoroidea =

Alligator mississippiensis and
all crocodylians closer to it than
to Crocodylus niloticus or
Gavialis gangeticus
Crocodylia =
Last common ancestor of
Gavialis gangeticus, Alligator
mississippiensis, and
Crocodylus niloticus and all of
its descendents
Brochu 2003
Type of definition
Anatomy of a phylogenetic
clade definition
Alligatoroidea =

Alligator mississippiensis and
all crocodylians closer to it than
to Crocodylus niloticus or
Gavialis gangeticus
Crocodylia =
Last common ancestor of
Gavialis gangeticus, Alligator
mississippiensis, and
Crocodylus niloticus and all of
its descendents
Brochu 2003
Anatomy of a phylogenetic
clade definition
Alligatoroidea =

Alligator mississippiensis and
all crocodylians closer to it than
to Crocodylus niloticus or
Gavialis gangeticus
Crocodylia =
Last common ancestor of
Gavialis gangeticus, Alligator
mississippiensis, and
Crocodylus niloticus and all of
its descendents
Brochu 2003
Axiomatization of a tree
A phyloreference
Crocodylus niloticus and
all crocodylians

closer to it than to
Alligator mississippiensis
includes_TU some tco:Crocodylus niloticus and
excludes_TU some tco:Alligator mississippiensis
Resolving a phyloreference:

axiomatized tree + phyloreference + reasoner
Crocodylus niloticus and
all crocodylians

closer to it than to
Alligator mississippiensis
includes_TU some tco:Crocodylus niloticus and
excludes_TU some tco:Alligator mississippiensis
Resolving a phyloreference:

axiomatized tree + phyloreference + reasoner
Crocodylus niloticus and
all crocodylians

closer to it than to
Alligator mississippiensis
includes_TU some tco:Crocodylus niloticus and
excludes_TU some tco:Alligator mississippiensis
Resolving a phyloreference:

axiomatized tree + phyloreference + reasoner
Crocodylus niloticus and
all crocodylians

closer to it than to
Alligator mississippiensis
includes_TU some tco:Crocodylus niloticus and
excludes_TU some tco:Alligator mississippiensis
Phyloreferences have major
advantages for data integration
• Unambiguous and computable semantics
• Portable between phylogenetic trees, and hence
competing or evolving phylogenetic hypotheses
• Computationally reproducible
• Can be constructed for any clade, and hence
enables unrestricted communication
Challenges with OWL-DL:

No set size or path length
Crocodylus niloticus and
all crocodylians

closer to it than to
Alligator mississippiensis
“Maximum clade definition”
Challenges with OWL-DL:

No set size or path length
Last common ancestor of
Alligator mississippiensis
and Crocodylus niloticus
and all of its descendents
“Minimum clade definition”
Challenges with OWL-DL:

No set size or path length
Last common ancestor of
Alligator mississippiensis
and Crocodylus niloticus

and all of its descendents
includes_TU some tco:Crocodylus niloticus and
includes_TU some tco:Alligator mississippiensis
Challenges with OWL-DL:

No set size or path length
Last common ancestor of
Alligator mississippiensis
and Crocodylus niloticus

and all of its descendents
includes_TU some tco:Crocodylus niloticus and
includes_TU some tco:Alligator mississippiensis
Challenges with OWL-DL:

No set size or path length
Last common ancestor of
Alligator mississippiensis
and Crocodylus niloticus

and all of its descendents
has_Child some

(includes_TU some tco:Crocodylus niloticus and
excludes_TU some tco:Alligator mississippiensis)
Challenges with OWL-DL:

No set size or path length
Last common ancestor of
Alligator mississippiensis
and Crocodylus niloticus

and all of its descendents
has_Child some

(includes_TU some tco:Crocodylus niloticus and
excludes_TU some tco:Alligator mississippiensis)
• However, this will fail for the general case:



MaxClade(S1, S2) := S1 ~ S2

Parent(S) := has_Child some S

LCA(S1, S2) := Parent(MaxClade(S1, S2))
• If S2 has_Ancestor some (includes_TU some S1) then

MaxClade(S1, S2) == {}, but LCA should be S1
• I.e., in this approach SN cannot itself be a phyloreference
(LCA of a clade).
Challenge: Multiple specifiers
• Naïve recursive approach:

LCA(S1,…,SN) = LCA(LCA(S1,…,SN-1),SN)
• However, as shown for LCA(S1, S2), S1 or S2
cannot itself be an LCA
Last common ancestor
of Gavialis gangeticus,
Alligator mississippiensis,
and Crocodylus niloticus
and all of its descendents
Challenge: Multiple specifiers
Last common ancestor
of Gavialis gangeticus,
Alligator mississippiensis,
and Crocodylus niloticus
and all of its descendents
Challenge: Multiple specifiers
Last common ancestor
of Gavialis gangeticus,
Alligator mississippiensis,
and Crocodylus niloticus
and all of its descendents
Parent(LCA(Cn, Am) ~ Gg)
Challenge: Multiple specifiers
Last common ancestor
of Gavialis gangeticus,
Alligator mississippiensis,
and Crocodylus niloticus
and all of its descendents
Parent(LCA(Cn, Am) ~ Gg) Parent(LCA(Cn, Gg) ~ Am) Parent(LCA(Am, Gg) ~ Cn)OR OR
Challenge: Multiple specifiers
• The number of binary tree topologies for n
leaves grows very fast, and is > 100 for 5
Last common ancestor
of Gavialis gangeticus,
Alligator mississippiensis,
and Crocodylus niloticus
and all of its descendents
Challenge: Scalability
• Eventually we want to apply this to very large
trees, with >1M leave nodes.
• Only OWL-EL reasoners (e.g., ELK) scale well to
this level.
• However, this prevents use of disjunction.
Challenge: Scalability
• As a kludge, can use multiple equivalency axioms
instead.
• However, this in essence makes false assertions.
• Can result in unexpected subclass inferences for
other phyloreferences.
Parent(LCA(Cn, Am) ~ Gg) Parent(LCA(Cn, Gg) ~ Am) Parent(LCA(Am, Gg) ~ Cn)
equivalentClass: equivalentClass: equivalentClass:
Challenge: “Qualifiers”
• “Qualifiers” are specifiers that are required to be
included or excluded by the clade.
• “Kill switches” – tests that do not alter the
semantics of the clade definition, but render it
invalid if it fails any of the tests
• External qualifiers as property chain:

has_Ancestor o excludes_TU -> excludes_qualifying_TU
Summary
• Reproducible large-scale comparative biology
requires taxon concepts with fully computable
semantics
• Phylogenetic clade definitions have well-defined
semantics in the form of necessary and sufficient
conditions for clade membership
• Clade semantics are well expressible in OWL, but
OWL lacks constructs needed for inferring last
common ancestors generically and scalably.
Acknowledgements
• Nico Cellinese, Gaurav Vaidya, Anna
Becker (U. Florida)
• Pascal Hitzler and DaSe Lab alumni &
collaborators (see Carral et al. WOP
2017, arXiv:1710.05096)
• Funded by the US National Science
Foundation (DBI-1458484, DBI-1458604)
How to find us
• Web: http://phyloref.org (includes link to full
grant proposal)
• Github: http://github.com/phyloref

More Related Content

Similar to Of Trees and Owl: 
The challenges of reasoning over the semantics of shared descent

Ch.20 lecture presentation1
Ch.20 lecture presentation1Ch.20 lecture presentation1
Ch.20 lecture presentation1Muhammad Tiwana
 
Chapter25
Chapter25Chapter25
Chapter25May Mar
 
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoning
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoningFranz ludaescher tdwg 2016 an update on taxonomic concept reasoning
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoningtaxonbytes
 
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Manikhandan Mudaliar
 
05 phylogeny modern taxonomy
05   phylogeny modern taxonomy05   phylogeny modern taxonomy
05 phylogeny modern taxonomymrtangextrahelp
 
Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02Cleophas Rwemera
 
Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02Cleophas Rwemera
 
Ambrosia Beetle Genotype-by-sequencing
Ambrosia Beetle Genotype-by-sequencingAmbrosia Beetle Genotype-by-sequencing
Ambrosia Beetle Genotype-by-sequencingcgstorer
 
Learning Keys , Lehninger's Chapter # 7 Carbohydrates (Polysaccharides)
Learning Keys , Lehninger's Chapter # 7 Carbohydrates (Polysaccharides)Learning Keys , Lehninger's Chapter # 7 Carbohydrates (Polysaccharides)
Learning Keys , Lehninger's Chapter # 7 Carbohydrates (Polysaccharides)Tauqeer Ahmad
 
Assessing the information content of fossil Glires using 'artificial extinction'
Assessing the information content of fossil Glires using 'artificial extinction'Assessing the information content of fossil Glires using 'artificial extinction'
Assessing the information content of fossil Glires using 'artificial extinction'AimeRankin
 
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...taxonbytes
 
Population genetics
Population geneticsPopulation genetics
Population geneticsLubnaSSubair
 
Genetic drift of iow
Genetic drift of iowGenetic drift of iow
Genetic drift of iowLyndsae Drury
 
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgeFranz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgetaxonbytes
 
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...taxonbytes
 
Cladistic systematics
Cladistic systematicsCladistic systematics
Cladistic systematicsEnoch Taclan
 

Similar to Of Trees and Owl: 
The challenges of reasoning over the semantics of shared descent (19)

2014 bangkok-talk
2014 bangkok-talk2014 bangkok-talk
2014 bangkok-talk
 
U1 and U2 Exam Review from 28May
U1 and U2 Exam Review from 28MayU1 and U2 Exam Review from 28May
U1 and U2 Exam Review from 28May
 
Ch.20 lecture presentation1
Ch.20 lecture presentation1Ch.20 lecture presentation1
Ch.20 lecture presentation1
 
Chapter25
Chapter25Chapter25
Chapter25
 
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoning
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoningFranz ludaescher tdwg 2016 an update on taxonomic concept reasoning
Franz ludaescher tdwg 2016 an update on taxonomic concept reasoning
 
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
 
05 phylogeny modern taxonomy
05   phylogeny modern taxonomy05   phylogeny modern taxonomy
05 phylogeny modern taxonomy
 
Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02
 
Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02Biol102 chp26-pp-spr10-100312094514-phpapp02
Biol102 chp26-pp-spr10-100312094514-phpapp02
 
Ambrosia Beetle Genotype-by-sequencing
Ambrosia Beetle Genotype-by-sequencingAmbrosia Beetle Genotype-by-sequencing
Ambrosia Beetle Genotype-by-sequencing
 
Learning Keys , Lehninger's Chapter # 7 Carbohydrates (Polysaccharides)
Learning Keys , Lehninger's Chapter # 7 Carbohydrates (Polysaccharides)Learning Keys , Lehninger's Chapter # 7 Carbohydrates (Polysaccharides)
Learning Keys , Lehninger's Chapter # 7 Carbohydrates (Polysaccharides)
 
Assessing the information content of fossil Glires using 'artificial extinction'
Assessing the information content of fossil Glires using 'artificial extinction'Assessing the information content of fossil Glires using 'artificial extinction'
Assessing the information content of fossil Glires using 'artificial extinction'
 
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
 
Population genetics
Population geneticsPopulation genetics
Population genetics
 
Genetic drift of iow
Genetic drift of iowGenetic drift of iow
Genetic drift of iow
 
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgeFranz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
 
Cg7 trees
Cg7 treesCg7 trees
Cg7 trees
 
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
 
Cladistic systematics
Cladistic systematicsCladistic systematics
Cladistic systematics
 

More from Hilmar Lapp

Integrating data with phylogenies, at scale
Integrating data with phylogenies, at scaleIntegrating data with phylogenies, at scale
Integrating data with phylogenies, at scaleHilmar Lapp
 
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...Hilmar Lapp
 
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Hilmar Lapp
 
Open Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionOpen Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionHilmar Lapp
 
Reproducible Science - Panel at iEvoBio 2014
Reproducible Science - Panel at iEvoBio 2014 Reproducible Science - Panel at iEvoBio 2014
Reproducible Science - Panel at iEvoBio 2014 Hilmar Lapp
 
The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...Hilmar Lapp
 
PhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing PhylogeniesPhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing PhylogeniesHilmar Lapp
 
OBF Address at BOSC 2013
OBF Address at BOSC 2013OBF Address at BOSC 2013
OBF Address at BOSC 2013Hilmar Lapp
 
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...Hilmar Lapp
 
The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...Hilmar Lapp
 
Bringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descentBringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descentHilmar Lapp
 
Phyloinformatics VoCamp
Phyloinformatics VoCampPhyloinformatics VoCamp
Phyloinformatics VoCampHilmar Lapp
 
Reasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentReasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentHilmar Lapp
 
Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Hilmar Lapp
 
Liberating Our Beautiful Trees: A Call to Arms.
Liberating Our Beautiful Trees: A Call to Arms.Liberating Our Beautiful Trees: A Call to Arms.
Liberating Our Beautiful Trees: A Call to Arms.Hilmar Lapp
 
OBF Address at BOSC 2012
OBF Address at BOSC 2012OBF Address at BOSC 2012
OBF Address at BOSC 2012Hilmar Lapp
 
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic DatabaseTowards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic DatabaseHilmar Lapp
 
Lapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing SymposiumLapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing SymposiumHilmar Lapp
 
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future FeaturesBioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future FeaturesHilmar Lapp
 

More from Hilmar Lapp (19)

Integrating data with phylogenies, at scale
Integrating data with phylogenies, at scaleIntegrating data with phylogenies, at scale
Integrating data with phylogenies, at scale
 
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
 
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
 
Open Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionOpen Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some Introspection
 
Reproducible Science - Panel at iEvoBio 2014
Reproducible Science - Panel at iEvoBio 2014 Reproducible Science - Panel at iEvoBio 2014
Reproducible Science - Panel at iEvoBio 2014
 
The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...
 
PhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing PhylogeniesPhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing Phylogenies
 
OBF Address at BOSC 2013
OBF Address at BOSC 2013OBF Address at BOSC 2013
OBF Address at BOSC 2013
 
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
 
The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...
 
Bringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descentBringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descent
 
Phyloinformatics VoCamp
Phyloinformatics VoCampPhyloinformatics VoCamp
Phyloinformatics VoCamp
 
Reasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentReasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descent
 
Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?
 
Liberating Our Beautiful Trees: A Call to Arms.
Liberating Our Beautiful Trees: A Call to Arms.Liberating Our Beautiful Trees: A Call to Arms.
Liberating Our Beautiful Trees: A Call to Arms.
 
OBF Address at BOSC 2012
OBF Address at BOSC 2012OBF Address at BOSC 2012
OBF Address at BOSC 2012
 
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic DatabaseTowards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
 
Lapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing SymposiumLapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing Symposium
 
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future FeaturesBioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
 

Recently uploaded

Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 

Recently uploaded (20)

Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 

Of Trees and Owl: 
The challenges of reasoning over the semantics of shared descent

  • 1. Of Trees and Owl:
 The challenges of reasoning over the semantics of shared descent Hilmar Lapp Duke University US2TS 2019 in Durham, NC
  • 3. Phylogenetic trees express hypotheses of common descent Brochu 2003
  • 4. Phylogenetic trees express hypotheses of common descent Brochu 2003
  • 5. Most data use Linnaean taxonomy
  • 6. Linnaean names suffer from fundamental shortfalls • Names ≠ Identifiers • Underlying taxon concepts (= semantics) shift over time, and are computationally inaccessible • At least 90-95% of estimated biodiversity doesn’t have a name, and most will never receive one • The better the Tree of Life is known, the more groups of organisms without a name
  • 7. The Phyloreferencing Project • Funded by US National Science Foundation since 2015 • Goal: Capture and make computable the semantics of phylogenetic clade definitions for computational integration of biological data. • Major expected products: • Ontology, specification, and tooling for authoring phyloreferences. • Ontology of phylogenetic clade definitions • Online tools for using phyloreferences to retrieve data phyloref.org
  • 8. Anatomy of a phylogenetic clade definition Alligatoroidea =
 Alligator mississippiensis and all crocodylians closer to it than to Crocodylus niloticus or Gavialis gangeticus Crocodylia = Last common ancestor of Gavialis gangeticus, Alligator mississippiensis, and Crocodylus niloticus and all of its descendents Brochu 2003
  • 9. Anatomy of a phylogenetic clade definition Alligatoroidea =
 Alligator mississippiensis and all crocodylians closer to it than to Crocodylus niloticus or Gavialis gangeticus Crocodylia = Last common ancestor of Gavialis gangeticus, Alligator mississippiensis, and Crocodylus niloticus and all of its descendents Brochu 2003 “Specifiers”
  • 10. Anatomy of a phylogenetic clade definition Alligatoroidea =
 Alligator mississippiensis and all crocodylians closer to it than to Crocodylus niloticus or Gavialis gangeticus Crocodylia = Last common ancestor of Gavialis gangeticus, Alligator mississippiensis, and Crocodylus niloticus and all of its descendents Brochu 2003 Type of definition
  • 11. Anatomy of a phylogenetic clade definition Alligatoroidea =
 Alligator mississippiensis and all crocodylians closer to it than to Crocodylus niloticus or Gavialis gangeticus Crocodylia = Last common ancestor of Gavialis gangeticus, Alligator mississippiensis, and Crocodylus niloticus and all of its descendents Brochu 2003
  • 12. Anatomy of a phylogenetic clade definition Alligatoroidea =
 Alligator mississippiensis and all crocodylians closer to it than to Crocodylus niloticus or Gavialis gangeticus Crocodylia = Last common ancestor of Gavialis gangeticus, Alligator mississippiensis, and Crocodylus niloticus and all of its descendents Brochu 2003
  • 14. A phyloreference Crocodylus niloticus and all crocodylians
 closer to it than to Alligator mississippiensis includes_TU some tco:Crocodylus niloticus and excludes_TU some tco:Alligator mississippiensis
  • 15. Resolving a phyloreference:
 axiomatized tree + phyloreference + reasoner Crocodylus niloticus and all crocodylians
 closer to it than to Alligator mississippiensis includes_TU some tco:Crocodylus niloticus and excludes_TU some tco:Alligator mississippiensis
  • 16. Resolving a phyloreference:
 axiomatized tree + phyloreference + reasoner Crocodylus niloticus and all crocodylians
 closer to it than to Alligator mississippiensis includes_TU some tco:Crocodylus niloticus and excludes_TU some tco:Alligator mississippiensis
  • 17. Resolving a phyloreference:
 axiomatized tree + phyloreference + reasoner Crocodylus niloticus and all crocodylians
 closer to it than to Alligator mississippiensis includes_TU some tco:Crocodylus niloticus and excludes_TU some tco:Alligator mississippiensis
  • 18. Phyloreferences have major advantages for data integration • Unambiguous and computable semantics • Portable between phylogenetic trees, and hence competing or evolving phylogenetic hypotheses • Computationally reproducible • Can be constructed for any clade, and hence enables unrestricted communication
  • 19. Challenges with OWL-DL:
 No set size or path length Crocodylus niloticus and all crocodylians
 closer to it than to Alligator mississippiensis “Maximum clade definition”
  • 20. Challenges with OWL-DL:
 No set size or path length Last common ancestor of Alligator mississippiensis and Crocodylus niloticus and all of its descendents “Minimum clade definition”
  • 21. Challenges with OWL-DL:
 No set size or path length Last common ancestor of Alligator mississippiensis and Crocodylus niloticus
 and all of its descendents includes_TU some tco:Crocodylus niloticus and includes_TU some tco:Alligator mississippiensis
  • 22. Challenges with OWL-DL:
 No set size or path length Last common ancestor of Alligator mississippiensis and Crocodylus niloticus
 and all of its descendents includes_TU some tco:Crocodylus niloticus and includes_TU some tco:Alligator mississippiensis
  • 23. Challenges with OWL-DL:
 No set size or path length Last common ancestor of Alligator mississippiensis and Crocodylus niloticus
 and all of its descendents has_Child some
 (includes_TU some tco:Crocodylus niloticus and excludes_TU some tco:Alligator mississippiensis)
  • 24. Challenges with OWL-DL:
 No set size or path length Last common ancestor of Alligator mississippiensis and Crocodylus niloticus
 and all of its descendents has_Child some
 (includes_TU some tco:Crocodylus niloticus and excludes_TU some tco:Alligator mississippiensis) • However, this will fail for the general case:
 
 MaxClade(S1, S2) := S1 ~ S2
 Parent(S) := has_Child some S
 LCA(S1, S2) := Parent(MaxClade(S1, S2)) • If S2 has_Ancestor some (includes_TU some S1) then
 MaxClade(S1, S2) == {}, but LCA should be S1 • I.e., in this approach SN cannot itself be a phyloreference (LCA of a clade).
  • 25. Challenge: Multiple specifiers • Naïve recursive approach:
 LCA(S1,…,SN) = LCA(LCA(S1,…,SN-1),SN) • However, as shown for LCA(S1, S2), S1 or S2 cannot itself be an LCA Last common ancestor of Gavialis gangeticus, Alligator mississippiensis, and Crocodylus niloticus and all of its descendents
  • 26. Challenge: Multiple specifiers Last common ancestor of Gavialis gangeticus, Alligator mississippiensis, and Crocodylus niloticus and all of its descendents
  • 27. Challenge: Multiple specifiers Last common ancestor of Gavialis gangeticus, Alligator mississippiensis, and Crocodylus niloticus and all of its descendents Parent(LCA(Cn, Am) ~ Gg)
  • 28. Challenge: Multiple specifiers Last common ancestor of Gavialis gangeticus, Alligator mississippiensis, and Crocodylus niloticus and all of its descendents Parent(LCA(Cn, Am) ~ Gg) Parent(LCA(Cn, Gg) ~ Am) Parent(LCA(Am, Gg) ~ Cn)OR OR
  • 29. Challenge: Multiple specifiers • The number of binary tree topologies for n leaves grows very fast, and is > 100 for 5 Last common ancestor of Gavialis gangeticus, Alligator mississippiensis, and Crocodylus niloticus and all of its descendents
  • 30. Challenge: Scalability • Eventually we want to apply this to very large trees, with >1M leave nodes. • Only OWL-EL reasoners (e.g., ELK) scale well to this level. • However, this prevents use of disjunction.
  • 31. Challenge: Scalability • As a kludge, can use multiple equivalency axioms instead. • However, this in essence makes false assertions. • Can result in unexpected subclass inferences for other phyloreferences. Parent(LCA(Cn, Am) ~ Gg) Parent(LCA(Cn, Gg) ~ Am) Parent(LCA(Am, Gg) ~ Cn) equivalentClass: equivalentClass: equivalentClass:
  • 32. Challenge: “Qualifiers” • “Qualifiers” are specifiers that are required to be included or excluded by the clade. • “Kill switches” – tests that do not alter the semantics of the clade definition, but render it invalid if it fails any of the tests • External qualifiers as property chain:
 has_Ancestor o excludes_TU -> excludes_qualifying_TU
  • 33. Summary • Reproducible large-scale comparative biology requires taxon concepts with fully computable semantics • Phylogenetic clade definitions have well-defined semantics in the form of necessary and sufficient conditions for clade membership • Clade semantics are well expressible in OWL, but OWL lacks constructs needed for inferring last common ancestors generically and scalably.
  • 34. Acknowledgements • Nico Cellinese, Gaurav Vaidya, Anna Becker (U. Florida) • Pascal Hitzler and DaSe Lab alumni & collaborators (see Carral et al. WOP 2017, arXiv:1710.05096) • Funded by the US National Science Foundation (DBI-1458484, DBI-1458604)
  • 35. How to find us • Web: http://phyloref.org (includes link to full grant proposal) • Github: http://github.com/phyloref