Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
1. Aligning insect phylogenies:
Perelleschus and other cases
Nico M. Franz 1,2
Arizona State University
http://taxonbytes.org/
1 Concepts and tools developed jointly with members of the Ludäscher Lab (UC Davis & UIUC):
Mingmin Chen, Parisa Kianmajd, Shizhuo Yu, Shawn Bowers & Bertram Ludäscher
2 Systematics, Evolution and Biodiversity Section, Ten Minute Papers
Annual Meeting of the Entomological Society of America
November 18, 2014 - Portland, Oregon
On-line @ http://www.slideshare.net/taxonbytes/franz-2014-esa-aligning-insect-phylogenies-perelleschus-and-other-cases-41654235
2. Research motivation: 1
How can we represent, and reason over,
taxonomic concept provenance,
based on varying input classifications
and differentially sampled phylogenies?
1 This presentation concentrates on the "how?"; though the "why?" is addressed in the References (listed at the end).
3. Definitional preliminaries, 1
Taxonomic concept: 1
The circumscription of a perceived
(or, more accurately, hypothesized)
taxonomic group, as advocated by
a particular author and source.
1Not the same as species concepts, which are theories about what species are, and/or how they are recognized.
4. Definitional preliminaries, 2
Provenance: 1
Information describing the origin, derivation,
history, custody, or context of an entity (etc.).
Provenance establishes the authenticity, integrity
and trustworthiness of information about entities.
1 See, e.g.: http://www.w3.org/2005/Incubator/prov/wiki/What_Is_Provenance
5. Definitional preliminaries, 3
Alignment ("merge"):
A comprehensive, logically consistent, and
(where possible) well-specified reconciliation
of shared and unique Euler regions that result from
integrating two or more taxonomic concept
hierarchies ("trees") with RCC-5 articulations.1
1 RCC-5 = Region Connection Calculus (set theory relationships: congruence, inclusion, overlap, exclusion, etc.).
8. Perelleschus salpinflexus Cardona-Duque & Franz sec. Franz & Cardona-Duque (2013)
• Habitus, mouthparts One might call this string a Taxonomic Concept Label.
Female
,
habitu
s
Labium Maxill
a
9. Perelleschus salpinflexus Cardona-Duque & Franz sec. Franz & Cardona-Duque (2013)
• Male & female terminalia, showing putative synapomorphies
Synapomorphy (genus-level): Spermatheca
with an acute, sclerotized appendix at
insertion of the collum (character 17:1).
"11"
Synapomorphy (subclade-level):
Aedeagus with endophallic
sclerites extending in apical
half of aedeagus (character
11:1).
"17"
19. Introducing the Euler/X software toolkit (Open Source)
"A toolkit for consistently aligning
sets of hierarchically arranged entities
under (relaxable) logic constraints,
and using RCC-5 articulations."
Desktop tool @ https://bitbucket.org/eulerx
Euler server @ http://euler.asu.edu
21. Euler/X uses Answer Set Programming.
The reasoner asks, and solves, the question:
"Which possible worlds can be generated
that satisfy (i.e., are consistent with)
a given set of input constraints?" 1
22. Euler/X uses Answer Set Programming.
The reasoner asks, and solves, the question:
"Which possible worlds can be generated
that satisfy (i.e., are consistent with)
a given set of input constraints?" 1
1 Input constraints:
• T1 − taxonomy 1
• T2 − taxonomy 2
• A − user-asserted articulations
• C − additional 'tree' constraints
23. Alignment 1 - Perelleschus sec. WOB (1986) versus sec. FOB (2001)
T1: Perelleschus sec. 1986
• Traditional classification
• 1 genus-level concept
• 3 species-level concepts
24. Alignment 1 - Perelleschus sec. WOB (1986) versus sec. FOB (2001)
T1: Perelleschus sec. 1986
• Traditional classification
• 1 genus-level concept
• 3 species-level concepts
T2: Perelleschus sec. 2001
• Phylogenetic revision
• 2 genus-level concepts
• 7 clade-level concepts
• 9 species-level concepts
25. Format for alignment input file (constraints: T1, T2, A, C)
Year Source
T2
Parent
concept
Child
concepts
T1
T2 to T1
Articulations
(as provided
by the user)
26. Input visualization
Six1 user-asserted input articulations (pink lines) are sufficient to yield a single,
well-specified alignment.
1Actually, three (species-level) articulations are sufficient to achieve this for the 2001/1986 alignment.
27. Alignment (merge) visualization
Reasoner infers 66 additional, logically implied articulations (MIR).1
2001.Perelleschus >< 1986.Perelleschus; provenance of overlapping articulation
is explained in the merge taxonomy.
1 MIR = Maximally Informative Relations (among paired concepts of T1, T2).
Legend
28. Alignment (merge) visualization
Reasoner infers 66 additional, logically implied articulations (MIR).1
2001.Perelleschus >< 1986.Perelleschus; provenance of overlapping articulation
is explained in the merge taxonomy.
3 congruent 2001/1986 species-level concepts.
1 MIR = Maximally Informative Relations (among paired concepts of T1, T2).
Legend
29. Alignment (merge) visualization
Reasoner infers 66 additional, logically implied articulations (MIR).1
2001.Perelleschus >< 1986.Perelleschus; provenance of overlapping articulation
is explained in the merge taxonomy.
3 congruent 2001/1986 species-level concepts.
6 species-level concepts unique sec. FOB (2001).
1 MIR = Maximally Informative Relations (among paired concepts of T1, T2).
Legend
30. Alignment (merge) visualization
Reasoner infers 66 additional, logically implied articulations (MIR).1
2001.Perelleschus >< 1986.Perelleschus; provenance of overlapping articulation
is explained in the merge taxonomy.
3 congruent 2001/1986 species-level concepts.
6 species-level concepts unique sec. FOB (2001).
6 clade-level concepts unique to FOB (2001).
1 MIR = Maximally Informative Relations (among paired concepts of T1, T2).
Legend
31. Alignment (merge) visualization
Reasoner infers 66 additional, logically implied articulations (MIR).1
2001.Perelleschus >< 1986.Perelleschus; provenance of overlapping articulation
is explained in the merge taxonomy.
3 congruent 2001/1986 species-level concepts.
6 species-level concepts unique sec. FOB (2001).
6 clade-level concepts unique to FOB (2001).
2001.PER & 2001.PHY in overlap with 1986.PER.
1 MIR = Maximally Informative Relations (among paired concepts of T1, T2).
Legend
32. Alignment (merge) visualization
Reasoner infers 66 additional, logically implied articulations (MIR).1
2001.Perelleschus >< 1986.Perelleschus; provenance of overlapping articulation
is explained in the merge taxonomy.
We can 'zoom in' on the overlap
and resolve the resulting subregions
in the "merge concept view".
1 MIR = Maximally Informative Relations (among paired concepts of T1, T2).
Legend
33. Merge concept view (in part)
"2001.PER and 1986.PER share a region (2001.PER * 1986.PER) constituted (at lower
levels) by 2001/1986.P_rectirostris; this latter region is that which is entailed in
1986.PER and excluded from 2001.PHY. (1986.PER2001.PHY)."
2001 concepts
2001/1986 concepts
34. Merge concept view (in part)
"2001.PHYsubcin/1986.Psubcin differentially 'participates' in 2001.PHY and
1986.PER; but not 2001.PER (or any of its children)."
2001 concepts
2001/1986 concepts
35. Alignment 2 - Perelleschus sec. FOB (2001) versus sec. F (2006)
T1: Perelleschus sec. 2001
• Phylogenetic revision
• 8 ingroup species concepts
• 2 outgroup concepts
• 18 concepts total
36. Alignment 2 - Perelleschus sec. FOB (2001) versus sec. F (2006)
T1: Perelleschus sec. 2001
• Phylogenetic revision
• 8 ingroup species concepts
• 2 outgroup concepts
• 18 concepts total
T2: Perelleschus sec. 2006
• Exemplar analysis
• 2 ingroup species concepts
• 1 outgroup concept
• 7 concepts total
37. Logic representation challenge:
Perelleschus sec. 2001 & 2006 concepts
have incongruent sets of subordinate members,
yet each concept has congruent synapomorphies.
38. Definitional preliminaries, 4 1
Ostensive alignment: the congruence among higher-level
concepts is assessed in relation to their entailed members.
Ostension: giving meaning through an act of pointing out.
1 See Bird & Tobin. 2012. Natural Kinds. URL: http://plato.stanford.edu/archives/win2012/entries/natural-kinds/
39. Definitional preliminaries, 4 1
Ostensive alignment: the congruence among higher-level
concepts is assessed in relation to their entailed members.
Ostension: giving meaning through an act of pointing out.
Intensional alignment: the congruence among higher-level
concepts is assessed in relation to their properties.
Intension: giving meaning through the specification of properties.
1 See Bird & Tobin. 2012. Natural Kinds. URL: http://plato.stanford.edu/archives/win2012/entries/natural-kinds/
40. Ostensive alignment – members are all that counts
Input constraints Challenge 1: Ostensive alignment
Ostensive alignment
2001 & 2006
41. Ostensive alignment – members are all that counts
Challenge 1: Ostensive alignment
Solution: 11 ingroup concept articulations
are coded ostensively – either as
<, ><, or | – to represent non-congruence
in the representation
of child concepts
Input constraints
Ostensive alignment
2001 & 2006
42. Ostensive alignment – members are all that counts
Challenge 1: Ostensive alignment
Solution: 11 ingroup concept articulations
are coded ostensively – either as
<, ><, or | – to represent non-congruence
in the representation
of child concepts
Result: 2006.PER < 2001.PER
2006.PER | 2001.[5 species concepts]
etc.
Input constraints
Ostensive alignment
2001 & 2006
5 x |
2 x ><
44. Intensional alignment – representation of congruent synapomorphies
Input constraints
Challenge 2: Intensional alignment
Solution: An Implied Child (_IC) concept is
added to the undersampled (2006)
clade concept; and the (5) "missing"
species-level concepts are included
within this Implied Child
Intensional alignment
2001 & 2006
"17"
"11"
45. Intensional alignment – representation of congruent synapomorphies
Input constraints
Challenge 2: Intensional alignment
Solution: An Implied Child (_IC) concept is
added to the undersampled (2006)
clade concept; and the (5) "missing"
species-level concepts are included
within this Implied Child
11 ingroup concept articulations are
coded intensionally – as == or > –
to reflect congruent synapomorphies
(chars. 11, 17) of 2001 & 2006
Intensional alignment
2001 & 2006
"17"
"11"
46. Intensional alignment – representation of congruent synapomorphies
Input constraints
Challenge 2: Intensional alignment
Result: The genus- and ingroup clade-level
concepts are inferred as congruent:
2006. PER == 2001.PER
2006.PcarPeve == 2001.PcarPsul
etc.
Intensional alignment
2001 & 2006
47. Review – representing ostensive versus intensional alignments
Ostensive alignment
2001.PER includes more
species-level concepts
than 2006.PER [>].
48. Review – representing ostensive versus intensional alignments
Ostensive alignment
2001.PER includes more
species-level concepts
than 2006.PER [>].
Intensional alignment
2006.PER reconfirms the
synapomorphies inferred
in 2001.PER [==].
50. Use case: Alternative phylogenetic schemes of higher-level weevils
T1: Curculionoidea sec. Kuschel (1995)
• Cladistic analysis
• 41 concepts
51. Use case: Alternative phylogenetic schemes of higher-level weevils
T1: Curculionoidea sec. Kuschel (1995)
• Cladistic analysis
• 41 concepts
T2: Curculionoidea sec. Marvaldi &
Morrone (2000)
• Cladistic analysis
• 25 concepts
52. Alignment: Curculionoidea sec. K (1995) versus sec. MM (2000)
Initial visual impression: Lots of green rectangles, yellow octagons, and overlap (><).
Much taxonomic concept incongruence.
53. Use case: Dwarf lemurs sec. 1993 & 2005 1
Chirogaleus furcifer sec. Mühel (1890) – Brehms Tierleben.
Public Domain: http://books.google.com/books?id=sDgQAQAAMAAJ
1 Franz et al. 2014. Taxonomic provenance: Two influential primate classifications logically aligned. (in preparation)
54. The 2nd & 3rd Editions of the Mammal Species of the World
1993 2005
Primates sec. Groves (1993)
317 taxonomic concepts,
233 at the species level.
Primates sec. Groves (2005)
483 taxonomic concepts,
376 at the species level.
Δ = 143
species-level
concepts
55. Alignment of Primates sec. Groves 1993 / 2005
Primates: 800 concepts
402
articulations
153,111 MIR
~ 380x information gain!
Strepsirrhini sec. MSW3
Haplorrhini sec. MSW3
Catarrhini sec. MSW3
56. Taxonomic provenance quantify name/meaning dissociation
'Dissociation' means that either un-identical names are paired with congruent concepts,
or that identical names are paired with incongruent concepts.
"Reliable names" "Unreliable names"
57. In summary (1) − What this approach can provide:
So, given an input set of [T1, T2, A, C], one gains:
(1) Logical consistency in the alignment;
(2) Intended degree of alignment resolution;
(3) Additional, logically implied articulations;
(4) Visualizations of taxonomic provenance;
(5) Quantifications of name/meaning relations.
58. In summary (2) − Representation and reasoning abilities
• Compatibility with contemporary Linnaean nomenclature (and PhyloCode too);
• Integration of many-to-many name/circumscription relationships across taxonomies;
• Reconciliation of traditional classifications with fully bifurcated phylogenies;
• Representation of monotypic concept lineages with congruent taxonomic extensions;
59. In summary (2) − Representation and reasoning abilities
• Compatibility with contemporary Linnaean nomenclature (and PhyloCode too);
• Integration of many-to-many name/circumscription relationships across taxonomies;
• Reconciliation of traditional classifications with fully bifurcated phylogenies;
• Representation of monotypic concept lineages with congruent taxonomic extensions;
• Accounting for insufficiently specified higher-level entities:
• Undersampled outgroup entities;
• Differentially sampled ingroup entities;
• Resolution of taxonomically overlapping entities and merge concepts;
• Differentiation of ostensive versus intensional readings of concept articulations;
• Representation of topologically localized resolution versus ambiguity in alignments.
60. In summary (2) − Representation and reasoning abilities
• Compatibility with contemporary Linnaean nomenclature (and PhyloCode too);
• Integration of many-to-many name/circumscription relationships across taxonomies;
• Reconciliation of traditional classifications with fully bifurcated phylogenies;
• Representation of monotypic concept lineages with congruent taxonomic extensions;
• Accounting for insufficiently specified higher-level entities:
• Undersampled outgroup entities;
• Differentially sampled ingroup entities;
• Resolution of taxonomically overlapping entities and merge concepts;
• Differentiation of ostensive versus intensional readings of concept articulations;
• Representation of topologically localized resolution versus ambiguity in alignments.
• Next critical step(s): accessible, scalable, usable, integrated web instance of Euler/X
61. In summary (3) − Take-home message
We can explain (much of)
taxonomy's legacy to computers (e.g.)
for superior name/meaning resolution.
Well, then, should we?
And at what cost?
65. Select references on concept taxonomy and the Euler/X toolkit
• Franz et al. 2008. On the use of taxonomic concepts in support of biodiversity
research and taxonomy. In: The New Taxonomy; pp. 63–86. Link
• Franz & Peet. 2009. Towards a language for mapping relationships among
taxonomic concepts. Systematics and Biodiversity 7: 5–20. Link
• Franz & Thau. 2010. Biological taxonomy and ontology development: Scope and
limitations. Biodiversity Informatics 7: 45–66. Link
• Chen et al. 2014. Euler/X: a toolkit for logic-based taxonomy integration. WFLP
2013 – 22nd International Workshop on Functional and (Constraint) Logic
Programming. Link
• Chen et al. 2014. A hybrid diagnosis approach combining Black-Box and White-
Box reasoning. Lecture Notes in Computer Science 8620: 127–141. Link
• Franz et al. 2014. Names are not good enough: Reasoning over taxonomic change in
the Andropogon complex. Semantic Web – Interoperability, Usability, Applicability –
Special Issue on Semantics for Biodiversity. (in press) Link
• Franz et al. 2014. Reasoning over taxonomic change: Exploring alignments for the
Perelleschus use case. PLoS ONE. (in press) Link
• Franz et al. 2015. Taxonomic provenance: Two influential primate classifications
logically aligned. (in preparation)
72. R32 lattice of RCC-5 articulations (lighter color = less certainty)
73. The other piece in the puzzle: Concept-to-voucher identifications
Source: Baskauf & Webb. 214. Darwin-SW. URL: http://www.semantic-web-journal.net/system/files/swj635.pdf