Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Franz. 2014. Explaining taxonomy's legacy to computers – how and why?

2,798 views

Published on

Slides presented on the Euler/X projected (http://taxonbytes.org/prior-work-on-concept-taxonomy-2013/ & https://bitbucket.org/eulerx/euler-project) - for the conference "The Meaning of Names: Naming Diversity in the 21st Century", CU Natural History Museum, September 30, 2014.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Franz. 2014. Explaining taxonomy's legacy to computers – how and why?

  1. 1. Explaining taxonomy's legacy to computers – how and why? Nico M. Franz 1,2 Arizona State University http://taxonbytes.org/ 1 Concepts and tools developed jointly with members of the Ludäscher Lab (UC Davis & UIUC): Mingmin Chen, Parisa Kianmajd, Shizhuo Yu, Shawn Bowers & Bertram Ludäscher 2 The Meaning of Names: Naming Diversity in the 21st Century September 30, 2014; Museum of Natural History, University of Colorado On-line @ http://www.slideshare.net/taxonbytes/franz-2014-explaining-taxonomys-legacy-to-computers-how-and-why
  2. 2. Alternative title: Concept taxonomy – now with logic reasoning.
  3. 3. Definitional preliminaries, 1 Taxonomic concept: 1 The circumscription of a perceived (or, more accurately, hypothesized) taxonomic group, as advocated by a particular author and source. 1Not the same as species concepts, which are theories about what species are, and/or how they are recognized.
  4. 4. Definitional preliminaries, 2 Provenance: 1 Information describing the origin, derivation, history, custody, or context of an entity (etc.). Provenance establishes the authenticity, integrity and trustworthiness of information about entities. 1 See, e.g.: http://www.w3.org/2005/Incubator/prov/wiki/What_Is_Provenance
  5. 5. Concept taxonomy in three introductory phrases • An emerging solution to the challenge of tracking stability and change across multiple taxonomic name usages.
  6. 6. Concept taxonomy in three introductory phrases • An emerging solution to the challenge of tracking stability and change across multiple taxonomic name usages. • Fully compatible with Linnaean nomenclature (Codes).
  7. 7. Concept taxonomy in three introductory phrases • An emerging solution to the challenge of tracking stability and change across multiple taxonomic name usages. • Fully compatible with Linnaean nomenclature (Codes). • The focus is on building sound provenance chains amenable to computational representation and reasoning; irrespective of whether the nomenclatural/taxonomic history of a perceived lineage of organisms was perfectly stable since the times of Linnaeus, or continues to undergo major alterations.
  8. 8. Overview of today's presentation • The challenge (1.0): Limitations of the name  taxon reference model. • The challenge (2.0): How to track taxonomic concept provenance? • Introducing Euler/X – overview of workflow and user/reasoner interaction. ~ 8 mins.
  9. 9. Overview of today's presentation • The challenge (1.0): Limitations of the name  taxon reference model. • The challenge (2.0): How to track taxonomic concept provenance? • Introducing Euler/X – overview of workflow and user/reasoner interaction. • How does it work? • Use case 1: Dwarf lemur classifications sec. 1993 & 2005. • From simple to complex merge taxonomies. • How can we represent taxonomic concept overlap? • Scalability & information gain: How many articulations? • Why? Insights into the performance of names as concept identifiers. • Use case 2: Andropogon glomeratus sec. auctorum. • In conclusion – feasibility, accessibility, and what it means. ~ 8 mins. ~ 15 mins.
  10. 10. The challenge (1.0): Often, we make statements like this:
  11. 11. "Andropogon glomeratus is a species of grass (Poaceae) that occurs in the Southern U.S." Photo by Max Licher (ASU Herbarium); Cottonwood, Arizona. http://swbiodiversity.org/seinet/imagelib/imgdetails.php?imgid=431755
  12. 12. Thereby we stipulate a direct name  taxon reference relationship.
  13. 13. Proposition 1: names refer (directly) to taxa "Andropogon glomeratus Taxonomic name is a species of grass (Poaceae) Taxon (species) that occurs in the Southern U.S." Biological data Reference relation: name refers to entity
  14. 14. Proposition 1: names refer (directly) to taxa "Andropogon glomeratus Taxonomic name is a species of grass (Poaceae) Taxon (species) that occurs in the Southern U.S." Biological data Reference relation: name refers to entity Data transmission: facilitated by name
  15. 15. Yet, the legacy of taxonomy is more complicated: the name  taxon relationship can change.1 This poses some representation challenges… 1 See Franz et al. 2008. On the use of taxonomic concepts in support of biodiversity research and taxonomy; pp. 63–86. In: The New Taxonomy, Systematics Association Special Volume 74. Taylor & Francis, Boca Raton.
  16. 16. Challenge 1: by necessity, a name refers only to a type (specimen) "Andropogon glomeratus Taxonomic name is a species of grass (Poaceae) that occurs in the Southern U.S." Identity of the name/reference relation is regulated by Codes (e.g., Typification)
  17. 17. Challenge 2: the discovery of 'true' taxon boundaries is contingent "Andropogon glomeratus Taxonomic name is a species of grass (Poaceae) Taxon (species) that occurs in the Southern U.S." Identity of the name/reference relation is regulated by Codes (e.g., Typification) The boundaries of taxon identity have the property of contingent, scientific hypotheses = concepts
  18. 18. Challenge 3: name/taxon (concept) changes are semi-independent "Andropogon glomeratus Taxonomic name is a species of grass (Poaceae) Taxon (species) that occurs in the Southern U.S." Identity of the name/reference relation is regulated by Codes (e.g., Typification) Precise, reliable mapping? The boundaries of taxon identity have the property of contingent, scientific hypotheses = concepts
  19. 19. Consequence: the name  taxon reference model is often too simple "Andropogon glomeratus Taxonomic name is a species of grass (Poaceae) Taxon (species) that occurs in the Southern U.S." Biological data Identity of the name/reference relation is regulated by Codes (e.g., Typification) Precise, reliable mapping? The boundaries of taxon identity have the property of contingent, scientific hypotheses = concepts Reference limitations! Name-based data transmission: reliability is also contingent
  20. 20. If we accept a contingent, changing name  concept  taxon reference model, then perhaps we should always say this:
  21. 21. Proposition 2: concept labels refer (directly) to taxonomic concepts "Andropogon glomeratus ..is the (Latin) name (string), nomenclaturally anchored with a type specimen, that can participate in the (more precisely in-dividuated) concept label "Andropogon glomeratus sec. Barkworth et al. 2014" (reference: Manual of Grasses for North America), which in turn refers to.. is a species of grass (Poaceae) that occurs in the Southern U.S."
  22. 22. Proposition 2: concept labels refer (directly) to taxonomic concepts "Andropogon glomeratus ..is the (Latin) name (string), nomenclaturally anchored with a type specimen, that can participate in the (more precisely in-dividuated) concept label "Andropogon glomeratus sec. Barkworth et al. 2014" (reference: Manual of Grasses for North America), which in turn refers to.. is a species of grass (Poaceae) ..a feature-based circumscription ("Plants cespitose, upper portion dense, … Pedicellate spikelets vestigial or absent, sterile. 2n = 20.") – the taxonomic concept as advocated by this reference – which may or may not align accurately with a (presumably existing and) relatively stable evolutionary lineage of organisms in nature for which.. that occurs in the Southern U.S."
  23. 23. Proposition 2: concept labels refer (directly) to taxonomic concepts "Andropogon glomeratus ..is the (Latin) name (string), nomenclaturally anchored with a type specimen, that can participate in the (more precisely in-dividuated) concept label "Andropogon glomeratus sec. Barkworth et al. 2014" (reference: Manual of Grasses for North America), which in turn refers to.. is a species of grass (Poaceae) ..a feature-based circumscription ("Plants cespitose, upper portion dense, … Pedicellate spikelets vestigial or absent, sterile. 2n = 20.") – the taxonomic concept as advocated by this reference – which may or may not align accurately with a (presumably existing and) relatively stable evolutionary lineage of organisms in nature for which.. that occurs in the Southern U.S." ..biological occurrence data are on hand.
  24. 24. Hence: The challenge (2.0): If we individuate taxonomic concepts and their labels consistently, ..
  25. 25. 1889 1933 1948 1950 1968 1979 1983 2006 2014 Chain of A. glomeratus concepts, 1889-2014.
  26. 26. ..then how can we track concept provenance?
  27. 27. 1889 1933 1948 1950 1968 1979 1983 2006 2014 ? Provenance representation challenge: How is each concept articulated to another?
  28. 28. Proposed solution: We articulate them with (RCC-5) concept-to-concept relationships..
  29. 29. 1889 1933 1948 1950 1968 1979 1983 2006 2014 Congruence [==] Congruence [==] Proper inclusion [>] Inverse proper inclusion [<] Overlap [><] Congruence [==] Exclusion [|] Future Floras: Congruence? [==] RCC-5 = Region Connection Calculus with five basic relations.
  30. 30. …and utilize logic reasoning to infer consistent merge taxonomies.
  31. 31. Merge – A. glomeratus sec. Blomquist (1948) / sec. Campbell (1983) Congruence [==] Merge View Legend
  32. 32. We now have a tool for this: Euler/X https://bitbucket.org/eulerx
  33. 33. Euler/X toolkit in a single screenshot (desktop version, IX-2014)
  34. 34. Euler/X applies logic reasoning to support the following workflow:
  35. 35. User/reasoner interaction: achieving well-specified alignments T1 = Taxonomy 1 T2 = Taxonomy 2 A = Input articulations [==, >, <, ><, |] C = Taxonomic constraints
  36. 36. User/reasoner interaction: achieving well-specified alignments T1 = Taxonomy 1 T2 = Taxonomy 2 A = Input articulations [==, >, <, ><, |] C = Taxonomic constraints  Articulations are asserted by taxonomic experts.
  37. 37. Data format for an Euler/X alignment input file T2 Year Author
  38. 38. T2 Year Author Parent concept Child concepts Data format for an Euler/X alignment input file
  39. 39. Data format for an Euler/X alignment input file T2 Year Author Parent concept Child concepts T1
  40. 40. Data format for an Euler/X alignment input file T2 Year Author Parent concept Child concepts T1 T2 to T1 Articulations (as provided by the user)
  41. 41. User/reasoner interaction: achieving well-specified alignments
  42. 42. Input visualization of the 2005/1993 concept trees & articulations Input articulations 2005 concepts 1993 concepts
  43. 43. User/reasoner interaction: achieving well-specified alignments No!
  44. 44. No Possible World merge [empty canvas, nothing to report]
  45. 45. User/reasoner interaction: achieving well-specified alignments No!
  46. 46. User/reasoner interaction: achieving well-specified alignments No! Yes
  47. 47. Nine Possible World merges for an under-specified use case input
  48. 48. User/reasoner interaction: achieving well-specified alignments No! Yes
  49. 49. User/reasoner interaction: achieving well-specified alignments Yes Yes
  50. 50. User/reasoner interaction: achieving well-specified alignments MIR = Maximally Informative Relations [==, >, <, ><, |] for each concept pair Yes Yes
  51. 51. Use case 1: dwarf lemurs sec. 1993 & 2005 1 Chirogaleus furcifer sec. Mühel (1890) – Brehms Tierleben. Public Domain: http://books.google.com/books?id=sDgQAQAAMAAJ 1 Franz et al. 2014. Two influential primate classifications logically aligned. (unpublished)
  52. 52. The 2nd & 3rd Editions of the Mammal Species of the World 1993 2005 Primates sec. Groves (1993)  317 taxonomic concepts, 233 at the species level. Primates sec. Groves (2005)  483 taxonomic concepts, 376 at the species level. Δ = 143 species-level concepts
  53. 53. Primate 1993/2005 concept alignments: From simple to complex merge taxonomies.
  54. 54. Microcebus rufus sec. 2005 – same name, congruent concepts [==] 1. Input concepts & articulations Merge View Legend
  55. 55. Microcebus rufus sec. 2005 – same name, congruent concepts [==] 1. Input concepts & articulations 2. Merge visualization Grey rectangle, round corners  Taxonomic congruence Merge View Legend
  56. 56. Mirza coquereli sec. 2005 – name change, congruent concepts [==] 1. Input concepts & articulations 2. Merge visualization Merge View Legend
  57. 57. Microcebus murinus (et al.) sec. 2005 – "lumping / splitting" [> , <] 1. Input concepts & articulations Merge View Legend
  58. 58. Microcebus murinus (et al.) sec. 2005 – "lumping / splitting" [> , <] 1. Input concepts & articulations 2. Merge visualization Yellow octagon  Unique to T1 (1993) Green rectangle  Unique to T2 (2005) Merge View Legend
  59. 59. Microcebus (part) & Mirza sec. 2005 – monotypic parent concepts 1. Input concepts & articulations Mirza & M. coquereli sec. Groves (2005) are two co-extensional concepts in T2
  60. 60. Microcebus (part) & Mirza sec. 2005 – monotypic parent concepts 1. Input concepts & articulations 2. Merge visualization Mirza & M. coquereli sec. Groves (2005) are two co-extensional concepts in T2 Three concepts are congruent!
  61. 61. How can we represent concept overlap?
  62. 62. Microcebus (all) & Mirza sec. 2005 – concept overlap [><] Merge visualization: containment, with overlap [-e mnpw --rcgo] Dashed blue line  Overlap [><]
  63. 63. Microcebus (all) & Mirza sec. 2005 – concept overlap [><] Merge visualization: containment, with overlap [-e mnpw --rcgo] Unique to 1993.Microcebus (2005  Mirza/coquereli) Dashed blue line  Overlap [><]
  64. 64. Microcebus (all) & Mirza sec. 2005 – concept overlap [><] Merge visualization: containment, with overlap [-e mnpw --rcgo] Unique to 1993.Microcebus (2005  Mirza/coquereli) Unique to 2005.Microcebus (1993  undescribed) Dashed blue line  Overlap [><]
  65. 65. Microcebus (all) & Mirza sec. 2005 – concept overlap [><] Merge visualization: containment, with overlap [-e mnpw --rcgo] Unique to 1993.Microcebus (2005  Mirza/coquereli) Unique to 2005.Microcebus (1993  undescribed) Dashed blue line  Overlap [><] Shared, congruent child concepts
  66. 66. We can resolve the merge overlap products.
  67. 67. Microcebus (all) & Mirza sec. 2005 – concept overlap [><] Merge visualization: "merge concept" representation [-e mncb] Red lines  Newly inferred articulations (to and from merge concepts)
  68. 68. Microcebus (all) & Mirza sec. 2005 – concept overlap [><] Merge visualization: "merge concept" representation [-e mncb] Red lines  Newly inferred articulations (to and from merge concepts) 2005.Microcebus*1993.Microcebu s  Shared merge concept
  69. 69. Microcebus (all) & Mirza sec. 2005 – concept overlap [><] Merge visualization: "merge concept" representation [-e mncb] 1993.Microcebus2005.Microcebus  Merge concept unique to 1993 2005.Microcebus1993.Microcebus  Merge concept unique to 2005 2005.Microcebus*1993.Microcebu s  Shared merge concept Red lines  Newly inferred articulations (to and from merge concepts)
  70. 70. Scalability & information gain: How many input articulations are sufficient?
  71. 71. Cheirogaleoidae sec. 2005 – how many articulations are sufficient? T2: 27 concepts; T1: 14 concepts; 22 input articulations
  72. 72. Cheirogaleoidae sec. 2005 – how many articulations are sufficient? T2: 27 concepts; T1: 14 concepts; 22 input articulations 17 'non-new' 2005 species-level concepts  Articulated to 1993 species-level concepts
  73. 73. Cheirogaleoidae sec. 2005 – how many articulations are sufficient? T2: 27 concepts; T1: 14 concepts; 22 input articulations 4 'new' 2005 species-level concepts  Exclusion (|) from 1993 family-level concept
  74. 74. Cheirogaleoidae sec. 2005 – how many articulations are sufficient? T2: 27 concepts; T1: 14 concepts; 22 input articulations 1 additional highest-level articulation  2005.Cheirogaleoidae > 1993.Cheirogaleidae  Eliminates 15 additional Possible Worlds
  75. 75. Cheirogaleoidae sec. 2005 – how many articulations are sufficient? T2: 27 concepts; T1: 14 concepts; 22 input articulations No genus-/subfamily level articulations are needed
  76. 76. Cheirogaleoidae sec. 2005 – how many articulations are sufficient? Well-specified merge: 378 Maximally Informative Relations  ~ 17x information gain through reasoning
  77. 77. Cheirogaleoidae sec. 2005 – how many articulations are sufficient? Well-specified merge: 378 Maximally Informative Relations  ~ 17x information gain through reasoning Primates: 483x317 = 800 concepts 402 articulations 153,111 MIR  ~ 380x information gain!
  78. 78. Why? Performance of names as concept identifiers.
  79. 79. MSW 2nd/3rd Edition name/concept identity relations  56.4% of the paired name lineages are taxonomically reliable.  Computers need concept resolution to track taxonomic provenance.
  80. 80. Use case 2 And Andropogon glomeratus sec. auctorum? 1 "Andropogon glomeratus is a species of grass (Poaceae) that occurs in the Southern U.S." Photo by Max Licher (ASU Herbarium); Cottonwood, Arizona. http://swbiodiversity.org/seinet/imagelib/imgdetails.php?imgid=431755 1 See Franz et al. 2014. Names are not good enough: reasoning over taxonomic change in the Andropogon complex. Semantic Web – Interoperability, Usability, Applicability – Special Issue on Semantics for Biodiversity. (in press)
  81. 81. In brief: Things are very messy.
  82. 82. Question 1: Which concept labels have included the name string "Andropogon glomeratus" in past eight classifications? Tabular alignment of eight Andropogon classifications: 1889 to 2006  6 / 8 classifications are taxonomically unique for the concept of A. glomeratus sec. auctorum.  No two concepts including the "A. glomeratus" name string are taxonomically congruent.
  83. 83. Question 2: Which previously named concepts are congruent with Andropogon glomeratus sec. Weakley (2006)? Tabular alignment of eight Andropogon classifications: 1889 to 2006  What Weakley (2006) refers to as "A. glomeratus" was previously referred to as: 1889: A. macrourus var. hirsutior + A. macrourus var. abbreviatus 1933: A. glomeratus (in part, I) 1948: A. glomeratus (?) 1950: A. virginicus var. hisutior + A. glomeratus (in part, II) 1968: A. virginicus (in part) 1979: A. virginicus var. abbreviatus (in part) 1983: A. glomeratus (in part, I)
  84. 84. Logic representation: Easy!
  85. 85. Case 1: 1948.Blomquist vs. 1950.Hitchcock & Chase (Δ = 2 years) T2: 7 concepts (1950); T1: 7 concepts (1948) – containment view Merge: 3 congruent regions, 3 with same name 6 unique regions, 4 with non-unique name
  86. 86. Case 1: 1948.Blomquist vs. 1950.Hitchcock & Chase (Δ = 2 years) T2: 7 concepts (1950); T1: 7 concepts (1948) – containment view Merge: 3 congruent regions, 3 with same name 6 unique regions, 4 with non-unique name  A. glomeratus sec. 1950 and A. glomeratus sec. 1948 are overlapping, as each concept includes a non-congruent variety-level concept.  Interestingly, the shared concept region has no unique name in either taxonomy. It is 'un-named', at least within the context of the 1950/1948 classifications.
  87. 87. Case 1: 1948.Blomquist vs. 1950.Hitchcock & Chase (Δ = 2 years) T2: 7 concepts (1950); T1: 7 concepts (1948) – merge concept view Merge: 3 congruent regions, 3 with same name 6 unique regions, 4 with non-unique name  The shared, overlapping region is more informatively resolved and labeled in the merge concept visualization; the region 1950.A._glomeratus * 1948.A_glomeratus contains no subelements that carry the name "A. virginicus" in either classification.
  88. 88. Case 2: 1889.Hackel vs. 2006.Weakley (Δ = 117 years) T2: 12 concepts (2006); T1: 12 concepts (1889) Merge: 8 congruent regions, 0 with same name (!) 5 unique regions, 1 with non-unique name
  89. 89. Case 2: 1889.Hackel vs. 2006.Weakley (Δ = 117 years) T2: 12 concepts (2006); T1: 12 concepts (1889) Merge: 8 congruent regions, 0 with same name (!) 5 unique regions, 1 with non-unique name  Hackel & Weakley agree very substantively on what entities are 'out there in nature'; however, more than a century of Code-compliant name changes has obscured their agreements.
  90. 90. Case 3: 1983.Campbell vs. 2006.Weakley (Δ = 23 years) T2: 12 concepts (2006); T1: 14 concepts (1983) – containment view Merge: 9 congruent regions, 5 with same name 6 unique regions, 4 with non-unique name
  91. 91. Case 3: 1983.Campbell vs. 2006.Weakley (Δ = 23 years) T2: 12 concepts (2006); T1: 14 concepts (1983) – containment view Merge: 9 congruent regions, 5 with same name 6 unique regions, 4 with non-unique name  One of the simpler merge taxonomies in this use case, although 8 / 15 merge regions have taxonomically misleading names (i.e., congruence/different names; non-congruence/same names).  This ratio is near-average through nine pairwise alignments.
  92. 92. In conclusion:
  93. 93. In conclusion – feasibility, accessibility, and what it means. • Feasibility of tracking taxonomic concept provenance in computational logic: • We are making leaps and bounds in feasibility (and in scalability) right now. • However, many interesting challenges remain (e.g., user/reasoner interaction).
  94. 94. In conclusion – feasibility, accessibility, and what it means. • Feasibility of tracking taxonomic concept provenance in computational logic: • We are making leaps and bounds in feasibility (and in scalability) right now. • However, many interesting challenges remain (e.g., user/reasoner interaction). • Accessibility and acceptance of the RCC-5/reasoning approach: • We need more use cases, and users – the Euler/X approach works! • It can be applied to any new or legacy systematic publication, biodiversity database, checklist, classification, phylogeny, or other kinds of taxonomic syntheses (print or virtual) and versions thereof; complementing the Linnaean system while providing superior individuation of taxonomic content. • Having a sound web service is the next critical step in advancing the approach.
  95. 95. In conclusion – feasibility, accessibility, and what it means. • Feasibility of tracking taxonomic concept provenance in computational logic: • We are making leaps and bounds in feasibility (and in scalability) right now. • However, many interesting challenges remain (e.g., user/reasoner interaction). • Accessibility and acceptance of the RCC-5/reasoning approach: • We need more use cases, and users – the Euler/X approach works! • It can be applied to any new or legacy systematic publication, biodiversity database, checklist, classification, phylogeny, or other kinds of taxonomic syntheses (print or virtual) and versions thereof; complementing the Linnaean system while providing superior individuation of taxonomic content. • Having a sound web service is the next critical step in advancing the approach. • What does it all mean? • The legacy of taxonomic name and concept authoring is amenable to computational logic and provenance tracking. We can likely derive much data integration power from further developments in this direction.
  96. 96. Acknowledgments • Robert Guralnick, Susanna Drogsvold & all CU Museum of Natural History "The Meaning of Names" conference organizers! • Euler/X team: Mingmin Chen, Parisa Kianmajd, Shizhuo Yu, Shawn Bowers & Bertram Ludäscher • Juliana Cardona-Duque (weevils), Naomi Pier (primates) & AlanWeakley (grasses) • taxonbytes lab members: Andrew Johnston & Guanyang Zhang • NSF DEB–1155984 & DBI–1342595 (PI Franz); IIS–118088 & DBI–1147273 (PI Ludäscher) Franz Lab: http://taxonbytes.org/ https://sols.asu.edu/
  97. 97. Select references on concept taxonomy and the Euler/X toolkit • Franz & Peet. 2009. Towards a language for mapping relationships among taxonomic concepts. Systematics and Biodiversity 7: 5–20. Link • Chen et al. 2014. Euler/X: a toolkit for logic-based taxonomy integration. WFLP 2013 – 22nd International Workshop on Functional and (Constraint) Logic Programming. Link • Chen et al. 2014. A hybrid diagnosis approach combining Black-Box and White- Box reasoning. Lecture Notes in Computer Science 8620: 127–141. Link • Franz et al. 2014. Names are not good enough: reasoning over taxonomic change in the Andropogon complex. Semantic Web – Interoperability, Usability, Applicability – Special Issue on Semantics for Biodiversity. (in press) Link • Franz et al. 2014. Reasoning over taxonomic change: exploring alignments for the Perelleschus use case. PLoS ONE. (in review) • Euler/X toolkit: https://bitbucket.org/eulerx/euler-project • Euler web service (in progress): http://euler.asu.edu/ • Concept taxonomy @ taxonbytes: http://taxonbytes.org/tag/concept-taxonomy/
  98. 98. Miscellaneous appended slides
  99. 99. The good: names refer to type specimens necessarily Source: Witteveen. 2014. Biology & Philosophy. (in press)
  100. 100. The challenge: names refer to non-type specimens contingently Names Non-types Source: Dubois. 2005. Zoosystema 27: 365-426.
  101. 101. We may categorize kinds of nomenclatural and taxonomic change, and opportunities, to track each, as follows:
  102. 102. Nomenclatural/taxonomic change & provenance tracking square E.g.: - A binomial name is formed incorrectly. - A homonym is discovered, requiring name change.
  103. 103. Nomenclatural/taxonomic change & provenance tracking square E.g.: - A type specimen is lost, a neotype must be designated. - "One fungus (a-/sexual), one name" – Melbourne Code.
  104. 104. Nomenclatural/taxonomic change & provenance tracking square E.g.: - A heterotypic synonymy is established (inferred). - a Priority-carrying name is newly 'transferred'.
  105. 105. Nomenclatural/taxonomic change & provenance tracking square E.g.: - A junior genus-level name is transferred among tribes. - An informal clade name is redefined across treatments.
  106. 106. Nomenclatural/taxonomic change & provenance tracking square Many changes Some changes Many changes MOST CHANGES ??? Question: Which changes are most common in a particular group? Answer: Concept-level resolution is needed to assess this.
  107. 107. Question: What is the proper scope of reference for representing our progress in inferring the tree of life?
  108. 108. Suggested answer: Even though the name  taxon mapping is the ultimate aim..
  109. 109. ..in effect we only need to represent the name  concept mapping. Congruence over time will suggest that we are 'getting taxa right'.
  110. 110. R32 lattice of RCC-5 articulations (lighter color = less certainty)
  111. 111. Higher-level primate classifications – 1993 versus 2005: Many recurrent names, little taxonomic congruence.
  112. 112. Primates sec. 1993 & 2005 Order to Subfamily-level  Not much is grey.
  113. 113. Strepsirrhini sec. 2005 Haplorrhini sec. 2005 Catarrhini sec. 2005
  114. 114. Use case 2: Perelleschus sec. 2001 & 2006 1 Perelleschus salpinflexus sec. Franz & Cardona-Duque (2013) DOI:10.1080/14772000.2013.806371 1 Input articulations: Franz & Cardona-Duque. 2013. Description of two new species and phylogenetic reassessment of PerelleschusWibmer & O'Brien, 1986 (Coleoptera: Curculionidae), with a complete taxonomic concept history of Perelleschus sec. Franz & Cardona-Duque, 2013. 2013. Systematics and Biodiversity 11: 209–236. Merge analyses: Franz et al. 2014. Reasoning over taxonomic change: exploring alignments for the Perelleschus use case. PLoS ONE. (in press)
  115. 115. Goal: align two phylogenies with differential taxon sampling T1: Perelleschus sec. 2001 • Phylogenetic revision • 8 ingroup species concepts • 2 outgroup concepts • 18 concepts total
  116. 116. Goal: align two phylogenies with differential taxon sampling T1: Perelleschus sec. 2001 • Phylogenetic revision • 8 ingroup species concepts • 2 outgroup concepts • 18 concepts total T2: Perelleschus sec. 2006 • Exemplar analysis • 2 ingroup species concepts • 1 outgroup concept • 7 concepts total
  117. 117. Logic representation challenge: Perelleschus sec. 2001 & 2006 concepts have incongruent sets of subordinate members, yet each concept has congruent synapomorphies.
  118. 118. Definitional preliminaries 1 Ostensive alignment: the congruence among higher-level concepts is assessed in relation to their entailed members.  Ostension: giving meaning through an act of pointing out. 1 See Bird & Tobin. 2012. Natural Kinds. URL: http://plato.stanford.edu/archives/win2012/entries/natural-kinds/
  119. 119. Definitional preliminaries 1 Ostensive alignment: the congruence among higher-level concepts is assessed in relation to their entailed members.  Ostension: giving meaning through an act of pointing out. Intensional alignment: the congruence among higher-level concepts is assessed in relation to their properties.  Intension: giving meaning through the specification of properties. 1 See Bird & Tobin. 2012. Natural Kinds. URL: http://plato.stanford.edu/archives/win2012/entries/natural-kinds/
  120. 120. Ostensive alignment – members are all that counts Challenge 1: Differential outgroup sampling (2 / 1 concepts) T2: 2006.PHY & 2006.PHYsubcin T1: 2006.PHY only Input constraints Ostensive alignment 2001 & 2006
  121. 121. Ostensive alignment – members are all that counts Challenge 1: Differential outgroup sampling (2 / 1 concepts) T2: 2006.PHY & 2006.PHYsubcin T1: 2006.PHY only Solution: Locally relax coverage with "nc" = "no coverage" Input constraints Ostensive alignment 2001 & 2006
  122. 122. Ostensive alignment – members are all that counts Challenge 1: Differential outgroup sampling (2 / 1 concepts) T2: 2006.PHY & 2006.PHYsubcin T1: 2006.PHY only Solution: Locally relax coverage with "nc" = "no coverage" Result: 2006.PHY == 2001.PHY  Outgroups are held congruent. Input constraints Ostensive alignment 2001 & 2006
  123. 123. Ostensive alignment – members are all that counts Input constraints Challenge 2: Ostensive alignment Ostensive alignment 2001 & 2006
  124. 124. Ostensive alignment – members are all that counts Challenge 2: Ostensive alignment Solution: 11 ingroup concept articulations are coded ostensively – either as <, ><, or | – to represent non-congruence in the representation of child concepts Input constraints Ostensive alignment 2001 & 2006
  125. 125. Ostensive alignment – members are all that counts Challenge 2: Ostensive alignment Solution: 11 ingroup concept articulations are coded ostensively – either as <, ><, or | – to represent non-congruence in the representation of child concepts Result: 2006.PER < 2001.PER 2006.PER | 2001.[5 species concepts] etc. Input constraints Ostensive alignment 2001 & 2006 5 x | 2 x ><
  126. 126. Intensional alignment – representation of congruent synapomorphies Input constraints Challenge 3: Intensional alignment Intensional alignment 2001 & 2006
  127. 127. Intensional alignment – representation of congruent synapomorphies Input constraints Challenge 3: Intensional alignment Solution: An Implied Child (_IC) concept is Intensional alignment 2001 & 2006 added to the undersampled (2006) clade concept; and the (5) "missing" species-level concepts are included within this Implied Child
  128. 128. Intensional alignment – representation of congruent synapomorphies Input constraints Challenge 3: Intensional alignment Solution: An Implied Child (_IC) concept is Intensional alignment 2001 & 2006 added to the undersampled (2006) clade concept; and the (5) "missing" species-level concepts are included within this Implied Child 11 ingroup concept articulations are coded intensionally – as == or > – to reflect congruent synapomorphies of 2001 & 2006
  129. 129. Intensional alignment – representation of congruent synapomorphies Input constraints Challenge 3: Intensional alignment Result: The genus- and ingroup clade-level Intensional alignment 2001 & 2006 concepts are inferred as congruent: 2006. PER == 2001.PER 2006.PcarPeve == 2001.PcarPsul etc.
  130. 130. Review – representing ostensive versus intensional alignments Ostensive alignment 2001.PER includes more species-level concepts than 2006.PER [>].
  131. 131. Review – representing ostensive versus intensional alignments Ostensive alignment 2001.PER includes more species-level concepts than 2006.PER [>]. Intensional alignment 2006.PER reconfirms the synapomorphies inferred in 2001.PER [==].
  132. 132. The other piece in the puzzle: Concept-to-voucher identifications Source: Baskauf & Webb. 214. Darwin-SW. URL: http://www.semantic-web-journal.net/system/files/swj635.pdf

×