Taxonomic 'data' exchange
as
expression and synthesis
of
phylogenetic claims
Jonathan A. Rees
National Evolutionary Synthe...
Synergy
CoL IRMNG NCBI GBIF EOL Union4
Treebase OpenTree...
Finding inconsistencies = good
but hard
Collecting information...
'Data' – BAH!
'data' 'information' 'representation'
'format' 'nomenclature' - how bland.
Distracting.
Claims, not data. Co...
Terminology
Taxon: a set determined by a membership rule.
['taxon concept']
Character based
Descent based
Conspecifcity ba...
Taxonomies are collections of
claims
X
A
B
C
X includes A, B, and C
A, B, C are mutually disjoint
X, A, B, and C are clade...
The important claims are about
biology
X includes Y
X1, X2, X3, … are mutually disjoint
X is a clade
X is a species
We have to designate taxa somehow, when we
express a claim
Many taxon names are polysemous
To be clear, always say 'in the...
Reasoning with claims
X includes Y and Y includes Z
→ X includes Z
X includes Y
→ X and Y are not disjoint
X and Y are cla...
Two ways to be wrong
Wrong about designation
Wrong about science
'Alignment' = estimating
coreference
Alignment claims:
X = Y (X and Y are the same taxon)
Mammaliasensu
http://dx.doi.org/...
Incertae sedis
Confusing.
X is incertae sedisin A means
(1) A includes X
(2) it's not known which of A's non-incertae-
sed...
'Data exchange'
Taxonomies - NP
Exchanging 'corrections'
'Rozellabelongs in Fungi.'
'Rhodophyceae is the same as Rhodophyta.'
'SILVA'sMorganellaisn't the ...
Interpreting advice
“Rozellais in Fungi.”
Rozella sensuSILVA115 and Fungisensu
SILVA115 belong to a clade disjoint from th...
Notation not so important,
but for example -
includes(X, Y)
disjoint(A, B, C, …)
clade(X)
node(X, A, B, C, …) - abbreviati...
On and on
Synthesis
Identifer stability
Alignment details
Compare 'macrotaxonomy' and
'microtaxonomy'
Defense of scrufy
Co...
Separate science from nomenclature.
Use logic to do science.
Always use names withsensu.
Use heuristics to prevent paralys...
Ack
Nico Franz, David Thau, Rod Page
Open Tree: Karen Cranston, Stephen Smith,
Mark Holder, and legions of others
Gerald J...
Upcoming SlideShare
Loading in …5
×

Taxonomic 'data' exchange as expression and synthesis of phylogenetic claimsRees claims-ievobio2014

228 views

Published on

'Slides' from a 5-minute presentation at iEvoBio 2014.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
228
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Taxonomic 'data' exchange as expression and synthesis of phylogenetic claimsRees claims-ievobio2014

  1. 1. Taxonomic 'data' exchange as expression and synthesis of phylogenetic claims Jonathan A. Rees National Evolutionary Synthesis Center IevoBio, 25 June 2014
  2. 2. Synergy CoL IRMNG NCBI GBIF EOL Union4 Treebase OpenTree... Finding inconsistencies = good but hard Collecting information is useful
  3. 3. 'Data' – BAH! 'data' 'information' 'representation' 'format' 'nomenclature' - how bland. Distracting. Claims, not data. Consequential.
  4. 4. Terminology Taxon: a set determined by a membership rule. ['taxon concept'] Character based Descent based Conspecifcity based Taxonomy: a collection of taxa that form a hierarchy. Some taxonomies are phylogenetic (all clades).
  5. 5. Taxonomies are collections of claims X A B C X includes A, B, and C A, B, C are mutually disjoint X, A, B, and C are clades - if phylogenetic.
  6. 6. The important claims are about biology X includes Y X1, X2, X3, … are mutually disjoint X is a clade X is a species
  7. 7. We have to designate taxa somehow, when we express a claim Many taxon names are polysemous To be clear, always say 'in the sense of' some static document (article or database snapshot) X = Mammalia sensu http://dx.doi.org/10.1126/science.1211028 If used multiple ways in some document, give further qualifcation Claims about taxa
  8. 8. Reasoning with claims X includes Y and Y includes Z → X includes Z X includes Y → X and Y are not disjoint X and Y are clades → one includes the other, or they are disjoint
  9. 9. Two ways to be wrong Wrong about designation Wrong about science
  10. 10. 'Alignment' = estimating coreference Alignment claims: X = Y (X and Y are the same taxon) Mammaliasensu http://dx.doi.org/10.1126/science.1211028} = MammaliasensuNCBI.20140515 Heuristics based on properties and relations (including names...) Manual 'curation' if necessary
  11. 11. Incertae sedis Confusing. X is incertae sedisin A means (1) A includes X (2) it's not known which of A's non-incertae- sedis'children' X belongs to, if any (2) is not a claim about biology. Logical content = (1).
  12. 12. 'Data exchange' Taxonomies - NP
  13. 13. Exchanging 'corrections' 'Rozellabelongs in Fungi.' 'Rhodophyceae is the same as Rhodophyta.' 'SILVA'sMorganellaisn't the same as Index Fungorum'sMorganella.' 'Anolisisn't a clade unless it isNoropsis merged into it.'
  14. 14. Interpreting advice “Rozellais in Fungi.” Rozella sensuSILVA115 and Fungisensu SILVA115 belong to a clade disjoint from the other SILVA115 children of Nucletmycea. How about let's apply the label 'Fungi' to such a clade and not to Fungisensu SILVA115.
  15. 15. Notation not so important, but for example - includes(X, Y) disjoint(A, B, C, …) clade(X) node(X, A, B, C, …) - abbreviation species(X) same(X, Y) notSame(X, Y) sensu('Name', source) + nomenclatural claims
  16. 16. On and on Synthesis Identifer stability Alignment details Compare 'macrotaxonomy' and 'microtaxonomy' Defense of scrufy Compare Rod's github proposal Philosophy of language
  17. 17. Separate science from nomenclature. Use logic to do science. Always use names withsensu. Use heuristics to prevent paralysis. Don't 'represent data' – express claims! https://github.com/OpenTreeOfLife/reference-taxonomy/wiki/Expressing-phylogenetic-claims Bottom line
  18. 18. Ack Nico Franz, David Thau, Rod Page Open Tree: Karen Cranston, Stephen Smith, Mark Holder, and legions of others Gerald Jay Sussman Jonathan A. Rees 2014 Copyright waived CC0 1.0

×