2. ¡ What is Taxonomy?
§ CBD – “Taxonomy is the science of naming, describing and
classifying organisms and includes all plants, animals and
microorganisms of the world”
§ Using morphological, behavioral, genetic and biochemical
observations, taxonomists identify, describe and arrange
species into classifications, including those that are new to
science.
¡ Taxonomy is related to:
§ the identification of an organism
§ Placing the organism in context with the rest of living
organisms
TAXONOMY – WHAT IS IT?
3. ¡ Taxonomy is based on names
¡ Humans have always given names
¡ Binomial nomenclature
¡ Define individuals and groups
¡ Each name defines a taxon
TAXONOMY – TAXONOMIC NAMES
4. ¡ Organization and
classification of
organisms
¡ According to common
features
¡ Taxonomic
classification
TAXONOMY - HIERARCHIES
http://wp.lps.org/jbenson2/blog/2012/01/18/january-18-taxonomy-chart-lab
5. ¡ Taxonomy has a strong subjective component
¡ Classifications depend on the expertise and point of
view of the specialist
¡ Lots of episodes of:
§ Name removals
§ Taxon splits
§ Taxon merges
§ Different organizations according to different features
¡ Some cases…
TAXONOMY – NAMES AND TAXONOMIES
6. ¡ Two different names are applied to the same organism
¡ Expert argues that two originally different taxa are the same
¡ Generally one name remains, the other is considered a
synonym and no longer valid
TAXONOMY - SYNONYMY
Photo: Arthur Chapman
Antilocapra americana
Ord, 1815
Antilocapra anteflexa
Gray, 1855
7. ¡ Two different names are applied to the same organism
¡ Expert argues that two originally different taxa are the same
¡ Generally one name remains, the other is considered a
synonym and no longer valid
TAXONOMY - SYNONYMY
Photo: Arthur Chapman
Antilocapra americana
Ord, 1815
Antilocapra anteflexa
Gray, 1855
8. ¡ The same name is applied to two different organisms
¡ New description using “already taken” name
¡ Generally, oldest name prevails and newest has to change
TAXONOMY - HOMONYMY
Echidna
Cuvier, 1797
Echidna
Forster, 1777
Photo: David R
Photo: Petr Baum
9. Photo: David R
Photo: Petr Baum
¡ The same name is applied to two different organisms
¡ New description using “already taken” name
¡ Generally, oldest name prevails and newest has to change
TAXONOMY - HOMONYMY
Echidna
Cuvier, 1797
Echidna
Forster, 1777
10. Photo: Petr Baum
¡ The same name is applied to two different organisms
¡ New description using “already taken” name
¡ Generally, oldest name prevails and newest has to change
TAXONOMY - HOMONYMY
Echidna
Cuvier, 1797
Tachyglossus
Illiger, 1811
11. ¡ Taxonomic classifications are subjective
¡ Based on common features
¡ Different experts select different features
¡ Scientific names might remain the same
¡ Higher level taxa or groups might differ
¡ See example…
TAXONOMY – ALTERNATE
CLASSIFICATIONS
13. ¡ Issues with names hamper the use of
taxonomic names alone to be
effective
¡ New term: Taxon concept
¡ Name – Concatenation of characters
¡ Concept – Name + context
¡ Even if the name is the same, the
concept is different since it applies
to different organisms
TAXONOMY – NAME VS CONCEPT
14. TAXONOMY - STANDARDS
¡ Taxonomic names: Scientific name and all higher taxa
¡ Taxon concept: taxonConceptID, nameAccordingTo,
namePublishedIn…
15. TAXONOMY - STANDARDS
¡ Taxonomic names: Scientific name and all higher taxa
¡ Taxon concept: taxonConceptID, nameAccordingTo,
namePublishedIn…
Source in which the specific taxon concept
circumscription is defined or implied
16. TAXONOMY - STANDARDS
¡ Taxonomic names: Scientific name and all higher taxa
¡ Taxon concept: taxonConceptID, nameAccordingTo,
namePublishedIn…
For taxa that result from identifications, a reference
to the keys, monographs, experts and other sources
should be given
17. ¡ One of the most common issues
¡ Random alteration of one or more characters in a
name
¡ Possibilities:
§ Purely accidental
§ Due to low knowledge
¡ Tend to appear at the time of digitization
NOISE - MISSPELLINGS
19. ¡ Misidentification
§ A more obscure type of error
§ Wrongly identify a taxon
§ The only way of solving is through close examination by
expert taxonomist
§ Might not be resolvable at all
¡ Emptiness
§ Seriousness depends on missing level/s
§ Importance decreases as taxonomic rank increases
§ Scientific name missing?
§ Special cases: homonymies, synonymies…
NOISE – MISIDENTIFICATIONS &
EMPTINESS
20. ¡ Not defining used taxonomy
§ Can have the same effect as having only scientific name
§ We might complete hierarchy, but reliability?
§ Providing employed taxonomy (taxonomic concept)
§ Use identification qualifiers: “Sensu Otegui, 2013”, or “Sensu
Biologia Centrali Americana”
¡ Synonymies and homonymies
§ Again, background information (metadata, taxonomic concept)
needed
§ Use of identification qualifiers
NOISE – NATURE OF TAXONOMY
21. ¡ Instability of taxonomic identifications
¡ Background information greatly help
¡ Also having source of change records
NOISE – NATURE OF TAXONOMY
22. ¡ Aims of taxonomic assessments
§ Correct issues
§ Reconcile taxonomies
§ Complete hierarchies
¡ Basic general process – controlled name list
§ Take a name
§ Check if exists in a reliable list of names
§ Extract related information
§ Apply to our dataset
ASSESSMENTS
23. ¡ General Databases
§ Ideally, global high-quality information
§ Not complete
§ Rely on taxon-specific sources and their completeness
ASSESSMENTS – SOURCES OF DATA
24. ¡ General Databases
§ Ideally, global high-quality information
§ Not complete
§ Rely on taxon-specific sources and their completeness
¡ Thematic databases and regional checklists
§ If our collection is taxon-specific or location-specific
§ Gather all available knowledge on their topic
§ Reliable authoritative sources
ASSESSMENTS – SOURCES OF DATA
25. ¡ General Databases
§ Ideally, global high-quality information
§ Not complete
§ Rely on taxon-specific sources and their completeness
¡ Thematic databases and regional checklists
§ If our collection is taxon-specific or location-specific
§ Gather all available knowledge on their topic
§ Reliable authoritative sources
¡ Taxonomic Literature
§ Most specific source
§ Very high reliability
§ Hard to retrieve relevant literature
§ Some processing needed
ASSESSMENTS – SOURCES OF DATA
26. ¡ Free of misspellings
§ Ab initio, or manage to reduce to the minimum
§ Some of the tools (Refine, Excel processing…) to accomplish
this
§ Taxonomic reconciliation depends on this requirement
¡ Completeness
§ At least to certain point
§ This minimum is scientific name
§ But only scientific name might not be enough
¡ Helpful metadata
§ Not related to the organism, but to the process of identification
§ The person who identified, taxonomic classification
ASSESSMENTS - REQUIREMENTS
27. ¡ Manual
§ Removing inconsistencies, updating the wrong information
§ Taxonomy is an interpretation of explicit and implicit knowledge
§ Explicit knowledge – records
§ Implicit knowledge – human deduction
§ Machines are not good at interpreting implicit knowledge
§ Prone to errors. Automated approach recommended
¡ Automatic
§ Big amounts of data
§ Repetitive tasks
§ Removal of misspellings, checking against source, update
§ Only explicit knowledge. Explicit metadata mandatory
ASSESSMENTS - METHODS
29. ¡ After cleaning, validate output
¡ Check:
§ The data that has been corrected
§ The data that could not be corrected
§ The data that might have gone worse
¡ Taxonomic validation:
§ Expertise
§ Mixture of explicit and implicit knowledge
§ Not completely automatable
¡ If assessments fail:
§ Our data – Document and report reliability
§ Distributed data – Flag and report
VALIDATION