5. What the phylogenomic era taught us
1) Insects are ‘crustaceans‘
INSECT PHYLOGENOMICS MODELLING HETEROGENEITY THE SYNTHESIS
üRemipedia + Cephalocarida is
shown to be an artifact of long-
branch attraction
üRemipedia is the sister group of
Hexapoda
Schwentner et al.
2017 Curr. Biol.
Remipedia
6. What the molecular era taught us
2) Termites are cockroaches
Inward et al. 2007 Biol. Lett. Evangelista et al. 2019 Proc. Roy. Soc. B
7. What the molecular era taught us
3) The origins of lice
Johnson et al. 2018 PNAS
“Psocoptera” is paraphyletic, while parasitism probably originated once in the ancestor of Phthiraptera
bark lice
Fahrenholzia
pinnata
8. What the molecular era taught us
4) The ‘Strepsiptera problem‘ resolved
Niehuis et al. 2012 Curr. Biol.
9. …but some tricky nodes
consistently defy
resolution
Despite the unprecedented
availability of data, some old
controversies remain resistant
to resolution.
Tihelka et al. 2021 Curr. Biol.
Basal hexapods
Monophyly of
Palaeoptera
Monophyly of
Acercaria
(Paraneoptera )
Basal
Polyneoptera
Placement of fleas
10. The tricky nodes
• conflict with our traditional
morphological understanding of
insect evolution
• incongruent between different
studies
• some phylogenetic hypotheses
cannot be rejected
Tihelka et al. 2021 Curr. Biol.
11. The ‘silent phylogenomic revolution‘: intro
Axiom 1
• Phylogenomic studies can give misleading results for a variety of
reasons.
Axiom 2
• Mitigating the most common sources of error can give us better trees.
12. Modelling nucleotide and amino acid
evolution substitution: what is it?
INSECT PHYLOGENOMICS
• tree generation methods use an
evolutionary model to account
for nucleotide/amino acid
substitutions
• different substitutions will be
differently weighted when
calculating the likelihood of trees,
based on their rarity
• active and long-running area of
research -> ‘the silent revolution‘
http://www.iqtree.org/
13. In reality, the evolutionary process is highly heterogeneous…
INSECT PHYLOGENOMICS
Tihelka et al. 2021 Curr. Biol.
14. … and our models have to account for that
• Since the 1960s, there have been progressive attempts to
relax the assumption of absolute homogeneity and
account for at least some of these heterogeneities
• 1980: Kimura two parameters model (K2P): accounts for the observation that
transitions and transversions happened at different rates due to the chemical
differences between purines and pyrimidines.
• 1985: Hasegawa Kishino Yano 1985 model (HKY85)152, where two rates of
nucleotide substitutions where used in the replacement matrix (to account for
transitions and transversions) and the nucleotide frequencies where estimated
from the data rather than assumed to be equal.
• 1980s-1990s: General Time Reversible (GTR) matrix: assume full heterogeneity of
the replacement process
15. We now have a suite of models for
addressing different evolutionary
heterogeneities
• Replacement-rate heterogeneity: GTR, WAG, LG …
• Across-sites rate heterogeneity: Gamma (G / Г) model
• Across-site compositional heterogeneity: CAT (PhyloBayes); UDM, C10–60 (IQ-Tree)
• Heterotachy: GHOST (IQ-TREE)
• Across-lineages compositional heterogeneity: Node Discrete Compositional
Heterogeneity (NDCH) model, Correspondence and Likelihood Analysis (COaLA)
increasing
computational
complexity
16. Why care about complex heterogeneous
models?
• When the assumptions of the model you are using are violated
by the data, you may get systematic errors:
• The same error is repeatably and consistently recovered with
high support from different datasets composed of the same
type of data as long as the underlying biases remain
unmitigated.
• Model mis-specification is one of the most widespread causes
of error in phylogenomics
• Using heterogeneous models helps suppress common sources
of error such as long-branch attraction
• -> some nodes are very difficult and can only be a solved with
well-fitting homogeneous models
17. Why care about complex heterogeneous
models?
INSECT PHYLOGENOMICS
A lot of old controversies in animal evolution boils down to the use of
inappropriate models!
18. Model-dependency is present across all
levels of insect phylogeny
INSECT PHYLOGENOMICS
Tihelka et al. 2021
Curr. Biol.
19. More on modelling: a new classification for
Coleoptera
• quarter of extant animal diversity
• studies based on morphology, few gene
markers, mitogenomes and
transcriptomes remain incongruentt at
deepest nodes
• makes reconstructing the timescale of
their evolution difficult
20. p 68 genes, single-copy
p “noise” filtered
p Site-heteroeneous
models
p 80 cores, ~18 months
p 95 genes
p “phylogenetic noise”
p Site-homogeneous
models
Previous Study
Present study
Results:
Clambiformia ser. nov.
Clamboidea sensu nov.,
Rhinorhipiformia ser. nov.,
Nosodendriformia ser. nov.,
Staphyliniformia sensu nov.
Cucujoidea divided into:
Erotyloidea stat. nov.,
Nitiduloidea stat. nov., and
Cucujoidea sensu nov.
More on modelling: a new classification for Coleoptera
21. Updated beetle timetree
Assessment of fossil calibrations
Myxophagan beetle
once identified as
Polyphaga
Key fossil positions reevaluated
Fossil calibrations: a more precise timetree for Coleoptera
22. More on modelling: a new classification for
Coleoptera
Cai et al. 2021 biorxiv
https://www.biorxiv.org/content/10.1101/2021.09.22.461358v1
23. More on modelling: a new classification for
Coleoptera
INSECT PHYLOGENOMICS
Cai et al. 2021 biorxiv
24. And the list goes on…
-> The insect tree of life is ripe for a re-evaluation!
25. Summary
1. despite the unprecedented accumulation of genome-scale datasets,
many old controversies in insect evolution remain unresolved
2. some ancient nodes in the insect tree of life are very hard to resolve,
and can only be tackled with some heterogeneous models
3. with better modelling, we can recover results expected based on
phylogeny and the fossil record
4. insect phylogenomic studies have to move away from standard
pipelines and embrace a more experimental approach – model fit
testing
5. modelling is as important as generating new data for resolving the
beetle/insect tree of life