So many different kinds of mistakes: Or why systematic error is the 21st century’s sampling error
1. So many different kinds of mistakes
Or why systematic error is the 21st century’s sampling error
Liliana M. Dávalos
Assistant Professor, Department of Ecology & Evolution
SUNY, Stony Brook
Grand Valley State University
10 April 2014
2. My lab’s research mission
Biological
diversity
Diversification
Human
impact
3. Two kinds of questions
Biological
diversity
Diversification,
speciation
decrease Habitat lossincrease
4. So many kinds of mistakes
• Sampling error vs. systematic error
• In phylogenetics
• How phenotypes evolve
• In environmental change
• Why we are losing forests?
5. So many kinds of mistakes
• Sampling error vs. systematic error
• In phylogenetics
• How phenotypes evolve
• In environmental change
• Why we are losing forests?
6. Thinking about errors
• Let’s say we want to
answer a question:
• In a finite
population, what is
the frequency of an
allele?
Sampling vs. systematic
7. How to answer this
question
• We go out, get
samples, genotype
different individuals
• Then we count the
alleles
• What is the main
source of error?
Sampling vs. systematic
8. This is sampling error
• We want to get a
better estimate of the
allele frequency
• => Sample more
• We could sample the
entire population
• => Best possible
estimate of allele
frequency
Sampling vs. systematic
9. Now let’s ask a
different question
• We want to find out
how these 3000
microbial lineages
relate to one another
• We get their genomes,
map out each of the
single-copy genes,
estimate a phylogeny
Lang, Darling, Eisen 2013 PLoS One
Sampling vs. systematic
10. But our results don’t
make sense
• Is it sampling error?
• Can we sample
more than the whole
genome?
• We discover the model
of gene evolution we
are using was wrong
• What kind of error is
this?
Lang, Darling, Eisen 2013 PLoS One
Sampling vs. systematic
11. This is systematic
error
• Even sampling whole
genomes won’t fix the
problem
• Having more data
can make the
problem worse!
• As long as we don’t
change the model, we
will keep obtaining the
wrong answer
Lang, Darling, Eisen 2013 PLoS One
Sampling vs. systematic
12. So many kinds of mistakes
• Sampling error vs. systematic error
• In phylogenetics
• How phenotypes evolve
• In environmental change
• Why we are losing forests?
13. Mycobacterium bovis BCG str. Pasteur 1173P2
M. tuberculosis H37Ra
M. bovis BCG str. Tokyo 172
M. bovis AF212297
M. tuberculosis CDC1551
M. tuberculosis F11
M. tuberculosis KZN 1435
M. tuberculosis H37Rv
M. avium subsp. paratuberculosis K10
M. avium 104
M. vanbaalenii PYR1
M. sp. Spyr1
M. smegmatis str. MC2 155
M. sp. KMS
M. sp. MCS
M. sp JLS
Mycobacterium sp. *
Nocardia farcinica IFM 10152
Gordonia bronchialis DSM 43247
Rhodococcus opacus B4
R. equi ATCC 33707
R. equi 103S
Segniliparus rotundus DSM 44985
Bifidobacterium longum NCC2705
B. longum DJO10A
B. longum subsp. infantis 157F
B. longum subsp. longum JCM 121
B. longum subsp. longum BBMN68
B. longum subsp. infantis ATCC 558
B. longum subsp. longum JDM301
B. longum subsp. infantis ATCC 156
B. breve DSM 20213
B. dentium Bd1
B. dentium ATCC
100
100
84
96
42
100
63
63
65
55
84
10074
51
70
98
92
99
74
100
100
75
99
100
20
88
pathogenic
(avium
non-pathogenic Mycobacterium sme
Phylogenetics
• Testing relatedness
• All of comparative
biology
• Historical
biogeography
• Evolutionary aspects
of community ecology
• Diagnostics and
similar applications
Corthals...Dávalos 2012 PLoS One
How phenotypes evolve
14. Dated trees more
important than ever
• Dated trees need
fossils
• Why use dated trees?
• Trait evolution
• History of
assemblages in time
and space
• Key innovations
Dumont, Dávalos et al. 2012 P R Soc B
How phenotypes evolve
15. • We use morphological
characters
• How good are the
models of evolution for
morphological
characters?
• Characteristics of
the data
• Compare to models
molecular evolution
Fossils without
genomes
Dávalos & Russell 2012 Ecol Evol
How phenotypes evolve
16. Species Characters
These are morphological
characters
• They look like this —>
• Discontinuous
between species
• Factors, not
numbers
• Difficult to model
How phenotypes evolve
19. The trouble with
morphological characters
• At first, only model
was parsimony
• Neutral Jukes-Cantor
1969 model
implemented 2001
• Current model has
gamma variation
across characters
• Applying this model
does not solve conflict
Dávalos, Cirranello et al. 2012 Biol Rev
How phenotypes evolve
20. If the Jukes-Cantor model yields conflicting answer,
could the model be inadequate given these data?
22. A B
Background selection
PercentcodonsofCYTBineachcodontype
0
20
40
60
Background selection Selection shift
Significantly support
Significantly reject
Reject
Support
Type of codon
Amino acid position in alignment
-3
-2
-1
0
1
2
3
100 200 300 400 500
Significant support
or rejection
Selection shift
Selection shift in
Glossophaginae
Type of codon
CYTB COX1
Homoplasy II:
ecological convergence
• Can bring together
unrelated ecologically
similar lineages
• This example: mt
cytochrome b gene
of nectar-feeding
bats
• Association adaptive
molecular evolution
and supporting wrong
node Dávalos, Cirranello et al. 2012 Biol Rev
How phenotypes evolve
23. Homoplasy III:
correlated evolution
• Expected in protein-
coding genes
• Models in use for
codons, aminoacids,
ribosomal RNA
secondary structure
Dávalos & Perkins 2008 Genomics
How phenotypes evolve
24. Might these affect morphological characters?
Reviewer 1:
I don't see the point. If the characters are good
characters (meaning that they have some phylogenetic
signal at some level), then there is nothing especially
wrong with the fact that they are weighted a little more
than other characters.
How phenotypes evolve
28. Models incur
systematic error
• Morphology =
phenotype
• Neutrality and
independence wrong
for models
• Not neutral
• Not independent
Skelly et al. 2013 Genome Res
How phenotypes evolve
29. How does
morphology evolve?
• Ordering: each
character state gives
rise to a finite range of
states
• There are limits to
states because of
• Development
• Natural selection
Dávalos, Cirranello et al. 2012 Biol Rev
How phenotypes evolve
30. Modeling selection in
morphology
• Brownian motion vs.
Ornstein-Uhlenbeck
models
• Continuous
phenotypic traits
• Might selection explain
homoplasy in
morphological data?
How phenotypes evolve
Butler & King 2004 Am Nat
31. A BB C D
nectarivorous
other
frugivorous (figs)
other
frugivorous (figs)
other
nectarivorous frugivorous (figs)
other
nectarivorous
strictly frugivorous (figs, Short-faced bats)
Figur
Ardops
Ariteus
Carollia
Diphylla
Mimon
Tonatia
Sturnira
Ametrida
Centurio
Pygoderma
Sphaeronycteris
Stenoderma
Lonchophylla
Chrotopterus
Desmodus
Diaemus
Lampronycteris
Lophostoma
Macrotus
Micronycteris
Phylloderma
Phyllostomus
Rhinophylla
Trachops
Vampyrum
Artibeus
Chiroderma
Ectophylla
Enchisthenes
Mesophylla
Platyrrhinus
Uroderma
Vampyressa
Vampyrodes
Metavampyressa
Lonchophylla
Platalina
Anoura
Choeroniscus
Choeronycteris
Hylonycteris
Erophylla
Glossophaga
Leptonycteris
Monophyllus
Phyllonycteris
Brachyphylla
Dumont ... Dávalos 2014 Evolution
Engineering model of
performance
How phenotypes evolve
32. 0
100
200
300
400
500
0.0 0.4 0.8 1.2
MA
count
diet
figs
figs only
nectar
other
• Performance related to
diet
• Low mechanical
advantage in nectar-
feeding bats
• Convergence on
this phenotype
• Analyzing function and
integrating selection
better than ignoring
Three performance
peaks
How phenotypes evolve
Mechanical advantage
Frequency
Dumont ... Dávalos 2014 Evolution
34. The trouble with
systematic error
• In sampling error mode
• More is more
• More characters
• = thousands of
correlated phenotypes
• This will fail, we have
systematic error
• Improve model
• Improve data
• Reduce data
35. So many kinds of mistakes
• Sampling error vs. systematic error
• In phylogenetics
• How phenotypes evolve
• In environmental change
• Why we are losing forests?
36. My lab’s research mission
Biological
diversity
Diversification
Human
impact
37. Why do rainforests decline? Three hypotheses
Hamburger! (or steak)
Kaimowitz et al. 2004 CIFOR
Coca
Dávalos et al. 2011 Environ
Sci Technol
Land tenure and property
Hecht 1993 BioScience
Why lose forests?
39. The real drivers of
habitat loss
Forest,
coca
nothing Eradicationdecrease
Urbanization
&
Development
Dávalos et al. 2014 Biol Cons
becomes
Pasture
&
Cows
isproperty
Why lose forests?
40. These systematic
errors are scary
• Models inform policy
• Real decisions are
made based on these
inadequate models
• Models influence what
data we collect
• If we focus on cattle
and the problem is
palm, we are missing
the real story
41. Shifting to the
present
• 20th century challenge
• Collecting enough data
• i.e., sampling
• Still relevant in many
cases
• New challenges
• Formulating models
• “Big” data
• Correlated data
• Otherwise biased data
Fjeldsa et al. 2005 Ambio
42. • Funding
• NSF–DEB, CIDER–SBU
• Speciation & diversification: A.
Cirranello, A. Russell, N. Simmons, P.
Velazco
• Functional evolution: E. Dumont, S.
Rossiter, E. Teeling
• Conservation & policy: D. Armenteras,
A. Bejarano, A. Corthals, L. Correa, J.
Holmes, N. Rodriguez, C. Romero
• Dávalos Lab
• Phylogenetics: R. Dahan, S.
DelSerra, A. Goldberg, O. Warsi, L.
Yohe, X. Zhang
• Land use: P. Connell, M. Hall, E.
Simola, G. Tudda, Y. Shah
Thanks!