FAIRSpectra - Enabling the FAIRification of Analytical Science
Vincenzi_Talk_at_Catolica_Santiago_2016
1. Simone Vincenzi
EU Marie Curie Fellow
University of California Santa Cruz, US
Polytechnic of Milan, Italy
simonevincenzi.com
simon.vincenz@gmail.com
Santiago, 12/14/2016
Within and among-population variation
in life histories and population
dynamics in an increasingly extreme
world: integrating genetics,
demography, pedigree reconstruction,
and life-history theory
2. Collaborators
University of
California Santa Cruz
Stanford
University of Bergen
Slovenia
Marc Mangel
Hans Skaug
Giulio De Leo
Carlos Garza
Slovenian field team Alain Crivelli Dusan Jesensek
3. My research interests
• Evolution of life histories and consequences
on vital rates, genetic variation, population
dynamics, risk of extinction in animal species
• Effects of climate (-change) and habitat
variation on life histories at different scales
– Acute (extreme event)
– Delayed (carry-over effect of early
environment)
– Long-term (among populations, spatial
variation)
• Theory and simulations guiding empirical
studies (and then back to theory)
4. Ecology in the 21st century
Past
environments
Evolutionary
history
LH traits
Genetic
variation
Climate change
Novel
environment
Fitness
landscape
Evolution
of LH
traits
Population
size, traits and dynamics
Time
Past Now Now/Future
5. Data and tools
• To understand how within- and among-
population variation emerge and to predict
future behavior we need
1. Long-term studies
2. Longitudinal data for the estimation of individual
and shared variation
3. Statistical models that tease apart different
contributions to the observed variation
4. Genetic data across space and through time
5. Theory-based hypotheses limiting “researcher
degrees of freedom”
• Comparative quantitative studies of trait
variation covering a substantial part of a
species’ geographic range are very rare
7. Marble trout
• Stream-living salmonid
endemic in:
– Adriatic basin of Slovenia and ex-
Yugoslavia
– Po river basin in Northern Italy
• High plasticity of body size, up
to 20-25 kg
• Spawning in November
• Emergence in June
• Maximum age 10 to 15 yo
• First reproduce at 1 to 4 yo
• Low movement
15. Huda
0
1000
2000
2000 2005 2010
Year
Fish/ha
0
1000
2000
2000 2005 2010
Year
Fish/haL Idrijca
0
1000
2000
2000 2005 2010
Year
Fish/ha
U Idrijca
0
1000
2000
2000 2005 2010
Year
Fish/ha
Lipovesck
0
1000
2000
2000 2005 2010
Year
Fish/ha
Zadlascica
0
1000
2000
2000 2005 2010
Year
Fish/ha
Trebuscica
0
1000
2000
2000 2005 2010
Year
Fish/ha
Zakojska
0
1000
2000
2000 2005 2010
Year
Fish/ha
Gacnik
Gacnik
Trebuscica
Idrijca
Studenc
Sevnica
Zakojska
Huda
Gorska
Lipovscek
Zadlascica
30-70 fish
10-300 fish
>1000 fish
16. • Questions on genetic variation
– among populations à evolutionarily
significant units
– within population à degree of inbreeding
Past
environments
Evolutionary
history
LH traits
Genetic
variation
Novel
environment
Fitness
landscape
Evolution
of LH
traits
Population
size, traits and dynamics
Climate change
17. Genome
• By sequencing the genome we may investigate
– how genotype leads to phenotype (still very difficult
even in model system like Drosophila and zebrafish)
– pressures and processes that shape diversity in
populations (easier, but still…)
Peterson, et al. (2012). Double digest RADseq: an inexpensive method for de novo SNP discovery
and genotyping in model and non-model species. PloS One, 7(5), e37135.
18. Genetic structure of marble trout
Fumagalli, et al. (2002). Extreme genetic differentiation among the remnant populations of
marble trout (Salmo marmoratus) in Slovenia. Molecular Ecology, 11, 2711–2716.
Idrijca drainage
All pairwise Fst 0.31-0.88
Zadla
Huda
Lipo
Prede
Gacnik
Trebuscica
Idrijca
Studenc
Sevnica
Zakojska
Huda
Gorska
Lipovscek
Zadlascica
14 msats
20. Pipeline for marble sequencing
• Illumina MiSeq
• ddRad sequencing
• Between 8 and 13 individuals genotyped for
each population
• Size selection ~ 500 bp
• Stacks for de novo assembly and genotyping
(we also tried pyRAD)
• Finding SNPs à variation in a single DNA base
within a sequence
• Developed ~230 markers and then genotyped
Davey, J. W. et al. (2011). Genome-wide genetic marker discovery and genotyping using next-generation
sequencing. Nature Reviews Genetics, 12(7), 499–510.
Catchen, J. M. et al. (2011). Stacks: building and genotyping loci de novo from short-read sequences. G3, 1(3),
171–82.
22. Structure
Pritchard, J. K., Stephens, M., & Donnelly, P. (2000). Inference of population structure using multilocus genotype
data. Genetics, 155, 945–959.
Main insight
New unit (Svenica), possibly
evolutionarily significant
(“represents an important
component in the evolutionary
legacy of the species”)
23. Inbreeding
• Mating of individuals that are genetically related
• Increased homozygosity and more likely occurrence of
recessive traits (inbreeding depression)
• Individual inbreeding coefficients estimated from
genomic data (based on the observed vs. expected
number of homozygous genotypes)
24. Inbreeding
Mean ± sd across
individuals
0.4
0.6
0.8
Huda Lipo U Idri Zadla Trebu L Idri
Inbreeding
25. Points
• Strong genetic divergence (few examples of
populations so divergent at such a small
geographic scale)
– Possibly a new evolutionarily significant unit
• Little shared polymorphism
• High to very high inbreeding
• Present/future work: how much adaptive
divergence vs. drift?
26. • Questions on vital rates and life-history traits
– Variation in survival and its determinants
– Variation in growth and its determinants
– How growth and survival co-vary
Past
environments
Evolutionary
history
LH traits
Genetic
variation
Novel
environment
Fitness
landscape
Evolution
of LH
traits
Population
size, traits and dynamics
Climate change
Vincenzi, S., M. Mangel, D. Jesensek, J. C. Garza, and A. J. Crivelli. 2016. Within and among-population
variation in vital rates and population dynamics in a variable environment. Ecological Applications 26:2086–2102
28. Survival probabilities
ϕ(x) p(y)
model npar AIC DeltaAIC
ϕ(~time)p(~time) 20 4698.23 0.00
ϕ(~time)p(~Age) 20 4698.75 0.52
ϕ(~time)p(~1) 19 4701.14 2.91
Laake, J. L., Johnson, D. S., & Conn, P. B. (2013). marked: An R package for maximum-
likelihood and MCMC analysis of capture-recapture data. Methods in Ecology and Evolution, 4,
885–890
Survival Capture
29. Survival
Mean ± 95% CI
0.2
0.3
0.4
0.5
0.6
Gac Huda L Idri Lipo Stu Sve Trebu U Idri Zadla Zak
Annualsurvival
Determinants
1. Year of birth
2. Sampling occasion
30. Growth
100
300
500
1 3 5 7 9
Age
Length(mm)
Gacnik
100
300
500
1 3 5 7 9
Age
Length(mm)
Zakojska
100
300
500
1 3 5 7 9
Age
Length(mm)
Huda
100
300
500
1 3 5 7 9
Age
Length(mm)
Lipo
100
300
500
1 3 5 7 9
Age
Length(mm)
L Idri
100
300
500
1 3 5 7 9
Age
Length(mm)
U Idri
100
300
500
1 3 5 7 9
Age
Length(mm)
Zadla
100
300
500
1 3 5 7 9
Age
Length(mm)
Trebu
31. Growth model
k(ij)
= α0
+ α1
( j)
+ α2
xij
+ σu
uij
L∞
(ij)
= β0
+ β1
( j)
+ β2
xij
+ σv
vij
t0
(ij)
= γ 0
⎧
⎨
⎪⎪
⎩
⎪
⎪
0( )
( ) (1 )k t t
L t L e− −
∞= −
Vincenzi, S. et al. (2014). Determining individual variation in growth and its implication for life-history
and population processes using the Empirical Bayes method. PLoS Computational Biology, 10, e1003828.
Vincenzi, S. et al. (2016). Trade-offs between accuracy and interpretability in von Bertalanffy random-
effects models of growth. Ecological Applications 26:1535–1552.
L∞
k
0t
32. Why we need individual
random effects
No random effects
Random effects
36. Points
• High variability in survival and growth among
populations à Most variation explained by
year of birth
– Climatic vagaries, families, spatial clustering,
early density
• Likely trade-off between growth and survival
at the population level, probably mediated by
availability of food
• Next: does higher heterozygosity increase
growth/survival?
37. • Questions on climate change
– Testing theory-based hypotheses on the
demographic, life-history, and genetic
effects of extreme climate events
Past
environments
Evolutionary
history
LH traits
Genetic
variation
Novel
environment
Fitness
landscape
Evolution
of LH
traits
Population
size, traits and dynamics
Climate change
38. MaxFlow
Time (yrs)
100-yr flood
Climate change and extreme events
Catastrophes
Climate change à
increased intensity, altered
frequency and seasonality of
extreme events
39. Extreme events are increasingly relevant
2014 flood in Parma
2014 NE US cold wave
2014 European floods2014 California drought
40. Extreme event
• September 2002, more than 20,000 chinook salmon died in
the lower 50 km of the Klamath River in Northern California
for a combination of low flows, high temperature,
subsequent crowding and proliferation of disease
Quinn TP (2005) The Behaviour and Ecology of Pacific Salmon and Trout. University of Washington
Press.
41. Extreme event
• Anoxic crisis in the California Current Ecosystem
– Demersal fish and benthic invertebrates in shallow shelf waters
not adapted to severe oxygen stress à massive mortality
Chan F et al. (2008) Emergence of anoxia in the California current large marine ecosystem.
Science 319:920.
Ø Blue years up to 1999
Ø Green 2000-2005
Ø Red 2006
From Chan et al. (2008)
48. Theoretical and empirical
problems for extremes
• Need to start from clearly defined, process-
based hypotheses
– Beyond anecdotes
– Avoid “garden of forking paths” and chasing noise
• It’s intrinsically complicated
– Finding the right model system (rare events)
– Before-after studies are rare
– Context-specific à difficult to generalize insights
and develop and overarching predictive framework
(depends on life cycle and recurrence interval)
50. • Extreme events increase risk of
extinction by reducing population size (iron
law of conservation)
• Heterozygosity and gene polymorphism
predicted to decrease after extremes
(bottlenecks)
• Higher survival after the event due to
lower density and cannibalism
• Transient faster life histories after the
extreme:
• faster growth early in life
• younger age at reproduction
A few, general predictions
62. Parentage
• Molecular markers for parentage inference are highly
polymorphic
– Microsatellites (repetitive DNA)
• SNPs (variation in a single nucleotide)
– abundant
– low genotyping error rates
– scoring SNP genotypes is easy
• Assignment of trios (mother, father, offspring) is
probabilistic (genotyping error, chance)
• ~80-100 SNPs à reliable reconstruction of the
pedigree
Anderson, E. C., & Garza, J. C. (2006). The power of single-nucleotide
polymorphisms for large-scale parentage inference. Genetics, 172, 2567–82
63. Pedigree reconstruction
• Genotyped at 94 loci for Zak and 118 for Lipo
• Allowed to identify when two or more allegedly different fish
were the same (genetic tagging)
• Minor allele frequency (mean±sd) of SNPs
• 0.22±0.15 for Lipo
• 0.28±0.13 for Zak
• Conservative, two-step approach:
• SNPPIT for parent-pairs
• FRANz for single parents and more parent-pairs
• We genotyped ~1800 marble trout from the
populations of Lipo and Zak
64. Zak
0
50
100
150
200
95 97 99 01 03 05 07 09 11 13
Year class
Count
Type
Single
Pair
Not_assign
• 61% of genotyped samples assigned to
parent-pairs
• 11% to single parents
• 27% were not assigned
2003 2004 2005 2006 2007 2008 2010 2011 2012 2013 2014
Type
Single
Pair
Not_assign
65. 0
100
200
300
1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2010 2011 2012 2013 2014
Year class
Count
Type
Single
Pair
Not_assign
• 36% of genotyped samples were assigned
to parent-pairs
• 31% to single parents
• 33% not assigned
Lipo
2003 2004 2005 2006 2007 2008 2010 2011 2012 2013 2014
Type
Single
Pair
Not_assign
66. Parents-per-offspring
• Measure of variance in reproductive success
• Total parents/total offspring ([2, limit of 0])
0.0
0.5
1.0
1.5
2.0
99 01 03 05 07 09 11 13
Year class
PP0
Sample size
10
50
100
A
0.0
0.5
1.0
1.5
2.0
99 01 03 05 07 09 11 13
Year class
PP0
Sample size
10
50
100
B
92% of the 2011 cohort of Zak was
produced by two parents
Lipo
Zak
97% of 220 assigned fish of the 2011
cohort were the progeny of 8 parents
67. Heterozygosity and loss of alleles
0.2
0.4
0.6
95 97 99 01 03 05 07 09 11 13
Year class
Heterozygosity
Sample size
10
50
100
150
A
0.2
0.4
0.6
95 97 99 01 03 05 07 09 11 13
Year class
Heterozygosity
Sample size
10
50
100
150
B
Lipo
Zak
Heterozygosity
Heterozygosity predicted
to decrease ✔
68. Survival before and after extreme
0.0
0.3
0.6
0.9
No Flood Flood
η
Stream
Zak
Sve
Zadla
Lipo
0.0
0.3
0.6
0.9
Pre Post
Higher survival after the event ✔
69. 100
200
300
400
1 3 5 7
Age
Length(mm)
Zak
100
200
300
400
1 3 5 7
Age
Length(mm)
Lipo
100
200
300
400
1 3 5 7
Age
Length(mm)
Zadla
After flood
Before flood
Growth before and after extreme
Faster growth after the extreme event ✔
70. Age of parents
2
4
6
8
97 99 01 03 05 07 09 11 13
Year class
Age
Sample size
10
Lipo
B
2
4
6
8
97 99 01 03 05 07 09 11 13
Year class
Age
Sample size
10
Zak
D
Younger age at
reproduction ✔
Not enough data
71. Final point
• With right model system, data, modern lab
and computational approaches, we can
provide answer to questions that were un-
answerable
• Theoretical predictions are still needed