SlideShare a Scribd company logo
The ‘hitch-hiking’
effect
April 4, 2016
Motivation
• You want to sequence some individuals and
identify loci “subject to selection”, in some general,
vague sense.
• To make any progress in terms of theory, you first
have to formalize the question.
Define the question
• What is the effect of natural selection at “site A” on
the change in allele frequency of “site B”?
A B
r
“Selected
site”
“Neutral
locus”
In the absence of selection
E[ x] = 0“HWE”
V [ x] = x(1 x)
2N“More drift
in small
populations”
E[H] = ✓
1+✓ ; ✓ = 4Neµ
“More variability
in large
populations”
(See any introductory evolution/population genetics text)
In the absence of selection
E[⇡] = 2 i
n
n i
n 1 = ✓
Expected mean # differences b/w all pairs of
sequences in a sample of size n.
Tajima (1983) Genetics
ˆ✓ = SPn 1
i=1
Watterson (1975)
Theoretical Pop’n
Biology
Expected # of mutations
in sample of size nE[S] = ✓
Pn 1
i=1
1
i
f(i) = ✓
i ; 1  i < n
Expected # of mutation where
derived state occurs i
times in a sample of size n.
Tajima (1983), but see
Hudson (2015) PLoS One for way
easier derivation.
0
2.5
5
7.5
10
1 2 3 4 5
n = 6; ✓ = 10
Figure from Hudson, 1990 “Gene genealogies and the coalescent process”
How does selection change
these predictions?
• “Classic sweep” - new mutation, beneficial upon
origin. This is 1, 1+sh, 1+2s.
• “Soft sweep” - neutral or deleterious variant,
becomes beneficial later. “Selection on standing
variation.”
• Polygenic trait. This is quadratic selection based
on deviations from an optimum. Will not cover in
detail. I will make qualitative comments.
Define the question
• What is the effect of natural selection at “site A” on
the change in allele frequency of “site B”?
A B
r
“Selected
site”
“Neutral
locus”
Classic sweeps: intuition
r = 0
r > 0
Heterozygosity ( = diversity)
is reduced at neutral locus due
to “hitch-hiking”. Magnitude of effect
will depend on ’s’ and ‘r’
Hitch-hiking effect of a gene 29
-0008 -0006 -0004 -0002 0 0002 0004 0006 0008
Fig. 2. 4Qao(l —Qoo) is
the final amount of heterozygosity at a locus, when initial
frequencies of o, A are 0-5. The graph here, with N = 106
and s = 0-01, is calculated
from (8).
heterozygosity remained, the gene frequencies would return towards their equi-From Maynard-Smith & Haigh (1974)
(Remember this when
reading Kim and Stephan)
Quantifying the process
• Need trajectory of beneficial mutation from
frequency 1/(2N) to 1-1/(2N).
• Need a way to simulate a coalescent on top of that.
• This is the “structured coalescent”, introduced by
Dick Hudson and Norm Kaplan
• See recent Perspective by Barton in Genetics:
http://www.genetics.org/content/202/3/865
Trajectories
t,-E=X-‘(l -&),
atisfiesthe differential equation
dx(t)-=sx(t)(l -x(t)),dt
x( t,) = E.
on of this differential equation is
x(t) =
&
E+(l--E)e-““-‘“”
(2)
(3a)
nient to introduce a new variable r = t - t,. The time it takes for
essto go from Eto 1-E is
f = -2 ln(s)/s. (3b)
r to describe the effect of the selected mutation on a linked
cus, Ohta and Kimura (1975) divide the population into two
part consists of chromosomes carrying the advantageous muta-
other one the disadvantageous allele b. Let pi be the frequency of
mong chromosomescarrying the favorable mutation B, and pZthe
of allele A among b-chromosomes.Note that these variables are
rom the usual state space variables of two-locus, two-allele
urthermore, let ~$(pi, p2, T) be the joint probability density func-
and p2 at time t > 0. Our goal is to compute the expectations of
nciesp, and p2 and their second-order momentsp:, p1 pz, and pi
s differential equation is
x(t) =
&
E+(l--E)e-““-‘“” (3a)
introduce a new variable r = t - t,. The time it takes for
o from Eto 1-E is
f = -2 ln(s)/s. (3b)
scribe the effect of the selected mutation on a linked
a and Kimura (1975) divide the population into two
nsists of chromosomes carrying the advantageous muta-
e the disadvantageous allele b. Let pi be the frequency of
omosomescarrying the favorable mutation B, and pZthe
A among b-chromosomes.Note that these variables are
usual state space variables of two-locus, two-allele
e, let ~$(pi, p2, T) be the joint probability density func-
Deterministic.
From Stephan et al. (1992, TPB)
itesimal mean includes an additional term, which effec-
y gives the appropriate push toward the boundary on
ch we have conditioned.
ur approach also relies on the reversibility of the diffu-
process (cf. Griffiths 2003). Specifically, we use the fact
the diffusion process looking backward in time from the
ent (i.e., toward the introduction of the allele) has the
e distribution as a process forward in time conditional
bsorption at zero. This conditional process (t) is theX*N
e as XN(t) but with ␮N(x) replaced by (x) ϭ Ϫx (Ewens␮*N
4). Likewise, because we are only interested in beneficial
es that eventually reach fixation, we consider the dif-
on process conditional on the selected allele reaching a
uency of one. This conditional process (t) has an in-ϩ
XS
esimal mean (x) ϭ 2Nsx(1 Ϫ x)/tanh(2Nsx) (Ewensϩ
␮S
4).
o generate a trajectory for allele A, we use a variable-
d jump random walk to approximate to the diffusion pro-
. Given a current frequency x, at time intervals ⌬t, the
uency x jumps to either:
x → x ϩ ␮(x)⌬t Ϫ ͙x(1 Ϫ x)⌬t or (2a)
x → x ϩ ␮(x)⌬t ϩ ͙x(1 Ϫ x)⌬t (2b)
equal probability. The term ␮(x) is replaced by the con-
onal infinitesimal mean of the phase in question (i.e.,
ral or selective). This process has the correct diffusion
t, that is, the correct infinitesimal mean and variance are
ined and all higher moments are zero, as the time interval
→ 0 (Karlin and Taylor 1981). Hence, for small ⌬t, it
ides a good approximation to the diffusion process. We
fied this for our choice of ⌬t ϭ 1/(4N) by comparison to
ytical expectations and to alternative methods of simu-
variation, we simulate samples from a linked, neutrally evo
ing region using a structured coalescent approach. Spec
cally, we generate a trajectory of allele A from introduct
to fixation, then condition on this particular realization of
genealogical process to generate an ancestral recombinat
graph for our sample (Fig. 1). The trajectory of allele A
modeled stochastically, using a new approach (see Method
Under the standard sweep model, f ϭ 1/(2N), while un
the model of directional selection on standing variation, f
1/(2N).
Effect of f on Diversity Levels
Irrespective of the value of f, mean diversity levels
most distorted near the selected site and tend toward th
neutral expectation with increasing genetic distance. Thi
illustrated in Figure 2A, using parameters that may be
plicable to humans (e.g., Frisse et al. 2001). We present th
summaries of diversity: ␪W (Watterson 1975), ␪H (Fay a
Wu 2000), and ␲ (Tajima 1989). Under a neutral equilibri
model, these statistics provide an unbiased estimate of ␪,
population mutation rate (␪ ϭ 4N␮, where ␮ is the mutat
rate per generation per base pair). For these parameters
standard sweep leads to a reduction in the mean levels
variation throughout the 100-kb region (relative to the neu
expectation of ␪ ϭ 0.001 per base pair).
A very similar picture is expected so long as f Ͻ 1/(2
and selection is strong (Stephan et al. 1992). As an examp
in Figure 2A the expected levels of variation are indis
guishable for f ϭ 1/(2N) ϭ 5 ϫ 10Ϫ5 and f ϭ 1/(2Ns) ϭ 10
As f increases, the substitution of a favored allele has a we
er effect on diversity at linked neutral sites (Innan and K
2004); for f ϭ 0.20, the effect is hardly detectable. If
Stochastic. From Przeworski et al.
(2005), orig. Coop & Griffiths (2004) TPB
but kinda hard to dig out.
TL;DR - the latter is preferred. The former over-estimates
time to fixation by approximately two-fold.
Kaplan et al.Hudson and C. H. Langley
ly
he
ly
dy
bi-
he
1-E
X
E
past present
Pr(escape) ⇡ r/s
Kaplan et al.896 N. L. Kaplan,R. R. Hudsonand C. H. Langley
10'
10'
4 3 2 1
10 10
1
10
a
10 1010
I
c
I
I
insensitive to 2N so long as a
not shown).
The major goal of this pa
consequence of hitchhiking r
selected substitutions on stan
variation at the DNA level. I
tion la), is plotted as afunc
values of a with 2N = 10'. S
to 2N (for fixed a),the same
smallvaluesof A, the hitchh
E ( T ) substantially from 2 (it
lated, selectively neutral locu
e.g., if a 3 lo5, and 0.0002 <
G 0.7. Since theexpectednum
sites, E(S), is proportional to
hitchhiking effect associated w
selected mutants (or very rare
reduce the expected number
E[T] = E[total time on tree]
Var.reduced
Recent sweep Old sweep
Var.notreduced
Strong
selection!!!!
↵ = 2Ns
R = 2Nr
⌧ = Generations since fixation
2N
Kaplan et al.
10'
10'
10'
10'
10'
10'
4 3 2 1
10 10
1
10
a
10 1010
I
c
I
I
1
2
J
4
l-
l o 4
3
a=10
10-2 10 -l loo 10'
FIGURE3.--E(T), theexpected size (measured in 2N genera-
tions) of the ancestral tree of a sample of two genes at a selectively
to 2N (fo
smallvalu
E ( T ) subs
lated, sele
e.g., if a 3
G 0.7. Si
sites, E(S
hitchhikin
selected m
reduce th
a sample
tion.
The ex
region ofs
ancestor o
Equation
A M A X ran
lo6). It is
near them
about 5 a
s/r
• Routinely mis-quoted as “distance at which a
sweep will affect variation”.
• Wrong! It is distance at which site has Pr(escape)
close to 1.
an
ed
of
=
ple
ed.
the population (or when the rareallele becomes selec-
tively favored). In Figure 3a, a = lo4 and in Figure
3b, R = 10. It is not difficult to show from Figure 2
and Equation (12) that E(T)is an increasing function
of r and R and decreasingfunction of a.In particular,
as is seen in Figure 3, a and b, E(T) will differ
plotted against T, the ancestral time of fixation of theselected
substitution, a. For different values of R, the expected number of
crossovers between the neutral region and theselected locus per
genome per 2N generations (a= lo4),and b. For different values
of selection, a (R = 10);(see text for explanation).
significantly from 2, its neutral value, if 7 < 0.1 and
R / a < 0.01. This means thattheexpected level of
variation will be substantially reduced for all the sites
within a physical distance of (O.Ol)a/C base pairs of a
locus at which a selected substitution has recently
occurred.Forexample, if 2N = lo8,s = and c =
thenthe width of the affectedregion is only
about 200 bp. But if s = and c = 10-', thenthe
expected variation is reduced in a region about 2000
bp wide.
In Table 2 the values of M (Equation 20), A M A X
(Equation 22)and Z22(M) are given for differentvalues
of a (2N = lo8and 6 = 0.01). The value of Mf in (20)
m
th
M
po
th
w
pr
an
co
cr
se
m
th
dy
th
ef
ar
(Remember this when reading Kim and Stephan)
Which SFS?? (Blue = no selection, for
reference)
0
5
10
15
20
1 2 3 4 5 6 7 8 9
Single recent,
strong sweep.
Fay and Wu.
Sweeps occurring at some rate.
Braverman et al., Przeworski.
Hitchhiking Effect 789
0.0
-0.5
Q
- -1.0
v)
.-E
p
a
0)
9
a
2 -1.5
-2.0
-2.5
I  -
-a= a = 10 lo”
a = lo3
0.0000 0.00050 0.0010
4
over by these typical 6s gives &. Table 2 lists & for
FIGURE4.-Theaveragevalue
of Tajima’s D as a function of A,.
Theparametersare n = 50, S =
17, a = lo’, lo4, lo5, lo6,or lo7,
and A,. ranges from zero to AMx,
which varies depending on a.
0.0015
2). If one takes this as an estimate of &.), then ac-Fig. from Braverman et al. (1995) Genetics
“Pseudo-hitchhiking”
• Ignore the trajectory—assume fixation time close to
zero. This means assuming very strong selection.
• Hitch-hiking events occur at rate rho, at which time
all lineages coalesce (in absence of
recombination).
“Pseudo-hitchhiking”
E[ x] = 0
V [ x] = ⇢x(1 x)
⇢ = Rate of hitch hiking events
Ne = N
1+2N⇢y2 ! 1
⇢y2 as N ! 1
The extent to which “rho” (and y…) depend on N determines
the extent to which variation is constant across species.
J. H. Gillespie
all be lowered
g the various
ur during the
may even be
rs to be quite
randomness,
y.
eudohitchhik-
ainder of this
of the model
y of n alleles
Figur e 7.—The average values of Tajima’s D for different
ation with de- sample sizes. Those E{D(n)}curves come from samples drawn
only way that from a direct simulation of the pseudohitchhiking model for
hhiking event. a sample of size n. The D(n) curves come from a direct simula-
tion of the coalescent using Equation 18. In both cases, r 5icular genera-
0.138, y 5 0.3, and u 5 5 3 1024
.le copyof one
s frequencyto
ed alleles areFigure from Gillespie (2000)
Soft sweeps: intuition
Beneficial mutation initially on > 1 genetic background,
leading to prediction that reduction in diversity at linked,
neutral sites will not be as extreme as for a “classic” sweep
(now often called a “hard” sweep).
Soft sweep simulation
• Stochastic trajectories
• Neutral trajectory + selected
“stitched” together.
• Vary “ts”, f, etc.
2314 MOLLY PRZEWORSKI ET AL.
FIG. 1. A possible genealogy for six chromosomes at a neutral locus linked to a site where a beneficial alle
In this example, A has just fixed in the population (at time T ϭ 0), so all lineages carry the favored allele. G
is favored from T to ts then neutrally evolving from ts (when it is at frequency f) to tm. The trajectories fo
phases are shown in black and gray, respectively. The coalescent genealogy for the six chromosomes is depict
recombination events between allelic classes are indicated with slanted arrows. Most coalescent events occ
frequency. Because A is neutrally evolving from ts to tm, its sojourn time is longer than it would be under a stan
more opportunity for recombination. Note that, in this example, the most recent common ancestor has not been re
Figure from Przeworski et al. (2005) Evolution
FIG. 2. Mean diversity levels as a function of distance from the selected site for different values of f, the fre
is first favored. Diversity levels are summarized by the mean ␲ (dashed), ␪W (gray) and ␪H (black). Under the n
all three statistics are unbiased estimators of ␪, the population mutation rate. (A) Plausible parameters fo
simulations were run for 100 chromosomes, with N ϭ 104, s ϭ 0.05, and ␪ ϭ ␳ ϭ 10Ϫ3 per base pair (␳ ϭ 4
parameter definitions). The time since the fixation of the beneficial allele is zero. Under the neutral equilibr
ϭ E(␪H) ϭ 1 per kilobase. (B) Plausible parameters for Drosophila melanogaster. A total of 104 simulations were
with N ϭ 106, s ϭ 0.01, ␪ ϭ 0.01 per base pair, and ␳ ϭ 0.1 per base pair. The time since the fixation of th
Under the neutral equilibrium model, E(␲) ϭ E(␪W) ϭ E(␪H) ϭ 1 per 100 bp.
has more effect on ␲ than ␪W in all four examples, this is To quantify this observation, we es
f=0.05: reduction in
variation not
nearly as pronounced
f=0.2: effect would be
very hard to detect
“Hard” sweep: variation strongly
reduced near selected site
⇡ = dashed
✓W = grey
✓H = black
Take-home
• To detect sweeps, they should be:
• strong (relative to r)
• recent
• in regions of low recombination (relative to s)
• been selected on when rare
Quantitative traits
• Any one beneficial mutation not guaranteed to fix.
• I.e., sweeps can “stall out”—see Chevin and
Hopital (2008) Genetics
• B/c genetic background may move mean trait
value to optimum before fixation occurs.
• But, patterns of hitch-hiking should depend mostly
on whether or not mutation was rare or not at onset
of selection.
Considerations
• Demographic null model matters. Methods we
read will have to “account for demography”
• How much HH does there need to be before this is
impossible? (This is an open question.)

More Related Content

What's hot

BlUP and BLUE- REML of linear mixed model
BlUP and BLUE- REML of linear mixed modelBlUP and BLUE- REML of linear mixed model
BlUP and BLUE- REML of linear mixed model
KyusonLim
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
Christian Robert
 
A brief history of generative models for power law and lognormal ...
A brief history of generative models for power law and lognormal ...A brief history of generative models for power law and lognormal ...
A brief history of generative models for power law and lognormal ...sugeladi
 
Hyderabad 2010 Distributions of extreme bursts above thresholds in a fraction...
Hyderabad 2010 Distributions of extreme bursts above thresholds in a fraction...Hyderabad 2010 Distributions of extreme bursts above thresholds in a fraction...
Hyderabad 2010 Distributions of extreme bursts above thresholds in a fraction...
Nick Watkins
 
Numerical Study of Some Iterative Methods for Solving Nonlinear Equations
Numerical Study of Some Iterative Methods for Solving Nonlinear EquationsNumerical Study of Some Iterative Methods for Solving Nonlinear Equations
Numerical Study of Some Iterative Methods for Solving Nonlinear Equations
inventionjournals
 
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapStatistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Christian Robert
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
Christian Robert
 
"reflections on the probability space induced by moment conditions with impli...
"reflections on the probability space induced by moment conditions with impli..."reflections on the probability space induced by moment conditions with impli...
"reflections on the probability space induced by moment conditions with impli...
Christian Robert
 

What's hot (10)

BlUP and BLUE- REML of linear mixed model
BlUP and BLUE- REML of linear mixed modelBlUP and BLUE- REML of linear mixed model
BlUP and BLUE- REML of linear mixed model
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 
A brief history of generative models for power law and lognormal ...
A brief history of generative models for power law and lognormal ...A brief history of generative models for power law and lognormal ...
A brief history of generative models for power law and lognormal ...
 
Hyderabad 2010 Distributions of extreme bursts above thresholds in a fraction...
Hyderabad 2010 Distributions of extreme bursts above thresholds in a fraction...Hyderabad 2010 Distributions of extreme bursts above thresholds in a fraction...
Hyderabad 2010 Distributions of extreme bursts above thresholds in a fraction...
 
Numerical Study of Some Iterative Methods for Solving Nonlinear Equations
Numerical Study of Some Iterative Methods for Solving Nonlinear EquationsNumerical Study of Some Iterative Methods for Solving Nonlinear Equations
Numerical Study of Some Iterative Methods for Solving Nonlinear Equations
 
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapStatistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
 
7주차
7주차7주차
7주차
 
"reflections on the probability space induced by moment conditions with impli...
"reflections on the probability space induced by moment conditions with impli..."reflections on the probability space induced by moment conditions with impli...
"reflections on the probability space induced by moment conditions with impli...
 
Final Report 201045581
Final Report 201045581Final Report 201045581
Final Report 201045581
 

Viewers also liked

Green Hope Reserve, Nicaragua
Green Hope Reserve, NicaraguaGreen Hope Reserve, Nicaragua
Green Hope Reserve, Nicaragua
IUCNGPAP
 
OpenConext Workshop TNC2014
OpenConext Workshop TNC2014OpenConext Workshop TNC2014
OpenConext Workshop TNC2014
openconext
 
SEO & Web Redesign - Before and After
SEO & Web Redesign - Before and AfterSEO & Web Redesign - Before and After
SEO & Web Redesign - Before and AfterDavy Bour
 
MPLS -Novel approach of multi protocol label switching for Asynchronous Trans...
MPLS -Novel approach of multi protocol label switching for Asynchronous Trans...MPLS -Novel approach of multi protocol label switching for Asynchronous Trans...
MPLS -Novel approach of multi protocol label switching for Asynchronous Trans...
Vatsal Shah
 
Ile-Alatau National Park, Kazakhstan
Ile-Alatau National Park, KazakhstanIle-Alatau National Park, Kazakhstan
Ile-Alatau National Park, Kazakhstan
IUCNGPAP
 
Seminar2015
Seminar2015Seminar2015
Seminar2015
Kevin Thornton
 
CURSO DE SUPERACIÓN LENGUA CASTELLANA Y CIENCIAS NATURALES
CURSO DE SUPERACIÓN LENGUA CASTELLANA Y CIENCIAS NATURALES CURSO DE SUPERACIÓN LENGUA CASTELLANA Y CIENCIAS NATURALES
CURSO DE SUPERACIÓN LENGUA CASTELLANA Y CIENCIAS NATURALES
Mary Barrera Muñoz
 
Prezentace, která se měla promítat na stužkováku (14.11.2013)
Prezentace, která se měla promítat na stužkováku (14.11.2013)Prezentace, která se měla promítat na stužkováku (14.11.2013)
Prezentace, která se měla promítat na stužkováku (14.11.2013)
zluva
 
Hlášky 4.A
Hlášky 4.AHlášky 4.A
Hlášky 4.A
zluva
 
Именуем ресурсы для Windows 8 правильно
Именуем ресурсы для Windows 8 правильноИменуем ресурсы для Windows 8 правильно
Именуем ресурсы для Windows 8 правильно
slavabobik
 
Vivo vitrothingamajig
Vivo vitrothingamajigVivo vitrothingamajig
Vivo vitrothingamajig
Kevin Thornton
 
OpenConext: Authentication & Authorization Infrastructure for Virtual Researc...
OpenConext: Authentication & Authorization Infrastructure for Virtual Researc...OpenConext: Authentication & Authorization Infrastructure for Virtual Researc...
OpenConext: Authentication & Authorization Infrastructure for Virtual Researc...
openconext
 
Mobistealth pro version
Mobistealth pro versionMobistealth pro version
Mobistealth pro versionfragrom
 

Viewers also liked (17)

Green Hope Reserve, Nicaragua
Green Hope Reserve, NicaraguaGreen Hope Reserve, Nicaragua
Green Hope Reserve, Nicaragua
 
OpenConext Workshop TNC2014
OpenConext Workshop TNC2014OpenConext Workshop TNC2014
OpenConext Workshop TNC2014
 
SEO & Web Redesign - Before and After
SEO & Web Redesign - Before and AfterSEO & Web Redesign - Before and After
SEO & Web Redesign - Before and After
 
MPLS -Novel approach of multi protocol label switching for Asynchronous Trans...
MPLS -Novel approach of multi protocol label switching for Asynchronous Trans...MPLS -Novel approach of multi protocol label switching for Asynchronous Trans...
MPLS -Novel approach of multi protocol label switching for Asynchronous Trans...
 
Ile-Alatau National Park, Kazakhstan
Ile-Alatau National Park, KazakhstanIle-Alatau National Park, Kazakhstan
Ile-Alatau National Park, Kazakhstan
 
Je
JeJe
Je
 
Seminar2015
Seminar2015Seminar2015
Seminar2015
 
CURSO DE SUPERACIÓN LENGUA CASTELLANA Y CIENCIAS NATURALES
CURSO DE SUPERACIÓN LENGUA CASTELLANA Y CIENCIAS NATURALES CURSO DE SUPERACIÓN LENGUA CASTELLANA Y CIENCIAS NATURALES
CURSO DE SUPERACIÓN LENGUA CASTELLANA Y CIENCIAS NATURALES
 
Prezentace, která se měla promítat na stužkováku (14.11.2013)
Prezentace, která se měla promítat na stužkováku (14.11.2013)Prezentace, která se měla promítat na stužkováku (14.11.2013)
Prezentace, která se měla promítat na stužkováku (14.11.2013)
 
Hlášky 4.A
Hlášky 4.AHlášky 4.A
Hlášky 4.A
 
Именуем ресурсы для Windows 8 правильно
Именуем ресурсы для Windows 8 правильноИменуем ресурсы для Windows 8 правильно
Именуем ресурсы для Windows 8 правильно
 
Halloween powerpoint
Halloween powerpointHalloween powerpoint
Halloween powerpoint
 
Vivo vitrothingamajig
Vivo vitrothingamajigVivo vitrothingamajig
Vivo vitrothingamajig
 
OpenConext: Authentication & Authorization Infrastructure for Virtual Researc...
OpenConext: Authentication & Authorization Infrastructure for Virtual Researc...OpenConext: Authentication & Authorization Infrastructure for Virtual Researc...
OpenConext: Authentication & Authorization Infrastructure for Virtual Researc...
 
Happy life 8
Happy life 8Happy life 8
Happy life 8
 
Rich poor
Rich poorRich poor
Rich poor
 
Mobistealth pro version
Mobistealth pro versionMobistealth pro version
Mobistealth pro version
 

Similar to Hitch hiking journalclub

population genetics of gene function (talk)
population genetics of gene function (talk)population genetics of gene function (talk)
population genetics of gene function (talk)
Ignacio Gallo
 
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Shu Tanaka
 
Thomas Lenormand - Génétique des populations
Thomas Lenormand - Génétique des populationsThomas Lenormand - Génétique des populations
Thomas Lenormand - Génétique des populationsSeminaire MEE
 
Statistical analysis by iswar
Statistical analysis by iswarStatistical analysis by iswar
Combining Data in Species Distribution Models
Combining Data in Species Distribution ModelsCombining Data in Species Distribution Models
Combining Data in Species Distribution Models
Bob O'Hara
 
Probability Theory 9
Probability Theory 9Probability Theory 9
Probability Theory 9
Lakshmikanta Satapathy
 
CHI SQUARE DISTRIBUTIONdjfnbefklwfwpfioaekf.pptx
CHI SQUARE DISTRIBUTIONdjfnbefklwfwpfioaekf.pptxCHI SQUARE DISTRIBUTIONdjfnbefklwfwpfioaekf.pptx
CHI SQUARE DISTRIBUTIONdjfnbefklwfwpfioaekf.pptx
rathorebhagwan07
 
THEME – 2 On Normalizing Transformations of the Coefficient of Variation for a ...
THEME – 2 On Normalizing Transformations of the Coefficient of Variation for a ...THEME – 2 On Normalizing Transformations of the Coefficient of Variation for a ...
THEME – 2 On Normalizing Transformations of the Coefficient of Variation for a ...ICARDA
 
Unit3
Unit3Unit3
ISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptxISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptx
ssuser1eba67
 
Review on probability distributions, estimation and hypothesis testing
Review on probability distributions, estimation and hypothesis testingReview on probability distributions, estimation and hypothesis testing
Review on probability distributions, estimation and hypothesis testing
Meselu Mellaku
 
Chi squared test
Chi squared testChi squared test
Chi squared test
Victoria Seymour
 
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
IJRES Journal
 
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
irjes
 
Challenges in predicting weather and climate extremes
Challenges in predicting weather and climate extremesChallenges in predicting weather and climate extremes
Challenges in predicting weather and climate extremes
IC3Climate
 
Chi square tests
Chi square testsChi square tests
Chi square tests
waqole
 
Genetic diversity clustering and AMOVA
Genetic diversityclustering and AMOVAGenetic diversityclustering and AMOVA
Genetic diversity clustering and AMOVA
FAO
 

Similar to Hitch hiking journalclub (20)

Poster_PingPong
Poster_PingPongPoster_PingPong
Poster_PingPong
 
The HKA Test
The HKA TestThe HKA Test
The HKA Test
 
population genetics of gene function (talk)
population genetics of gene function (talk)population genetics of gene function (talk)
population genetics of gene function (talk)
 
eatonmuirheadsoaita
eatonmuirheadsoaitaeatonmuirheadsoaita
eatonmuirheadsoaita
 
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
Network-Growth Rule Dependence of Fractal Dimension of Percolation Cluster on...
 
Thomas Lenormand - Génétique des populations
Thomas Lenormand - Génétique des populationsThomas Lenormand - Génétique des populations
Thomas Lenormand - Génétique des populations
 
Statistical analysis by iswar
Statistical analysis by iswarStatistical analysis by iswar
Statistical analysis by iswar
 
Combining Data in Species Distribution Models
Combining Data in Species Distribution ModelsCombining Data in Species Distribution Models
Combining Data in Species Distribution Models
 
Probability Theory 9
Probability Theory 9Probability Theory 9
Probability Theory 9
 
CHI SQUARE DISTRIBUTIONdjfnbefklwfwpfioaekf.pptx
CHI SQUARE DISTRIBUTIONdjfnbefklwfwpfioaekf.pptxCHI SQUARE DISTRIBUTIONdjfnbefklwfwpfioaekf.pptx
CHI SQUARE DISTRIBUTIONdjfnbefklwfwpfioaekf.pptx
 
THEME – 2 On Normalizing Transformations of the Coefficient of Variation for a ...
THEME – 2 On Normalizing Transformations of the Coefficient of Variation for a ...THEME – 2 On Normalizing Transformations of the Coefficient of Variation for a ...
THEME – 2 On Normalizing Transformations of the Coefficient of Variation for a ...
 
Unit3
Unit3Unit3
Unit3
 
ISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptxISM_Session_5 _ 23rd and 24th December.pptx
ISM_Session_5 _ 23rd and 24th December.pptx
 
Review on probability distributions, estimation and hypothesis testing
Review on probability distributions, estimation and hypothesis testingReview on probability distributions, estimation and hypothesis testing
Review on probability distributions, estimation and hypothesis testing
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
 
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
A Mathematical Model for the Hormonal Responses During Neurally Mediated Sync...
 
Challenges in predicting weather and climate extremes
Challenges in predicting weather and climate extremesChallenges in predicting weather and climate extremes
Challenges in predicting weather and climate extremes
 
Chi square tests
Chi square testsChi square tests
Chi square tests
 
Genetic diversity clustering and AMOVA
Genetic diversityclustering and AMOVAGenetic diversityclustering and AMOVA
Genetic diversity clustering and AMOVA
 

Recently uploaded

Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), EligibilityISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
SciAstra
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Studia Poinsotiana
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
zeex60
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
sanjana502982
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 

Recently uploaded (20)

Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), EligibilityISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 

Hitch hiking journalclub

  • 2. Motivation • You want to sequence some individuals and identify loci “subject to selection”, in some general, vague sense. • To make any progress in terms of theory, you first have to formalize the question.
  • 3. Define the question • What is the effect of natural selection at “site A” on the change in allele frequency of “site B”? A B r “Selected site” “Neutral locus”
  • 4. In the absence of selection E[ x] = 0“HWE” V [ x] = x(1 x) 2N“More drift in small populations” E[H] = ✓ 1+✓ ; ✓ = 4Neµ “More variability in large populations” (See any introductory evolution/population genetics text)
  • 5. In the absence of selection E[⇡] = 2 i n n i n 1 = ✓ Expected mean # differences b/w all pairs of sequences in a sample of size n. Tajima (1983) Genetics ˆ✓ = SPn 1 i=1 Watterson (1975) Theoretical Pop’n Biology Expected # of mutations in sample of size nE[S] = ✓ Pn 1 i=1 1 i f(i) = ✓ i ; 1  i < n Expected # of mutation where derived state occurs i times in a sample of size n. Tajima (1983), but see Hudson (2015) PLoS One for way easier derivation. 0 2.5 5 7.5 10 1 2 3 4 5 n = 6; ✓ = 10
  • 6. Figure from Hudson, 1990 “Gene genealogies and the coalescent process”
  • 7. How does selection change these predictions? • “Classic sweep” - new mutation, beneficial upon origin. This is 1, 1+sh, 1+2s. • “Soft sweep” - neutral or deleterious variant, becomes beneficial later. “Selection on standing variation.” • Polygenic trait. This is quadratic selection based on deviations from an optimum. Will not cover in detail. I will make qualitative comments.
  • 8. Define the question • What is the effect of natural selection at “site A” on the change in allele frequency of “site B”? A B r “Selected site” “Neutral locus”
  • 9. Classic sweeps: intuition r = 0 r > 0 Heterozygosity ( = diversity) is reduced at neutral locus due to “hitch-hiking”. Magnitude of effect will depend on ’s’ and ‘r’
  • 10. Hitch-hiking effect of a gene 29 -0008 -0006 -0004 -0002 0 0002 0004 0006 0008 Fig. 2. 4Qao(l —Qoo) is the final amount of heterozygosity at a locus, when initial frequencies of o, A are 0-5. The graph here, with N = 106 and s = 0-01, is calculated from (8). heterozygosity remained, the gene frequencies would return towards their equi-From Maynard-Smith & Haigh (1974) (Remember this when reading Kim and Stephan)
  • 11. Quantifying the process • Need trajectory of beneficial mutation from frequency 1/(2N) to 1-1/(2N). • Need a way to simulate a coalescent on top of that. • This is the “structured coalescent”, introduced by Dick Hudson and Norm Kaplan • See recent Perspective by Barton in Genetics: http://www.genetics.org/content/202/3/865
  • 12. Trajectories t,-E=X-‘(l -&), atisfiesthe differential equation dx(t)-=sx(t)(l -x(t)),dt x( t,) = E. on of this differential equation is x(t) = & E+(l--E)e-““-‘“” (2) (3a) nient to introduce a new variable r = t - t,. The time it takes for essto go from Eto 1-E is f = -2 ln(s)/s. (3b) r to describe the effect of the selected mutation on a linked cus, Ohta and Kimura (1975) divide the population into two part consists of chromosomes carrying the advantageous muta- other one the disadvantageous allele b. Let pi be the frequency of mong chromosomescarrying the favorable mutation B, and pZthe of allele A among b-chromosomes.Note that these variables are rom the usual state space variables of two-locus, two-allele urthermore, let ~$(pi, p2, T) be the joint probability density func- and p2 at time t > 0. Our goal is to compute the expectations of nciesp, and p2 and their second-order momentsp:, p1 pz, and pi s differential equation is x(t) = & E+(l--E)e-““-‘“” (3a) introduce a new variable r = t - t,. The time it takes for o from Eto 1-E is f = -2 ln(s)/s. (3b) scribe the effect of the selected mutation on a linked a and Kimura (1975) divide the population into two nsists of chromosomes carrying the advantageous muta- e the disadvantageous allele b. Let pi be the frequency of omosomescarrying the favorable mutation B, and pZthe A among b-chromosomes.Note that these variables are usual state space variables of two-locus, two-allele e, let ~$(pi, p2, T) be the joint probability density func- Deterministic. From Stephan et al. (1992, TPB) itesimal mean includes an additional term, which effec- y gives the appropriate push toward the boundary on ch we have conditioned. ur approach also relies on the reversibility of the diffu- process (cf. Griffiths 2003). Specifically, we use the fact the diffusion process looking backward in time from the ent (i.e., toward the introduction of the allele) has the e distribution as a process forward in time conditional bsorption at zero. This conditional process (t) is theX*N e as XN(t) but with ␮N(x) replaced by (x) ϭ Ϫx (Ewens␮*N 4). Likewise, because we are only interested in beneficial es that eventually reach fixation, we consider the dif- on process conditional on the selected allele reaching a uency of one. This conditional process (t) has an in-ϩ XS esimal mean (x) ϭ 2Nsx(1 Ϫ x)/tanh(2Nsx) (Ewensϩ ␮S 4). o generate a trajectory for allele A, we use a variable- d jump random walk to approximate to the diffusion pro- . Given a current frequency x, at time intervals ⌬t, the uency x jumps to either: x → x ϩ ␮(x)⌬t Ϫ ͙x(1 Ϫ x)⌬t or (2a) x → x ϩ ␮(x)⌬t ϩ ͙x(1 Ϫ x)⌬t (2b) equal probability. The term ␮(x) is replaced by the con- onal infinitesimal mean of the phase in question (i.e., ral or selective). This process has the correct diffusion t, that is, the correct infinitesimal mean and variance are ined and all higher moments are zero, as the time interval → 0 (Karlin and Taylor 1981). Hence, for small ⌬t, it ides a good approximation to the diffusion process. We fied this for our choice of ⌬t ϭ 1/(4N) by comparison to ytical expectations and to alternative methods of simu- variation, we simulate samples from a linked, neutrally evo ing region using a structured coalescent approach. Spec cally, we generate a trajectory of allele A from introduct to fixation, then condition on this particular realization of genealogical process to generate an ancestral recombinat graph for our sample (Fig. 1). The trajectory of allele A modeled stochastically, using a new approach (see Method Under the standard sweep model, f ϭ 1/(2N), while un the model of directional selection on standing variation, f 1/(2N). Effect of f on Diversity Levels Irrespective of the value of f, mean diversity levels most distorted near the selected site and tend toward th neutral expectation with increasing genetic distance. Thi illustrated in Figure 2A, using parameters that may be plicable to humans (e.g., Frisse et al. 2001). We present th summaries of diversity: ␪W (Watterson 1975), ␪H (Fay a Wu 2000), and ␲ (Tajima 1989). Under a neutral equilibri model, these statistics provide an unbiased estimate of ␪, population mutation rate (␪ ϭ 4N␮, where ␮ is the mutat rate per generation per base pair). For these parameters standard sweep leads to a reduction in the mean levels variation throughout the 100-kb region (relative to the neu expectation of ␪ ϭ 0.001 per base pair). A very similar picture is expected so long as f Ͻ 1/(2 and selection is strong (Stephan et al. 1992). As an examp in Figure 2A the expected levels of variation are indis guishable for f ϭ 1/(2N) ϭ 5 ϫ 10Ϫ5 and f ϭ 1/(2Ns) ϭ 10 As f increases, the substitution of a favored allele has a we er effect on diversity at linked neutral sites (Innan and K 2004); for f ϭ 0.20, the effect is hardly detectable. If Stochastic. From Przeworski et al. (2005), orig. Coop & Griffiths (2004) TPB but kinda hard to dig out. TL;DR - the latter is preferred. The former over-estimates time to fixation by approximately two-fold.
  • 13. Kaplan et al.Hudson and C. H. Langley ly he ly dy bi- he 1-E X E past present Pr(escape) ⇡ r/s
  • 14. Kaplan et al.896 N. L. Kaplan,R. R. Hudsonand C. H. Langley 10' 10' 4 3 2 1 10 10 1 10 a 10 1010 I c I I insensitive to 2N so long as a not shown). The major goal of this pa consequence of hitchhiking r selected substitutions on stan variation at the DNA level. I tion la), is plotted as afunc values of a with 2N = 10'. S to 2N (for fixed a),the same smallvaluesof A, the hitchh E ( T ) substantially from 2 (it lated, selectively neutral locu e.g., if a 3 lo5, and 0.0002 < G 0.7. Since theexpectednum sites, E(S), is proportional to hitchhiking effect associated w selected mutants (or very rare reduce the expected number E[T] = E[total time on tree] Var.reduced Recent sweep Old sweep Var.notreduced Strong selection!!!! ↵ = 2Ns R = 2Nr ⌧ = Generations since fixation 2N
  • 15. Kaplan et al. 10' 10' 10' 10' 10' 10' 4 3 2 1 10 10 1 10 a 10 1010 I c I I 1 2 J 4 l- l o 4 3 a=10 10-2 10 -l loo 10' FIGURE3.--E(T), theexpected size (measured in 2N genera- tions) of the ancestral tree of a sample of two genes at a selectively to 2N (fo smallvalu E ( T ) subs lated, sele e.g., if a 3 G 0.7. Si sites, E(S hitchhikin selected m reduce th a sample tion. The ex region ofs ancestor o Equation A M A X ran lo6). It is near them about 5 a
  • 16. s/r • Routinely mis-quoted as “distance at which a sweep will affect variation”. • Wrong! It is distance at which site has Pr(escape) close to 1.
  • 17. an ed of = ple ed. the population (or when the rareallele becomes selec- tively favored). In Figure 3a, a = lo4 and in Figure 3b, R = 10. It is not difficult to show from Figure 2 and Equation (12) that E(T)is an increasing function of r and R and decreasingfunction of a.In particular, as is seen in Figure 3, a and b, E(T) will differ plotted against T, the ancestral time of fixation of theselected substitution, a. For different values of R, the expected number of crossovers between the neutral region and theselected locus per genome per 2N generations (a= lo4),and b. For different values of selection, a (R = 10);(see text for explanation). significantly from 2, its neutral value, if 7 < 0.1 and R / a < 0.01. This means thattheexpected level of variation will be substantially reduced for all the sites within a physical distance of (O.Ol)a/C base pairs of a locus at which a selected substitution has recently occurred.Forexample, if 2N = lo8,s = and c = thenthe width of the affectedregion is only about 200 bp. But if s = and c = 10-', thenthe expected variation is reduced in a region about 2000 bp wide. In Table 2 the values of M (Equation 20), A M A X (Equation 22)and Z22(M) are given for differentvalues of a (2N = lo8and 6 = 0.01). The value of Mf in (20) m th M po th w pr an co cr se m th dy th ef ar (Remember this when reading Kim and Stephan)
  • 18. Which SFS?? (Blue = no selection, for reference) 0 5 10 15 20 1 2 3 4 5 6 7 8 9 Single recent, strong sweep. Fay and Wu. Sweeps occurring at some rate. Braverman et al., Przeworski.
  • 19. Hitchhiking Effect 789 0.0 -0.5 Q - -1.0 v) .-E p a 0) 9 a 2 -1.5 -2.0 -2.5 I - -a= a = 10 lo” a = lo3 0.0000 0.00050 0.0010 4 over by these typical 6s gives &. Table 2 lists & for FIGURE4.-Theaveragevalue of Tajima’s D as a function of A,. Theparametersare n = 50, S = 17, a = lo’, lo4, lo5, lo6,or lo7, and A,. ranges from zero to AMx, which varies depending on a. 0.0015 2). If one takes this as an estimate of &.), then ac-Fig. from Braverman et al. (1995) Genetics
  • 20. “Pseudo-hitchhiking” • Ignore the trajectory—assume fixation time close to zero. This means assuming very strong selection. • Hitch-hiking events occur at rate rho, at which time all lineages coalesce (in absence of recombination).
  • 21. “Pseudo-hitchhiking” E[ x] = 0 V [ x] = ⇢x(1 x) ⇢ = Rate of hitch hiking events Ne = N 1+2N⇢y2 ! 1 ⇢y2 as N ! 1 The extent to which “rho” (and y…) depend on N determines the extent to which variation is constant across species.
  • 22. J. H. Gillespie all be lowered g the various ur during the may even be rs to be quite randomness, y. eudohitchhik- ainder of this of the model y of n alleles Figur e 7.—The average values of Tajima’s D for different ation with de- sample sizes. Those E{D(n)}curves come from samples drawn only way that from a direct simulation of the pseudohitchhiking model for hhiking event. a sample of size n. The D(n) curves come from a direct simula- tion of the coalescent using Equation 18. In both cases, r 5icular genera- 0.138, y 5 0.3, and u 5 5 3 1024 .le copyof one s frequencyto ed alleles areFigure from Gillespie (2000)
  • 23. Soft sweeps: intuition Beneficial mutation initially on > 1 genetic background, leading to prediction that reduction in diversity at linked, neutral sites will not be as extreme as for a “classic” sweep (now often called a “hard” sweep).
  • 24. Soft sweep simulation • Stochastic trajectories • Neutral trajectory + selected “stitched” together. • Vary “ts”, f, etc. 2314 MOLLY PRZEWORSKI ET AL. FIG. 1. A possible genealogy for six chromosomes at a neutral locus linked to a site where a beneficial alle In this example, A has just fixed in the population (at time T ϭ 0), so all lineages carry the favored allele. G is favored from T to ts then neutrally evolving from ts (when it is at frequency f) to tm. The trajectories fo phases are shown in black and gray, respectively. The coalescent genealogy for the six chromosomes is depict recombination events between allelic classes are indicated with slanted arrows. Most coalescent events occ frequency. Because A is neutrally evolving from ts to tm, its sojourn time is longer than it would be under a stan more opportunity for recombination. Note that, in this example, the most recent common ancestor has not been re Figure from Przeworski et al. (2005) Evolution
  • 25. FIG. 2. Mean diversity levels as a function of distance from the selected site for different values of f, the fre is first favored. Diversity levels are summarized by the mean ␲ (dashed), ␪W (gray) and ␪H (black). Under the n all three statistics are unbiased estimators of ␪, the population mutation rate. (A) Plausible parameters fo simulations were run for 100 chromosomes, with N ϭ 104, s ϭ 0.05, and ␪ ϭ ␳ ϭ 10Ϫ3 per base pair (␳ ϭ 4 parameter definitions). The time since the fixation of the beneficial allele is zero. Under the neutral equilibr ϭ E(␪H) ϭ 1 per kilobase. (B) Plausible parameters for Drosophila melanogaster. A total of 104 simulations were with N ϭ 106, s ϭ 0.01, ␪ ϭ 0.01 per base pair, and ␳ ϭ 0.1 per base pair. The time since the fixation of th Under the neutral equilibrium model, E(␲) ϭ E(␪W) ϭ E(␪H) ϭ 1 per 100 bp. has more effect on ␲ than ␪W in all four examples, this is To quantify this observation, we es f=0.05: reduction in variation not nearly as pronounced f=0.2: effect would be very hard to detect “Hard” sweep: variation strongly reduced near selected site ⇡ = dashed ✓W = grey ✓H = black
  • 26. Take-home • To detect sweeps, they should be: • strong (relative to r) • recent • in regions of low recombination (relative to s) • been selected on when rare
  • 27. Quantitative traits • Any one beneficial mutation not guaranteed to fix. • I.e., sweeps can “stall out”—see Chevin and Hopital (2008) Genetics • B/c genetic background may move mean trait value to optimum before fixation occurs. • But, patterns of hitch-hiking should depend mostly on whether or not mutation was rare or not at onset of selection.
  • 28. Considerations • Demographic null model matters. Methods we read will have to “account for demography” • How much HH does there need to be before this is impossible? (This is an open question.)