PCB5065 Advanced Genetics Population Genetics and Quantitative Genetics Instructor: Rongling Wu, Department of Statistics, 420A Genetics Institute Tel: 2-3806, Email: [email_address] Tues Nov 14 Population genetics - population structure Wed Nov 15 Population genetics - Hardy-Weinberg equilibrium Thurs Nov 16 Population genetics - effective population size Mon Nov 20 Population genetics - linkage disequilibrium Tues Nov 21 Population genetics - evolutionary forces Wed Nov 22 Genetic Parameters: Means Thurs Nov 24 Thanksgiving Holiday - No Class Mon Nov 27 Genetic Parameters: (Co)Variances Tues Nov 28 Mating Designs for Parameter Estimation Wed Nov 29 Experimental Designs for Parameter Estimation Thurs Nov 30 Heritability, Genetic Correlation and Gain from Selection Mon Dec4 Molecular Dissection of Quant. Variation – Linkage Analysis Tues Dec 5 Molecular Dissection of Quant. Variation – Association Studies Wed Dec 6 Take-home exam on population and quantitative genetics given- due in electronic format ( [email_address] ) by 5 PM Tuesday Dec 12
Teosinte and Maize Teosinte branched 1( tb1 ) is found to affect the differentiation in branch architecture from teosinte to maize (John Doebley 2001)
Identify the genetic architecture of the differences in morphology between maize and teosinte
Estimate the number of genes required for the evolution of a new morphological trait from teosinte to maize: few genes of large effect or many genes of small effect?
Doebley pioneered the use of quantitative trait locus (QTL) mapping approaches to successfully identify genomic regions that are responsible for the separation of maize from its undomesticated relatives.
According to The International HapMap Consortium (2003), the statistical analysis and modeling of the links between DNA sequence variants and phenotypes will play a pivotal role in the characterization of specific genes for various diseases and, ultimately, the design of personalized medications that are optimal for individual patients.
What knowledge is needed to perform such statistical analyses?
Population genetics and quantitative genetics, and others…
The International HapMap Consortium, 2003 The International HapMap Project. Nature 426: 789-94.
Liu, T., J. A. Johnson, G. Casella and R. L. Wu, 2004 Sequencing complex diseases with HapMap. Genetics 168: 503-511.
Whether or not the population deviates from HWE at a particular locus can be tested using a chi-square test.
If the population deviates from HWE (i.e., Hardy-Weinberg disequilibrium, HWD), this implies that the population is not randomly mating. Many evolutionary forces, such as mutation, genetic drift and population structure, may operate.
Therefore, the population does not deviate from HWE at this locus.
Why the degree of freedom = 1? Degree of freedom = the number of parameters contained in the alternative hypothesis – the number of parameters contained in the null hypothesis. In this case, df = 2 (p A or p a and D) – 1 (p A or p a ) = 1
How does D transmit from one generation (1) to the next (2)?
D(2) = (1-r) 1 D(1)
D(t+1) = (1-r) t D(1)
t , D(t+1) r
Conclusions: - D tends to be zero at the rate depending on the recombination fraction. - Linkage equilibrium PAB = pApB is approached gradually and without oscillation. - The larger r, the faster is the rate of convergence, the most rapid being (½)t for unlinked loci (r=0.5).
The four gametes randomly unite to form a zygote. The proportion 1-r of the gametes produced by this zygote are parental (or nonrecombinant) gametes and fraction r are nonparental (or recombinant) gametes. A particular gamete, say AB, has a proportion (1-r) in generation t+1 produced without recombination. The frequency with which this gamete is produced in this way is (1-r)p AB (t).
Also this gamete is generated as a recombinant from the genotypes formed by the gametes containing allele A and the gametes containing allele B. The frequencies of the gametes containing alleles A or B are p A (t) and p B (t), respectively. So the frequency with which AB arises in this way is rp A (t)p B (t).
Therefore the frequency of AB in the generation t+1 is
p AB (t+1) = (1-r)p AB (t) + rp A (t)p B (t)
By subtracting is p A (t)p B (t) from both sides of the above equation, we have
This means that when the population undergoes random mating, the LD decays exponentially in a proportion related to the recombination fraction.
(1) Population structure and evolution
Estimating D, D' and R the mating history of
The larger the D’ and R estimates, the more likely the population in nonrandom mating, the more likely the population to have a small size, the more likely the population to be affected by evolutionary forces.
Reich, D. E., M. Cargill, S. Bolk, J. Ireland, P. C. Sabeti, D. J. Richter, T. Lavery, R. Kouyoumjian, S. F. Farhadian, R. Ward and E. S. Lander, 2001 Linkage disequilibrium in the human genome. Nature 411: 199-204.
Dawson, E., G. R. Abecasis, S. Bumpstead, Y. Chen et al. 2002 A first-generation linkage disequilibrium map of human chromosome 22. Nature 418: 544-548.
LD curve for Swedish and Yoruban samples. To minimize ascertainment bias, data are only shown for marker comparisons involving the core SNP. Alleles are paired such that D' > 0 in the Utah population. D' > 0 in the other populations indicates the same direction of allelic association and D' < 0 indicates the opposite association. a , In Sweden, average D' is nearly identical to the average |D'| values up to 40-kb distances, and the overall curve has a similar shape to that of the Utah population (thin line in a and b ). b , LD extends less far in the Yoruban sample, with most of the long-range LD coming from a single region, HCF2 . Even at 5 kb, the average values of |D'| and D' diverge substantially. To make the comparisons between populations appropriate, the Utah LD curves are calculated solely on the basis of SNPs that had been successfully genotyped and met the minimum frequency criterion in both populations (Swedish and Yoruban) (Reich,te al. 2001)
as the inbreeding coefficient. Biologically, F measures the degree with which heterozygosity is reduced due to inbreeding, measured as a fraction relative to heterozygosity expected in a random-mating population.
Controlled cross (self fertilization): F = (P Aa – P 0Aa )/ P 0Aa = (1/2 – 1/4)/½ = 1
Natural population: AA Aa aa Total
Example 1 Obs 224 64 6 294
Freq .762 .218 .020 p A =.871, pa=.129
Exp (HWE) .769 .224 .017
F = (P 0Aa – P Aa )/ P 0Aa = (.224-.218)/.224=.027
Example 2 Obs 234 36 6 276
Freq .848 .130 .022 p A =.913, pa=..087
Exp (HWE) .719 .159 0
F = (P 0Aa – P Aa )/ P 0Aa = (.159-.130)/.159=.101
First of all , no HWE population exists in nature because many evolutionary forces may operate in a population, which cause the genotype frequencies in the population to change.
Secondly , even if a population is at HWE, this equilibrium may be quickly violated because of some particular evolutionary forces.
These so-called evolutionary forces that cause the structure and organization of a population to change include mutation , selection , admixture , division , migration , genetic drift … Next, we will talk about the roles of some of these evolutionary forces in shaping a population.
With the definition of mutation rate u ( a fraction u of A alleles undergo mutation and become a alleles, whereas a fraction 1-u of A alleles escape mutation and remain A ), we have allele frequency in the next generation t+1
2 is actually the difference between the genotype frequencies ( R S ) in the metapopulation (equal to the average genotype frequencies among the subpopulations) and the genotype frequencies ( R T ) that would be expected in a total population in HWE., i.e.,
The average frequency of homozygous recessive genotypes among a group of subpopulations is always greater than the frequency of homozygous recessive genotypes that would be expected with random mating, and excess is numerically equal to the variance in the recessive allele frequency .
The relationship R S = R T + 2 R T is called Wahlund’s principle
This is a fundamental relation in population genetics that connects the fixation index in a metapopulation with the variance in allele frequencies among the subpopulations . The fixation index can be interpreted in terms of the inbreeding coefficient. Thus, the genotype frequencies in a metapopulation are expressed as
AA: p - A 2 + p - A p - a F ST = p - A 2 (1-F ST ) + p - A F ST
Aa: 2p - A p - a - 2p - A p - a F ST = 2p - A p - a (1-F ST )
aa: p - a 2 + p - A p - a F ST = p - a 2 (1-F ST ) + p - a F ST