Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
Day2 145pm Crawford
1. Association Analysis University of Louisville Center for Genetics and Molecular Medicine January 11, 2008 Dana Crawford, PhD Vanderbilt University Center for Human Genetics Research
2.
3.
4.
5. Recurrence Risks The chance that a disease present in the family will recur in that family “ Lightning striking twice” If recurrence risk is greater in the family compared with unrelated individuals, the disease has a “genetic” component Suggests familial aggregation
6. Recurrence Risks Measured using the risk ratio ( λ ) Sibling risk ratio = λ s Cystic fibrosis λ s = (0.25/0.0004) = 500 Huntington disease λ s = (0.50/0.0001) = 5000 λ s = sibling recurrence risk population prevalence
7. Recurrence Risks: Complex traits λ h ere is for first degree relative Merikangas and Risch (2003) Science 302:599-601.
8. Heritability Think “twin studies” The proportion of phenotypic variation in a population attributable to genetic variation Quantitative traits Heritability measured as h 2 (Can also be family studies)
9. Heritability and Quantitative Traits Determined by genes and environment Boys Girls Mexican Americans Blacks Whites Mexican Americans Blacks Whites Example: Height NHANES 1971-1974 versus NHANES 1999-2002 Freedman et al (2006) Obesity 14:301-308
10. Heritability and Quantitative Traits Trait variation = genetic + environment Genetic variation = additive + dominant Environmental variation = familial/household + random/individual σ T 2 = σ G 2 + σ E 2 σ G 2 = σ a 2 + σ d 2 σ E 2 = σ f 2 + σ e 2 h B 2 = σ G 2 / σ T 2 Broad Sense heritability Narrow Sense heritability h N 2 = σ a 2 / σ T 2
11. Heritability and Twins Studies h 2 = 2(r MZ – r DZ ), where r is the correlation coefficient Monozygotic = same genetic material = r ~ 100% Dizygotic = half genetic material = r ~ 50%
12. Heritability and Twins Studies Trait r(MZ) r(DZ) Reference Cholesterol 0.76 0.39 Fenger et al SBP 0.60 0.32 Evans et al BMI 0.67 0.32 Schousboe et al Perceived pitch 0.67 0.44 Drayna et al
13. Heritability: Is everything genetic? Trait r(MZ) r(DZ) Reference Vote choice 0.81 0.69 Hatemi et al Religiousness 0.62 0.42 Koenig et al
14. Other Evidence For A Genetic Component Monogenic disorders Example: Phenotype of interest is sensitivity to warfarin dosing, but there are no heritability estimates Solution: Rare, familial disorder of warfarin resistance
15. Other Evidence For A Genetic Component Case Reports Example: Phenotype of interest is susceptibility to Neisseria meningitidis (prevalence: 1/100,000) Solution: Case report of recurrent N. meningitidis in patient
16.
17.
18. Target Phenotypes Disease or Quantitative trait? Carlson et al. (2004) Nature 429:446-452 MI Note: SNPs associated with quantitative traits may not be associated with clinical endpoint CRP LDL-C IL6 LDLR Acute Illness Diet
19.
20.
21.
22.
23. Study Design Power calculation example: Cases: Adverse reaction (wheezing) to flu vaccination Controls: Vaccinated children with no adverse reactions Calculated using Quanto 1.1.1 MAF
24. Study Design Power calculation example: Immunogenicity to influenza A (H5N1) vaccine Calculated using Quanto 1.1.1
25.
26.
27.
28.
29. Direct Candidate Gene Association Study Genotype “functional” SNPs Example: Nonsynonymous SNPs Collins et al (1997) Science 278:1580-1581
30. Direct Candidate Gene Association Study Problem: We don’t know what is functional and what is not functional Botstein and Risch (2003) Nat Genet 33 Suppl:228-37.
31. Direct Candidate Gene Association Study What would we miss? Functional synonymous SNPs in MDR1 alter P-glycoprotein activity Komar (2007) Science 315:466-467
32.
33.
34. Indirect Candidate Gene Association Study Linkage disequilibrium (LD) r 2 = 0 SNPs are independent r 2 = 1 SNPs are perfectly correlated AND have the same minor allele frequency Measured by r 2 r 2 = [f(A 1 B 1 ) – f(A 1 )f(B 1 )] 2 f(A 1 )f(A 2 )f(B 1 )f(B 2 )
35. Indirect Candidate Gene Association Study Using LD to pick “tagSNPs” CRP European-descent 10 SNPs >5% MAF CRP European-descent 4 tagSNPs r 2 >0.80
36. Indirect Candidate Gene Association Study “ tagSNPs” are population specific CRP European-descent 4 tagSNPs CRP African-descent 10 tagSNPs
37.
38.
39.
40.
41.
42. Manolio et al. Nature Reviews Genetics 7 , 812 – 820 (October 2006) Case/Control Study Designs For either candidate gene or whole genome
43. Study Pros Cons Case/Control Easier to collect Subject to bias Less expensive No risk estimates Case/Control Study Designs: Pros and Cons Prospective Risk estimates Harder to collect More expensive Subject to bias For rare outcomes, case/control design may be only option
44.
45.
46.
47. Analysis Methods Odds ratio (OR) = ratio of odds of minor allele in Cases (A/C) and Controls (B/D) OR (A*D)/(B*C) The Case/Control Study Case Control Minor allele A B Major allele C D
48. For genotypes, set homozygous for major allele (A) as “ referent” genotype, and calculate 2 odds ratios: Analysis Methods Case Control Aa A B AA C D Case Control aa A B AA C D
49. Analysis Methods Case/control: Interpretation of Odds Ratio 1.0 – Referent >1.0 – Greater odds of disease compared with controls <1.0 – Lesser odds of disease compared with controls Confidence Intervals: probably contain true OR OR does not measure risk*
50.
51. Analysis Methods Prospective cohort Risk Ratio (RR) = Incidence of disease in Exposed A/(A+B) or Unexposed C/(C+D) Case Control Total Exposed A B (A+B) Unexposed C D (C+D)
52. Prospective Study: Interpretation of Risk Ratio 1.0 – Referent >1.0 – Risk for disease increases <1.0 – Risk for disease decreases Confidence Intervals: probably contain true RR *For rare diseases, OR ~ RR Analysis Methods
53. Case/control: Matching Age Gender Race Warning: Can “over match” and miss describing an interesting factor Bad Example: Cases: Adults with heart disease Controls: Newborns without heart disease Analysis Methods
54. Case/control: Stratifying Age Gender Race Warning: Need sufficient sample size to stratify or split the data into males and females Ex. Cases with heart disease Aged-matched controls without heart disease (Exposure: smoking status) Stratify for Gender Specific Risks Analysis Methods
55.
56.
57.
58.
59. Analysis Methods Coding Genotypes 0 0 0 GG 0 1 1 AG 1 2 1 AA Recessive Additive Dominant Genotype Genotype can be re-coded in any number of ways for regression analysis
64. Analysis Methods Whole genome in PLINK (pngu.mgh.harvard.edu/~purcell/plink/) Can adjust for population stratification Can add covariates P<5x10 -8 Genome-wide significance P=5x10 -8 Plenge et al 2007 NEJM MHC removed P<1x10 -100 P<2x10 -11
65.
66.
67.
68. Statistical Replication CRP SNPs and CRP levels in NHANES III Crawford et al Circulation 2006; 114:2458-2465 Carlson et al. AJHG 2005 ; 77:64-77 Results Consistent with CARDIA