Epi519 Gwas Talk

5,172 views
5,046 views

Published on

A lecture for UW EPI 519 providing background for genome-wide association studies, a few examples of recent papers in the CVD GWAS literature, and some lessons and new directions. The talk was originally given in 2008 (in collaboration with a colleagure), this version has been updated slightly for 2010 and includes references for further reading.

Some of the typefaces may have been mangled on conversion; the file download should be more reliable.

Published in: Education, Business
1 Comment
8 Likes
Statistics
Notes
No Downloads
Views
Total views
5,172
On SlideShare
0
From Embeds
0
Number of Embeds
248
Actions
Shares
0
Downloads
381
Comments
1
Likes
8
Embeds 0
No embeds

No notes for slide

Epi519 Gwas Talk

  1. 1. Genome-wide Association Studies EPI 519 3 November 2009 Joshua C. Bis, PhD University of Washington, The Type 1 Diabetes Genetics Consortium. Nature Genetics, 2009 May 10 Cardiovascular Health Research Unit
  2. 2. genome-wide publication epidemic genome.gov/GWAStudies :: 2 November 2009
  3. 3. Manolio et al. J. Clin. Invest. 118:1590-1605 (2008).
  4. 4. rationale for association studies Balding. Nature Reviews Genetics. 2006; 7:781-791
  5. 5. candidate genes Manolio, Boerwinkle, O’Donnell, Wilson. Arterioscler Thromb Vasc Biol. 2004;24:1567-1577.
  6. 6. highly consistent associations (out of 600 gene–disease studies) Hirschhorn: Genet Med, Volume 4(2).March/April 2002.45-61
  7. 7. “genomics” The field within genetics concerned with the structure and function of the entire DNA sequence of an individual or population. -- Thomas Roderick McDonald’s Raw Bar 1986
  8. 8. genome-wide association study “… a study of common genetic variation across the entire human genome designed to identify genetic associations with observable traits.” -- National Institutes of Health, “Policy for sharing of data obtained in NIH-sponsored or conducted GWAS”
  9. 9. “A major strength of the genome-wide approach … has been its freedom from reliance on prior knowledge.” -- “A HapMap harvest of insights into the genetics of common disease” (Manolio, Brooks, Collins.)
  10. 10. Modified from http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpsnpfaq
  11. 11. haplotypes The International HapMap Consortium. Nature | Vol 437 | 27October2005
  12. 12. “… to create a public, genome- wide database of common human sequence variation, providing information needed as a guide to genetic studies of clinical phenotypes.” -- October 2002
  13. 13. Ben Fry, for Genome Research. November 2005
  14. 14. study design considerations Case-control or cohort Sample size Phenotype definition Comparability of cases and controls • Genotyping quality • Population substructure
  15. 15. (McCarthy et al.,Nature Reviews Genetics, May 2008)
  16. 16. genotyping
  17. 17. genotyping
  18. 18. genotyping: raw data Ratio of intensities from two channels Calls = 2461 No calls = 27
  19. 19. analysis × 2.5 million
  20. 20. association study controls cases CC CT CT TT TT CT CC TT TT CC CC CC TT TT TT CT CC TT CT TT TT CT TT CT TT TT TT CT CT CT CT TT CT CT TT CT CT CT CC CT CT CC CC CT TT CT CT TT CC CC CT CC TT CT CT CC TT TT CC CT CT CC TT CT CT TT TT CC CT CT CC CC CC CT CT CT TT CT TT CT TT TT CC CC CT CT CC CT CT TT CC CT TT CC CC CC CT CT CT CT CT CC CT CT CC CC CT CT CT TT TT CT CT CT CT TT CT CT TT TT CC CT CC CT CT CC CT CC CT CC CT CT CC CC TT TT TT TT CC CT CC CT TT CT CC CT CC CT CT TT CT CT TT CT CT TT CT CC CC CC CC CT CT CT CT TT CT CT TT CT TT CC TT CC CC TT CC CT TT CT CC CT CT CT TT CT CT CT CT CT CT CC CT CC CT CT TT CC CC TT CT CT CT TT CT CC CT CT TT CC CT TT CC CC CC CT CT CT CT CC CC CT CC CT TT CT TT CT CT CC CT CT CT CT CT TT CC CT TT TT CC TT CC CT TT CT TT CT CT CT TT CC CT CT CC CT CT CT CT CC CT CT CC CC CC CT TT CC CC CC CT CC TT CT CT TT CT TT TT CT CT CT CT TT CT CT TT CC TT CT CT CT TT TT TT CT CT CT CT CT CT CT CC CC CC TT CC CT TT CT CC CT CT TT TT TT TT CT CC TT CT CT TT CT CT CC CC CC TT CC CC TT CT CC TT TT CC CT CT CC CC CT CC CT CT CC CT TT CC CC CC CT CT CC TT CT CT CT CT CC CT TT CC CC TT TT CC TT CT CT CT CT CC CT CT CC TT CT CT CT CT CC CT CT CT CC CT CC CC CT CT CT CT TT CC CT CC CC CT CT CT CT CT CT TT CC CT CT CT TT CC TT CT CT TT CT CC CT CT CT TT CC CC CC CT CT CC CC CC CT CT CT CT CC CT CT CT CC CT CT CC CT CC CT CT CC CC CT CT CT CC CC CC TT CT TT CT CT TT CC TT TT CT CC CT TT CC TT CT TT CT CC CT CC CT CC CT TT TT CT CT CT CT CT CC TT CC CC CT CT TT CT CT CT CC TT CT CC CT CC CT CC CC CC CC CT CT CT CC CT CC CT CT CT CC CC CT CT CT CT TT CT CT TT CT CT CT CC CC CT CT CC CC CC CT CT CT CT CT CT CC CT CT CC CT TT CT CT CT CT TT CC CT CT CT CT CT CT TT CT CT CT CC TT CT CC CC TT TT TT TT TT CT CC CT CT CT CC TT CT CC CT CC TT TT CT TT TT CT CT TT CT CT TT CT TT CC CT CT CC CT TT CC TT TT CC TT CT TT TT CT TT CT CT CC TT TT CT CT CC CC CT TT CT CT CT CC CT CT CC TT TT CC CT CC CT CT CT CT CT TT CC CC CT CC TT TT TT CC CT CC CC CC CT TT CC CC CT TT CC CT CT CT CC CT CC TT TT CT CC CT CC CT CT CT CC CT CC CC CT CT TT CT CT CT CT CT CC CT TT CC CT TT CT CC CT CT CT CC CC CT CT CT TT CT CT CT CT CC CT CT CT CC CT CC TT CT CC CT CT CT CT CC CT CC TT CT TT TT TT CT CT TT CT CT TT CT CC TT CT CC TT CT CC CC CT CT CC TT CT CC CC CT CT CT CT CC CT CT CC CT CT TT CT TT CT TT TT TT CC CT CT CC CT CT CT CT CC CT CT CT CC TT CT CC TT CT CT CT CC TT TT CT CT CT CT CC CT CT CT CC CT TT TT CT CT CC TT CT CT CC TT CC TT CT CT TT TT CT CT CT CT CT CT CT CT CT CT CT CT TT CT TT CT CT CC CT CT CT CT CC CT TT TT TT CT CT CT TT CC CC CC CC CT CT TT CT CC CT CT CC CC CC CT TT TT TT CC CT CT CC CT TT CT CT CT TT TT CC CT CC CT CC CT CT TT CC TT CC CT CT TT CC CT CC CT CT CC CT CC CC CC CC TT TT CT CT TT CT CT CT CC CT CC CC CC CC CT CC CT TT CT CT CT CT TT CT TT TT CT CC CT CT TT TT CT CC CC CC CT CT CT TT CT TT CT CT CT CT TT CT TT CC TT CT CT CC CT CT CT CT CT TT TT TT TT CC CC CC CT TT CT CC CT CT CT TT CC CT CT CC CC CC CT CC CT CC TT TT TT CC CC TT CT CT CT TT CC CT CT CC CC CT TT CT CT CT TT CT CT CC CC CC CT CC TT CC CT TT TT CC CT CC TT CT CT CT CC CC CC CT CT TT CT CT CT TT CT CT CT TT CT TT CC CT CT TT CT CT TT TT CC TT TT CC CT CT CT CC CC CT TT CT TT CT TT CT CC CC CT CC CC CT TT CT CT CT CT CC CT CC TT CT CC CC CT TT CT TT CT CT CT CT TT TT CT TT CT CT CC CT CT CT TT TT CT CC CT TT CC CT TT TT CT TT CT CC CC CT CC CT TT CT CC CT CC TT CC TT CT CC CT CC CT CT CT CC CT CT TT CT CC CT TT TT CT CT TT CT CC CT TT CT CT CT CT CT CT CC CT CT CC CT CC TT CT CC TT CT TT TT CC CT TT TT CC CT CC CT CT CT TT TT CT CC TT CT CT CC CT CC TT CT CT CT TT CC CT CT CC TT CT CC TT CC CT CT TT TT CT CT TT CT CC CC CT CT TT CT CT TT CC CC CT CC CC CT CT CC CT TT CT CT CT CT CT CT TT CC CT CT TT TT CC CT TT CC TT TT CT TT TT CT CC TT CC CT CT CC TT CT CC CT CC TT CC TT CC CT TT TT TT CT CC TT CC CC CC CT CT CC TT CC TT CT CT CC TT TT CT TT CT CC CT CC CT CT CT TT TT TT CC CT CC CC CC TT TT TT TT CC CC CC TT TT CT CC CC CT CT CC CT CT TT TT CC CT CT TT TT CT TT CT CT CT CT CT TT CT CC CT TT CT TT CC CC CT TT CT CT CC CT CT CC TT TT TT CT TT CC TT TT CC CC TT TT TT TT TT CT CT CT CT CC CC CT CC CC TT CT CC CT CT CC CT CT CC CC CC CT CT TT CT CT CC TT CT CT TT CT CT CT CC TT CT CT CC CT CC CC CT CC CT CT CT TT CC CT CC CC TT CT CT CC CT CT CC CT CT CC CT CT TT CT CT CC CC TT CC CC TT CC CC TT CT CC TT TT CT TT CT CC CT CT CT TT CC CC CC CC CT CT CT CC CT CT CT TT CT CT CT CT CC CT CT CT CT TT TT CT CC CT CT CC CT CC TT CC TT CC CC TT CT CC TT TT CT CC CT CC CC CC CT CT TT CT CT CC CC TT CC CT TT CT TT CT CC CT CT TT TT TT CT TT CC CC CC TT CT CT CT CT CC CT CT CT CT CT CC TT CT CT CT TT CT CC CC TT CC CT CT CC CT CT CC CC CT CT TT CC TT TT CT CT CT TT CT CT CT CC TT TT TT CT TT TT CT CT CC TT CT CC CC CT CT CT TT CT CT CC TT CT CT CT CT CC CT CT CC CC CC CT CC CC CC CT CT CT CT CT CC CT CC CT TT CC CT TT CT CT CT CT CT TT CT TT CT CT TT TT TT CC TT CC CT CC TT CC CT CT CT CT CC CT TT CT TT CC CT CT TT CC CC CT TT TT TT CC CC CC CT CT CC CT TT TT CT CT CC TT CT TT CT TT CT CC CC CC CT TT TT TT CT CT CC CT CT TT TT CT TT CT CC CC CT CT CT CC CC CT CT CC TT CC CT CC CT CC TT CC CT CT CC CC CT CC CC TT CT CC CT CC CT CT CC CC TT CT CC TT TT CT CT TT CT CT CC TT CC CT CT CC CT CT CC CT TT CT CT CC TT TT CT CT CT CT CC TT CT CT CC CT TT TT CC CC CT TT TT CC CT CC CT CT TT CC CT TT CT CT CT CT CT CT CT CC CC CT CC CT TT CT CC CC CT CT CC CC TT CT CT TT TT CT CT CT CT CT TT CT CT CT CC CT TT CT CT CC CT CC CT TT CT CC CT CC TT CT CT CC CT CT CT CC CC CT CC CT CC CC CC TT TT CC CT CC CC TT CC TT CC TT CT CC TT CT TT CT TT TT CT CT CT TT CC CT TT CT CC TT CT TT CC TT CT TT CT CT CT CT CT CC CT CT CT CT CT CC CC TT TT CC CC CC TT TT CT CC CC TT CT TT TT TT CC CC TT CC CC CC CC CC CT CT CT CC CT CT CC CT TT CT CT CT CT CT CT TT CT CC CT CT TT CT CC CC CC CT CT TT CC TT TT CT CT CT CT CT TT CT CC CT CT CC CT CT CC CC CT CT CT CT CT TT TT CT CT TT TT CT TT CC TT TT CT CC TT TT CT TT CC TT CC CT CT TT CT CT CT CT CC CC CT TT CC TT CT CC CT CT CT CT CC CT CT TT CT CT CC TT CT CC CT CT TT CC CT CT TT TT CT CT TT CT CC CT CT CC TT CT CC CT TT TT CC CT TT CT CC CT CT TT CC CT TT TT TT TT CT TT CT TT CT CC CT CC TT CT CT CT CT TT CT CT TT CT CT TT CT TT TT CT CT CC TT CC CT CT CC CT TT CT CT CT CT TT CT CT CT TT CC CC CC CT CC TT CC TT CT CC CT CT CC CC CC CT CT CC TT CT CT CT CT CT CT CT CT TT TT CT CC CC CT CT CT CT CC CT TT TT CT CT CT CC CT CT TT CT CT CT CT CT CC CT CT CT TT CC CC TT CT CC CT CT TT CT CC CT TT CC CT CC CT CT CT TT CC CC CT CT TT CT CT CT TT CT CC CT CT CT CT TT CT CT TT CT CC CT CT CC CT CC CT CT CT CT CT CT CT CT TT TT CC CT TT CC CT TT CC CC CT CC CC CT TT CT CC CT CT CC CC CT CT CT CT CT TT TT CT CT TT CC CC CT CT CC CT CC CT TT CC CT CT CC CC TT TT TT CT TT CT TT CT TT TT CT TT CC TT CT CT CC CC CC CC CT CT TT TT TT CC CT CC CT TT CT TT CC CT TT CT CT CC CC CT CT CT TT TT TT CT CC CT TT TT CT TT CC CT TT CC CT CC CT CT CT TT CC CC CT TT TT TT CT CT CC CT TT CT CT CT TT CT TT CC CT CC CC CT CT CC CT CC TT CT CC CT CT CC CC CT CC TT CT CT CT TT CT CT CC CT CT CC TT CC CT CT TT CT CT TT TT CT CT TT TT TT CT CT CT CC CT CT CC TT CT CT CC CC CT CC CT CT CT CT CT CC CC CC CT CT TT CT TT CT CT CT CT CT TT CT CT CT CC TT CT TT CC TT CT TT CC CT CT CT TT TT TT CT CT TT TT TT CT CC TT TT TT CT CT CC TT CC CC CT CT TT CT TT CT CT CC TT CT CC CT CC CT TT TT CT CT CC TT TT TT CC TT CT CT CT TT TT CT CC TT TT CC CC CT CT CT TT CT TT CT TT CT TT CT CT TT CC CT CT CT CC TT CC CT CC CC CC TT CC CC CT CT CT CT TT CC CT CT TT CT CT CT CC CT TT CC CT CT CT TT TT Odds ratio for C allele: 1.35, p = 6.3 x 10-7 TT TT TT CT CT CT TT CT CT CC CC CT TT CT TT CT TT CT CC CC CT CC CT CT CT TT CC CC CT CT CT CC CT CT TT CT CT CT CT CT CT CT CC CC TT CT TT CT CT CC CC CC TT CC TT CT CT CT CT TT CC TT CT CC TT CC TT CC CT TT CC CT TT CC CT CT CC CT CC CC CT CC CC CT CT CT CT CT CT CT CT CT CT CT CT CT CC CT TT CT TT CT CT CC TT CT CC CT CC CT CT TT TT CC CT CT CT TT CC CT TT CT CT TT CT CT CT TT CT CT CT TT CC CT CT TT TT CC TT CC CC TT TT TT TT CC CC CT CT CT CT CC CC TT CT TT CC TT CT CT CT TT CT CT CC CC CC TT CT CT CC CT CT CT CT CC CC CT CT CC CC CC CC CC CC CT CT TT CT CT CT CT CT CT TT CT CT TT TT CT CT CT TT CT CT CT CT TT TT TT CT CC CT TT TT CT TT TT TT CT
  21. 21. Manhattan plot (McCarthy et al.,Nature Reviews Genetics, May 2008)
  22. 22. p-value the probability of seeing your data or more extreme data if the null hypothesis is true. By chance, with 1,000,000 statistical tests: • a threshold of p=0.05 would show 50,000 “significant” associations 360 cases : 360 controls • a threshold of p = 0.05/1,000,000 (5 x 10-8) would show 0.05 “significant” associations 1590 cases: 1590 controls.
  23. 23. population stratification requires both allele frequency and disease prevalence differences Balding. Nature Reviews Genetics. 2006; 7:781-791
  24. 24. Q-Q plots (modified from McCarthy et al.,Nature Reviews Genetics, May 2008)
  25. 25. power & sample size (Rice, personal communication)
  26. 26. reasons for larger sample size: • More genotypes / tests • Lower effect size • More genotype error or • Lower frequency of risk misclassification allele • Higher heterogeneity of • Lower correlation association between marker allele and risk allele.
  27. 27. meta analysis Combine results from several studies to increase power using traditional methods of meta- analysis.
  28. 28. imputation Use patterns of variation from HapMap to impute genotypes. Increases power by allowing for association testing at untyped markers and allows comparisons across studies and platforms by using a common set of SNPs. Li, Willer, Sanna, Abecasis. Annu Rev Genomics Hum Genet. 2009;10:387-406
  29. 29. “There have been few, if any, similar bursts of discovery in the history of medical research” -- “Drinking from the fire hose …” (Hunter & Knox)
  30. 30. Published Genome-Wide Associations through 6/2009, 439 published GWA at p < 5 x 10-8 NHGRI GWA Catalog www.genome.gov/GWAStudies
  31. 31. Sources / References / Reading 1. The International HapMap Consortium.* A haplotype map of the human genome. Nature, 2005. 437(7063): p. 1299-320.[16255080]. 2. The Type 1 Diabetes Genetics Consortium. Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nature Genetics, 2009 May 10 [19430480] 3.  The Wellcome Trust Case Control Constortium.* Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 2007. 447(7145): p. 661-78.[17554300]. 4.  Balding, D.J., A tutorial on statistical methods for population association studies. Nat Rev Genet, 2006. 7(10): p. 781-91.[16983374]. 5. Christensen, K. and J.C. Murray, What genome-wide association studies can do for medicine. N Engl J Med, 2007. 356(11): p. 1094-7.[17360987]. 6. Frazer, K.A., et al., A second generation human haplotype map of over 3.1 million SNPs. Nature, 2007. 449(7164): p. 851-61.[17943122]. 7. Hirschhorn, J.N., et al., A comprehensive review of genetic association studies. Genet Med, 2002. 4(2): p. 45-61.[11882781]. 8. Hunter, D.J. and P. Kraft, Drinking from the fire hose--statistical issues in genomewide association studies. N Engl J Med, 2007. 357(5): p. 436- 9.[17634446]. 9. Li Y, Willer C, Sanna S, Abecasis G., Genotype imputation. Annu Rev Genomics Hum Genet. 2009;10:387-406. [19715440] 10. Manolio, T.A., et al., Genetics of ultrasonographic carotid atherosclerosis. Arterioscler Thromb Vasc Biol, 2004. 24(9): p. 1567-77.[15256397]. 11. Manolio, T.A., L.D. Brooks, and F.S. Collins, A HapMap harvest of insights into the genetics of common disease. J Clin Invest, 2008. 118(5): p. 1590- 605.[18451988]. 12. McCarthy, M.I., et al., Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet, 2008. 9(5): p. 356-69.[18398418]. 13. Pearson, T.A. and T.A. Manolio, How to interpret a genome-wide association study. JAMA, 2008. 299(11): p. 1335-44.[18349094].

×