Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

10 Liu, Dajiang


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

10 Liu, Dajiang

  1. 1. Statistical Genetics Using Sequence Data Dajiang J. Liu Department of Statistics
  2. 2. Why We Study Statistical Genetics <ul><li>Statistics is originated from genetics </li></ul><ul><li>R.A. Fisher: “ The Correlation Between Relatives on the Supposition of Mendelian Inheritance” </li></ul><ul><ul><li>Introduced the concept of variance in this article </li></ul></ul><ul><li>Francis Galton : Regression of human height toward the mean: </li></ul><ul><ul><li>Introduced correlation and regression </li></ul></ul><ul><li>Karl Pearson: </li></ul><ul><ul><li>“ Mendelism and the problem of mental defect” </li></ul></ul><ul><ul><li>“ Tuberculosis, heredity and environment ” </li></ul></ul><ul><li>Why don’t we seek our roots? </li></ul><ul><li>In order to find disease genes in the genome, statistics is a must </li></ul>
  3. 3. Statistical Genetics <ul><li>Disease gene mapping : </li></ul><ul><ul><li>The determination of the sequence of genes and their relative distances from one another on a specific chromosome </li></ul></ul><ul><ul><li>Technology driven field : </li></ul></ul><ul><ul><li>Mendel’s era: Segregation Analysis </li></ul></ul><ul><ul><li>- Patience : peas, fruit fly: inbreeding is necessary </li></ul></ul>Experimental Design
  4. 4. Statistical Genetics <ul><li>Modern era: </li></ul><ul><ul><li>Microsatellite Markers: </li></ul></ul><ul><ul><ul><li>Genetic linkage analysis </li></ul></ul></ul><ul><ul><ul><ul><li>Extremely successful for mapping and identifying Mendelian traits </li></ul></ul></ul></ul><ul><ul><li>Single nucleotide polymorphism (SNP) marker </li></ul></ul><ul><ul><ul><li>Case control studies: </li></ul></ul></ul><ul><ul><ul><ul><li>Genome Wide Association Studies: To identify common variants involved in complex traits </li></ul></ul></ul></ul>Computational Techniques for likelihood in Pedigrees Statistics play a major role
  5. 5. Statistical Genetics <ul><li>Sequencing Era: </li></ul><ul><li>Study of diseases due to rare variants is emerging </li></ul>ABI SOLiD sequencer Statistics is ALL for sequencing data
  6. 6. Statistical Genetics <ul><li>Data we work with </li></ul>Human Genome Project Hap Map Project 1000 Genome Project
  7. 7. Multi-facotorial Disease Etiology Hypothesis <ul><li>Common Disease Common Variants Hypothesis (CD/CV) hypothesis: </li></ul><ul><ul><li>Common diseases are caused by a few common variants with moderate effect </li></ul></ul><ul><ul><li>E.g. Age-related Macular Degeneration: </li></ul></ul><ul><li>Common variants are likely to have lower odds ratio than rare variants: </li></ul>
  8. 8. Multi-facotorial Disease Etiology Hypothesis <ul><li>Common Disease Rare Variants Hypothesis: </li></ul><ul><ul><li>Common diseases are caused by multiple rare variants with large effect size: </li></ul></ul><ul><ul><li>The discovery of rare variants will have high impact on public health since they will aid in risk prediction and treatment </li></ul></ul><ul><ul><ul><li>E.g. Multiple Rare Alleles Contribute to Low Plasma Levels of HDL Cholesterol </li></ul></ul></ul><ul><ul><ul><li>E.g. Colorectal Adenomas </li></ul></ul></ul>
  9. 9. Challenges on Statistical Methodologies <ul><li>Variants misclassification: </li></ul><ul><ul><li>Non-causal variants Included: </li></ul></ul><ul><ul><ul><li>Huge number of mutations on the genome: </li></ul></ul></ul><ul><ul><ul><ul><li>Most of them are not causing the disease under study </li></ul></ul></ul></ul><ul><ul><li>Causal Variants Excluded: </li></ul></ul><ul><ul><ul><li>Intronic mutations: </li></ul></ul></ul><ul><ul><ul><li>Intergenic regions: </li></ul></ul></ul><ul><li>Unknown patterns of interactions: </li></ul><ul><ul><li>Within gene interactions: e.g. Hirschsprung’s disease (RET gene) </li></ul></ul><ul><ul><li>Gene x gene interactions: e.g. breast cancer genes (BRCA 1 BRCA2 x CHEK2) </li></ul></ul><ul><ul><li>Adaptive methods are needed </li></ul></ul>1. 2. x
  10. 10. Kernel Based Adaptive Clustering <ul><li>Combine variant classification with association testing into a coherent framework </li></ul><ul><li>Applicable to population based case/control studies using unrelated individuals </li></ul><ul><li>Robust against variants misclassifications </li></ul><ul><li>Can handle gene x gene interactions and gene x environment interactions </li></ul>