Generalized Regional Admixture Mapping (RAM) and Structured Association Testing (SAT) David T. Redden,  Associate Professo...
Acknowledgements <ul><li>Jose Fernandez </li></ul><ul><li>Jasmin Divers </li></ul><ul><li>Kelly Vaughan </li></ul><ul><li>...
The problem and the promise <ul><li>Admixture, the event of two or more populations with different allele frequencies inte...
The problem and the promise <ul><li>This admixture process can, under some circumstances, create disequilibrium between pa...
The problem and the promise <ul><li>The classic example is found in Knowler et al (1988).  They reported an association be...
The Problem  G I Y I A i
Structured Association Tests <ul><li>In response to this problem, many authors have proposed a collection of methods we wi...
The promise <ul><li>Regional admixture mapping (RAM) methods use genome wide ancestry and region specific admixture estima...
The promise (Illustrated) <ul><li>Hypothetical Segment of an Admixed Individual.  </li></ul>
Objective of the Paper <ul><li>To extend both Regional Admixture Mapping (RAM) and Structured Association Tests (SAT) into...
Regional Admixture Mapping <ul><li>Using a sample of admixed individuals, estimate each individual’s ancestry as well as e...
Regional Admixture Mapping <ul><li>By comparing regional admixture estimates to an individual’s total admixture estimate, ...
Structured Association Tests <ul><li>The SAT approach seeks to test for the association of a marker and phenotype after ma...
Structured Association Tests <ul><li>All examine for association between markers and case/control status.  The methods hav...
Our Proposed Model (RAM) <ul><li>Y i  =  β 0  + β 1 A i  + β 2 (P 1i *P 2i )  </li></ul><ul><li>+ β 3 A i,j,1 + β 4 A i,j,...
Controlling for Ancestry <ul><li>Let P 1  = ancestry of parent 1 from population D </li></ul><ul><li>Let P 2  = ancestry o...
Controlling for Ancestry <ul><li>P(V i  = 0| P1 P2) = (1-P1)*(1-P2)  </li></ul><ul><li>= 1 – P1-P2 +P1*P2 </li></ul><ul><l...
One Possible Model (RAM) <ul><li>Y i  =  β 0  + β 1 A i  + β 2 (P 1i *P 2i )  </li></ul><ul><li>+ β 3 A i,j,1 + β 4 A i,j,...
Proposed Model (SAT) <ul><li>Y i  =  β 0  + β 1 A i  +  β 2 (P 1i *P 2i )  </li></ul><ul><li>+ β 3 G i,j,1 + β 4 G i,j,2  ...
Further  Issues <ul><li>Literature confuses the terms ancestry and admixture.  </li></ul><ul><li>Individual admixture (W i...
Further  Issues <ul><li>In fact W i  = A i  + u i  where u i  is a combination of measurement error and biological effect....
The Big Problem  G I Y I W I u i A i
 
 
 
If we ignore the error in estimates… <ul><li>In regression, we  assume all covariates are measured without error. </li></u...
Measurement errors  <ul><li>Some knowledge about the distribution of the measurement error is required.  </li></ul><ul><li...
 
What controls the measurement error for admixture? <ul><li>How many of markers will be used to estimate admixture? </li></...
What controls the measurement error for admixture? <ul><li>How variable is admixture/ancestry within your sample? </li></ul>
Conclusion <ul><li>The research continues… </li></ul>
Upcoming SlideShare
Loading in...5
×

Generalized Regional Admixture Mapping (RAM) and Structured ...

339

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
339
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Generalized Regional Admixture Mapping (RAM) and Structured ...

  1. 1. Generalized Regional Admixture Mapping (RAM) and Structured Association Testing (SAT) David T. Redden, Associate Professor, Department of Biostatistics, University of Alabama at Birmingham
  2. 2. Acknowledgements <ul><li>Jose Fernandez </li></ul><ul><li>Jasmin Divers </li></ul><ul><li>Kelly Vaughan </li></ul><ul><li>Solomon Musani </li></ul><ul><li>Hemant Tiwari </li></ul><ul><li>Miguel Padilla </li></ul><ul><li>Michael B. Miller </li></ul>Rui Feng Nianjun Liu Guimin Gao T. Mark Beasley Robert P. Kimberly David B. Allison
  3. 3. The problem and the promise <ul><li>Admixture, the event of two or more populations with different allele frequencies intermating, creates offspring with linkage disequilibrium that spans a greater distance than in a panmictic population. </li></ul>
  4. 4. The problem and the promise <ul><li>This admixture process can, under some circumstances, create disequilibrium between pairs of unlinked loci and thus create confounding (spurious associations, inflated false positive results) in genetic association studies between trait and marker. </li></ul>
  5. 5. The problem and the promise <ul><li>The classic example is found in Knowler et al (1988). They reported an association between an HLA haplotype and diabetes for Pima Indians. When the analysis was repeated stratifying subjects by amount of European ancestry, the observed association between HLA haplotype and diabetes was not present. </li></ul>
  6. 6. The Problem G I Y I A i
  7. 7. Structured Association Tests <ul><li>In response to this problem, many authors have proposed a collection of methods we will collectively call SAT. </li></ul>
  8. 8. The promise <ul><li>Regional admixture mapping (RAM) methods use genome wide ancestry and region specific admixture estimates to identify specific regions of the genome potentially harboring loci influencing the trait. </li></ul>
  9. 9. The promise (Illustrated) <ul><li>Hypothetical Segment of an Admixed Individual. </li></ul>
  10. 10. Objective of the Paper <ul><li>To extend both Regional Admixture Mapping (RAM) and Structured Association Tests (SAT) into a regression modeling framework. </li></ul><ul><li>This would allow for tests of dominance, allow for either continuous or dichotomous outcomes, and allow for inclusion of covariates. </li></ul>
  11. 11. Regional Admixture Mapping <ul><li>Using a sample of admixed individuals, estimate each individual’s ancestry as well as estimate the ancestry of an individual’s alleles within specific genomic regions. </li></ul><ul><li>Given the increased linkage disequilibrium in admixed populations and assuming a disease/phenotype which is more prevalent within a parental population (P d ) genomic regions exhibiting a high number of alleles from P d may harbor/be linked to causative alleles. </li></ul>
  12. 12. Regional Admixture Mapping <ul><li>By comparing regional admixture estimates to an individual’s total admixture estimate, some authors (Zhu et al 2004, Patterson et al 2004, Montana and Pritchard 2004) have recommended case only designs. </li></ul><ul><li>Other authors have recommended comparing regional admixture estimates between cases and controls. </li></ul>
  13. 13. Structured Association Tests <ul><li>The SAT approach seeks to test for the association of a marker and phenotype after making adjustments for population stratification. </li></ul><ul><li>Examples include Devlin and Roeder (1999), Pritchard et al (2001), Satten et al (2001). </li></ul>
  14. 14. Structured Association Tests <ul><li>All examine for association between markers and case/control status. The methods have not been generalized to continuous outcomes. </li></ul>
  15. 15. Our Proposed Model (RAM) <ul><li>Y i = β 0 + β 1 A i + β 2 (P 1i *P 2i ) </li></ul><ul><li>+ β 3 A i,j,1 + β 4 A i,j,2 + ε i </li></ul><ul><li>A i is the ancestry for the i th individual (to be estimated) which is (P 1i +P 2i )/2 . </li></ul><ul><li>P 1i is the ancestry of Parent 1 and P 2i is the ancestry of parent 2. </li></ul>
  16. 16. Controlling for Ancestry <ul><li>Let P 1 = ancestry of parent 1 from population D </li></ul><ul><li>Let P 2 = ancestry of parent 2 from population D </li></ul><ul><li>Let V i be the number of alleles at a random loci that their child has from population D </li></ul>
  17. 17. Controlling for Ancestry <ul><li>P(V i = 0| P1 P2) = (1-P1)*(1-P2) </li></ul><ul><li>= 1 – P1-P2 +P1*P2 </li></ul><ul><li>P(V i = 1| P1 P2) = </li></ul><ul><li>(1-P1)*P2+(1-P2)*P1 = P1+P2-2*P1*P2 </li></ul><ul><li>P(V i = 2| P1 P2) = P1*P2 </li></ul>
  18. 18. One Possible Model (RAM) <ul><li>Y i = β 0 + β 1 A i + β 2 (P 1i *P 2i ) </li></ul><ul><li>+ β 3 A i,j,1 + β 4 A i,j,2 + ε i </li></ul><ul><li>A i,j,k is a (0,1) indicator variable indicating whether the i th individual inherited k allele at the j th locus from population D. </li></ul>
  19. 19. Proposed Model (SAT) <ul><li>Y i = β 0 + β 1 A i + β 2 (P 1i *P 2i ) </li></ul><ul><li>+ β 3 G i,j,1 + β 4 G i,j,2 + ε i </li></ul><ul><li>G i,j,k is a (0,1) indicator variable indicating whether the i th individual inherited k allele at the j th locus of type M. </li></ul>
  20. 20. Further Issues <ul><li>Literature confuses the terms ancestry and admixture. </li></ul><ul><li>Individual admixture (W i ) is the proportion of alleles in an individual’s genome that an individual has from population D. </li></ul><ul><li>Ancestry (A i ) is simply the midpoint of the parental ancestries. </li></ul>
  21. 21. Further Issues <ul><li>In fact W i = A i + u i where u i is a combination of measurement error and biological effect. </li></ul><ul><li>E[W i ] = A i . </li></ul><ul><li>All software (Structure, Admixmap) we are aware of provides estimates of individual admixture, which are error contaminated estimates of ancestry. </li></ul>
  22. 22. The Big Problem G I Y I W I u i A i
  23. 26. If we ignore the error in estimates… <ul><li>In regression, we assume all covariates are measured without error. </li></ul><ul><li>The admixture estimates violate this assumption and create the possibility of residual confounding and biased estimation. </li></ul><ul><li>We are currently investigating multiple models to address these issues. </li></ul>
  24. 27. Measurement errors <ul><li>Some knowledge about the distribution of the measurement error is required. </li></ul><ul><li>There are several approaches to the measurement error problem: </li></ul><ul><ul><ul><li>SIMEX </li></ul></ul></ul><ul><ul><ul><li>Multiple Imputation </li></ul></ul></ul><ul><ul><ul><li>Regression calibration </li></ul></ul></ul><ul><li>Regression Calibration methods </li></ul><ul><ul><ul><li>Regression calibration </li></ul></ul></ul><ul><ul><ul><li>Moment reconstruction method </li></ul></ul></ul><ul><ul><ul><li>Extended Regression calibration </li></ul></ul></ul>
  25. 29. What controls the measurement error for admixture? <ul><li>How many of markers will be used to estimate admixture? </li></ul><ul><li>How informative the markers are with regard to ancestry? </li></ul><ul><li>How many individuals within your study have pure ancestry from a founding population? </li></ul>
  26. 30. What controls the measurement error for admixture? <ul><li>How variable is admixture/ancestry within your sample? </li></ul>
  27. 31. Conclusion <ul><li>The research continues… </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×