Pathway talk for IGES 2009 Hawaii

304 views
270 views

Published on

Published in: Technology, Health & Medicine
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
304
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Pathway talk for IGES 2009 Hawaii

  1. 1. Using pathways to discover complex disease models Gary Chen, Duncan Thomas Department ofUsing pathways to discover Preventive Medicine USC complex disease models 1. Motivation 2. A stochastic search variable selection algorithm 3. Example using Gary Chen, Duncan Thomas candidate genesDepartment of Preventive Medicine 4. Ideas for GWAS USC October 20, 2009
  2. 2. Using pathways toAn outline discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC1. Motivation 1. Motivation 2. A stochastic2. A stochastic search variable selection search variable selection algorithmalgorithm 3. Example using candidate genes 4. Ideas for GWAS3. Example using candidate genes4. Ideas for GWAS
  3. 3. Using pathways toCommon disease have complex discover complex disease models Gary Chen,etiology Duncan Thomas Department of Preventive Medicine USC 1. Motivation GWAS have had great success in searching 2. A stochastic for genetic variants for common diseases search variable selection algorithm Recent successes: AMD, BMI/obesity, 3. Example using candidate genes Type 2 diabetes, Breast cancer, Prostate 4. Ideas for GWAS cancer
  4. 4. Using pathways toCommon disease have complex discover complex disease models Gary Chen,etiology Duncan Thomas Department of Preventive Medicine USC 1. Motivation GWAS have had great success in searching 2. A stochastic for genetic variants for common diseases search variable selection algorithm Recent successes: AMD, BMI/obesity, 3. Example using candidate genes Type 2 diabetes, Breast cancer, Prostate 4. Ideas for GWAS cancer Marginal effects from single SNP analyses do not explain all heritability. Can we move beyond the low-hanging fruit?
  5. 5. Using pathways toUse biological knowledge to help discover complex disease models Gary Chen,search for disease models Duncan Thomas Department of Preventive Medicine USC Hierarchical Modeling 1. Motivation Stabilizes effect estimates β from an 2. A stochastic search variable association test by assuming they come from selection algorithm a prior distribution derived from biological 3. Example using candidate genes data 4. Ideas for GWAS
  6. 6. Using pathways toUse biological knowledge to help discover complex disease models Gary Chen,search for disease models Duncan Thomas Department of Preventive Medicine USC Hierarchical Modeling 1. Motivation Stabilizes effect estimates β from an 2. A stochastic search variable association test by assuming they come from selection algorithm a prior distribution derived from biological 3. Example using candidate genes data 4. Ideas for GWAS Examples in Genetic Epi Model selection: Conti et al (Hum Her, 2003), Baurley et al(Stat Med, in review) GWAS: Lewinger et al (Gen Epi 2007), Chen et Witte (AJHG 2007) Review: Thomas et al (Hum Genomics 2009)
  7. 7. Using pathways toAn outline discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC1. Motivation 1. Motivation 2. A stochastic2. A stochastic search variable selection search variable selection algorithmalgorithm 3. Example using candidate genes 4. Ideas for GWAS3. Example using candidate genes4. Ideas for GWAS
  8. 8. Using pathways toSearching for independent main discover complex disease models Gary Chen,effects and their interactions Duncan Thomas Department of Preventive Medicine Ideally fit all predictors in a single model if USC N >P 1. Motivation 2. A stochastic Model selection: e.g. stepwise regression search variable selection algorithm P-values can be anti-conservative: Don’t 3. Example using adjust for number of tests candidate genes Can be computationally intractable 4. Ideas for GWAS
  9. 9. Using pathways toSearching for independent main discover complex disease models Gary Chen,effects and their interactions Duncan Thomas Department of Preventive Medicine Ideally fit all predictors in a single model if USC N >P 1. Motivation 2. A stochastic Model selection: e.g. stepwise regression search variable selection algorithm P-values can be anti-conservative: Don’t 3. Example using adjust for number of tests candidate genes Can be computationally intractable 4. Ideas for GWAS An alternative: Bayesian model averaging Probabilistically propose sub-models from a posterior distribution Summary statistics of parameters averaged across all proposed models Appears to better control for multiple comparisons
  10. 10. Using pathways toThe model form: A two-level discover complex disease models Gary Chen,hierarchical model Duncan Thomas Department of Preventive Medicine USC 1. Motivation First Level: a linear model 2. A stochastic search variable K logit(P(Y = 1|β, X )) ∼ β0 + k=1 βk X selection algorithm X can be G, E, GxG, GxE, etc. 3. Example using candidate genes 4. Ideas for GWAS
  11. 11. Using pathways toThe model form: A two-level discover complex disease models Gary Chen,hierarchical model Duncan Thomas Department of Preventive Medicine USC 1. Motivation First Level: a linear model 2. A stochastic search variable K logit(P(Y = 1|β, X )) ∼ β0 + k=1 βk X selection algorithm X can be G, E, GxG, GxE, etc. 3. Example using candidate genes Second level: a mixture prior on each βk 4. Ideas for GWAS of univariate Gaussians: ¯ τ2 β ∼ N(φβk + (1 − φ)π T Zk , φ adjk + (1 − φ)σ 2 ) 1st component: neighborhood of gene k 2nd component: pathway info on gene k
  12. 12. Using pathways toHow the parameters fit together discover complex disease models ¯ τ2β ∼ N(φβk + (1 − φ)π T Zk , φ adjk + (1 − φ)σ 2 ) Gary Chen, Duncan Thomas Department of Preventive Medicine USC 1. Motivation 2. A stochastic search variable selection algorithm 3. Example using candidate genes 4. Ideas for GWAS
  13. 13. Using pathways toStochastic Search Variable discover complex disease models Gary Chen,Selection Duncan Thomas Department of Preventive Medicine USC 1. Motivation Propose a swap, addition or deletion of an 2. A stochastic search variable selection algorithm variable 3. Example using candidate genes 4. Ideas for GWAS
  14. 14. Using pathways toStochastic Search Variable discover complex disease models Gary Chen,Selection Duncan Thomas Department of Preventive Medicine USC 1. Motivation Propose a swap, addition or deletion of an 2. A stochastic search variable selection algorithm variable 3. Example using Perform reversible jump Metropolis candidate genes 4. Ideas for GWAS Hastings step comparing posterior probabilities P(Y =1|β ,X )P(β |Z ,A,π,σ,τ,φ) H= P(Y =1|β,X )P(β|Z ,A,π,σ,τ,φ)
  15. 15. Using pathways toStochastic Search Variable discover complex disease models Gary Chen,Selection Duncan Thomas Department of Preventive Medicine USC 1. Motivation Propose a swap, addition or deletion of an 2. A stochastic search variable selection algorithm variable 3. Example using Perform reversible jump Metropolis candidate genes 4. Ideas for GWAS Hastings step comparing posterior probabilities P(Y =1|β ,X )P(β |Z ,A,π,σ,τ,φ) H= P(Y =1|β,X )P(β|Z ,A,π,σ,τ,φ) Accept move with probability min(1, H)
  16. 16. Using pathways toAn outline discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC1. Motivation 1. Motivation 2. A stochastic2. A stochastic search variable selection search variable selection algorithmalgorithm 3. Example using candidate genes 4. Ideas for GWAS3. Example using candidate genes4. Ideas for GWAS
  17. 17. Using pathways toFolate pathway discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC 1. Motivation 2. A stochastic search variable selection algorithm 3. Example using candidate genes 4. Ideas for GWASReed et al J Nutr. 2006 Oct;136(10):2653-61
  18. 18. Using pathways toSimulated data set discover complex disease models Gary Chen, Simulated data for 4000 individuals Duncan Thomas Department of Preventive 14 genes, 2 environmental variables Medicine USC Pathway enzymes: genotype specific rates 1. Motivation 2. A stochastic search variable selection algorithm 3. Example using candidate genes 4. Ideas for GWAS
  19. 19. Using pathways toSimulated data set discover complex disease models Gary Chen, Simulated data for 4000 individuals Duncan Thomas Department of Preventive 14 genes, 2 environmental variables Medicine USC Pathway enzymes: genotype specific rates 1. Motivation Simulating disease status 2. A stochastic search variable Assign homocysteine as causal mechanism selection algorithm ’Run’ the pathway until steady state 3. Example using candidate genes Probabilistically assign disease status 4. Ideas for GWAS conditional on metabolite conc.
  20. 20. Using pathways toSimulated data set discover complex disease models Gary Chen, Simulated data for 4000 individuals Duncan Thomas Department of Preventive 14 genes, 2 environmental variables Medicine USC Pathway enzymes: genotype specific rates 1. Motivation Simulating disease status 2. A stochastic search variable Assign homocysteine as causal mechanism selection algorithm ’Run’ the pathway until steady state 3. Example using candidate genes Probabilistically assign disease status 4. Ideas for GWAS conditional on metabolite conc. Priors Deposit half the genotypes into prior database Z matrix, causal metabolite(s): correlation of prior genotypes to candidate metabolite A matrix, network information: correlation of correlation profiles between two effects
  21. 21. Using pathways toSetting up the priors discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC 1. Motivation 2. A stochastic search variable selection algorithm 3. Example using candidate genes 4. Ideas for GWAS
  22. 22. Using pathways toComparison discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC 1. Motivation 2. A stochastic search variable selection algorithm 3. Example using candidate genes 4. Ideas for GWASSame interactions detected. Z matrix providessupport.
  23. 23. Using pathways toSensitivity analysis discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine How does our prior on β affect posterior USC inference? 1. Motivation 2. A stochastic search variable selection algorithm 3. Example using candidate genes 4. Ideas for GWAS
  24. 24. Using pathways toSensitivity analysis discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine How does our prior on β affect posterior USC inference? 1. Motivation Compare four special cases of the prior 2. A stochastic search variable density: selection algorithm 3. Example using ¯ βpriork ∼ N(φβk + (1 − φ)π T Zk , candidate genes τ2 φ nk + (1 − φ)σ 2 ) 4. Ideas for GWAS
  25. 25. Using pathways toSensitivity analysis discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine How does our prior on β affect posterior USC inference? 1. Motivation Compare four special cases of the prior 2. A stochastic search variable density: selection algorithm 3. Example using ¯ βpriork ∼ N(φβk + (1 − φ)π T Zk , candidate genes τ2 φ nk + (1 − φ)σ 2 ) 4. Ideas for GWAS 1. Non-informative: constrain φ = 0, π = 0 2. Z matrix: constrain φ = 0 3. Adjacency info: constrain π = 0 4. Z matrix and adjacency info: no constraints
  26. 26. Using pathways toModel averaged estimates of discover complex disease models Gary Chen,hyperparameters Duncan Thomas Department of Preventive Results Medicine USC Prior solely incorporating information in Z 1. Motivation matrix appeared to explain residual variation 2. A stochastic search variable better than adjacency-only prior selection algorithm π estimated at 1.86, consistent with 3. Example using candidate genes simulated effect size. 4. Ideas for GWAS Scenario ˆ σ2 ˆ τ2 ˆ φ Non informative .48 N/A 0 Z matrix .00459 N/A 0 Adjacency .48 .22 .56 Z mat + Adj .00731 .23 .05
  27. 27. Using pathways toComparison among several priors discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC 1. Motivation 2. A stochastic search variable selection algorithm 3. Example using candidate genes 4. Ideas for GWAS
  28. 28. Using pathways toSummary of simulated example discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine Biomarker data incorporated as priors USC Intermediate phenotypes believed to be 1. Motivation 2. A stochastic causal in Z (mean) matrix search variable selection algorithm Global level pathway information encoded in 3. Example using A (adjacency) matrix candidate genes 4. Ideas for GWAS Influence of prior estimated by observed data through π,τ ,σ,φ Informative priors provided additional support for causal genes
  29. 29. Using pathways toAn outline discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC1. Motivation 1. Motivation 2. A stochastic2. A stochastic search variable selection search variable selection algorithmalgorithm 3. Example using candidate genes 4. Ideas for GWAS3. Example using candidate genes4. Ideas for GWAS
  30. 30. Using pathways toCan be applied in genome-wide discover complex disease models Gary Chen,association study Duncan Thomas Department of Preventive Medicine USC Proof of concept: GWAS of breast cancer 1. Motivation 2000 cases, 2000 controls, ∼ 1M SNPs 2. A stochastic Top SNP from each of 2755 genes, p < .05 search variable selection algorithm from GWAS 3. Example using candidate genes 4. Ideas for GWAS
  31. 31. Using pathways toCan be applied in genome-wide discover complex disease models Gary Chen,association study Duncan Thomas Department of Preventive Medicine USC Proof of concept: GWAS of breast cancer 1. Motivation 2000 cases, 2000 controls, ∼ 1M SNPs 2. A stochastic Top SNP from each of 2755 genes, p < .05 search variable selection algorithm from GWAS 3. Example using candidate genes Gene Ontology used to define adjacency 4. Ideas for GWAS matrix and proposal kernel Considered the 22 GO terms under Biological Process (Level 3) Pair of SNPs considered neighbors if share at least one GO term Define a proposal density for new var Vi as: Q(Vi ) = I (Aij,i=j = 0)
  32. 32. Using pathways toAnalysis discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC Stepwise regression: 1. Motivation Considered only first 100 SNPs 2. A stochastic search variable Retained 83/100 SNPs selection algorithm 3. Example using Intractable for 2nd order interactions candidate genes 4. Ideas for GWAS
  33. 33. Using pathways toAnalysis discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC Stepwise regression: 1. Motivation Considered only first 100 SNPs 2. A stochastic search variable Retained 83/100 SNPs selection algorithm 3. Example using Intractable for 2nd order interactions candidate genes Our proposed algorithm: 4. Ideas for GWAS Low posterior probability for interactions Most sub-models contained variables with shared annotation
  34. 34. Using pathways toSensitivity analysis discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC Compare non-informative prior to one using GO terms in A 1. Motivation 2. A stochastic 1. Non-informative: constrain φ = 0 search variable selection algorithm 2. Adjacency info: no constraint on φ 3. Example using candidate genes 4. Ideas for GWAS Scenario ˆ σ2 ˆ τ2 ˆ φ Non informative .01 N/A 0 Adjacency .01 .0004 .86
  35. 35. Using pathways toPosterior inference discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC 1. Motivation 2. A stochastic search variable selection algorithm 3. Example using candidate genes 4. Ideas for GWAS
  36. 36. Using pathways toScaling up to larger sub-models discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC Need to test larger sub-models in GWAS 1. Motivation settings 2. A stochastic search variable selection algorithm Partition models into submodels using 3. Example using candidate genes ontology info 4. Ideas for GWAS Parallel processing: nodes fit submodels A parallelized MCMC algorithm - Poster 190
  37. 37. Using pathways toLogical topology of sub-models discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC 1. Motivation 2. A stochastic search variable selection algorithm 3. Example using candidate genes 4. Ideas for GWAS
  38. 38. Using pathways toHierarchical model discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC 1. Motivation 2. A stochastic search variable selection algorithm 3. Example using candidate genes 4. Ideas for GWAS
  39. 39. Using pathways toSummary for GWAS example discover complex disease models Gary Chen, External knowledge can be informative Duncan Thomas Department of MLEs of β are smoothed towards pathway Preventive Medicine means USC Ontologies useful: WECARE study in breast 1. Motivation cancer - Poster 189 2. A stochastic search variable For GWAS: Genome-wide expression selection algorithm potentially more biologically informative in Z 3. Example using candidate genes matrix 4. Ideas for GWAS Priors can guide towards biologically relevant interactions
  40. 40. Using pathways toSummary for GWAS example discover complex disease models Gary Chen, External knowledge can be informative Duncan Thomas Department of MLEs of β are smoothed towards pathway Preventive Medicine means USC Ontologies useful: WECARE study in breast 1. Motivation cancer - Poster 189 2. A stochastic search variable For GWAS: Genome-wide expression selection algorithm potentially more biologically informative in Z 3. Example using candidate genes matrix 4. Ideas for GWAS Priors can guide towards biologically relevant interactions Computational efficiency essential: Defining proposal kernel: e.g. expit(π T Z ) More parsimonious sub-models desirable (e.g. fused LASSO) Fisher scoring can be improved using parallel code (e.g. GPUs)
  41. 41. Using pathways toAcknowledgements discover complex disease models Gary Chen, Duncan Thomas Department of Preventive Medicine USC 1. Motivation James Baurley 2. A stochastic search variable David Conti selection algorithm 3. Example using Dataset: African American Breast Cancer candidate genes 4. Ideas for GWAS GWAS Collaborators Funding: R01 ES016813

×