10. Limitation of logistic Reg
• 1. Iterative method to solve and obtain
parameters (time)
– PLINK is relatively fast
– But not solving the the full logistic eqn
• 2. b3 coefficient term
11. Limitation of logistic Reg
• Without and with interaction
Rearrange
Effect of x1 on y is related to dosage effect of x2
Limitation in both logistic regression and log-linear models
12. Limitation of logistic Reg
• Existing analysis
approach only allow a
allele dosage related
interaction
• i.e. x2=(0,1,2) for (RR,RQ,QQ)
• QQ should have double
the interaction effect of
RQ
Interaction effect is allele dose related
0 1 2X2=
13. Huge number of tests
• 330K chips
• Will have 330,000 ^2 interactions
• 1 x 10^11 test
• Problems
• 1. speed
• 2. type 1 error
14. Early attempts
• Yu Weichuan et al in UST
• Information theory
• Some try
• Limitation: too many variables and not
statistically based.
15. Log-linear regression
• First developed by LA Goodman in 1970’s
• Further extended by Bishop, Finberg &
Holland 1975; Haberman 1975
• Model the counts (no. Of subjects in each
wells of a table)
17. Log-linear regression
• Statistical inference by comparison of nested
models
• Simplest model: no association (independent)
• Saturated model: with association
𝝅𝟏𝟏 𝝅12
𝝅21 𝝅22
A1 A2
B1
B2
S
B
18. Log-linear regression
• The saturated model fits the data best
• Step by step to remove terms
• Ask if the more simple (nested) model
significantly fit worse
Test statistics: likihood ratio test
Follow chi-square distribution
19. CONTROL
Log-linear regression
• 1. can expand to nth dimension
– E.g. 3-way table
– Allow potential study of n-SNP interaction
– Here we still restrict to 2-SNP interaction
– Which is a 3-way table ( 3 x 3 x 2)
CASE
20. Log-linear regression
• 2. Equivalent to logistic regression
Similar to PLINK (test for b3)
Will be compare model Ms to MH
No need to iterate for model
Approximation by closed form equation (available for
most log-linear models )
21. Analysis of WTCCC dataset
• T1D
• Known SNP-SNP
interaction in HLA loci
• HLA region shown
• Single SNP association
• Interaction analysis (log-
linear) shows interacting
pairs within 31 Mb to
33Mb, p<1e-15
22. Example of an interaction pair
• (a) Case, (b) control, (c) Odds
• Double heterozygotes have increased risk
23. • Boolean Operation-based data coding
• Log-linear model test
Wan, X., Yang, C., Yang, Q., Xue, H., Fan, X., Tang, N. L., &
Yu, W. (2010). BOOST: A fast approach to detecting gene-
gene interactions in genome-wide case-control
studies. American journal of human genetics, 87(3), 325–
340. https://doi.org/10.1016/j.ajhg.2010.07.021
24. • AJHG paper : narrow- sense interaction
• Broad-sense Interaction
• Ms – Mp