This document discusses correlated data structures and methods for analyzing correlated binary outcome data, specifically generalized estimating equations (GEE) and generalized linear mixed models (GLMM). It begins with examples of correlated data and an overview of GEE and GLMM. It then compares GEE and GLMM, noting that GEE makes population-level inferences while GLMM allows for individual-level inferences. The document concludes by stating that both GEE and GLMM can be applied to genome-wide association studies (GWAS) to account for genetic correlations.
Generative AI on Enterprise Cloud with NiFi and Milvus
GEE & GLMM in GWAS
1. Association Study: Binomial Case
GEE & GLMM
Jinseob Kim
GSPH, SNU
July 2, 2014
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 1 / 45
2. Contents
1 Correlated = Not Independent
Concept
Example
2 GEE & GLMM Basic
Basic Linear Regression
GEE
GLMM
Comparison
3 GEE & GLMM in GWAS
Concepts of GWAS
Genetic Correlation
Use GEE & GLMM
4 Conclusion
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 2 / 45
4. Correlated = Not Independent
Contents
1 Correlated = Not Independent
Concept
Example
2 GEE GLMM Basic
Basic Linear Regression
GEE
GLMM
Comparison
3 GEE GLMM in GWAS
Concepts of GWAS
Genetic Correlation
Use GEE GLMM
4 Conclusion
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 4 / 45
5. Correlated = Not Independent Concept
iid??
i iid N(0; 2) or N(0; 2In)
Independent
Identically distributed
i N(0; 2
i )
Independent
Not Identically distributed
@ ¨Ñèt DÈä!!
äL ÜÐ..
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 5 / 45
7. Correlated = Not Independent Example
Repeated Measure
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 7 / 45
8. Correlated = Not Independent Example
Clustered/Multilevel study
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 8 / 45
9. Correlated = Not Independent Example
Serial Correlation
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 9 / 45
10. Correlated = Not Independent Example
Familial structure in Genetic Study
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 10 / 45
14. estimation in linear regression
1 Ordinary Least Square(OLS): semi-parametric
2 Maximum Likelihood Estimator(MLE): parametric
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 13 / 45
15. GEE GLMM Basic Basic Linear Regression
Least Square(Œñ•)
ñiD Œ: y Ü1Ð D”Æä.
Figure. OLS Fitting
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 14 / 45
16. GEE GLMM Basic Basic Linear Regression
Likelihood??
¥Ä(likelihood) VS U`(probability)
Discrete: ¥Ä = U` - ü¬ X8 1˜, U`@ 16
Continuous: ¥Ä != U` - 01 Ð + X˜ QXD L 0.7|
U`@ 0...
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 15 / 45
17. GEE GLMM Basic Basic Linear Regression
Maximum likelihood estimator(MLE)
¥Ä”É: 1; ; nt Žt|X.
1 X ¥Ä h| lä.
2 ¥Ä| € ñXt ´ ¬tX ¥Ä (ŽtÈL)
3 ¥Ä| X”
18. | lä.
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 16 / 45
19. GEE GLMM Basic Basic Linear Regression
MLE: ¥Ä”É
pt0 |´ ¥1D : y” „ìD”.
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 17 / 45
20. GEE GLMM Basic Basic Linear Regression
Logistic function: MLE
Figure. Fitting Logistic Function
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 18 / 45
21. GEE GLMM Basic Basic Linear Regression
LRT? Ward? score?
Likelihood Ratio Test VS Ward test VS score test
1 µÄ X1 èX” )•ä.
2 ¥ÄDP VS ÀDP VS 0¸0DP/
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 19 / 45
22. GEE GLMM Basic Basic Linear Regression
DP
Figure. Comparison
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 20 / 45
23. GEE GLMM Basic Basic Linear Regression
AIC
°¬ l ¨X ¥Ä| Lt| Xt.
1 AIC = 2 log(L) + 2 k
2 k: $…ÀX /(1Ä, ˜t, ð ...)
3 ‘D] ‹@ ¨!!!
¥Ä p ¨D àt ÀÌ.. $…À 4 Ît ˜ð!!!
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 21 / 45
42. µ”
l` ˆä(BLUP).
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 29 / 45
43. GEE GLMM Basic Comparison
GEE example: Continuous
running glm to get initial regression estimate
(Intercept) age sex BMI
-64.2956645 0.1811694 -42.3958662 8.5256257
gee(formula = TG ~ age + sex + BMI, id = FID, data = a, corstr = exchangeable)
Estimate Naive S.E. Naive z Robust S.E. Robust z
(Intercept) -67.2665582 35.8624272 -1.8756834 35.9094269 -1.8732284
age 0.1751885 0.3340099 0.5245007 0.3996143 0.4383938
sex -42.2905294 11.3716707 -3.7189372 8.3038131 -5.0929048
BMI 8.6744524 1.2930220 6.7086657 1.4041520 6.1777161
Working Correlation
[,1] [,2] [,3] [,4]
[1,] 1.0000000 0.2582559 0.2582559 0.2582559
[2,] 0.2582559 1.0000000 0.2582559 0.2582559
[3,] 0.2582559 0.2582559 1.0000000 0.2582559
[4,] 0.2582559 0.2582559 0.2582559 1.0000000
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 30 / 45
44. GEE GLMM Basic Comparison
GLMM example: Continuous
lmer(formula = TG ~ age + sex + BMI + (1 | FID), data = a)
Estimate Std. Error t value
(Intercept) -65.222107 35.8720093 -1.8181894
age 0.109564 0.3318413 0.3301699
sex -41.942137 11.3684264 -3.6893529
BMI 8.648601 1.2917159 6.6954362
Groups Name Std.Dev.
FID (Intercept) 39.356
Residual 72.007
39.356^2/(39.356^2+72.007^2)=0.23
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 31 / 45
45. GEE GLMM Basic Comparison
GEE example: Binomial
running glm to get initial regression estimate
(Intercept) age sex BMI
-5.457458529 0.009749659 -1.385819506 0.157734298
gee(formula = hyperTG ~ age + sex + BMI, id = FID, data = a,
family = binomial, corstr = exchangeable)
Estimate Naive S.E. Naive z Robust S.E. Robust z
(Intercept) -5.453486897 1.10811194 -4.9214224 1.14198243 -4.7754561
age 0.008754136 0.00997040 0.8780125 0.01087413 0.8050421
sex -1.337114934 0.53428456 -2.5026270 0.52621253 -2.5410169
BMI 0.158988089 0.03867076 4.1113256 0.04248749 3.7419975
Working Correlation
[,1] [,2] [,3] [,4]
[1,] 1.0000000 0.1942491 0.1942491 0.1942491
[2,] 0.1942491 1.0000000 0.1942491 0.1942491
[3,] 0.1942491 0.1942491 1.0000000 0.1942491
[4,] 0.1942491 0.1942491 0.1942491 1.0000000
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 32 / 45
46. GEE GLMM Basic Comparison
GLMM example: Binomial
glmer(formula = hyperTG ~ age + sex + BMI + (1 | FID), data = family = binomial)
Estimate Std. Error z value Pr(|z|)
(Intercept) -6.65451749 1.48227814 -4.4893852 7.142904e-06
age 0.01052907 0.01206682 0.8725635 3.829010e-01
sex -1.48506920 0.60773433 -2.4436158 1.454090e-02
BMI 0.19131619 0.05022612 3.8090977 1.394749e-04
Groups Name Std.Dev.
FID (Intercept) 1.1163
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 33 / 45
47. GEE GLMM in GWAS
Contents
1 Correlated = Not Independent
Concept
Example
2 GEE GLMM Basic
Basic Linear Regression
GEE
GLMM
Comparison
3 GEE GLMM in GWAS
Concepts of GWAS
Genetic Correlation
Use GEE GLMM
4 Conclusion
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 34 / 45
48. GEE GLMM in GWAS Concepts of GWAS
Issues
Concepts
Sample SNP (3461 VS 500,000)
Regression more than 500,000 repeat...!!!!
Strict p-value( 5 108)
Issues
Computation burden.. speed!!
Complex correlation structure
Approximation technique
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 35 / 45
49. GEE GLMM in GWAS Genetic Correlation
GCM
Genetic Correlation Matrix
Correlation structure: tø Là ˆä. (qlp VS Data)
õ¡Xä. ÜYt Æä.
Computation...
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 36 / 45
51. GEE GLMM in GWAS Use GEE GLMM
üX
Cluster” Æä. x X˜X˜ Cluster.
GCM ø¬ ¥ä.
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 38 / 45
52. GEE GLMM in GWAS Use GEE GLMM
GWAS example: GEE-continuous
running glm to get initial regression estimate
(Intercept) age sex BMI genecount
-63.0665181 0.1441694 -39.0676606 7.8280011 19.8533844
gee(formula = TG ~ age + sex + BMI + genecount, id = ID, data = a,
R = kin, corstr = fixed)
Estimate Naive S.E. Naive z Robust S.E. Robust z
(Intercept) -63.0665181 35.4400639 -1.7795261 31.4650444 -2.0043359
age 0.1441694 0.3376881 0.4269307 0.3558302 0.4051635
sex -39.0676606 11.2797186 -3.4635315 7.2549380 -5.3849751
BMI 7.8280011 1.2914399 6.0614519 1.3054881 5.9962258
genecount 19.8533844 6.2315166 3.1859635 5.8534124 3.3917624
Working Correlation
[,1] [,2] [,3] [,4]
[1,] 1.0 0.5 0.5 0.5
[2,] 0.5 1.0 0.5 0.5
[3,] 0.5 0.5 1.0 0.0
[4,] 0.5 0.5 0.0 1.0
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 39 / 45
53. GEE GLMM in GWAS Use GEE GLMM
GWAS example: GEE-binomial
running glm to get initial regression estimate
(Intercept) age sex BMI genecount
-5.482288956 0.009646267 -1.348154797 0.151819412 0.192508455
gee(formula = hyperTG ~ age + sex + BMI + genecount, id = ID,
data = a, R = kin, family = binomial, corstr = fixed)
Estimate Naive S.E. Naive z Robust S.E. Robust z
(Intercept) -5.482288957 1.10060632 -4.9811535 1.07919392 -5.0799850
age 0.009646267 0.01004073 0.9607134 0.01027862 0.9384789
sex -1.348154801 0.53873048 -2.5024662 0.52100579 -2.5876004
BMI 0.151819412 0.03861585 3.9315312 0.04199752 3.6149615
genecount 0.192508455 0.18683677 1.0303564 0.19281252 0.9984230
Working Correlation
[,1] [,2] [,3] [,4]
[1,] 1.0 0.5 0.5 0.5
[2,] 0.5 1.0 0.5 0.5
[3,] 0.5 0.5 1.0 0.0
[4,] 0.5 0.5 0.0 1.0
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 40 / 45
54. GEE GLMM in GWAS Use GEE GLMM
GWAS example: GLMM
lme4 (¤ÀÐ l ˆ¥.
hglm (¤ÀÐ ¥.
GenABELÐ polygenic hglm h l´ ˆL.
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 41 / 45
55. GEE GLMM in GWAS Use GEE GLMM
Limitation
Both GEE GLMM
¬ä. ¹ˆ qlp + Binomial@ E..
Continuous: ApproximationX ì ùõ- FASTA, GRAMMAR,
GEMMA..
Binomial: Approximation 1ˆ..- Speed8 ùõˆ.
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 42 / 45
56. Conclusion
Contents
1 Correlated = Not Independent
Concept
Example
2 GEE GLMM Basic
Basic Linear Regression
GEE
GLMM
Comparison
3 GEE GLMM in GWAS
Concepts of GWAS
Genetic Correlation
Use GEE GLMM
4 Conclusion
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 43 / 45
58. Conclusion
END
Email : secondmath85@gmail.com
Oce: (02)880-2473
H.P: 010-9192-5385
Jinseob Kim (GSPH, SNU) Association Study: Binomial Case July 2, 2014 45 / 45