Upcoming SlideShare
Loading in …5
×

# Introduction of mixed effect model

1,885 views

Published on

Event link: http://www.meetup.com/NYC-Open-Data/events/161342472/
A free R workshop given by SupStat Inc at New York R user group and NYC Open Data Meetup group

0 Comments
3 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

No Downloads
Views
Total views
1,885
On SlideShare
0
From Embeds
0
Number of Embeds
404
Actions
Shares
0
Downloads
43
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

### Introduction of mixed effect model

1. 1. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Introduction of Mixed effect model Learning by simulation Supstat Inc. 1 of 34 1/29/14, 10:51 PM
2. 2. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Outline · What is mixed effect model · Fixed effect model · Mixed effect model - Random Intercept model - Random Intercept and Slope Model · General Mixed effect model · Case study 2 of 34 1/29/14, 10:51 PM
3. 3. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 What is mixed effect model 3 of 34 1/29/14, 10:51 PM
4. 4. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Classical normal linear model Formation: Yi = b0 + b1*Xi + ei · Yi is response from suject i. · Xi are covariates. · b0, b1 are parameters that we want to estimate. · ei are the random terms in the model, and are assumped to be independently and indentically distributed from Normal(0,1). It is very important that there is no stucuture in ei and it represents the variations that could not be controled in our studies. 4 of 34 1/29/14, 10:51 PM
5. 5. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Violation of independence assumpation. In many cases, responses are not independent from each other. These data usualy have some cluster stucture. · Repeated measures, where measurements are taken multiple times from the same sujects. (clustered by subject) · A survey of all the family memebers. (clustered by family) · A survey of students from 20 classrooms in a high school. (clustered by classroom) · Longitudial data, or known as the panel data, where several responses are collected from the same sujects along the time. (clustered by subject) We need new tools - Mixed effect model. 5 of 34 1/29/14, 10:51 PM
6. 6. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Mixed effect model Mixed effect model = Fixed effect + Random effect · Fixed effects - expected to have a systematic and predictable inﬂuence on your data. - exhaust “the levels of a factor”.Think of sex(male/femal). · Random effect - expected to have a non-systematic, unpredictable, or “random” inﬂuence on your data. - Random effects have factor levels that are drawn from a large population, but we do not know exactly how or why they differ. 6 of 34 1/29/14, 10:51 PM
7. 7. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Example of Fixed effects and Random effects FIXED EFFECTS Male or female Individuals with repeated measures Insecticide sprayed or not Block within a field Upland or lowland Brood One country versus another Split plot within a plot Wet versus dry 7 of 34 RANDOM EFFECTS Family 1/29/14, 10:51 PM
8. 8. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Fixed effect model 8 of 34 1/29/14, 10:51 PM
9. 9. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Fixed effect model Fixed effect model is just the linear model that you maybe already know. Yi = b0 + b1*Xi + ei 1<i<n n is number of sample · Yi: Response Variable · b0: ﬁxed intercept · b1: ﬁxed slope · Xi: Explanatory Variable (ﬁxed effect) · ei: noise (error) 9 of 34 1/29/14, 10:51 PM
10. 10. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Data generation of fixed effect model set.seed(1) # genaerate x x <- seq(1,5,length.out=100) # generate error noise <- rnorm(n=100,mean=0,sd=1) b0 <- 1 b1 <- 2 # generate y y <- b0 + b1*x + noise 10 of 34 1/29/14, 10:51 PM
11. 11. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Data generation of fixed effect model plot(y~x) 11 of 34 1/29/14, 10:51 PM
12. 12. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Cooefficient estimation of fixed effect model model <- lm(y~x) summary(model) Call: lm(formula = y ~ x) Residuals: Min 1Q -2.3401 -0.6058 Median 0.0155 3Q 0.5851 Max 2.2975 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.1424 0.2491 4.59 1.3e-05 *** x 1.9888 0.0774 25.70 < 2e-16 *** --Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.903 on 98 degrees of freedom 12 of 34 1/29/14, 10:51 PM
13. 13. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 plot of fixed effect model plot(y~x) abline(model) 13 of 34 1/29/14, 10:51 PM
14. 14. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Mixed effect model 14 of 34 1/29/14, 10:51 PM
15. 15. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Random Intercept model there are i people, and we repeat measure j times for every people. These poeple are individually different which we don't know, so there are random effect cause by people, and there are another random noise cause by measure for every people. Yij = b0 + b1*Xij + bi + eij · b0: ﬁxed intercept · b1: ﬁxed slope · Xij: ﬁxed effect · bi: random effect(inﬂuence intercept) · eij: noise 15 of 34 1/29/14, 10:51 PM
16. 16. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Data generation of Random Intercept model b0 <- 9.9 b1 <- 2 # repeat measure times for 6 people n <- c(13, 14, 14, 15, 12, 13) npeople <- length(n) set.seed(1) # generate x(fixed effect) x <- matrix(rep(0, length=max(n) * npeople),ncol = npeople) for (i in 1:npeople){ x[1:n[i], i] <- runif(n[i], min = 1, max = 5) x[1:n[i], i] <- sort(x[1:n[i], i]) } # random effect bi <- rnorm(npeople, mean = 0, sd = 10) 16 of 34 1/29/14, 10:51 PM
17. 17. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Data generation of Random Intercept model xall <- NULL yall <- NULL peopleall <- NULL for (i in 1:npeople){ xall <- c(xall, x[1:n[i], i]) # combine x # generate y y <- rep(b0 + bi[i], length = n[i]) + b1 * x[1:n[i],i] + rnorm(n[i], mean = 0, sd = 2) # noise yall <- c(yall, y) # combine y people <- rep(i, length = n[i]) peopleall <- c(peopleall, people) } # final dataset data1 <- data.frame(yall,peopleall,xall) 17 of 34 1/29/14, 10:51 PM
18. 18. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Cooefficient estimation of Random Intercept model library(nlme) # xall is fixed effect # bi influence intercept of model lme1 <- lme(yall~xall,random=~1|peopleall,data=data1) summary(lme1) Linear mixed-effects model fit by REML Data: data1 AIC BIC logLik 358 368 -175 Random effects: Formula: ~1 | peopleall (Intercept) Residual StdDev: 7.3 1.77 Fixed effects: yall ~ xall 18 of 34 1/29/14, 10:51 PM
19. 19. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Plot of Random Intercept model 19 of 34 1/29/14, 10:51 PM
20. 20. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Random Intercept and slope model Yij = b0 + (b1+si)*Xij + bi + eij · b0: ﬁxed intercept · b1: ﬁxed slope · X: ﬁxed effect · bi: random effect(inﬂuence intercept) · eij: noise · si: random effect(inﬂuence slope) 20 of 34 1/29/14, 10:51 PM
21. 21. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Data generation of Random Intercept and slope model a0 <- 9.9 a1 <- 2 n <- c(12, 13, 14, 15, 16, 13) npeople <- length(n) set.seed(1) si <- rnorm(npeople, mean = 0, sd = 0.5) # random slope x <- matrix(rep(0, length = max(n) * npeople), ncol = npeople) for (i in 1:npeople){ x[1:n[i], i] <- runif(n[i], min = 1, max = 5) x[1:n[i], i] <- sort(x[1:n[i], i]) } 21 of 34 1/29/14, 10:51 PM
22. 22. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Data generation of Random Intercept and slope model bi <- rnorm(npeople, mean = 0, sd = 10) # random intercept xall <- NULL yall <- NULL peopleall <- NULL for (i in 1:npeople){ xall <- c(xall, x[1:n[i], i]) y <- rep(a0 + bi[i], length = n[i]) + (a1 + si[i]) * x[1:n[i],i] + rnorm(n[i], mean = 0, sd = 0.5) yall <- c(yall, y) people <- rep(i, length = n[i]) peopleall <- c(peopleall, people) } # generate final dataset data2 <- data.frame(yall, peopleall, xall) 22 of 34 1/29/14, 10:51 PM
23. 23. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Cooefficient estimation of Random Intercept and slope model # bi influence intercept and slope of model lme2 <- lme(yall~xall,random=~1+xall|peopleall,data=data2) print(summary(lme2)) Linear mixed-effects model fit by REML Data: data2 AIC BIC logLik 179 194 -83.6 Random effects: Formula: ~1 + xall | peopleall Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr (Intercept) 11.593 (Intr) xall 0.464 0.044 Residual 0.445 23 of 34 1/29/14, 10:51 PM
24. 24. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Plot of Random Intercept and slope model 24 of 34 1/29/14, 10:51 PM
25. 25. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 what if we just use linear model · complete pooling # wrong estimation lm1 <- lm(yall~xall,data=data2) summary(lm1) Call: lm(formula = yall ~ xall, data = data2) Residuals: Min 1Q Median -17.80 -6.27 -3.67 3Q 2.19 Max 24.33 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.86 3.72 1.84 0.06874 . xall 4.31 1.15 3.76 0.00032 *** --25 of 34 1/29/14, 10:51 PM
26. 26. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 what if we just use linear model · no pooling # wrong estimation and waste too many freedom and we don't care about the exact different of pe lm2 <- lm(yall~xall+factor(peopleall)+xall*factor(peopleall),data=data1) summary(lm2) Call: lm(formula = yall ~ xall + factor(peopleall) + xall * factor(peopleall), data = data1) Residuals: Min 1Q Median -2.983 -1.194 0.054 3Q 1.092 Max 4.238 Coefficients: (Intercept) xall 26 of 34 Estimate Std. Error t value Pr(>|t|) 18.818 1.342 14.02 < 2e-16 *** 0.929 0.413 2.25 0.028 * 1/29/14, 10:51 PM
27. 27. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 General Mixed effect model 27 of 34 1/29/14, 10:51 PM
28. 28. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Logistic Mixed effect model Yij = exp(eta)/(1+exp(eta)) eta = b0 + b1*Xij + bi + eij · b0: ﬁxed intercept · b1: ﬁxed slope · X: ﬁxed effect · bi: random effect(inﬂuence intercept) · eij: noise 28 of 34 1/29/14, 10:51 PM
29. 29. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Data generation of Logistic Mixed effect model b0 <- - 6 b1 <- 2.1 set.seed(1) n <- c(12, 13, 14, 15, 16, 13) npeople <- length(n) x <- matrix(rep(0, length = max(n) * npeople), ncol = npeople) bi <- rnorm(npeople, mean = 0, sd = 1.5) for (i in 1:npeople){ x[1:n[i], i] <- runif(n[i], min = 1,max = 5) x[1:n[i], i] <- sort(x[1:n[i], i]) } 29 of 34 1/29/14, 10:51 PM
30. 30. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Data generation of Logistic Mixed effect model xall <- NULL yall <- NULL peopleall <- NULL for (i in 1:npeople){ xall <- c(xall, x[1:n[i], i]) y <- NULL for(j in 1:n[i]){ eta1 <- b0 + b1 * x[j, i] + bi[i] y <- c(y, rbinom(n = 1, size = 1, prob = exp(eta1)/(exp(eta1) + 1))) } yall <- c(yall, y) people <- rep(i, length = n[i]) peopleall <- c(peopleall, people) } data3 <- data.frame(xall, peopleall,yall) 30 of 34 1/29/14, 10:51 PM
31. 31. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Cooefficient estimation of Logistic Mixed effect model library(lme4) # formula is different lmer3 <- glmer(yall~xall+(1|peopleall),data=data3,family=binomial) print(summary(lmer3)) Generalized linear mixed model fit by maximum likelihood ['glmerMod'] Family: binomial ( logit ) Formula: yall ~ xall + (1 | peopleall) Data: data3 AIC 69.8 BIC 77.1 logLik deviance -31.9 63.8 Random effects: Groups Name Variance Std.Dev. peopleall (Intercept) 3.94 1.98 Number of obs: 83, groups: peopleall, 6 31 of 34 1/29/14, 10:51 PM
32. 32. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Plot of Logistic Mixed effect model 32 of 34 1/29/14, 10:51 PM
33. 33. Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 Case study 33 of 34 1/29/14, 10:51 PM
34. 34. Introduction of Mixed effect model 34 of 34 http://nycdatascience.com/mixed_effect_model_supstat/index.html#1 1/29/14, 10:51 PM