Upcoming SlideShare
×

# Linear regression with R 2

863
-1

Published on

Published in: Education
4 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total Views
863
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
29
0
Likes
4
Embeds 0
No embeds

No notes for slide

### Linear regression with R 2

1. 1. Linear Regression with 2: Model selection 2012-12-10 @HSPH Kazuki Yoshida, M.D. MPH-CLE student FREEDOM TO  KNOW
2. 2. Group Website is at:http://rpubs.com/kaz_yos/useR_at_HSPH
3. 3. Previously in this groupn Introduction n Graphicsn Reading Data into R (1) n Groupwise, continuousn Reading Data into R (2) n Linear regressionn Descriptive, continuousn Descriptive, categoricaln Deducer
4. 4. Menun Linear regression: Model selection
5. 5. Ingredients Statistics Programmingn Selection methods n step() n drop1() n add1() n leaps::regsubsets()
6. 6. OpenR Studio
7. 7. Open the saved script that we created last time.See also Linear Regression with R 1 slides
8. 8. Create full & null modelslm.full <- lm(bwt ~ age + lwt + smoke + ht + ui + ftv.cat + race.cat + preterm, data = lbw)lm.null <- lm(bwt ~ 1, data = lbw) Intercept-only
9. 9. Compare two modelsanova(lm.full, lm.null) Model 1 Model 2
10. 10. Models Partial F-test Difference in residual SS Residual sum of squaresResidual degree of freedom Signiﬁcant
11. 11. Backward elimination Specify full modellm.step.bw <- step(lm.full, direction = "backward") Final model object
12. 12. Initial AIC Removing ftv.catfor full makes AIC smallest model Removing age makes AIC smallest Doing nothing makes AIC smallest
13. 13. Forward selection Final model object Specify null modellm.step.fw <- step(lm.null, scope = ~ age + lwt + smoke + ht + ui + ftv.cat + race.cat + preterm, direction = "forward") formula for possible variables
14. 14. Initial AIC for Adding ui null makes AIC smallestmodel Adding race.cat makes AIC smallest Adding smoke makes AIC smallest Still goes on ...
15. 15. Stepwise selection/elimination Final model object Specify null modellm.step.both <- step(lm.null, scope = ~ age + lwt + smoke + ht + ui + ftv.cat + race.cat + preterm, direction = "both") formula for possible variables
16. 16. Initial AIC Adding ui for makes AIC smallest null model Adding race.cat Removing is makes AIC smallestalso considered Adding smoke Removing is makes AIC smallestalso considered Still goes on ...
17. 17. F-test using drop1()## age is the least signiﬁcant by partial F testdrop1(lm.full, test = "F")## After elimination, ftv.cat is the least signiﬁcantdrop1(update(lm.full, ~ . -age), test = "F")## After elimination, preterm is least signiﬁcat at p = 0.12.drop1(update(lm.full, ~ . -age -ftv.cat), test = "F")## After elimination, all variables are signiﬁcant at p < 0.1drop1(update(lm.full, ~ . -age -ftv.cat -preterm), test = "F")## Show summary for ﬁnal modelsummary(update(lm.full, ~ . -age -ftv.cat -preterm))
18. 18. Updating models## Remove age from full modellm.age.less <- update(lm.full, ~ . -age) all variables(.) minus age## Adding ui to null modellm.ui.only <- update(lm.null, ~ . +ui) all variables (.) plus ui
19. 19. test full modelage least signiﬁcantF-test comparing age-inmodel to age-out modelremove age, and testftv.cat least signiﬁcantremove age, ftv.cat
20. 20. F-test using add1()## ui is the most signiﬁcant variableadd1(lm.null, scope = ~ age + lwt + race.cat + smoke + preterm ++ ui + ftv.cat, test = "F")## After inclusion, race.cat is the most signiﬁcantadd1(update(lm.null, ~ . +ui), scope = ~ age + lwt + race.cat +smoke + preterm + ht + ui + ftv.cat, test = "F")## After inclusion, smoke is the most signiﬁcantadd1(update(lm.null, ~ . +ui +race.cat), scope = ~ age + lwt +race.cat + smoke + preterm + ht + ui + ftv.cat, test = "F")## After inclusion, ht is the most signiﬁcantadd1(update(lm.null, ~ . +ui +race.cat +smoke), scope = ~ age + l+ race.cat + smoke + preterm + ht + ui + ftv.cat, test = "F")...
21. 21. test null model ui most signiﬁcantF-test comparing ui-out model to ui-in model add ui, and testrace.cat most signiﬁcant add ui and race.cat
22. 22. All-subset regressionusing leaps package
23. 23. library(leaps)regsubsets.out <- regsubsets(bwt ~ age + lwt + smoke + ht + ui + ftv.cat + race.cat + preterm, data = lbw, nbest = 1, nvmax = NULL, force.in = NULL, force.out = NULL, method = "exhaustive")summary(regsubsets.out)
24. 24. library(leaps) Result object regsubsets.out <- regsubsets(bwt ~ age + lwt + smoke + ht + ui + ftv.cat + race.cat + preterm, Full model data = lbw, How many best models? Maxmodel size nbest = 1, Forced variables nvmax = NULL, force.in = NULL, force.out = NULL, method = "exhaustive")summary(regsubsets.out)
25. 25. Forced variablesVariable combination Best 1 predictor model Best 7 predictor model Best 10 predictor model
26. 26. plot(regsubsets.out, scale = "adjr2", main = "Adjusted R^2") the higher the better ~ lwt + smoke + ht + ui + race.cat + preterm ~ smoke + ht + ui + race ~ ui
27. 27. library(car)subsets(regsubsets.out, statistic="adjr2", legend = FALSE, min.size = 5, main = "Adjusted R^2") ~ lwt + smoke + ht + ui + race.cat + preterm
28. 28. subsets(regsubsets.out, statistic="cp", legend = FALSE, min.size = 5, main = "Mallow Cp") First model for which Mallow Cp is less than number of regressors + 1 ~ lwt + smoke + ht + ui + race.cat + preterm
1. #### A particular slide catching your eye?

Clipping is a handy way to collect important slides you want to go back to later.