SlideShare a Scribd company logo
1 of 25
Download to read offline
Relentless Regression
By Nicholas Brooks
3/15/17
Using the dataset in R called mtcars I will use descriptive and inferential statistical methods
to find out whether any significant relationships exist between miles per gallon (mpg) and
the other variables in the dataset.
> datasets::mtcars
> mc<-mtcars
> head(mc,5)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
> str(mc)
'data.frame': 32 obs. of 11 variables:
$ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
$ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
$ disp: num 160 160 108 258 360 ...
$ hp : num 110 110 93 110 175 105 245 62 95 123 ...
$ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
$ wt : num 2.62 2.88 2.32 3.21 3.44 ...
$ qsec: num 16.5 17 18.6 19.4 17 ...
$ vs : num 0 0 1 1 0 1 0 1 1 1 ...
$ am : num 1 1 1 0 0 0 0 0 0 0 ...
$ gear: num 4 4 4 3 3 3 3 4 4 4 ...
$ carb: num 4 4 1 1 2 1 4 2 2 4 ...
> summary(mc$mpg)
Min. 1st Qu. Median Mean 3rd Qu. Max.
10.40 15.42 19.20 20.09 22.80 33.90
𝑝𝑙𝑜𝑡(𝑚𝑐wt,mc$mpg,xlab="weight",ylab="mpg",main="weight and mpg
comparison",col="blue")
the plot above displays a possible strong negative relationship between
weight and mpg
cor.test(mc𝑤𝑡, 𝑚𝑐mpg)
Pearson's product-moment correlation
data: mc𝑤𝑡𝑎𝑛𝑑𝑚𝑐mpg t = -9.559, df = 30, p-value = 1.294e-10 alternative hypothesis: true
correlation is not equal to 0 95 percent confidence interval: -0.9338264 -0.7440872 sample
estimates: cor -0.8676594
The above correlation test supports that a strong negative relationship does
exist between weight and mpg conluding that as the weight of the car
increases, the mpg decreases.
plot(mcℎ𝑝, 𝑚𝑐mpg,xlab="horse power",ylab="mpg",main="horse power and mpg comparison",col="green")
2 3 4 5
1015202530
weight and mpg comparison
weight
mpg
cor.test(mcℎ𝑝, 𝑚𝑐mpg)
Pearson's product-moment correlation
data: mcℎ𝑝𝑎𝑛𝑑𝑚𝑐mpg t = -6.7424, df = 30, p-value = 1.788e-07 alternative hypothesis: true
correlation is not equal to 0 95 percent confidence interval: -0.8852686 -0.5860994 sample
estimates: cor -0.7761684
The above plot displayed a possible negative relationship between horse
power and mpg. A correlation test between these two variables supports
sufficient evidence a strong negative relationship possibly exist as horse
power increases, mpg decreases.
plot(mc𝑑𝑖𝑠𝑝, 𝑚𝑐mpg,xlab="dispostion",ylab="mpg",main=" disposition and mpg comparison",col="red")
50 100 150 200 250 300
1015202530
horse power and mpg comparison
horse power
mpg
cor.test(mc𝑑𝑖𝑠𝑝, 𝑚𝑐mpg)
Pearson's product-moment correlation
data: mc𝑑𝑖𝑠𝑝𝑎𝑛𝑑𝑚𝑐mpg t = -8.7472, df = 30, p-value = 9.38e-10 alternative hypothesis:
true correlation is not equal to 0 95 percent confidence interval: -0.9233594 -0.7081376
sample estimates: cor -0.8475514
The plot as well as the correlation test between dispositon and mpg does
show indications of a strong negative relationship that as disposition
increases, mpg decreases.
100 200 300 400
1015202530
disposition and mpg comparison
dispostion
mpg
plot(mc𝑑𝑟𝑎𝑡, 𝑚𝑐mpg,xlab="drat",ylab="mpg",main="drat and mpg comparison",col="black")
cor.test(mc𝑑𝑟𝑎𝑡, 𝑚𝑐mpg)
Pearson's product-moment correlation
data: mc𝑑𝑟𝑎𝑡𝑎𝑛𝑑𝑚𝑐mpg t = 5.096, df = 30, p-value = 1.776e-05 alternative hypothesis: true
correlation is not equal to 0 95 percent confidence interval: 0.4360484 0.8322010 sample
estimates: cor 0.6811719
The above plot and correlation test between drat and mpg show indications of
a moderately strong positive relationship between the two variables exists
that as drat increases, mpg increases.
3.0 3.5 4.0 4.5 5.0
1015202530
drat and mpg comparison
drat
mpg
plot(mc𝑞𝑠𝑒𝑐, 𝑚𝑐mpg,xlab="qsec",ylab="mpg",main="qsec and mpg comparison",col="black")
cor.test(mc𝑞𝑠𝑒𝑐, 𝑚𝑐mpg)
Pearson's product-moment correlation
data: mc𝑞𝑠𝑒𝑐𝑎𝑛𝑑𝑚𝑐mpg t = 2.5252, df = 30, p-value = 0.01708 alternative hypothesis: true
correlation is not equal to 0 95 percent confidence interval: 0.08195487 0.66961864
sample estimates: cor 0.418684
The plot and correlation test between qsec and mpg indicates a slightly
positive relationship may exist between qsec and mpg.
16 18 20 22
1015202530
qsec and mpg comparison
qsec
mpg
boxplot(mc$mpg~factor.cyl,xlab="cylinder",ylab="mpg",main="mpg and cylinder comparison",col=c(3,5,7))
This box plot reveals a possible indication that as cylinder increases the mpg
decreases.
The data visualization has shown indications that possible relationships
exist as well as substantial variance between mpg and other variables. I will
now construct a regression model that best measures if any independent
variables are statistically significant to the dependent variable mpg. The
model should also help better explain the variation in the mpg that is
predictable from any independent variables. I will use forward selection,
4 6 8
1015202530
mpg and cylinder comparison
cylinder
mpg
backward elimination, and stepwise regression to construct a regression
model with each method and then compare their results to determine which
best fits the model.
add1(lm(mc$mpg~1),scope=(~.+mc$disp+mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+f
actor.vs+factor.am+factor.gear+factor.carb),test="F")
Single term additions
Model:
mc$mpg ~ 1
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 1126.05 115.943
mc$disp 1 808.89 317.16 77.397 76.5127 9.380e-10 ***
mc$hp 1 678.37 447.67 88.427 45.4598 1.788e-07 ***
mc$drat 1 522.48 603.57 97.988 25.9696 1.776e-05 ***
mc$wt 1 847.73 278.32 73.217 91.3753 1.294e-10 ***
mc$qsec 1 197.39 928.66 111.776 6.3767 0.0170820 *
factor.cyl 2 824.78 301.26 77.752 39.6975 4.979e-09 ***
factor.vs 1 496.53 629.52 99.335 23.6622 3.416e-05 ***
factor.am 1 405.15 720.90 103.672 16.8603 0.0002850 ***
factor.gear 2 483.24 642.80 102.003 10.9007 0.0002948 ***
factor.carb 5 500.56 625.49 107.129 4.1614 0.0065462 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
add1(lm(mc$mpg~1+mc$wt),scope=(~.+mc$disp+mc$hp+mc$drat+mc$qsec+factor.cyl+f
actor.vs+factor.am+factor.gear+factor.carb),test="F")
Single term additions
Model:
mc$mpg ~ 1 + mc$wt
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 278.32 73.217
mc$disp 1 31.639 246.68 71.356 3.7195 0.063620 .
mc$hp 1 83.274 195.05 63.840 12.3813 0.001451 **
mc$drat 1 9.081 269.24 74.156 0.9781 0.330854
mc$qsec 1 82.858 195.46 63.908 12.2933 0.001500 **
factor.cyl 2 95.263 183.06 63.810 7.2856 0.002835 **
factor.vs 1 54.228 224.09 68.283 7.0177 0.012926 *
factor.am 1 0.002 278.32 75.217 0.0002 0.987915
factor.gear 2 40.372 237.95 72.202 2.3753 0.111467
factor.carb 5 47.458 230.86 77.235 1.0278 0.422802
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
add1(lm(mc$mpg~1+mc$wt+mc$hp),scope=(~.+mc$disp+mc$drat+mc$qsec+factor.cyl+f
actor.vs+factor.am+factor.gear+factor.carb),test="F")
Single term additions
Model:
mc$mpg ~ 1 + mc$wt + mc$hp
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 195.05 63.840
mc$disp 1 0.057 194.99 65.831 0.0082 0.92851
mc$drat 1 11.366 183.68 63.919 1.7326 0.19876
mc$qsec 1 8.988 186.06 64.331 1.3527 0.25463
factor.cyl 2 34.270 160.78 61.657 2.8776 0.07364 .
factor.vs 1 6.868 188.18 64.693 1.0219 0.32072
factor.am 1 14.757 180.29 63.323 2.2918 0.14127
factor.gear 2 9.903 185.15 66.173 0.7221 0.49489
factor.carb 5 11.448 183.60 71.905 0.2993 0.90842
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
model=lm(mc$mpg~mc$wt+mc$hp)
summary(model)
Call:
lm(formula = mc$mpg ~ mc$wt + mc$hp)
Residuals:
Min 1Q Median 3Q Max
-3.941 -1.600 -0.182 1.050 5.854
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.22727 1.59879 23.285 < 2e-16 ***
mc$wt -3.87783 0.63273 -6.129 1.12e-06 ***
mc$hp -0.03177 0.00903 -3.519 0.00145 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148
F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
AIC(model)
[1] 156.6523
The model above was constructed using the forward selection method
drop1(lm(mc$mpg~mc$disp+mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+factor.vs+fact
or.am+factor.gear+factor.carb),test="F")
Single term deletions
Model:
mc$mpg ~ mc$disp + mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl +
factor.vs + factor.am + factor.gear + factor.carb
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 120.40 76.403
mc$disp 1 9.9672 130.37 76.948 1.2417 0.28267
mc$hp 1 25.6715 146.07 80.588 3.1982 0.09393 .
mc$drat 1 1.8208 122.22 74.884 0.2268 0.64074
mc$wt 1 25.5541 145.96 80.562 3.1836 0.09462 .
mc$qsec 1 1.2413 121.64 74.732 0.1546 0.69967
factor.cyl 2 10.9314 131.33 75.184 0.6809 0.52112
factor.vs 1 3.6299 124.03 75.354 0.4522 0.51151
factor.am 1 1.1420 121.55 74.705 0.1423 0.71132
factor.gear 2 3.9729 124.38 73.442 0.2475 0.78390
factor.carb 5 13.5989 134.00 69.828 0.3388 0.88144
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
drop1(lm(mc$mpg~mc$disp+mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+factor.vs+
factor.am+factor.gear),test="F")
Single term deletions
Model:
mc$mpg ~ mc$disp + mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl +
factor.vs + factor.am + factor.gear
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 134.00 69.828
mc$disp 1 0.9934 135.00 68.064 0.1483 0.70427
mc$hp 1 22.7935 156.79 72.855 3.4020 0.07998 .
mc$drat 1 1.1854 135.19 68.110 0.1769 0.67852
mc$wt 1 19.7963 153.80 72.237 2.9546 0.10107
mc$qsec 1 5.2634 139.26 69.061 0.7856 0.38598
factor.cyl 2 12.5642 146.57 68.696 0.9376 0.40811
factor.vs 1 3.6763 137.68 68.694 0.5487 0.46746
factor.am 1 11.9255 145.93 70.556 1.7799 0.19715
factor.gear 2 5.0215 139.02 67.005 0.3747 0.69220
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
drop1(lm(mc$mpg~mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+factor.vs+factor.am+fact
or.gear),test="F")
Single term deletions
Model:
mc$mpg ~ mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl + factor.vs +
factor.am + factor.gear
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 135.00 68.064
mc$hp 1 23.8685 158.86 71.274 3.7130 0.06763 .
mc$drat 1 1.5589 136.55 66.431 0.2425 0.62751
mc$wt 1 27.6318 162.63 72.023 4.2984 0.05064 .
mc$qsec 1 4.6789 139.67 67.154 0.7279 0.40320
factor.cyl 2 18.6303 153.62 68.201 1.4491 0.25732
factor.vs 1 4.6788 139.67 67.154 0.7278 0.40321
factor.am 1 13.5206 148.52 69.119 2.1033 0.16176
factor.gear 2 5.5765 140.57 65.359 0.4337 0.65375
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
drop1(lm(mc$mpg~mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+factor.vs+factor.am),tes
t="F")
Single term deletions
Model:
mc$mpg ~ mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl + factor.vs +
factor.am
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 140.57 65.359
mc$hp 1 18.566 159.14 67.329 3.0378 0.09470 .
mc$drat 1 0.666 141.24 63.511 0.1090 0.74426
mc$wt 1 38.996 179.57 71.194 6.3804 0.01888 *
mc$qsec 1 2.778 143.35 63.986 0.4545 0.50692
factor.cyl 2 17.987 158.56 65.212 1.4715 0.25040
factor.vs 1 2.644 143.22 63.956 0.4326 0.51726
factor.am 1 16.244 156.81 66.859 2.6578 0.11666
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> drop1(lm(mc$mpg~mc$hp+mc$wt+mc$qsec+factor.cyl+factor.vs+factor.am),test="F")
Single term deletions
Model:
mc$mpg ~ mc$hp + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 141.24 63.511
mc$hp 1 18.184 159.42 65.386 3.0899 0.09153 .
mc$wt 1 39.645 180.88 69.428 6.7367 0.01586 *
mc$qsec 1 2.442 143.68 62.059 0.4150 0.52557
factor.cyl 2 18.580 159.82 63.466 1.5786 0.22693
factor.vs 1 2.744 143.98 62.126 0.4663 0.50124
factor.am 1 18.885 160.12 65.527 3.2090 0.08585 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> drop1(lm(mc$mpg~mc$hp+mc$wt+factor.cyl+factor.vs+factor.am),test="F")
Single term deletions
Model:
mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.vs + factor.am
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 143.68 62.059
mc$hp 1 36.344 180.02 67.275 6.3238 0.01871 *
mc$wt 1 41.088 184.77 68.108 7.1493 0.01302 *
factor.cyl 2 25.284 168.96 63.246 2.1997 0.13183
factor.vs 1 7.346 151.03 61.655 1.2782 0.26897
factor.am 1 16.443 160.12 63.527 2.8611 0.10317
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> drop1(lm(mc$mpg~mc$hp+mc$wt+factor.cyl+factor.am),test="F")
Single term deletions
Model:
mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.am
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 151.03 61.655
mc$hp 1 31.943 182.97 65.794 5.4991 0.026935 *
mc$wt 1 46.173 197.20 68.191 7.9490 0.009081 **
factor.cyl 2 29.265 180.29 63.323 2.5191 0.099998 .
factor.am 1 9.752 160.78 61.657 1.6789 0.206460
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> drop1(lm(mc$mpg~mc$hp+mc$wt+factor.cyl),test="F")
Single term deletions
Model:
mc$mpg ~ mc$hp + mc$wt + factor.cyl
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 160.78 61.657
mc$hp 1 22.281 183.06 63.810 3.7417 0.0636127 .
mc$wt 1 116.390 277.17 77.084 19.5458 0.0001442 ***
factor.cyl 2 34.270 195.05 63.840 2.8776 0.0736450 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> modelII<-lm(mc$mpg~mc$hp+mc$wt+factor.cyl)
> summary(modelII)
Call:
lm(formula = mc$mpg ~ mc$hp + mc$wt + factor.cyl)
Residuals:
Min 1Q Median 3Q Max
-4.2612 -1.0320 -0.3210 0.9281 5.3947
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 35.84600 2.04102 17.563 2.67e-16 ***
mc$hp -0.02312 0.01195 -1.934 0.063613 .
mc$wt -3.18140 0.71960 -4.421 0.000144 ***
factor.cyl6 -3.35902 1.40167 -2.396 0.023747 *
factor.cyl8 -3.18588 2.17048 -1.468 0.153705
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.44 on 27 degrees of freedom
Multiple R-squared: 0.8572, Adjusted R-squared: 0.8361
F-statistic: 40.53 on 4 and 27 DF, p-value: 4.869e-11
> AIC(modelII)
[1] 154.4692
The model above was constructed using the backward elimination method.
> modelIII<-
(lm(mc$mpg~mc$disp+mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+factor.vs+factor.am+
factor.gear+factor.carb))
> step<-stepAIC(modelIII,direction="both")
Start: AIC=76.4
mc$mpg ~ mc$disp + mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl +
factor.vs + factor.am + factor.gear + factor.carb
Df Sum of Sq RSS AIC
- factor.carb 5 13.5989 134.00 69.828
- factor.gear 2 3.9729 124.38 73.442
- factor.am 1 1.1420 121.55 74.705
- mc$qsec 1 1.2413 121.64 74.732
- mc$drat 1 1.8208 122.22 74.884
- factor.cyl 2 10.9314 131.33 75.184
- factor.vs 1 3.6299 124.03 75.354
<none> 120.40 76.403
- mc$disp 1 9.9672 130.37 76.948
- mc$wt 1 25.5541 145.96 80.562
- mc$hp 1 25.6715 146.07 80.588
Step: AIC=69.83
mc$mpg ~ mc$disp + mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl +
factor.vs + factor.am + factor.gear
Df Sum of Sq RSS AIC
- factor.gear 2 5.0215 139.02 67.005
- mc$disp 1 0.9934 135.00 68.064
- mc$drat 1 1.1854 135.19 68.110
- factor.vs 1 3.6763 137.68 68.694
- factor.cyl 2 12.5642 146.57 68.696
- mc$qsec 1 5.2634 139.26 69.061
<none> 134.00 69.828
- factor.am 1 11.9255 145.93 70.556
- mc$wt 1 19.7963 153.80 72.237
- mc$hp 1 22.7935 156.79 72.855
+ factor.carb 5 13.5989 120.40 76.403
Step: AIC=67
mc$mpg ~ mc$disp + mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl +
factor.vs + factor.am
Df Sum of Sq RSS AIC
- mc$drat 1 0.9672 139.99 65.227
- factor.cyl 2 10.4247 149.45 65.319
- mc$disp 1 1.5483 140.57 65.359
- factor.vs 1 2.1829 141.21 65.503
- mc$qsec 1 3.6324 142.66 65.830
<none> 139.02 67.005
- factor.am 1 16.5665 155.59 68.608
- mc$hp 1 18.1768 157.20 68.937
+ factor.gear 2 5.0215 134.00 69.828
- mc$wt 1 31.1896 170.21 71.482
+ factor.carb 5 14.6475 124.38 73.442
Step: AIC=65.23
mc$mpg ~ mc$disp + mc$hp + mc$wt + mc$qsec + factor.cyl + factor.vs +
factor.am
Df Sum of Sq RSS AIC
- mc$disp 1 1.2474 141.24 63.511
- factor.vs 1 2.3403 142.33 63.757
- factor.cyl 2 12.3267 152.32 63.927
- mc$qsec 1 3.1000 143.09 63.928
<none> 139.99 65.227
+ mc$drat 1 0.9672 139.02 67.005
- mc$hp 1 17.7382 157.73 67.044
- factor.am 1 19.4660 159.46 67.393
+ factor.gear 2 4.8033 135.19 68.110
- mc$wt 1 30.7151 170.71 69.574
+ factor.carb 5 13.0509 126.94 72.095
Step: AIC=63.51
mc$mpg ~ mc$hp + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am
Df Sum of Sq RSS AIC
- mc$qsec 1 2.442 143.68 62.059
- factor.vs 1 2.744 143.98 62.126
- factor.cyl 2 18.580 159.82 63.466
<none> 141.24 63.511
+ mc$disp 1 1.247 139.99 65.227
+ mc$drat 1 0.666 140.57 65.359
- mc$hp 1 18.184 159.42 65.386
- factor.am 1 18.885 160.12 65.527
+ factor.gear 2 4.684 136.55 66.431
- mc$wt 1 39.645 180.88 69.428
+ factor.carb 5 2.331 138.91 72.978
Step: AIC=62.06
mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.vs + factor.am
Df Sum of Sq RSS AIC
- factor.vs 1 7.346 151.03 61.655
<none> 143.68 62.059
- factor.cyl 2 25.284 168.96 63.246
+ mc$qsec 1 2.442 141.24 63.511
- factor.am 1 16.443 160.12 63.527
+ mc$disp 1 0.589 143.09 63.928
+ mc$drat 1 0.330 143.35 63.986
+ factor.gear 2 3.437 140.24 65.284
- mc$hp 1 36.344 180.02 67.275
- mc$wt 1 41.088 184.77 68.108
+ factor.carb 5 3.480 140.20 71.275
Step: AIC=61.65
mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.am
Df Sum of Sq RSS AIC
<none> 151.03 61.655
- factor.am 1 9.752 160.78 61.657
+ factor.vs 1 7.346 143.68 62.059
+ mc$qsec 1 7.044 143.98 62.126
- factor.cyl 2 29.265 180.29 63.323
+ mc$disp 1 0.617 150.41 63.524
+ mc$drat 1 0.220 150.81 63.608
+ factor.gear 2 1.361 149.66 65.365
- mc$hp 1 31.943 182.97 65.794
- mc$wt 1 46.173 197.20 68.191
+ factor.carb 5 5.633 145.39 70.438
> model3<-lm(mc$mpg~mc$hp+mc$wt+factor.cyl+factor.am)
> summary(model3)
Call:
lm(formula = mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.am)
Residuals:
Min 1Q Median 3Q Max
-3.9387 -1.2560 -0.4013 1.1253 5.0513
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 33.70832 2.60489 12.940 7.73e-13 ***
mc$hp -0.03211 0.01369 -2.345 0.02693 *
mc$wt -2.49683 0.88559 -2.819 0.00908 **
factor.cyl6 -3.03134 1.40728 -2.154 0.04068 *
factor.cyl8 -2.16368 2.28425 -0.947 0.35225
factor.am1 1.80921 1.39630 1.296 0.20646
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.41 on 26 degrees of freedom
Multiple R-squared: 0.8659, Adjusted R-squared: 0.8401
F-statistic: 33.57 on 5 and 26 DF, p-value: 1.506e-10
> AIC(model3)
[1] 154.4669
The model above was constructed using the stepwise regression method.
Now that I have 3 regression models using 3 different methods I can now
choose which is the best fitting model. Below I recoded each model to easily
compare their results.
> summary(model)
Call:
lm(formula = mc$mpg ~ mc$wt + mc$hp)
Residuals:
Min 1Q Median 3Q Max
-3.941 -1.600 -0.182 1.050 5.854
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.22727 1.59879 23.285 < 2e-16 ***
mc$wt -3.87783 0.63273 -6.129 1.12e-06 ***
mc$hp -0.03177 0.00903 -3.519 0.00145 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148
F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
> summary(modelII)
Call:
lm(formula = mc$mpg ~ mc$hp + mc$wt + factor.cyl)
Residuals:
Min 1Q Median 3Q Max
-4.2612 -1.0320 -0.3210 0.9281 5.3947
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 35.84600 2.04102 17.563 2.67e-16 ***
mc$hp -0.02312 0.01195 -1.934 0.063613 .
mc$wt -3.18140 0.71960 -4.421 0.000144 ***
factor.cyl6 -3.35902 1.40167 -2.396 0.023747 *
factor.cyl8 -3.18588 2.17048 -1.468 0.153705
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.44 on 27 degrees of freedom
Multiple R-squared: 0.8572, Adjusted R-squared: 0.8361
F-statistic: 40.53 on 4 and 27 DF, p-value: 4.869e-11
> summary(model3)
Call:
lm(formula = mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.am)
Residuals:
Min 1Q Median 3Q Max
-3.9387 -1.2560 -0.4013 1.1253 5.0513
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 33.70832 2.60489 12.940 7.73e-13 ***
mc$hp -0.03211 0.01369 -2.345 0.02693 *
mc$wt -2.49683 0.88559 -2.819 0.00908 **
factor.cyl6 -3.03134 1.40728 -2.154 0.04068 *
factor.cyl8 -2.16368 2.28425 -0.947 0.35225
factor.am1 1.80921 1.39630 1.296 0.20646
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.41 on 26 degrees of freedom
Multiple R-squared: 0.8659, Adjusted R-squared: 0.8401
F-statistic: 33.57 on 5 and 26 DF, p-value: 1.506e-10
> AIC(model)
[1] 156.6523
> AIC(modelII)
[1] 154.4692
> AIC(model3)
[1] 154.4669
After examining and comparing all the regression models, it can be concluded
that the variable that explains the most variability in all 3 models is weight.
Horse power would be the second varible to explain the most variability.
model3 has the lowest AIC value which is a measure used to avoid
multicollinearity. Model3 not only has the lowest AIC value, but also the
highest Adjusted R squared value or coefficent of determination that can
explain approximately 84% of the variation in the regression equation.

More Related Content

What's hot

Hyperon and charmed baryon masses and axial charges from Lattice QCD
Hyperon and charmed baryon masses and axial charges from Lattice QCDHyperon and charmed baryon masses and axial charges from Lattice QCD
Hyperon and charmed baryon masses and axial charges from Lattice QCDChristos Kallidonis
 
Solution to-2nd-semester-soil-mechanics-2015-2016
Solution to-2nd-semester-soil-mechanics-2015-2016Solution to-2nd-semester-soil-mechanics-2015-2016
Solution to-2nd-semester-soil-mechanics-2015-2016chener Qadr
 
EES Functions and Procedures for Forced convection heat transfer
EES Functions and Procedures for Forced convection heat transferEES Functions and Procedures for Forced convection heat transfer
EES Functions and Procedures for Forced convection heat transfertmuliya
 
New highway project
New highway projectNew highway project
New highway projectjanaka ruwan
 
IA data based, boiling point estimation fatty acids by molecular weight using...
IA data based, boiling point estimation fatty acids by molecular weight using...IA data based, boiling point estimation fatty acids by molecular weight using...
IA data based, boiling point estimation fatty acids by molecular weight using...Lawrence kok
 
Mathematical Calculation toFindtheBest Chamber andDetector Radii Used for Mea...
Mathematical Calculation toFindtheBest Chamber andDetector Radii Used for Mea...Mathematical Calculation toFindtheBest Chamber andDetector Radii Used for Mea...
Mathematical Calculation toFindtheBest Chamber andDetector Radii Used for Mea...theijes
 
Kuzey soğutma kulesi̇ basinç kaybi
Kuzey soğutma kulesi̇ basinç kaybiKuzey soğutma kulesi̇ basinç kaybi
Kuzey soğutma kulesi̇ basinç kaybimertdemir1461
 
JacobSiegler_Research_2015
JacobSiegler_Research_2015JacobSiegler_Research_2015
JacobSiegler_Research_2015Jacob Siegler
 
Calibrating a CFD canopy model with the EC1 vertical profiles of mean wind sp...
Calibrating a CFD canopy model with the EC1 vertical profiles of mean wind sp...Calibrating a CFD canopy model with the EC1 vertical profiles of mean wind sp...
Calibrating a CFD canopy model with the EC1 vertical profiles of mean wind sp...Stephane Meteodyn
 
Assignment #4 questions and solutions-2013
Assignment #4 questions and solutions-2013Assignment #4 questions and solutions-2013
Assignment #4 questions and solutions-2013Darlington Etaje
 
Pressure research in kriss tilt effect 04122018 ver1.67
Pressure research in kriss  tilt effect  04122018 ver1.67Pressure research in kriss  tilt effect  04122018 ver1.67
Pressure research in kriss tilt effect 04122018 ver1.67Gigin Ginanjar
 
IA data based, boiling point estimation fatty acids using carbon chain and mo...
IA data based, boiling point estimation fatty acids using carbon chain and mo...IA data based, boiling point estimation fatty acids using carbon chain and mo...
IA data based, boiling point estimation fatty acids using carbon chain and mo...Lawrence kok
 
DSD-INT 2019 Development and Calibration of a Global Tide and Surge Model (GT...
DSD-INT 2019 Development and Calibration of a Global Tide and Surge Model (GT...DSD-INT 2019 Development and Calibration of a Global Tide and Surge Model (GT...
DSD-INT 2019 Development and Calibration of a Global Tide and Surge Model (GT...Deltares
 
videoMotionTrackingPCA
videoMotionTrackingPCAvideoMotionTrackingPCA
videoMotionTrackingPCAKellen Betts
 
HSFC Physics formula sheet
HSFC Physics formula sheetHSFC Physics formula sheet
HSFC Physics formula sheetoneill95
 
EES Procedures and Functions for Heat exchanger calculations
EES Procedures and Functions for Heat exchanger calculationsEES Procedures and Functions for Heat exchanger calculations
EES Procedures and Functions for Heat exchanger calculationstmuliya
 
Optimization of parameters affecting the performance of passive solar distill...
Optimization of parameters affecting the performance of passive solar distill...Optimization of parameters affecting the performance of passive solar distill...
Optimization of parameters affecting the performance of passive solar distill...IOSR Journals
 

What's hot (20)

Episode 39 : Hopper Design
Episode 39 :  Hopper Design Episode 39 :  Hopper Design
Episode 39 : Hopper Design
 
Relatório
RelatórioRelatório
Relatório
 
Hyperon and charmed baryon masses and axial charges from Lattice QCD
Hyperon and charmed baryon masses and axial charges from Lattice QCDHyperon and charmed baryon masses and axial charges from Lattice QCD
Hyperon and charmed baryon masses and axial charges from Lattice QCD
 
Solution to-2nd-semester-soil-mechanics-2015-2016
Solution to-2nd-semester-soil-mechanics-2015-2016Solution to-2nd-semester-soil-mechanics-2015-2016
Solution to-2nd-semester-soil-mechanics-2015-2016
 
EES Functions and Procedures for Forced convection heat transfer
EES Functions and Procedures for Forced convection heat transferEES Functions and Procedures for Forced convection heat transfer
EES Functions and Procedures for Forced convection heat transfer
 
New highway project
New highway projectNew highway project
New highway project
 
IA data based, boiling point estimation fatty acids by molecular weight using...
IA data based, boiling point estimation fatty acids by molecular weight using...IA data based, boiling point estimation fatty acids by molecular weight using...
IA data based, boiling point estimation fatty acids by molecular weight using...
 
Mathematical Calculation toFindtheBest Chamber andDetector Radii Used for Mea...
Mathematical Calculation toFindtheBest Chamber andDetector Radii Used for Mea...Mathematical Calculation toFindtheBest Chamber andDetector Radii Used for Mea...
Mathematical Calculation toFindtheBest Chamber andDetector Radii Used for Mea...
 
Kuzey soğutma kulesi̇ basinç kaybi
Kuzey soğutma kulesi̇ basinç kaybiKuzey soğutma kulesi̇ basinç kaybi
Kuzey soğutma kulesi̇ basinç kaybi
 
JacobSiegler_Research_2015
JacobSiegler_Research_2015JacobSiegler_Research_2015
JacobSiegler_Research_2015
 
Calibrating a CFD canopy model with the EC1 vertical profiles of mean wind sp...
Calibrating a CFD canopy model with the EC1 vertical profiles of mean wind sp...Calibrating a CFD canopy model with the EC1 vertical profiles of mean wind sp...
Calibrating a CFD canopy model with the EC1 vertical profiles of mean wind sp...
 
Assignment #4 questions and solutions-2013
Assignment #4 questions and solutions-2013Assignment #4 questions and solutions-2013
Assignment #4 questions and solutions-2013
 
Pressure research in kriss tilt effect 04122018 ver1.67
Pressure research in kriss  tilt effect  04122018 ver1.67Pressure research in kriss  tilt effect  04122018 ver1.67
Pressure research in kriss tilt effect 04122018 ver1.67
 
Sesion 10
Sesion 10Sesion 10
Sesion 10
 
IA data based, boiling point estimation fatty acids using carbon chain and mo...
IA data based, boiling point estimation fatty acids using carbon chain and mo...IA data based, boiling point estimation fatty acids using carbon chain and mo...
IA data based, boiling point estimation fatty acids using carbon chain and mo...
 
DSD-INT 2019 Development and Calibration of a Global Tide and Surge Model (GT...
DSD-INT 2019 Development and Calibration of a Global Tide and Surge Model (GT...DSD-INT 2019 Development and Calibration of a Global Tide and Surge Model (GT...
DSD-INT 2019 Development and Calibration of a Global Tide and Surge Model (GT...
 
videoMotionTrackingPCA
videoMotionTrackingPCAvideoMotionTrackingPCA
videoMotionTrackingPCA
 
HSFC Physics formula sheet
HSFC Physics formula sheetHSFC Physics formula sheet
HSFC Physics formula sheet
 
EES Procedures and Functions for Heat exchanger calculations
EES Procedures and Functions for Heat exchanger calculationsEES Procedures and Functions for Heat exchanger calculations
EES Procedures and Functions for Heat exchanger calculations
 
Optimization of parameters affecting the performance of passive solar distill...
Optimization of parameters affecting the performance of passive solar distill...Optimization of parameters affecting the performance of passive solar distill...
Optimization of parameters affecting the performance of passive solar distill...
 

Similar to Relentless Regression

Optimization of performance and emission characteristics of dual flow diesel ...
Optimization of performance and emission characteristics of dual flow diesel ...Optimization of performance and emission characteristics of dual flow diesel ...
Optimization of performance and emission characteristics of dual flow diesel ...eSAT Journals
 
Regression_Class_Project_-_MTCARS
Regression_Class_Project_-_MTCARSRegression_Class_Project_-_MTCARS
Regression_Class_Project_-_MTCARSDavid Ritchie
 
Linear models
Linear modelsLinear models
Linear modelsFAO
 
Applied Regression Analysis using R
Applied Regression Analysis using RApplied Regression Analysis using R
Applied Regression Analysis using RTarek Dib
 
Quantitative Methods Assignment Help
Quantitative Methods Assignment HelpQuantitative Methods Assignment Help
Quantitative Methods Assignment HelpExcel Homework Help
 
11. Linear Models
11. Linear Models11. Linear Models
11. Linear ModelsFAO
 
FYP Review 3 Presentation about EV c.pdf
FYP Review 3 Presentation about EV c.pdfFYP Review 3 Presentation about EV c.pdf
FYP Review 3 Presentation about EV c.pdfsauravdesignnedits
 
library(tidyr) and library(ggplot2)
library(tidyr)  and library(ggplot2)library(tidyr)  and library(ggplot2)
library(tidyr) and library(ggplot2)Dr. Volkan OBAN
 
Computer Science Programming Assignment Help
Computer Science Programming Assignment HelpComputer Science Programming Assignment Help
Computer Science Programming Assignment HelpProgramming Homework Help
 
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
[論文紹介] DPSNet: End-to-end Deep Plane Sweep StereoSeiya Ito
 

Similar to Relentless Regression (20)

Optimization of performance and emission characteristics of dual flow diesel ...
Optimization of performance and emission characteristics of dual flow diesel ...Optimization of performance and emission characteristics of dual flow diesel ...
Optimization of performance and emission characteristics of dual flow diesel ...
 
CarProject
CarProjectCarProject
CarProject
 
Regression_Class_Project_-_MTCARS
Regression_Class_Project_-_MTCARSRegression_Class_Project_-_MTCARS
Regression_Class_Project_-_MTCARS
 
Automobile design
Automobile designAutomobile design
Automobile design
 
Linear models
Linear modelsLinear models
Linear models
 
Applied Regression Analysis using R
Applied Regression Analysis using RApplied Regression Analysis using R
Applied Regression Analysis using R
 
chapter3
chapter3chapter3
chapter3
 
Quantitative Methods Assignment Help
Quantitative Methods Assignment HelpQuantitative Methods Assignment Help
Quantitative Methods Assignment Help
 
Presentation
PresentationPresentation
Presentation
 
11. Linear Models
11. Linear Models11. Linear Models
11. Linear Models
 
FYP Review 3 Presentation about EV c.pdf
FYP Review 3 Presentation about EV c.pdfFYP Review 3 Presentation about EV c.pdf
FYP Review 3 Presentation about EV c.pdf
 
Quantitative Methods Assignment Help
Quantitative Methods Assignment HelpQuantitative Methods Assignment Help
Quantitative Methods Assignment Help
 
library(tidyr) and library(ggplot2)
library(tidyr)  and library(ggplot2)library(tidyr)  and library(ggplot2)
library(tidyr) and library(ggplot2)
 
12. Linear models
12. Linear models12. Linear models
12. Linear models
 
Scatter plots ppt
Scatter plots pptScatter plots ppt
Scatter plots ppt
 
Scatter plots ppt
Scatter plots pptScatter plots ppt
Scatter plots ppt
 
GDP Viva Slides
GDP Viva SlidesGDP Viva Slides
GDP Viva Slides
 
Computer Science Programming Assignment Help
Computer Science Programming Assignment HelpComputer Science Programming Assignment Help
Computer Science Programming Assignment Help
 
Plug-In Hybrid Simulation
Plug-In Hybrid SimulationPlug-In Hybrid Simulation
Plug-In Hybrid Simulation
 
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
 

Recently uploaded

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 

Recently uploaded (20)

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 

Relentless Regression

  • 1. Relentless Regression By Nicholas Brooks 3/15/17 Using the dataset in R called mtcars I will use descriptive and inferential statistical methods to find out whether any significant relationships exist between miles per gallon (mpg) and the other variables in the dataset. > datasets::mtcars > mc<-mtcars > head(mc,5) mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 > str(mc) 'data.frame': 32 obs. of 11 variables: $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... $ cyl : num 6 6 4 6 8 6 8 4 4 6 ... $ disp: num 160 160 108 258 360 ... $ hp : num 110 110 93 110 175 105 245 62 95 123 ... $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ... $ wt : num 2.62 2.88 2.32 3.21 3.44 ... $ qsec: num 16.5 17 18.6 19.4 17 ... $ vs : num 0 0 1 1 0 1 0 1 1 1 ... $ am : num 1 1 1 0 0 0 0 0 0 0 ... $ gear: num 4 4 4 3 3 3 3 4 4 4 ... $ carb: num 4 4 1 1 2 1 4 2 2 4 ... > summary(mc$mpg) Min. 1st Qu. Median Mean 3rd Qu. Max. 10.40 15.42 19.20 20.09 22.80 33.90 𝑝𝑙𝑜𝑡(𝑚𝑐wt,mc$mpg,xlab="weight",ylab="mpg",main="weight and mpg comparison",col="blue")
  • 2. the plot above displays a possible strong negative relationship between weight and mpg cor.test(mc𝑤𝑡, 𝑚𝑐mpg) Pearson's product-moment correlation data: mc𝑤𝑡𝑎𝑛𝑑𝑚𝑐mpg t = -9.559, df = 30, p-value = 1.294e-10 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.9338264 -0.7440872 sample estimates: cor -0.8676594 The above correlation test supports that a strong negative relationship does exist between weight and mpg conluding that as the weight of the car increases, the mpg decreases. plot(mcℎ𝑝, 𝑚𝑐mpg,xlab="horse power",ylab="mpg",main="horse power and mpg comparison",col="green") 2 3 4 5 1015202530 weight and mpg comparison weight mpg
  • 3. cor.test(mcℎ𝑝, 𝑚𝑐mpg) Pearson's product-moment correlation data: mcℎ𝑝𝑎𝑛𝑑𝑚𝑐mpg t = -6.7424, df = 30, p-value = 1.788e-07 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.8852686 -0.5860994 sample estimates: cor -0.7761684 The above plot displayed a possible negative relationship between horse power and mpg. A correlation test between these two variables supports sufficient evidence a strong negative relationship possibly exist as horse power increases, mpg decreases. plot(mc𝑑𝑖𝑠𝑝, 𝑚𝑐mpg,xlab="dispostion",ylab="mpg",main=" disposition and mpg comparison",col="red") 50 100 150 200 250 300 1015202530 horse power and mpg comparison horse power mpg
  • 4. cor.test(mc𝑑𝑖𝑠𝑝, 𝑚𝑐mpg) Pearson's product-moment correlation data: mc𝑑𝑖𝑠𝑝𝑎𝑛𝑑𝑚𝑐mpg t = -8.7472, df = 30, p-value = 9.38e-10 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.9233594 -0.7081376 sample estimates: cor -0.8475514 The plot as well as the correlation test between dispositon and mpg does show indications of a strong negative relationship that as disposition increases, mpg decreases. 100 200 300 400 1015202530 disposition and mpg comparison dispostion mpg
  • 5. plot(mc𝑑𝑟𝑎𝑡, 𝑚𝑐mpg,xlab="drat",ylab="mpg",main="drat and mpg comparison",col="black") cor.test(mc𝑑𝑟𝑎𝑡, 𝑚𝑐mpg) Pearson's product-moment correlation data: mc𝑑𝑟𝑎𝑡𝑎𝑛𝑑𝑚𝑐mpg t = 5.096, df = 30, p-value = 1.776e-05 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: 0.4360484 0.8322010 sample estimates: cor 0.6811719 The above plot and correlation test between drat and mpg show indications of a moderately strong positive relationship between the two variables exists that as drat increases, mpg increases. 3.0 3.5 4.0 4.5 5.0 1015202530 drat and mpg comparison drat mpg
  • 6. plot(mc𝑞𝑠𝑒𝑐, 𝑚𝑐mpg,xlab="qsec",ylab="mpg",main="qsec and mpg comparison",col="black") cor.test(mc𝑞𝑠𝑒𝑐, 𝑚𝑐mpg) Pearson's product-moment correlation data: mc𝑞𝑠𝑒𝑐𝑎𝑛𝑑𝑚𝑐mpg t = 2.5252, df = 30, p-value = 0.01708 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: 0.08195487 0.66961864 sample estimates: cor 0.418684 The plot and correlation test between qsec and mpg indicates a slightly positive relationship may exist between qsec and mpg. 16 18 20 22 1015202530 qsec and mpg comparison qsec mpg
  • 7. boxplot(mc$mpg~factor.cyl,xlab="cylinder",ylab="mpg",main="mpg and cylinder comparison",col=c(3,5,7)) This box plot reveals a possible indication that as cylinder increases the mpg decreases. The data visualization has shown indications that possible relationships exist as well as substantial variance between mpg and other variables. I will now construct a regression model that best measures if any independent variables are statistically significant to the dependent variable mpg. The model should also help better explain the variation in the mpg that is predictable from any independent variables. I will use forward selection, 4 6 8 1015202530 mpg and cylinder comparison cylinder mpg
  • 8. backward elimination, and stepwise regression to construct a regression model with each method and then compare their results to determine which best fits the model. add1(lm(mc$mpg~1),scope=(~.+mc$disp+mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+f actor.vs+factor.am+factor.gear+factor.carb),test="F") Single term additions Model: mc$mpg ~ 1 Df Sum of Sq RSS AIC F value Pr(>F) <none> 1126.05 115.943 mc$disp 1 808.89 317.16 77.397 76.5127 9.380e-10 *** mc$hp 1 678.37 447.67 88.427 45.4598 1.788e-07 *** mc$drat 1 522.48 603.57 97.988 25.9696 1.776e-05 *** mc$wt 1 847.73 278.32 73.217 91.3753 1.294e-10 *** mc$qsec 1 197.39 928.66 111.776 6.3767 0.0170820 * factor.cyl 2 824.78 301.26 77.752 39.6975 4.979e-09 *** factor.vs 1 496.53 629.52 99.335 23.6622 3.416e-05 *** factor.am 1 405.15 720.90 103.672 16.8603 0.0002850 *** factor.gear 2 483.24 642.80 102.003 10.9007 0.0002948 *** factor.carb 5 500.56 625.49 107.129 4.1614 0.0065462 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 add1(lm(mc$mpg~1+mc$wt),scope=(~.+mc$disp+mc$hp+mc$drat+mc$qsec+factor.cyl+f actor.vs+factor.am+factor.gear+factor.carb),test="F") Single term additions
  • 9. Model: mc$mpg ~ 1 + mc$wt Df Sum of Sq RSS AIC F value Pr(>F) <none> 278.32 73.217 mc$disp 1 31.639 246.68 71.356 3.7195 0.063620 . mc$hp 1 83.274 195.05 63.840 12.3813 0.001451 ** mc$drat 1 9.081 269.24 74.156 0.9781 0.330854 mc$qsec 1 82.858 195.46 63.908 12.2933 0.001500 ** factor.cyl 2 95.263 183.06 63.810 7.2856 0.002835 ** factor.vs 1 54.228 224.09 68.283 7.0177 0.012926 * factor.am 1 0.002 278.32 75.217 0.0002 0.987915 factor.gear 2 40.372 237.95 72.202 2.3753 0.111467 factor.carb 5 47.458 230.86 77.235 1.0278 0.422802 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 add1(lm(mc$mpg~1+mc$wt+mc$hp),scope=(~.+mc$disp+mc$drat+mc$qsec+factor.cyl+f actor.vs+factor.am+factor.gear+factor.carb),test="F") Single term additions Model: mc$mpg ~ 1 + mc$wt + mc$hp Df Sum of Sq RSS AIC F value Pr(>F) <none> 195.05 63.840 mc$disp 1 0.057 194.99 65.831 0.0082 0.92851 mc$drat 1 11.366 183.68 63.919 1.7326 0.19876 mc$qsec 1 8.988 186.06 64.331 1.3527 0.25463 factor.cyl 2 34.270 160.78 61.657 2.8776 0.07364 . factor.vs 1 6.868 188.18 64.693 1.0219 0.32072
  • 10. factor.am 1 14.757 180.29 63.323 2.2918 0.14127 factor.gear 2 9.903 185.15 66.173 0.7221 0.49489 factor.carb 5 11.448 183.60 71.905 0.2993 0.90842 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 model=lm(mc$mpg~mc$wt+mc$hp) summary(model) Call: lm(formula = mc$mpg ~ mc$wt + mc$hp) Residuals: Min 1Q Median 3Q Max -3.941 -1.600 -0.182 1.050 5.854 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 37.22727 1.59879 23.285 < 2e-16 *** mc$wt -3.87783 0.63273 -6.129 1.12e-06 *** mc$hp -0.03177 0.00903 -3.519 0.00145 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.593 on 29 degrees of freedom Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148 F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12 AIC(model)
  • 11. [1] 156.6523 The model above was constructed using the forward selection method drop1(lm(mc$mpg~mc$disp+mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+factor.vs+fact or.am+factor.gear+factor.carb),test="F") Single term deletions Model: mc$mpg ~ mc$disp + mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am + factor.gear + factor.carb Df Sum of Sq RSS AIC F value Pr(>F) <none> 120.40 76.403 mc$disp 1 9.9672 130.37 76.948 1.2417 0.28267 mc$hp 1 25.6715 146.07 80.588 3.1982 0.09393 . mc$drat 1 1.8208 122.22 74.884 0.2268 0.64074 mc$wt 1 25.5541 145.96 80.562 3.1836 0.09462 . mc$qsec 1 1.2413 121.64 74.732 0.1546 0.69967 factor.cyl 2 10.9314 131.33 75.184 0.6809 0.52112 factor.vs 1 3.6299 124.03 75.354 0.4522 0.51151 factor.am 1 1.1420 121.55 74.705 0.1423 0.71132 factor.gear 2 3.9729 124.38 73.442 0.2475 0.78390 factor.carb 5 13.5989 134.00 69.828 0.3388 0.88144 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 drop1(lm(mc$mpg~mc$disp+mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+factor.vs+ factor.am+factor.gear),test="F") Single term deletions Model:
  • 12. mc$mpg ~ mc$disp + mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am + factor.gear Df Sum of Sq RSS AIC F value Pr(>F) <none> 134.00 69.828 mc$disp 1 0.9934 135.00 68.064 0.1483 0.70427 mc$hp 1 22.7935 156.79 72.855 3.4020 0.07998 . mc$drat 1 1.1854 135.19 68.110 0.1769 0.67852 mc$wt 1 19.7963 153.80 72.237 2.9546 0.10107 mc$qsec 1 5.2634 139.26 69.061 0.7856 0.38598 factor.cyl 2 12.5642 146.57 68.696 0.9376 0.40811 factor.vs 1 3.6763 137.68 68.694 0.5487 0.46746 factor.am 1 11.9255 145.93 70.556 1.7799 0.19715 factor.gear 2 5.0215 139.02 67.005 0.3747 0.69220 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 drop1(lm(mc$mpg~mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+factor.vs+factor.am+fact or.gear),test="F") Single term deletions Model: mc$mpg ~ mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am + factor.gear Df Sum of Sq RSS AIC F value Pr(>F) <none> 135.00 68.064 mc$hp 1 23.8685 158.86 71.274 3.7130 0.06763 . mc$drat 1 1.5589 136.55 66.431 0.2425 0.62751 mc$wt 1 27.6318 162.63 72.023 4.2984 0.05064 . mc$qsec 1 4.6789 139.67 67.154 0.7279 0.40320
  • 13. factor.cyl 2 18.6303 153.62 68.201 1.4491 0.25732 factor.vs 1 4.6788 139.67 67.154 0.7278 0.40321 factor.am 1 13.5206 148.52 69.119 2.1033 0.16176 factor.gear 2 5.5765 140.57 65.359 0.4337 0.65375 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 drop1(lm(mc$mpg~mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+factor.vs+factor.am),tes t="F") Single term deletions Model: mc$mpg ~ mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am Df Sum of Sq RSS AIC F value Pr(>F) <none> 140.57 65.359 mc$hp 1 18.566 159.14 67.329 3.0378 0.09470 . mc$drat 1 0.666 141.24 63.511 0.1090 0.74426 mc$wt 1 38.996 179.57 71.194 6.3804 0.01888 * mc$qsec 1 2.778 143.35 63.986 0.4545 0.50692 factor.cyl 2 17.987 158.56 65.212 1.4715 0.25040 factor.vs 1 2.644 143.22 63.956 0.4326 0.51726 factor.am 1 16.244 156.81 66.859 2.6578 0.11666 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > drop1(lm(mc$mpg~mc$hp+mc$wt+mc$qsec+factor.cyl+factor.vs+factor.am),test="F") Single term deletions Model:
  • 14. mc$mpg ~ mc$hp + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am Df Sum of Sq RSS AIC F value Pr(>F) <none> 141.24 63.511 mc$hp 1 18.184 159.42 65.386 3.0899 0.09153 . mc$wt 1 39.645 180.88 69.428 6.7367 0.01586 * mc$qsec 1 2.442 143.68 62.059 0.4150 0.52557 factor.cyl 2 18.580 159.82 63.466 1.5786 0.22693 factor.vs 1 2.744 143.98 62.126 0.4663 0.50124 factor.am 1 18.885 160.12 65.527 3.2090 0.08585 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > drop1(lm(mc$mpg~mc$hp+mc$wt+factor.cyl+factor.vs+factor.am),test="F") Single term deletions Model: mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.vs + factor.am Df Sum of Sq RSS AIC F value Pr(>F) <none> 143.68 62.059 mc$hp 1 36.344 180.02 67.275 6.3238 0.01871 * mc$wt 1 41.088 184.77 68.108 7.1493 0.01302 * factor.cyl 2 25.284 168.96 63.246 2.1997 0.13183 factor.vs 1 7.346 151.03 61.655 1.2782 0.26897 factor.am 1 16.443 160.12 63.527 2.8611 0.10317 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > drop1(lm(mc$mpg~mc$hp+mc$wt+factor.cyl+factor.am),test="F") Single term deletions
  • 15. Model: mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.am Df Sum of Sq RSS AIC F value Pr(>F) <none> 151.03 61.655 mc$hp 1 31.943 182.97 65.794 5.4991 0.026935 * mc$wt 1 46.173 197.20 68.191 7.9490 0.009081 ** factor.cyl 2 29.265 180.29 63.323 2.5191 0.099998 . factor.am 1 9.752 160.78 61.657 1.6789 0.206460 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > drop1(lm(mc$mpg~mc$hp+mc$wt+factor.cyl),test="F") Single term deletions Model: mc$mpg ~ mc$hp + mc$wt + factor.cyl Df Sum of Sq RSS AIC F value Pr(>F) <none> 160.78 61.657 mc$hp 1 22.281 183.06 63.810 3.7417 0.0636127 . mc$wt 1 116.390 277.17 77.084 19.5458 0.0001442 *** factor.cyl 2 34.270 195.05 63.840 2.8776 0.0736450 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > modelII<-lm(mc$mpg~mc$hp+mc$wt+factor.cyl) > summary(modelII) Call: lm(formula = mc$mpg ~ mc$hp + mc$wt + factor.cyl)
  • 16. Residuals: Min 1Q Median 3Q Max -4.2612 -1.0320 -0.3210 0.9281 5.3947 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 35.84600 2.04102 17.563 2.67e-16 *** mc$hp -0.02312 0.01195 -1.934 0.063613 . mc$wt -3.18140 0.71960 -4.421 0.000144 *** factor.cyl6 -3.35902 1.40167 -2.396 0.023747 * factor.cyl8 -3.18588 2.17048 -1.468 0.153705 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.44 on 27 degrees of freedom Multiple R-squared: 0.8572, Adjusted R-squared: 0.8361 F-statistic: 40.53 on 4 and 27 DF, p-value: 4.869e-11 > AIC(modelII) [1] 154.4692 The model above was constructed using the backward elimination method. > modelIII<- (lm(mc$mpg~mc$disp+mc$hp+mc$drat+mc$wt+mc$qsec+factor.cyl+factor.vs+factor.am+ factor.gear+factor.carb)) > step<-stepAIC(modelIII,direction="both") Start: AIC=76.4 mc$mpg ~ mc$disp + mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am + factor.gear + factor.carb Df Sum of Sq RSS AIC
  • 17. - factor.carb 5 13.5989 134.00 69.828 - factor.gear 2 3.9729 124.38 73.442 - factor.am 1 1.1420 121.55 74.705 - mc$qsec 1 1.2413 121.64 74.732 - mc$drat 1 1.8208 122.22 74.884 - factor.cyl 2 10.9314 131.33 75.184 - factor.vs 1 3.6299 124.03 75.354 <none> 120.40 76.403 - mc$disp 1 9.9672 130.37 76.948 - mc$wt 1 25.5541 145.96 80.562 - mc$hp 1 25.6715 146.07 80.588 Step: AIC=69.83 mc$mpg ~ mc$disp + mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am + factor.gear Df Sum of Sq RSS AIC - factor.gear 2 5.0215 139.02 67.005 - mc$disp 1 0.9934 135.00 68.064 - mc$drat 1 1.1854 135.19 68.110 - factor.vs 1 3.6763 137.68 68.694 - factor.cyl 2 12.5642 146.57 68.696 - mc$qsec 1 5.2634 139.26 69.061 <none> 134.00 69.828 - factor.am 1 11.9255 145.93 70.556 - mc$wt 1 19.7963 153.80 72.237 - mc$hp 1 22.7935 156.79 72.855 + factor.carb 5 13.5989 120.40 76.403
  • 18. Step: AIC=67 mc$mpg ~ mc$disp + mc$hp + mc$drat + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am Df Sum of Sq RSS AIC - mc$drat 1 0.9672 139.99 65.227 - factor.cyl 2 10.4247 149.45 65.319 - mc$disp 1 1.5483 140.57 65.359 - factor.vs 1 2.1829 141.21 65.503 - mc$qsec 1 3.6324 142.66 65.830 <none> 139.02 67.005 - factor.am 1 16.5665 155.59 68.608 - mc$hp 1 18.1768 157.20 68.937 + factor.gear 2 5.0215 134.00 69.828 - mc$wt 1 31.1896 170.21 71.482 + factor.carb 5 14.6475 124.38 73.442 Step: AIC=65.23 mc$mpg ~ mc$disp + mc$hp + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am Df Sum of Sq RSS AIC - mc$disp 1 1.2474 141.24 63.511 - factor.vs 1 2.3403 142.33 63.757 - factor.cyl 2 12.3267 152.32 63.927 - mc$qsec 1 3.1000 143.09 63.928 <none> 139.99 65.227
  • 19. + mc$drat 1 0.9672 139.02 67.005 - mc$hp 1 17.7382 157.73 67.044 - factor.am 1 19.4660 159.46 67.393 + factor.gear 2 4.8033 135.19 68.110 - mc$wt 1 30.7151 170.71 69.574 + factor.carb 5 13.0509 126.94 72.095 Step: AIC=63.51 mc$mpg ~ mc$hp + mc$wt + mc$qsec + factor.cyl + factor.vs + factor.am Df Sum of Sq RSS AIC - mc$qsec 1 2.442 143.68 62.059 - factor.vs 1 2.744 143.98 62.126 - factor.cyl 2 18.580 159.82 63.466 <none> 141.24 63.511 + mc$disp 1 1.247 139.99 65.227 + mc$drat 1 0.666 140.57 65.359 - mc$hp 1 18.184 159.42 65.386 - factor.am 1 18.885 160.12 65.527 + factor.gear 2 4.684 136.55 66.431 - mc$wt 1 39.645 180.88 69.428 + factor.carb 5 2.331 138.91 72.978 Step: AIC=62.06 mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.vs + factor.am Df Sum of Sq RSS AIC - factor.vs 1 7.346 151.03 61.655
  • 20. <none> 143.68 62.059 - factor.cyl 2 25.284 168.96 63.246 + mc$qsec 1 2.442 141.24 63.511 - factor.am 1 16.443 160.12 63.527 + mc$disp 1 0.589 143.09 63.928 + mc$drat 1 0.330 143.35 63.986 + factor.gear 2 3.437 140.24 65.284 - mc$hp 1 36.344 180.02 67.275 - mc$wt 1 41.088 184.77 68.108 + factor.carb 5 3.480 140.20 71.275 Step: AIC=61.65 mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.am Df Sum of Sq RSS AIC <none> 151.03 61.655 - factor.am 1 9.752 160.78 61.657 + factor.vs 1 7.346 143.68 62.059 + mc$qsec 1 7.044 143.98 62.126 - factor.cyl 2 29.265 180.29 63.323 + mc$disp 1 0.617 150.41 63.524 + mc$drat 1 0.220 150.81 63.608 + factor.gear 2 1.361 149.66 65.365 - mc$hp 1 31.943 182.97 65.794 - mc$wt 1 46.173 197.20 68.191 + factor.carb 5 5.633 145.39 70.438
  • 21. > model3<-lm(mc$mpg~mc$hp+mc$wt+factor.cyl+factor.am) > summary(model3) Call: lm(formula = mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.am) Residuals: Min 1Q Median 3Q Max -3.9387 -1.2560 -0.4013 1.1253 5.0513 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 33.70832 2.60489 12.940 7.73e-13 *** mc$hp -0.03211 0.01369 -2.345 0.02693 * mc$wt -2.49683 0.88559 -2.819 0.00908 ** factor.cyl6 -3.03134 1.40728 -2.154 0.04068 * factor.cyl8 -2.16368 2.28425 -0.947 0.35225 factor.am1 1.80921 1.39630 1.296 0.20646 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.41 on 26 degrees of freedom Multiple R-squared: 0.8659, Adjusted R-squared: 0.8401 F-statistic: 33.57 on 5 and 26 DF, p-value: 1.506e-10 > AIC(model3) [1] 154.4669 The model above was constructed using the stepwise regression method.
  • 22. Now that I have 3 regression models using 3 different methods I can now choose which is the best fitting model. Below I recoded each model to easily compare their results. > summary(model) Call: lm(formula = mc$mpg ~ mc$wt + mc$hp) Residuals: Min 1Q Median 3Q Max -3.941 -1.600 -0.182 1.050 5.854 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 37.22727 1.59879 23.285 < 2e-16 *** mc$wt -3.87783 0.63273 -6.129 1.12e-06 *** mc$hp -0.03177 0.00903 -3.519 0.00145 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.593 on 29 degrees of freedom Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148 F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
  • 23. > summary(modelII) Call: lm(formula = mc$mpg ~ mc$hp + mc$wt + factor.cyl) Residuals: Min 1Q Median 3Q Max -4.2612 -1.0320 -0.3210 0.9281 5.3947 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 35.84600 2.04102 17.563 2.67e-16 *** mc$hp -0.02312 0.01195 -1.934 0.063613 . mc$wt -3.18140 0.71960 -4.421 0.000144 *** factor.cyl6 -3.35902 1.40167 -2.396 0.023747 * factor.cyl8 -3.18588 2.17048 -1.468 0.153705 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.44 on 27 degrees of freedom Multiple R-squared: 0.8572, Adjusted R-squared: 0.8361 F-statistic: 40.53 on 4 and 27 DF, p-value: 4.869e-11
  • 24. > summary(model3) Call: lm(formula = mc$mpg ~ mc$hp + mc$wt + factor.cyl + factor.am) Residuals: Min 1Q Median 3Q Max -3.9387 -1.2560 -0.4013 1.1253 5.0513 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 33.70832 2.60489 12.940 7.73e-13 *** mc$hp -0.03211 0.01369 -2.345 0.02693 * mc$wt -2.49683 0.88559 -2.819 0.00908 ** factor.cyl6 -3.03134 1.40728 -2.154 0.04068 * factor.cyl8 -2.16368 2.28425 -0.947 0.35225 factor.am1 1.80921 1.39630 1.296 0.20646 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 2.41 on 26 degrees of freedom Multiple R-squared: 0.8659, Adjusted R-squared: 0.8401 F-statistic: 33.57 on 5 and 26 DF, p-value: 1.506e-10
  • 25. > AIC(model) [1] 156.6523 > AIC(modelII) [1] 154.4692 > AIC(model3) [1] 154.4669 After examining and comparing all the regression models, it can be concluded that the variable that explains the most variability in all 3 models is weight. Horse power would be the second varible to explain the most variability. model3 has the lowest AIC value which is a measure used to avoid multicollinearity. Model3 not only has the lowest AIC value, but also the highest Adjusted R squared value or coefficent of determination that can explain approximately 84% of the variation in the regression equation.