A regression model and classification tree comparison report to find out whether a car is of US origin (0) or not (1) based on observed variables. The aim of this report is to show different accuracies of using such models.
1. Auto_MPG_Report
Quentin Adam
12/6/2019
Logistic Regression
This shows the probability of a car being of US origin based on its number of cylinders, mpg, hp, acceleration and displacement.
##
## Call:
## glm(formula = orogin ~ cylinders + mpg + horsepower + acceleration +
## displacement, family = "binomial", data = mydf)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.82851 -0.54332 0.00894 0.10536 2.25479
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.48435 3.02361 1.152 0.24916
## cylinders -1.19654 0.38396 -3.116 0.00183 **
## mpg -0.01945 0.03865 -0.503 0.61481
## horsepower -0.07389 0.01910 -3.868 0.00011 ***
## acceleration -0.16920 0.08865 -1.909 0.05630 .
## displacement 0.08779 0.01176 7.466 8.24e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 518.67 on 391 degrees of freedom
## Residual deviance: 214.85 on 386 degrees of freedom
## (14 observations deleted due to missingness)
## AIC: 226.85
##
## Number of Fisher Scoring iterations: 8
2. Decision “Classification” Tree
This shows the decisions tree to assess whether a car is of US origin based on its number of cylinders, mpg, hp, acceleration and displacement.
Comparing the two Models for Accuracy
This performance comparison tells us that the decision tree is more accurate than the logistic model at predicting the origin of the vehicle, as seen
by the decision tree curve (in black) being closer to a “True Positive” (in other words: accurately saying that the vehicle is of US origin when it is
actually of US origin)
3. Probabilities
For each additional unit of displacement there is a +9% probability of the vehicle being a US vehicle.
## [1] 0.09175883
For each additional number of cylinders there is a +69% probability of the vehicle being of US origin.
## [1] -0.6977619
For each additional number of cylinders there is a +7% probability of the vehicle being of US origin.
4. ## [1] -0.07122615
For each additional number of cylinders there is a +15% probability of the vehicle being of US origin.
## [1] -0.15566