SlideShare a Scribd company logo
1 of 33
AUTO MPG
REGRESSION
ANALYSIS
ANIRUDH SRINATH
SHANKAR PRASAAD
MATHU BALAN
INTRODUCTION
 The objective of this project is to study the
relationship between Horsepower,
Displacement, Cylinders, Acceleration and
Weight on Miles Per Gallon(MPG). The dataset
was obtained from the UCI Website and
Regression Analysis was conducted.
 The reason why we choose the particular
dataset was because of its practical
applications involved in it. Miles per
Gallon(mpg) will be useful when you purchase
a car and that was one of the reasons why we
choose this dataset.
METHODOLOGY
The model that we have used to perform regression
analysis is multivariate. It has more than two variables
and therefore Multiple Regression Analysis is conducted.
The variable here to predict is called the dependent
variable. The variables here to predict the dependent
variable are called the independent variables.
Data
Sourcing
The data taken into consideration is taken from
the University of California-Irvine website. It has
been extensively used by students,educators and
researchers all over the world and is the primary
source for Regression Dataset Analysis
Link to the Dataset -
http://archive.ics.uci.edu/ml/datasets/Auto+MPG
VARIABLES
DEPENDENT VARIABLE:
Miles per Gallon(mpg) – Continuous
INDEPENDENT VARIABLES:
Cylinders - Multi-Valued discrete - Denotes the no
of cylinders in a car(3,4,6,8)
Displacement - Continuous - Volume of Pistons
inside a car
Horsepower - Continuous - Power of an Engine in a
car
Weight - Continuous - Weight of the car in lbs
Acceleration - Continuous - Acceleration of a car
MODEL 1 - Multiple Regression Analysis
Miles Per Gallon(MPG) is regressed on the four independent
variables and this is the first model of our Regression Analysis. R-
Squared explains 70.70% variation in the independent
variable(MPG).
MODEL 2 - INDEPENDENT
VARIABLE
TRANSFORMATION
 After transforming the
independent variable with log
transformation, we found the R
squared to improve from 70.70%
to 78.98%.Also performing the
slog transformation, showed the
data to be distributed normal
which we could see from the
histogram distribution. The
formula is given below
 L_mpg = β0 + β1Displacement +
β2Horsepower + β3Acceleration
+ β4Weight
CORRELATION
ANALYSIS
 Here we found that
correlation between
 1) Displacement and
Horsepower
 2) Weight and
horsepower
 3) Weight and
Displacement
HISTOGRAM & SCATTER PLOT FOR LIN-LIN MODEL
Scatter Plot HISTOGRAM
HISTOGRAM & SCATTER PLOT FOR LOG-LIN MODEL
Scatter Plot Histogram
As you can see from the graphs, the Log-Lin Model appears to be a better
model because it is more normally distributed.
Hypothesis Testing - Paired sample t test
Hypothesis Testing to identify if the Coefficients of Two variables are
equal is performed
MODEL 3 - DUMMY
VARIABLE - ANALYSIS(STEP
1)
The first step to identify
the dummy variables in
the model is to identify
the no of categories in a
variable. As seen from the
table, our model has 5
categories with Eight
having the highest
frequency..
STEP 2 - DUMMY
VARIABLE ANALYSIS
Multiple Regression is
performed Using the
Dummy encoded Cylinders
with Cylinder 5 as the base
variable. Cylinder Variable
5 is Three which has a
frequency of 3.
MODEL 4 - INTERACTION TERMS & REGRESSION ANALYSIS
Regression is done on Interaction Terms (Displacement & Horsepower) and the
other independent variables. The reason why Displacement and Horsepower
was chosen is because of their high correlation value
MODEL 5 - Regression Analysis on Dummy Variables & Interaction
Terms.
Regression Analysis is done on the Dummy Variables and Interaction Terms to check if the R-Squared
Value is increasing. The equation for the model is given below
L_mpg = β0 + β1Displacement + β2Horsepower + β3Acceleration + β4Weight + β5CYLINDER_COUNT4
+ β6CYLINDER_COUNT2 + β7CYLINDER_COUNT3 + β8CYLINDER_COUNT5 + β9disp_horse
OBSERVATIONS FROM MODEL 5
 Here CYLINDER_COUNT1 is being kept as
base variable and regressed on the other
independent variables.
 We can see that CYLINDER_COUNT4 is 3.3%
less that CYLINDER_COUNT1
 We could see (CYLINDER_COUNT2 ) is
predicted to have 11.3 – (-3.3) = 14.6 more
mpg than CYLINDER_COUNT4
 To check whether the difference is
significant or not, we have performed
another model with CYLINDER_COUNT4 is
kept as the base variable.
Test For Significant Difference
Here CYLINDER_COUNT4 is kept as base and regressed model shows that
CYLINDER_COUNT2 has 14.6% more mpg than CYLINDER_COUNT4 (which is
evident from our previous model)
Testing Differences Between
Groups(F-Test)
L_mpg = β0 + 𝛿0 CYLINDER_COUNT1 + β1 displacement + 𝛿1 c1_disp + β2
horsepower + 𝛿2 c1_horse + β3 weight + 𝛿3 c1_weight + β4 acceleration + 𝛿4
c1_acc

Null hypothesis:
If 𝛿0 = 𝛿1 = 𝛿2 = 𝛿3 = 𝛿4 = 0 then we conclude that there is no difference between
the groups
Alternate:
Null hypothesis is False i.e, there is a difference between the groups
Using F-Stats to determine difference between groups(Restricted & Unrestricted)
UNRESTRICTED MODEL
Unrestricted model contains Independent Variables and Dummy
Variable(Cylinder Count 1) and the product of the Dummy Variable along
with Independent Variables.
RESTRICTED MODEL
Restricted Model contains Regression on the Base
Model.
F-Test to Determine Difference between Groups
F = (R2
u - R2
r)/q
(1 – R2
u)/ (n-k-1)
= (0.8154 – 0.7898)/5
(1 – 0.8154) / 382
=10.59
Therefore 10.59 is greater than F-Table(5,382) which is 2.2141.
Therefore we reject the null and therefore we can conclude that there
are differences in groups.
Test for Heteroskedasticity - Breusch Pagan Test
Multiple Regression is done using Log-Lin Model to
check for heteroskedasticity.
As seen from the table, the Error Term is predicted and regression is
done on the Square of the Regressors.
Hypothesis Testing for Heteroskedasticity
Continued..
Null Hypothesis - βdisplacement = βhorsepower= βweight= βacceleration = 0
Alternate Hypothesis - There is heteroskedasticity
F = (R2
u /k)
(1 – R2
u)/ (n-k-1)
= (0.05/4)
(1 – 0.05)/ (387)
= 5.092
Therefore 5.092 is greater than F-Table(4,387) which is 2.3719 and null is rejected. So our model exhibits
heteroskedasticity.
White Test for Heteroskedasticity
Multiple Regression is done using the Log-Lin Model.
Regression
on Cross
Products of
Regressors
and its
Square
 Gen disp2 = displacement ^2
 Gen horsepower2 = Horsepower ^2
 Gen Acceleration2 = Acceleration ^2
 Gen Weight2 = Weight ^2
 Gen disp_acceleration = Displacement * Acceleration
 Gen horse_acc = Horsepower * Acceleration
 Gen weight_acc = Weight * Acceleration
Contd..
Hypothesis Testing
Null Hypothesis - βdisplacement = βhorsepower= βweight= βacceleration = 0
Alternate Hypothesis - There is heteroskedasticity
F Statistic(90.44482) is greater than F-Table Value(8.08), therefore we
reject the null and confirm that there is heteroskedasticity.
Conclusion for Heteroskedasticity
As seen from the graph and the two tests, we can determine that there is
heteroskedasticity.
HETEROSKEDASTICITY ROBUST STANDARD
ERRORS(HRSE)
Due to the presence
of heteroskedasticity,
the best variance and
the standard error
estimates are not
valid. Therefore we
need to find
heteroskedasticity
robust standard
errors.
When a model
exhibits
heteroskedasticity, it
is better to look at
the robust standard
errors than the OLS
standard errors.
Regression on Log-Lin Model
Robust Standard Errors
Summary
Model No R-Squared Adjusted R-
Squared
Model 1 0.7070 0.7040
Model 2 0.7898 0.7876
Model 3 0.8112 0.8073
Model 4 0.8134 0.8110
Model 5 0.8286 0.8184

More Related Content

What's hot

Excess gibbs free energy models
Excess gibbs free energy modelsExcess gibbs free energy models
Excess gibbs free energy modelsSunny Chauhan
 
Slides gas liquid flow patterns as directed graphs
Slides gas liquid flow patterns as directed graphsSlides gas liquid flow patterns as directed graphs
Slides gas liquid flow patterns as directed graphsPablo Adames
 
Peifeng Ma FR01 T01 5.ppt
Peifeng Ma FR01 T01 5.pptPeifeng Ma FR01 T01 5.ppt
Peifeng Ma FR01 T01 5.pptgrssieee
 
Partial gibbs free energy and gibbs duhem equation
Partial gibbs free energy and gibbs duhem equationPartial gibbs free energy and gibbs duhem equation
Partial gibbs free energy and gibbs duhem equationSunny Chauhan
 
Handout notes gas liquid flow patterns as directed graphs
Handout notes  gas liquid flow patterns as directed graphsHandout notes  gas liquid flow patterns as directed graphs
Handout notes gas liquid flow patterns as directed graphsPablo Adames
 
Chem 2 - Introduction to Chemical Kinetics II
Chem 2 - Introduction to Chemical Kinetics IIChem 2 - Introduction to Chemical Kinetics II
Chem 2 - Introduction to Chemical Kinetics IILumen Learning
 
A simplified thermal model for the three way catalytic converter (1)
A simplified thermal model for the three way catalytic converter (1)A simplified thermal model for the three way catalytic converter (1)
A simplified thermal model for the three way catalytic converter (1)Varun Pandey
 
An Offshore Natural Gas Transmission Pipeline Model and Analysis for the Pred...
An Offshore Natural Gas Transmission Pipeline Model and Analysis for the Pred...An Offshore Natural Gas Transmission Pipeline Model and Analysis for the Pred...
An Offshore Natural Gas Transmission Pipeline Model and Analysis for the Pred...IOSRJAC
 
Research Proposal for Turbulence Examination of Class-8 Vehicles
Research Proposal for Turbulence Examination of Class-8 VehiclesResearch Proposal for Turbulence Examination of Class-8 Vehicles
Research Proposal for Turbulence Examination of Class-8 VehiclesSalman Rahmani
 
Comsol conference presentation
Comsol conference presentationComsol conference presentation
Comsol conference presentationPatan Ameer Khan
 

What's hot (15)

Crude Oil Levy
Crude Oil LevyCrude Oil Levy
Crude Oil Levy
 
Excess gibbs free energy models
Excess gibbs free energy modelsExcess gibbs free energy models
Excess gibbs free energy models
 
Slides gas liquid flow patterns as directed graphs
Slides gas liquid flow patterns as directed graphsSlides gas liquid flow patterns as directed graphs
Slides gas liquid flow patterns as directed graphs
 
conference_MAF_22042014
conference_MAF_22042014conference_MAF_22042014
conference_MAF_22042014
 
Peifeng Ma FR01 T01 5.ppt
Peifeng Ma FR01 T01 5.pptPeifeng Ma FR01 T01 5.ppt
Peifeng Ma FR01 T01 5.ppt
 
Material and energy balances
Material and energy balancesMaterial and energy balances
Material and energy balances
 
Partial gibbs free energy and gibbs duhem equation
Partial gibbs free energy and gibbs duhem equationPartial gibbs free energy and gibbs duhem equation
Partial gibbs free energy and gibbs duhem equation
 
Handout notes gas liquid flow patterns as directed graphs
Handout notes  gas liquid flow patterns as directed graphsHandout notes  gas liquid flow patterns as directed graphs
Handout notes gas liquid flow patterns as directed graphs
 
Ascoli et al. 2014 ICFFR
Ascoli et al. 2014 ICFFRAscoli et al. 2014 ICFFR
Ascoli et al. 2014 ICFFR
 
Chem 2 - Introduction to Chemical Kinetics II
Chem 2 - Introduction to Chemical Kinetics IIChem 2 - Introduction to Chemical Kinetics II
Chem 2 - Introduction to Chemical Kinetics II
 
A simplified thermal model for the three way catalytic converter (1)
A simplified thermal model for the three way catalytic converter (1)A simplified thermal model for the three way catalytic converter (1)
A simplified thermal model for the three way catalytic converter (1)
 
An Offshore Natural Gas Transmission Pipeline Model and Analysis for the Pred...
An Offshore Natural Gas Transmission Pipeline Model and Analysis for the Pred...An Offshore Natural Gas Transmission Pipeline Model and Analysis for the Pred...
An Offshore Natural Gas Transmission Pipeline Model and Analysis for the Pred...
 
Research Proposal for Turbulence Examination of Class-8 Vehicles
Research Proposal for Turbulence Examination of Class-8 VehiclesResearch Proposal for Turbulence Examination of Class-8 Vehicles
Research Proposal for Turbulence Examination of Class-8 Vehicles
 
Comsol conference presentation
Comsol conference presentationComsol conference presentation
Comsol conference presentation
 
7. Novel Technique
7. Novel Technique7. Novel Technique
7. Novel Technique
 

Similar to Auto MPG Regression Analysis

Multiple Linear Regression Applications Automobile Pricing
Multiple Linear Regression Applications Automobile PricingMultiple Linear Regression Applications Automobile Pricing
Multiple Linear Regression Applications Automobile Pricinginventionjournals
 
Toward a Unified Approach to Fitting Loss Models
Toward a Unified Approach to Fitting Loss ModelsToward a Unified Approach to Fitting Loss Models
Toward a Unified Approach to Fitting Loss ModelsJacques Rioux
 
Lab practice session.pptx
Lab practice session.pptxLab practice session.pptx
Lab practice session.pptxakashayosha
 
Econometrics project
Econometrics projectEconometrics project
Econometrics projectShubham Joon
 
Statistics - Multiple Regression and Two Way Anova
Statistics - Multiple Regression and Two Way AnovaStatistics - Multiple Regression and Two Way Anova
Statistics - Multiple Regression and Two Way AnovaNisheet Mahajan
 
Stats computing project_final
Stats computing project_finalStats computing project_final
Stats computing project_finalAyank Gupta
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using RGregg Barrett
 
Simulation of an Active Suspension Using PID Control
Simulation of an Active Suspension Using PID ControlSimulation of an Active Suspension Using PID Control
Simulation of an Active Suspension Using PID ControlSuzana Avila
 
Guide for building GLMS
Guide for building GLMSGuide for building GLMS
Guide for building GLMSAli T. Lotia
 
multiple Regression
multiple Regressionmultiple Regression
multiple RegressionAnniqah
 
Data Analytics Project_Eun Seuk Choi (Eric)
Data Analytics Project_Eun Seuk Choi (Eric)Data Analytics Project_Eun Seuk Choi (Eric)
Data Analytics Project_Eun Seuk Choi (Eric)Eric Choi
 
Flavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachFlavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachAlexander Rakhlin
 

Similar to Auto MPG Regression Analysis (20)

Multiple Regression
Multiple RegressionMultiple Regression
Multiple Regression
 
JEDM_RR_JF_Final
JEDM_RR_JF_FinalJEDM_RR_JF_Final
JEDM_RR_JF_Final
 
R analysis of covariance
R   analysis of covarianceR   analysis of covariance
R analysis of covariance
 
Multiple Linear Regression Applications Automobile Pricing
Multiple Linear Regression Applications Automobile PricingMultiple Linear Regression Applications Automobile Pricing
Multiple Linear Regression Applications Automobile Pricing
 
Employee mode of commuting
Employee mode of commutingEmployee mode of commuting
Employee mode of commuting
 
Telecom customer churn prediction
Telecom customer churn predictionTelecom customer churn prediction
Telecom customer churn prediction
 
Toward a Unified Approach to Fitting Loss Models
Toward a Unified Approach to Fitting Loss ModelsToward a Unified Approach to Fitting Loss Models
Toward a Unified Approach to Fitting Loss Models
 
Lab practice session.pptx
Lab practice session.pptxLab practice session.pptx
Lab practice session.pptx
 
report
reportreport
report
 
Econometrics project
Econometrics projectEconometrics project
Econometrics project
 
Chapter 18,19
Chapter 18,19Chapter 18,19
Chapter 18,19
 
Statistics - Multiple Regression and Two Way Anova
Statistics - Multiple Regression and Two Way AnovaStatistics - Multiple Regression and Two Way Anova
Statistics - Multiple Regression and Two Way Anova
 
Stats computing project_final
Stats computing project_finalStats computing project_final
Stats computing project_final
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using R
 
Simulation of an Active Suspension Using PID Control
Simulation of an Active Suspension Using PID ControlSimulation of an Active Suspension Using PID Control
Simulation of an Active Suspension Using PID Control
 
Guide for building GLMS
Guide for building GLMSGuide for building GLMS
Guide for building GLMS
 
multiple Regression
multiple Regressionmultiple Regression
multiple Regression
 
Data Analytics Project_Eun Seuk Choi (Eric)
Data Analytics Project_Eun Seuk Choi (Eric)Data Analytics Project_Eun Seuk Choi (Eric)
Data Analytics Project_Eun Seuk Choi (Eric)
 
Factors affecting customer satisfaction
Factors affecting customer satisfactionFactors affecting customer satisfaction
Factors affecting customer satisfaction
 
Flavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachFlavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approach
 

Recently uploaded

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 

Auto MPG Regression Analysis

  • 2. INTRODUCTION  The objective of this project is to study the relationship between Horsepower, Displacement, Cylinders, Acceleration and Weight on Miles Per Gallon(MPG). The dataset was obtained from the UCI Website and Regression Analysis was conducted.  The reason why we choose the particular dataset was because of its practical applications involved in it. Miles per Gallon(mpg) will be useful when you purchase a car and that was one of the reasons why we choose this dataset.
  • 3. METHODOLOGY The model that we have used to perform regression analysis is multivariate. It has more than two variables and therefore Multiple Regression Analysis is conducted. The variable here to predict is called the dependent variable. The variables here to predict the dependent variable are called the independent variables.
  • 4. Data Sourcing The data taken into consideration is taken from the University of California-Irvine website. It has been extensively used by students,educators and researchers all over the world and is the primary source for Regression Dataset Analysis Link to the Dataset - http://archive.ics.uci.edu/ml/datasets/Auto+MPG
  • 5. VARIABLES DEPENDENT VARIABLE: Miles per Gallon(mpg) – Continuous INDEPENDENT VARIABLES: Cylinders - Multi-Valued discrete - Denotes the no of cylinders in a car(3,4,6,8) Displacement - Continuous - Volume of Pistons inside a car Horsepower - Continuous - Power of an Engine in a car Weight - Continuous - Weight of the car in lbs Acceleration - Continuous - Acceleration of a car
  • 6. MODEL 1 - Multiple Regression Analysis Miles Per Gallon(MPG) is regressed on the four independent variables and this is the first model of our Regression Analysis. R- Squared explains 70.70% variation in the independent variable(MPG).
  • 7. MODEL 2 - INDEPENDENT VARIABLE TRANSFORMATION  After transforming the independent variable with log transformation, we found the R squared to improve from 70.70% to 78.98%.Also performing the slog transformation, showed the data to be distributed normal which we could see from the histogram distribution. The formula is given below  L_mpg = β0 + β1Displacement + β2Horsepower + β3Acceleration + β4Weight
  • 8. CORRELATION ANALYSIS  Here we found that correlation between  1) Displacement and Horsepower  2) Weight and horsepower  3) Weight and Displacement
  • 9. HISTOGRAM & SCATTER PLOT FOR LIN-LIN MODEL Scatter Plot HISTOGRAM
  • 10. HISTOGRAM & SCATTER PLOT FOR LOG-LIN MODEL Scatter Plot Histogram As you can see from the graphs, the Log-Lin Model appears to be a better model because it is more normally distributed.
  • 11. Hypothesis Testing - Paired sample t test Hypothesis Testing to identify if the Coefficients of Two variables are equal is performed
  • 12. MODEL 3 - DUMMY VARIABLE - ANALYSIS(STEP 1) The first step to identify the dummy variables in the model is to identify the no of categories in a variable. As seen from the table, our model has 5 categories with Eight having the highest frequency..
  • 13. STEP 2 - DUMMY VARIABLE ANALYSIS Multiple Regression is performed Using the Dummy encoded Cylinders with Cylinder 5 as the base variable. Cylinder Variable 5 is Three which has a frequency of 3.
  • 14. MODEL 4 - INTERACTION TERMS & REGRESSION ANALYSIS Regression is done on Interaction Terms (Displacement & Horsepower) and the other independent variables. The reason why Displacement and Horsepower was chosen is because of their high correlation value
  • 15. MODEL 5 - Regression Analysis on Dummy Variables & Interaction Terms. Regression Analysis is done on the Dummy Variables and Interaction Terms to check if the R-Squared Value is increasing. The equation for the model is given below L_mpg = β0 + β1Displacement + β2Horsepower + β3Acceleration + β4Weight + β5CYLINDER_COUNT4 + β6CYLINDER_COUNT2 + β7CYLINDER_COUNT3 + β8CYLINDER_COUNT5 + β9disp_horse
  • 16. OBSERVATIONS FROM MODEL 5  Here CYLINDER_COUNT1 is being kept as base variable and regressed on the other independent variables.  We can see that CYLINDER_COUNT4 is 3.3% less that CYLINDER_COUNT1  We could see (CYLINDER_COUNT2 ) is predicted to have 11.3 – (-3.3) = 14.6 more mpg than CYLINDER_COUNT4  To check whether the difference is significant or not, we have performed another model with CYLINDER_COUNT4 is kept as the base variable.
  • 17. Test For Significant Difference Here CYLINDER_COUNT4 is kept as base and regressed model shows that CYLINDER_COUNT2 has 14.6% more mpg than CYLINDER_COUNT4 (which is evident from our previous model)
  • 18. Testing Differences Between Groups(F-Test) L_mpg = β0 + 𝛿0 CYLINDER_COUNT1 + β1 displacement + 𝛿1 c1_disp + β2 horsepower + 𝛿2 c1_horse + β3 weight + 𝛿3 c1_weight + β4 acceleration + 𝛿4 c1_acc  Null hypothesis: If 𝛿0 = 𝛿1 = 𝛿2 = 𝛿3 = 𝛿4 = 0 then we conclude that there is no difference between the groups Alternate: Null hypothesis is False i.e, there is a difference between the groups Using F-Stats to determine difference between groups(Restricted & Unrestricted)
  • 19. UNRESTRICTED MODEL Unrestricted model contains Independent Variables and Dummy Variable(Cylinder Count 1) and the product of the Dummy Variable along with Independent Variables.
  • 20. RESTRICTED MODEL Restricted Model contains Regression on the Base Model.
  • 21. F-Test to Determine Difference between Groups F = (R2 u - R2 r)/q (1 – R2 u)/ (n-k-1) = (0.8154 – 0.7898)/5 (1 – 0.8154) / 382 =10.59 Therefore 10.59 is greater than F-Table(5,382) which is 2.2141. Therefore we reject the null and therefore we can conclude that there are differences in groups.
  • 22. Test for Heteroskedasticity - Breusch Pagan Test Multiple Regression is done using Log-Lin Model to check for heteroskedasticity.
  • 23. As seen from the table, the Error Term is predicted and regression is done on the Square of the Regressors. Hypothesis Testing for Heteroskedasticity
  • 24. Continued.. Null Hypothesis - βdisplacement = βhorsepower= βweight= βacceleration = 0 Alternate Hypothesis - There is heteroskedasticity F = (R2 u /k) (1 – R2 u)/ (n-k-1) = (0.05/4) (1 – 0.05)/ (387) = 5.092 Therefore 5.092 is greater than F-Table(4,387) which is 2.3719 and null is rejected. So our model exhibits heteroskedasticity.
  • 25. White Test for Heteroskedasticity Multiple Regression is done using the Log-Lin Model.
  • 26. Regression on Cross Products of Regressors and its Square  Gen disp2 = displacement ^2  Gen horsepower2 = Horsepower ^2  Gen Acceleration2 = Acceleration ^2  Gen Weight2 = Weight ^2  Gen disp_acceleration = Displacement * Acceleration  Gen horse_acc = Horsepower * Acceleration  Gen weight_acc = Weight * Acceleration
  • 28. Hypothesis Testing Null Hypothesis - βdisplacement = βhorsepower= βweight= βacceleration = 0 Alternate Hypothesis - There is heteroskedasticity F Statistic(90.44482) is greater than F-Table Value(8.08), therefore we reject the null and confirm that there is heteroskedasticity.
  • 29. Conclusion for Heteroskedasticity As seen from the graph and the two tests, we can determine that there is heteroskedasticity.
  • 30. HETEROSKEDASTICITY ROBUST STANDARD ERRORS(HRSE) Due to the presence of heteroskedasticity, the best variance and the standard error estimates are not valid. Therefore we need to find heteroskedasticity robust standard errors. When a model exhibits heteroskedasticity, it is better to look at the robust standard errors than the OLS standard errors.
  • 33. Summary Model No R-Squared Adjusted R- Squared Model 1 0.7070 0.7040 Model 2 0.7898 0.7876 Model 3 0.8112 0.8073 Model 4 0.8134 0.8110 Model 5 0.8286 0.8184