SlideShare a Scribd company logo
Multiple regression analysis
And
Stepwise regression
History:
The earliest form of regression was
the method of least squares, which was
published by Legendre in 1805, and by Gauss in
1809.
The term "regression" was used by British
biometrician sir Francis Galton in the (1822-
1911), to describe a biological phenomenon.
Sir Galton's work on inherited characteristics
of sweet peas led to the initial conception of
linear regression.
Introduction:
 Regression is a statistical technique for investigating
and modeling the relationship between variables.
 Applications of regression are numerous and occur
in almost every field, including engineering, the
physical and the social sciences, and the biological
sciences.
 Usually, the investigator seeks to ascertain the causal
effect of one variable upon another—the effect of a
price increase upon demand, for example, or the effect
of changes in the money supply upon the inflation rate.
Definition:
Regression is the measure of the average
relationship between two or more variables in terms of
the original units of the data. It is unquestionably the
most widely used statistical technique in social
sciences. It is also widely used in biological and
physical science.
Regression equation is (y) =a + b x
Slope (b) = (NΣXY-(ΣX)( ΣY)) / (NΣX2 – (ΣX)2)
Intercept (a) = (ΣY-b(ΣX)) / N
Review of Simple linear regression.
A simple linear regression is carried out to
estimate the relationship between a dependent variable, Y
and a single explanatory variable, x given a set of data
that includes observations for both of these variables for a
particular population.
•For ex: A real estate agent wishes to examine the
relationship between the selling price of a home and its size
(measured in square feet)
•A random sample of 10 houses is selected
Dependent variable (Y) = house price
Independent variable (X) = square feet
Simple Linear Regression Model
ii10i
εXββY
Linear component
Population
Y intercept
Population
Slope
Coefficient
Random
Error
term
Dependent
Variable
Independen
t Variable
Random Error
component
i10i
XbbYˆ
The simple linear regression equation provides
an estimate of the population regression line
Estimate of
the regression
intercept
Estimate of the
regression slope
Estimated
(or predicted)
Y value for
observation i
Value of X for
observation i
The individual random error terms ei have a mean of zero
Prediction equation is given by:
21
ˆ
xx
yyxx
SS
SS
i
ii
xx
xy
xy 10
ˆˆ
Estimation of coefficients:
Where
Measures of Variation
Total variation is made up of two parts:
SSESSRSST
Total Sum of
Squares
Regression Sum
of Squares
Error Sum of
Squares
2
i
)YY(SST 2
ii
)YˆY(SSE
2
i
)YYˆ(SSR
where:
= Average value of the dependent variable
Yi = Observed values of the dependent variable
i = Predicted value of Y for the given Xi valueYˆ
Y
Measures of Variation
Xi
Y
X
Yi
SST = (Yi - Y)2
SSE = (Yi - Yi )2
SSR = (Yi - Y)2
_
_
_
Y
Y
Y
_
Y
Coefficient of Determination, r2
• The coefficient of determination is the
portion of the total variation in the
dependent variable that is explained by
variation in the independent variable
• The coefficient of determination is also
called r-squared and is denoted as r2
1r0
2note:
squaresofsum
squaresofregression2
total
sum
SST
SSR
r
Multiple linear regression
Introduction:
The general purpose of multiple regression (the
term was first used by Pearson, 1908) is to learn more
about the relationship between several independent or
predictor variables and a dependent or criterion
variables.
Definition:
A regression model that involves the relationship
between two or more explanatory variables and a response
variable by fitting a linear equation to observed data (more
than one regressor variable) is called a multiple regression
model. Every value of the independent variable x is
associated with a value of the dependent variable y.
Suppose that the yield in the pounds of conversation in a
chemical process depends on temperature and the catalyst
concentration. A multiple regression model that might
describe the relationship is
y=β0+β1x1+β2x2+ε
where y denotes the yield,x1denotes the temperature,x2
denotes the catalyst concentration. This is multiple linear
regression model with two regressor variables.
The term linear is used because equation is a linear function
of the known parameters β0,β1& β2 and ε is error term.
The parameter β1 indicates that the expected
change in response (y) per unit change in x1 when x2 is held
constant. Similarly β2 measures the expected change in (y)
per unit change in x2 when x1 held constant.
In general, the response y may be related to k regressor (or)
predictor variables. The model
y= β0+β1x1+β2x2+……………+ βkxk+ε
is a multiple linear regression with k regressors. The parameters
βj, j=0,1,…….k. are called regression coefficients.
The parameter βj represents the expected change in the response (y)
per unit change in xj when all of the remaining regressor variables xi
(i≠j) are held constant. For this reason the parameters βj, j=1,…….k are
often called partial regression coefficients.
Assumptions of Regression
• For any given set of values of x1, x2, … , xk, the random
error has a probability distribution with the following
properties:
• 1. Mean equal to 0
• 2. Variance equal to 2
• 3. Normal distribution
• 4. Random errors are independent
Regression Analysis: Model Building
• General Linear Model
• Determining When to Add or Delete Variables
• Analysis of a Larger Problem
• Multiple Regression Approach
to Analysis of Variance
General Linear Model
Models in which the parameters (β0, β1, . . . , βp)
all have exponents of one are called linear
models.
• First-Order Model with One Predictor
Variable
y x0 1 1
y x0 1 1
Variable Selection Procedures
• Stepwise Regression
• Forward Selection
• Backward Elimination
Iterative; one
independent
variable at a time
is added or
deleted
Based on
the F statistic
Variable Selection Procedures
• F Test
• To test whether the addition of x2 to a model
involving x1 (or the deletion of x2 from a model
involving x1and x2) is statistically significant
F0=MSR/MSRes
(MSR=SSR/K)
The p-value corresponding to the F statistic is the
criterion used to determine if a variable should be added or
deleted
(SSE(reduced)-SSE(full))/number of extra terms
MSE(full)
F
Forward Selection
• This procedure is similar to stepwise-
regression, but does not permit a variable to
be deleted.
• This forward-selection procedure starts with
no independent variables.
• It adds variables one at a time as long as a
significant reduction in the error sum of
squares (SSE) can be achieved.
Backward Elimination
• This procedure begins with a model that
includes all the independent variables the
modeler wants considered.
• It then attempts to delete one variable at a
time by determining whether the least
significant variable currently in the model
can be removed because its p-value is less
than the user-specified or default value.
• Once a variable has been removed from the
model it cannot re enter at a subsequent step.
Stepwise regression:
Procedure of simultaneous forward and backward
selection also available
In a stepwise regression, predictor variables are
entered into the regression equation one at a time
based upon statistical criteria.
At each step in the analysis the predictor variable that
contributes the most to the prediction equation in
terms of increasing the multiple correlation, R, is
entered first. This process is continued only if
additional variables add anything statistically to the
regression equation.
The choosing is done according to following
manner
i.e.) delete x.i if ^i
2 E( 2 )(Z1
lZl)-1
ii<Fout=F1,n-r-1
pout
enter x.j if (n-r-2)cjq
2 (cjjcqq-cjq
2) > Fin=F1,n-r-2
pin
here either pin or pout are specify the stepwise
procedure is terminated when either of the two
following points happens
We can’t enter or delete the variables according to the above criteria
i.e.) this includes the case where enter all regressor & can’t delete any.
The processor dictates that the same regressor be enter and deleted in
successive operations the stepwise selection procedure is an attempt to
achieve to insert variables in terms until the regression equation is
satisfactory.
When additional predictor variables add anything statistically
meaningful to the regression equation, the analysis stops. Thus, not all
predictor variables may enter the equation in stepwise regression.
There are a number of multiple regression variants. Stepwise is usually
a good choice though one can enter all variables simultaneously as an
alternative. Similarly, one can enter all of the variables simultaneously
and gradually eliminate predictors one by one if elimination does little to
change the overall prediction.
Stepwise regression procedure is the best procedure when
compared to the all procedures we have see earlier.
Uses of Regression Analysis:
1.Regression analysis helps in establishing a functional
Relationship between two or more variables.
2. Since most of the problems of economic analysis are based
on cause and effect relationships, the regression analysis is a
highly valuable tool in economic and business research.
3. Regression analysis predicts the values of dependent
variables from the values of independent variables.
4. We can calculate coefficient of correlation (r) and
coefficient of determination (R2) with the help of regression
coefficients.
ANOVA TABLE:
Source Degrees of freedom Sum of squares Mean Square F
Regression 2 5550.8166 2775.4083 4.7*10-16
Residual 22 233.7260 10.6239
Total 24 5784.5426
R2 = 0.9596 Adjusted R2 = 0.9559
Scatter plot for cases and
distance
Regression

More Related Content

What's hot

Anova ppt
Anova pptAnova ppt
Anova ppt
Sravani Ganti
 
{ANOVA} PPT-1.pptx
{ANOVA} PPT-1.pptx{ANOVA} PPT-1.pptx
{ANOVA} PPT-1.pptx
SNEHA AGRAWAL GUPTA
 
Correlation Analysis
Correlation AnalysisCorrelation Analysis
Correlation Analysis
Birinder Singh Gulati
 
Multiple Regression Analysis (MRA)
Multiple Regression Analysis (MRA)Multiple Regression Analysis (MRA)
Multiple Regression Analysis (MRA)
Naveen Kumar Medapalli
 
Chi squared test
Chi squared testChi squared test
Chi squared test
Ramakanth Gadepalli
 
Standard error
Standard error Standard error
Standard error
Satyaki Mishra
 
Skewness & Kurtosis
Skewness & KurtosisSkewness & Kurtosis
Skewness & KurtosisNavin Bafna
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
MOHIT PANCHAL
 
Non parametric test
Non parametric testNon parametric test
Non parametric test
dineshmeena53
 
Chi -square test
Chi -square testChi -square test
Chi -square test
VIVEK KUMAR SINGH
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
Avjinder (Avi) Kaler
 
Type i and type ii errors
Type i and type ii errorsType i and type ii errors
Type i and type ii errorsp24ssp
 
Brm (one tailed and two tailed hypothesis)
Brm (one tailed and two tailed hypothesis)Brm (one tailed and two tailed hypothesis)
Brm (one tailed and two tailed hypothesis)Upama Dwivedi
 

What's hot (20)

Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Anova ppt
Anova pptAnova ppt
Anova ppt
 
Chi square test
Chi square testChi square test
Chi square test
 
{ANOVA} PPT-1.pptx
{ANOVA} PPT-1.pptx{ANOVA} PPT-1.pptx
{ANOVA} PPT-1.pptx
 
Analysis of variance anova
Analysis of variance anovaAnalysis of variance anova
Analysis of variance anova
 
Correlation Analysis
Correlation AnalysisCorrelation Analysis
Correlation Analysis
 
Multiple Regression Analysis (MRA)
Multiple Regression Analysis (MRA)Multiple Regression Analysis (MRA)
Multiple Regression Analysis (MRA)
 
Correlation analysis
Correlation analysisCorrelation analysis
Correlation analysis
 
F test
F testF test
F test
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
Standard error
Standard error Standard error
Standard error
 
Non-Parametric Tests
Non-Parametric TestsNon-Parametric Tests
Non-Parametric Tests
 
Skewness & Kurtosis
Skewness & KurtosisSkewness & Kurtosis
Skewness & Kurtosis
 
T test
T testT test
T test
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Non parametric test
Non parametric testNon parametric test
Non parametric test
 
Chi -square test
Chi -square testChi -square test
Chi -square test
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
Type i and type ii errors
Type i and type ii errorsType i and type ii errors
Type i and type ii errors
 
Brm (one tailed and two tailed hypothesis)
Brm (one tailed and two tailed hypothesis)Brm (one tailed and two tailed hypothesis)
Brm (one tailed and two tailed hypothesis)
 

Similar to Regression

regression-130929093340-phpapp02 (1).pdf
regression-130929093340-phpapp02 (1).pdfregression-130929093340-phpapp02 (1).pdf
regression-130929093340-phpapp02 (1).pdf
MuhammadAftab89
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 
Regression analysis by Muthama JM
Regression analysis by Muthama JMRegression analysis by Muthama JM
Regression analysis by Muthama JM
Japheth Muthama
 
Regression Analysis by Muthama JM
Regression Analysis by Muthama JM Regression Analysis by Muthama JM
Regression Analysis by Muthama JM
Japheth Muthama
 
Regression
RegressionRegression
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
RekhaChoudhary24
 
Regression.ppt basic introduction of regression with example
Regression.ppt basic introduction of regression with exampleRegression.ppt basic introduction of regression with example
Regression.ppt basic introduction of regression with example
shivshankarshiva98
 
Cost indexes
Cost indexesCost indexes
Regression analysis
Regression analysisRegression analysis
Regression analysissaba khan
 
Reg
RegReg
Regression
RegressionRegression
Regression
nandini patil
 
Regression Analysis - Thiyagu
Regression Analysis - ThiyaguRegression Analysis - Thiyagu
Regression Analysis - Thiyagu
Thiyagu K
 
604_multiplee.ppt
604_multiplee.ppt604_multiplee.ppt
604_multiplee.ppt
Rufesh
 
Ydb ji n8 itc ko6esvj8 kgerx k8tc ko4sx k
Ydb ji n8 itc ko6esvj8 kgerx k8tc ko4sx kYdb ji n8 itc ko6esvj8 kgerx k8tc ko4sx k
Ydb ji n8 itc ko6esvj8 kgerx k8tc ko4sx k
Adikesavaperumal
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
Sohag Babu
 
Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help
HelpWithAssignment.com
 
Regression
RegressionRegression
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
Muhammad Fazeel
 
HIGHER MATHEMATICS
HIGHER MATHEMATICSHIGHER MATHEMATICS
linear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annovalinear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annova
Mansi Rastogi
 

Similar to Regression (20)

regression-130929093340-phpapp02 (1).pdf
regression-130929093340-phpapp02 (1).pdfregression-130929093340-phpapp02 (1).pdf
regression-130929093340-phpapp02 (1).pdf
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 
Regression analysis by Muthama JM
Regression analysis by Muthama JMRegression analysis by Muthama JM
Regression analysis by Muthama JM
 
Regression Analysis by Muthama JM
Regression Analysis by Muthama JM Regression Analysis by Muthama JM
Regression Analysis by Muthama JM
 
Regression
RegressionRegression
Regression
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
Regression.ppt basic introduction of regression with example
Regression.ppt basic introduction of regression with exampleRegression.ppt basic introduction of regression with example
Regression.ppt basic introduction of regression with example
 
Cost indexes
Cost indexesCost indexes
Cost indexes
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Reg
RegReg
Reg
 
Regression
RegressionRegression
Regression
 
Regression Analysis - Thiyagu
Regression Analysis - ThiyaguRegression Analysis - Thiyagu
Regression Analysis - Thiyagu
 
604_multiplee.ppt
604_multiplee.ppt604_multiplee.ppt
604_multiplee.ppt
 
Ydb ji n8 itc ko6esvj8 kgerx k8tc ko4sx k
Ydb ji n8 itc ko6esvj8 kgerx k8tc ko4sx kYdb ji n8 itc ko6esvj8 kgerx k8tc ko4sx k
Ydb ji n8 itc ko6esvj8 kgerx k8tc ko4sx k
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help
 
Regression
RegressionRegression
Regression
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
HIGHER MATHEMATICS
HIGHER MATHEMATICSHIGHER MATHEMATICS
HIGHER MATHEMATICS
 
linear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annovalinear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annova
 

Recently uploaded

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

Regression

  • 2. History: The earliest form of regression was the method of least squares, which was published by Legendre in 1805, and by Gauss in 1809. The term "regression" was used by British biometrician sir Francis Galton in the (1822- 1911), to describe a biological phenomenon. Sir Galton's work on inherited characteristics of sweet peas led to the initial conception of linear regression.
  • 3. Introduction:  Regression is a statistical technique for investigating and modeling the relationship between variables.  Applications of regression are numerous and occur in almost every field, including engineering, the physical and the social sciences, and the biological sciences.  Usually, the investigator seeks to ascertain the causal effect of one variable upon another—the effect of a price increase upon demand, for example, or the effect of changes in the money supply upon the inflation rate.
  • 4. Definition: Regression is the measure of the average relationship between two or more variables in terms of the original units of the data. It is unquestionably the most widely used statistical technique in social sciences. It is also widely used in biological and physical science. Regression equation is (y) =a + b x Slope (b) = (NΣXY-(ΣX)( ΣY)) / (NΣX2 – (ΣX)2) Intercept (a) = (ΣY-b(ΣX)) / N
  • 5. Review of Simple linear regression. A simple linear regression is carried out to estimate the relationship between a dependent variable, Y and a single explanatory variable, x given a set of data that includes observations for both of these variables for a particular population. •For ex: A real estate agent wishes to examine the relationship between the selling price of a home and its size (measured in square feet) •A random sample of 10 houses is selected Dependent variable (Y) = house price Independent variable (X) = square feet
  • 6. Simple Linear Regression Model ii10i εXββY Linear component Population Y intercept Population Slope Coefficient Random Error term Dependent Variable Independen t Variable Random Error component
  • 7. i10i XbbYˆ The simple linear regression equation provides an estimate of the population regression line Estimate of the regression intercept Estimate of the regression slope Estimated (or predicted) Y value for observation i Value of X for observation i The individual random error terms ei have a mean of zero Prediction equation is given by:
  • 9. Measures of Variation Total variation is made up of two parts: SSESSRSST Total Sum of Squares Regression Sum of Squares Error Sum of Squares 2 i )YY(SST 2 ii )YˆY(SSE 2 i )YYˆ(SSR where: = Average value of the dependent variable Yi = Observed values of the dependent variable i = Predicted value of Y for the given Xi valueYˆ Y
  • 10. Measures of Variation Xi Y X Yi SST = (Yi - Y)2 SSE = (Yi - Yi )2 SSR = (Yi - Y)2 _ _ _ Y Y Y _ Y
  • 11. Coefficient of Determination, r2 • The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable • The coefficient of determination is also called r-squared and is denoted as r2 1r0 2note: squaresofsum squaresofregression2 total sum SST SSR r
  • 12. Multiple linear regression Introduction: The general purpose of multiple regression (the term was first used by Pearson, 1908) is to learn more about the relationship between several independent or predictor variables and a dependent or criterion variables.
  • 13. Definition: A regression model that involves the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data (more than one regressor variable) is called a multiple regression model. Every value of the independent variable x is associated with a value of the dependent variable y. Suppose that the yield in the pounds of conversation in a chemical process depends on temperature and the catalyst concentration. A multiple regression model that might describe the relationship is
  • 14. y=β0+β1x1+β2x2+ε where y denotes the yield,x1denotes the temperature,x2 denotes the catalyst concentration. This is multiple linear regression model with two regressor variables. The term linear is used because equation is a linear function of the known parameters β0,β1& β2 and ε is error term. The parameter β1 indicates that the expected change in response (y) per unit change in x1 when x2 is held constant. Similarly β2 measures the expected change in (y) per unit change in x2 when x1 held constant. In general, the response y may be related to k regressor (or) predictor variables. The model y= β0+β1x1+β2x2+……………+ βkxk+ε
  • 15. is a multiple linear regression with k regressors. The parameters βj, j=0,1,…….k. are called regression coefficients. The parameter βj represents the expected change in the response (y) per unit change in xj when all of the remaining regressor variables xi (i≠j) are held constant. For this reason the parameters βj, j=1,…….k are often called partial regression coefficients.
  • 16.
  • 17. Assumptions of Regression • For any given set of values of x1, x2, … , xk, the random error has a probability distribution with the following properties: • 1. Mean equal to 0 • 2. Variance equal to 2 • 3. Normal distribution • 4. Random errors are independent
  • 18. Regression Analysis: Model Building • General Linear Model • Determining When to Add or Delete Variables • Analysis of a Larger Problem • Multiple Regression Approach to Analysis of Variance
  • 19. General Linear Model Models in which the parameters (β0, β1, . . . , βp) all have exponents of one are called linear models. • First-Order Model with One Predictor Variable y x0 1 1 y x0 1 1
  • 20. Variable Selection Procedures • Stepwise Regression • Forward Selection • Backward Elimination Iterative; one independent variable at a time is added or deleted Based on the F statistic
  • 21. Variable Selection Procedures • F Test • To test whether the addition of x2 to a model involving x1 (or the deletion of x2 from a model involving x1and x2) is statistically significant F0=MSR/MSRes (MSR=SSR/K) The p-value corresponding to the F statistic is the criterion used to determine if a variable should be added or deleted (SSE(reduced)-SSE(full))/number of extra terms MSE(full) F
  • 22. Forward Selection • This procedure is similar to stepwise- regression, but does not permit a variable to be deleted. • This forward-selection procedure starts with no independent variables. • It adds variables one at a time as long as a significant reduction in the error sum of squares (SSE) can be achieved.
  • 23. Backward Elimination • This procedure begins with a model that includes all the independent variables the modeler wants considered. • It then attempts to delete one variable at a time by determining whether the least significant variable currently in the model can be removed because its p-value is less than the user-specified or default value. • Once a variable has been removed from the model it cannot re enter at a subsequent step.
  • 24. Stepwise regression: Procedure of simultaneous forward and backward selection also available In a stepwise regression, predictor variables are entered into the regression equation one at a time based upon statistical criteria. At each step in the analysis the predictor variable that contributes the most to the prediction equation in terms of increasing the multiple correlation, R, is entered first. This process is continued only if additional variables add anything statistically to the regression equation.
  • 25. The choosing is done according to following manner i.e.) delete x.i if ^i 2 E( 2 )(Z1 lZl)-1 ii<Fout=F1,n-r-1 pout enter x.j if (n-r-2)cjq 2 (cjjcqq-cjq 2) > Fin=F1,n-r-2 pin here either pin or pout are specify the stepwise procedure is terminated when either of the two following points happens
  • 26. We can’t enter or delete the variables according to the above criteria i.e.) this includes the case where enter all regressor & can’t delete any. The processor dictates that the same regressor be enter and deleted in successive operations the stepwise selection procedure is an attempt to achieve to insert variables in terms until the regression equation is satisfactory. When additional predictor variables add anything statistically meaningful to the regression equation, the analysis stops. Thus, not all predictor variables may enter the equation in stepwise regression. There are a number of multiple regression variants. Stepwise is usually a good choice though one can enter all variables simultaneously as an alternative. Similarly, one can enter all of the variables simultaneously and gradually eliminate predictors one by one if elimination does little to change the overall prediction. Stepwise regression procedure is the best procedure when compared to the all procedures we have see earlier.
  • 27. Uses of Regression Analysis: 1.Regression analysis helps in establishing a functional Relationship between two or more variables. 2. Since most of the problems of economic analysis are based on cause and effect relationships, the regression analysis is a highly valuable tool in economic and business research. 3. Regression analysis predicts the values of dependent variables from the values of independent variables. 4. We can calculate coefficient of correlation (r) and coefficient of determination (R2) with the help of regression coefficients.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33. ANOVA TABLE: Source Degrees of freedom Sum of squares Mean Square F Regression 2 5550.8166 2775.4083 4.7*10-16 Residual 22 233.7260 10.6239 Total 24 5784.5426 R2 = 0.9596 Adjusted R2 = 0.9559 Scatter plot for cases and distance