SlideShare a Scribd company logo
Simple Linear Regression
The simplest of all machine learning techniques is “Simple Linear Regression”. In this
blog, I will explain in detail the mathematical formulation of Simple Linear Regression
(SLR) and how to:
• Estimate model parameters
• Test significance of parameters
• Test goodness of the model fit
Let me begin with the definition of SLR. A simple linear regression is a statistical
technique used to investigate the relationship between two variables in a non-
deterministic fashion. In general, it is used to estimate an unknown variable (aka
dependent variable) by determining its relationship with a known variable (aka
independent variable).
Model Formulation
An SLR model can be generalized as:
𝑌 = 𝛽0 + 𝛽1 𝑥 + 𝜀
where, Y – dependent variable
x – independent variable
ε – random error [we assume ε ~ N(0, σ2
), homogenous and uncorrelated]
β0 – intercept (value of Y, when x = 0)
β1 – slope (change in Y per unit change in x)
An SLR model has 2 components,
• Deterministic (β0 + β1x)
• Random / Non-deterministic (ε)
This random error (ε) characterizes the linear regression model.
The regression model, 𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜀𝑖 implies that the responses Yi comes from a
normal probability distribution whose means are
𝐸(𝑌|𝑥) = 𝛽0 + 𝛽1 𝑥
And variances are σ2
(the same for all levels of x). Also, any two responses Yi and Yj are
uncorrelated.
Estimating Model Parameters
To determine the values of Yi for each xi, the values of β0 and β1 are not known. Instead,
we have some sample data available.
We have to estimate β0 and β1 of the true regression line from the available data, from
which we get
𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖
and estimate the errors,
𝜀𝑖 = 𝑌𝑖 − 𝑌̂𝑖 = 𝑌𝑖 − (𝛽0 + 𝛽1 𝑥𝑖)
Where, 𝑌̂ is the estimated value of 𝑌𝑖
In the following figure below, describing a scatter plot of x vs Y
Which of these lines best fits the data and can be
assumed as the true regression line?
To find the best fit line, we use the “Principle of Least Squares”, which states that the
best fit line is the one having the smallest sum of squares of errors.
(Sum of squares of errors, 𝑆𝑆𝐸 = ∑ 𝜀𝑖
2𝑛
𝑖=1 = ∑ (𝑌𝑖 − 𝑌̂𝑖)2𝑛
𝑖=1 )
Thus, to obtain β0 and β1 of the best fit line
𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖
we minimize,
𝑓( 𝛽0, 𝛽1) = ∑ 𝜀𝑖
2𝑛
𝑖=1 = ∑ (𝑌𝑖 − 𝑌̂𝑖)2𝑛
𝑖=1 = ∑ (𝑌𝑖 − 𝛽0 − 𝛽1 𝑥𝑖)2𝑛
𝑖=1 …(1)
To find the minimum of equation (1), partially differentiate 𝑓( 𝛽0, 𝛽1) with respect to β0
and β1 and equate to zero.
𝜕𝑓
𝜕𝑥
= −2 ∑( 𝑌𝑖 − 𝛽0 − 𝛽1 𝑥𝑖) = 0 … (2)
⇒ 𝛽0 𝑛 + 𝛽1 ∑ 𝑥𝑖 = ∑ 𝑦𝑖 … (3)
𝜕𝑓
𝜕𝑥
= −2 ∑( 𝑌𝑖 − 𝛽0 − 𝛽1 𝑥𝑖) = 0 … (4)
⇒ 𝛽0 ∑ 𝑥𝑖 + 𝛽1 ∑ 𝑥𝑖
2
= ∑ 𝑥𝑖 𝑦𝑖 … (5)
Solve equations (3) and (5) to obtain β0 and β1.
Equations (3) and (5) in matrix form:
[
𝑛 ∑ 𝑥𝑖
∑ 𝑥𝑖 ∑ 𝑥𝑖
2] . [
𝛽0
𝛽1
] = [
∑ 𝑥𝑖
∑ 𝑥𝑖 𝑌𝑖
] … (6)
[
𝛽0
𝛽1
] = [
𝑛 ∑ 𝑥𝑖
∑ 𝑥𝑖 ∑ 𝑥𝑖
2]
−1
. [
∑ 𝑥𝑖
∑ 𝑥𝑖 𝑌𝑖
] … (7)
[
𝛽0
𝛽1
] =
1
𝑛 ∑ 𝑥 𝑖
2− (∑ 𝑥 𝑖)2
[
∑ 𝑥𝑖
2
∑ 𝑌𝑖 − ∑ 𝑥𝑖 ∑ 𝑥𝑖 𝑌𝑖
𝑛 ∑ 𝑥𝑖 𝑌𝑖 − ∑ 𝑥𝑖 ∑ 𝑥𝑖 𝑌𝑖
] … (8)
From equation (8),
𝛽1 =
∑ 𝑥 𝑖 𝑌 𝑖−
∑ 𝑥 𝑖 ∑ 𝑌 𝑖
𝑛
∑ 𝑥 𝑖
2−
(∑ 𝑥 𝑖)2
𝑛
… (9)
𝛽1 =
∑(𝑥 𝑖−𝑥̅ 𝑖)(𝑌 𝑖−𝑌̅ 𝑖)
∑(𝑥 𝑖−𝑥̅ 𝑖)2
=
𝑆 𝑥𝑦
𝑆 𝑥𝑥
… (10)
𝛽0 = 𝑌̅ − 𝛽1 𝑥̅ … (11)
where, 𝑥̅ =
∑ 𝑥 𝑖
𝑛
𝑌̅ =
∑ 𝑌 𝑖
𝑛
𝑆 𝑥𝑦 = ∑(𝑥𝑖 − 𝑥̅𝑖)(𝑌𝑖 − 𝑌̅𝑖)
𝑆 𝑥𝑥 = ∑( 𝑥𝑖 − 𝑥̅𝑖)2
We can thus predict the value of the dependent variable Yi by substituting the values of
𝛽0 and 𝛽1 obtained from equation (10) and (11) in the equation:
𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖
Testing Significance of Model Parameters - 𝜷 𝟎 and 𝜷 𝟏
Distribution of 𝜷 𝟏
σ2
determines the amount of variability inherent in the regression model. As the equation
of true line is unknown, an estimate is based on the extent, the sample observation
deviates from the estimated line. This fitted line falls on the mean of the sample data, thus
the standard deviation can be estimated using this line
𝜎2
=
𝑆𝑆𝐸
𝑛 − 2
=
∑ 𝜀𝑖
𝑛
𝑖=1
𝑛 − 2
Since each 𝜀𝑖 is normally distributed, each Yi is also normal. And since 𝛽1 is a linear
function of each independent variable Yi, we have:
• 𝛽1is normally distributed
• 𝐸( 𝛽1) = 𝛽1
• 𝑉𝑎𝑟( 𝛽1) = 𝜎𝛽1
2
=
𝜎2
∑(𝑥 𝑖−𝑥̅)2
=
𝜎2
𝑆 𝑥𝑥
Hence,
𝛽1 ~ 𝑁(𝛽1, 𝜎2
𝑆 𝑥𝑥)⁄
𝑠𝑒( 𝛽1) = √
𝜎2
𝑆 𝑥𝑥
The assumptions of SLR model states that:
𝜷 𝟏− 𝜷 𝟏
𝟎
√𝝈 𝜷 𝟏
𝟐
~ 𝑵( 𝟎, 𝟏)
Thus, the standardized variable:
𝑇 =
𝛽1 − 𝛽1
0
𝜎 √ 𝑆 𝑥𝑥⁄
=
𝛽1 − 𝛽1
0
𝑠𝑒( 𝛽1)
has a t – distribution with (n-2) degrees of freedom.
Hypothesis Test for slope of regression line:
𝐻0: 𝛽1 = 𝛽1
0
𝐻 𝛼: 𝛽1 ≠ 𝛽1
0
Test statistic, 𝑇0 =
𝛽1− 𝛽1
0
𝑠𝑒(𝛽1)
Reject H0 if | 𝑡| ≥ 𝑡 𝛼 2 ,𝑛−2⁄
The most general assumption is 𝐻0: 𝛽1 = 0 versus 𝐻 𝛼: 𝛽1 ≠ 0
In this case, rejecting H0 implies that there is no significant relation between x and Y.
Distribution of 𝜷 𝟎
Using a similar approach as that of 𝛽1, we get,
𝛽0 ~ 𝑁(𝛽0, 𝜎𝛽0
2
)
where, 𝜎𝛽0
2
= 𝜎2
[
1
𝑛
+
𝑥̅2
∑(𝑥 𝑖− 𝑥̅)2
] = 𝜎2
[
1
𝑛
+
𝑥̅2
𝑆 𝑥𝑥
]
Also,
𝜷 𝟎− 𝜷 𝟎
𝟎
√𝝈 𝜷 𝟎
𝟐
~ 𝑵( 𝟎, 𝟏)
Thus, the standardized variable,
𝑇 =
𝛽0 − 𝛽0
0
𝑠𝑒( 𝛽0)
has a t – distribution with (n-2) degrees of freedom
Hypothesis Test for slope of regression line
𝐻0: 𝛽0 = 𝛽0
0
𝐻 𝛼: 𝛽0 ≠ 𝛽0
0
Test statistic, 𝑇0 =
𝛽0− 𝛽0
0
𝑠𝑒(𝛽0)
Reject H0 if | 𝑡| ≥ 𝑡 𝛼 2 ,𝑛−2⁄
We are generally more interested in the slope of the model than the intercept. So, to
minimize bias, we leave 𝛽0 in the model.
Testing Goodness of Model Fit
Recall,
• Error sum of squares, 𝑆𝑆𝐸 = ∑ (𝑌𝑖 − 𝑌̂𝑖)2𝑛
𝑖=1 is the sum of deviations about the
least square line.
• Total sum of squares, 𝑆𝑆𝑇 = ∑ (𝑌𝑖 − 𝑌̅𝑖)2𝑛
𝑖=1 is the sum of deviation about the
horizontal line
Note that, SSE < SST.
SSE / SST represents the proportion of variation that cannot be explained by the
Simple Linear Regression.
The Coefficient of Determination, denoted by R2
is given by
𝑅2
= 1 −
𝑆𝑆𝐸
𝑆𝑆𝑇
R2
represents the proportion of variation explained by the Simple Linear
Regression.
❖ Higher the value of R2
, better is the model in explaining the variation in Y.

More Related Content

What's hot

Regression analysis
Regression analysisRegression analysis
Regression analysissaba khan
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionKhalid Aziz
 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVA
MEENURANJI
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
Ram Kumar Shah "Struggler"
 
Spearman Rank Correlation Presentation
Spearman Rank Correlation PresentationSpearman Rank Correlation Presentation
Spearman Rank Correlation Presentationcae_021
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
cbt1213
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
Avjinder (Avi) Kaler
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysis
Farzad Javidanrad
 
Regression analysis.
Regression analysis.Regression analysis.
Regression analysis.
sonia gupta
 
Geometric Mean
Geometric MeanGeometric Mean
Geometric Mean
Sumit Kumar
 
Probability Theory
Probability TheoryProbability Theory
Probability Theory
Parul Singh
 
Correlation and regression analysis
Correlation and regression analysisCorrelation and regression analysis
Correlation and regression analysis
_pem
 
Multivariate analyses
Multivariate analysesMultivariate analyses
Multivariate analysesNaveen Deswal
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
ASAD ALI
 
Regression analysis
Regression analysisRegression analysis
Regression analysisRavi shankar
 
Spearman’s rank correlation
Spearman’s rank correlation Spearman’s rank correlation
Spearman’s rank correlation
PritikaNeupane
 
Spearman Rank Correlation - Thiyagu
Spearman Rank Correlation - ThiyaguSpearman Rank Correlation - Thiyagu
Spearman Rank Correlation - Thiyagu
Thiyagu K
 

What's hot (20)

Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVA
 
F test
F testF test
F test
 
Correlation analysis
Correlation analysisCorrelation analysis
Correlation analysis
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 
Spearman Rank Correlation Presentation
Spearman Rank Correlation PresentationSpearman Rank Correlation Presentation
Spearman Rank Correlation Presentation
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysis
 
Regression analysis.
Regression analysis.Regression analysis.
Regression analysis.
 
Geometric Mean
Geometric MeanGeometric Mean
Geometric Mean
 
Probability Theory
Probability TheoryProbability Theory
Probability Theory
 
Correlation and regression analysis
Correlation and regression analysisCorrelation and regression analysis
Correlation and regression analysis
 
Multivariate analyses
Multivariate analysesMultivariate analyses
Multivariate analyses
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Path analysis
Path analysisPath analysis
Path analysis
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Spearman’s rank correlation
Spearman’s rank correlation Spearman’s rank correlation
Spearman’s rank correlation
 
Spearman Rank Correlation - Thiyagu
Spearman Rank Correlation - ThiyaguSpearman Rank Correlation - Thiyagu
Spearman Rank Correlation - Thiyagu
 

Similar to Simple Linear Regression

Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptx
MdRokonMia1
 
Course pack unit 5
Course pack unit 5Course pack unit 5
Course pack unit 5
Rai University
 
Linear regression, costs & gradient descent
Linear regression, costs & gradient descentLinear regression, costs & gradient descent
Linear regression, costs & gradient descent
Revanth Kumar
 
Applied mathematics
Applied mathematicsApplied mathematics
Applied mathematics
ssuserada5be
 
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTIONPROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
Journal For Research
 
Machine learning
Machine learningMachine learning
Machine learning
ssuserada5be
 
Module10 the regression analysis
Module10 the regression analysisModule10 the regression analysis
Module10 the regression analysis
REYEMMANUELILUMBA
 
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
mathsjournal
 
Nonparametric approach to multiple regression
Nonparametric approach to multiple regressionNonparametric approach to multiple regression
Nonparametric approach to multiple regression
Alexander Decker
 
Flip bifurcation and chaos control in discrete-time Prey-predator model
Flip bifurcation and chaos control in discrete-time Prey-predator model Flip bifurcation and chaos control in discrete-time Prey-predator model
Flip bifurcation and chaos control in discrete-time Prey-predator model
irjes
 
Differentiation
DifferentiationDifferentiation
Differentiation
Anirudh Gaddamanugu
 
Analysis of a self-sustained vibration of mass-spring oscillator on moving belt
Analysis of a self-sustained vibration of mass-spring oscillator on moving beltAnalysis of a self-sustained vibration of mass-spring oscillator on moving belt
Analysis of a self-sustained vibration of mass-spring oscillator on moving belt
Varun Jadhav
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
Shivakumar B N
 
Regression
RegressionRegression
Unit 5 Correlation
Unit 5 CorrelationUnit 5 Correlation
Unit 5 Correlation
Rai University
 
Estimation Theory Class (Summary and Revision)
Estimation Theory Class (Summary and Revision)Estimation Theory Class (Summary and Revision)
Estimation Theory Class (Summary and Revision)
Ahmad Gomaa
 
simple linear regression - brief introduction
simple linear regression - brief introductionsimple linear regression - brief introduction
simple linear regression - brief introduction
edinyoka
 
Calculas
CalculasCalculas
Calculas
Vatsal Manavar
 
Elasticity, Plasticity and elastic plastic analysis
Elasticity, Plasticity and elastic plastic analysisElasticity, Plasticity and elastic plastic analysis
Elasticity, Plasticity and elastic plastic analysis
JAGARANCHAKMA2
 

Similar to Simple Linear Regression (20)

Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptx
 
Course pack unit 5
Course pack unit 5Course pack unit 5
Course pack unit 5
 
Linear regression, costs & gradient descent
Linear regression, costs & gradient descentLinear regression, costs & gradient descent
Linear regression, costs & gradient descent
 
Applied mathematics
Applied mathematicsApplied mathematics
Applied mathematics
 
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTIONPROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
PROBABILITY DISTRIBUTION OF SUM OF TWO CONTINUOUS VARIABLES AND CONVOLUTION
 
Machine learning
Machine learningMachine learning
Machine learning
 
Module10 the regression analysis
Module10 the regression analysisModule10 the regression analysis
Module10 the regression analysis
 
Corr And Regress
Corr And RegressCorr And Regress
Corr And Regress
 
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
 
Nonparametric approach to multiple regression
Nonparametric approach to multiple regressionNonparametric approach to multiple regression
Nonparametric approach to multiple regression
 
Flip bifurcation and chaos control in discrete-time Prey-predator model
Flip bifurcation and chaos control in discrete-time Prey-predator model Flip bifurcation and chaos control in discrete-time Prey-predator model
Flip bifurcation and chaos control in discrete-time Prey-predator model
 
Differentiation
DifferentiationDifferentiation
Differentiation
 
Analysis of a self-sustained vibration of mass-spring oscillator on moving belt
Analysis of a self-sustained vibration of mass-spring oscillator on moving beltAnalysis of a self-sustained vibration of mass-spring oscillator on moving belt
Analysis of a self-sustained vibration of mass-spring oscillator on moving belt
 
Regression.pptx
Regression.pptxRegression.pptx
Regression.pptx
 
Regression
RegressionRegression
Regression
 
Unit 5 Correlation
Unit 5 CorrelationUnit 5 Correlation
Unit 5 Correlation
 
Estimation Theory Class (Summary and Revision)
Estimation Theory Class (Summary and Revision)Estimation Theory Class (Summary and Revision)
Estimation Theory Class (Summary and Revision)
 
simple linear regression - brief introduction
simple linear regression - brief introductionsimple linear regression - brief introduction
simple linear regression - brief introduction
 
Calculas
CalculasCalculas
Calculas
 
Elasticity, Plasticity and elastic plastic analysis
Elasticity, Plasticity and elastic plastic analysisElasticity, Plasticity and elastic plastic analysis
Elasticity, Plasticity and elastic plastic analysis
 

Recently uploaded

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 

Recently uploaded (20)

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 

Simple Linear Regression

  • 1. Simple Linear Regression The simplest of all machine learning techniques is “Simple Linear Regression”. In this blog, I will explain in detail the mathematical formulation of Simple Linear Regression (SLR) and how to: • Estimate model parameters • Test significance of parameters • Test goodness of the model fit Let me begin with the definition of SLR. A simple linear regression is a statistical technique used to investigate the relationship between two variables in a non- deterministic fashion. In general, it is used to estimate an unknown variable (aka dependent variable) by determining its relationship with a known variable (aka independent variable). Model Formulation An SLR model can be generalized as: 𝑌 = 𝛽0 + 𝛽1 𝑥 + 𝜀 where, Y – dependent variable x – independent variable ε – random error [we assume ε ~ N(0, σ2 ), homogenous and uncorrelated] β0 – intercept (value of Y, when x = 0) β1 – slope (change in Y per unit change in x) An SLR model has 2 components, • Deterministic (β0 + β1x) • Random / Non-deterministic (ε) This random error (ε) characterizes the linear regression model.
  • 2. The regression model, 𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝜀𝑖 implies that the responses Yi comes from a normal probability distribution whose means are 𝐸(𝑌|𝑥) = 𝛽0 + 𝛽1 𝑥 And variances are σ2 (the same for all levels of x). Also, any two responses Yi and Yj are uncorrelated. Estimating Model Parameters To determine the values of Yi for each xi, the values of β0 and β1 are not known. Instead, we have some sample data available. We have to estimate β0 and β1 of the true regression line from the available data, from which we get 𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 and estimate the errors, 𝜀𝑖 = 𝑌𝑖 − 𝑌̂𝑖 = 𝑌𝑖 − (𝛽0 + 𝛽1 𝑥𝑖) Where, 𝑌̂ is the estimated value of 𝑌𝑖 In the following figure below, describing a scatter plot of x vs Y Which of these lines best fits the data and can be assumed as the true regression line? To find the best fit line, we use the “Principle of Least Squares”, which states that the best fit line is the one having the smallest sum of squares of errors. (Sum of squares of errors, 𝑆𝑆𝐸 = ∑ 𝜀𝑖 2𝑛 𝑖=1 = ∑ (𝑌𝑖 − 𝑌̂𝑖)2𝑛 𝑖=1 )
  • 3. Thus, to obtain β0 and β1 of the best fit line 𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 we minimize, 𝑓( 𝛽0, 𝛽1) = ∑ 𝜀𝑖 2𝑛 𝑖=1 = ∑ (𝑌𝑖 − 𝑌̂𝑖)2𝑛 𝑖=1 = ∑ (𝑌𝑖 − 𝛽0 − 𝛽1 𝑥𝑖)2𝑛 𝑖=1 …(1) To find the minimum of equation (1), partially differentiate 𝑓( 𝛽0, 𝛽1) with respect to β0 and β1 and equate to zero. 𝜕𝑓 𝜕𝑥 = −2 ∑( 𝑌𝑖 − 𝛽0 − 𝛽1 𝑥𝑖) = 0 … (2) ⇒ 𝛽0 𝑛 + 𝛽1 ∑ 𝑥𝑖 = ∑ 𝑦𝑖 … (3) 𝜕𝑓 𝜕𝑥 = −2 ∑( 𝑌𝑖 − 𝛽0 − 𝛽1 𝑥𝑖) = 0 … (4) ⇒ 𝛽0 ∑ 𝑥𝑖 + 𝛽1 ∑ 𝑥𝑖 2 = ∑ 𝑥𝑖 𝑦𝑖 … (5) Solve equations (3) and (5) to obtain β0 and β1. Equations (3) and (5) in matrix form: [ 𝑛 ∑ 𝑥𝑖 ∑ 𝑥𝑖 ∑ 𝑥𝑖 2] . [ 𝛽0 𝛽1 ] = [ ∑ 𝑥𝑖 ∑ 𝑥𝑖 𝑌𝑖 ] … (6) [ 𝛽0 𝛽1 ] = [ 𝑛 ∑ 𝑥𝑖 ∑ 𝑥𝑖 ∑ 𝑥𝑖 2] −1 . [ ∑ 𝑥𝑖 ∑ 𝑥𝑖 𝑌𝑖 ] … (7) [ 𝛽0 𝛽1 ] = 1 𝑛 ∑ 𝑥 𝑖 2− (∑ 𝑥 𝑖)2 [ ∑ 𝑥𝑖 2 ∑ 𝑌𝑖 − ∑ 𝑥𝑖 ∑ 𝑥𝑖 𝑌𝑖 𝑛 ∑ 𝑥𝑖 𝑌𝑖 − ∑ 𝑥𝑖 ∑ 𝑥𝑖 𝑌𝑖 ] … (8) From equation (8), 𝛽1 = ∑ 𝑥 𝑖 𝑌 𝑖− ∑ 𝑥 𝑖 ∑ 𝑌 𝑖 𝑛 ∑ 𝑥 𝑖 2− (∑ 𝑥 𝑖)2 𝑛 … (9) 𝛽1 = ∑(𝑥 𝑖−𝑥̅ 𝑖)(𝑌 𝑖−𝑌̅ 𝑖) ∑(𝑥 𝑖−𝑥̅ 𝑖)2 = 𝑆 𝑥𝑦 𝑆 𝑥𝑥 … (10) 𝛽0 = 𝑌̅ − 𝛽1 𝑥̅ … (11) where, 𝑥̅ = ∑ 𝑥 𝑖 𝑛 𝑌̅ = ∑ 𝑌 𝑖 𝑛
  • 4. 𝑆 𝑥𝑦 = ∑(𝑥𝑖 − 𝑥̅𝑖)(𝑌𝑖 − 𝑌̅𝑖) 𝑆 𝑥𝑥 = ∑( 𝑥𝑖 − 𝑥̅𝑖)2 We can thus predict the value of the dependent variable Yi by substituting the values of 𝛽0 and 𝛽1 obtained from equation (10) and (11) in the equation: 𝑌𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 Testing Significance of Model Parameters - 𝜷 𝟎 and 𝜷 𝟏 Distribution of 𝜷 𝟏 σ2 determines the amount of variability inherent in the regression model. As the equation of true line is unknown, an estimate is based on the extent, the sample observation deviates from the estimated line. This fitted line falls on the mean of the sample data, thus the standard deviation can be estimated using this line 𝜎2 = 𝑆𝑆𝐸 𝑛 − 2 = ∑ 𝜀𝑖 𝑛 𝑖=1 𝑛 − 2 Since each 𝜀𝑖 is normally distributed, each Yi is also normal. And since 𝛽1 is a linear function of each independent variable Yi, we have: • 𝛽1is normally distributed • 𝐸( 𝛽1) = 𝛽1 • 𝑉𝑎𝑟( 𝛽1) = 𝜎𝛽1 2 = 𝜎2 ∑(𝑥 𝑖−𝑥̅)2 = 𝜎2 𝑆 𝑥𝑥 Hence, 𝛽1 ~ 𝑁(𝛽1, 𝜎2 𝑆 𝑥𝑥)⁄ 𝑠𝑒( 𝛽1) = √ 𝜎2 𝑆 𝑥𝑥
  • 5. The assumptions of SLR model states that: 𝜷 𝟏− 𝜷 𝟏 𝟎 √𝝈 𝜷 𝟏 𝟐 ~ 𝑵( 𝟎, 𝟏) Thus, the standardized variable: 𝑇 = 𝛽1 − 𝛽1 0 𝜎 √ 𝑆 𝑥𝑥⁄ = 𝛽1 − 𝛽1 0 𝑠𝑒( 𝛽1) has a t – distribution with (n-2) degrees of freedom. Hypothesis Test for slope of regression line: 𝐻0: 𝛽1 = 𝛽1 0 𝐻 𝛼: 𝛽1 ≠ 𝛽1 0 Test statistic, 𝑇0 = 𝛽1− 𝛽1 0 𝑠𝑒(𝛽1) Reject H0 if | 𝑡| ≥ 𝑡 𝛼 2 ,𝑛−2⁄ The most general assumption is 𝐻0: 𝛽1 = 0 versus 𝐻 𝛼: 𝛽1 ≠ 0 In this case, rejecting H0 implies that there is no significant relation between x and Y. Distribution of 𝜷 𝟎 Using a similar approach as that of 𝛽1, we get, 𝛽0 ~ 𝑁(𝛽0, 𝜎𝛽0 2 ) where, 𝜎𝛽0 2 = 𝜎2 [ 1 𝑛 + 𝑥̅2 ∑(𝑥 𝑖− 𝑥̅)2 ] = 𝜎2 [ 1 𝑛 + 𝑥̅2 𝑆 𝑥𝑥 ] Also, 𝜷 𝟎− 𝜷 𝟎 𝟎 √𝝈 𝜷 𝟎 𝟐 ~ 𝑵( 𝟎, 𝟏) Thus, the standardized variable, 𝑇 = 𝛽0 − 𝛽0 0 𝑠𝑒( 𝛽0) has a t – distribution with (n-2) degrees of freedom
  • 6. Hypothesis Test for slope of regression line 𝐻0: 𝛽0 = 𝛽0 0 𝐻 𝛼: 𝛽0 ≠ 𝛽0 0 Test statistic, 𝑇0 = 𝛽0− 𝛽0 0 𝑠𝑒(𝛽0) Reject H0 if | 𝑡| ≥ 𝑡 𝛼 2 ,𝑛−2⁄ We are generally more interested in the slope of the model than the intercept. So, to minimize bias, we leave 𝛽0 in the model. Testing Goodness of Model Fit Recall, • Error sum of squares, 𝑆𝑆𝐸 = ∑ (𝑌𝑖 − 𝑌̂𝑖)2𝑛 𝑖=1 is the sum of deviations about the least square line. • Total sum of squares, 𝑆𝑆𝑇 = ∑ (𝑌𝑖 − 𝑌̅𝑖)2𝑛 𝑖=1 is the sum of deviation about the horizontal line
  • 7. Note that, SSE < SST. SSE / SST represents the proportion of variation that cannot be explained by the Simple Linear Regression. The Coefficient of Determination, denoted by R2 is given by 𝑅2 = 1 − 𝑆𝑆𝐸 𝑆𝑆𝑇 R2 represents the proportion of variation explained by the Simple Linear Regression. ❖ Higher the value of R2 , better is the model in explaining the variation in Y.