SlideShare a Scribd company logo
Chapter 14 Part I

    ISDS 2001
     Matt Levy
Introduction
Regression is the term used to describe the technique of
modeling and analyzing 1 or more variables.

The focus is on a dependent variable, and one or more
independent variables.

Simple Linear Regression means 1 independent variable.

Regression, and other statistical modeling techniques gives us
the power to infer, or predict future outcomes.

An understanding of regression, and the techniques used to
validate your models will provide you with sound methodology
to do just that.
Simple Linear Regression
As previously mentioned, simple linear regression means we
have 1 dependent variable (y), and 1 independent variable (x).

In order to make a prediction about y using x, we need sample
data (from both x and y) in order to generate some additional
terms, namely the parameters (β0 and β1), and an error term
(ε).

The parameters, β0 and β1, can be thought of as what is
generated from explained variability

The error term (ε) accounts for unexplained variability.

Thus, the simple linear regression model is: y = β0 + β1x + ε
Estimating the Regression Equation
If we were so fortunate to know the population parameters, we could
use the equation on the previous slide to compute the mean.

Unfortunately, for us, we must use sample data to estimate these
parameters, and subsequently, use different symbols to denote our
estimated parameters:
   ŷ = b 0 + b 1x

Note that we use place a hat over y (pronounced y-hat) and use
english lettering to denote our estimated parameters.

We now have an equation that graphs a "regression line"
   ŷ is the point estimator of E(y), the mean.
   b0 is the y-intercept
   b1 is the slope
The Estimation Process for Simple Linear
Regression
The Estimation Process for Simple Linear
Regression
So how to we estimate b0 and b1?

To do this we use a method known as least squares.

In simple linear regression, finding b0 and b1 is relatively straightforward.

Equations 14.6 and 14.7 in your book show the procedure for b0 and
b1, respectively.

Once b0 and b1 are obtained, the estimated simple linear regression equation
will resemble the following:
   ŷ = 60 + 5x

It is important to note that you will have a ŷi for every yi in the sample data-set.

It is up to you to determine if the difference between them is small enough to         de
the equation an accurate predictor.
Coefficient of Determination
The Coefficient of Determination (r2) provides us one measure to judge how well our
regression equation (for example: ŷ = 60 + 5x ) fits the actual data.

Lets take some time to build r2 and learn some important terms along the way:

◆ Remember that we have an estimated dependent variable (ŷi ) and an actual dependent
variable (yi) for each observation.

◆ (yi - ŷi ) is known as the ith residual.

◆ When we take (yi - ŷi ), square it, and sum the squares we get the Sum of Squares of the
Error Terms (SSE) , hence SSE = ∑(yi - ŷi)2 .

◆ When we take (yi - y̅), square it, and sum the squares we get the Total Sum of Squares
(SST), hence SST = ∑ (yi - y̅)2

◆ Lastly, when we take (ŷi - y̅), square it, and sum the squares, we get a measure of how
much the estimated values on the regression line deviate from the actual mean.

◆ This is known as the Sum of Squares of the Regression Line:  SSR = ∑ (ŷi - y̅)2
Coefficient of Determination (con't)
The relationship between SSR, SST, and SSE is one of the most important
facts to know in statistics.

SST = SSR + SSE

Now, if (yi - ŷi ) = 0 for each ith observation, SST = SSR, and we have a perfect
fit of the data. This is never the case.

On the flip side, if SST - SSR = 0, we have the worst possible fit because
everything is in the error term, or the unexplained portion of the equation.

Hence to measure of goodness of fit we look at the ratio of SSR to SST.

r2 = SSR/SST

This yields a value between 0 and 1.

r2 can be interpreted as the % of the total sum of squares (SST) that can be
explained by using your estimated regression equation.
Correlation Coefficient
Denoted rxy, is a measure of the strength of the linear association between the
independent (x) and dependent variable (y).

rxy = (sign of b1) √r2

rxy always yields a value between (-1, +1).

A value of 1 indicates perfect positive linear relationship.

A value of -1 indicates perfect negative linear relationship.

A value of zero indicates no relationship.

In practice, this is used much less as it only provides an accurate
measurement in the case of perfectly linear relationships.

r2 can be used to measure goodness-of-fit in linear and nonlinear relationships.
Estimating the Regression Equation
In this model, y can be thought of as having a distribution for a
given range of x values.

As we have learned in the past, a distribution has a mean or
expected value.

Thus the regression equation for the mean is as follows:
  E(y) = β0 + β1x

Notice that to obtain the mean, we simply remove our ability to
account for unexplained variance.
Model Assumptions
It is important to understand that r2 is not enough to ensure we have an
appropriate regression equation.

There are numerous other tests and measures we must use.

All of these tests are based on assumptions about the error term (ε)
1. E(ε) = 0.
Implication: E(y) = β0 + β1x

2. The variance of ε, denoted by σ2 is the same for all values of x.
Implication: The variance of y equals σ2 and is the same for all values of x.

3. The values of ε are independent (uncorrelated)
Implication: The value of y for any x is not related to value of y for any other x.

4. ε is a normally distributed random variable.
Implication: Because y is a linear function of ε, y is also normally distributed.

Table 14.14 in the text provides a complete explanation.
Testing for Significance
In Simple Linear Regression, the mean or expected value of y is a linear
function of x (E(y) = β0 + β1x )

If the value of β1 = 0, then E(y) = β0 + 0x = β0.

Hence, in this case we can conclude x and y are not linearly related.

In the next, couple of slides we offer a few tests, the t-test, an evaluation of the
confidence interval for β1, the F-test.

Each of these test are based on the following hypothesis:

  H 0: β 1 = 0
  Ha: β1 ≠ 0

This starts to tell us more about the appropriateness of our model.
2
Estimating σ
As a pre-cursor to running our tests, we need an estimate of σ2.

Recall one of our key assumptions that variance of ε also represents the
variance of y.

Also recall the deviations of y about the regression line are called residuals.

Hence we can call upon the SSE to calculate the Mean Square Error (MSE) as
an estimate of σ2 which we will denote as s2.

s2 = MSE = SSE/(n-2), where n is the sample size and (n-2) is the
model degrees of freedom.

Consequently, to get the standard error (s) of the estimate: √MSE.
t  Test
Remember we are testing the following: H0: β1 = 0; Ha: β1 ≠ 0

To do this we need information about the distribution of b1 (see figure 14.17).,
specifically, we need the estimated standard deviation of b1 (see figure 14.18)

Once we have sb1 we can find the test statistic t: t = b1/sb1.

And using the t-table and our well known rejection rules:

Reject H0 if p-value ≤ α .

where t α/2 is based on a t-distribution with n-2 degrees of freedom.
Confidence Interval for β1
As an alternative to the t-test, we can check the confidence interval for β1

We are essentially checking to see if the interval of β1 contains 0.

The form of the confidence interval is as follows:

b1 ± t α/2*sb1

If this interval contains zero at the designated significance level, we cannot
reject the null hypothesis (H0).
F-Test
Based on the F probability distribution (hence, using our F-table)

In simple linear regression this does the same thing as the t-test.

With more than one independent variable (multiple regression) ONLY the F-
test can be used to test for overall significance.

To arrive at the F-Test statistic, we need the Mean Square due to Regression
(MSR).

MSR = MSE / (Number of Independent Variables)

F = MSR/MSE (Just like when we first learned ANOVA)

And using the F-table and our well known rejection rules:

Reject H0 if p-value ≤ α .

where F α is based on an F-distribution with 1 degree of freedom (for SLR) in
the numerator and (n-2) degrees of freedom in the denominator.
Caution about the Interpretation of
Significance Testing
Correlation is not causation!

Just because we Reject H0 does not guarantee cause-
and-effect, theoretical justification must be warranted.

Furthermore, just because we can Reject H0 does not
mean the relationship between x and y is linear.

More Related Content

What's hot

Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
ASAD ALI
 
Data analysis 1
Data analysis 1Data analysis 1
Data analysis 1
Bùi Trâm
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
Amany El-seoud
 
Regression analysis.
Regression analysis.Regression analysis.
Regression analysis.
sonia gupta
 
Simple regression and correlation
Simple regression and correlationSimple regression and correlation
Simple regression and correlation
Mary Grace
 
Chapter 2 part2-Correlation
Chapter 2 part2-CorrelationChapter 2 part2-Correlation
Chapter 2 part2-Correlation
nszakir
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
Birinder Singh Gulati
 
Correlation Analysis
Correlation AnalysisCorrelation Analysis
Correlation Analysis
Suresh Babu
 
Correlation
CorrelationCorrelation
Correlation
Tech_MX
 
Correlation
CorrelationCorrelation
Correlation
harshika5
 
Pearson's correlation
Pearson's  correlationPearson's  correlation
Pearson's correlation
TRIPTI DUBEY
 
Correlation &regression
Correlation &regressionCorrelation &regression
Correlation &regression
JIMS
 
Regression
RegressionRegression
Regression
simran sakshi
 
Regression & correlation coefficient
Regression & correlation coefficientRegression & correlation coefficient
Regression & correlation coefficient
MuhamamdZiaSamad
 
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
RekhaChoudhary24
 
correlation and regression
correlation and regressioncorrelation and regression
correlation and regression
Keyur Tejani
 

What's hot (16)

Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Data analysis 1
Data analysis 1Data analysis 1
Data analysis 1
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Regression analysis.
Regression analysis.Regression analysis.
Regression analysis.
 
Simple regression and correlation
Simple regression and correlationSimple regression and correlation
Simple regression and correlation
 
Chapter 2 part2-Correlation
Chapter 2 part2-CorrelationChapter 2 part2-Correlation
Chapter 2 part2-Correlation
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Correlation Analysis
Correlation AnalysisCorrelation Analysis
Correlation Analysis
 
Correlation
CorrelationCorrelation
Correlation
 
Correlation
CorrelationCorrelation
Correlation
 
Pearson's correlation
Pearson's  correlationPearson's  correlation
Pearson's correlation
 
Correlation &regression
Correlation &regressionCorrelation &regression
Correlation &regression
 
Regression
RegressionRegression
Regression
 
Regression & correlation coefficient
Regression & correlation coefficientRegression & correlation coefficient
Regression & correlation coefficient
 
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s ...
 
correlation and regression
correlation and regressioncorrelation and regression
correlation and regression
 

Viewers also liked

Non linear curve fitting
Non linear curve fitting Non linear curve fitting
Non linear curve fitting
Anumita Mondal
 
Mathematical modeling
Mathematical modelingMathematical modeling
Mathematical modeling
Champion Christian College
 
phd thesis presentation
phd thesis presentationphd thesis presentation
phd thesis presentation
Dimitris Theodorou
 
Curve fitting - Lecture Notes
Curve fitting - Lecture NotesCurve fitting - Lecture Notes
Curve fitting - Lecture Notes
Dr. Nirav Vyas
 
Es272 ch5a
Es272 ch5aEs272 ch5a
Es272 ch1
Es272 ch1Es272 ch1
case study of curve fitting
case study of curve fittingcase study of curve fitting
case study of curve fitting
Adarsh Patel
 
metode numerik kurva fitting dan regresi
metode numerik kurva fitting dan regresimetode numerik kurva fitting dan regresi
metode numerik kurva fitting dan regresi
Izhan Nassuha
 

Viewers also liked (8)

Non linear curve fitting
Non linear curve fitting Non linear curve fitting
Non linear curve fitting
 
Mathematical modeling
Mathematical modelingMathematical modeling
Mathematical modeling
 
phd thesis presentation
phd thesis presentationphd thesis presentation
phd thesis presentation
 
Curve fitting - Lecture Notes
Curve fitting - Lecture NotesCurve fitting - Lecture Notes
Curve fitting - Lecture Notes
 
Es272 ch5a
Es272 ch5aEs272 ch5a
Es272 ch5a
 
Es272 ch1
Es272 ch1Es272 ch1
Es272 ch1
 
case study of curve fitting
case study of curve fittingcase study of curve fitting
case study of curve fitting
 
metode numerik kurva fitting dan regresi
metode numerik kurva fitting dan regresimetode numerik kurva fitting dan regresi
metode numerik kurva fitting dan regresi
 

Similar to Chapter 14 Part I

Chapter 14 Part Ii
Chapter 14 Part IiChapter 14 Part Ii
Chapter 14 Part Ii
Matthew L Levy
 
Corr And Regress
Corr And RegressCorr And Regress
Corr And Regress
rishi.indian
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).ppt
MuhammadAftab89
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
BAGARAGAZAROMUALD2
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.ppt
RidaIrfan10
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Science
ssuser71ac73
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
HarunorRashid74
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
krunal soni
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
MoinPasha12
 
Chapter 15
Chapter 15Chapter 15
Chapter 15
Matthew L Levy
 
Linear Regression
Linear Regression Linear Regression
Linear Regression
Rupak Roy
 
Materi_Business_Intelligence_1.pdf
Materi_Business_Intelligence_1.pdfMateri_Business_Intelligence_1.pdf
Materi_Business_Intelligence_1.pdf
Hasan Dwi Cahyono
 
Two-Variable (Bivariate) RegressionIn the last unit, we covered
Two-Variable (Bivariate) RegressionIn the last unit, we covered Two-Variable (Bivariate) RegressionIn the last unit, we covered
Two-Variable (Bivariate) RegressionIn the last unit, we covered
LacieKlineeb
 
Lecture.3.regression.all
Lecture.3.regression.allLecture.3.regression.all
Lecture.3.regression.all
KUBUKE JACKSON
 
For this assignment, use the aschooltest.sav dataset.The d
For this assignment, use the aschooltest.sav dataset.The dFor this assignment, use the aschooltest.sav dataset.The d
For this assignment, use the aschooltest.sav dataset.The d
MerrileeDelvalle969
 
The linear regression model: Theory and Application
The linear regression model: Theory and ApplicationThe linear regression model: Theory and Application
The linear regression model: Theory and Application
University of Salerno
 
9. parametric regression
9. parametric regression9. parametric regression
9. parametric regression
Lahore Garrison University
 
REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HERE
ShriramKargaonkar
 
ML-UNIT-IV complete notes download here
ML-UNIT-IV  complete notes download hereML-UNIT-IV  complete notes download here
ML-UNIT-IV complete notes download here
keerthanakshatriya20
 
Powerpoint2.reg
Powerpoint2.regPowerpoint2.reg
Powerpoint2.reg
Mili Sabarots
 

Similar to Chapter 14 Part I (20)

Chapter 14 Part Ii
Chapter 14 Part IiChapter 14 Part Ii
Chapter 14 Part Ii
 
Corr And Regress
Corr And RegressCorr And Regress
Corr And Regress
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.ppt
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Science
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Chapter 15
Chapter 15Chapter 15
Chapter 15
 
Linear Regression
Linear Regression Linear Regression
Linear Regression
 
Materi_Business_Intelligence_1.pdf
Materi_Business_Intelligence_1.pdfMateri_Business_Intelligence_1.pdf
Materi_Business_Intelligence_1.pdf
 
Two-Variable (Bivariate) RegressionIn the last unit, we covered
Two-Variable (Bivariate) RegressionIn the last unit, we covered Two-Variable (Bivariate) RegressionIn the last unit, we covered
Two-Variable (Bivariate) RegressionIn the last unit, we covered
 
Lecture.3.regression.all
Lecture.3.regression.allLecture.3.regression.all
Lecture.3.regression.all
 
For this assignment, use the aschooltest.sav dataset.The d
For this assignment, use the aschooltest.sav dataset.The dFor this assignment, use the aschooltest.sav dataset.The d
For this assignment, use the aschooltest.sav dataset.The d
 
The linear regression model: Theory and Application
The linear regression model: Theory and ApplicationThe linear regression model: Theory and Application
The linear regression model: Theory and Application
 
9. parametric regression
9. parametric regression9. parametric regression
9. parametric regression
 
REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HERE
 
ML-UNIT-IV complete notes download here
ML-UNIT-IV  complete notes download hereML-UNIT-IV  complete notes download here
ML-UNIT-IV complete notes download here
 
Powerpoint2.reg
Powerpoint2.regPowerpoint2.reg
Powerpoint2.reg
 

More from Matthew L Levy

Chapter 15R Lecture
Chapter 15R LectureChapter 15R Lecture
Chapter 15R Lecture
Matthew L Levy
 
Chapter 14R
Chapter 14RChapter 14R
Chapter 14R
Matthew L Levy
 
Chapter 5R
Chapter 5RChapter 5R
Chapter 5R
Matthew L Levy
 
Chapter 4R Part II
Chapter 4R Part IIChapter 4R Part II
Chapter 4R Part II
Matthew L Levy
 
Chapter 4 R Part I
Chapter 4 R Part IChapter 4 R Part I
Chapter 4 R Part I
Matthew L Levy
 
Chapter 20 Lecture Notes
Chapter 20 Lecture NotesChapter 20 Lecture Notes
Chapter 20 Lecture Notes
Matthew L Levy
 
Chapter 18 Part I
Chapter 18 Part IChapter 18 Part I
Chapter 18 Part I
Matthew L Levy
 
Chapter 16
Chapter 16Chapter 16
Chapter 16
Matthew L Levy
 

More from Matthew L Levy (8)

Chapter 15R Lecture
Chapter 15R LectureChapter 15R Lecture
Chapter 15R Lecture
 
Chapter 14R
Chapter 14RChapter 14R
Chapter 14R
 
Chapter 5R
Chapter 5RChapter 5R
Chapter 5R
 
Chapter 4R Part II
Chapter 4R Part IIChapter 4R Part II
Chapter 4R Part II
 
Chapter 4 R Part I
Chapter 4 R Part IChapter 4 R Part I
Chapter 4 R Part I
 
Chapter 20 Lecture Notes
Chapter 20 Lecture NotesChapter 20 Lecture Notes
Chapter 20 Lecture Notes
 
Chapter 18 Part I
Chapter 18 Part IChapter 18 Part I
Chapter 18 Part I
 
Chapter 16
Chapter 16Chapter 16
Chapter 16
 

Recently uploaded

A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
Celine George
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
NgcHiNguyn25
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
Dr. Shivangi Singh Parihar
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
paigestewart1632
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 

Recently uploaded (20)

A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 

Chapter 14 Part I

  • 1. Chapter 14 Part I ISDS 2001 Matt Levy
  • 2. Introduction Regression is the term used to describe the technique of modeling and analyzing 1 or more variables. The focus is on a dependent variable, and one or more independent variables. Simple Linear Regression means 1 independent variable. Regression, and other statistical modeling techniques gives us the power to infer, or predict future outcomes. An understanding of regression, and the techniques used to validate your models will provide you with sound methodology to do just that.
  • 3. Simple Linear Regression As previously mentioned, simple linear regression means we have 1 dependent variable (y), and 1 independent variable (x). In order to make a prediction about y using x, we need sample data (from both x and y) in order to generate some additional terms, namely the parameters (β0 and β1), and an error term (ε). The parameters, β0 and β1, can be thought of as what is generated from explained variability The error term (ε) accounts for unexplained variability. Thus, the simple linear regression model is: y = β0 + β1x + ε
  • 4. Estimating the Regression Equation If we were so fortunate to know the population parameters, we could use the equation on the previous slide to compute the mean. Unfortunately, for us, we must use sample data to estimate these parameters, and subsequently, use different symbols to denote our estimated parameters: ŷ = b 0 + b 1x Note that we use place a hat over y (pronounced y-hat) and use english lettering to denote our estimated parameters. We now have an equation that graphs a "regression line" ŷ is the point estimator of E(y), the mean. b0 is the y-intercept b1 is the slope
  • 5. The Estimation Process for Simple Linear Regression
  • 6. The Estimation Process for Simple Linear Regression So how to we estimate b0 and b1? To do this we use a method known as least squares. In simple linear regression, finding b0 and b1 is relatively straightforward. Equations 14.6 and 14.7 in your book show the procedure for b0 and b1, respectively. Once b0 and b1 are obtained, the estimated simple linear regression equation will resemble the following: ŷ = 60 + 5x It is important to note that you will have a ŷi for every yi in the sample data-set. It is up to you to determine if the difference between them is small enough to de the equation an accurate predictor.
  • 7. Coefficient of Determination The Coefficient of Determination (r2) provides us one measure to judge how well our regression equation (for example: ŷ = 60 + 5x ) fits the actual data. Lets take some time to build r2 and learn some important terms along the way: ◆ Remember that we have an estimated dependent variable (ŷi ) and an actual dependent variable (yi) for each observation. ◆ (yi - ŷi ) is known as the ith residual. ◆ When we take (yi - ŷi ), square it, and sum the squares we get the Sum of Squares of the Error Terms (SSE) , hence SSE = ∑(yi - ŷi)2 . ◆ When we take (yi - y̅), square it, and sum the squares we get the Total Sum of Squares (SST), hence SST = ∑ (yi - y̅)2 ◆ Lastly, when we take (ŷi - y̅), square it, and sum the squares, we get a measure of how much the estimated values on the regression line deviate from the actual mean. ◆ This is known as the Sum of Squares of the Regression Line:  SSR = ∑ (ŷi - y̅)2
  • 8. Coefficient of Determination (con't) The relationship between SSR, SST, and SSE is one of the most important facts to know in statistics. SST = SSR + SSE Now, if (yi - ŷi ) = 0 for each ith observation, SST = SSR, and we have a perfect fit of the data. This is never the case. On the flip side, if SST - SSR = 0, we have the worst possible fit because everything is in the error term, or the unexplained portion of the equation. Hence to measure of goodness of fit we look at the ratio of SSR to SST. r2 = SSR/SST This yields a value between 0 and 1. r2 can be interpreted as the % of the total sum of squares (SST) that can be explained by using your estimated regression equation.
  • 9. Correlation Coefficient Denoted rxy, is a measure of the strength of the linear association between the independent (x) and dependent variable (y). rxy = (sign of b1) √r2 rxy always yields a value between (-1, +1). A value of 1 indicates perfect positive linear relationship. A value of -1 indicates perfect negative linear relationship. A value of zero indicates no relationship. In practice, this is used much less as it only provides an accurate measurement in the case of perfectly linear relationships. r2 can be used to measure goodness-of-fit in linear and nonlinear relationships.
  • 10. Estimating the Regression Equation In this model, y can be thought of as having a distribution for a given range of x values. As we have learned in the past, a distribution has a mean or expected value. Thus the regression equation for the mean is as follows: E(y) = β0 + β1x Notice that to obtain the mean, we simply remove our ability to account for unexplained variance.
  • 11. Model Assumptions It is important to understand that r2 is not enough to ensure we have an appropriate regression equation. There are numerous other tests and measures we must use. All of these tests are based on assumptions about the error term (ε) 1. E(ε) = 0. Implication: E(y) = β0 + β1x 2. The variance of ε, denoted by σ2 is the same for all values of x. Implication: The variance of y equals σ2 and is the same for all values of x. 3. The values of ε are independent (uncorrelated) Implication: The value of y for any x is not related to value of y for any other x. 4. ε is a normally distributed random variable. Implication: Because y is a linear function of ε, y is also normally distributed. Table 14.14 in the text provides a complete explanation.
  • 12. Testing for Significance In Simple Linear Regression, the mean or expected value of y is a linear function of x (E(y) = β0 + β1x ) If the value of β1 = 0, then E(y) = β0 + 0x = β0. Hence, in this case we can conclude x and y are not linearly related. In the next, couple of slides we offer a few tests, the t-test, an evaluation of the confidence interval for β1, the F-test. Each of these test are based on the following hypothesis: H 0: β 1 = 0 Ha: β1 ≠ 0 This starts to tell us more about the appropriateness of our model.
  • 13. 2 Estimating σ As a pre-cursor to running our tests, we need an estimate of σ2. Recall one of our key assumptions that variance of ε also represents the variance of y. Also recall the deviations of y about the regression line are called residuals. Hence we can call upon the SSE to calculate the Mean Square Error (MSE) as an estimate of σ2 which we will denote as s2. s2 = MSE = SSE/(n-2), where n is the sample size and (n-2) is the model degrees of freedom. Consequently, to get the standard error (s) of the estimate: √MSE.
  • 14. t  Test Remember we are testing the following: H0: β1 = 0; Ha: β1 ≠ 0 To do this we need information about the distribution of b1 (see figure 14.17)., specifically, we need the estimated standard deviation of b1 (see figure 14.18) Once we have sb1 we can find the test statistic t: t = b1/sb1. And using the t-table and our well known rejection rules: Reject H0 if p-value ≤ α . where t α/2 is based on a t-distribution with n-2 degrees of freedom.
  • 15. Confidence Interval for β1 As an alternative to the t-test, we can check the confidence interval for β1 We are essentially checking to see if the interval of β1 contains 0. The form of the confidence interval is as follows: b1 ± t α/2*sb1 If this interval contains zero at the designated significance level, we cannot reject the null hypothesis (H0).
  • 16. F-Test Based on the F probability distribution (hence, using our F-table) In simple linear regression this does the same thing as the t-test. With more than one independent variable (multiple regression) ONLY the F- test can be used to test for overall significance. To arrive at the F-Test statistic, we need the Mean Square due to Regression (MSR). MSR = MSE / (Number of Independent Variables) F = MSR/MSE (Just like when we first learned ANOVA) And using the F-table and our well known rejection rules: Reject H0 if p-value ≤ α . where F α is based on an F-distribution with 1 degree of freedom (for SLR) in the numerator and (n-2) degrees of freedom in the denominator.
  • 17. Caution about the Interpretation of Significance Testing Correlation is not causation! Just because we Reject H0 does not guarantee cause- and-effect, theoretical justification must be warranted. Furthermore, just because we can Reject H0 does not mean the relationship between x and y is linear.