SlideShare a Scribd company logo
1 of 47
Chapter 15
Multiple Regression and Model Building
Copyright ©2018 McGraw-Hill Education. All rights reserved.
1
Chapter Outline
15.1 The Multiple Regression Model and the Least Squares
Point Estimate
15.2 R2 and Adjusted R2
15.3 Model Assumptions and the Standard Error
15.4 The Overall F Test
15.5 Testing the Significance of an Independent Variable
15.6 Confidence and Prediction Intervals
15-2
2
Chapter Outline Continued
15.7 The Sales Representative Case: Evaluating Employee
Performance
15.8 Using Dummy Variables to Model Qualitative Independent
Variables (Optional)
15.9 Using Squared and Interaction Variables (Optional)
15.10 Multicollinearity, Model Building and Model
Validation (Optional)
15.11 Residual Analysis and Outlier Detection in Multiple
Regression (Optional)
15-3
3
15.1 The Multiple Regression Model and the Least Squares
Point Estimate
Simple linear regression used one independent variable to
explain the dependent variable
Some relationships are too complex to be described using a
single independent variable
Multiple regression uses two or more independent variables to
describe the dependent variable
This allows multiple regression models to handle more complex
situations
There is no limit to the number of independent variables a
model can use
Multiple regression has only one dependent variable
LO15-1: Explain the multiple regression model and the related
least squares point estimates.
15-4
4
The Multiple Regression Model
The linear regression model relating y to x1, x2,…, xk is y = β0
µy = β0 + β1x1 + β2x2 +…+ βkxk is the mean value of the
dependent variable y when the values of the independent
variables are x1, x2,…, xk
β0, β1, β2,… βk are the unknown regression parameters relating
the mean value of y to x1, x2,…, xk
other than the independent variables x1, x2,…, xk
LO15-1
15-5
5
The Least Squares Estimates and Point Estimation and
Prediction
Estimation/prediction equation
ŷ = b0 + b1x1 + b2x2 + … + bkxk
is the point estimate of the mean value of the dependent
variable when the values of the independent variables are x1,
x2,…, xk
It is also the point prediction of an individual value of the
dependent variable when the values of the independent variables
are x1, x2,…, xk
b0, b1, b2,…, bk are the least squares point estimates of the
parameters β0, β1, β2,…, βk
x1, x2,…, xk are specified values of the independent predictor
variables x1, x2,…, xk
LO15-1
15-6
6
LO15-1
Example 15.1 The Tasty Sub Shop Case
Figure 15.4 (a)
15-7
7
15.2 R2 and Adjusted R2
Total variation is given by the formula
Explained variation is given by the formula
Unexplained variation is given by the formula
Total variation is the sum of explained and unexplained
variation
LO15-2: Calculate and interpret the multiple and adjusted
multiple coefficients of determination.
15-8
8
R2 and Adjusted R2 Continued
The multiple coefficient of determination is the ratio of
explained variation to total variation
R2 is the proportion of the total variation that is explained by
the overall regression model
Multiple correlation coefficient R is the square root of R2
LO15-2
15-9
9
Multiple Correlation Coefficient R
The multiple correlation coefficient R is just the square root of
R2
With simple linear regression, r would take on the sign of b1
There are multiple bi’s with multiple regression
For this reason, R is always positive
To interpret the direction of the relationship between the x’s
and y, you must look to the sign of the appropriate bi
coefficient
LO15-2
15-10
10
Adjusted R2
Adding an independent variable to multiple regression will raise
R2
R2 will rise slightly even if the new variable has no relationship
to y
corrects this tendency in R2
As a result, it gives a better estimate of the importance of the
independent variables
LO15-2
15-11
11
15.3 Model Assumptions and the Standard Error
A
Mean of Zero Assumption: The mean of the error terms is equal
to 0
Constant Variance Assumption: The variance of the error terms
σ2 is, the same for every combination values of x1, x2,…, xk
Normality Assumption: The error terms follow a normal
distribution for every combination values of
x1, x2,…, xk
Independence Assumption: The values of the error terms are
statistically independent of each other
LO15-3: Explain the assumptions behind multiple regression
and calculate the standard error.
15-12
12
The Mean Square Error and the Standard Error
Sum of squared errors
Mean squared error
Point estimate of the residual variance σ2
Standard error
Point estimate of the residual standard deviation σ
LO15-3
15-13
13
15.4 The Overall F Test
To test
H0: β1= β2 = …= βk = 0 versus
Ha: At least one of β1, β2,…, βk ≠ 0
Test statistic
p-
- (k + 1) denominator
degrees of freedom
LO15-4: Test the significance of a multiple regression model by
using an F test.
15-14
14
15.5 Testing the Significance of an Independent Variable
A variable in a multiple regression model is not likely to be
useful unless there is a significant relationship between it and y
To test significance, we use the null hypothesis H0: βj = 0
Versus the alternative hypothesis
Ha: βj ≠ 0
LO15-5: Test the significance of a single independent variable.
15-15
15
Testing the Significance of the Independent Variable xj
LO15-5
15-16
16
Testing the Significance of an Independent Variable Continued
Customary to test significance of every independent variable
the independent variable xj is significantly related to y
evidence the independent variable xj is significantly related to y
rejected, the stronger the evidence that xj is significantly
related to y
LO15-5
15-17
17
A Confidence Interval for the Regression Parameter βj
If the regression assumptions hold,
100 (1 -
– (k + 1) degrees of freedom
LO15-5
15-18
18
15.6 Confidence and Prediction Intervals
The point on the regression line corresponding to a particular
value of x1, x2,…, xk, of the independent variables is
It is unlikely that this value will equal the mean value of y for
these x values
Therefore, we need to place bounds on how far away the
predicted value might be
We can do this by calculating a confidence interval for the mean
value of y and a prediction interval for an individual value of y
LO15-6: Find and interpret a confidence interval for a mean
value and a prediction interval for an individual value.
15-19
19
Distance Value
Both the confidence interval for the mean value of y and the
prediction interval for an individual value of y employ a
quantity called the distance value
With simple regression, we were able to calculate the distance
value fairly easily
However, for multiple regression, calculating the distance value
requires matrix algebra
LO15-6
15-20
20
A Confidence Interval and a Prediction Interval
Distance value
Assume the regression assumptions hold
Confidence interval for the mean value of y
Prediction interval for an individual value of y
These are based on n - (k + 1) degrees of freedom
LO15-6
15-21
21
15.7 The Sales Representative Case: Evaluating Employee
Performance
yi Yearly sales of the company’s product
x1 Number of months the representative has been employed
x2 Sales of products in the sales territory
x3 Dollar advertising expenditure in the territory
x4 Weighted average of the company’s market share in
territory for the previous four years
x5 Change in the company’s market share in the territory over
the previous four years
15-22
22
Partial Excel Output of a Regression Analysis of the Sales
Territory Performance Data
Figure 15.10a
15-23
Time = 85.42
MktPoten = 35,182.73
Adver = 7,281.65
MktShare = 9.64
Change = .28
Sales
Predicted 4,181.74
95% Prediction Interval
[3,233.59 to 5,129.89]
23
15.8 Using Dummy Variables to Model Qualitative Independent
Variables (Optional)
So far, we have only looked at including quantitative data in a
regression model
However, we may wish to include descriptive qualitative data as
well
For example, might want to include the gender of respondents
We can model the effects of different levels of a qualitative
variable by using what are called dummy variables
Also known as indicator variables
LO15-7: Use dummy variables to model qualitative independent
Variables (Optional).
15-24
24
Constructing Dummy Variables
A dummy variable always has a value of either 0 or 1
For example, to model sales at two locations, would code the
first location as a zero and the second as a 1
Operationally, it does not matter which is coded 0 and which is
coded 1
LO15-7
15-25
25
What If We Have More Than Two Categories?
Consider having three categories, say A, B and C
Cannot code this using one dummy variable
A=0, B=1 and C=2 would be invalid
Assumes the difference between A and B is the same as B and C
We must use multiple dummy variables
Specifically, k categories requires k - 1 dummy variables
LO15-7
15-26
26
What If We Have Three Categories?
For A, B, and C, would need two dummy variables
x1 is 1 for A, zero otherwise
x2 is 1 for B, zero otherwise
If x1 and x2 are zero, must be C
This is why the third dummy variable is not needed
LO15-7
15-27
27
Interaction Models
So far, have only considered dummy variables as stand-alone
variables
Where D is dummy variable
However, can also look at interaction between dummy variable
and other variables
That model would take the form
With an interaction term, both the intercept and slope are
shifted
LO15-7
15-28
28
15.9 Using Squared and Interaction Variables (Optional)
Quadratic regression model is:
y = β0 + β1x + β2x2 ε
where
β0 + β1x + β2x2 is μy
β, β1, and β2 are the regression parameters
ε is an error term
LO15-8: Use squared and interaction variables.
15-29
29
Using Interaction Variables
Regression models often contain interaction variables
Formed by multiplying two independent variables together
Consider a model where x3 and x4 interact
and x3 is used as a quadratic
y = β0 + β1x4 + β2x3 + β3x32 + β4x4x3 + ε
LO15-8
15-30
30
15.10 Multicollinearity, Model Building, and Model Validation
(Optional)
Multicollinearity: when “independent” variables are related to
one another
Considered severe when the simple correlation exceeds 0.9
Even moderate multicollinearity can be a problem
Another measurement is variance inflation factors
Multicollinearity considered
Severe when VIF > 10
Moderately strong for VIF > 5
LO15-9: Describe multicollinearity and build and validate a
multiple regression model (Optional).
15-31
31
Effect of Adding Independent Variable
Adding any independent variable will increase R²
Even adding an unimportant independent variable
Thus, R² cannot tell us that adding an independent variable is
undesirable
LO15-9
15-32
32
A Better Criterion is the Standard Error
A better criterion is the size of the standard error s
If s increases when an independent variable is added, we should
not add that variable
However, decreasing s alone is not enough
An independent variable should only be included if it reduces s
enough to offset the higher t value and reduces the length of the
desired prediction interval for y
LO15-9
15-33
33
C Statistic
Another quantity for comparing regression models is called the
C (a.k.a. Cp) statistic,
First, calculate mean square error for the model containing all p
potential independent variables (s2p)
Next, calculate SSE for a reduced model with k independent
variables
LO15-9
15-34
34
C Statistic Continued
We want the value of C to be small
Adding unimportant independent variables will raise the value
of C
While we want C to be small, we also wish to find a model for
which C roughly equals k + 1
A model with C substantially greater than k + 1 has substantial
bias and is undesirable
If a model has a small value of C and C for this model is less
than k + 1, then it is not biased and the model should be
considered desirable
LO15-9
15-35
35
The Partial F Test: An F Test for a Portion of a Regression
Model
To test
H0: All of the βj coefficients corresponding to the independent
variables in the subset are zero
Ha: At least one of the βj coefficients is not equal to zero
Reject H0 in favor of Ha if:
p-
- g numerator and n - (k + 1) denominator
degrees of freedom
LO15-9
15-36
36
15.11 Residual Analysis and Outlier Detection in Multiple
Regression (Optional)
For an observed value of yi, the residual is
i = yi - ŷ = yi – (b0 + b1xi1 + … + bkxik)
If the assumptions hold, the residuals should look like a random
sample from a normal distribution with mean 0 and variance σ2
Residual plots
Residuals versus each independent variable
Residuals versus predicted y’s
Residuals in time order (if the response is a time series)
LO15-10: Use residual analysis and outlier detection to check
the assumptions of multiple regression (Optional).
15-37
Figure 15.35
37
LO15-10
Outliers
Figure 15.37 c, d and e
15-38
Chapter 14
Simple Linear Regression Analysis
Copyright ©2018 McGraw-Hill Education. All rights reserved.
1
Chapter Outline
14.1 The Simple Linear Regression Model and the Least Square
Point Estimates
14.2 Simple Coefficients of Determination and Correlation
14.3 Model Assumptions and the Standard Error
14.4 Testing the Significance of the Slope and
y-Intercept
14.5 Confidence and Prediction Intervals
14-2
2
Chapter Outline Continued
14.6 Testing the Significance of the Population Correlation
Coefficient (Optional)
14.7 Residual Analysis
14-3
3
14.1 The Simple Linear Regression Model and the Least
Squares Point Estimates
The dependent (or response) variable is the variable we wish to
understand or predict
The independent (or predictor) variable is the variable we will
use to understand or predict the dependent variable
Regression analysis is a statistical technique that uses observed
data to relate the dependent variable to one or more independent
variables
The objective is to build a regression model that can describe,
predict and control the dependent variable based on the
independent variable
LO14-1: Explain the simple linear regression model.
14-4
4
Form of The Simple Linear
Regression Model
y = β0 + β1x + ε
when the value of the independent variable is x
β0 is the y-intercept; the mean of y when x is zero
β1 is the slope; the change in the mean of y per unit change in x
ε is an error term that describes the effect on y of all factors
other than x
LO14-1
14-5
5
Regression Terms
β0 and β1 are called regression parameters
β0 is the y-intercept
β1 is the slope
We do not know the true values of these parameters
So, we must use sample data to estimate them
b0 is the estimate of β0
b1 is the estimate of β1
LO14-1
14-6
6
LO14-1
The Simple Linear Regression Model Illustrated
Figure 14.3
14-7
7
The Least Squares Point Estimates
LO14-2: Find the least squares point estimates of the slope and
y-intercept.
14-8
8
Example 14.2 The Tasty Sub Shop Case: The Least Squares
Estimates
LO14-2
14-9
9
Example 14.2 The Tasty Sub Shop Case: The Least Squares
Estimates
From last slide,
Σyi = 8,603.1
Σxi = 434.1
Σx2i = 20,757.41
Σxiyi = 403,296.96
Once we have these values, we no longer need the raw data
Calculation of b0 and b1 uses these totals
LO14-2
14-10
10
Example 14.2 The Tasty Sub Shop Case (Slope b1)
LO14-2
14-11
11
Example 14.2 The Tasty Sub Shop Case (y-Intercept b0)
Prediction (x = 20.8)
ŷ = b0 + b1x = 183.31 + (15.59)(20.8)
ŷ = 507.69
Residual is 527.1 – 507.69 = 19.41
LO14-2
14-12
Figure 14.5
12
14.2 Simple Coefficients of
Determination and Correlation
How useful is a particular regression model?
One measure of usefulness is the simple coefficient of
determination
It is represented by the symbol r2
LO14-3: Calculate and interpret the simple coefficients of
determination and correlation.
14-13
13
The Simple Coefficient of Determination,
Total variation is yi-ȳ)2
Explained variation is ŷi-ȳ)2
Unexplained variation is yi-ŷ)2
Total variation is the sum of explained and unexplained
variation
Simple coefficient of determination is
is the proportion of explained variation
LO14-3
14-14
14
The Simple Correlation Coefficient,
The simple correlation coefficient between y and x is denoted
by r
It is…
if b1 is positive
if b1 is negative
Where b1 is the slope of the least squares line
Simple correlation coefficient measures the strength of the
linear relationship between y and x and is denoted by r
LO14-3
14-15
15
LO14-3
Different Values of the Correlation Coefficient
Figure 14.8
14-16
16
14.3 Model Assumptions and the Standard Error
Mean of Zero: At any given value of x, the population of
potential error term values has a mean equal to zero
Constant Variance Assumption: At any value of x, the
population of potential error term values has a variance that
does not depend on the value of x
Normality Assumption: At any given value of x, the population
of potential error term values has a normal distribution
Independence Assumption: Any one value of the error term ε is
statistically independent of any other value of ε
LO14-4: Describe the assumptions behind simple linear
regression and calculate the standard error.
14-17
Figure 14.9
17
LO14-4
The Mean Square Error and the Standard Error
Sum of squared errors
Mean square error
Point estimate of the residual variance σ2
Standard error
Point estimate of the residual standard deviation σ
14-18
18
14.4 Testing the Significance of the Slope and y-Intercept
A regression model is not likely to be useful unless there is a
significant relationship between x and y
To test significance, we use the null hypothesis:
H0: β1 = 0
Versus the alternative hypothesis:
Ha: β1 ≠ 0
LO14-5: Test the significance of the slope and y-intercept.
14-19
19
Testing the Significance of the Slope and y-Intercept Continued
LO14-5
14-20
20
An F Test for the Significance of the Slope (Optional)
H0: β1 = 0
p-
- 2 denominator degrees of
freedom
LO14-6: Test the significance of a simple linear regression
model by using an F test (Optional).
14-21
14.5 Confidence and Prediction Intervals
The point on the regression line corresponding to a particular
value of x0 of the independent variable x is ŷ = b0 + b1x0
It is unlikely that this value will equal the mean value of y
when x equals x0
Therefore, we need to place bounds on how far the predicted
value might be from the actual value
We can do this by calculating a confidence interval mean for the
value of y and a prediction interval for an individual value of y
LO14-7: Calculate and interpret a confidence interval for a
mean value and a prediction interval for an individual value.
14-22
22
Distance Value
Both the confidence interval for the mean value of y and the
prediction interval for an individual value of y employ a
quantity called the distance value
The distance value is a measure of the distance between the
value x0 of x and
Notice that the further x0 is from , the larger the distance value
LO14-7
14-23
23
A Confidence Interval and Prediction Interval
Assume that the regression assumption holds
The formula for a 100 (1 - the mean
value of y is
The formula for a 100 (1 -
individual value of y is
This is based on n - 2 degrees of freedom
LO14-7
14-24
24
Which to Use?
The prediction interval is useful if it is important to predict an
individual value of the dependent variable
A confidence interval is useful if it is important to estimate the
mean value
The prediction interval will always be wider than the confidence
interval
LO14-7
14-25
25
14.6 Testing the Significance of the Population Correlation
Coefficient (Optional)
The simple correlation coefficient (r) measures the linear
relationship between the observed values of x and y from the
sample
The population correlation coefficient (ρ) measures the linear
relationship between all possible combinations of observed
values of x and y
r is an estimate of ρ
LO14-8: Test hypotheses about the population correlation
coefficient (Optional).
14-26
26
Testing ρ
We can test to see if the correlation is significant using the
hypotheses
H0: ρ = 0
Ha: ρ ≠ 0
The statistic is
This test will give the same results as the test for significance
on the slope coefficient b1
LO14-8
14-27
27
14.7 Residual Analysis
Checks of regression assumptions are performed by analyzing
the regression residuals
Residuals () are defined as the difference between the observed
value of y and the predicted value of y, = y - ŷ
Note that is the point estimate of ε
If regression assumptions valid, the population of potential
error terms will be normally distributed with mean zero and
variance σ2
Different error terms will be statistically independent
LO14-9: Use residual analysis to check the assumptions of
simple linear regression.
14-28
28
Residual Analysis Continued
Residuals are randomly and independently selected from normal
populations with mean zero and variance σ2
With any real data, assumptions will not hold exactly
Mild departures do not affect our ability to make statistical
inferences
In checking assumptions, we are looking for pronounced
departures from the assumptions
So, only require residuals to approximately fit the description
above
LO14-9
14-29
29
LO14-9
Example 14.9 The QHIC Case: Constructing Residual Plots
Figure 14.18b
Quality Home Improvement Center (QHIC) operates five stores
Studies the relationship between home value and yearly
expenditure on home upkeep
Random sample of 40 homeowners
Intercept = –348.3921
Slope 7.2583
14-30
30
Residual Plots
Residuals versus independent variable
Residuals versus predicted y’s
Residuals in time order (if the response is a time series)
LO14-9
14-31
31
Constant Variance Assumptions
To check the validity of the constant variance assumption,
examine residual plots against
The x values
The predicted y values
Time (when data is time series)
A pattern that fans out says the variance is increasing rather
than staying constant
A pattern that funnels in says the variance is decreasing rather
than staying constant
A pattern that is evenly spread within a band says the
assumption has been met
LO14-9
14-32
32
LO14-9
Constant Variance Visually
Figure 14.19
14-33
33
Assumption of Correct Functional Form
If the relationship between x and y is something other than a
linear one, the residual plot will often suggest a form more
appropriate for the model
For example, if there is a curved relationship between x and y, a
plot of residuals will often show a curved relationship
LO14-9
14-34
34
Normality Assumption
If the normality assumption holds, a histogram or stem-and-leaf
display of residuals should look bell-shaped and symmetric
Another way to check is a normal plot of residuals
Order residuals from smallest to largest
Plot (i) on vertical axis against (i)
(i) is the point on the horizontal axis under the curve so the
area under this curve to the left is (3i - 1)/(3n + 1)
If the normality assumption holds, the plot should have a
straight-line appearance
LO14-9
14-35
35
Independence Assumption
Independence assumption most likely violated by time-series
data
If the data is not time series, it can be reordered without
affecting it
For time-series data, the time-ordered error terms can be
autocorrelated
Positive autocorrelation is when a positive error term in time
period i tends to be followed by another positive value in i + k
Negative autocorrelation is when a positive error term tends to
be followed by a negative value
Either one will cause a cyclical error term over time
LO14-9
14-36
36
LO14-9
Independence Assumption Visually
Figure 14.26 a and b
14-37
37
(
)
(
)
(
)
n
x
x
n
y
y
x
b
y
b
n
x
x
x
x
SS
n
y
x
y
x
y
y
x
x
SS
SS
SS
b
x
b
b
y
i
i
i
i
i
xx
i
i
i
i
i
i
xy
xx
xy
å
å
å
å
å
å
å
å
å
=
=
-
=
-
=
-
=
-
=
-
-
=
=
+
=
and
where
0
β
intercept
-
y
the
of
estimate
point
squares
Least
)
(
)
(
)
(
1
β
slope
the
of
estimate
point
squares
Least
ˆ
equation
n
/predictio
Estimation
1
0
2
2
2
1
1
0
596.15
129.913,1
389.836,29
129.913,1
10
)1.434(
41.757,120
389.836,29
10
)1.603,8)(1.434(
96.296,403
1
2
2
2
xx
xy
i
ixx
ii
iixy
SS
SS
b
n
x
xSS
n
yx
yxSS
31.183
)41.43)(596.15(31.860
41.43
10
1.434
31.860
10
1.603,8
10
xbyb
n
x
x
n
y
y
i
i
Chapter 15Multiple Regression and Model BuildingCo

More Related Content

Similar to Chapter 15Multiple Regression and Model BuildingCo

Lecture - 8 MLR.pptx
Lecture - 8 MLR.pptxLecture - 8 MLR.pptx
Lecture - 8 MLR.pptxiris765749
 
8 correlation regression
8 correlation regression 8 correlation regression
8 correlation regression Penny Jiang
 
Detail Study of the concept of Regression model.pptx
Detail Study of the concept of  Regression model.pptxDetail Study of the concept of  Regression model.pptx
Detail Study of the concept of Regression model.pptxtruptikulkarni2066
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptxShivankAggatwal
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inferenceKemal İnciroğlu
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionmejikpg
 
Multiple regression in R
Multiple regression in RMultiple regression in R
Multiple regression in RAman Chauhan
 
15 ch ken black solution
15 ch ken black solution15 ch ken black solution
15 ch ken black solutionKrunal Shah
 
604_multiplee.ppt
604_multiplee.ppt604_multiplee.ppt
604_multiplee.pptRufesh
 
Cost Behavior
Cost BehaviorCost Behavior
Cost BehaviorAIS_USU
 
OR CHAPTER TWO II.PPT
OR CHAPTER  TWO II.PPTOR CHAPTER  TWO II.PPT
OR CHAPTER TWO II.PPTAynetuTerefe2
 
Logistic regression and analysis using statistical information
Logistic regression and analysis using statistical informationLogistic regression and analysis using statistical information
Logistic regression and analysis using statistical informationAsadJaved304231
 
Introduction to Limited Dependent variable
Introduction to Limited Dependent variableIntroduction to Limited Dependent variable
Introduction to Limited Dependent variableAshok Dsouza
 

Similar to Chapter 15Multiple Regression and Model BuildingCo (20)

Lecture - 8 MLR.pptx
Lecture - 8 MLR.pptxLecture - 8 MLR.pptx
Lecture - 8 MLR.pptx
 
8 correlation regression
8 correlation regression 8 correlation regression
8 correlation regression
 
Detail Study of the concept of Regression model.pptx
Detail Study of the concept of  Regression model.pptxDetail Study of the concept of  Regression model.pptx
Detail Study of the concept of Regression model.pptx
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptx
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inference
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Regression
RegressionRegression
Regression
 
Chapter 15
Chapter 15Chapter 15
Chapter 15
 
Multiple regression in R
Multiple regression in RMultiple regression in R
Multiple regression in R
 
Linear regression theory
Linear regression theoryLinear regression theory
Linear regression theory
 
15 ch ken black solution
15 ch ken black solution15 ch ken black solution
15 ch ken black solution
 
604_multiplee.ppt
604_multiplee.ppt604_multiplee.ppt
604_multiplee.ppt
 
Cost Behavior
Cost BehaviorCost Behavior
Cost Behavior
 
OR CHAPTER TWO II.PPT
OR CHAPTER  TWO II.PPTOR CHAPTER  TWO II.PPT
OR CHAPTER TWO II.PPT
 
Rsh qam11 ch04 ge
Rsh qam11 ch04 geRsh qam11 ch04 ge
Rsh qam11 ch04 ge
 
Multiple Linear Regression
Multiple Linear Regression Multiple Linear Regression
Multiple Linear Regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Logistic regression and analysis using statistical information
Logistic regression and analysis using statistical informationLogistic regression and analysis using statistical information
Logistic regression and analysis using statistical information
 
SPSS
SPSSSPSS
SPSS
 
Introduction to Limited Dependent variable
Introduction to Limited Dependent variableIntroduction to Limited Dependent variable
Introduction to Limited Dependent variable
 

More from EstelaJeffery653

Individual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docxIndividual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docxEstelaJeffery653
 
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docxIndividual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docxEstelaJeffery653
 
Individual ProjectArticulating the Integrated PlanWed, 31.docx
Individual ProjectArticulating the Integrated PlanWed, 31.docxIndividual ProjectArticulating the Integrated PlanWed, 31.docx
Individual ProjectArticulating the Integrated PlanWed, 31.docxEstelaJeffery653
 
Individual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docxIndividual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docxEstelaJeffery653
 
Individual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docxIndividual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docxEstelaJeffery653
 
Individual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docxIndividual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docxEstelaJeffery653
 
Individual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docxIndividual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docxEstelaJeffery653
 
Individual ProjectThe Basic Marketing PlanWed, 3117N.docx
Individual ProjectThe Basic Marketing PlanWed, 3117N.docxIndividual ProjectThe Basic Marketing PlanWed, 3117N.docx
Individual ProjectThe Basic Marketing PlanWed, 3117N.docxEstelaJeffery653
 
Individual ProjectFinancial Procedures in a Health Care Organiza.docx
Individual ProjectFinancial Procedures in a Health Care Organiza.docxIndividual ProjectFinancial Procedures in a Health Care Organiza.docx
Individual ProjectFinancial Procedures in a Health Care Organiza.docxEstelaJeffery653
 
Individual Expanded Website PlanView more »Expand view.docx
Individual Expanded Website PlanView more  »Expand view.docxIndividual Expanded Website PlanView more  »Expand view.docx
Individual Expanded Website PlanView more »Expand view.docxEstelaJeffery653
 
Individual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docxIndividual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docxEstelaJeffery653
 
Individual Communicating to Management Concerning Information Syste.docx
Individual Communicating to Management Concerning Information Syste.docxIndividual Communicating to Management Concerning Information Syste.docx
Individual Communicating to Management Concerning Information Syste.docxEstelaJeffery653
 
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docx
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docxIndividual Case Analysis-MatavIn max 4 single-spaced total pag.docx
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docxEstelaJeffery653
 
Individual Assignment Report Format• Report should contain not m.docx
Individual Assignment Report Format• Report should contain not m.docxIndividual Assignment Report Format• Report should contain not m.docx
Individual Assignment Report Format• Report should contain not m.docxEstelaJeffery653
 
Include LOCO api that allows user to key in an address and get the d.docx
Include LOCO api that allows user to key in an address and get the d.docxInclude LOCO api that allows user to key in an address and get the d.docx
Include LOCO api that allows user to key in an address and get the d.docxEstelaJeffery653
 
Include the title, the name of the composer (if known) and of the .docx
Include the title, the name of the composer (if known) and of the .docxInclude the title, the name of the composer (if known) and of the .docx
Include the title, the name of the composer (if known) and of the .docxEstelaJeffery653
 
include as many events as possible to support your explanation of th.docx
include as many events as possible to support your explanation of th.docxinclude as many events as possible to support your explanation of th.docx
include as many events as possible to support your explanation of th.docxEstelaJeffery653
 
Incorporate the suggestions that were provided by your fellow projec.docx
Incorporate the suggestions that were provided by your fellow projec.docxIncorporate the suggestions that were provided by your fellow projec.docx
Incorporate the suggestions that were provided by your fellow projec.docxEstelaJeffery653
 
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docxinal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docxEstelaJeffery653
 
include 1page proposal- short introduction to research paper and yo.docx
include 1page proposal- short introduction to research paper and yo.docxinclude 1page proposal- short introduction to research paper and yo.docx
include 1page proposal- short introduction to research paper and yo.docxEstelaJeffery653
 

More from EstelaJeffery653 (20)

Individual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docxIndividual ProjectMedical TechnologyWed, 9617Num.docx
Individual ProjectMedical TechnologyWed, 9617Num.docx
 
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docxIndividual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
Individual ProjectThe Post-Watergate EraWed, 3817Numeric.docx
 
Individual ProjectArticulating the Integrated PlanWed, 31.docx
Individual ProjectArticulating the Integrated PlanWed, 31.docxIndividual ProjectArticulating the Integrated PlanWed, 31.docx
Individual ProjectArticulating the Integrated PlanWed, 31.docx
 
Individual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docxIndividual Multilingualism Guidelines1)Where did the a.docx
Individual Multilingualism Guidelines1)Where did the a.docx
 
Individual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docxIndividual Implementation Strategiesno new messagesObjectives.docx
Individual Implementation Strategiesno new messagesObjectives.docx
 
Individual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docxIndividual Refine and Finalize WebsiteDueJul 02View m.docx
Individual Refine and Finalize WebsiteDueJul 02View m.docx
 
Individual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docxIndividual Cultural Communication Written Assignment  (Worth 20 of .docx
Individual Cultural Communication Written Assignment  (Worth 20 of .docx
 
Individual ProjectThe Basic Marketing PlanWed, 3117N.docx
Individual ProjectThe Basic Marketing PlanWed, 3117N.docxIndividual ProjectThe Basic Marketing PlanWed, 3117N.docx
Individual ProjectThe Basic Marketing PlanWed, 3117N.docx
 
Individual ProjectFinancial Procedures in a Health Care Organiza.docx
Individual ProjectFinancial Procedures in a Health Care Organiza.docxIndividual ProjectFinancial Procedures in a Health Care Organiza.docx
Individual ProjectFinancial Procedures in a Health Care Organiza.docx
 
Individual Expanded Website PlanView more »Expand view.docx
Individual Expanded Website PlanView more  »Expand view.docxIndividual Expanded Website PlanView more  »Expand view.docx
Individual Expanded Website PlanView more »Expand view.docx
 
Individual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docxIndividual Expanded Website PlanDueJul 02View more .docx
Individual Expanded Website PlanDueJul 02View more .docx
 
Individual Communicating to Management Concerning Information Syste.docx
Individual Communicating to Management Concerning Information Syste.docxIndividual Communicating to Management Concerning Information Syste.docx
Individual Communicating to Management Concerning Information Syste.docx
 
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docx
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docxIndividual Case Analysis-MatavIn max 4 single-spaced total pag.docx
Individual Case Analysis-MatavIn max 4 single-spaced total pag.docx
 
Individual Assignment Report Format• Report should contain not m.docx
Individual Assignment Report Format• Report should contain not m.docxIndividual Assignment Report Format• Report should contain not m.docx
Individual Assignment Report Format• Report should contain not m.docx
 
Include LOCO api that allows user to key in an address and get the d.docx
Include LOCO api that allows user to key in an address and get the d.docxInclude LOCO api that allows user to key in an address and get the d.docx
Include LOCO api that allows user to key in an address and get the d.docx
 
Include the title, the name of the composer (if known) and of the .docx
Include the title, the name of the composer (if known) and of the .docxInclude the title, the name of the composer (if known) and of the .docx
Include the title, the name of the composer (if known) and of the .docx
 
include as many events as possible to support your explanation of th.docx
include as many events as possible to support your explanation of th.docxinclude as many events as possible to support your explanation of th.docx
include as many events as possible to support your explanation of th.docx
 
Incorporate the suggestions that were provided by your fellow projec.docx
Incorporate the suggestions that were provided by your fellow projec.docxIncorporate the suggestions that were provided by your fellow projec.docx
Incorporate the suggestions that were provided by your fellow projec.docx
 
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docxinal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
inal ProjectDUE Jun 25, 2017 1155 PMGrade DetailsGradeNA.docx
 
include 1page proposal- short introduction to research paper and yo.docx
include 1page proposal- short introduction to research paper and yo.docxinclude 1page proposal- short introduction to research paper and yo.docx
include 1page proposal- short introduction to research paper and yo.docx
 

Recently uploaded

History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxAnaBeatriceAblay2
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 

Recently uploaded (20)

History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 

Chapter 15Multiple Regression and Model BuildingCo

  • 1. Chapter 15 Multiple Regression and Model Building Copyright ©2018 McGraw-Hill Education. All rights reserved. 1 Chapter Outline 15.1 The Multiple Regression Model and the Least Squares Point Estimate 15.2 R2 and Adjusted R2 15.3 Model Assumptions and the Standard Error 15.4 The Overall F Test 15.5 Testing the Significance of an Independent Variable 15.6 Confidence and Prediction Intervals 15-2 2
  • 2. Chapter Outline Continued 15.7 The Sales Representative Case: Evaluating Employee Performance 15.8 Using Dummy Variables to Model Qualitative Independent Variables (Optional) 15.9 Using Squared and Interaction Variables (Optional) 15.10 Multicollinearity, Model Building and Model Validation (Optional) 15.11 Residual Analysis and Outlier Detection in Multiple Regression (Optional) 15-3 3 15.1 The Multiple Regression Model and the Least Squares Point Estimate Simple linear regression used one independent variable to explain the dependent variable Some relationships are too complex to be described using a single independent variable Multiple regression uses two or more independent variables to describe the dependent variable This allows multiple regression models to handle more complex situations There is no limit to the number of independent variables a model can use Multiple regression has only one dependent variable LO15-1: Explain the multiple regression model and the related least squares point estimates.
  • 3. 15-4 4 The Multiple Regression Model The linear regression model relating y to x1, x2,…, xk is y = β0 µy = β0 + β1x1 + β2x2 +…+ βkxk is the mean value of the dependent variable y when the values of the independent variables are x1, x2,…, xk β0, β1, β2,… βk are the unknown regression parameters relating the mean value of y to x1, x2,…, xk other than the independent variables x1, x2,…, xk LO15-1 15-5 5 The Least Squares Estimates and Point Estimation and Prediction Estimation/prediction equation ŷ = b0 + b1x1 + b2x2 + … + bkxk is the point estimate of the mean value of the dependent
  • 4. variable when the values of the independent variables are x1, x2,…, xk It is also the point prediction of an individual value of the dependent variable when the values of the independent variables are x1, x2,…, xk b0, b1, b2,…, bk are the least squares point estimates of the parameters β0, β1, β2,…, βk x1, x2,…, xk are specified values of the independent predictor variables x1, x2,…, xk LO15-1 15-6 6 LO15-1 Example 15.1 The Tasty Sub Shop Case Figure 15.4 (a) 15-7 7 15.2 R2 and Adjusted R2 Total variation is given by the formula
  • 5. Explained variation is given by the formula Unexplained variation is given by the formula Total variation is the sum of explained and unexplained variation LO15-2: Calculate and interpret the multiple and adjusted multiple coefficients of determination. 15-8 8 R2 and Adjusted R2 Continued The multiple coefficient of determination is the ratio of explained variation to total variation R2 is the proportion of the total variation that is explained by the overall regression model Multiple correlation coefficient R is the square root of R2 LO15-2 15-9
  • 6. 9 Multiple Correlation Coefficient R The multiple correlation coefficient R is just the square root of R2 With simple linear regression, r would take on the sign of b1 There are multiple bi’s with multiple regression For this reason, R is always positive To interpret the direction of the relationship between the x’s and y, you must look to the sign of the appropriate bi coefficient LO15-2 15-10 10 Adjusted R2 Adding an independent variable to multiple regression will raise R2 R2 will rise slightly even if the new variable has no relationship to y corrects this tendency in R2 As a result, it gives a better estimate of the importance of the independent variables LO15-2 15-11
  • 7. 11 15.3 Model Assumptions and the Standard Error A Mean of Zero Assumption: The mean of the error terms is equal to 0 Constant Variance Assumption: The variance of the error terms σ2 is, the same for every combination values of x1, x2,…, xk Normality Assumption: The error terms follow a normal distribution for every combination values of x1, x2,…, xk Independence Assumption: The values of the error terms are statistically independent of each other LO15-3: Explain the assumptions behind multiple regression and calculate the standard error. 15-12 12 The Mean Square Error and the Standard Error
  • 8. Sum of squared errors Mean squared error Point estimate of the residual variance σ2 Standard error Point estimate of the residual standard deviation σ LO15-3 15-13 13 15.4 The Overall F Test To test H0: β1= β2 = …= βk = 0 versus Ha: At least one of β1, β2,…, βk ≠ 0 Test statistic p- - (k + 1) denominator degrees of freedom LO15-4: Test the significance of a multiple regression model by using an F test. 15-14
  • 9. 14 15.5 Testing the Significance of an Independent Variable A variable in a multiple regression model is not likely to be useful unless there is a significant relationship between it and y To test significance, we use the null hypothesis H0: βj = 0 Versus the alternative hypothesis Ha: βj ≠ 0 LO15-5: Test the significance of a single independent variable. 15-15 15 Testing the Significance of the Independent Variable xj LO15-5 15-16 16
  • 10. Testing the Significance of an Independent Variable Continued Customary to test significance of every independent variable the independent variable xj is significantly related to y evidence the independent variable xj is significantly related to y rejected, the stronger the evidence that xj is significantly related to y LO15-5 15-17 17 A Confidence Interval for the Regression Parameter βj If the regression assumptions hold, 100 (1 - – (k + 1) degrees of freedom LO15-5 15-18 18
  • 11. 15.6 Confidence and Prediction Intervals The point on the regression line corresponding to a particular value of x1, x2,…, xk, of the independent variables is It is unlikely that this value will equal the mean value of y for these x values Therefore, we need to place bounds on how far away the predicted value might be We can do this by calculating a confidence interval for the mean value of y and a prediction interval for an individual value of y LO15-6: Find and interpret a confidence interval for a mean value and a prediction interval for an individual value. 15-19 19 Distance Value Both the confidence interval for the mean value of y and the prediction interval for an individual value of y employ a quantity called the distance value With simple regression, we were able to calculate the distance value fairly easily However, for multiple regression, calculating the distance value requires matrix algebra LO15-6 15-20
  • 12. 20 A Confidence Interval and a Prediction Interval Distance value Assume the regression assumptions hold Confidence interval for the mean value of y Prediction interval for an individual value of y These are based on n - (k + 1) degrees of freedom LO15-6 15-21 21 15.7 The Sales Representative Case: Evaluating Employee Performance yi Yearly sales of the company’s product x1 Number of months the representative has been employed x2 Sales of products in the sales territory x3 Dollar advertising expenditure in the territory x4 Weighted average of the company’s market share in territory for the previous four years x5 Change in the company’s market share in the territory over the previous four years 15-22
  • 13. 22 Partial Excel Output of a Regression Analysis of the Sales Territory Performance Data Figure 15.10a 15-23 Time = 85.42 MktPoten = 35,182.73 Adver = 7,281.65 MktShare = 9.64 Change = .28 Sales Predicted 4,181.74 95% Prediction Interval [3,233.59 to 5,129.89] 23 15.8 Using Dummy Variables to Model Qualitative Independent Variables (Optional)
  • 14. So far, we have only looked at including quantitative data in a regression model However, we may wish to include descriptive qualitative data as well For example, might want to include the gender of respondents We can model the effects of different levels of a qualitative variable by using what are called dummy variables Also known as indicator variables LO15-7: Use dummy variables to model qualitative independent Variables (Optional). 15-24 24 Constructing Dummy Variables A dummy variable always has a value of either 0 or 1 For example, to model sales at two locations, would code the first location as a zero and the second as a 1 Operationally, it does not matter which is coded 0 and which is coded 1 LO15-7 15-25 25
  • 15. What If We Have More Than Two Categories? Consider having three categories, say A, B and C Cannot code this using one dummy variable A=0, B=1 and C=2 would be invalid Assumes the difference between A and B is the same as B and C We must use multiple dummy variables Specifically, k categories requires k - 1 dummy variables LO15-7 15-26 26 What If We Have Three Categories? For A, B, and C, would need two dummy variables x1 is 1 for A, zero otherwise x2 is 1 for B, zero otherwise If x1 and x2 are zero, must be C This is why the third dummy variable is not needed LO15-7 15-27 27
  • 16. Interaction Models So far, have only considered dummy variables as stand-alone variables Where D is dummy variable However, can also look at interaction between dummy variable and other variables That model would take the form With an interaction term, both the intercept and slope are shifted LO15-7 15-28 28 15.9 Using Squared and Interaction Variables (Optional) Quadratic regression model is: y = β0 + β1x + β2x2 ε where β0 + β1x + β2x2 is μy β, β1, and β2 are the regression parameters ε is an error term LO15-8: Use squared and interaction variables. 15-29
  • 17. 29 Using Interaction Variables Regression models often contain interaction variables Formed by multiplying two independent variables together Consider a model where x3 and x4 interact and x3 is used as a quadratic y = β0 + β1x4 + β2x3 + β3x32 + β4x4x3 + ε LO15-8 15-30 30 15.10 Multicollinearity, Model Building, and Model Validation (Optional) Multicollinearity: when “independent” variables are related to one another Considered severe when the simple correlation exceeds 0.9 Even moderate multicollinearity can be a problem Another measurement is variance inflation factors Multicollinearity considered Severe when VIF > 10 Moderately strong for VIF > 5 LO15-9: Describe multicollinearity and build and validate a multiple regression model (Optional).
  • 18. 15-31 31 Effect of Adding Independent Variable Adding any independent variable will increase R² Even adding an unimportant independent variable Thus, R² cannot tell us that adding an independent variable is undesirable LO15-9 15-32 32 A Better Criterion is the Standard Error A better criterion is the size of the standard error s If s increases when an independent variable is added, we should not add that variable However, decreasing s alone is not enough An independent variable should only be included if it reduces s enough to offset the higher t value and reduces the length of the desired prediction interval for y LO15-9
  • 19. 15-33 33 C Statistic Another quantity for comparing regression models is called the C (a.k.a. Cp) statistic, First, calculate mean square error for the model containing all p potential independent variables (s2p) Next, calculate SSE for a reduced model with k independent variables LO15-9 15-34 34 C Statistic Continued We want the value of C to be small Adding unimportant independent variables will raise the value of C While we want C to be small, we also wish to find a model for which C roughly equals k + 1 A model with C substantially greater than k + 1 has substantial bias and is undesirable
  • 20. If a model has a small value of C and C for this model is less than k + 1, then it is not biased and the model should be considered desirable LO15-9 15-35 35 The Partial F Test: An F Test for a Portion of a Regression Model To test H0: All of the βj coefficients corresponding to the independent variables in the subset are zero Ha: At least one of the βj coefficients is not equal to zero Reject H0 in favor of Ha if: p- - g numerator and n - (k + 1) denominator degrees of freedom LO15-9 15-36 36
  • 21. 15.11 Residual Analysis and Outlier Detection in Multiple Regression (Optional) For an observed value of yi, the residual is i = yi - ŷ = yi – (b0 + b1xi1 + … + bkxik) If the assumptions hold, the residuals should look like a random sample from a normal distribution with mean 0 and variance σ2 Residual plots Residuals versus each independent variable Residuals versus predicted y’s Residuals in time order (if the response is a time series) LO15-10: Use residual analysis and outlier detection to check the assumptions of multiple regression (Optional). 15-37 Figure 15.35 37 LO15-10 Outliers Figure 15.37 c, d and e 15-38
  • 22. Chapter 14 Simple Linear Regression Analysis Copyright ©2018 McGraw-Hill Education. All rights reserved. 1 Chapter Outline 14.1 The Simple Linear Regression Model and the Least Square Point Estimates 14.2 Simple Coefficients of Determination and Correlation 14.3 Model Assumptions and the Standard Error 14.4 Testing the Significance of the Slope and y-Intercept 14.5 Confidence and Prediction Intervals 14-2 2
  • 23. Chapter Outline Continued 14.6 Testing the Significance of the Population Correlation Coefficient (Optional) 14.7 Residual Analysis 14-3 3 14.1 The Simple Linear Regression Model and the Least Squares Point Estimates The dependent (or response) variable is the variable we wish to understand or predict The independent (or predictor) variable is the variable we will use to understand or predict the dependent variable Regression analysis is a statistical technique that uses observed data to relate the dependent variable to one or more independent variables The objective is to build a regression model that can describe, predict and control the dependent variable based on the independent variable LO14-1: Explain the simple linear regression model. 14-4 4
  • 24. Form of The Simple Linear Regression Model y = β0 + β1x + ε when the value of the independent variable is x β0 is the y-intercept; the mean of y when x is zero β1 is the slope; the change in the mean of y per unit change in x ε is an error term that describes the effect on y of all factors other than x LO14-1 14-5 5 Regression Terms β0 and β1 are called regression parameters β0 is the y-intercept β1 is the slope We do not know the true values of these parameters So, we must use sample data to estimate them b0 is the estimate of β0 b1 is the estimate of β1 LO14-1 14-6
  • 25. 6 LO14-1 The Simple Linear Regression Model Illustrated Figure 14.3 14-7 7 The Least Squares Point Estimates LO14-2: Find the least squares point estimates of the slope and y-intercept. 14-8 8 Example 14.2 The Tasty Sub Shop Case: The Least Squares Estimates
  • 26. LO14-2 14-9 9 Example 14.2 The Tasty Sub Shop Case: The Least Squares Estimates From last slide, Σyi = 8,603.1 Σxi = 434.1 Σx2i = 20,757.41 Σxiyi = 403,296.96 Once we have these values, we no longer need the raw data Calculation of b0 and b1 uses these totals LO14-2 14-10 10 Example 14.2 The Tasty Sub Shop Case (Slope b1) LO14-2 14-11
  • 27. 11 Example 14.2 The Tasty Sub Shop Case (y-Intercept b0) Prediction (x = 20.8) ŷ = b0 + b1x = 183.31 + (15.59)(20.8) ŷ = 507.69 Residual is 527.1 – 507.69 = 19.41 LO14-2 14-12 Figure 14.5 12 14.2 Simple Coefficients of Determination and Correlation How useful is a particular regression model? One measure of usefulness is the simple coefficient of determination
  • 28. It is represented by the symbol r2 LO14-3: Calculate and interpret the simple coefficients of determination and correlation. 14-13 13 The Simple Coefficient of Determination, Total variation is yi-ȳ)2 Explained variation is ŷi-ȳ)2 Unexplained variation is yi-ŷ)2 Total variation is the sum of explained and unexplained variation Simple coefficient of determination is is the proportion of explained variation LO14-3 14-14 14 The Simple Correlation Coefficient, The simple correlation coefficient between y and x is denoted
  • 29. by r It is… if b1 is positive if b1 is negative Where b1 is the slope of the least squares line Simple correlation coefficient measures the strength of the linear relationship between y and x and is denoted by r LO14-3 14-15 15 LO14-3 Different Values of the Correlation Coefficient Figure 14.8 14-16 16 14.3 Model Assumptions and the Standard Error Mean of Zero: At any given value of x, the population of potential error term values has a mean equal to zero
  • 30. Constant Variance Assumption: At any value of x, the population of potential error term values has a variance that does not depend on the value of x Normality Assumption: At any given value of x, the population of potential error term values has a normal distribution Independence Assumption: Any one value of the error term ε is statistically independent of any other value of ε LO14-4: Describe the assumptions behind simple linear regression and calculate the standard error. 14-17 Figure 14.9 17 LO14-4 The Mean Square Error and the Standard Error Sum of squared errors Mean square error Point estimate of the residual variance σ2 Standard error Point estimate of the residual standard deviation σ 14-18
  • 31. 18 14.4 Testing the Significance of the Slope and y-Intercept A regression model is not likely to be useful unless there is a significant relationship between x and y To test significance, we use the null hypothesis: H0: β1 = 0 Versus the alternative hypothesis: Ha: β1 ≠ 0 LO14-5: Test the significance of the slope and y-intercept. 14-19 19 Testing the Significance of the Slope and y-Intercept Continued LO14-5 14-20
  • 32. 20 An F Test for the Significance of the Slope (Optional) H0: β1 = 0 p- - 2 denominator degrees of freedom LO14-6: Test the significance of a simple linear regression model by using an F test (Optional). 14-21 14.5 Confidence and Prediction Intervals The point on the regression line corresponding to a particular value of x0 of the independent variable x is ŷ = b0 + b1x0 It is unlikely that this value will equal the mean value of y when x equals x0 Therefore, we need to place bounds on how far the predicted value might be from the actual value We can do this by calculating a confidence interval mean for the value of y and a prediction interval for an individual value of y LO14-7: Calculate and interpret a confidence interval for a mean value and a prediction interval for an individual value. 14-22
  • 33. 22 Distance Value Both the confidence interval for the mean value of y and the prediction interval for an individual value of y employ a quantity called the distance value The distance value is a measure of the distance between the value x0 of x and Notice that the further x0 is from , the larger the distance value LO14-7 14-23 23 A Confidence Interval and Prediction Interval Assume that the regression assumption holds The formula for a 100 (1 - the mean value of y is The formula for a 100 (1 - individual value of y is
  • 34. This is based on n - 2 degrees of freedom LO14-7 14-24 24 Which to Use? The prediction interval is useful if it is important to predict an individual value of the dependent variable A confidence interval is useful if it is important to estimate the mean value The prediction interval will always be wider than the confidence interval LO14-7 14-25 25 14.6 Testing the Significance of the Population Correlation Coefficient (Optional) The simple correlation coefficient (r) measures the linear relationship between the observed values of x and y from the
  • 35. sample The population correlation coefficient (ρ) measures the linear relationship between all possible combinations of observed values of x and y r is an estimate of ρ LO14-8: Test hypotheses about the population correlation coefficient (Optional). 14-26 26 Testing ρ We can test to see if the correlation is significant using the hypotheses H0: ρ = 0 Ha: ρ ≠ 0 The statistic is This test will give the same results as the test for significance on the slope coefficient b1 LO14-8 14-27
  • 36. 27 14.7 Residual Analysis Checks of regression assumptions are performed by analyzing the regression residuals Residuals () are defined as the difference between the observed value of y and the predicted value of y, = y - ŷ Note that is the point estimate of ε If regression assumptions valid, the population of potential error terms will be normally distributed with mean zero and variance σ2 Different error terms will be statistically independent LO14-9: Use residual analysis to check the assumptions of simple linear regression. 14-28 28 Residual Analysis Continued Residuals are randomly and independently selected from normal populations with mean zero and variance σ2 With any real data, assumptions will not hold exactly Mild departures do not affect our ability to make statistical inferences
  • 37. In checking assumptions, we are looking for pronounced departures from the assumptions So, only require residuals to approximately fit the description above LO14-9 14-29 29 LO14-9 Example 14.9 The QHIC Case: Constructing Residual Plots Figure 14.18b Quality Home Improvement Center (QHIC) operates five stores Studies the relationship between home value and yearly expenditure on home upkeep Random sample of 40 homeowners Intercept = –348.3921 Slope 7.2583 14-30 30
  • 38. Residual Plots Residuals versus independent variable Residuals versus predicted y’s Residuals in time order (if the response is a time series) LO14-9 14-31 31 Constant Variance Assumptions To check the validity of the constant variance assumption, examine residual plots against The x values The predicted y values Time (when data is time series) A pattern that fans out says the variance is increasing rather than staying constant A pattern that funnels in says the variance is decreasing rather than staying constant A pattern that is evenly spread within a band says the assumption has been met LO14-9 14-32 32
  • 39. LO14-9 Constant Variance Visually Figure 14.19 14-33 33 Assumption of Correct Functional Form If the relationship between x and y is something other than a linear one, the residual plot will often suggest a form more appropriate for the model For example, if there is a curved relationship between x and y, a plot of residuals will often show a curved relationship LO14-9 14-34 34 Normality Assumption If the normality assumption holds, a histogram or stem-and-leaf
  • 40. display of residuals should look bell-shaped and symmetric Another way to check is a normal plot of residuals Order residuals from smallest to largest Plot (i) on vertical axis against (i) (i) is the point on the horizontal axis under the curve so the area under this curve to the left is (3i - 1)/(3n + 1) If the normality assumption holds, the plot should have a straight-line appearance LO14-9 14-35 35 Independence Assumption Independence assumption most likely violated by time-series data If the data is not time series, it can be reordered without affecting it For time-series data, the time-ordered error terms can be autocorrelated Positive autocorrelation is when a positive error term in time period i tends to be followed by another positive value in i + k Negative autocorrelation is when a positive error term tends to be followed by a negative value Either one will cause a cyclical error term over time LO14-9 14-36
  • 41. 36 LO14-9 Independence Assumption Visually Figure 14.26 a and b 14-37 37 ( ) ( ) ( ) n x x n y y x b y b n