SlideShare a Scribd company logo
1 of 62
Business Analytics
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Linear Regression
Chapter 7
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Introduction (Slide 1 of 2)
Managerial decisions are often based on the relationship
between two or more variables:
Example: After considering the relationship between advertising
expenditures and sales, a marketing manager might attempt to
predict sales for a given level of advertising expenditures.
Sometimes a manager will rely on intuition to judge how two
variables are related.
If data can be obtained, a statistical procedure called regression
analysis can be used to develop an equation showing how the
variables are related.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publ icly accessible
website, in whole or in part.
Introduction (Slide 2 of 2)
Dependent variable or response: Variable being predicted.
Independent variables or predictor variables: Variables being
used to predict the value of the dependent variable.
Simple linear regression: A regression analysis for which any
one unit change in the independent variable, x, is assumed to
result in the same change in the dependent variable, y.
Multiple linear regression: A regression analysis involving two
or more independent variables.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Simple Linear Regression Model
Regression Model
Estimated Regression Equation
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Simple Linear Regression Model (Slide 1 of 5)
Regression Model:
The equation that describes how y is related to x and an error
term.
The error term accounts for the variability in y that cannot be
explained by the linear relationship between x and y.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
In the Butler Trucking Company example, the population
consists of all the driving assignments that can be made by the
company.
For every driving assignment in the population, there is a value
of x (miles traveled) and a corresponding value of y (travel time
in hours).
6
Simple Linear Regression Model (Slide 2 of 5)
Estimated Regression Equation:
The parameter values are usually not known and must be
estimated using sample data.
the regression equation and dropping the error term, we obtain
the estimated regression for simple linear regression.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
To estimate the mean or expected value of travel time for a
driving assignment of 75 miles, Butler Trucking would
substitute the value of 75 for x in equation = b0 + b1x.
7
Simple Linear Regression Model (Slide 3 of 5)
In the estimated simple linear regression equation:
The graph of the estimated simple linear regression equation is
called the estimated regression line.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly access ible
website, in whole or in part.
To estimate the mean or expected value of travel time for a
driving assignment of 75 miles, Butler Trucking would
substitute the value of 75 for x in equation = b0 + b1x.
8
Meghan Cook (MC) - The slash in this equation should be a
vertical line. I have edited the alt text to show how this should
read as well.
Simple Linear Regression Model (Slide 4 of 5)
Figure 7.1: The Estimation Process in Simple Linear Regression
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
9
Simple Linear Regression Model (Slide 5 of 5)
Figure 7.2: Possible Regression Lines in Simple Linear
Regression
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
The regression line in Panel A shows that the mean value of y is
related positively to x, with larger values of associated with
larger values of x.
In Panel B, the mean value of y is related negatively to x, with
smaller values of associated with larger values of x.
In Panel C, the mean value of y is not related to x; that is, is
the same for every value of x.
10
Least Squares Method
Least Squares Estimates of the Regression Parameters
Using Excel’s Chart Tools to Compute the Estimated Regression
Equation
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Least Squares Method (Slide 1 of 15)
Least squares method: A procedure for using sample data to find
the estimated regression equation.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
12
Least Squares Method (Slide 2 of 15)
Table 7.1: Miles Traveled and Travel Time for 10 Butler
Trucking Company Driving AssignmentsDriving Assignment ix
= Miles Traveledy = Travel Time (hours)1 1009.32 504.83
508.94 1006.55 504.26 806.27 757.48 656.09
907.610 906.1
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
To illustrate the least squares method, suppose data were
collected from a sample of 10 Butler Trucking Company driving
assignments.
For the ith observation or driving assignment in the sample, xi
is the miles traveled and yi is the travel time (in hours).
The values of xi and yi for the 10 driving assignments in the
sample are summarized in Table 7.1.
13
Least Squares Method (Slide 3 of 15)
Figure 7.3: Scatter Chart of Miles Traveled and Travel Time for
Sample of 10 Butler Trucking Company Driving Assignments
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Figure 7.3 is a scatter chart of the data in Table 7.1.
Miles traveled is shown on the horizontal axis, and travel time
(in hours) is shown on the vertical axis.
Longer travel times appear to coincide with more miles
traveled.
The relationship between the travel time and miles traveled
appears to be approximated by a straight line; indeed, a positive
linear relationship is indicated between x and y.
Therefore, the simple linear regression model is chosen to
represent this relationship.
14
Least Squares Method (Slide 4 of 15)
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
15
Least Squares Method (Slide 5 of 15)
Hence,
We are finding the regression that minimizes the sum of squared
errors.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
16
Least Squares Method (Slide 6 of 15)
Least Squares Estimates of the Regression Parameters:
For the Butler Trucking Company data in Table 7.1:
The estimated simple linear regression model:
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
17
Least Squares Method (Slide 7 of 15)
If the length of a driving assignment were 1 unit (1 mile)
longer, the mean travel time for that driving assignment would
be 0.0678 units (0.0678 hours, or approximately 4 minutes)
longer.
If the driving distance for a driving assignment was 0 units (0
miles), the mean travel time would be 1.2739 units (1.2739
hours, or approximately 76 minutes).
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
We interpret b1 and b0 as we would the y-intercept and slope of
any straight line.
18
Least Squares Method (Slide 8 of 15)
Experimental region: The range of values of the independent
variables in the data used to estimate the model.
The regression model is valid only over this region.
Extrapolation: Prediction of the value of the dependent variable
outside the experimental region.
It is risky.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Because we have no empirical evidence that the relationship we
have found holds true for values of x outside of the range of
values of x in the data used to estimate the relationship,
extrapolation is risky and should be avoided if possible.
For Butler Trucking, this means that any prediction outside the
travel time for a driving distance less than 50 miles or greater
than 100 miles is not a reliable estimate, and so for this model
the estimate of β0 is meaningless.
However, if the experimental region for a regression problem
includes zero, the y-intercept will have a meaningful
interpretation.
19
Least Squares Method (Slide 9 of 15)
Butler Trucking Company example: Use the estimated model
and the known values for miles traveled for a driving
assignment (x) to estimate mean travel time in hours.
For example, the first driving assignment in Table 7.1 has a
value for miles
The mean travel time in hours for this driving assignment is
estimated to be:
The resulting residual of the estimate is:
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
The simple linear regression model underestimated travel time
for this driving assignment by 1.2461 hours (approximately 74
minutes).
20
Least Squares Method (Slide 10 of 15)
Table 7.2: Predicted Travel Time and Residuals for 10 Butler
Trucking Company Driving Assignments
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Note in Table 7.2 that:
The sum of predicted values is equal to the sum of the values
of the dependent variable y.
The sum of the residuals ei is 0.
The sum of the squared residuals has been minimized.
21
Least Squares Method (Slide 11 of 15)
Figure 7.4: Scatter Chart of Miles Traveled and Travel Time for
Butler Trucking Company Driving Assignments with Regression
Line Superimposed
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Figure 7.4 shows the simple linear regression line = 1.2739 +
0.0678xi superimposed on the scatter chart for the Butler
Trucking Company data in Table 7.1.
This figure, which also highlights the residuals for driving
assignment 3 (e3) and driving assignment 5 (e5), shows that the
regression model underpredicts travel time for some driving
assignments (such as driving assignment 3) and overpredicts
travel time for others (such as driving assignment 5), but in
general appears to fit the data relatively well.
22
Least Squares Method (Slide 12 of 15)
Figure 7.5: A Geometric Interpretation of the
Least Squares Method
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
In Figure 7.5, a vertical line is drawn from each point in the
scatter chart to the linear regression line.
Each of these lines represents the difference between the actual
driving time and the driving time to be predicted using linear
regression for one of the assignments in the data.
The length of each line is equal to the absolute value of the
residual for one of the driving assignments.
When a residual is squared, the resulting value is equal to the
square that is formed using the vertical dashed line representing
the residual in Figure 7.4 as one side of a square.
Thus, when we find the linear regression model that minimizes
the sum of squared errors for the Butler Trucking example, we
are positioning the regression line in the manner that minimizes
the sum of the areas of the ten squares in Figure 7.5.
23
Least Squares Method (Slide 13 of 15)
Using Excel’s Chart Tools to Compute the Estimated Regression
Equation:
After constructing a scatter chart with Excel’s chart tools:
Right-click on any data point and select Add Trendline.
When the Format Trendline task pane appears:
Select Linear in the Trendline Options area.
Select Display Equation on chart in the Trendline Options area.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Least Squares Method (Slide 14 of 15)
Figure 7.6: Scatter Chart and Estimated Regression Line for
Butler Trucking Company
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
We can use Excel’s chart tools to compute the estimated
regression equation on a scatter chart of the Butler Trucking
Company data in Table 7.1.
The worksheet displayed in Figure 7.6 shows the original data,
scatter chart, estimated regression line, and estimated regression
equation.
Note that Excel uses y instead of to denote the predicted value
of the dependent variable and puts the regression equation into
slope-intercept form whereas we use the intercept-slope form
that is standard in statistics.
25
Least Squares Method (Slide 15 of 15)
Slope Equation
y-Intercept Equation
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicl y accessible
website, in whole or in part.
26
Assessing the Fit of the Simple Linear Regression Model
The Sums of Squares
The Coefficient of Determination
Using Excel’s Chart Tools to Compute the Coefficient of
Determination
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Assessing the Fit of the Simple Linear Regression Model (Slide
1 of 10)
The Sums of Squares:
Sum of squares due to error: The value of SSE is a measure of
the error in using the estimated regression equation to predict
the values of the dependent variable in the sample.
From Table 7.2,
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Assessing the Fit of the Simple Linear Regression Model (Slide
2 of 10)
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publ icly accessible
website, in whole or in part.
Without knowledge of any related variables, we would use the
sample mean as a predictor of travel time for any given driving
assignment.
To find , we divide the sum of the actual driving times yi from
Table 7.2 (67) by the number of observations n in the data (10);
this yields = 6.7.
Figure 7.7 provides insight on how well we would predict the
values of yi in the Butler Trucking company example using =
6.7.
From this figure, which again highlights the residuals for
driving assignments 3 and 5, we can see that tends to
overpredict travel times for driving assignments that have
relatively small values for miles traveled (such as driving
assignment 5) and tends to underpredict travel times for driving
assignments that have relatively large values for miles traveled
(such as driving assignment 3).
29
Assessing the Fit of the Simple Linear Regression Model (Slide
3 of 10)
Table 7.3 shows the sum of squared deviations obtained by
using
for each driving assignment in the sample.
Butler Trucking Example: For the ith driving assignment in the
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
SST is a measure of how well the observations cluster about the
line.
30
Assessing the Fit of the Simple Linear Regression Model (Slide
4 of 10)
The corresponding sum of squares is called the total sum of
squares (SST).
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
SST is a measure of how well the observations cluster about the
line.
31
Assessing the Fit of the Simple Linear Regression Model (Sli de
5 of 10)
Table 7.3: Calculations for the Sum of Squares Total for the
Butler Trucking Simple Linear Regression
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
The sum at the bottom of the last column in Table 7.3 is the
total sum of squares for Butler Trucking Company: SST = 23.9.
32
Assessing the Fit of the Simple Linear Regression Model (Slide
6 of 10)
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
In Figure 7.8, we show the estimated regression line = 1.2739 +
0.0678xi and the line corresponding to = 6.7.
Note that the points cluster more closely around the estimated
regression line = 1.2739 + 0.0678xi than they do about the
horizontal line = 6.7.
33
Assessing the Fit of the Simple Linear Regression Model (Slide
7 of 10)
Relation between SST, SSR, and SSE:
where
SST = total sum of squares
SSR = sum of squares due to regression
SSE = sum of squares due to error.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
In the Butler Trucking Company example, SSE = 8.0288 and
SST = 23.9:
SSR = SST SSE = 23.9 8.0288 = 15.8712
34
Assessing the Fit of the Simple Linear Regression Model (Slide
8 of 10)
The Coefficient of Determination:
The ratio SSR/SST used to evaluate the goodness of fit for the
estimated regression equation; this ratio is called the coefficient
of determination and is denoted by
Take values between zero and one.
Interpreted as the percentage of the total sum of squares that
can be explained by using the estimated regression equation.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
For the Butler Trucking Company example, the coefficient of
determination,
r2 = = = 0.6641.
It can be concluded that 66.41% of the total sum of squares can
be explained by using the estimated regression equation =
1.2739 + 0.0678xi to predict quarterly sales.
35
Assessing the Fit of the Simple Linear Regression Model (Slide
9 of 10)
Using Excel’s Chart Tools to Compute the Coefficient of
Determination:
To compute the coefficient of determination using the scatter
chart in Figure 7.3:
Right-click on any data point in the scatter chart and select Add
Trendline…
When the Format Trendline task pane appears:
Select Display R-squared value on chart in the Trendline
Options area.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
36
Assessing the Fit of the Simple Linear Regression Model (Slide
10 of 10)
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Figure 7.9 displays the scatter chart, the estimated regression
equation, the graph of the estimated regression equation, and
the coefficient of determination for the Butler Trucking
Company data. We see that r2 = 0.6641.
37
The Multiple Regression Model
Regression Model
Estimated Multiple Regression Equation
Least Squares Method and Multiple Regression
Butler Trucking Company and Multiple Regression
Using Excel’s Regression Tool to Develop the Estimated
Multiple Regression Equation
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
The Multiple Regression Model (Slide 1 of 11)
Regression Model:
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
39
The Multiple Regression Model (Slide 2 of 11)
Regression Model (cont.):
Represents the change in the mean value of the dependent
variable y that corresponds to a one unit increase in the
independent variable
holding the values of all other independent variables in the
model
constant.
The multiple regression equation that describes how the mean
value of y is related to
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
40
The Multiple Regression Model (Slide 3 of 11)
Estimated Multiple Regression Equation:
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
41
The Multiple Regression Model (Slide 4 of 11)
Least Squares Method and Multiple Regression:
The least squares method is used to develop the estimated
multiple regression equation:
Finding
Uses sample data to provide the values of
that minimize the sum of squared residuals.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
42
The Multiple Regression Model (Slide 5 of 11)
Figure 7:10: The Estimation Process for Multiple Regression
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
The estimation process for multiple regression is shown in
Figure 7.10.
The estimated values of the dependent variable y are computed
by substituting values of the independent variables x1, x2, . . . ,
xq into the estimated multiple regression equation.
Computer software packages can be used to obtain the estimated
regression equation and determine the regression coefficients
b0, b1, b2, . . . , bq.
43
The Multiple Regression Model (Slide 6 of 11)
Butler Trucking Company and Multiple Regression:
The estimated simple linear regression equation,
The linear effect of the number of miles traveled explains
66.41%
This implies, 33.59% of the variability in sample travel times
remains unexplained
The managers might want to consider adding one or more
independent variables, such as number of deliveries, to the
model to explain some of the remaining variability in the
dependent variable.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
44
The Multiple Regression Model (Slide 7 of 11)
Butler Trucking Company and Multiple Regression (cont.):
Estimated multiple linear regression with two independent
variables:
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
45
The Multiple Regression Model (Slide 8 of 11)
Figure 7.11: Data Analysis Tools Box
Using Excel’s Regression Tool to Develop the Estimated
Multiple Regression Equation:
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
The following steps describe how to use Excel’s Regression tool
to compute the estimated regression equation using the data in
the worksheet.
Copy the data values from the file ButlerWithDeliveries into an
Excel worksheet from columns A through D and rows 1 through
301.
Step 1. Click the DATA tab in the Ribbon
Step 2. Click Data Analysis in the Analysis group
Step 3. Select Regression from the list of Analysis Tools
in the Data Analysis tools box (shown in Figure 7.11) and click
OK
46
The Multiple Regression Model (Slide 9 of 11)
Figure 7.12: Regression Dialog Box
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Step 4. When the Regression dialog box appears (as shown in
Figure 7.12):
1. Enter D1:D301 in the Input Y Range: box
2. Enter B1:C301 in the Input X Range: box
3. Select Labels
Selecting Labels tells Excel to use the names you have gi ven to
your variables in row 1 when displaying the regression model
output.
4. Select Confidence Level:
5. Enter 99 in the Confidence Level: box
6. Select New Worksheet Ply:
7. Click OK
47
The Multiple Regression Model (Slide 10 of 11)
Figure 7.13: Excel Regression Output for the Butler Trucking
Company with Miles and Deliveries as Independent Variables
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
In the Excel output shown in Figure 7.13, the label for the
independent variable x1 is Miles (see cell A18), and the label
for the independent variable x2 is Deliveries (see cell A19).
Estimated regression equation:
= 0.1273 + 0.0672x1 + 0.6900x2
Interpretation:
For a fixed number of deliveries, we estimate that the mean
travel time will increase by 0.0672 hours when the distance
traveled increases by 1 mile.
For a fixed distance traveled, we estimate that the mean travel
time will increase by 0.69 hours when the number of deliveries
increases by 1 delivery.
Multiple coefficient of determination, R2 = 0.8173.
Interpretation: 81.73% of the variability in sample values of the
dependent variable, travel time, is explained by the independent
variables, number of deliveries and distance traveled.
48
The Multiple Regression Model (Slide 11 of 11)
Figure 7.14: Graph of the Regression Equation for Multiple
Regression Analysis with Two Independent Variables
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
With two independent variables x1 and x2, we now generate a
predicted value of y for every combination of values of x1 and
x2.
Thus, instead of a regression line, we now create a regression
plane in three-dimensional space.
Figure 7.14 provides the graph of the estimated regression plane
for the Butler Trucking Company example and shows the
seventh driving assignment in the data.
As the plane slopes upward to larger values of total travel time
() as either the number of miles traveled (x1) or the number of
deliveries (x2) increases.
The residual for a driving assignment when x1 = 75 and x2 = 3
is the difference between the observed y value and the estimated
mean value of y when
x1 = 75 and x2 = 3.
The observed value lies above the regression plane, indicating
that the regression model underestimates the expected driving
time for the seventh driving assignment.
49
Inference and Regression
Conditions Necessary for Valid Inference in the Least Squares
Regression Model
Testing Individual Regression Parameters
Addressing Nonsignificant Independent Variables
Multicollinearity
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Inference and Regression (Slide 1 of 17)
Statistical inference: Process of making estimates and drawing
conclusions about one or more characteristics of a population
(the value of one or more parameters) through the analysis of
sample data drawn from the population.
In regression, inference is commonly used to estimate and draw
conclusions about:
The regression parameters
The mean value and/or the predicted value of the dependent
variable y for specific values of the independent variables
Consider both hypothesis testing and interval estimation.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
51
Inference and Regression (Slide 2 of 17)
Conditions Necessary for Valid Inference in the Least Squares
Regression Model:
For any given combination of values of the independent
variables
distributed with a mean of 0 and a constant variance.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Inference and Regression (Slide 3 of 17)
Figure 7.15: Illustration of the Conditions for Valid Inference in
Regression
…
Business Analytics
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Time Series Analysis and Forecasting
Chapter 8
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Introduction (Slide 1 of 2)
Forecasting methods can be classified as qualitative or
quantitative.
Qualitative methods generally involve the use of expert
judgment to develop forecasts.
Quantitative forecasting methods can be used when:
Past information about the variable being forecast is available.
The information can be quantified.
It is reasonable to assume that past is prologue.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Introduction (Slide 2 of 2)
The objective of time series analysis is to uncover a pattern in
the time series and then extrapolate the pattern into the future.
The forecast is based solely on past values of the variable
and/or on past forecast errors.
Modern data-collection technologies have enabled individuals,
businesses, and government agencies to collect vast amounts of
data that may be used for causal forecasting.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
4
Time Series Patterns
Horizontal Pattern
Trend Pattern
Seasonal Pattern
Trend and Seasonal Pattern
Cyclical Pattern
Identifying Time Series Patterns
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Time Series Patterns (Slide 1 of 20)
Time series: A sequence of observations on a variable measured
at successive points in time or over successive periods of time.
The measurements may be taken every hour, day, week, month,
year, or any other regular interval. The pattern of the data is
important in understanding the series’ past behavior.
If the behavior of the times series data of the past is expected to
continue in the future, it can be used as a guide in selecting an
appropriate forecasting method.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
To identify the underlying pattern in the data, a useful first step
is to construct a time series plot, which is a graphical
presentation of the relationship between time and the time series
variable; time is represented on the horizontal axis and values
of the time series variable are shown on the vertical axis.
6
Time Series Patterns (Slide 2 of 20)
Horizontal Pattern:
Exists when the data fluctuate randomly around a constant mean
over time.
Stationary time series: It denotes a time series whose statistical
properties are independent of time:
The process generating the data has a constant mean.
The variability of the time series is constant over time.
A time series plot for a stationary time series will always
exhibit a horizontal pattern with random fluctuations.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Time Series Patterns (Slide 3 of 20)
Table 8.1: Gasoline Sales Time SeriesWeekSales (1,000s of
gallons)117221319423518616720818922102011151222
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Data in Table 8.1 show the number of gallons of gasoline (in
1000s) sold by a gasoline distributor in Bennington, Vermont,
over the past 12 weeks.
The average value, or mean, for this time series is 19.25, or
19,250 gallons per week.
8
Time Series Patterns (Slide 4 of 20)
Figure 8.1: Gasoline Sales Time Series Plot
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Figure 8.1 shows a time series plot for the data in Table 8.1.
Note how the data fluctuate around the sample mean of 19,250
gallons.
Although random variability is present, we would say that these
data follow a horizontal pattern.
9
Time Series Patterns (Slide 5 of 20)
Table 8.2: Gasoline Sales Time Series after Obtaining the
Contract with the Vermont State PoliceWeekSales (1,000s of
gallons)WeekSales (1,000s of
gallons)1171222221133131914344231531518163361617287201
832818193092220291020213411152233
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Table 8.2 shows the number of gallons of gasoline sold for the
original time series and the 10 weeks after signing the new
contract with the Vermont State Police to provide gasoline for
state police cars located in southern Vermont.
10
Time Series Patterns (Slide 6 of 20)
Figure 8.2: Gasoline Sales Time Series Plot after Obtaining the
Contract with the Vermont State Police
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Figure 8.2 shows the corresponding time series plot. Note the
increased level of the time series beginning in week 13.
This change in the level of the time series makes it more
difficult to choose an appropriate forecasting method.
11
Time Series Patterns (Slide 7 of 20)
Trend Pattern:
A trend pattern shows gradual shifts or movements to relatively
higher or lower values over a longer period of time.
A trend is usually the result of long-term factors such as:
Population increases or decreases.
Shifting demographic characteristics of the population.
Improving technology.
Changes in the competitive landscape.
Changes in consumer preferences.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Time Series Patterns (Slide 8 of 20)
Table 8.3: Bicycle Sales Time SeriesYearSales
(1,000s)121.6222.9325.5421.9523.9627.5731.5829.7928.61031.
4
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Table 8.3 shows the time series of bicycle sales for a particular
manufacturer over the past 10 years.
13
Time Series Patterns (Slide 9 of 20)
Figure 8.3: Bicycle Sales Time Series Plot
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Figure 8.3 shows the time series of bicycle sales for a particular
manufacturer over the past 10 years.
Note that 21,600 bicycles were sold in year 1, 22,900 were sold
in year 2, and so on.
In year 10, the most recent year, 31,400 bicycles were sold.
Visual inspection of the time series plot shows some up-and-
down movement over the past 10 years.
14
Time Series Patterns (Slide 10 of 20)
Table 8.4: Cholesterol Drug Revenue TimesYearRevenue ($
millions)123.1221.3327.4434.6533.8643.2759.5864.4974.21099.
3
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
The data in Table 8.4 and the corresponding time series plot in
Figure 8.4 show the sales revenue for a cholesterol drug since
the company won FDA approval for the drug 10 years ago.
The time series increases in a nonlinear fashion; that is, the rate
of change of revenue does not increase by a constant amount
from one year to the next.
In fact, the revenue appears to be growing in an exponential
fashion.
15
Time Series Patterns (Slide 11 of 20)
Figure 8.4: Cholesterol Drug Revenue Times Series Plot ($
millions)
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
The data in Table 8.4 and the corresponding time series plot in
Figure 8.4 show the sales revenue for a cholesterol drug since
the company won FDA approval for the drug 10 years ago.
The time series increases in a nonlinear fashion; that is, the rate
of change of revenue does not increase by a constant amount
from one year to the next.
In fact, the revenue appears to be growing in an exponential
fashion.
16
Time Series Patterns (Slide 12 of 20)
Seasonal Pattern:
Seasonal patterns are recurring patterns over successive periods
of time.
Example: A retailer that sells bathing suits expects low sales
activity in the fall and winter months, with peak sales in the
spring and summer months to occur every year.
The time series plot not only exhibits a seasonal pattern over a
one-year period but also for less than one year in duration.
Example: daily traffic volume shows within-the-day “seasonal”
behavior, with peak levels occurring during rush hour, moderate
flow during the rest of the day, and light flow from midnight to
early morning.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Time Series Patterns (Slide 13 of 20)
Table 8.5: Umbrella Sales Time SeriesYearQuarterSales11
1252 1533 1064 8821 1182 1613 1334 10231 1382 1443
1134 80
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
18
Time Series Patterns (Slide 14 of 20)
Table 8.5: Umbrella Sales Time Series
(cont.)YearQuarterSales41 1092 1373 1254 10951 1302
1653 1284 96
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
19
Time Series Patterns (Slide 15 of 20)
Figure 8.5: Umbrella Sales Time Series Plot
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
20
Time Series Patterns (Slide 16 of 20)
Trend and Seasonal Pattern:
Some time series include both a trend and a seasonal pattern.
Table 8.6: Quarterly Smartphone Sales Time Series
YearQuarterSales
($1,000s)114.824.136.046.5215.825.236.847.4
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Table 8.6 and Figure 8.6 show quarterly smartphone sales for a
particular manufacturer over the past four years.
Clearly an increasing trend is present.
However, Figure 8.6 also indicates that sales are lowest in the
second quarter of each year and highest in quarters 3 and 4.
Thus, we conclude that a seasonal pattern also exists for
smartphone sales.
21
Time Series Patterns (Slide 17 of 20)
Table 8.6: Quarterly Smartphone Sales Time Series
(cont.)YearQuarterSales
($1,000s)316.025.637.547.8416.325.938.048.4
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Figure 8.6 shows quarterly smartphone sales for a particular
manufacturer over the past four years.
Clearly an increasing trend is present.
However, Figure 8.6 also indicates that sales are lowest in the
second quarter of each year and highest in quarters 3 and 4.
Thus, we conclude that a seasonal pattern also exists for
smartphone sales.
22
Time Series Patterns (Slide 18 of 20)
Figure 8.6: Quarterly Smartphone Sales Time Series Plot
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly access ible
website, in whole or in part.
Table 8.6 and Figure 8.6 show quarterly smartphone sales for a
particular manufacturer over the past four years.
Clearly an increasing trend is present.
However, Figure 8.6 also indicates that sales are lowest in the
second quarter of each year and highest in quarters 3 and 4.
Thus, we conclude that a seasonal pattern also exists for
smartphone sales.
23
Time Series Patterns (Slide 19 of 20)
Cyclical Pattern:
A cyclical pattern exists if the time series plot shows an
alternating sequence of points below and above the trendline
that lasts for more than one year.
Example: Periods of moderate inflation followed by periods of
rapid inflation can lead to a time series that alternates below
and above a generally increasing trendline.
Cyclical effects are often combined with long-term trend effects
and referred to as trend-cycle effects.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
24
Time Series Patterns (Slide 20 of 20)
Identifying Time Series Patterns:
The underlying pattern in the time series is an important factor
in selecting a forecasting method.
A time series plot should be one of the first analytic tools.
We need to use a forecasting method that is capable of handling
the pattern exhibited by the time series effectively.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
25
Forecast Accuracy
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Forecast Accuracy (Slide 1 of 10)
Table 8.7: Computing Forecasts and Measures of Forecast
Accuracy Using the Most Recent Value as the Forecast for the
Next PeriodWeekTime Series ValueForecastForecast
ErrorAbsolute Value of Forecast ErrorSquared Forecast
ErrorPercentage ErrorAbsolute Value of Percentage
Error11722117 4 4 16 19.05 19.0531921 −2 2
4 −10.53 10.5342319 4 4 16 17.39
17.3951823 −5 5 25 −27.78 27.7861618 −2
2 4 −12.50 12.50
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessib le
website, in whole or in part.
Table 8.7 shows the forecasts for the gasoline time series shown
in Table 8.1 using the simplest of all the forecasting methods.
We use the most recent week’s sales volume as the forecast for
the next week.
For instance, the distributor sold 17 thousand gallons of
gasoline in week 1; this value is used as the forecast for week 2.
Next, we use 21, the actual value of sales in week 2, as the
forecast for week 3, and so on.
This method is often referred to as a naïve forecasti ng method
because of its simplicity.
27
Forecast Accuracy (Slide 2 of 10)
Table 8.7: Computing Forecasts and Measures of Forecast
Accuracy Using the Most Recent Value as the Forecast for the
Next Period (cont.)WeekTime Series ValueForecastForecast
ErrorAbsolute Value of Forecast ErrorSquared Forecast
ErrorPercentage ErrorAbsolute Value of Percentage Error72016
4 4 16 20.00 20.0081820 −2 2 4
−11.11 11.1192218 4 4 16 18.18
18.18102022 −2 2 4 −10.00 10.00111520 −5
5 25 −33.33 33.33122215 7 7 49 31.82
31.82Totals 5 41 179 1.19 211.69
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Table 8.7 shows the forecasts for the gasoline time series shown
in Table 8.1 using the simplest of all the forecasting methods.
We use the most recent week’s sales volume as the forecast for
the next week.
For instance, the distributor sold 17 thousand gallons of
gasoline in week 1; this value is used as the forecast for week 2.
Next, we use 21, the actual value of sales in week 2, as the
forecast for week 3, and so on.
This method is often referred to as a naïve forecasting method
because of its simplicity.
28
Forecast Accuracy (Slide 3 of 10)
Naïve forecasting method: Using the most recent data to predict
future data.
The key concept associated with measuring forecast accuracy is
forecast error.
Measures to determine how well a particular forecasting method
is able to reproduce the time series data that are already
available.
Forecast error.
Mean forecast error (MFE).
Mean absolute error (MAE).
Mean squared error (MSE).
Mean absolute percentage error (MAPE).
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicl y accessible
website, in whole or in part.
Forecast Accuracy (Slide 4 of 10)
Forecast Error: Difference between the actual and the forecasted
values for period t.
Mean Forecast Error: Mean or average of the forecast errors.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Forecast error:
=
= actual value
= forecasted value
Example:
Consider Table 8.7.
The distributor actually sold 21 thousa nd gallons of gasoline in
week 2, and the forecast, using the sales volume in week 1, was
17 thousand gallons.
The forecast error in week 2 is = = 21 17 = 4.
Mean Forecast Error (MFE):
MFE =
n = Number of periods in time series
k = Number of periods at the beginning of the time series for
which we cannot produce a naïve forecast
Example:
Table 8.7 shows that the sum of the forecast errors for the
gasoline sales time series is 5.
Thus, the mean or average error is 5/11 = 0.45.
30
Forecast Accuracy (Slide 5 of 10)
Mean Absolute Error (MAE): Measure of forecast accuracy that
avoids the problem of positive and negative forecast errors
offsetting one another.
Mean Squared Error (MSE): Measure that avoids the problem of
positive and negative errors offsetting each other is obtained by
computing the average of the squared forecast errors.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Mean Absolute Error (MAE)
It is also referred to as the mean absolute deviation (MAD).
Example:
Table 8.7 shows that the sum of the absolute values of the
forecast errors is 41.
Thus MAE = average of the absolute value of the forecast errors
= 41/11 = 3.73.
MEAN SQUARED ERROR (MSE)
Example:
From Table 8.7, the sum of the squared errors is 179.
Hence, MSE = average of the square of the forecast errors =
179/11 = 16.27.
31
Forecast Accuracy (Slide 6 of 10)
Mean Absolute Percentage Error (MAPE): Average of the
absolute value of percentage forecast errors.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Mean Absolute Percentage Error (MAPE):
The size of MAE or MSE depends upon the scale of the data.
As a result, it is difficult to make comparisons for different time
intervals (such as comparing a method of forecasting monthly
gasoline sales to a method of forecasting weekly sales) or to
make comparisons across different time series (such as monthly
sales of gasoline and monthly sales of oil filters).
To make comparisons such as these, we need to work with
relative or percentage error measures.
The mean absolute percentage error (MAPE) is such a measure.
Example:
Table 8.7 shows that the sum of the absolute values of the
percentage errors is
Thus the MAPE, which is the average of the absolute value of
percentage forecast errors, is 211.69/11 = 19.24%
32
Forecast Accuracy (Slide 7 of 10)
Table 8.8: Computing Forecasts and Measures of Forecast
Accuracy Using the Average of All the Historical Data as the
Forecast for the Next PeriodWeekTime Series
ValueForecastForecast ErrorAbsolute Value of Forecast
ErrorSquared Forecast ErrorPercentage ErrorAbsolute Value of
Percentage Error 1 17 2 21 17.00 4.00 4.00
16.00 19.05 19.05 3 19 19.00 0.00
0.00 0.00 0.00 0.00 4 23 19.00 4.00 4.00 16.00
17.39 17.39 5 18 20.00 −2.00 2.00
4.00 −11.11 11.11
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
We begin by developing a forecast for week 2.
Because there is only one historical value available prior to
week 2, the forecast for week 2 is just the time series value in
week 1; thus, the forecast for week 2 is 17 thousand gallons of
gasoline.
To compute the forecast for week 3, we take the average of the
sales values in weeks 1 and 2. Thus,
= (17 + 21)/2 = 19
Similarly, the forecast for week 4 is = (17 + 21 + 19)/3 = 19.
Using the results shown in Table 8.8, we obtain the following
values of MAE, MSE, and MAPE:
MAE = 26.81/11 = 2.44
MSE = 89.07/11 = 8.10
MAPE = 141.34/11 = 12.85%
33
Forecast Accuracy (Slide 8 of 10)
Table 8.8: Computing Forecasts and Measures of Forecast
Accuracy Using the Average of All the Historical Data as the
Forecast for the Next Period (cont.)WeekTime Series
ValueForecastForecast ErrorAbsolute Value of Forecast
ErrorSquared Forecast ErrorPercentage ErrorAbsolute Value of
Percentage Error 6 16 19.60 −3.60 3.60 12.96
−22.50 22.50 7 20 19.00 1.00 1.00 1.00
5.00 5.00 8 18 19.14 −1.14 1.14 1.31 −6.35
6.35 9 22 19.00 3.00 3.00 9.00 13.64 13.64
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
We begin by developing a forecast for week 2.
Because there is only one historical value available prior to
week 2, the forecast for week 2 is just the time series value in
week 1; thus, the forecast for week 2 is 17 thousand gallons of
gasoline.
To compute the forecast for week 3, we take the average of the
sales values in weeks 1 and 2. Thus,
= (17 + 21)/2 = 19
Similarly, the forecast for week 4 is = (17 + 21 + 19)/3 = 19.
Using the results shown in Table 8.8, we obtain the following
values of MAE, MSE, and MAPE:
MAE = 26.81/11 = 2.44
MSE = 89.07/11 = 8.10
MAPE = 141.34/11 = 12.85%
34
Forecast Accuracy (Slide 9 of 10)
Table 8.8: Computing Forecasts and Measures of Forecast
Accuracy Using the Average of All the Historical Data as the
Forecast for the Next Period (cont.)WeekTime Series
ValueForecastForecast ErrorAbsolute Value of Forecast
ErrorSquared Forecast ErrorPercentage ErrorAbsolute Value of
Percentage Error 10 20 19.33 0.67 0.67 0.44 3.33
3.33 11 15 19.40 −4.40 4.40 19.36 −29.33
29.33 12 22 19.00 3.00 3.00 9.00 13.64
13.64Totals 4.52 26.81 89.07 2.75 141.34
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
We begin by developing a forecast for week 2.
Because there is only one historical value available prior to
week 2, the forecast for week 2 is just the time series value in
week 1; thus, the forecast for week 2 is 17 thousand gall ons of
gasoline.
To compute the forecast for week 3, we take the average of the
sales values in weeks 1 and 2. Thus,
= (17 + 21)/2 = 19
Similarly, the forecast for week 4 is = (17 + 21 + 19)/3 = 19.
Using the results shown in Table 8.8, we obtain the following
values of MAE, MSE, and MAPE:
MAE = 26.81/11 = 2.44
MSE = 89.07/11 = 8.10
MAPE = 141.34/11 = 12.85%
35
Forecast Accuracy (Slide 10 of 10)
Compare the accuracy of the two forecasting methods by
comparing the values of MAE, MSE, and MAPE for each
method.Naïve MethodAverage of Past ValuesMAE 3.73
2.44MSE 16.27 8.10MAPE 19.24% 12.85%
The average of past values provides more accurate forecasts for
the next period than using the most recent observation.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
36
Moving Averages and Exponential Smoothing
Moving Averages
Exponential Smoothing
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
Moving Averages and Exponential Smoothing (Slide 1 of 16)
Moving Averages:
Moving averages method: Uses the average of the most recent k
data values in the time series as the forecast for the next period.
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
In the formula for moving average forecast,
= forecast of the time series for period t + 1
= actual value of the time series in period t
k = number of periods of time series data used to generate the
forecast
To use moving averages to forecast a time series, we must first
select the order k.
If only the most recent values of the time series are considered
relevant, a small value of k is preferred.
If a greater number of past values are considered relevant, then
we generally opt for a larger value of k.
A moving average will adapt to the new level of the series and
continue to provide good forecasts in k periods.
Thus, a smaller value of k will track shifts in a time series more
quickly.
On the other hand, larger values of k will be more effective in
smoothing out random fluctuations.
38
Moving Averages and Exponential Smoothing (Slide 2 of 16)
Table 8.9: Summary of Three-Week Moving Average
CalculationsWeekTime Series ValueForecastForecast
ErrorAbsolute Value of Forecast ErrorSquared Forecast
ErrorPercentage ErrorAbsolute Value of Percentage Error 1
17 2 21 3 19 4 23 19 4 4 16
17.39 17.39 5 18 21 −3 3 9 −16.67
16.67 6 16 20 −4 4 16 −25.00 25.00
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
To illustrate the moving averages method, let us return to the
original 12 weeks of gasoline sales data in Table 8.1 and Figure
8.1.
We will use a three-week moving average (k = 3).
We begin by computing the forecast of sales in week 4 using the
average of the time series values in weeks 1 to 3.
= average for weeks 1 to 3 = (17 + 21 + 19)/3 = 19
The moving average forecast of sales in week 4 is 19, or 19,000
gallons of gasoline.
Because the actual value observed in week 4 is 23, the forecast
error in week 4 is e4 = 23 - 19 = 4.
We next compute the forecast of sales in week 5 by averaging
the time series values in weeks 2 to 4.
= average for weeks 2 to 4 = (21 + 19 + 23)/3 = 21
Hence, the forecast of sales in week 5 is 21 and the error
associated with this forecast is e5 = 18 - 21 = -3.
A complete summary of the three-week moving average
forecasts for the gasoline sales time series is provided in Table
8.9.
39
Moving Averages and Exponential Smoothing (Slide 3 of 16)
Table 8.9: Summary of Three-Week Moving Average
Calculations (cont.)WeekTime Series ValueForecastForecast
ErrorAbsolute Value of Forecast ErrorSquared Forecast
ErrorPercentage ErrorAbsolute Value of Percentage Error 7
20 19 1 1 1 5.00 5.00 8 18 18 0 0
0 0.00 0.00 9 22 18 4 4 16 18.18
18.18 10 20 20 0 0 0 0.00 0.00 11 15
20 −5 5 25 −33.33 33.33 12 22 19 3
3 9 13.64 13.64Totals 0 24 92 −20.79
129.21
© 2021 Cengage Learning. All Rights Reserved. May not be
scanned, copied or duplicated, or posted to a publicly accessible
website, in whole or in part.
To illustrate the moving averages method, let us return to the
original 12 weeks of gasoline sales data in Table 8.1 and Figure
8.1.
We will use a three-week moving average (k = 3).
We begin by computing the forecast of sales in week 4 using the
average of the time series values in weeks 1 to 3.
= average for weeks 1 to 3 = (17 + 21 + 19)/3 = 19
The moving average forecast of sales in week 4 is 19 or 19,000
gallons of gasoline.
Because the actual value observed in week 4 is 23, the forecast
error in week 4 is e4 = 23 - 19 = 4.
We next compute the forecast of sales in week 5 by averaging
the time series values in weeks 2 to 4.
= average for weeks 2 to 4 = (21 + 19 + 23)/3 = 21
Hence, the forecast of sales in week 5 is 21 and the error
associated with this forecast is e5 = 18 - 21 = -3.
A complete summary of the three-week moving average
forecasts …

More Related Content

Similar to Business analytics© 2021 cengage learning. all rights

Towards Finding the Optimal Bucket Filling Strategy through Simulation
Towards Finding the Optimal Bucket Filling Strategy through SimulationTowards Finding the Optimal Bucket Filling Strategy through Simulation
Towards Finding the Optimal Bucket Filling Strategy through SimulationReno Filla
 
Ch_06_Wooldridge_5e_PPT.pdf
Ch_06_Wooldridge_5e_PPT.pdfCh_06_Wooldridge_5e_PPT.pdf
Ch_06_Wooldridge_5e_PPT.pdfFaizanGul6
 
2012 SAE-A Young Engineer of the Year Award - Leaf Spring Simulation Tool
2012 SAE-A Young Engineer of the Year Award - Leaf Spring Simulation Tool2012 SAE-A Young Engineer of the Year Award - Leaf Spring Simulation Tool
2012 SAE-A Young Engineer of the Year Award - Leaf Spring Simulation ToolMarwan Zogheib
 
Improving_programming_skills_of_Mechanical_Enginee.pdf
Improving_programming_skills_of_Mechanical_Enginee.pdfImproving_programming_skills_of_Mechanical_Enginee.pdf
Improving_programming_skills_of_Mechanical_Enginee.pdfssuserbe139c
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptxMoinPasha12
 
Aerodynamic Drag Reduction for A Generic Sport Utility Vehicle Using Rear Suc...
Aerodynamic Drag Reduction for A Generic Sport Utility Vehicle Using Rear Suc...Aerodynamic Drag Reduction for A Generic Sport Utility Vehicle Using Rear Suc...
Aerodynamic Drag Reduction for A Generic Sport Utility Vehicle Using Rear Suc...IJERA Editor
 
Nonlinear vehicle modelling
Nonlinear vehicle modellingNonlinear vehicle modelling
Nonlinear vehicle modellingAdam Wittmann
 
Chapter 7 - The Repetition Structure
Chapter 7 - The Repetition StructureChapter 7 - The Repetition Structure
Chapter 7 - The Repetition Structuremshellman
 
CFD Simulation for Flow over Passenger Car Using Tail Plates for Aerodynamic ...
CFD Simulation for Flow over Passenger Car Using Tail Plates for Aerodynamic ...CFD Simulation for Flow over Passenger Car Using Tail Plates for Aerodynamic ...
CFD Simulation for Flow over Passenger Car Using Tail Plates for Aerodynamic ...IOSR Journals
 
IRJET- Influence of Tire Parameters of a Semi-Trailer Truck on Road Surfa...
IRJET-  	  Influence of Tire Parameters of a Semi-Trailer Truck on Road Surfa...IRJET-  	  Influence of Tire Parameters of a Semi-Trailer Truck on Road Surfa...
IRJET- Influence of Tire Parameters of a Semi-Trailer Truck on Road Surfa...IRJET Journal
 
A machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehiclesA machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehiclesVenkat Projects
 
L6_Chapter15_-_Introduction_to_Simulation_Modeling.pptx
L6_Chapter15_-_Introduction_to_Simulation_Modeling.pptxL6_Chapter15_-_Introduction_to_Simulation_Modeling.pptx
L6_Chapter15_-_Introduction_to_Simulation_Modeling.pptxMohdSoufiYussof
 
Machine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paperMachine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paperJames by CrowdProcess
 
Comparative Analysis of Tuning Hyperparameters in Policy-Based DRL Algorithm ...
Comparative Analysis of Tuning Hyperparameters in Policy-Based DRL Algorithm ...Comparative Analysis of Tuning Hyperparameters in Policy-Based DRL Algorithm ...
Comparative Analysis of Tuning Hyperparameters in Policy-Based DRL Algorithm ...IRJET Journal
 
Summary of “Efficient large-scale fleet management via multi-agent deep reinf...
Summary of “Efficient large-scale fleet management via multi-agent deep reinf...Summary of “Efficient large-scale fleet management via multi-agent deep reinf...
Summary of “Efficient large-scale fleet management via multi-agent deep reinf...MauroRubieri
 
Anti lock braking (ABS) Model based Design in MATLAB-Simulink
Anti lock braking (ABS) Model based Design in MATLAB-SimulinkAnti lock braking (ABS) Model based Design in MATLAB-Simulink
Anti lock braking (ABS) Model based Design in MATLAB-SimulinkOmkar Rane
 

Similar to Business analytics© 2021 cengage learning. all rights (20)

Towards Finding the Optimal Bucket Filling Strategy through Simulation
Towards Finding the Optimal Bucket Filling Strategy through SimulationTowards Finding the Optimal Bucket Filling Strategy through Simulation
Towards Finding the Optimal Bucket Filling Strategy through Simulation
 
Ch_06_Wooldridge_5e_PPT.pdf
Ch_06_Wooldridge_5e_PPT.pdfCh_06_Wooldridge_5e_PPT.pdf
Ch_06_Wooldridge_5e_PPT.pdf
 
2012 SAE-A Young Engineer of the Year Award - Leaf Spring Simulation Tool
2012 SAE-A Young Engineer of the Year Award - Leaf Spring Simulation Tool2012 SAE-A Young Engineer of the Year Award - Leaf Spring Simulation Tool
2012 SAE-A Young Engineer of the Year Award - Leaf Spring Simulation Tool
 
Lesson 8 more on repitition structure
Lesson 8 more on repitition structureLesson 8 more on repitition structure
Lesson 8 more on repitition structure
 
Improving_programming_skills_of_Mechanical_Enginee.pdf
Improving_programming_skills_of_Mechanical_Enginee.pdfImproving_programming_skills_of_Mechanical_Enginee.pdf
Improving_programming_skills_of_Mechanical_Enginee.pdf
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptx
 
Aerodynamic Drag Reduction for A Generic Sport Utility Vehicle Using Rear Suc...
Aerodynamic Drag Reduction for A Generic Sport Utility Vehicle Using Rear Suc...Aerodynamic Drag Reduction for A Generic Sport Utility Vehicle Using Rear Suc...
Aerodynamic Drag Reduction for A Generic Sport Utility Vehicle Using Rear Suc...
 
Employee mode of commuting
Employee mode of commutingEmployee mode of commuting
Employee mode of commuting
 
Nonlinear vehicle modelling
Nonlinear vehicle modellingNonlinear vehicle modelling
Nonlinear vehicle modelling
 
Chapter 7 - The Repetition Structure
Chapter 7 - The Repetition StructureChapter 7 - The Repetition Structure
Chapter 7 - The Repetition Structure
 
CFD Simulation for Flow over Passenger Car Using Tail Plates for Aerodynamic ...
CFD Simulation for Flow over Passenger Car Using Tail Plates for Aerodynamic ...CFD Simulation for Flow over Passenger Car Using Tail Plates for Aerodynamic ...
CFD Simulation for Flow over Passenger Car Using Tail Plates for Aerodynamic ...
 
IRJET- Influence of Tire Parameters of a Semi-Trailer Truck on Road Surfa...
IRJET-  	  Influence of Tire Parameters of a Semi-Trailer Truck on Road Surfa...IRJET-  	  Influence of Tire Parameters of a Semi-Trailer Truck on Road Surfa...
IRJET- Influence of Tire Parameters of a Semi-Trailer Truck on Road Surfa...
 
A machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehiclesA machine learning model for average fuel consumption in heavy vehicles
A machine learning model for average fuel consumption in heavy vehicles
 
L6_Chapter15_-_Introduction_to_Simulation_Modeling.pptx
L6_Chapter15_-_Introduction_to_Simulation_Modeling.pptxL6_Chapter15_-_Introduction_to_Simulation_Modeling.pptx
L6_Chapter15_-_Introduction_to_Simulation_Modeling.pptx
 
Machine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paperMachine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paper
 
ijca_publication
ijca_publicationijca_publication
ijca_publication
 
Comparative Analysis of Tuning Hyperparameters in Policy-Based DRL Algorithm ...
Comparative Analysis of Tuning Hyperparameters in Policy-Based DRL Algorithm ...Comparative Analysis of Tuning Hyperparameters in Policy-Based DRL Algorithm ...
Comparative Analysis of Tuning Hyperparameters in Policy-Based DRL Algorithm ...
 
Summary of “Efficient large-scale fleet management via multi-agent deep reinf...
Summary of “Efficient large-scale fleet management via multi-agent deep reinf...Summary of “Efficient large-scale fleet management via multi-agent deep reinf...
Summary of “Efficient large-scale fleet management via multi-agent deep reinf...
 
Anti lock braking (ABS) Model based Design in MATLAB-Simulink
Anti lock braking (ABS) Model based Design in MATLAB-SimulinkAnti lock braking (ABS) Model based Design in MATLAB-Simulink
Anti lock braking (ABS) Model based Design in MATLAB-Simulink
 
Qt unit i
Qt unit   iQt unit   i
Qt unit i
 

More from RAJU852744

2222020 Report Pagehttpsww3.capsim.comcgi-bindispla.docx
2222020 Report Pagehttpsww3.capsim.comcgi-bindispla.docx2222020 Report Pagehttpsww3.capsim.comcgi-bindispla.docx
2222020 Report Pagehttpsww3.capsim.comcgi-bindispla.docxRAJU852744
 
2212020 Soil Colloids (Chapter 8) Notes - AGRI1050R50 Intro.docx
2212020 Soil Colloids (Chapter 8) Notes - AGRI1050R50 Intro.docx2212020 Soil Colloids (Chapter 8) Notes - AGRI1050R50 Intro.docx
2212020 Soil Colloids (Chapter 8) Notes - AGRI1050R50 Intro.docxRAJU852744
 
20 Other Conditions That May Be a Focus of Clinical AttentionV-c.docx
20 Other Conditions That May Be a Focus of Clinical AttentionV-c.docx20 Other Conditions That May Be a Focus of Clinical AttentionV-c.docx
20 Other Conditions That May Be a Focus of Clinical AttentionV-c.docxRAJU852744
 
223 Case 53 Problems in Pasta Land by Andres Sous.docx
223 Case 53 Problems in Pasta Land by  Andres Sous.docx223 Case 53 Problems in Pasta Land by  Andres Sous.docx
223 Case 53 Problems in Pasta Land by Andres Sous.docxRAJU852744
 
222111Organization N.docx
222111Organization N.docx222111Organization N.docx
222111Organization N.docxRAJU852744
 
22-6  Reporting the Plight of Depression FamiliesMARTHA GELLHOR.docx
22-6  Reporting the Plight of Depression FamiliesMARTHA GELLHOR.docx22-6  Reporting the Plight of Depression FamiliesMARTHA GELLHOR.docx
22-6  Reporting the Plight of Depression FamiliesMARTHA GELLHOR.docxRAJU852744
 
2012 © Laureate Education, Inc. ASSIGNMENT AND FINAL P.docx
2012 © Laureate Education, Inc. ASSIGNMENT AND FINAL P.docx2012 © Laureate Education, Inc. ASSIGNMENT AND FINAL P.docx
2012 © Laureate Education, Inc. ASSIGNMENT AND FINAL P.docxRAJU852744
 
216Author’s Note I would like to thank the Division of Wo.docx
216Author’s Note I would like to thank the Division of Wo.docx216Author’s Note I would like to thank the Division of Wo.docx
216Author’s Note I would like to thank the Division of Wo.docxRAJU852744
 
2019 International Conference on Machine Learning, Big Data, C.docx
2019 International Conference on Machine Learning, Big Data, C.docx2019 International Conference on Machine Learning, Big Data, C.docx
2019 International Conference on Machine Learning, Big Data, C.docxRAJU852744
 
2018 4th International Conference on Green Technology and Sust.docx
2018 4th International Conference on Green Technology and Sust.docx2018 4th International Conference on Green Technology and Sust.docx
2018 4th International Conference on Green Technology and Sust.docxRAJU852744
 
202 S.W.3d 811Court of Appeals of Texas,San Antonio.PROG.docx
202 S.W.3d 811Court of Appeals of Texas,San Antonio.PROG.docx202 S.W.3d 811Court of Appeals of Texas,San Antonio.PROG.docx
202 S.W.3d 811Court of Appeals of Texas,San Antonio.PROG.docxRAJU852744
 
200 wordsResearch Interest Lack of minorities in top level ma.docx
200 wordsResearch Interest Lack of minorities in top level ma.docx200 wordsResearch Interest Lack of minorities in top level ma.docx
200 wordsResearch Interest Lack of minorities in top level ma.docxRAJU852744
 
2019 14th Iberian Conference on Information Systems and Tech.docx
2019 14th Iberian Conference on Information Systems and Tech.docx2019 14th Iberian Conference on Information Systems and Tech.docx
2019 14th Iberian Conference on Information Systems and Tech.docxRAJU852744
 
200520201ORG30002 – Leadership Practice and Skills.docx
200520201ORG30002 – Leadership Practice and Skills.docx200520201ORG30002 – Leadership Practice and Skills.docx
200520201ORG30002 – Leadership Practice and Skills.docxRAJU852744
 
2182020 Sample Content Topichttpspurdueglobal.brights.docx
2182020 Sample Content Topichttpspurdueglobal.brights.docx2182020 Sample Content Topichttpspurdueglobal.brights.docx
2182020 Sample Content Topichttpspurdueglobal.brights.docxRAJU852744
 
21 hours agoMercy Eke Week 2 Discussion Hamilton Depression.docx
21 hours agoMercy Eke Week 2 Discussion Hamilton Depression.docx21 hours agoMercy Eke Week 2 Discussion Hamilton Depression.docx
21 hours agoMercy Eke Week 2 Discussion Hamilton Depression.docxRAJU852744
 
2192020 Originality Reporthttpsucumberlands.blackboar.docx
2192020 Originality Reporthttpsucumberlands.blackboar.docx2192020 Originality Reporthttpsucumberlands.blackboar.docx
2192020 Originality Reporthttpsucumberlands.blackboar.docxRAJU852744
 
20810chapter Information Systems Sourcing .docx
20810chapter       Information Systems Sourcing    .docx20810chapter       Information Systems Sourcing    .docx
20810chapter Information Systems Sourcing .docxRAJU852744
 
21720201Chapter 14Eating and WeightHealth Ps.docx
21720201Chapter 14Eating and WeightHealth Ps.docx21720201Chapter 14Eating and WeightHealth Ps.docx
21720201Chapter 14Eating and WeightHealth Ps.docxRAJU852744
 
2020221 Critical Review #2 - WebCOM™ 2.0httpssmc.grte.docx
2020221 Critical Review #2 - WebCOM™ 2.0httpssmc.grte.docx2020221 Critical Review #2 - WebCOM™ 2.0httpssmc.grte.docx
2020221 Critical Review #2 - WebCOM™ 2.0httpssmc.grte.docxRAJU852744
 

More from RAJU852744 (20)

2222020 Report Pagehttpsww3.capsim.comcgi-bindispla.docx
2222020 Report Pagehttpsww3.capsim.comcgi-bindispla.docx2222020 Report Pagehttpsww3.capsim.comcgi-bindispla.docx
2222020 Report Pagehttpsww3.capsim.comcgi-bindispla.docx
 
2212020 Soil Colloids (Chapter 8) Notes - AGRI1050R50 Intro.docx
2212020 Soil Colloids (Chapter 8) Notes - AGRI1050R50 Intro.docx2212020 Soil Colloids (Chapter 8) Notes - AGRI1050R50 Intro.docx
2212020 Soil Colloids (Chapter 8) Notes - AGRI1050R50 Intro.docx
 
20 Other Conditions That May Be a Focus of Clinical AttentionV-c.docx
20 Other Conditions That May Be a Focus of Clinical AttentionV-c.docx20 Other Conditions That May Be a Focus of Clinical AttentionV-c.docx
20 Other Conditions That May Be a Focus of Clinical AttentionV-c.docx
 
223 Case 53 Problems in Pasta Land by Andres Sous.docx
223 Case 53 Problems in Pasta Land by  Andres Sous.docx223 Case 53 Problems in Pasta Land by  Andres Sous.docx
223 Case 53 Problems in Pasta Land by Andres Sous.docx
 
222111Organization N.docx
222111Organization N.docx222111Organization N.docx
222111Organization N.docx
 
22-6  Reporting the Plight of Depression FamiliesMARTHA GELLHOR.docx
22-6  Reporting the Plight of Depression FamiliesMARTHA GELLHOR.docx22-6  Reporting the Plight of Depression FamiliesMARTHA GELLHOR.docx
22-6  Reporting the Plight of Depression FamiliesMARTHA GELLHOR.docx
 
2012 © Laureate Education, Inc. ASSIGNMENT AND FINAL P.docx
2012 © Laureate Education, Inc. ASSIGNMENT AND FINAL P.docx2012 © Laureate Education, Inc. ASSIGNMENT AND FINAL P.docx
2012 © Laureate Education, Inc. ASSIGNMENT AND FINAL P.docx
 
216Author’s Note I would like to thank the Division of Wo.docx
216Author’s Note I would like to thank the Division of Wo.docx216Author’s Note I would like to thank the Division of Wo.docx
216Author’s Note I would like to thank the Division of Wo.docx
 
2019 International Conference on Machine Learning, Big Data, C.docx
2019 International Conference on Machine Learning, Big Data, C.docx2019 International Conference on Machine Learning, Big Data, C.docx
2019 International Conference on Machine Learning, Big Data, C.docx
 
2018 4th International Conference on Green Technology and Sust.docx
2018 4th International Conference on Green Technology and Sust.docx2018 4th International Conference on Green Technology and Sust.docx
2018 4th International Conference on Green Technology and Sust.docx
 
202 S.W.3d 811Court of Appeals of Texas,San Antonio.PROG.docx
202 S.W.3d 811Court of Appeals of Texas,San Antonio.PROG.docx202 S.W.3d 811Court of Appeals of Texas,San Antonio.PROG.docx
202 S.W.3d 811Court of Appeals of Texas,San Antonio.PROG.docx
 
200 wordsResearch Interest Lack of minorities in top level ma.docx
200 wordsResearch Interest Lack of minorities in top level ma.docx200 wordsResearch Interest Lack of minorities in top level ma.docx
200 wordsResearch Interest Lack of minorities in top level ma.docx
 
2019 14th Iberian Conference on Information Systems and Tech.docx
2019 14th Iberian Conference on Information Systems and Tech.docx2019 14th Iberian Conference on Information Systems and Tech.docx
2019 14th Iberian Conference on Information Systems and Tech.docx
 
200520201ORG30002 – Leadership Practice and Skills.docx
200520201ORG30002 – Leadership Practice and Skills.docx200520201ORG30002 – Leadership Practice and Skills.docx
200520201ORG30002 – Leadership Practice and Skills.docx
 
2182020 Sample Content Topichttpspurdueglobal.brights.docx
2182020 Sample Content Topichttpspurdueglobal.brights.docx2182020 Sample Content Topichttpspurdueglobal.brights.docx
2182020 Sample Content Topichttpspurdueglobal.brights.docx
 
21 hours agoMercy Eke Week 2 Discussion Hamilton Depression.docx
21 hours agoMercy Eke Week 2 Discussion Hamilton Depression.docx21 hours agoMercy Eke Week 2 Discussion Hamilton Depression.docx
21 hours agoMercy Eke Week 2 Discussion Hamilton Depression.docx
 
2192020 Originality Reporthttpsucumberlands.blackboar.docx
2192020 Originality Reporthttpsucumberlands.blackboar.docx2192020 Originality Reporthttpsucumberlands.blackboar.docx
2192020 Originality Reporthttpsucumberlands.blackboar.docx
 
20810chapter Information Systems Sourcing .docx
20810chapter       Information Systems Sourcing    .docx20810chapter       Information Systems Sourcing    .docx
20810chapter Information Systems Sourcing .docx
 
21720201Chapter 14Eating and WeightHealth Ps.docx
21720201Chapter 14Eating and WeightHealth Ps.docx21720201Chapter 14Eating and WeightHealth Ps.docx
21720201Chapter 14Eating and WeightHealth Ps.docx
 
2020221 Critical Review #2 - WebCOM™ 2.0httpssmc.grte.docx
2020221 Critical Review #2 - WebCOM™ 2.0httpssmc.grte.docx2020221 Critical Review #2 - WebCOM™ 2.0httpssmc.grte.docx
2020221 Critical Review #2 - WebCOM™ 2.0httpssmc.grte.docx
 

Recently uploaded

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIShubhangi Sonawane
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 

Recently uploaded (20)

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 

Business analytics© 2021 cengage learning. all rights

  • 1. Business Analytics © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Linear Regression Chapter 7 © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Introduction (Slide 1 of 2) Managerial decisions are often based on the relationship between two or more variables: Example: After considering the relationship between advertising expenditures and sales, a marketing manager might attempt to predict sales for a given level of advertising expenditures. Sometimes a manager will rely on intuition to judge how two variables are related. If data can be obtained, a statistical procedure called regression
  • 2. analysis can be used to develop an equation showing how the variables are related. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publ icly accessible website, in whole or in part. Introduction (Slide 2 of 2) Dependent variable or response: Variable being predicted. Independent variables or predictor variables: Variables being used to predict the value of the dependent variable. Simple linear regression: A regression analysis for which any one unit change in the independent variable, x, is assumed to result in the same change in the dependent variable, y. Multiple linear regression: A regression analysis involving two or more independent variables. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Simple Linear Regression Model
  • 3. Regression Model Estimated Regression Equation © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Simple Linear Regression Model (Slide 1 of 5) Regression Model: The equation that describes how y is related to x and an error term. The error term accounts for the variability in y that cannot be explained by the linear relationship between x and y. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. In the Butler Trucking Company example, the population consists of all the driving assignments that can be made by the company. For every driving assignment in the population, there is a value
  • 4. of x (miles traveled) and a corresponding value of y (travel time in hours). 6 Simple Linear Regression Model (Slide 2 of 5) Estimated Regression Equation: The parameter values are usually not known and must be estimated using sample data. the regression equation and dropping the error term, we obtain the estimated regression for simple linear regression. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. To estimate the mean or expected value of travel time for a driving assignment of 75 miles, Butler Trucking would substitute the value of 75 for x in equation = b0 + b1x. 7 Simple Linear Regression Model (Slide 3 of 5) In the estimated simple linear regression equation:
  • 5. The graph of the estimated simple linear regression equation is called the estimated regression line. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly access ible website, in whole or in part. To estimate the mean or expected value of travel time for a driving assignment of 75 miles, Butler Trucking would substitute the value of 75 for x in equation = b0 + b1x. 8 Meghan Cook (MC) - The slash in this equation should be a vertical line. I have edited the alt text to show how this should read as well. Simple Linear Regression Model (Slide 4 of 5) Figure 7.1: The Estimation Process in Simple Linear Regression © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
  • 6. 9 Simple Linear Regression Model (Slide 5 of 5) Figure 7.2: Possible Regression Lines in Simple Linear Regression © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. The regression line in Panel A shows that the mean value of y is related positively to x, with larger values of associated with larger values of x. In Panel B, the mean value of y is related negatively to x, with smaller values of associated with larger values of x. In Panel C, the mean value of y is not related to x; that is, is the same for every value of x. 10 Least Squares Method Least Squares Estimates of the Regression Parameters Using Excel’s Chart Tools to Compute the Estimated Regression Equation © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible
  • 7. website, in whole or in part. Least Squares Method (Slide 1 of 15) Least squares method: A procedure for using sample data to find the estimated regression equation. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 12 Least Squares Method (Slide 2 of 15) Table 7.1: Miles Traveled and Travel Time for 10 Butler Trucking Company Driving AssignmentsDriving Assignment ix = Miles Traveledy = Travel Time (hours)1 1009.32 504.83 508.94 1006.55 504.26 806.27 757.48 656.09 907.610 906.1 © 2021 Cengage Learning. All Rights Reserved. May not be
  • 8. scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. To illustrate the least squares method, suppose data were collected from a sample of 10 Butler Trucking Company driving assignments. For the ith observation or driving assignment in the sample, xi is the miles traveled and yi is the travel time (in hours). The values of xi and yi for the 10 driving assignments in the sample are summarized in Table 7.1. 13 Least Squares Method (Slide 3 of 15) Figure 7.3: Scatter Chart of Miles Traveled and Travel Time for Sample of 10 Butler Trucking Company Driving Assignments © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Figure 7.3 is a scatter chart of the data in Table 7.1. Miles traveled is shown on the horizontal axis, and travel time (in hours) is shown on the vertical axis. Longer travel times appear to coincide with more miles traveled. The relationship between the travel time and miles traveled appears to be approximated by a straight line; indeed, a positive linear relationship is indicated between x and y. Therefore, the simple linear regression model is chosen to represent this relationship.
  • 9. 14 Least Squares Method (Slide 4 of 15) © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 15 Least Squares Method (Slide 5 of 15) Hence, We are finding the regression that minimizes the sum of squared errors. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
  • 10. 16 Least Squares Method (Slide 6 of 15) Least Squares Estimates of the Regression Parameters: For the Butler Trucking Company data in Table 7.1: The estimated simple linear regression model: © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 17 Least Squares Method (Slide 7 of 15) If the length of a driving assignment were 1 unit (1 mile) longer, the mean travel time for that driving assignment would be 0.0678 units (0.0678 hours, or approximately 4 minutes) longer. If the driving distance for a driving assignment was 0 units (0 miles), the mean travel time would be 1.2739 units (1.2739
  • 11. hours, or approximately 76 minutes). © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. We interpret b1 and b0 as we would the y-intercept and slope of any straight line. 18 Least Squares Method (Slide 8 of 15) Experimental region: The range of values of the independent variables in the data used to estimate the model. The regression model is valid only over this region. Extrapolation: Prediction of the value of the dependent variable outside the experimental region. It is risky. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
  • 12. Because we have no empirical evidence that the relationship we have found holds true for values of x outside of the range of values of x in the data used to estimate the relationship, extrapolation is risky and should be avoided if possible. For Butler Trucking, this means that any prediction outside the travel time for a driving distance less than 50 miles or greater than 100 miles is not a reliable estimate, and so for this model the estimate of β0 is meaningless. However, if the experimental region for a regression problem includes zero, the y-intercept will have a meaningful interpretation. 19 Least Squares Method (Slide 9 of 15) Butler Trucking Company example: Use the estimated model and the known values for miles traveled for a driving assignment (x) to estimate mean travel time in hours. For example, the first driving assignment in Table 7.1 has a value for miles The mean travel time in hours for this driving assignment is estimated to be: The resulting residual of the estimate is: © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible
  • 13. website, in whole or in part. The simple linear regression model underestimated travel time for this driving assignment by 1.2461 hours (approximately 74 minutes). 20 Least Squares Method (Slide 10 of 15) Table 7.2: Predicted Travel Time and Residuals for 10 Butler Trucking Company Driving Assignments © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Note in Table 7.2 that: The sum of predicted values is equal to the sum of the values of the dependent variable y. The sum of the residuals ei is 0. The sum of the squared residuals has been minimized. 21 Least Squares Method (Slide 11 of 15) Figure 7.4: Scatter Chart of Miles Traveled and Travel Time for Butler Trucking Company Driving Assignments with Regression Line Superimposed
  • 14. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Figure 7.4 shows the simple linear regression line = 1.2739 + 0.0678xi superimposed on the scatter chart for the Butler Trucking Company data in Table 7.1. This figure, which also highlights the residuals for driving assignment 3 (e3) and driving assignment 5 (e5), shows that the regression model underpredicts travel time for some driving assignments (such as driving assignment 3) and overpredicts travel time for others (such as driving assignment 5), but in general appears to fit the data relatively well. 22 Least Squares Method (Slide 12 of 15) Figure 7.5: A Geometric Interpretation of the Least Squares Method © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. In Figure 7.5, a vertical line is drawn from each point in the scatter chart to the linear regression line. Each of these lines represents the difference between the actual driving time and the driving time to be predicted using linear regression for one of the assignments in the data.
  • 15. The length of each line is equal to the absolute value of the residual for one of the driving assignments. When a residual is squared, the resulting value is equal to the square that is formed using the vertical dashed line representing the residual in Figure 7.4 as one side of a square. Thus, when we find the linear regression model that minimizes the sum of squared errors for the Butler Trucking example, we are positioning the regression line in the manner that minimizes the sum of the areas of the ten squares in Figure 7.5. 23 Least Squares Method (Slide 13 of 15) Using Excel’s Chart Tools to Compute the Estimated Regression Equation: After constructing a scatter chart with Excel’s chart tools: Right-click on any data point and select Add Trendline. When the Format Trendline task pane appears: Select Linear in the Trendline Options area. Select Display Equation on chart in the Trendline Options area. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Least Squares Method (Slide 14 of 15) Figure 7.6: Scatter Chart and Estimated Regression Line for Butler Trucking Company
  • 16. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. We can use Excel’s chart tools to compute the estimated regression equation on a scatter chart of the Butler Trucking Company data in Table 7.1. The worksheet displayed in Figure 7.6 shows the original data, scatter chart, estimated regression line, and estimated regression equation. Note that Excel uses y instead of to denote the predicted value of the dependent variable and puts the regression equation into slope-intercept form whereas we use the intercept-slope form that is standard in statistics. 25 Least Squares Method (Slide 15 of 15) Slope Equation y-Intercept Equation © 2021 Cengage Learning. All Rights Reserved. May not be
  • 17. scanned, copied or duplicated, or posted to a publicl y accessible website, in whole or in part. 26 Assessing the Fit of the Simple Linear Regression Model The Sums of Squares The Coefficient of Determination Using Excel’s Chart Tools to Compute the Coefficient of Determination © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Assessing the Fit of the Simple Linear Regression Model (Slide 1 of 10) The Sums of Squares: Sum of squares due to error: The value of SSE is a measure of the error in using the estimated regression equation to predict the values of the dependent variable in the sample. From Table 7.2, © 2021 Cengage Learning. All Rights Reserved. May not be
  • 18. scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Assessing the Fit of the Simple Linear Regression Model (Slide 2 of 10) © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publ icly accessible website, in whole or in part. Without knowledge of any related variables, we would use the sample mean as a predictor of travel time for any given driving assignment. To find , we divide the sum of the actual driving times yi from Table 7.2 (67) by the number of observations n in the data (10); this yields = 6.7. Figure 7.7 provides insight on how well we would predict the values of yi in the Butler Trucking company example using = 6.7. From this figure, which again highlights the residuals for driving assignments 3 and 5, we can see that tends to overpredict travel times for driving assignments that have relatively small values for miles traveled (such as driving assignment 5) and tends to underpredict travel times for driving assignments that have relatively large values for miles traveled (such as driving assignment 3). 29
  • 19. Assessing the Fit of the Simple Linear Regression Model (Slide 3 of 10) Table 7.3 shows the sum of squared deviations obtained by using for each driving assignment in the sample. Butler Trucking Example: For the ith driving assignment in the © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. SST is a measure of how well the observations cluster about the line. 30 Assessing the Fit of the Simple Linear Regression Model (Slide 4 of 10) The corresponding sum of squares is called the total sum of squares (SST). © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible
  • 20. website, in whole or in part. SST is a measure of how well the observations cluster about the line. 31 Assessing the Fit of the Simple Linear Regression Model (Sli de 5 of 10) Table 7.3: Calculations for the Sum of Squares Total for the Butler Trucking Simple Linear Regression © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. The sum at the bottom of the last column in Table 7.3 is the total sum of squares for Butler Trucking Company: SST = 23.9. 32 Assessing the Fit of the Simple Linear Regression Model (Slide 6 of 10)
  • 21. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. In Figure 7.8, we show the estimated regression line = 1.2739 + 0.0678xi and the line corresponding to = 6.7. Note that the points cluster more closely around the estimated regression line = 1.2739 + 0.0678xi than they do about the horizontal line = 6.7. 33 Assessing the Fit of the Simple Linear Regression Model (Slide 7 of 10) Relation between SST, SSR, and SSE: where SST = total sum of squares SSR = sum of squares due to regression SSE = sum of squares due to error. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. In the Butler Trucking Company example, SSE = 8.0288 and SST = 23.9: SSR = SST SSE = 23.9 8.0288 = 15.8712
  • 22. 34 Assessing the Fit of the Simple Linear Regression Model (Slide 8 of 10) The Coefficient of Determination: The ratio SSR/SST used to evaluate the goodness of fit for the estimated regression equation; this ratio is called the coefficient of determination and is denoted by Take values between zero and one. Interpreted as the percentage of the total sum of squares that can be explained by using the estimated regression equation. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. For the Butler Trucking Company example, the coefficient of determination, r2 = = = 0.6641. It can be concluded that 66.41% of the total sum of squares can be explained by using the estimated regression equation = 1.2739 + 0.0678xi to predict quarterly sales. 35 Assessing the Fit of the Simple Linear Regression Model (Slide
  • 23. 9 of 10) Using Excel’s Chart Tools to Compute the Coefficient of Determination: To compute the coefficient of determination using the scatter chart in Figure 7.3: Right-click on any data point in the scatter chart and select Add Trendline… When the Format Trendline task pane appears: Select Display R-squared value on chart in the Trendline Options area. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 36 Assessing the Fit of the Simple Linear Regression Model (Slide 10 of 10) © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
  • 24. Figure 7.9 displays the scatter chart, the estimated regression equation, the graph of the estimated regression equation, and the coefficient of determination for the Butler Trucking Company data. We see that r2 = 0.6641. 37 The Multiple Regression Model Regression Model Estimated Multiple Regression Equation Least Squares Method and Multiple Regression Butler Trucking Company and Multiple Regression Using Excel’s Regression Tool to Develop the Estimated Multiple Regression Equation © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. The Multiple Regression Model (Slide 1 of 11) Regression Model: © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
  • 25. 39 The Multiple Regression Model (Slide 2 of 11) Regression Model (cont.): Represents the change in the mean value of the dependent variable y that corresponds to a one unit increase in the independent variable holding the values of all other independent variables in the model constant. The multiple regression equation that describes how the mean value of y is related to © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 40 The Multiple Regression Model (Slide 3 of 11)
  • 26. Estimated Multiple Regression Equation: © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 41 The Multiple Regression Model (Slide 4 of 11) Least Squares Method and Multiple Regression: The least squares method is used to develop the estimated multiple regression equation: Finding Uses sample data to provide the values of that minimize the sum of squared residuals. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
  • 27. 42 The Multiple Regression Model (Slide 5 of 11) Figure 7:10: The Estimation Process for Multiple Regression © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. The estimation process for multiple regression is shown in Figure 7.10. The estimated values of the dependent variable y are computed by substituting values of the independent variables x1, x2, . . . , xq into the estimated multiple regression equation. Computer software packages can be used to obtain the estimated regression equation and determine the regression coefficients b0, b1, b2, . . . , bq. 43 The Multiple Regression Model (Slide 6 of 11) Butler Trucking Company and Multiple Regression: The estimated simple linear regression equation, The linear effect of the number of miles traveled explains 66.41%
  • 28. This implies, 33.59% of the variability in sample travel times remains unexplained The managers might want to consider adding one or more independent variables, such as number of deliveries, to the model to explain some of the remaining variability in the dependent variable. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 44 The Multiple Regression Model (Slide 7 of 11) Butler Trucking Company and Multiple Regression (cont.): Estimated multiple linear regression with two independent variables: © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible
  • 29. website, in whole or in part. 45 The Multiple Regression Model (Slide 8 of 11) Figure 7.11: Data Analysis Tools Box Using Excel’s Regression Tool to Develop the Estimated Multiple Regression Equation: © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. The following steps describe how to use Excel’s Regression tool to compute the estimated regression equation using the data in the worksheet. Copy the data values from the file ButlerWithDeliveries into an Excel worksheet from columns A through D and rows 1 through 301. Step 1. Click the DATA tab in the Ribbon Step 2. Click Data Analysis in the Analysis group Step 3. Select Regression from the list of Analysis Tools in the Data Analysis tools box (shown in Figure 7.11) and click OK 46
  • 30. The Multiple Regression Model (Slide 9 of 11) Figure 7.12: Regression Dialog Box © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Step 4. When the Regression dialog box appears (as shown in Figure 7.12): 1. Enter D1:D301 in the Input Y Range: box 2. Enter B1:C301 in the Input X Range: box 3. Select Labels Selecting Labels tells Excel to use the names you have gi ven to your variables in row 1 when displaying the regression model output. 4. Select Confidence Level: 5. Enter 99 in the Confidence Level: box 6. Select New Worksheet Ply: 7. Click OK 47 The Multiple Regression Model (Slide 10 of 11) Figure 7.13: Excel Regression Output for the Butler Trucking Company with Miles and Deliveries as Independent Variables
  • 31. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. In the Excel output shown in Figure 7.13, the label for the independent variable x1 is Miles (see cell A18), and the label for the independent variable x2 is Deliveries (see cell A19). Estimated regression equation: = 0.1273 + 0.0672x1 + 0.6900x2 Interpretation: For a fixed number of deliveries, we estimate that the mean travel time will increase by 0.0672 hours when the distance traveled increases by 1 mile. For a fixed distance traveled, we estimate that the mean travel time will increase by 0.69 hours when the number of deliveries increases by 1 delivery. Multiple coefficient of determination, R2 = 0.8173. Interpretation: 81.73% of the variability in sample values of the dependent variable, travel time, is explained by the independent variables, number of deliveries and distance traveled. 48 The Multiple Regression Model (Slide 11 of 11) Figure 7.14: Graph of the Regression Equation for Multiple Regression Analysis with Two Independent Variables © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible
  • 32. website, in whole or in part. With two independent variables x1 and x2, we now generate a predicted value of y for every combination of values of x1 and x2. Thus, instead of a regression line, we now create a regression plane in three-dimensional space. Figure 7.14 provides the graph of the estimated regression plane for the Butler Trucking Company example and shows the seventh driving assignment in the data. As the plane slopes upward to larger values of total travel time () as either the number of miles traveled (x1) or the number of deliveries (x2) increases. The residual for a driving assignment when x1 = 75 and x2 = 3 is the difference between the observed y value and the estimated mean value of y when x1 = 75 and x2 = 3. The observed value lies above the regression plane, indicating that the regression model underestimates the expected driving time for the seventh driving assignment. 49 Inference and Regression Conditions Necessary for Valid Inference in the Least Squares Regression Model Testing Individual Regression Parameters Addressing Nonsignificant Independent Variables Multicollinearity
  • 33. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Inference and Regression (Slide 1 of 17) Statistical inference: Process of making estimates and drawing conclusions about one or more characteristics of a population (the value of one or more parameters) through the analysis of sample data drawn from the population. In regression, inference is commonly used to estimate and draw conclusions about: The regression parameters The mean value and/or the predicted value of the dependent variable y for specific values of the independent variables Consider both hypothesis testing and interval estimation. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 51 Inference and Regression (Slide 2 of 17) Conditions Necessary for Valid Inference in the Least Squares Regression Model: For any given combination of values of the independent variables
  • 34. distributed with a mean of 0 and a constant variance. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Inference and Regression (Slide 3 of 17) Figure 7.15: Illustration of the Conditions for Valid Inference in Regression … Business Analytics © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
  • 35. Time Series Analysis and Forecasting Chapter 8 © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Introduction (Slide 1 of 2) Forecasting methods can be classified as qualitative or quantitative. Qualitative methods generally involve the use of expert judgment to develop forecasts. Quantitative forecasting methods can be used when: Past information about the variable being forecast is available. The information can be quantified. It is reasonable to assume that past is prologue. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Introduction (Slide 2 of 2) The objective of time series analysis is to uncover a pattern in
  • 36. the time series and then extrapolate the pattern into the future. The forecast is based solely on past values of the variable and/or on past forecast errors. Modern data-collection technologies have enabled individuals, businesses, and government agencies to collect vast amounts of data that may be used for causal forecasting. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 4 Time Series Patterns Horizontal Pattern Trend Pattern Seasonal Pattern Trend and Seasonal Pattern Cyclical Pattern Identifying Time Series Patterns
  • 37. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Time Series Patterns (Slide 1 of 20) Time series: A sequence of observations on a variable measured at successive points in time or over successive periods of time. The measurements may be taken every hour, day, week, month, year, or any other regular interval. The pattern of the data is important in understanding the series’ past behavior. If the behavior of the times series data of the past is expected to continue in the future, it can be used as a guide in selecting an appropriate forecasting method. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. To identify the underlying pattern in the data, a useful first step is to construct a time series plot, which is a graphical presentation of the relationship between time and the time series variable; time is represented on the horizontal axis and values of the time series variable are shown on the vertical axis. 6 Time Series Patterns (Slide 2 of 20) Horizontal Pattern: Exists when the data fluctuate randomly around a constant mean over time.
  • 38. Stationary time series: It denotes a time series whose statistical properties are independent of time: The process generating the data has a constant mean. The variability of the time series is constant over time. A time series plot for a stationary time series will always exhibit a horizontal pattern with random fluctuations. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Time Series Patterns (Slide 3 of 20) Table 8.1: Gasoline Sales Time SeriesWeekSales (1,000s of gallons)117221319423518616720818922102011151222 © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Data in Table 8.1 show the number of gallons of gasoline (in 1000s) sold by a gasoline distributor in Bennington, Vermont, over the past 12 weeks. The average value, or mean, for this time series is 19.25, or 19,250 gallons per week. 8
  • 39. Time Series Patterns (Slide 4 of 20) Figure 8.1: Gasoline Sales Time Series Plot © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Figure 8.1 shows a time series plot for the data in Table 8.1. Note how the data fluctuate around the sample mean of 19,250 gallons. Although random variability is present, we would say that these data follow a horizontal pattern. 9 Time Series Patterns (Slide 5 of 20) Table 8.2: Gasoline Sales Time Series after Obtaining the Contract with the Vermont State PoliceWeekSales (1,000s of gallons)WeekSales (1,000s of gallons)1171222221133131914344231531518163361617287201 832818193092220291020213411152233 © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Table 8.2 shows the number of gallons of gasoline sold for the original time series and the 10 weeks after signing the new
  • 40. contract with the Vermont State Police to provide gasoline for state police cars located in southern Vermont. 10 Time Series Patterns (Slide 6 of 20) Figure 8.2: Gasoline Sales Time Series Plot after Obtaining the Contract with the Vermont State Police © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Figure 8.2 shows the corresponding time series plot. Note the increased level of the time series beginning in week 13. This change in the level of the time series makes it more difficult to choose an appropriate forecasting method. 11 Time Series Patterns (Slide 7 of 20) Trend Pattern: A trend pattern shows gradual shifts or movements to relatively higher or lower values over a longer period of time. A trend is usually the result of long-term factors such as: Population increases or decreases. Shifting demographic characteristics of the population. Improving technology. Changes in the competitive landscape. Changes in consumer preferences.
  • 41. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Time Series Patterns (Slide 8 of 20) Table 8.3: Bicycle Sales Time SeriesYearSales (1,000s)121.6222.9325.5421.9523.9627.5731.5829.7928.61031. 4 © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Table 8.3 shows the time series of bicycle sales for a particular manufacturer over the past 10 years. 13 Time Series Patterns (Slide 9 of 20) Figure 8.3: Bicycle Sales Time Series Plot © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible
  • 42. website, in whole or in part. Figure 8.3 shows the time series of bicycle sales for a particular manufacturer over the past 10 years. Note that 21,600 bicycles were sold in year 1, 22,900 were sold in year 2, and so on. In year 10, the most recent year, 31,400 bicycles were sold. Visual inspection of the time series plot shows some up-and- down movement over the past 10 years. 14 Time Series Patterns (Slide 10 of 20) Table 8.4: Cholesterol Drug Revenue TimesYearRevenue ($ millions)123.1221.3327.4434.6533.8643.2759.5864.4974.21099. 3 © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. The data in Table 8.4 and the corresponding time series plot in Figure 8.4 show the sales revenue for a cholesterol drug since the company won FDA approval for the drug 10 years ago. The time series increases in a nonlinear fashion; that is, the rate of change of revenue does not increase by a constant amount from one year to the next. In fact, the revenue appears to be growing in an exponential fashion. 15 Time Series Patterns (Slide 11 of 20)
  • 43. Figure 8.4: Cholesterol Drug Revenue Times Series Plot ($ millions) © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. The data in Table 8.4 and the corresponding time series plot in Figure 8.4 show the sales revenue for a cholesterol drug since the company won FDA approval for the drug 10 years ago. The time series increases in a nonlinear fashion; that is, the rate of change of revenue does not increase by a constant amount from one year to the next. In fact, the revenue appears to be growing in an exponential fashion. 16 Time Series Patterns (Slide 12 of 20) Seasonal Pattern: Seasonal patterns are recurring patterns over successive periods of time. Example: A retailer that sells bathing suits expects low sales activity in the fall and winter months, with peak sales in the spring and summer months to occur every year. The time series plot not only exhibits a seasonal pattern over a one-year period but also for less than one year in duration. Example: daily traffic volume shows within-the-day “seasonal” behavior, with peak levels occurring during rush hour, moderate flow during the rest of the day, and light flow from midnight to early morning.
  • 44. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Time Series Patterns (Slide 13 of 20) Table 8.5: Umbrella Sales Time SeriesYearQuarterSales11 1252 1533 1064 8821 1182 1613 1334 10231 1382 1443 1134 80 © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 18 Time Series Patterns (Slide 14 of 20) Table 8.5: Umbrella Sales Time Series (cont.)YearQuarterSales41 1092 1373 1254 10951 1302 1653 1284 96
  • 45. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 19 Time Series Patterns (Slide 15 of 20) Figure 8.5: Umbrella Sales Time Series Plot © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 20 Time Series Patterns (Slide 16 of 20) Trend and Seasonal Pattern: Some time series include both a trend and a seasonal pattern. Table 8.6: Quarterly Smartphone Sales Time Series YearQuarterSales ($1,000s)114.824.136.046.5215.825.236.847.4
  • 46. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Table 8.6 and Figure 8.6 show quarterly smartphone sales for a particular manufacturer over the past four years. Clearly an increasing trend is present. However, Figure 8.6 also indicates that sales are lowest in the second quarter of each year and highest in quarters 3 and 4. Thus, we conclude that a seasonal pattern also exists for smartphone sales. 21 Time Series Patterns (Slide 17 of 20) Table 8.6: Quarterly Smartphone Sales Time Series (cont.)YearQuarterSales ($1,000s)316.025.637.547.8416.325.938.048.4 © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Figure 8.6 shows quarterly smartphone sales for a particular manufacturer over the past four years. Clearly an increasing trend is present. However, Figure 8.6 also indicates that sales are lowest in the second quarter of each year and highest in quarters 3 and 4. Thus, we conclude that a seasonal pattern also exists for smartphone sales.
  • 47. 22 Time Series Patterns (Slide 18 of 20) Figure 8.6: Quarterly Smartphone Sales Time Series Plot © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly access ible website, in whole or in part. Table 8.6 and Figure 8.6 show quarterly smartphone sales for a particular manufacturer over the past four years. Clearly an increasing trend is present. However, Figure 8.6 also indicates that sales are lowest in the second quarter of each year and highest in quarters 3 and 4. Thus, we conclude that a seasonal pattern also exists for smartphone sales. 23 Time Series Patterns (Slide 19 of 20) Cyclical Pattern: A cyclical pattern exists if the time series plot shows an alternating sequence of points below and above the trendline that lasts for more than one year. Example: Periods of moderate inflation followed by periods of rapid inflation can lead to a time series that alternates below and above a generally increasing trendline. Cyclical effects are often combined with long-term trend effects and referred to as trend-cycle effects.
  • 48. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 24 Time Series Patterns (Slide 20 of 20) Identifying Time Series Patterns: The underlying pattern in the time series is an important factor in selecting a forecasting method. A time series plot should be one of the first analytic tools. We need to use a forecasting method that is capable of handling the pattern exhibited by the time series effectively. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 25 Forecast Accuracy
  • 49. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Forecast Accuracy (Slide 1 of 10) Table 8.7: Computing Forecasts and Measures of Forecast Accuracy Using the Most Recent Value as the Forecast for the Next PeriodWeekTime Series ValueForecastForecast ErrorAbsolute Value of Forecast ErrorSquared Forecast ErrorPercentage ErrorAbsolute Value of Percentage Error11722117 4 4 16 19.05 19.0531921 −2 2 4 −10.53 10.5342319 4 4 16 17.39 17.3951823 −5 5 25 −27.78 27.7861618 −2 2 4 −12.50 12.50 © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessib le website, in whole or in part. Table 8.7 shows the forecasts for the gasoline time series shown in Table 8.1 using the simplest of all the forecasting methods. We use the most recent week’s sales volume as the forecast for the next week. For instance, the distributor sold 17 thousand gallons of gasoline in week 1; this value is used as the forecast for week 2. Next, we use 21, the actual value of sales in week 2, as the forecast for week 3, and so on. This method is often referred to as a naïve forecasti ng method because of its simplicity.
  • 50. 27 Forecast Accuracy (Slide 2 of 10) Table 8.7: Computing Forecasts and Measures of Forecast Accuracy Using the Most Recent Value as the Forecast for the Next Period (cont.)WeekTime Series ValueForecastForecast ErrorAbsolute Value of Forecast ErrorSquared Forecast ErrorPercentage ErrorAbsolute Value of Percentage Error72016 4 4 16 20.00 20.0081820 −2 2 4 −11.11 11.1192218 4 4 16 18.18 18.18102022 −2 2 4 −10.00 10.00111520 −5 5 25 −33.33 33.33122215 7 7 49 31.82 31.82Totals 5 41 179 1.19 211.69 © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Table 8.7 shows the forecasts for the gasoline time series shown in Table 8.1 using the simplest of all the forecasting methods. We use the most recent week’s sales volume as the forecast for the next week. For instance, the distributor sold 17 thousand gallons of gasoline in week 1; this value is used as the forecast for week 2. Next, we use 21, the actual value of sales in week 2, as the forecast for week 3, and so on. This method is often referred to as a naïve forecasting method because of its simplicity. 28
  • 51. Forecast Accuracy (Slide 3 of 10) Naïve forecasting method: Using the most recent data to predict future data. The key concept associated with measuring forecast accuracy is forecast error. Measures to determine how well a particular forecasting method is able to reproduce the time series data that are already available. Forecast error. Mean forecast error (MFE). Mean absolute error (MAE). Mean squared error (MSE). Mean absolute percentage error (MAPE). © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicl y accessible website, in whole or in part. Forecast Accuracy (Slide 4 of 10) Forecast Error: Difference between the actual and the forecasted values for period t. Mean Forecast Error: Mean or average of the forecast errors. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible
  • 52. website, in whole or in part. Forecast error: = = actual value = forecasted value Example: Consider Table 8.7. The distributor actually sold 21 thousa nd gallons of gasoline in week 2, and the forecast, using the sales volume in week 1, was 17 thousand gallons. The forecast error in week 2 is = = 21 17 = 4. Mean Forecast Error (MFE): MFE = n = Number of periods in time series k = Number of periods at the beginning of the time series for which we cannot produce a naïve forecast Example: Table 8.7 shows that the sum of the forecast errors for the gasoline sales time series is 5. Thus, the mean or average error is 5/11 = 0.45. 30 Forecast Accuracy (Slide 5 of 10) Mean Absolute Error (MAE): Measure of forecast accuracy that avoids the problem of positive and negative forecast errors offsetting one another. Mean Squared Error (MSE): Measure that avoids the problem of positive and negative errors offsetting each other is obtained by computing the average of the squared forecast errors.
  • 53. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Mean Absolute Error (MAE) It is also referred to as the mean absolute deviation (MAD). Example: Table 8.7 shows that the sum of the absolute values of the forecast errors is 41. Thus MAE = average of the absolute value of the forecast errors = 41/11 = 3.73. MEAN SQUARED ERROR (MSE) Example: From Table 8.7, the sum of the squared errors is 179. Hence, MSE = average of the square of the forecast errors = 179/11 = 16.27. 31 Forecast Accuracy (Slide 6 of 10) Mean Absolute Percentage Error (MAPE): Average of the absolute value of percentage forecast errors. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
  • 54. Mean Absolute Percentage Error (MAPE): The size of MAE or MSE depends upon the scale of the data. As a result, it is difficult to make comparisons for different time intervals (such as comparing a method of forecasting monthly gasoline sales to a method of forecasting weekly sales) or to make comparisons across different time series (such as monthly sales of gasoline and monthly sales of oil filters). To make comparisons such as these, we need to work with relative or percentage error measures. The mean absolute percentage error (MAPE) is such a measure. Example: Table 8.7 shows that the sum of the absolute values of the percentage errors is Thus the MAPE, which is the average of the absolute value of percentage forecast errors, is 211.69/11 = 19.24% 32 Forecast Accuracy (Slide 7 of 10) Table 8.8: Computing Forecasts and Measures of Forecast Accuracy Using the Average of All the Historical Data as the Forecast for the Next PeriodWeekTime Series ValueForecastForecast ErrorAbsolute Value of Forecast ErrorSquared Forecast ErrorPercentage ErrorAbsolute Value of Percentage Error 1 17 2 21 17.00 4.00 4.00 16.00 19.05 19.05 3 19 19.00 0.00 0.00 0.00 0.00 0.00 4 23 19.00 4.00 4.00 16.00 17.39 17.39 5 18 20.00 −2.00 2.00 4.00 −11.11 11.11 © 2021 Cengage Learning. All Rights Reserved. May not be
  • 55. scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. We begin by developing a forecast for week 2. Because there is only one historical value available prior to week 2, the forecast for week 2 is just the time series value in week 1; thus, the forecast for week 2 is 17 thousand gallons of gasoline. To compute the forecast for week 3, we take the average of the sales values in weeks 1 and 2. Thus, = (17 + 21)/2 = 19 Similarly, the forecast for week 4 is = (17 + 21 + 19)/3 = 19. Using the results shown in Table 8.8, we obtain the following values of MAE, MSE, and MAPE: MAE = 26.81/11 = 2.44 MSE = 89.07/11 = 8.10 MAPE = 141.34/11 = 12.85% 33 Forecast Accuracy (Slide 8 of 10) Table 8.8: Computing Forecasts and Measures of Forecast Accuracy Using the Average of All the Historical Data as the Forecast for the Next Period (cont.)WeekTime Series ValueForecastForecast ErrorAbsolute Value of Forecast ErrorSquared Forecast ErrorPercentage ErrorAbsolute Value of Percentage Error 6 16 19.60 −3.60 3.60 12.96 −22.50 22.50 7 20 19.00 1.00 1.00 1.00 5.00 5.00 8 18 19.14 −1.14 1.14 1.31 −6.35 6.35 9 22 19.00 3.00 3.00 9.00 13.64 13.64
  • 56. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. We begin by developing a forecast for week 2. Because there is only one historical value available prior to week 2, the forecast for week 2 is just the time series value in week 1; thus, the forecast for week 2 is 17 thousand gallons of gasoline. To compute the forecast for week 3, we take the average of the sales values in weeks 1 and 2. Thus, = (17 + 21)/2 = 19 Similarly, the forecast for week 4 is = (17 + 21 + 19)/3 = 19. Using the results shown in Table 8.8, we obtain the following values of MAE, MSE, and MAPE: MAE = 26.81/11 = 2.44 MSE = 89.07/11 = 8.10 MAPE = 141.34/11 = 12.85% 34 Forecast Accuracy (Slide 9 of 10) Table 8.8: Computing Forecasts and Measures of Forecast Accuracy Using the Average of All the Historical Data as the Forecast for the Next Period (cont.)WeekTime Series ValueForecastForecast ErrorAbsolute Value of Forecast ErrorSquared Forecast ErrorPercentage ErrorAbsolute Value of Percentage Error 10 20 19.33 0.67 0.67 0.44 3.33 3.33 11 15 19.40 −4.40 4.40 19.36 −29.33 29.33 12 22 19.00 3.00 3.00 9.00 13.64 13.64Totals 4.52 26.81 89.07 2.75 141.34
  • 57. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. We begin by developing a forecast for week 2. Because there is only one historical value available prior to week 2, the forecast for week 2 is just the time series value in week 1; thus, the forecast for week 2 is 17 thousand gall ons of gasoline. To compute the forecast for week 3, we take the average of the sales values in weeks 1 and 2. Thus, = (17 + 21)/2 = 19 Similarly, the forecast for week 4 is = (17 + 21 + 19)/3 = 19. Using the results shown in Table 8.8, we obtain the following values of MAE, MSE, and MAPE: MAE = 26.81/11 = 2.44 MSE = 89.07/11 = 8.10 MAPE = 141.34/11 = 12.85% 35 Forecast Accuracy (Slide 10 of 10) Compare the accuracy of the two forecasting methods by comparing the values of MAE, MSE, and MAPE for each method.Naïve MethodAverage of Past ValuesMAE 3.73 2.44MSE 16.27 8.10MAPE 19.24% 12.85% The average of past values provides more accurate forecasts for the next period than using the most recent observation.
  • 58. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 36 Moving Averages and Exponential Smoothing Moving Averages Exponential Smoothing © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Moving Averages and Exponential Smoothing (Slide 1 of 16) Moving Averages: Moving averages method: Uses the average of the most recent k data values in the time series as the forecast for the next period. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
  • 59. In the formula for moving average forecast, = forecast of the time series for period t + 1 = actual value of the time series in period t k = number of periods of time series data used to generate the forecast To use moving averages to forecast a time series, we must first select the order k. If only the most recent values of the time series are considered relevant, a small value of k is preferred. If a greater number of past values are considered relevant, then we generally opt for a larger value of k. A moving average will adapt to the new level of the series and continue to provide good forecasts in k periods. Thus, a smaller value of k will track shifts in a time series more quickly. On the other hand, larger values of k will be more effective in smoothing out random fluctuations. 38 Moving Averages and Exponential Smoothing (Slide 2 of 16) Table 8.9: Summary of Three-Week Moving Average CalculationsWeekTime Series ValueForecastForecast ErrorAbsolute Value of Forecast ErrorSquared Forecast ErrorPercentage ErrorAbsolute Value of Percentage Error 1 17 2 21 3 19 4 23 19 4 4 16 17.39 17.39 5 18 21 −3 3 9 −16.67 16.67 6 16 20 −4 4 16 −25.00 25.00
  • 60. © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. To illustrate the moving averages method, let us return to the original 12 weeks of gasoline sales data in Table 8.1 and Figure 8.1. We will use a three-week moving average (k = 3). We begin by computing the forecast of sales in week 4 using the average of the time series values in weeks 1 to 3. = average for weeks 1 to 3 = (17 + 21 + 19)/3 = 19 The moving average forecast of sales in week 4 is 19, or 19,000 gallons of gasoline. Because the actual value observed in week 4 is 23, the forecast error in week 4 is e4 = 23 - 19 = 4. We next compute the forecast of sales in week 5 by averaging the time series values in weeks 2 to 4. = average for weeks 2 to 4 = (21 + 19 + 23)/3 = 21 Hence, the forecast of sales in week 5 is 21 and the error associated with this forecast is e5 = 18 - 21 = -3. A complete summary of the three-week moving average forecasts for the gasoline sales time series is provided in Table 8.9. 39 Moving Averages and Exponential Smoothing (Slide 3 of 16) Table 8.9: Summary of Three-Week Moving Average Calculations (cont.)WeekTime Series ValueForecastForecast ErrorAbsolute Value of Forecast ErrorSquared Forecast
  • 61. ErrorPercentage ErrorAbsolute Value of Percentage Error 7 20 19 1 1 1 5.00 5.00 8 18 18 0 0 0 0.00 0.00 9 22 18 4 4 16 18.18 18.18 10 20 20 0 0 0 0.00 0.00 11 15 20 −5 5 25 −33.33 33.33 12 22 19 3 3 9 13.64 13.64Totals 0 24 92 −20.79 129.21 © 2021 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. To illustrate the moving averages method, let us return to the original 12 weeks of gasoline sales data in Table 8.1 and Figure 8.1. We will use a three-week moving average (k = 3). We begin by computing the forecast of sales in week 4 using the average of the time series values in weeks 1 to 3. = average for weeks 1 to 3 = (17 + 21 + 19)/3 = 19 The moving average forecast of sales in week 4 is 19 or 19,000 gallons of gasoline. Because the actual value observed in week 4 is 23, the forecast error in week 4 is e4 = 23 - 19 = 4. We next compute the forecast of sales in week 5 by averaging the time series values in weeks 2 to 4. = average for weeks 2 to 4 = (21 + 19 + 23)/3 = 21 Hence, the forecast of sales in week 5 is 21 and the error associated with this forecast is e5 = 18 - 21 = -3.
  • 62. A complete summary of the three-week moving average forecasts …