SlideShare a Scribd company logo
1 of 162
Download to read offline
PART B:
Quantitative Forecasting Techniques
Ruggero Golini golini@mip.polimi.it
Agenda Part B
• Forecasting models: Direct vs Cross-validation procedure
• Moving Average Forecast with application
• Exponential Smoothing
– Simple Exponential Smoothing with application
– Linear Exponential smoothing
– JMP Download and Installation
– Excel Applications (simple and linear exponential smoothing)
– Winters exponential smoothing
– JMP Tutorial
– 2 JMP Group Exercises
• Launch of assignment 2
• Linear regression with applications
• Autoregressive models with applications
Teaching methodology
• Using MS Excel is not part of the program
• Exercises and applications are usually solved by the teacher, make
sure you can follow
• Individual exercises are provided (Individual exercises.xlsx) to test
your understanding after the class
• It is required a basic understanding on how to use MS Excel with the
following features:
– Basic operations (copy, paste, formatting,…)
– Basic formulas (sum, dragging formulas,...)
– Matrix formulas
• There is a nice tutorial provided by Microsoft on how to use MS
Excel, please take a look if you need it!
Forecasting models
• Forecast = f (D, a, b, c,…)
– f: function, depends on the type of model
– D: previous demand data
– a, b, c: parameters used to adapt the model
• Building a forecasting model means to:
– Identify a model (f)
– Select the data to be used (D)
– Find the best value of the parameter (a, b, c,...)
How do we select a model? How do we evaluate it?
Direct vs cross-validation
procedure
• Two procedures to make forecasts applicable to every forecasting
model/method
1. The direct procedure is simpler and faster
2. The cross-validation procedure is more complex but allows a
better evaluation of the model
Direct procedure (step 1)
Data Set
• Build a model based on the data available (Data Set)
• Change the parameters until you do not find a good fit (MAD,
BIAS, RMSEA, etc.)
Time now
Model
Direct procedure (step 1)
• Quantitative models can have many parameters to “play” with
• DO NOT OVER FIT: your target is to create a model able to interpret reality
and produce good forecasts. A model that perfectly fits history is usually not
good at forecasting.
Data Set
Time now
Model
Forecast
Data Set
Time now
Model
Forecast
Over fitting
An example of overfitting
Image source: pingax.com
Direct procedure (step 2)
Data Set
• Produce the forecast based on the parameters identified
Time now
Forecast
Model
Forecasting using the moving
average
• It is possible to make forecasts using the right-centered moving
average
• The forecast is based on the average of the last k periods


 
k
i
i
t Y
k
F
1
1
1
Period Demand MM(3) MM(5) MM(6)
1 105
2 125
3 100
4 105
5 104
6 117
7 104
8 127
9 103
10 101
11 109
12 121
13 129
Forecasting using the moving
average: model
Period Demand MM(3) MM(5) MM(6)
1 105
2 125
3 100
4 105 110.0
5 104 110.0
6 117 103.0 107.8
7 104 108.7 110.2 109.3
8 127 108.3 106.0 109.2
9 103 116.0 111.4 109.5
10 101 111.3 111.0 110.0
11 109 110.3 110.4 109.3
12 121 104.3 108.8 110.2
13 129 110.3 112.2 110.8
Forecasting using the moving
average: forecast
Period Demand MM(3) MM(5) MM(6)
1 105
2 125
3 100
4 105 110.0
5 104 110.0
6 117 103.0 107.8
7 104 108.7 110.2 109.3
8 127 108.3 106.0 109.2
9 103 116.0 111.4 109.5
10 101 111.3 111.0 110.0
11 109 110.3 110.4 109.3
12 121 104.3 108.8 110.2
13 129 110.3 112.2 110.8
14 119.7 112.6 115.0
Forecasting using the moving
average: forecast
• Forecast for the periods after the first one is equal to the initial
forecast
Period Demand MM(3) MM(5) MM(6)
1 105
2 125
3 100
4 105 110.0
5 104 110.0
6 117 103.0 107.8
7 104 108.7 110.2 109.3
8 127 108.3 106.0 109.2
9 103 116.0 111.4 109.5
10 101 111.3 111.0 110.0
11 109 110.3 110.4 109.3
12 121 104.3 108.8 110.2
13 129 110.3 112.2 110.8
14 119.7 112.6 115.0
15 119.7 112.6 115.0
16 119.7 112.6 115.0
Forecasting using the moving
average
• How to select the parameter k?
• Same consideration you made for the centered moving average:
– Low values (you look only to recent values): more reactive, less
smoothing
– High values (you look to older values): more smoothing, less
reactive
• If there is seasonality: k = frequency, but if there is no seasonality,
you have to define the “best” value of k testing it on your time
series
• Let’s try…
Exercise B01 Part 1
• Download the file excel: “01B_MovingAverage.xlsx”
• Open the tab: “Direct Procedure”
• Which one of the proposed models (MM4, MM8, MM12) does fit best?
Direct procedure (step 3)
Data Set
• Once you have the actual data, you can measure the actual
errors of your model
• If necessary you can adjust the parameters for future
forecasts
Forecast
Time now
Actual
data
Direct procedure (step 4)
Data Set
Re-run the procedure with new data to get a new forecast
Forecast
Time now
Direct Procedure: Limitations
• The limitation of the direct procedure is that it does not test the
validity of the model developed until we get actual data
• The cross-validation procedure allows to make this testing through a
slightly more complex process
Cross-validation procedure
Data set
Let’s start from the available data at timenow: the data set
Time now
Cross-validation procedure
(step 1)
Training
Set
Test Set
Divide the sample in two:
•Training set: to build the model
•Test set: to test the model
Rule of thumb: test set should be a period (step e.g. one year)
and usually no more than one third of the entire dataset.
Time now
Cross-validation procedure
(step 2)
Training
Set
• Build your model on the training set.
• Change the parameters until you do not find a good fit (MAD, BIAS,
RMSEA, etc.)
• Remember to not over fit on the training set (as in the direct
procedure)
Time now
Model
Cross-validation procedure
(step 3)
• Make a forecast and test it on the test set
• Check MAD, BIAS, RMSEA, etc.
• Now you have two sets of errors
Training
Set
Time now
Model
Forecast
1. Errors on the
training set
Use them to tune
the model
2. Errors on the
test set
Use them to
validate the
model
Test Set
Cross-validation procedure
(step 3)
• If you are not satisfied, change the parameters looking at the
training set
• Do not change parameters to fit the TEST SET, otherwise you will
get a test set over fitting
Training
Set
Time now
Model
Training
Set
Time now
Model
Forecast
Test set
over fitting
Forecast
Cross-validation procedure
(step 4)
Data Set
Go back to the data set and run the model with all the available
data keeping the parameters you have identified before.
From here on the procedure is identical to the direct
procedure.
Time now
Forecast
Cross-validation procedure
(step 5)
Data Set
Once you have the actual data, you can measure the real errors
of your model.
If necessary you can adjust the parameters.
Forecast
Time now
Actual
data
Cross-validation procedure
(step 6)
Data Set
Re-run the procedure to get a new forecast
Forecast
Time now
Exercise 01B Part 2
• Open the file “01B MovingAverage.xlsx”
• Open tab “Cross Validation”
• Compare the MAD and BIAS between the training and the test set and
check whether MM8 provides a reliable forecast
• Make the final forecast
Conclusions
Final
forecast
Dataset
Tune and
Run Model
Direct Procedure
Conclusions
Forecast
Test Set
Training Set
Final
forecast
Dataset Run Model
Tune
Model
Cross-validation Procedure
Same Parameters
Exponential smoothing
• Limitations:
– The moving-average gives the same weight to all the observations
(1/k)
– The number of observations considered by the weighted moving
average is constant and finite
• What we would like to have:
– Higher weights for newer observations  higher reactivity to
new trends
– At the same time, be able to consider the history entire (less
recent observations) if it can provide useful information
• Exponential smoothing overcomes the issues of:
– It is a weighted moving average where weights decrease
exponentially
–  All the observations are considered but newer observations
have higher weights
Simple exponential smoothing
• The basic idea:
– Higher weights to more recent observations
– Exponentially decreasing weight for the other observations
Weight given
to the most
recent
observation
a
Simple exponential smoothing
• Building the model is very simple: just decide the weight form 0 to 1
for the weight of the most recent observation and the model
computes all the other weights automatically
0
0,2
0,4
0,6
0,8
1
t-1
t-2
t-3
t-4
t-5
t-6
0
0,2
0,4
0,6
0,8
1
t-1
t-2
t-3
t-4
t-5
t-6
0
0,2
0,4
0,6
0,8
1
t-1
t-2
t-3
t-4
t-5
t-6
a  0.2 a  0.5 a  0.8
The higher a the higher the importance of the last observation
compared to the others
 the model has little memory for data in the past
Simple exponential smoothing
t
t
t F
Y
F )
1
(
1 a
a 



• The implementation of the model is very simple and based on a
recursive formula
• The forecast is based on the last demand data weighted by alpha e the
forecast made for the previous period weighted by (1-alpha)
• Remember: alpha is a weight bounded within a 0 to 1 range
– Close to “1”: means to give importance only to the last demand
data (model very reactive on current demand)
– Close to “0”: means to give importance only to the previous
demand data (model little reactive, very history-based)
Last demand
data
Previous Forecast =
Previous demand data =
history
Simple exponential smoothing
• Before producing the final forecast, forecasts for all the previous
periods have to be generated recursively using the same formula
• There is always a first forecast to calculate without a previous
forecast
• Usually the first forecast is set equal to the first observation (naive
method)
– F1 = Y1
– After some periods, the first observation will lose its significance
Example
• We have a time series
• For simplicity sake, we use
the direct procedure
Period Demand
1 105
2 125
3 100
4 105
5 104
6 117
7 104
8 127
9 103
10 101
11 109
12 121
13 129
14
15
16
Example
1. Define alpha = 0.5
Period Demand
Simple Exp.
Smoothing
1 105
2 125
3 100
4 105
5 104
6 117
7 104
8 127
9 103
10 101
11 109
12 121
13 129
14
15
16
Example
2. Initialize the model
Period Demand
Simple Exp.
Smoothing
1 105 105
2 125
3 100
4 105
5 104
6 117
7 104
8 127
9 103
10 101
11 109
12 121
13 129
14
15
16
Example
3. Calculate the model
for the second period
t
t
t F
Y
F )
1
(
1 a
a 



= 0.5 * 105 + (1-0.5)*105 = 105
Period Demand
Simple Exp.
Smoothing
1 105 105
2 125 105.00
3 100
4 105
5 104
6 117
7 104
8 127
9 103
10 101
11 109
12 121
13 129
14
15
16
Example
4. Drag down the
formula to calculate the
other values
Period Demand
Simple Exp.
Smoothing
1 105 105
2 125 105.00
3 100 115.00
4 105 107.50
5 104 106.25
6 117 105.13
7 104 111.06
8 127 107.53
9 103 117.27
10 101 110.13
11 109 105.57
12 121 107.28
13 129 114.14
14
15
16
Example
5. Check the fit and
errors Period Demand
Simple Exp.
Smoothing
1 105 105
2 125 105.00
3 100 115.00
4 105 107.50
5 104 106.25
6 117 105.13
7 104 111.06
8 127 107.53
9 103 117.27
10 101 110.13
11 109 105.57
12 121 107.28
13 129 114.14
14
15
16
Example
6. Change alpha if needed (but do not overfit). Always use both
visual analysis of the data and error calculation
Alpha = 0.2 Alpha = 0.9
Which one looks better?
Example
Alpha = 0.2 Alpha = 0.9
MAD = 9.82 MAD = 12.37
6. Change alpha if needed (but do not overfit). Always use both
visual analysis of the data and error calculation
Example
7. We select alpha = 0.2
and to make the forecast
just drag down the
formula one more time
Period Demand
Simple Exp.
Smoothing
1 105 105
2 125 105.00
3 100 109.00
4 105 107.20
5 104 106.76
6 117 106.21
7 104 108.37
8 127 107.49
9 103 111.39
10 101 109.72
11 109 107.97
12 121 108.18
13 129 110.74
14 114.39
15
16
Use of the model
• The simple exponential smoothing
is used to forecast only one
period ahead (if you need more
periods you can use the same
value)
Period Demand
Simple Exp.
Smoothing
1 105 105
2 125 105.00
3 100 109.00
4 105 107.20
5 104 106.76
6 117 106.21
7 104 108.37
8 127 107.49
9 103 111.39
10 101 109.72
11 109 107.97
12 121 108.18
13 129 110.74
14 114.39
15 114.39
16 114.39
Use of the model
• Stationary demand
• Purpose: identify the level (that is, local mean)
• Use of alpha:
– If high variability  lower alpha (0.1-0.3)
– If there are “jumps” in the series  higher alpha (0.6-0.8)
The simple exponential smoothing is not
able to recognize trends and seasonality
Low alpha High alpha
Reference table of weights
given to previous demand data
(previous demand = 100%)
Lag/Alpha 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.05
Long term
24 0% 0% 0% 0% 0% 0% 0% 1% 9% 31%
23 0% 0% 0% 0% 0% 0% 0% 1% 10% 32%
22 0% 0% 0% 0% 0% 0% 0% 1% 11% 34%
21 0% 0% 0% 0% 0% 0% 0% 1% 12% 36%
20 0% 0% 0% 0% 0% 0% 0% 1% 14% 38%
19 0% 0% 0% 0% 0% 0% 0% 2% 15% 40%
18 0% 0% 0% 0% 0% 0% 0% 2% 17% 42%
17 0% 0% 0% 0% 0% 0% 0% 3% 19% 44%
16 0% 0% 0% 0% 0% 0% 0% 4% 21% 46%
15 0% 0% 0% 0% 0% 0% 1% 4% 23% 49%
14 0% 0% 0% 0% 0% 0% 1% 5% 25% 51%
13 0% 0% 0% 0% 0% 0% 1% 7% 28% 54%
Medium
Term
12 0% 0% 0% 0% 0% 0% 2% 9% 31% 57%
11 0% 0% 0% 0% 0% 1% 3% 11% 35% 60%
10 0% 0% 0% 0% 0% 1% 4% 13% 39% 63%
9 0% 0% 0% 0% 0% 2% 6% 17% 43% 66%
8 0% 0% 0% 0% 1% 3% 8% 21% 48% 70%
7 0% 0% 0% 0% 2% 5% 12% 26% 53% 74%
6 0% 0% 0% 1% 3% 8% 17% 33% 59% 77%
Short Term
5 0% 0% 1% 3% 6% 13% 24% 41% 66% 81%
4 0% 1% 3% 6% 13% 22% 34% 51% 73% 86%
3 1% 4% 9% 16% 25% 36% 49% 64% 81% 90%
2 10% 20% 30% 40% 50% 60% 70% 80% 90% 95%
1 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
Exercise B02
• Forecast the demand for a double-knit
fabric for the next 12 months
• Use MS Excel
• Use a simple exponential smoothing model
• Use:
1. Direct procedure (first Excel sheet)
2. Cross validation procedure (test set
= 12 months) (second Excel sheet)
Simple Exponential Smoothing
Game
• Divide in groups of 3: 1 computer per group
• Open the file “Exp Smoothing Game Class.xlsx”
• For each period and for each of the 3 products:
1. Write the actual demand provided by the teacher
2. Check the error and the charts (Deck sheet)
3. Select the most appropriate value of alpha
4. Make your forecast for the next period (using direct procedure)
• Two rules
– You can not change past decisions
– You can vary the value of alpha of maximum +/- 0.1
(otherwise becomes red)
Adaptive Exponential
Smoothig
• Used to dynamically change the value of alpha
• If MAD is increasing  reduce alpha
• If BIAS is increasing  increase alpha
MAD increasing
(variability)
Decrease alpha
BIAS increasing (trend
or jumps)
Increase Alpha
Adaptive exponential
smoothing
• Improvement of the simple exponential smoothing
• The model becomes more or less reactive according to the evolution
of the demand
• The alpha parameter is dynamically changed according to the error
• A beta parameter sets how fast the alpha
parameter will vary
• Initialization:
– β = 0.2
– F2 = Y1
– A1 = M1 = 0
• Robust algorithm, useful in automated systems
• Alpha is variable, but beta is fixed and it can affect the performance
of the overall model
 
 
 
t
t
t
t
t
t
t
t
t
t
t
t
t
t
t
Y
F
E
M
E
M
A
E
A
M
A
F
Y
F















1
1
1
1
1
1




a
a
a
50
Adaptive exponential
smoothing
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
gen-90
mag-90
set-90
gen-91
mag-91
set-91
gen-92
mag-92
set-92
gen-93
mag-93
set-93
gen-94
mag-94
set-94
gen-95
mag-95
set-95
gen-96
mag-96
set-96
gen-97
mag-97
set-97
gen-98
mag-98
set-98
gen-99
mag-99
set-99
gen-00
mag-00
set-00
gen-01
mag-01
set-01
gen-02
mag-02
set-02
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
domanda
adattivo
Alfa
51
Linear exponential smoothing
(Holt)
• Evolution of the simple exponential smoothing to consider the trend
(Holt, 1957)
– Lt is the level of the series
– bt is the slope of the series
– Alpha: sets the reactiveness of the level (as in the simple
smoothing)
– Beta: sets the reactivennes of the underlying trend
  
   
m
b
L
F
b
L
L
b
b
L
Y
L
t
t
m
t
t
t
t
t
t
t
t
t















1
1
1
1
1
1


a
a
Linear exponential smoothing
(Holt)
• The model can adapt to changes in the trend according to beta:
higher beta higher reactivity to the recent trend
• The factor m is used to project the trend
– It is always 1 until there are demand data than it increases by 1
(2, 3, 4,…) for the forecasting periods
• Limitations: the trend is hypothesized to be linear
Building the model
• Initial level value
– L1 = Y1 (level equal to the demand)
• Initial trend value (different approaches)
– b1 = Y2-Y1 (ok if values are not too much different)
– b1 = (Y4 – Y1)/3 (data every four months)
– b1 = (Y13 – Y1)/12 (data per month)
• Alpha and beta have to be set through the visual analysis of the
series and iteration (remember to not overfit) to get the optimal
model
Building the model
• The procedure is similar to the one followed for the simple
exponential smoothing (see example next slide)
Example (alpa = beta = 0.1)
Yt Ft Lt bt m
105 - 105.00 2.00 -
107 107.00 107.00 2.00 1
110 109.00 109.10 2.01 1
111 111.11 111.10 2.01 1
112 113.11 113.00 2.00 1
114 114.99 114.90 1.99 1
113 116.88 116.49 1.95 1
114 118.44 118.00 1.90 1
116 119.90 119.51 1.87 1
118 121.38 121.04 1.83 1
121 122.87 122.69 1.81 1
124 124.50 124.45 1.81 1
126.26 1
128.07 2
129.87 3
Forecast
= 124.45 + 1.81 * 1
= 124.45 + 1.81 * 2
= 124.45 + 1.81 * 3
1. Initialization
5. Calculate the forecast
using the last data
available for L and b
3. Drag down
2. Create the first row
using formulas (in
order: L, b, F)
4. Check errors
and set alpha and
beta
  
   
m
b
L
F
b
L
L
b
b
L
Y
L
t
t
m
t
t
t
t
t
t
t
t
t















1
1
1
1
1
1


a
a
Example (alpa = beta = 0.1)
Use of the model
• Use it for series with no seasonality
• The use of alpha is the same as in the simple exponential smoothing
(variability vs jumps)
• Use of beta:
– If long term trend: low beta (0.1-0.3)
– If short term trend: high beta (0.6-0.8)  be careful that with high beta
the model could mistake random oscillations for trends!
Low beta
High beta
Exercise B03
• Provide a forecast for the
following 12 months for the flight
traffic
• Use MS Excel
• Use a linear exponential
smoothing
1. Direct procedure (first Excel
sheet)
2. Cross validation procedure
(test set = 12 months) (second
Excel sheet)
Group exercise
Download and install
SAS JMP 10
• http://www.jmp.com/landing/jmp_trial.shtml?ref=hp_visual
• Register to the website and download the 30 days trial
Winters exponential
smoothing
• Evolution of the Holt model (Winters 1960) that considers also seasonality
– Lt is the level of the series
– bt is the slope of the series
– St is the seasonality of the series
– Alpha: sets the reactivity of the level (as in the simple smoothing)
– Beta: sets the reactivity of the underlying trend (as in the linear
exponential smoothing)
– Gamma: sets reactivity of the seasonality
  
   
 
  m
s
t
t
t
m
t
s
t
t
t
t
t
t
t
t
t
t
s
t
t
t
S
m
b
L
F
S
L
Y
S
b
L
L
b
b
L
S
Y
L


























a
a
1
1
1
1
1
1
1
Characteristics of the model
• “s” is the seasonality period and it is hypothesized to be known and
constant (e.g. 12 months)
• Limited time horizon (trend and seasonality are supposed to be
constant in the future)
• Trend is hypothesized linear but the model can follow changes in the
trend (as in the linear exponential smoothing)
• Seasonality is multiplicative
– One coefficient for each period
– The coefficient can change over time
Building the model
• Initialization: three initial values (L1, b1, S1)
• Two full seasons are needed to calculate coefficients
• The optimal model has to be found changing alpha, beta and gamma
 
s
s
s
s
s
s
s
s
s
s
s
s
s
L
Y
S
L
Y
S
L
Y
S
s
Y
Y
s
Y
Y
s
Y
Y
s
b
Y
Y
Y
s
L








 













;...;
;
...
1
...
1
2
2
1
1
2
2
1
1
2
1
Use of the model
• The use of alpha is the same as in the simple exponential smoohting
(variability vs jumps)
• Use of beta is the same as in the linear exponential smoohting (long
term trend vs short term trend)
• Use of gamma
– If constant long term seasonality: low gamma (0.1-0.3)
– If varying seasonality from year to year: high gamma (0.6-0.8)
Low gamma
High gamma
In conclusion
Data characteristics Simple Linear Winters
Stationary (no
seasonality)
V
Trend V
Trend + Seasonality V
Do not use a more complicated model if it is not the
case, otherwise you will be likely to over fit
Do not use a simple model if it is not the case,
otherwise you will have a poor fit
An example
Quarterly data
Which model would you use? How would you
set the parameters?
An example
Quarterly data
Alpha: low to high
Beta: low
Gamma: medium to high
Quantitative forecasting techniques
Ruggero Golini ruggero.golini@polimi.it
Applications
SAS JMP
• Repeat Exercise B02 using SAS JMP
• Using the direct procedure, perform simple, linear and Winters
exponential smoothing and make a forecast for the next year
• Finally, use the cross-validation procedure for the Winters model
Guided exercise
Exercise B04
• Monthly Australian sales of red wine: thousands of liters Jan 1980 -
Jul 1995
• Forecast: 12 months
• Make the best forecast you can!
• Use SAS JMP
Group exercise
Exercise B05
• World crude oil production
• Make the best forecast you can!
• Use SAS JMP
Group exercise
Assignment 2
• File: Assignment 2.xls
• 3 real time series
– Number of research in Goolge of a keyword (normalized at 100)
– Global production of an agriculture product
– Retail sales of a specific good category in a specific country
• Objective: produce the best forecasts you can for the required
periods (see excel file > ‘Forecast’ sheet)
• Your forecast will be compared to real data to find who got the best
forecast in terms of MAD and BIAS
Assignment 2
• Try different models (moving average or one of the exponential
smoothing) and decide which is best
• Use the cross-validation technique to set the parameters and
evaluate your forecast
• Remember to avoid over-fit
• You can use Excel or JMP
• Send an email to golini@mip.polimi.it and to
sciacovelli@mip.polimi.it assignment:
– 3 slides (1 slide per series) for the presentation of your solution. For
each series report the model that you used, the value of the parameters
and the motivation
– Excel file with main calculations and forecasts (fill the provided excel
file > ‘Forecast’ sheet)
– Report has to be submitted by email to golini@mip.polimi.it and to
sciacovelli@mip.polimi.it by May 18th
REGRESSION ANALYSIS
Introduction
• Time-based models (e.g. demand decomposition, moving average,
exponential smoothing) are based on the following hypothesis:
t
Past
demand
Future
demand
explains
What are the assumptions?
Assumptions of time-based
models
• There are past data  might not be the case for new products
• There are regular patterns (trend, seasonality)  not always the case (e.g.,
stock prices)
• The demand is quite disconnected form the environment  might not be the
case for many products subject to shocks (e.g., events, promotions,…)
The Great Gatsby
Google Search
Explanatory models
• When the assumptions are not met, we can try to introduce external
variables (drivers) to explain future demand (explanatory models)
• Examples:
Driver Predicted outcome
Price Sales
Early sales Total Sales
Gross Domestic Product of a country Total demand
Promotion Temporary increase of the sales
… …
Issue: we need information about the
drivers!
Regression analysis
• The regression analysis is a type of explanatory model
• The idea is to determine coefficients (a1, a2, a3,..) associated to
drivers (X1, X2, X2,…) able to explain the future demand (Y)
• We will hypothesize linear relationships between the drivers and
demand  linear regression
Drivers (X) Demand (Y)
X1
X2
X3
Y
a1
a3
a2
Simple vs Multiple Linear
Regression
Drivers (X) Demand (Y)
X1
X2
X3
Y
a1
a3
a2
Drivers (X) Demand (Y)
X1 Y
a1
Simple Linear
Regression:
1 Driver
Multiple Linear
Regression:
>1 Drivers
Example
80
Year
CM of
rain
Umbrella
sales
1990 58 59044
1991 58 62060
1992 80 85600
1993 70 65450
1994 83 77107
1995 91 90909
1996 69 66033
1997 63 65268
1998 63 64638
1999 56 55216
2000 86 85140
2001 86 83334
2002 92 84548
2003 58 52490
2004 91 85813
2005 78 78624
2006 93 83700
2007 61 55876
Drivers (X) Demand (Y)
Cm of
rain
Umbrella
sale
a1
only 1 driver 
SIMPLE LINEAR REGRESSION
Example
81
Year
CM of
rain
Umbrella
sales
1990 58 59044
1991 58 62060
1992 80 85600
1993 70 65450
1994 83 77107
1995 91 90909
1996 69 66033
1997 63 65268
1998 63 64638
1999 56 55216
2000 86 85140
2001 86 83334
2002 92 84548
2003 58 52490
2004 91 85813
2005 78 78624
2006 93 83700
2007 61 55876
4000
14000
24000
34000
44000
54000
64000
74000
84000
94000
104000
50 60 70 80 90 100
Key tool: dispersion
diagram
Examples
Good Good
Bad
Bad
An example
83
Year
CM of
rain
Umbrella
sales
1990 58 59044
1991 58 62060
1992 80 85600
1993 70 65450
1994 83 77107
1995 91 90909
1996 69 66033
1997 63 65268
1998 63 64638
1999 56 55216
2000 86 85140
2001 86 83334
2002 92 84548
2003 58 52490
2004 91 85813
2005 78 78624
2006 93 83700
2007 61 55876
y = 883,89x + 6665,5
4000
14000
24000
34000
44000
54000
64000
74000
84000
94000
104000
50 60 70 80 90 100
Simple Linear Regression
Y = a1 * X1 + b
Interpretation: if the cm of rain were 0, we would sell 6665 umbrellas,
for each cm of rain more we sell 883 umbrella more
Simple Linear Regression
• A linear relationship between Y (dependent variable) and only one
driver (independent variable)
• The output is the equation of a straight line:
Y = a1 * X1 + b
– a1 is the coefficient associated to X1  for a unit increase of X1
the demand increases by a1 (example: for each cm of rain we sell
883 umbrella more)
– b is the intercept with the Y axis  the baseline demand in case
X1 is 0 (example: even if it does not rain, we sell 6665 umbrellas)
84
Regression is the attempt to explain the variation in a dependent
variable using the variation in independent variables.
Regression is thus an explanation of causation.
If the independent variable(s) sufficiently explain the variation in the
dependent variable, the model can be used for prediction.
Drivers (x)
Dependent
variable
(y)
Simple Linear Regression
Independent variable (x)
Dependent
variable
The function will make a prediction for each observed data point.
The observation is denoted by y and the prediction is denoted by ŷ
The difference between y and ŷ is the prediction error
Zero
Prediction: ŷ
Observation: y
Simple Linear Regression
ERROR
Independent variable (x)
Dependent
variable
A least squares regression selects the line with the lowest total sum of
squared prediction errors.
This value is called the Sum of Squares of Error, or SSE.
Simple Linear Regression
Simple linear regression
88
The coefficient a1 can also be negative
Week Sales Price
1 10 € 1.30
2 6 € 2.00
3 5 € 1.70
4 12 € 1.50
5 10 € 1.60
6 15 € 1.20
7 5 € 1.60
8 12 € 1.40
9 17 € 1.00
10 20 € 1.10
Average 11.2 € 1.44
y = -14,539x + 32,136
0
5
10
15
20
25
€ 0,00 € 0,50 € 1,00 € 1,50 € 2,00 € 2,50
Interpretation: if the price was 0, we would sell 32 products,
but for every euro more we sell 14.5 products less
Question:
Why the coefficient b should not be
negative?
R2 indicator
• A regression line can have a good or a bad fit
• To evaluate this fit the R2 indicator is used
– R2 is the percentage of variability predicted by the model
– R2 near 100%: good fit
– R2 near 0%: poor fit (data are not related each other)
R2 = 45%
POOR FIT
R2 = 99%
GOOD FIT!
Use the linear regression
to make forecasts
91
Year
CM of
rain
Umbrella
sales
1990 58 59044
1991 58 62060
1992 80 85600
1993 70 65450
1994 83 77107
1995 91 90909
1996 69 66033
1997 63 65268
1998 63 64638
1999 56 55216
2000 86 85140
2001 86 83334
2002 92 84548
2003 58 52490
2004 91 85813
2005 78 78624
2006 93 83700
2007 61 55876
y = 883,89x + 6665,5
4000
14000
24000
34000
44000
54000
64000
74000
84000
94000
104000
50 60 70 80 90 100
Back to the umbrella example. We want to
open a new branch in India where there
are 110 cm of rain per year. What is the
predicted demand?
We just apply the formula:
Y = a1*X1 + b where X1 is the cm of rain in India
Use the linear regression
to make forecasts
92
y = 883,89x + 6665,5
4000
24000
44000
64000
84000
104000
124000
50 60 70 80 90 100 110 120 130
Solution:
883,89 * 110 + 6665,5 =
103893,4
Use the linear regression
to make forecasts
93
Week Sales Price
1 10 € 1.30
2 6 € 2.00
3 5 € 1.70
4 12 € 1.50
5 10 € 1.60
6 15 € 1.20
7 5 € 1.60
8 12 € 1.40
9 17 € 1.00
10 20 € 1.10
Average 11.2 € 1.44
y = -14,539x + 32,136
0
5
10
15
20
25
€ 0,00 € 0,50 € 1,00 € 1,50 € 2,00 € 2,50
How much we expect to sell with a price of 1.30?
Use the linear regression
to make forecasts
94
Week Sales Price
1 10 € 1.30
2 6 € 2.00
3 5 € 1.70
4 12 € 1.50
5 10 € 1.60
6 15 € 1.20
7 5 € 1.60
8 12 € 1.40
9 17 € 1.00
10 20 € 1.10
Average 11.2 € 1.44
y = -14,539x + 32,136
0
5
10
15
20
25
€ 0,00 € 0,50 € 1,00 € 1,50 € 2,00 € 2,50
How much we expect to sell with a price of 1.30?
Solution: -14.539 * 1.30 + 32.136 = 13.23
Use the simple regression as a
forecasting tool
1. Set-up the model
– Decide which is your dependent Y variable (sales, demand,
etc.)
– Decide which is your X1 driver (price, market drivers, etc.)
2. Calculate coefficients (a1 and b) using a software and check R2 (if
not good change the driver)
3. Identify the future value of the selected driver (may require a
forecast!)
4. Perform the forecast
Simple Regression in Excel
• Display data in a scatter-diagram
• Right-click on the series  add trend line
Recommendations for simple
linear regression
• Simple linear regression can be used even if you do not have deep
statistical competences
• The counterpart is that you need to choose one driver
• It always best to use the driver that is more theoretically correlated
to the demand (cm of rain  umbrella)
• But the driver should be also independent from the demand (do not
use cm of rain to predict days of rain)
• If you are undecided among different drivers, you can run several
models and pick the one with the highest R2
• Be careful: regression is sensitive to outliers, always inspect your
data and remove outliers!
Exercise B06
• Open the file: “06B Regression Exercise 1.xlsx”
• How many cranes we expect to sell in 2001 when it is expected a
GDP growth of 7.5%?
• What other model could you have used? What would have been the
outcome?
Exercise B07 (early sales)
• Fashion companies can not rely on historical data because their
products change every season
• At the beginning of the season there is high uncertainty and the risk
to overproduce if the product is a flop or underproduce if the
product will have success
• One option is to look at the early sales (sales at the beginning of the
season) to forecast the total sales
Exercise B07 (early sales)
• Every six months (season) a fast-fashion company launches ten types
of new t-shirts while the old ones are removed from the market
• For each t-shirt two data are registered:
– Total sales at the end of the season
– Early sales after 2 months
• The company is trying to understand if it is possible to forecast the
total sales given a certain amount of early sales
Season Product Total Sales Early Sales
1 T-shirt - 39662 21 7
1 T-shirt - 73131 21 7
1 T-shirt - 97299 49 49
1 T-shirt - 81181 57 57
1 T-shirt - 63623 77 25
1 T-shirt - 14358 84 84
…… … …
Exercise B07 (early sales)
• Consider the file “07B Exercise Early Sales.xls”
• Using the software perform a simple regression to evaluate the
relationship between early sales and total sales
• How much will be the sales for a product that had 1,230 early sales?
• Is there any difference between Fashionable and Non Fashionable
products?
Exercise B08
• Regression analysis can also be used to predict the effect of a
promotion
• Simply indicate with “1” when a promotion took place or “0” if in
that period there was no promotion
• Perform a simple regression between the total sales and the new 0/1
variable
• The “b” coefficient will be the average increase of sales due to the
promotion
• The “a” coefficient will be the average level of sales excluding the
promotions
• Reversing the formula (subtract from the sales the value of “b” when
a promotion takes place) you can then your data from the effect of
the promotion
Exercise B08
• Open File “Regression Exercise 3.xlsx”
• Set up the 0/1 promotion variable
• Calculate the average effect of a promotion
• Clean your data
• Perform a forecast of 3 periods ahead using simple exponential
smoothing
Multiple linear regression
• Simple: a relationship between Y (dependent variable) and X
(independent variable)
Y = a1 * X1 + b
• Multiple: a relationship between Y (dependent variable) and several
Xi (independent variables)
Y = a1 * X1 + a2 * X2 + … + b
104
Multiple linear regression
105
Sales of cars
(units GDP per capita (€) Marketing actions (€)
1293 16201 15348
1891 27287 22579
1958 22811 26120
1989 25559 28841
1507 17436 18084
1306 17866 19172
1307 14834 18376
1917 27662 24499
1866 19332 20937
1926 25885 26868
1139 16834 13247
1833 21318 18697
1254 17105 17368
1791 21008 19665
1919 19823 20610
1128 12070 13965
Multiple linear regression
Both drivers seem to have a relation with sales
The results of the linear regression are:
Y = 0.034 * GPD + 0.033 * Marketing + 276.72
0
500
1000
1500
2000
2500
10000 15000 20000 25000 30000
Sales of cars
GDP Per capita
0
500
1000
1500
2000
2500
10000 15000 20000 25000 30000
Sales of cars
Marketing Actions
Attention! The coefficient are not the result of two separate regressions,
but it is a simultaneous result
Use the linear regression
to make forecasts
• Back to the car sales example based on GDP and Marketing effort
• The regression equation was:
• How many cars would we sale if the GDP next year will be 30.000 per
capita and we plan to invest 25.000 € in marketing?
• Solution
0.034 * 30000 + 0.033 * 25000 + 276.72= 2121.72
Y = 0.034 * GPD + 0.033 * Marketing + 276.72
Recommendations for
multiple linear regression
• Multiple linear regression allows you to use several drivers
simultaneously
• However, there some statistical traps, so it is suggested to ask the
help of an expert
• Some suggestions:
• Start with few, theoretically and statistically uncorrelated
drivers (for instance: do not put days of rain AND days of sun to
predict umbrella sales, but days of rain AND people income)
• Check if R2 is significant
• You can add other drivers, one at a time, but always check that
they are not correlated to the previous drivers
• Check if R2 has significantly increased, if not, drop the new
driver (it is better to have fewer drivers in the model)
Multiple Regression in Excel
• Select a range
• Function LINEST
• Inputs:
– Y as a vector;
– X as a matrix;
– True (otherwise b = 0);
– True (to have statistics)
• Ctrl+Shift+Enter to enter the function in each cell of the result array
• Outputs in matrix form
an … a1 b
st.err n … …st.err 1 st.err b
R2 St.erry
F Df
SSreg SSresid
Exercise B09
• Open the file “09B Exercise Multiple Regression.xlsx”
• The dataset represents number of cold organic fruit juices sold
• The product is still quite new for the market to the sales have high
fluctuations
• Also the price made by the company can change significantly
according to the cost of the raw materials (fruits)
• Customers can order in larger or smaller batches
• Provide a forecast for the number of orders and products sold
considering that for the next week these values are expected
Price 9
Promotions (1 =Yes) 1
Holidays during the week(1=Yes) 0
Full working week (1=Yes) 1
External Temperature (Celsius degrees) 10
Economic outlook (10=very good; 1=very bad) 7
Autoregressive (AR) models
FORECAST HORIZON : m
FORECAST estimated at period t for period t+m : Ft+m
t t+m
t+1
m periods
Ft+1 Ft+2 Ft+m
Time
t-1 t+2
Yt
Yt-1
DEMAND for period t : Yt
Notation
The idea behind AR Models
• Simple regression model
Yt = b0 + b1X1 + et
• Multivariate regression model
Yt = b0 + b1X1 + b2X2 + … + bkXk + et
• Multivariate regression based on previous observations
Yt = b0 + b1Yt-1 + b2Yt-2 + … + bkYt-k + et
The idea behind AR
• We will focus on its application to stationary series and non-
seasonal, but this limitation can be easily removed
• Context of use: mean reverting time series
• Example: drunk man trying to walk on a straight line
Signal
(Mean)
Demand: signal +
disturbance
Which one is a mean reverting
time series?
What is the difference in
these two series?
Mean reverting
(AR model)
White noise
(random number in
a range +200 ; +100)
Mean reverting time series
• It can be difficult to asses graphically
• It’s more likely that the next period will be closer to the mean than
the period before
• But the path that brings back to the mean can be complex
• We can assess this through the autocorrelogram (see Part A slides)
and partial-autocorrelogram
Autocorrelation function
(ACF)
• Provides a synthetic
view of autocorrelations
• Useful tool to identify
non stationariety
factors
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Lag
1,0000
0,8166
0,7181
0,6395
0,5479
0,4301
0,2512
0,0793
-0,0383
-0,1487
-0,2118
-0,3170
-0,3793
-0,3957
-0,3464
AutoCorr
-.8
-.6
-.4
-.2
0
.2
.4
.6
.8
Partial Autocorrelation (PACF)
• If we have a regression relationship between Y and X1
and X2, it may be interesting to evaluate how much X1
explains of what X2 cannot explain
Yt = b0 + b1X1 + b2X2 + et
• Partial autocorrelation evaluates the correlation
between Yt e Yt-k when the effect of lags 1, 2, 3, …, k-1
has been eliminated: how much does Yt-k contribute to
explain Yt
Yt Yt-1 Yt-2 …
Partial Autocorrelation (PACF)
• Evaluated by means of regression
Yt = b0 + b1Yt-1 + b2Yt-2 + … + bkYt-k
0
1
2
3
4
5
6
7
8
9
10
11
12
13
Lag
1,0000
0,1059
0,0946
-0,0682
-0,0403
-0,2243
0,1095
0,2653
-0,1011
-0,0892
-0,2980
-0,1635
-0,4105
0,3200
Partial
-.8
-.6
-.4
-.2
0
.2
.4
.6
.8
Partial
AutoCorrelation
Function
(PACF)
Time series 1 (Mean
reverting)
Time series 2 (White noise)
Revert-to-mean models
• A first order AR(1) model can be described as:
• Ŷt = μ + ϕ1Yt-1
• With |ϕ1|< 1
• If ϕ1 is positive the series tends to have periods above and periods
below the average
• If ϕ1 is negative up and downs from the average  demand will be
below the mean next period if it is above the mean this period
Time series 1 (Mean
reverting)
• We can therefore apply an AR(1) model, for instance in JMP
• We obatain a forecast (green line) that converges to the mean
250
270
290
310
330
350
370
390
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97
Ŷt = 332.506 + 0.646 * Yt-1
AR(1) Models with negative ϕ1
AR(1) Models with negative ϕ1
• We can still apply an AR(1) model
Ŷt = 58.76 - 0.69 * Yt-1
AR(2) Model
Revert-to-mean models
• A first order AR(2) model can be described as:
• Ŷt = μ + ϕ1Yt-1 + ϕ2Yt-2
AR(2) Model
Ŷt = 200.85 + 0.281 * Yt-1 + 0.316 * Yt-2
How to detect AR(x) models
• 2 conditions:
1. The ACF decreases
exponentially (or
oscillating)
2. The PACF is
significant until “x”
lag
In the example we have
an AR(3) model, as the
PACF is significant until
lag 3 (above the blue
line) values
Exercise 10B
• Use JMP
• Open the file “10B Autoregressive.xlsx”, identify the model and
perform the forecast
Differentiation
• First and second order differentiation removes the trend component
• First and second seasonal differentiation remove the seasonality
Gain stationariety
• A possible way is to differentiate
• The 1st difference series evaluates the change between
two subsequent observations in the original series
1
' 

 t
t
t Y
Y
Y
Now the
series is
stationary
Before differentiation
After differentiation
Differentiation
• Allows to get rid of unstationary elements
• We can have 2nd level differentiation
• We can also have seasonal differentiation
2
1
2
1
1
1 2
)
(
)
(
'
'
'
' 




 







 t
t
t
t
t
t
t
t
t
t Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Seasonal Differentiation
• Defined as:
s
t
t
t Y
Y
Y 


'
7000
8000
9000
10000
11000
12000
13000
14000
15000
16000
1980 1985 1990 1995
Electricity
production
In Australia
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Lag
1,0000
0,9358
0,8878
0,8228
0,7762
0,7578
0,7259
0,7464
0,7481
0,7674
0,8054
0,8203
0,8448
0,7867
0,7407
0,6780
0,6339
0,6181
0,5896
0,6062
0,6108
0,6263
0,6586
0,6700
0,6923
0,6393
0,5924
0,5320
0,4870
0,4679
0,4399
0,4549
0,4535
0,4685
0,5005
0,5086
0,5269
0,4726
0,4257
0,3694
AutoCorr
-.8
-.6
-.4
-.2
0
.2
.4
.6
.8
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Lag
1,0000
0,9667
0,1908
-0,2275
0,1144
0,3361
-0,1544
0,4628
0,1883
0,1292
0,7253
0,3276
0,3988
-0,5150
-0,2882
-0,4583
-0,1003
0,0113
-0,1482
-0,1537
0,0873
-0,2422
-0,0247
0,1330
0,4591
-0,1213
-0,1444
-0,2152
-0,2753
-0,0602
-0,1031
0,0960
0,0963
-0,1244
0,2419
0,2135
0,6330
-0,2822
-2,5533
0,0000
Partial
-.8
-.6
-.4
-.2
0
.2
.4
.6
.8
Seasonal Differentiation
Electricity
production
In Australia
-1000
-500
0
500
1000
1500
1981 1986 1991
s
t
t
t Y
Y
Y 


'
Seasonal Differentiation
Electricity
production
In Australia
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Lag
1,0000
0,4100
0,3663
0,4116
0,3569
0,3055
0,2381
0,3292
0,2482
0,2292
0,2622
0,2580
-0,1198
0,0414
0,1490
0,0244
0,0509
0,0775
0,1450
-0,0344
0,0111
0,0437
-0,0875
-0,1209
-0,0713
0,0444
-0,0557
-0,0358
0,0025
-0,0366
-0,0389
0,0077
0,0361
-0,0584
0,0593
0,1006
0,0846
0,0327
0,0207
0,0775
AutoCorr
-.8
-.6
-.4
-.2
0
.2
.4
.6
.8
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Lag
1,0000
0,4204
0,2411
0,2625
0,1233
0,0498
-0,0395
0,1532
0,0006
0,0328
0,0677
0,0654
-0,5520
0,0054
0,1719
0,0392
0,0822
0,1170
0,0169
-0,1764
0,0050
0,0292
-0,1837
0,0224
-0,3556
0,1650
0,2928
0,0191
-0,0818
0,1403
0,3154
-0,1565
0,1517
0,0299
-0,0580
-0,0787
-0,2901
0,0556
0,0108
0,0160
Partial
-.8
-.6
-.4
-.2
0
.2
.4
.6
.8
Electricity
production
In Australia
Seasonal Differentiation
-1500
-1000
-500
0
500
1000
1500
1981 1986 1991
1
' 

 t
t
t Y
Y
Y
Electricity
production
In Australia
1° order differentiation
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Lag
1,0000
-0,4387
-0,0848
0,0843
-0,0022
0,0074
-0,1388
0,1497
-0,0572
-0,0515
0,0342
0,3329
-0,4640
0,0343
0,1902
-0,1181
-0,0116
-0,0182
0,2079
-0,1974
0,0434
0,1311
-0,0938
-0,0861
-0,0464
0,1843
-0,0956
-0,0174
0,0718
-0,0419
-0,0463
0,0322
0,0850
-0,1763
0,0669
0,0591
0,0145
-0,0324
-0,0377
0,0865
AutoCorr
-.8
-.6
-.4
-.2
0
.2
.4
.6
.8
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Lag
1,0000
-0,4648
-0,3768
-0,1963
-0,1081
-0,0159
-0,1999
-0,0402
-0,0692
-0,1002
-0,0929
0,5121
-0,0832
-0,2318
-0,0895
-0,1245
-0,1514
-0,0436
0,1452
-0,0431
-0,0585
0,1509
-0,0625
0,2955
-0,2320
-0,3300
-0,0428
0,0589
-0,1608
-0,3210
0,1509
-0,1537
-0,0274
0,0622
0,0796
0,2820
-0,0734
-0,0336
-0,0331
0,0383
Partial
-.8
-.6
-.4
-.2
0
.2
.4
.6
.8
The differentiated series
Conclusive remarks
• AR models should be used only when the conditions shown by ACF
and PACF are verified
• AR models can be applied to series with trend by applying a first or
second order differentiation
– For instance: AR(1) with first order differentiation: ARI(1,1)
• AR models can be applied to series with seasonality by applying a
seasonal differentiation
– For instance: AR(1,0)(0,1)12 (in case of seasonality every 12
periods)
• Both differentiations can be combined:
– ARI(1,1)(0,1)12 (in case of seasonality every 12 periods)
An example – number of users
of an internet service
0
50
100
150
200
250
1 21 41 61 81
An example – number of users
of an internet service
• Non stationariety
• 1st autocorrelation
is dominant
• No seasonality
• Let’s differentiate
80
100
120
140
160
180
200
220
240
Column
1
0 10 20 30 40 50 60 70 80 90 100 110
Row
Mean
Std
N
137,08
39,798915
100
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Lag
1,0000
0,9602
0,9011
0,8287
0,7460
0,6572
0,5647
0,4686
0,3709
0,2742
0,1809
0,0905
0,0021
-0,0852
-0,1671
-0,2389
-0,2990
-0,3498
-0,3927
-0,4248
AutoCorr -.8-.6 -.4-.2 0 .2 .4 .6 .8
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Lag
1,0000
0,9602
-0,8466
0,2667
-0,4256
-0,0624
-0,0651
-0,0296
0,0708
-0,2389
-0,2455
-0,1294
-0,2181
0,0340
0,1009
-0,0054
-0,0327
-0,1917
0,0576
0,0389
Partial -.8-.6 -.4-.2 0 .2 .4 .6 .8
-15
-10
-5
0
5
10
15
1 21 41 61 81
An example – number of users
of an internet service
An example – number of users
of an internet service
• ACF: exponential
decrease and
sinusoidal
• At least 3 partial
autocorrelations are
significant
• AR(3)
-15
-10
-5
0
5
10
15
Column
2
0 10 20 30 40 50 60 70 80 90 100 110
Row
Mean
Std
N
1,28
5,6410637
100
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Lag
1,0000
0,7904
0,5243
0,4084
0,3712
0,3194
0,2182
0,1023
0,0441
0,0695
0,1081
0,1121
0,0374
-0,0795
-0,1611
-0,1917
-0,1838
-0,2194
-0,3171
-0,3832
AutoCorr -.8-.6 -.4-.2 0 .2 .4 .6 .8
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Lag
1,0000
0,8045
-0,3031
0,3400
-0,0568
-0,0319
-0,0808
-0,1515
0,1193
0,0744
-0,0274
0,0349
-0,2312
-0,2467
-0,1163
-0,0894
0,0785
-0,1760
-0,1709
-0,1705
Partial -.8-.6 -.4-.2 0 .2 .4 .6 .8
AR1
AR2
AR3
Intercept
Term
1
2
3
0
Lag
1,14599656
-0,6592887
0,33460154
0,97991731
Estimate
0,0953656
0,1351023
0,0947145
1,6480961
Std Error
12,02
-4,88
3,53
0,59
t Ratio
<.0001
<.0001
0,0006
0,5535
Prob>|t|
Constant Estimate 0,17510197
Parameter Estimates
-10
-5
0
5
Residual
Value
0 10 20 30 40 50 60 70 80 90 100 110
Row
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Lag
1,0000
-0,0126
0,0210
0,0293
0,0322
-0,0085
0,0216
-0,0621
-0,1535
0,0521
0,0313
0,1237
0,0536
-0,0604
-0,0287
AutoCorr -.8-.6 -.4-.2 0 .2 .4 .6 .8
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Lag
1,0000
-0,0129
0,0220
0,0319
0,0353
-0,0105
0,0216
-0,0766
-0,1849
0,0803
0,0439
0,1818
0,0734
-0,1278
-0,0563
Partial -.8-.6 -.4-.2 0 .2 .4 .6 .8
Residuals
ARIMA(3,1,0) model
An example – number of users
of an internet service
Exercise 11B
• Use JMP
• Identify if differentiation is needed (first order or seasonal)
• Detect and apply the correct AR model
Moving Average Models
Regression and “Moving
Average”
• Similarly we can consider:
• Pay attention: this is the moving average of the
error not of the original series
t
p
t
p
t
t
t e
e
b
e
b
e
b
b
Y 




 

 ...
2
2
1
1
0 MA
Moving average models
• Do not produce regular patterns as AR models  ACF usually
significant only at Lag 1
• The series theoretically always starts from the constant b0 (and not
from the previous values) and varies according to the previous
variations  that is why we need to rely on the PACF
Moving average models
• A first order MA(1) model can be described as:
– Ŷt = b0 + b1et-1
– b0 and b1 > 0
• The demand is supposed to be b0 but if in the previous period was
higher (et-1) then also in this period is going to be higher
– If b1 is positive the series tends to move in couples (high  high;
lowlow)
– If b1 is negative the series moves in opposite couples (high 
low; low  high)
MA(1,1) = Simple exponential
smoothing
• Ŷt = b0 + b1et-1 with et-1= Yt - Ŷt-1
• So we can re-write as:
• Ŷt = b0 + b1 (Yt - Ŷt-1)
• In words: the forecast (Ŷt ) is dependent on the previous demand (Yt
) and the previous forecast (Ŷt-1)
• It’s the same principle of the simple exponential smoothing
• In particular, a MA(1,1) with no constant (b0) is equivalent to the
simple exponential smoothing
• For the demonstration:
https://onlinecourses.science.psu.edu/stat510/?q=node/70
• In conclusion: MA models are a broader class of smoothing
models!
Example 1
0
1
2
3
4
5
6
7
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96
-1
-0,8
-0,6
-0,4
-0,2
0
0,2
0,4
0,6
0,8
1
Y-1 Y-2 Y-3 Y-4 Y-5 Y-6 Y-7 Y-8 Y-9 Y-10 Y-11 Y-12 Y-13 Y-14 Y-15
-1
-0,8
-0,6
-0,4
-0,2
0
0,2
0,4
0,6
0,8
1
Y-1 Y-2 Y-3 Y-4 Y-5 Y-6 Y-7 Y-8 Y-9 Y-10 Y-11 Y-12 Y-13 Y-14
ACF PACF
Example 2
-2
-1
0
1
2
3
4
5
6
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96
-1
-0,8
-0,6
-0,4
-0,2
0
0,2
0,4
0,6
0,8
1
Y-1 Y-2 Y-3 Y-4 Y-5 Y-6 Y-7 Y-8 Y-9 Y-10 Y-11 Y-12 Y-13 Y-14 Y-15
-1
-0,8
-0,6
-0,4
-0,2
0
0,2
0,4
0,6
0,8
1
Y-1 Y-2 Y-3 Y-4 Y-5 Y-6 Y-7 Y-8 Y-9 Y-10 Y-11 Y-12 Y-13 Y-14
ACF PACF
ARIMA Models
AutoRegression and Moving
Average
• By joining the two models
• The model is good for stationary series
• It can be adapted for non-stationary series by
adding differentiation
AR MA
+ = ARMA
I
AR I
+ = ARIMA
MA
+
Notation
• AR: p = order of the autoregressive component
• I: d = order of the differentiation
• MA: q = order of the moving average component
• The value of p, d, q can be also be higher than one in case of
additional autoregressive or moving average components, however
better to be conservative and do not set more than one parameter
higher than 2
• Examples:
– Autoregressive: ARIMA (1,0,0)
– Moving average: ARIMA (0,0,1)
– Autoregressive with trend: ARIMA (1,1,0)
Model identification
ARIMA model identification is highly complex, however as a general
guideline:
1. Check if the series is stationary  if not, differentiate until it is
stationary
2. Check if the series has seasonality  if not, differentiate
(lag=period of the seasonality) until it is stationary
3. Check if ACF has significant patterns  if yes, add an AR
component
4. Check if PACF has a significant patterns  if yes, add a MA
component
5. Repeat steps 3 and 4 until ACF and PACF do not show significant
significant patterns
Quality evaluation
• Few suggestions for model design
– For the sake of simplicity start with an AR or an MA
model, then apply an ARMA model and analyse
differences
– Test different methods: pay attention
• Usually estimation is done by min(MSE)
• You can easily reduce MSE by complicating the model (i.e.,
overfit)
• Useful to consider measures that account for the complexity
of the model
– Residual analysis
Quality measures
• Mean Square Error
• Error variance
• R2
• Likelihood
– Akaike’s Information Criterion (AIC)
• m = p + q + P + Q
• L: Likelihood function
• AIC = -2logL + 2m
• The lower the better
– Schwarz Bayesian Information Criterion (BIC)

More Related Content

Similar to Quantitative Forecasting Techniques in SCM

LPP application and problem formulation
LPP application and problem formulationLPP application and problem formulation
LPP application and problem formulationKarishma Chaudhary
 
Automatic Forecasting at Scale
Automatic Forecasting at ScaleAutomatic Forecasting at Scale
Automatic Forecasting at ScaleSean Taylor
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationThomas Ploetz
 
MSPresentation_Spring2011
MSPresentation_Spring2011MSPresentation_Spring2011
MSPresentation_Spring2011Shaun Smith
 
RS in the context of Big Data-v4
RS in the context of Big Data-v4RS in the context of Big Data-v4
RS in the context of Big Data-v4Khadija Atiya
 
Dynamic DSM Features - Measures
Dynamic DSM Features - MeasuresDynamic DSM Features - Measures
Dynamic DSM Features - MeasuresDynamic DSM
 
Statistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxStatistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxrajalakshmi5921
 
Heuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchHeuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchGreg Makowski
 
Online Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsOnline Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsViral Gupta
 
3. Monitoring and updating multiple of medians (MoM) values within the labora...
3. Monitoring and updating multiple of medians (MoM) values within the labora...3. Monitoring and updating multiple of medians (MoM) values within the labora...
3. Monitoring and updating multiple of medians (MoM) values within the labora...PHEScreening
 
Noha danms13 talk_final
Noha danms13 talk_finalNoha danms13 talk_final
Noha danms13 talk_finalNoha Elprince
 
Software Engineering (Testing Activities, Management, and Automation)
Software Engineering (Testing Activities, Management, and Automation)Software Engineering (Testing Activities, Management, and Automation)
Software Engineering (Testing Activities, Management, and Automation)ShudipPal
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Lionel Briand
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10Roger Barga
 
Survey On Temporal Data And Change Management in Data Warehouses
Survey On Temporal Data And Change Management in Data WarehousesSurvey On Temporal Data And Change Management in Data Warehouses
Survey On Temporal Data And Change Management in Data WarehousesEtisalat
 
Lecture2 forecasting f06_604
Lecture2 forecasting f06_604Lecture2 forecasting f06_604
Lecture2 forecasting f06_604datkuki
 
Macroeconomic modelling using Eviews
Macroeconomic modelling using EviewsMacroeconomic modelling using Eviews
Macroeconomic modelling using EviewsMuhammad Anees
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter TuningJon Lederman
 

Similar to Quantitative Forecasting Techniques in SCM (20)

LPP application and problem formulation
LPP application and problem formulationLPP application and problem formulation
LPP application and problem formulation
 
Automatic Forecasting at Scale
Automatic Forecasting at ScaleAutomatic Forecasting at Scale
Automatic Forecasting at Scale
 
Motion and time study
Motion and time studyMotion and time study
Motion and time study
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
 
MSPresentation_Spring2011
MSPresentation_Spring2011MSPresentation_Spring2011
MSPresentation_Spring2011
 
RS in the context of Big Data-v4
RS in the context of Big Data-v4RS in the context of Big Data-v4
RS in the context of Big Data-v4
 
Dynamic DSM Features - Measures
Dynamic DSM Features - MeasuresDynamic DSM Features - Measures
Dynamic DSM Features - Measures
 
Statistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxStatistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptx
 
Heuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchHeuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient search
 
Online Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsOnline Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation Systems
 
3. Monitoring and updating multiple of medians (MoM) values within the labora...
3. Monitoring and updating multiple of medians (MoM) values within the labora...3. Monitoring and updating multiple of medians (MoM) values within the labora...
3. Monitoring and updating multiple of medians (MoM) values within the labora...
 
Noha danms13 talk_final
Noha danms13 talk_finalNoha danms13 talk_final
Noha danms13 talk_final
 
Software Engineering (Testing Activities, Management, and Automation)
Software Engineering (Testing Activities, Management, and Automation)Software Engineering (Testing Activities, Management, and Automation)
Software Engineering (Testing Activities, Management, and Automation)
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Survey On Temporal Data And Change Management in Data Warehouses
Survey On Temporal Data And Change Management in Data WarehousesSurvey On Temporal Data And Change Management in Data Warehouses
Survey On Temporal Data And Change Management in Data Warehouses
 
Lecture2 forecasting f06_604
Lecture2 forecasting f06_604Lecture2 forecasting f06_604
Lecture2 forecasting f06_604
 
Macroeconomic modelling using Eviews
Macroeconomic modelling using EviewsMacroeconomic modelling using Eviews
Macroeconomic modelling using Eviews
 
crossvalidation.pptx
crossvalidation.pptxcrossvalidation.pptx
crossvalidation.pptx
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 

Recently uploaded

How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 

Recently uploaded (20)

How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 

Quantitative Forecasting Techniques in SCM

  • 1. PART B: Quantitative Forecasting Techniques Ruggero Golini golini@mip.polimi.it
  • 2. Agenda Part B • Forecasting models: Direct vs Cross-validation procedure • Moving Average Forecast with application • Exponential Smoothing – Simple Exponential Smoothing with application – Linear Exponential smoothing – JMP Download and Installation – Excel Applications (simple and linear exponential smoothing) – Winters exponential smoothing – JMP Tutorial – 2 JMP Group Exercises • Launch of assignment 2 • Linear regression with applications • Autoregressive models with applications
  • 3. Teaching methodology • Using MS Excel is not part of the program • Exercises and applications are usually solved by the teacher, make sure you can follow • Individual exercises are provided (Individual exercises.xlsx) to test your understanding after the class • It is required a basic understanding on how to use MS Excel with the following features: – Basic operations (copy, paste, formatting,…) – Basic formulas (sum, dragging formulas,...) – Matrix formulas • There is a nice tutorial provided by Microsoft on how to use MS Excel, please take a look if you need it!
  • 4. Forecasting models • Forecast = f (D, a, b, c,…) – f: function, depends on the type of model – D: previous demand data – a, b, c: parameters used to adapt the model • Building a forecasting model means to: – Identify a model (f) – Select the data to be used (D) – Find the best value of the parameter (a, b, c,...) How do we select a model? How do we evaluate it?
  • 5. Direct vs cross-validation procedure • Two procedures to make forecasts applicable to every forecasting model/method 1. The direct procedure is simpler and faster 2. The cross-validation procedure is more complex but allows a better evaluation of the model
  • 6. Direct procedure (step 1) Data Set • Build a model based on the data available (Data Set) • Change the parameters until you do not find a good fit (MAD, BIAS, RMSEA, etc.) Time now Model
  • 7. Direct procedure (step 1) • Quantitative models can have many parameters to “play” with • DO NOT OVER FIT: your target is to create a model able to interpret reality and produce good forecasts. A model that perfectly fits history is usually not good at forecasting. Data Set Time now Model Forecast Data Set Time now Model Forecast Over fitting
  • 8. An example of overfitting Image source: pingax.com
  • 9. Direct procedure (step 2) Data Set • Produce the forecast based on the parameters identified Time now Forecast Model
  • 10. Forecasting using the moving average • It is possible to make forecasts using the right-centered moving average • The forecast is based on the average of the last k periods     k i i t Y k F 1 1 1 Period Demand MM(3) MM(5) MM(6) 1 105 2 125 3 100 4 105 5 104 6 117 7 104 8 127 9 103 10 101 11 109 12 121 13 129
  • 11. Forecasting using the moving average: model Period Demand MM(3) MM(5) MM(6) 1 105 2 125 3 100 4 105 110.0 5 104 110.0 6 117 103.0 107.8 7 104 108.7 110.2 109.3 8 127 108.3 106.0 109.2 9 103 116.0 111.4 109.5 10 101 111.3 111.0 110.0 11 109 110.3 110.4 109.3 12 121 104.3 108.8 110.2 13 129 110.3 112.2 110.8
  • 12. Forecasting using the moving average: forecast Period Demand MM(3) MM(5) MM(6) 1 105 2 125 3 100 4 105 110.0 5 104 110.0 6 117 103.0 107.8 7 104 108.7 110.2 109.3 8 127 108.3 106.0 109.2 9 103 116.0 111.4 109.5 10 101 111.3 111.0 110.0 11 109 110.3 110.4 109.3 12 121 104.3 108.8 110.2 13 129 110.3 112.2 110.8 14 119.7 112.6 115.0
  • 13. Forecasting using the moving average: forecast • Forecast for the periods after the first one is equal to the initial forecast Period Demand MM(3) MM(5) MM(6) 1 105 2 125 3 100 4 105 110.0 5 104 110.0 6 117 103.0 107.8 7 104 108.7 110.2 109.3 8 127 108.3 106.0 109.2 9 103 116.0 111.4 109.5 10 101 111.3 111.0 110.0 11 109 110.3 110.4 109.3 12 121 104.3 108.8 110.2 13 129 110.3 112.2 110.8 14 119.7 112.6 115.0 15 119.7 112.6 115.0 16 119.7 112.6 115.0
  • 14. Forecasting using the moving average • How to select the parameter k? • Same consideration you made for the centered moving average: – Low values (you look only to recent values): more reactive, less smoothing – High values (you look to older values): more smoothing, less reactive • If there is seasonality: k = frequency, but if there is no seasonality, you have to define the “best” value of k testing it on your time series • Let’s try…
  • 15. Exercise B01 Part 1 • Download the file excel: “01B_MovingAverage.xlsx” • Open the tab: “Direct Procedure” • Which one of the proposed models (MM4, MM8, MM12) does fit best?
  • 16. Direct procedure (step 3) Data Set • Once you have the actual data, you can measure the actual errors of your model • If necessary you can adjust the parameters for future forecasts Forecast Time now Actual data
  • 17. Direct procedure (step 4) Data Set Re-run the procedure with new data to get a new forecast Forecast Time now
  • 18. Direct Procedure: Limitations • The limitation of the direct procedure is that it does not test the validity of the model developed until we get actual data • The cross-validation procedure allows to make this testing through a slightly more complex process
  • 19. Cross-validation procedure Data set Let’s start from the available data at timenow: the data set Time now
  • 20. Cross-validation procedure (step 1) Training Set Test Set Divide the sample in two: •Training set: to build the model •Test set: to test the model Rule of thumb: test set should be a period (step e.g. one year) and usually no more than one third of the entire dataset. Time now
  • 21. Cross-validation procedure (step 2) Training Set • Build your model on the training set. • Change the parameters until you do not find a good fit (MAD, BIAS, RMSEA, etc.) • Remember to not over fit on the training set (as in the direct procedure) Time now Model
  • 22. Cross-validation procedure (step 3) • Make a forecast and test it on the test set • Check MAD, BIAS, RMSEA, etc. • Now you have two sets of errors Training Set Time now Model Forecast 1. Errors on the training set Use them to tune the model 2. Errors on the test set Use them to validate the model Test Set
  • 23. Cross-validation procedure (step 3) • If you are not satisfied, change the parameters looking at the training set • Do not change parameters to fit the TEST SET, otherwise you will get a test set over fitting Training Set Time now Model Training Set Time now Model Forecast Test set over fitting Forecast
  • 24. Cross-validation procedure (step 4) Data Set Go back to the data set and run the model with all the available data keeping the parameters you have identified before. From here on the procedure is identical to the direct procedure. Time now Forecast
  • 25. Cross-validation procedure (step 5) Data Set Once you have the actual data, you can measure the real errors of your model. If necessary you can adjust the parameters. Forecast Time now Actual data
  • 26. Cross-validation procedure (step 6) Data Set Re-run the procedure to get a new forecast Forecast Time now
  • 27. Exercise 01B Part 2 • Open the file “01B MovingAverage.xlsx” • Open tab “Cross Validation” • Compare the MAD and BIAS between the training and the test set and check whether MM8 provides a reliable forecast • Make the final forecast
  • 29. Conclusions Forecast Test Set Training Set Final forecast Dataset Run Model Tune Model Cross-validation Procedure Same Parameters
  • 30. Exponential smoothing • Limitations: – The moving-average gives the same weight to all the observations (1/k) – The number of observations considered by the weighted moving average is constant and finite • What we would like to have: – Higher weights for newer observations  higher reactivity to new trends – At the same time, be able to consider the history entire (less recent observations) if it can provide useful information • Exponential smoothing overcomes the issues of: – It is a weighted moving average where weights decrease exponentially –  All the observations are considered but newer observations have higher weights
  • 31. Simple exponential smoothing • The basic idea: – Higher weights to more recent observations – Exponentially decreasing weight for the other observations Weight given to the most recent observation a
  • 32. Simple exponential smoothing • Building the model is very simple: just decide the weight form 0 to 1 for the weight of the most recent observation and the model computes all the other weights automatically 0 0,2 0,4 0,6 0,8 1 t-1 t-2 t-3 t-4 t-5 t-6 0 0,2 0,4 0,6 0,8 1 t-1 t-2 t-3 t-4 t-5 t-6 0 0,2 0,4 0,6 0,8 1 t-1 t-2 t-3 t-4 t-5 t-6 a  0.2 a  0.5 a  0.8 The higher a the higher the importance of the last observation compared to the others  the model has little memory for data in the past
  • 33. Simple exponential smoothing t t t F Y F ) 1 ( 1 a a     • The implementation of the model is very simple and based on a recursive formula • The forecast is based on the last demand data weighted by alpha e the forecast made for the previous period weighted by (1-alpha) • Remember: alpha is a weight bounded within a 0 to 1 range – Close to “1”: means to give importance only to the last demand data (model very reactive on current demand) – Close to “0”: means to give importance only to the previous demand data (model little reactive, very history-based) Last demand data Previous Forecast = Previous demand data = history
  • 34. Simple exponential smoothing • Before producing the final forecast, forecasts for all the previous periods have to be generated recursively using the same formula • There is always a first forecast to calculate without a previous forecast • Usually the first forecast is set equal to the first observation (naive method) – F1 = Y1 – After some periods, the first observation will lose its significance
  • 35. Example • We have a time series • For simplicity sake, we use the direct procedure Period Demand 1 105 2 125 3 100 4 105 5 104 6 117 7 104 8 127 9 103 10 101 11 109 12 121 13 129 14 15 16
  • 36. Example 1. Define alpha = 0.5 Period Demand Simple Exp. Smoothing 1 105 2 125 3 100 4 105 5 104 6 117 7 104 8 127 9 103 10 101 11 109 12 121 13 129 14 15 16
  • 37. Example 2. Initialize the model Period Demand Simple Exp. Smoothing 1 105 105 2 125 3 100 4 105 5 104 6 117 7 104 8 127 9 103 10 101 11 109 12 121 13 129 14 15 16
  • 38. Example 3. Calculate the model for the second period t t t F Y F ) 1 ( 1 a a     = 0.5 * 105 + (1-0.5)*105 = 105 Period Demand Simple Exp. Smoothing 1 105 105 2 125 105.00 3 100 4 105 5 104 6 117 7 104 8 127 9 103 10 101 11 109 12 121 13 129 14 15 16
  • 39. Example 4. Drag down the formula to calculate the other values Period Demand Simple Exp. Smoothing 1 105 105 2 125 105.00 3 100 115.00 4 105 107.50 5 104 106.25 6 117 105.13 7 104 111.06 8 127 107.53 9 103 117.27 10 101 110.13 11 109 105.57 12 121 107.28 13 129 114.14 14 15 16
  • 40. Example 5. Check the fit and errors Period Demand Simple Exp. Smoothing 1 105 105 2 125 105.00 3 100 115.00 4 105 107.50 5 104 106.25 6 117 105.13 7 104 111.06 8 127 107.53 9 103 117.27 10 101 110.13 11 109 105.57 12 121 107.28 13 129 114.14 14 15 16
  • 41. Example 6. Change alpha if needed (but do not overfit). Always use both visual analysis of the data and error calculation Alpha = 0.2 Alpha = 0.9 Which one looks better?
  • 42. Example Alpha = 0.2 Alpha = 0.9 MAD = 9.82 MAD = 12.37 6. Change alpha if needed (but do not overfit). Always use both visual analysis of the data and error calculation
  • 43. Example 7. We select alpha = 0.2 and to make the forecast just drag down the formula one more time Period Demand Simple Exp. Smoothing 1 105 105 2 125 105.00 3 100 109.00 4 105 107.20 5 104 106.76 6 117 106.21 7 104 108.37 8 127 107.49 9 103 111.39 10 101 109.72 11 109 107.97 12 121 108.18 13 129 110.74 14 114.39 15 16
  • 44. Use of the model • The simple exponential smoothing is used to forecast only one period ahead (if you need more periods you can use the same value) Period Demand Simple Exp. Smoothing 1 105 105 2 125 105.00 3 100 109.00 4 105 107.20 5 104 106.76 6 117 106.21 7 104 108.37 8 127 107.49 9 103 111.39 10 101 109.72 11 109 107.97 12 121 108.18 13 129 110.74 14 114.39 15 114.39 16 114.39
  • 45. Use of the model • Stationary demand • Purpose: identify the level (that is, local mean) • Use of alpha: – If high variability  lower alpha (0.1-0.3) – If there are “jumps” in the series  higher alpha (0.6-0.8) The simple exponential smoothing is not able to recognize trends and seasonality Low alpha High alpha
  • 46. Reference table of weights given to previous demand data (previous demand = 100%) Lag/Alpha 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.05 Long term 24 0% 0% 0% 0% 0% 0% 0% 1% 9% 31% 23 0% 0% 0% 0% 0% 0% 0% 1% 10% 32% 22 0% 0% 0% 0% 0% 0% 0% 1% 11% 34% 21 0% 0% 0% 0% 0% 0% 0% 1% 12% 36% 20 0% 0% 0% 0% 0% 0% 0% 1% 14% 38% 19 0% 0% 0% 0% 0% 0% 0% 2% 15% 40% 18 0% 0% 0% 0% 0% 0% 0% 2% 17% 42% 17 0% 0% 0% 0% 0% 0% 0% 3% 19% 44% 16 0% 0% 0% 0% 0% 0% 0% 4% 21% 46% 15 0% 0% 0% 0% 0% 0% 1% 4% 23% 49% 14 0% 0% 0% 0% 0% 0% 1% 5% 25% 51% 13 0% 0% 0% 0% 0% 0% 1% 7% 28% 54% Medium Term 12 0% 0% 0% 0% 0% 0% 2% 9% 31% 57% 11 0% 0% 0% 0% 0% 1% 3% 11% 35% 60% 10 0% 0% 0% 0% 0% 1% 4% 13% 39% 63% 9 0% 0% 0% 0% 0% 2% 6% 17% 43% 66% 8 0% 0% 0% 0% 1% 3% 8% 21% 48% 70% 7 0% 0% 0% 0% 2% 5% 12% 26% 53% 74% 6 0% 0% 0% 1% 3% 8% 17% 33% 59% 77% Short Term 5 0% 0% 1% 3% 6% 13% 24% 41% 66% 81% 4 0% 1% 3% 6% 13% 22% 34% 51% 73% 86% 3 1% 4% 9% 16% 25% 36% 49% 64% 81% 90% 2 10% 20% 30% 40% 50% 60% 70% 80% 90% 95% 1 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
  • 47. Exercise B02 • Forecast the demand for a double-knit fabric for the next 12 months • Use MS Excel • Use a simple exponential smoothing model • Use: 1. Direct procedure (first Excel sheet) 2. Cross validation procedure (test set = 12 months) (second Excel sheet)
  • 48. Simple Exponential Smoothing Game • Divide in groups of 3: 1 computer per group • Open the file “Exp Smoothing Game Class.xlsx” • For each period and for each of the 3 products: 1. Write the actual demand provided by the teacher 2. Check the error and the charts (Deck sheet) 3. Select the most appropriate value of alpha 4. Make your forecast for the next period (using direct procedure) • Two rules – You can not change past decisions – You can vary the value of alpha of maximum +/- 0.1 (otherwise becomes red)
  • 49. Adaptive Exponential Smoothig • Used to dynamically change the value of alpha • If MAD is increasing  reduce alpha • If BIAS is increasing  increase alpha MAD increasing (variability) Decrease alpha BIAS increasing (trend or jumps) Increase Alpha
  • 50. Adaptive exponential smoothing • Improvement of the simple exponential smoothing • The model becomes more or less reactive according to the evolution of the demand • The alpha parameter is dynamically changed according to the error • A beta parameter sets how fast the alpha parameter will vary • Initialization: – β = 0.2 – F2 = Y1 – A1 = M1 = 0 • Robust algorithm, useful in automated systems • Alpha is variable, but beta is fixed and it can affect the performance of the overall model       t t t t t t t t t t t t t t t Y F E M E M A E A M A F Y F                1 1 1 1 1 1     a a a 50
  • 52. Linear exponential smoothing (Holt) • Evolution of the simple exponential smoothing to consider the trend (Holt, 1957) – Lt is the level of the series – bt is the slope of the series – Alpha: sets the reactiveness of the level (as in the simple smoothing) – Beta: sets the reactivennes of the underlying trend        m b L F b L L b b L Y L t t m t t t t t t t t t                1 1 1 1 1 1   a a
  • 53. Linear exponential smoothing (Holt) • The model can adapt to changes in the trend according to beta: higher beta higher reactivity to the recent trend • The factor m is used to project the trend – It is always 1 until there are demand data than it increases by 1 (2, 3, 4,…) for the forecasting periods • Limitations: the trend is hypothesized to be linear
  • 54. Building the model • Initial level value – L1 = Y1 (level equal to the demand) • Initial trend value (different approaches) – b1 = Y2-Y1 (ok if values are not too much different) – b1 = (Y4 – Y1)/3 (data every four months) – b1 = (Y13 – Y1)/12 (data per month) • Alpha and beta have to be set through the visual analysis of the series and iteration (remember to not overfit) to get the optimal model
  • 55. Building the model • The procedure is similar to the one followed for the simple exponential smoothing (see example next slide)
  • 56. Example (alpa = beta = 0.1) Yt Ft Lt bt m 105 - 105.00 2.00 - 107 107.00 107.00 2.00 1 110 109.00 109.10 2.01 1 111 111.11 111.10 2.01 1 112 113.11 113.00 2.00 1 114 114.99 114.90 1.99 1 113 116.88 116.49 1.95 1 114 118.44 118.00 1.90 1 116 119.90 119.51 1.87 1 118 121.38 121.04 1.83 1 121 122.87 122.69 1.81 1 124 124.50 124.45 1.81 1 126.26 1 128.07 2 129.87 3 Forecast = 124.45 + 1.81 * 1 = 124.45 + 1.81 * 2 = 124.45 + 1.81 * 3 1. Initialization 5. Calculate the forecast using the last data available for L and b 3. Drag down 2. Create the first row using formulas (in order: L, b, F) 4. Check errors and set alpha and beta        m b L F b L L b b L Y L t t m t t t t t t t t t                1 1 1 1 1 1   a a
  • 57. Example (alpa = beta = 0.1)
  • 58. Use of the model • Use it for series with no seasonality • The use of alpha is the same as in the simple exponential smoothing (variability vs jumps) • Use of beta: – If long term trend: low beta (0.1-0.3) – If short term trend: high beta (0.6-0.8)  be careful that with high beta the model could mistake random oscillations for trends! Low beta High beta
  • 59. Exercise B03 • Provide a forecast for the following 12 months for the flight traffic • Use MS Excel • Use a linear exponential smoothing 1. Direct procedure (first Excel sheet) 2. Cross validation procedure (test set = 12 months) (second Excel sheet) Group exercise
  • 60. Download and install SAS JMP 10 • http://www.jmp.com/landing/jmp_trial.shtml?ref=hp_visual • Register to the website and download the 30 days trial
  • 61. Winters exponential smoothing • Evolution of the Holt model (Winters 1960) that considers also seasonality – Lt is the level of the series – bt is the slope of the series – St is the seasonality of the series – Alpha: sets the reactivity of the level (as in the simple smoothing) – Beta: sets the reactivity of the underlying trend (as in the linear exponential smoothing) – Gamma: sets reactivity of the seasonality            m s t t t m t s t t t t t t t t t t s t t t S m b L F S L Y S b L L b b L S Y L                           a a 1 1 1 1 1 1 1
  • 62. Characteristics of the model • “s” is the seasonality period and it is hypothesized to be known and constant (e.g. 12 months) • Limited time horizon (trend and seasonality are supposed to be constant in the future) • Trend is hypothesized linear but the model can follow changes in the trend (as in the linear exponential smoothing) • Seasonality is multiplicative – One coefficient for each period – The coefficient can change over time
  • 63. Building the model • Initialization: three initial values (L1, b1, S1) • Two full seasons are needed to calculate coefficients • The optimal model has to be found changing alpha, beta and gamma   s s s s s s s s s s s s s L Y S L Y S L Y S s Y Y s Y Y s Y Y s b Y Y Y s L                        ;...; ; ... 1 ... 1 2 2 1 1 2 2 1 1 2 1
  • 64. Use of the model • The use of alpha is the same as in the simple exponential smoohting (variability vs jumps) • Use of beta is the same as in the linear exponential smoohting (long term trend vs short term trend) • Use of gamma – If constant long term seasonality: low gamma (0.1-0.3) – If varying seasonality from year to year: high gamma (0.6-0.8) Low gamma High gamma
  • 65. In conclusion Data characteristics Simple Linear Winters Stationary (no seasonality) V Trend V Trend + Seasonality V Do not use a more complicated model if it is not the case, otherwise you will be likely to over fit Do not use a simple model if it is not the case, otherwise you will have a poor fit
  • 66. An example Quarterly data Which model would you use? How would you set the parameters?
  • 67. An example Quarterly data Alpha: low to high Beta: low Gamma: medium to high
  • 68. Quantitative forecasting techniques Ruggero Golini ruggero.golini@polimi.it Applications
  • 69. SAS JMP • Repeat Exercise B02 using SAS JMP • Using the direct procedure, perform simple, linear and Winters exponential smoothing and make a forecast for the next year • Finally, use the cross-validation procedure for the Winters model Guided exercise
  • 70. Exercise B04 • Monthly Australian sales of red wine: thousands of liters Jan 1980 - Jul 1995 • Forecast: 12 months • Make the best forecast you can! • Use SAS JMP Group exercise
  • 71. Exercise B05 • World crude oil production • Make the best forecast you can! • Use SAS JMP Group exercise
  • 72. Assignment 2 • File: Assignment 2.xls • 3 real time series – Number of research in Goolge of a keyword (normalized at 100) – Global production of an agriculture product – Retail sales of a specific good category in a specific country • Objective: produce the best forecasts you can for the required periods (see excel file > ‘Forecast’ sheet) • Your forecast will be compared to real data to find who got the best forecast in terms of MAD and BIAS
  • 73. Assignment 2 • Try different models (moving average or one of the exponential smoothing) and decide which is best • Use the cross-validation technique to set the parameters and evaluate your forecast • Remember to avoid over-fit • You can use Excel or JMP • Send an email to golini@mip.polimi.it and to sciacovelli@mip.polimi.it assignment: – 3 slides (1 slide per series) for the presentation of your solution. For each series report the model that you used, the value of the parameters and the motivation – Excel file with main calculations and forecasts (fill the provided excel file > ‘Forecast’ sheet) – Report has to be submitted by email to golini@mip.polimi.it and to sciacovelli@mip.polimi.it by May 18th
  • 75. Introduction • Time-based models (e.g. demand decomposition, moving average, exponential smoothing) are based on the following hypothesis: t Past demand Future demand explains What are the assumptions?
  • 76. Assumptions of time-based models • There are past data  might not be the case for new products • There are regular patterns (trend, seasonality)  not always the case (e.g., stock prices) • The demand is quite disconnected form the environment  might not be the case for many products subject to shocks (e.g., events, promotions,…) The Great Gatsby Google Search
  • 77. Explanatory models • When the assumptions are not met, we can try to introduce external variables (drivers) to explain future demand (explanatory models) • Examples: Driver Predicted outcome Price Sales Early sales Total Sales Gross Domestic Product of a country Total demand Promotion Temporary increase of the sales … … Issue: we need information about the drivers!
  • 78. Regression analysis • The regression analysis is a type of explanatory model • The idea is to determine coefficients (a1, a2, a3,..) associated to drivers (X1, X2, X2,…) able to explain the future demand (Y) • We will hypothesize linear relationships between the drivers and demand  linear regression Drivers (X) Demand (Y) X1 X2 X3 Y a1 a3 a2
  • 79. Simple vs Multiple Linear Regression Drivers (X) Demand (Y) X1 X2 X3 Y a1 a3 a2 Drivers (X) Demand (Y) X1 Y a1 Simple Linear Regression: 1 Driver Multiple Linear Regression: >1 Drivers
  • 80. Example 80 Year CM of rain Umbrella sales 1990 58 59044 1991 58 62060 1992 80 85600 1993 70 65450 1994 83 77107 1995 91 90909 1996 69 66033 1997 63 65268 1998 63 64638 1999 56 55216 2000 86 85140 2001 86 83334 2002 92 84548 2003 58 52490 2004 91 85813 2005 78 78624 2006 93 83700 2007 61 55876 Drivers (X) Demand (Y) Cm of rain Umbrella sale a1 only 1 driver  SIMPLE LINEAR REGRESSION
  • 81. Example 81 Year CM of rain Umbrella sales 1990 58 59044 1991 58 62060 1992 80 85600 1993 70 65450 1994 83 77107 1995 91 90909 1996 69 66033 1997 63 65268 1998 63 64638 1999 56 55216 2000 86 85140 2001 86 83334 2002 92 84548 2003 58 52490 2004 91 85813 2005 78 78624 2006 93 83700 2007 61 55876 4000 14000 24000 34000 44000 54000 64000 74000 84000 94000 104000 50 60 70 80 90 100 Key tool: dispersion diagram
  • 83. An example 83 Year CM of rain Umbrella sales 1990 58 59044 1991 58 62060 1992 80 85600 1993 70 65450 1994 83 77107 1995 91 90909 1996 69 66033 1997 63 65268 1998 63 64638 1999 56 55216 2000 86 85140 2001 86 83334 2002 92 84548 2003 58 52490 2004 91 85813 2005 78 78624 2006 93 83700 2007 61 55876 y = 883,89x + 6665,5 4000 14000 24000 34000 44000 54000 64000 74000 84000 94000 104000 50 60 70 80 90 100 Simple Linear Regression Y = a1 * X1 + b Interpretation: if the cm of rain were 0, we would sell 6665 umbrellas, for each cm of rain more we sell 883 umbrella more
  • 84. Simple Linear Regression • A linear relationship between Y (dependent variable) and only one driver (independent variable) • The output is the equation of a straight line: Y = a1 * X1 + b – a1 is the coefficient associated to X1  for a unit increase of X1 the demand increases by a1 (example: for each cm of rain we sell 883 umbrella more) – b is the intercept with the Y axis  the baseline demand in case X1 is 0 (example: even if it does not rain, we sell 6665 umbrellas) 84
  • 85. Regression is the attempt to explain the variation in a dependent variable using the variation in independent variables. Regression is thus an explanation of causation. If the independent variable(s) sufficiently explain the variation in the dependent variable, the model can be used for prediction. Drivers (x) Dependent variable (y) Simple Linear Regression
  • 86. Independent variable (x) Dependent variable The function will make a prediction for each observed data point. The observation is denoted by y and the prediction is denoted by ŷ The difference between y and ŷ is the prediction error Zero Prediction: ŷ Observation: y Simple Linear Regression ERROR
  • 87. Independent variable (x) Dependent variable A least squares regression selects the line with the lowest total sum of squared prediction errors. This value is called the Sum of Squares of Error, or SSE. Simple Linear Regression
  • 88. Simple linear regression 88 The coefficient a1 can also be negative Week Sales Price 1 10 € 1.30 2 6 € 2.00 3 5 € 1.70 4 12 € 1.50 5 10 € 1.60 6 15 € 1.20 7 5 € 1.60 8 12 € 1.40 9 17 € 1.00 10 20 € 1.10 Average 11.2 € 1.44 y = -14,539x + 32,136 0 5 10 15 20 25 € 0,00 € 0,50 € 1,00 € 1,50 € 2,00 € 2,50 Interpretation: if the price was 0, we would sell 32 products, but for every euro more we sell 14.5 products less
  • 89. Question: Why the coefficient b should not be negative?
  • 90. R2 indicator • A regression line can have a good or a bad fit • To evaluate this fit the R2 indicator is used – R2 is the percentage of variability predicted by the model – R2 near 100%: good fit – R2 near 0%: poor fit (data are not related each other) R2 = 45% POOR FIT R2 = 99% GOOD FIT!
  • 91. Use the linear regression to make forecasts 91 Year CM of rain Umbrella sales 1990 58 59044 1991 58 62060 1992 80 85600 1993 70 65450 1994 83 77107 1995 91 90909 1996 69 66033 1997 63 65268 1998 63 64638 1999 56 55216 2000 86 85140 2001 86 83334 2002 92 84548 2003 58 52490 2004 91 85813 2005 78 78624 2006 93 83700 2007 61 55876 y = 883,89x + 6665,5 4000 14000 24000 34000 44000 54000 64000 74000 84000 94000 104000 50 60 70 80 90 100 Back to the umbrella example. We want to open a new branch in India where there are 110 cm of rain per year. What is the predicted demand?
  • 92. We just apply the formula: Y = a1*X1 + b where X1 is the cm of rain in India Use the linear regression to make forecasts 92 y = 883,89x + 6665,5 4000 24000 44000 64000 84000 104000 124000 50 60 70 80 90 100 110 120 130 Solution: 883,89 * 110 + 6665,5 = 103893,4
  • 93. Use the linear regression to make forecasts 93 Week Sales Price 1 10 € 1.30 2 6 € 2.00 3 5 € 1.70 4 12 € 1.50 5 10 € 1.60 6 15 € 1.20 7 5 € 1.60 8 12 € 1.40 9 17 € 1.00 10 20 € 1.10 Average 11.2 € 1.44 y = -14,539x + 32,136 0 5 10 15 20 25 € 0,00 € 0,50 € 1,00 € 1,50 € 2,00 € 2,50 How much we expect to sell with a price of 1.30?
  • 94. Use the linear regression to make forecasts 94 Week Sales Price 1 10 € 1.30 2 6 € 2.00 3 5 € 1.70 4 12 € 1.50 5 10 € 1.60 6 15 € 1.20 7 5 € 1.60 8 12 € 1.40 9 17 € 1.00 10 20 € 1.10 Average 11.2 € 1.44 y = -14,539x + 32,136 0 5 10 15 20 25 € 0,00 € 0,50 € 1,00 € 1,50 € 2,00 € 2,50 How much we expect to sell with a price of 1.30? Solution: -14.539 * 1.30 + 32.136 = 13.23
  • 95. Use the simple regression as a forecasting tool 1. Set-up the model – Decide which is your dependent Y variable (sales, demand, etc.) – Decide which is your X1 driver (price, market drivers, etc.) 2. Calculate coefficients (a1 and b) using a software and check R2 (if not good change the driver) 3. Identify the future value of the selected driver (may require a forecast!) 4. Perform the forecast
  • 96. Simple Regression in Excel • Display data in a scatter-diagram • Right-click on the series  add trend line
  • 97. Recommendations for simple linear regression • Simple linear regression can be used even if you do not have deep statistical competences • The counterpart is that you need to choose one driver • It always best to use the driver that is more theoretically correlated to the demand (cm of rain  umbrella) • But the driver should be also independent from the demand (do not use cm of rain to predict days of rain) • If you are undecided among different drivers, you can run several models and pick the one with the highest R2 • Be careful: regression is sensitive to outliers, always inspect your data and remove outliers!
  • 98. Exercise B06 • Open the file: “06B Regression Exercise 1.xlsx” • How many cranes we expect to sell in 2001 when it is expected a GDP growth of 7.5%? • What other model could you have used? What would have been the outcome?
  • 99. Exercise B07 (early sales) • Fashion companies can not rely on historical data because their products change every season • At the beginning of the season there is high uncertainty and the risk to overproduce if the product is a flop or underproduce if the product will have success • One option is to look at the early sales (sales at the beginning of the season) to forecast the total sales
  • 100. Exercise B07 (early sales) • Every six months (season) a fast-fashion company launches ten types of new t-shirts while the old ones are removed from the market • For each t-shirt two data are registered: – Total sales at the end of the season – Early sales after 2 months • The company is trying to understand if it is possible to forecast the total sales given a certain amount of early sales Season Product Total Sales Early Sales 1 T-shirt - 39662 21 7 1 T-shirt - 73131 21 7 1 T-shirt - 97299 49 49 1 T-shirt - 81181 57 57 1 T-shirt - 63623 77 25 1 T-shirt - 14358 84 84 …… … …
  • 101. Exercise B07 (early sales) • Consider the file “07B Exercise Early Sales.xls” • Using the software perform a simple regression to evaluate the relationship between early sales and total sales • How much will be the sales for a product that had 1,230 early sales? • Is there any difference between Fashionable and Non Fashionable products?
  • 102. Exercise B08 • Regression analysis can also be used to predict the effect of a promotion • Simply indicate with “1” when a promotion took place or “0” if in that period there was no promotion • Perform a simple regression between the total sales and the new 0/1 variable • The “b” coefficient will be the average increase of sales due to the promotion • The “a” coefficient will be the average level of sales excluding the promotions • Reversing the formula (subtract from the sales the value of “b” when a promotion takes place) you can then your data from the effect of the promotion
  • 103. Exercise B08 • Open File “Regression Exercise 3.xlsx” • Set up the 0/1 promotion variable • Calculate the average effect of a promotion • Clean your data • Perform a forecast of 3 periods ahead using simple exponential smoothing
  • 104. Multiple linear regression • Simple: a relationship between Y (dependent variable) and X (independent variable) Y = a1 * X1 + b • Multiple: a relationship between Y (dependent variable) and several Xi (independent variables) Y = a1 * X1 + a2 * X2 + … + b 104
  • 105. Multiple linear regression 105 Sales of cars (units GDP per capita (€) Marketing actions (€) 1293 16201 15348 1891 27287 22579 1958 22811 26120 1989 25559 28841 1507 17436 18084 1306 17866 19172 1307 14834 18376 1917 27662 24499 1866 19332 20937 1926 25885 26868 1139 16834 13247 1833 21318 18697 1254 17105 17368 1791 21008 19665 1919 19823 20610 1128 12070 13965
  • 106. Multiple linear regression Both drivers seem to have a relation with sales The results of the linear regression are: Y = 0.034 * GPD + 0.033 * Marketing + 276.72 0 500 1000 1500 2000 2500 10000 15000 20000 25000 30000 Sales of cars GDP Per capita 0 500 1000 1500 2000 2500 10000 15000 20000 25000 30000 Sales of cars Marketing Actions Attention! The coefficient are not the result of two separate regressions, but it is a simultaneous result
  • 107. Use the linear regression to make forecasts • Back to the car sales example based on GDP and Marketing effort • The regression equation was: • How many cars would we sale if the GDP next year will be 30.000 per capita and we plan to invest 25.000 € in marketing? • Solution 0.034 * 30000 + 0.033 * 25000 + 276.72= 2121.72 Y = 0.034 * GPD + 0.033 * Marketing + 276.72
  • 108. Recommendations for multiple linear regression • Multiple linear regression allows you to use several drivers simultaneously • However, there some statistical traps, so it is suggested to ask the help of an expert • Some suggestions: • Start with few, theoretically and statistically uncorrelated drivers (for instance: do not put days of rain AND days of sun to predict umbrella sales, but days of rain AND people income) • Check if R2 is significant • You can add other drivers, one at a time, but always check that they are not correlated to the previous drivers • Check if R2 has significantly increased, if not, drop the new driver (it is better to have fewer drivers in the model)
  • 109. Multiple Regression in Excel • Select a range • Function LINEST • Inputs: – Y as a vector; – X as a matrix; – True (otherwise b = 0); – True (to have statistics) • Ctrl+Shift+Enter to enter the function in each cell of the result array • Outputs in matrix form an … a1 b st.err n … …st.err 1 st.err b R2 St.erry F Df SSreg SSresid
  • 110. Exercise B09 • Open the file “09B Exercise Multiple Regression.xlsx” • The dataset represents number of cold organic fruit juices sold • The product is still quite new for the market to the sales have high fluctuations • Also the price made by the company can change significantly according to the cost of the raw materials (fruits) • Customers can order in larger or smaller batches • Provide a forecast for the number of orders and products sold considering that for the next week these values are expected Price 9 Promotions (1 =Yes) 1 Holidays during the week(1=Yes) 0 Full working week (1=Yes) 1 External Temperature (Celsius degrees) 10 Economic outlook (10=very good; 1=very bad) 7
  • 112. FORECAST HORIZON : m FORECAST estimated at period t for period t+m : Ft+m t t+m t+1 m periods Ft+1 Ft+2 Ft+m Time t-1 t+2 Yt Yt-1 DEMAND for period t : Yt Notation
  • 113. The idea behind AR Models • Simple regression model Yt = b0 + b1X1 + et • Multivariate regression model Yt = b0 + b1X1 + b2X2 + … + bkXk + et • Multivariate regression based on previous observations Yt = b0 + b1Yt-1 + b2Yt-2 + … + bkYt-k + et
  • 114. The idea behind AR • We will focus on its application to stationary series and non- seasonal, but this limitation can be easily removed • Context of use: mean reverting time series • Example: drunk man trying to walk on a straight line Signal (Mean) Demand: signal + disturbance
  • 115. Which one is a mean reverting time series?
  • 116. What is the difference in these two series? Mean reverting (AR model) White noise (random number in a range +200 ; +100)
  • 117. Mean reverting time series • It can be difficult to asses graphically • It’s more likely that the next period will be closer to the mean than the period before • But the path that brings back to the mean can be complex • We can assess this through the autocorrelogram (see Part A slides) and partial-autocorrelogram
  • 118. Autocorrelation function (ACF) • Provides a synthetic view of autocorrelations • Useful tool to identify non stationariety factors 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Lag 1,0000 0,8166 0,7181 0,6395 0,5479 0,4301 0,2512 0,0793 -0,0383 -0,1487 -0,2118 -0,3170 -0,3793 -0,3957 -0,3464 AutoCorr -.8 -.6 -.4 -.2 0 .2 .4 .6 .8
  • 119. Partial Autocorrelation (PACF) • If we have a regression relationship between Y and X1 and X2, it may be interesting to evaluate how much X1 explains of what X2 cannot explain Yt = b0 + b1X1 + b2X2 + et • Partial autocorrelation evaluates the correlation between Yt e Yt-k when the effect of lags 1, 2, 3, …, k-1 has been eliminated: how much does Yt-k contribute to explain Yt Yt Yt-1 Yt-2 …
  • 120. Partial Autocorrelation (PACF) • Evaluated by means of regression Yt = b0 + b1Yt-1 + b2Yt-2 + … + bkYt-k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Lag 1,0000 0,1059 0,0946 -0,0682 -0,0403 -0,2243 0,1095 0,2653 -0,1011 -0,0892 -0,2980 -0,1635 -0,4105 0,3200 Partial -.8 -.6 -.4 -.2 0 .2 .4 .6 .8 Partial AutoCorrelation Function (PACF)
  • 121. Time series 1 (Mean reverting)
  • 122. Time series 2 (White noise)
  • 123. Revert-to-mean models • A first order AR(1) model can be described as: • Ŷt = μ + ϕ1Yt-1 • With |ϕ1|< 1 • If ϕ1 is positive the series tends to have periods above and periods below the average • If ϕ1 is negative up and downs from the average  demand will be below the mean next period if it is above the mean this period
  • 124. Time series 1 (Mean reverting) • We can therefore apply an AR(1) model, for instance in JMP • We obatain a forecast (green line) that converges to the mean 250 270 290 310 330 350 370 390 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 Ŷt = 332.506 + 0.646 * Yt-1
  • 125. AR(1) Models with negative ϕ1
  • 126. AR(1) Models with negative ϕ1 • We can still apply an AR(1) model Ŷt = 58.76 - 0.69 * Yt-1
  • 128. Revert-to-mean models • A first order AR(2) model can be described as: • Ŷt = μ + ϕ1Yt-1 + ϕ2Yt-2
  • 129. AR(2) Model Ŷt = 200.85 + 0.281 * Yt-1 + 0.316 * Yt-2
  • 130. How to detect AR(x) models • 2 conditions: 1. The ACF decreases exponentially (or oscillating) 2. The PACF is significant until “x” lag In the example we have an AR(3) model, as the PACF is significant until lag 3 (above the blue line) values
  • 131. Exercise 10B • Use JMP • Open the file “10B Autoregressive.xlsx”, identify the model and perform the forecast
  • 132. Differentiation • First and second order differentiation removes the trend component • First and second seasonal differentiation remove the seasonality
  • 133. Gain stationariety • A possible way is to differentiate • The 1st difference series evaluates the change between two subsequent observations in the original series 1 '    t t t Y Y Y
  • 136. Differentiation • Allows to get rid of unstationary elements • We can have 2nd level differentiation • We can also have seasonal differentiation 2 1 2 1 1 1 2 ) ( ) ( ' ' ' '                t t t t t t t t t t Y Y Y Y Y Y Y Y Y Y
  • 137. Seasonal Differentiation • Defined as: s t t t Y Y Y    ' 7000 8000 9000 10000 11000 12000 13000 14000 15000 16000 1980 1985 1990 1995 Electricity production In Australia
  • 138. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Lag 1,0000 0,9358 0,8878 0,8228 0,7762 0,7578 0,7259 0,7464 0,7481 0,7674 0,8054 0,8203 0,8448 0,7867 0,7407 0,6780 0,6339 0,6181 0,5896 0,6062 0,6108 0,6263 0,6586 0,6700 0,6923 0,6393 0,5924 0,5320 0,4870 0,4679 0,4399 0,4549 0,4535 0,4685 0,5005 0,5086 0,5269 0,4726 0,4257 0,3694 AutoCorr -.8 -.6 -.4 -.2 0 .2 .4 .6 .8 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Lag 1,0000 0,9667 0,1908 -0,2275 0,1144 0,3361 -0,1544 0,4628 0,1883 0,1292 0,7253 0,3276 0,3988 -0,5150 -0,2882 -0,4583 -0,1003 0,0113 -0,1482 -0,1537 0,0873 -0,2422 -0,0247 0,1330 0,4591 -0,1213 -0,1444 -0,2152 -0,2753 -0,0602 -0,1031 0,0960 0,0963 -0,1244 0,2419 0,2135 0,6330 -0,2822 -2,5533 0,0000 Partial -.8 -.6 -.4 -.2 0 .2 .4 .6 .8 Seasonal Differentiation Electricity production In Australia
  • 139. -1000 -500 0 500 1000 1500 1981 1986 1991 s t t t Y Y Y    ' Seasonal Differentiation Electricity production In Australia
  • 140. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Lag 1,0000 0,4100 0,3663 0,4116 0,3569 0,3055 0,2381 0,3292 0,2482 0,2292 0,2622 0,2580 -0,1198 0,0414 0,1490 0,0244 0,0509 0,0775 0,1450 -0,0344 0,0111 0,0437 -0,0875 -0,1209 -0,0713 0,0444 -0,0557 -0,0358 0,0025 -0,0366 -0,0389 0,0077 0,0361 -0,0584 0,0593 0,1006 0,0846 0,0327 0,0207 0,0775 AutoCorr -.8 -.6 -.4 -.2 0 .2 .4 .6 .8 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Lag 1,0000 0,4204 0,2411 0,2625 0,1233 0,0498 -0,0395 0,1532 0,0006 0,0328 0,0677 0,0654 -0,5520 0,0054 0,1719 0,0392 0,0822 0,1170 0,0169 -0,1764 0,0050 0,0292 -0,1837 0,0224 -0,3556 0,1650 0,2928 0,0191 -0,0818 0,1403 0,3154 -0,1565 0,1517 0,0299 -0,0580 -0,0787 -0,2901 0,0556 0,0108 0,0160 Partial -.8 -.6 -.4 -.2 0 .2 .4 .6 .8 Electricity production In Australia Seasonal Differentiation
  • 141. -1500 -1000 -500 0 500 1000 1500 1981 1986 1991 1 '    t t t Y Y Y Electricity production In Australia 1° order differentiation
  • 142. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Lag 1,0000 -0,4387 -0,0848 0,0843 -0,0022 0,0074 -0,1388 0,1497 -0,0572 -0,0515 0,0342 0,3329 -0,4640 0,0343 0,1902 -0,1181 -0,0116 -0,0182 0,2079 -0,1974 0,0434 0,1311 -0,0938 -0,0861 -0,0464 0,1843 -0,0956 -0,0174 0,0718 -0,0419 -0,0463 0,0322 0,0850 -0,1763 0,0669 0,0591 0,0145 -0,0324 -0,0377 0,0865 AutoCorr -.8 -.6 -.4 -.2 0 .2 .4 .6 .8 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Lag 1,0000 -0,4648 -0,3768 -0,1963 -0,1081 -0,0159 -0,1999 -0,0402 -0,0692 -0,1002 -0,0929 0,5121 -0,0832 -0,2318 -0,0895 -0,1245 -0,1514 -0,0436 0,1452 -0,0431 -0,0585 0,1509 -0,0625 0,2955 -0,2320 -0,3300 -0,0428 0,0589 -0,1608 -0,3210 0,1509 -0,1537 -0,0274 0,0622 0,0796 0,2820 -0,0734 -0,0336 -0,0331 0,0383 Partial -.8 -.6 -.4 -.2 0 .2 .4 .6 .8 The differentiated series
  • 143. Conclusive remarks • AR models should be used only when the conditions shown by ACF and PACF are verified • AR models can be applied to series with trend by applying a first or second order differentiation – For instance: AR(1) with first order differentiation: ARI(1,1) • AR models can be applied to series with seasonality by applying a seasonal differentiation – For instance: AR(1,0)(0,1)12 (in case of seasonality every 12 periods) • Both differentiations can be combined: – ARI(1,1)(0,1)12 (in case of seasonality every 12 periods)
  • 144. An example – number of users of an internet service 0 50 100 150 200 250 1 21 41 61 81
  • 145. An example – number of users of an internet service • Non stationariety • 1st autocorrelation is dominant • No seasonality • Let’s differentiate 80 100 120 140 160 180 200 220 240 Column 1 0 10 20 30 40 50 60 70 80 90 100 110 Row Mean Std N 137,08 39,798915 100 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Lag 1,0000 0,9602 0,9011 0,8287 0,7460 0,6572 0,5647 0,4686 0,3709 0,2742 0,1809 0,0905 0,0021 -0,0852 -0,1671 -0,2389 -0,2990 -0,3498 -0,3927 -0,4248 AutoCorr -.8-.6 -.4-.2 0 .2 .4 .6 .8 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Lag 1,0000 0,9602 -0,8466 0,2667 -0,4256 -0,0624 -0,0651 -0,0296 0,0708 -0,2389 -0,2455 -0,1294 -0,2181 0,0340 0,1009 -0,0054 -0,0327 -0,1917 0,0576 0,0389 Partial -.8-.6 -.4-.2 0 .2 .4 .6 .8
  • 146. -15 -10 -5 0 5 10 15 1 21 41 61 81 An example – number of users of an internet service
  • 147. An example – number of users of an internet service • ACF: exponential decrease and sinusoidal • At least 3 partial autocorrelations are significant • AR(3) -15 -10 -5 0 5 10 15 Column 2 0 10 20 30 40 50 60 70 80 90 100 110 Row Mean Std N 1,28 5,6410637 100 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Lag 1,0000 0,7904 0,5243 0,4084 0,3712 0,3194 0,2182 0,1023 0,0441 0,0695 0,1081 0,1121 0,0374 -0,0795 -0,1611 -0,1917 -0,1838 -0,2194 -0,3171 -0,3832 AutoCorr -.8-.6 -.4-.2 0 .2 .4 .6 .8 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Lag 1,0000 0,8045 -0,3031 0,3400 -0,0568 -0,0319 -0,0808 -0,1515 0,1193 0,0744 -0,0274 0,0349 -0,2312 -0,2467 -0,1163 -0,0894 0,0785 -0,1760 -0,1709 -0,1705 Partial -.8-.6 -.4-.2 0 .2 .4 .6 .8
  • 148. AR1 AR2 AR3 Intercept Term 1 2 3 0 Lag 1,14599656 -0,6592887 0,33460154 0,97991731 Estimate 0,0953656 0,1351023 0,0947145 1,6480961 Std Error 12,02 -4,88 3,53 0,59 t Ratio <.0001 <.0001 0,0006 0,5535 Prob>|t| Constant Estimate 0,17510197 Parameter Estimates -10 -5 0 5 Residual Value 0 10 20 30 40 50 60 70 80 90 100 110 Row 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Lag 1,0000 -0,0126 0,0210 0,0293 0,0322 -0,0085 0,0216 -0,0621 -0,1535 0,0521 0,0313 0,1237 0,0536 -0,0604 -0,0287 AutoCorr -.8-.6 -.4-.2 0 .2 .4 .6 .8 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Lag 1,0000 -0,0129 0,0220 0,0319 0,0353 -0,0105 0,0216 -0,0766 -0,1849 0,0803 0,0439 0,1818 0,0734 -0,1278 -0,0563 Partial -.8-.6 -.4-.2 0 .2 .4 .6 .8 Residuals ARIMA(3,1,0) model An example – number of users of an internet service
  • 149. Exercise 11B • Use JMP • Identify if differentiation is needed (first order or seasonal) • Detect and apply the correct AR model
  • 151. Regression and “Moving Average” • Similarly we can consider: • Pay attention: this is the moving average of the error not of the original series t p t p t t t e e b e b e b b Y          ... 2 2 1 1 0 MA
  • 152. Moving average models • Do not produce regular patterns as AR models  ACF usually significant only at Lag 1 • The series theoretically always starts from the constant b0 (and not from the previous values) and varies according to the previous variations  that is why we need to rely on the PACF
  • 153. Moving average models • A first order MA(1) model can be described as: – Ŷt = b0 + b1et-1 – b0 and b1 > 0 • The demand is supposed to be b0 but if in the previous period was higher (et-1) then also in this period is going to be higher – If b1 is positive the series tends to move in couples (high  high; lowlow) – If b1 is negative the series moves in opposite couples (high  low; low  high)
  • 154. MA(1,1) = Simple exponential smoothing • Ŷt = b0 + b1et-1 with et-1= Yt - Ŷt-1 • So we can re-write as: • Ŷt = b0 + b1 (Yt - Ŷt-1) • In words: the forecast (Ŷt ) is dependent on the previous demand (Yt ) and the previous forecast (Ŷt-1) • It’s the same principle of the simple exponential smoothing • In particular, a MA(1,1) with no constant (b0) is equivalent to the simple exponential smoothing • For the demonstration: https://onlinecourses.science.psu.edu/stat510/?q=node/70 • In conclusion: MA models are a broader class of smoothing models!
  • 155. Example 1 0 1 2 3 4 5 6 7 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 -1 -0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8 1 Y-1 Y-2 Y-3 Y-4 Y-5 Y-6 Y-7 Y-8 Y-9 Y-10 Y-11 Y-12 Y-13 Y-14 Y-15 -1 -0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8 1 Y-1 Y-2 Y-3 Y-4 Y-5 Y-6 Y-7 Y-8 Y-9 Y-10 Y-11 Y-12 Y-13 Y-14 ACF PACF
  • 156. Example 2 -2 -1 0 1 2 3 4 5 6 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 -1 -0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8 1 Y-1 Y-2 Y-3 Y-4 Y-5 Y-6 Y-7 Y-8 Y-9 Y-10 Y-11 Y-12 Y-13 Y-14 Y-15 -1 -0,8 -0,6 -0,4 -0,2 0 0,2 0,4 0,6 0,8 1 Y-1 Y-2 Y-3 Y-4 Y-5 Y-6 Y-7 Y-8 Y-9 Y-10 Y-11 Y-12 Y-13 Y-14 ACF PACF
  • 158. AutoRegression and Moving Average • By joining the two models • The model is good for stationary series • It can be adapted for non-stationary series by adding differentiation AR MA + = ARMA I AR I + = ARIMA MA +
  • 159. Notation • AR: p = order of the autoregressive component • I: d = order of the differentiation • MA: q = order of the moving average component • The value of p, d, q can be also be higher than one in case of additional autoregressive or moving average components, however better to be conservative and do not set more than one parameter higher than 2 • Examples: – Autoregressive: ARIMA (1,0,0) – Moving average: ARIMA (0,0,1) – Autoregressive with trend: ARIMA (1,1,0)
  • 160. Model identification ARIMA model identification is highly complex, however as a general guideline: 1. Check if the series is stationary  if not, differentiate until it is stationary 2. Check if the series has seasonality  if not, differentiate (lag=period of the seasonality) until it is stationary 3. Check if ACF has significant patterns  if yes, add an AR component 4. Check if PACF has a significant patterns  if yes, add a MA component 5. Repeat steps 3 and 4 until ACF and PACF do not show significant significant patterns
  • 161. Quality evaluation • Few suggestions for model design – For the sake of simplicity start with an AR or an MA model, then apply an ARMA model and analyse differences – Test different methods: pay attention • Usually estimation is done by min(MSE) • You can easily reduce MSE by complicating the model (i.e., overfit) • Useful to consider measures that account for the complexity of the model – Residual analysis
  • 162. Quality measures • Mean Square Error • Error variance • R2 • Likelihood – Akaike’s Information Criterion (AIC) • m = p + q + P + Q • L: Likelihood function • AIC = -2logL + 2m • The lower the better – Schwarz Bayesian Information Criterion (BIC)