2. Outline
1 A brief history of forecasting
2 Types of data
3 Forecasting models
4 Some case studies
5 The statistical forecasting perspective
6 Introduction to R
1. Getting started A brief history of forecasting 2
3. Standard business practice today
“What-if scenarios” based on assumed and
fixed future conditions.
Highly subjective.
Not replicable or testable.
No possible way of quantifying probabilistic
uncertainty.
Lack of uncertainty statements leads to false
sense of accuracy.
Largely guesswork.
Is this any better than a sheep’s liver or
hallucinogens?
1. Getting started A brief history of forecasting 3
4. The rise of stochastic models
1959 exponential smoothing (Brown)
1970 ARIMA models (Box, Jenkins)
1980 VAR models (Sims, Granger)
1980 non-linear models (Granger, Tong, Hamilton,
Teräsvirta, . . . )
1982 ARCH/GARCH (Engle, Bollerslev)
1986 neural networks (Rumelhart)
1989 state space models (Harvey, West, Harrison)
1994 nonparametric forecasting (Tjøstheim,
Härdle, Tsay,. . . )
2002 exponential smoothing state space models
(Snyder, Hyndman, Koehler, Ord)
1. Getting started A brief history of forecasting 4
5. Advantages of stochastic models
Based on empirical data
Computable
Replicable
Testable
Objective measure of uncertainty
Able to compute prediction intervals
1. Getting started A brief history of forecasting 5
6. Outline
1 A brief history of forecasting
2 Types of data
3 Forecasting models
4 Some case studies
5 The statistical forecasting perspective
6 Introduction to R
1. Getting started Types of data 6
7. Types of data
Most forecasting problems use either
1 Time series data (collected at regular intervals
over time)
2 Cross-sectional data are for a single point in
time.
Time series examples
Daily IBM stock prices
Monthly rainfall
Annual Google profits
Quarterly Australian beer production
Forecasting is estimating how the sequence
of observations will continue into the future.
1. Getting started Types of data 7
8. Australian beer production
500
megaliters
450
400
1995 2000 2005 2010
Year
1. Getting started Types of data 8
9. Types of data
Cross-sectional examples
House prices for all houses sold in 2009 in
Clayton. We are interested in predicting the
price of a house not in our data set using house
characteristics: position, no. bedrooms, age,
etc.
Fuel economy data for a range of 2009 model
cars. We are interested in predicting the carbon
footprint of a vehicle not in our data set using
information such as the size of the engine and
the fuel efficiency of the car.
1. Getting started Types of data 9
10. Vehicle carbon footprints
Model Cyl. Litres City Highway Carbon
MPG MPG footprint
Chevrolet Aveo 4 1.6 25.0 34 6.6
Chrysler PT Cruiser 4 2.4 19.0 24 8.7
Dodge Avenger 4 2.4 21.0 30 7.7
Ford Escape FWD 4 2.5 20.0 28 8.0
Ford Ranger Pickup 2WD 4 2.3 19.0 24 8.7
GMC Canyon 2WD 4 2.9 18.0 24 9.2
Honda Accord 4 2.4 21.0 30 7.7
Honda Civic 4 1.8 25.0 36 6.3
...
All vehicles with automatic transmission and using
regular fuel. How to predict carbon footprint (tons
of CO2 per year) for other vehicles?
1. Getting started Types of data 10
11. Outline
1 A brief history of forecasting
2 Types of data
3 Forecasting models
4 Some case studies
5 The statistical forecasting perspective
6 Introduction to R
1. Getting started Forecasting models 11
12. Time series models
Time series models use only information on the
variable to be forecast
EDt+1 = f (EDt , EDt−1 , EDt−2 , EDt−3 , . . . , error),
where t is time and ED is electricity demand.
e.g., ARIMA models and exponential smoothing.
Useful when predictor variables not known or measured.
Useful if prediction of predictor variables difficult.
Doesn’t lead to much understanding of system
1. Getting started Forecasting models 12
13. Cross-sectional models
Cross-sectional models assume that variable to
be forecast is affected by one or more other
predictor variables.
ED = f (current temperature, GDP,
population, time of day, day of week,
error).
e.g., regression models.
1. Getting started Forecasting models 13
14. Mixed models
Mixed model
EDt+1 = f (EDt , current temperature,
time of day, day of week, error).
e.g., dynamic regression models, panel data
models, longitudinal models, transfer function
models
1. Getting started Forecasting models 14
15. Outline
1 A brief history of forecasting
2 Types of data
3 Forecasting models
4 Some case studies
5 The statistical forecasting perspective
6 Introduction to R
1. Getting started Some case studies 15
16. CASE STUDY 1: Paperware company
Client: large company manufacturing disposable tableware.
Problem: They want forecasts of each of hundreds of items.
Series can be stationary, trended or seasonal. They currently
have a large forecasting program written in-house but it
doesn’t seem to produce sensible forecasts. They want me to
tell them what is wrong and fix it.
Additional information
The program is written in COBOL making numerical
calculations limited. It is not possible to do any
optimisation.
Their programmer has little experience in numerical
computing.
They employ no statisticians and want the program to
produce forecasts automatically.
1. Getting started Some case studies 16
17. CASE STUDY 1: Paperware company
Methods currently used
A 12 month average
C 6 month average
E straight line regression over last 12 months
G straight line regression over last 6 months
H average slope between last year’s and this year’s
values.
(Equivalent to differencing at lag 12 and taking
mean.)
I Same as H except over 6 months.
K I couldn’t understand the explanation.
1. Getting started Some case studies 17
18. CASE STUDY 2: PBS
Client: Federal government
Problem: Develop methodology to forecast annual
budget for Pharmaceutical Benefit Scheme (around
$7billion).
Additional information
At the time, they used Excel to fit a trend line
through three observations from about 10
years earlier.
All calculations must be done in Excel.
They have under-estimated expenditure by
nearly $1billion in last two years.
1. Getting started Some case studies 18
19. CASE STUDY 3: Car fleet company
Client: One of Australia’s largest car fleet
companies
Problem: how to forecast resale value of vehicles?
How should this affect leasing and sales policies?
Additional information
They can provide a large amount of data on
previous vehicles and their eventual resale
values.
The resale values are currently estimated by a
group of specialists. They see me as a threat
and do not cooperate.
1. Getting started Some case studies 19
20. CASE STUDY 4: Airline
Client: Ansett.
Problem: how to forecast passenger traffic on
major routes.
Additional information
They can provide a large amount of data on
previous routes.
Traffic is affected by school holidays, special
events such as the Grand Prix, advertising
campaigns, competition behaviour, etc.
They have a highly capable team of people who
are able to do most of the computing.
1. Getting started Some case studies 20
21. Outline
1 A brief history of forecasting
2 Types of data
3 Forecasting models
4 Some case studies
5 The statistical forecasting perspective
6 Introduction to R
1. Getting started The statistical forecasting perspective 21
22. Statistical forecasting
Thing to be forecast: a random variable, yi .
Forecast distribution: If I is all observations,
then yi |I means “the random variable yi given
what we know in I ”.
The “point forecast” is the mean (or median) of
yi |I
The “forecast variance” is var[yi |I]
A prediction interval or “interval forecast” is a
range of values of yi with high probability.
ˆ
With time series, yt|t−1 = yt |{y1 , y2 , . . . , yt−1 }.
ˆ
yT +h|T = E[yT +h |y1 , . . . , yT ] (an h-step forecast
taking account of all observations up to time T).
1. Getting started The statistical forecasting perspective 22
23. Outline
1 A brief history of forecasting
2 Types of data
3 Forecasting models
4 Some case studies
5 The statistical forecasting perspective
6 Introduction to R
1. Getting started Introduction to R 23