This powerpoint presentation was done as part of the course STAT 591 titled Mater's Seminar during Third semester of MSc. Agricultural Statistics at Agricultural College, Bapatla under ANGRAU, Andhra Pradesh.
2. “The future belongs to those who believe in
the beauty of their dreams.”
- Eleanor Roosevelt
2
3. Do you know in 2020
• World population will reach 7,675 million (11% more than in 2010)
• Life expectancy will exceed 70 years
• 52% of world population will be middle class (54% of this in Asia)
• 2000 million obese people
• 6000 million of mobile phone users (5000 million in 2010)
[source: www.tid.es]
3
6. • “Forecasting is the process of making predictions of the future based on
past and present data and most commonly by analysis of trends”
- www.wikipedia.com
• “A planning tool that helps management in its attempts to cope with the
uncertainty of the future, relying mainly on data from the past and
present and analysis of trends.”
- www.businessdictionary.com
INTRODUCTION
6
7. Define problem
Assumptions regarding situation
Determine variables
Collect data
Analyse data
Determine forecast
Validation
STAGES OF FORECASTING
7
8. Two basic reasons for the need for forecast
• Purpose – Any action devised in the PRESENT to take care of some
contingency accruing out of a situation or set of conditions set in future.
These future conditions offer a purpose / target to be achieved so as to
take advantage of or to minimize the impact of (if the foreseen conditions
are adverse in nature) these future conditions.
• Time – To prepare plan, to organize resources for its implementation, to
implement; and complete the plan; all these need time as a resource.
Some situations need very little time, some other situations need several
years of time. Therefore, if future forecast is available in advance,
appropriate actions can be planned and implemented ‘intime’.
Why forecasting?
8
10. Kesten and Scott (2007) proposed a structured judgmental procedure. When predicting
decisions made in eight conflict situations, 46% of structured-analogies forecasts were
accurate. Among experts who were able to think of two or more analogies and who had
direct experience with their closest analogy, 60% of forecasts were accurate.
Enrique de Alba and Manuel Mendoza extended Foresight’s coverage of approaches to
forecasting seasonal data from short historical series (less than 2-3 years of data.) They
described and illustrated a Bayesian method for modeling seasonal data and showed that
it could outperform traditional time series methods for short time series.
Nury et al. used ARIMA for short-term predictions of monthly maximum and minimum
temperatures in the Sylhet and Moulvibazar districts in north-east Bangladesh. For the
maximum and minimum temperatures at Sylhet station ARIMA (1,1,1) (1,1,1) and
ARIMA (1,1,1) (0,1,1), respectively, are obtained, whereas the respective models for the
Moulvibazar station are ARIMA (1,1,0) (1,1,1)1and ARIMA (0,1,1) (1,1,1). Using these
ARIMA-models one-month-ahead forecasts of the temperatures at the two stations for
years 2010 and 2011 are carried out.
10
Reviews
11. Xin and Can (2016) studied the agricultural products price among 214 large-scale wholesale
markets in China. The predictive model of cucumber prices presents regularity of
ARIMA (3,1,2) model to provide high accuracy of short-term prediction for cucumber
prices in Shandong Shouguang wholesale market.
Amin et al. (2014) developed time series models and best model is identified to forecast the
wheat production of Pakistan using data from 1902-2005. They found that the best
model is ARIMA (1, 2, 2) and wheat production of Pakistan would become 26623.5
thousand tons in 2020 and would become double in 2060 as compared in 2010.
G Vamsi Krishna (2015) considered weather data with attributes, such as wind pressure,
humidity, Minimum and Maximum Temperature, Forecast and Type, of Visakhapatnam
city for a period of 97days. ARIMA (1,1,0), was found to be appropriate to evaluate the
weather condition for the next 15 days.
Abhishek et al. studied the application of ANN simulated in MATLAB to predict maximum
and minimum temperature using 60 years of data (1901-1960) and concluded that
Multilayered Neural Network can be an effective tool in weather prediction.
11
12. • Qualitative forecasting models are useful in developing forecasts with a
limited scope.
• These models are highly reliant on expert opinions and are most beneficial
in the short term.
• Broadly classified into
Qualitative methods
Quantitative methods
FORECASTING MODELS
12
13. Forecasting models
Qualitative
1. Delphi method
2. Jury of executive
opinion
3. Interactive
approach
Quantitative
Causal
1. Simple Linear
Regression
2. Multiple Linear
Regression
Time series
1. Naive
2. Simple Average
3. Simple Moving Average
4. Weighted MA
5. Exponential Smoothing
6.Trend Adjusted ES
7.Adaptive Response Rate ES
8.Curve fitting
9. ARIMA
13
14. • A systematic - involves structured interaction among a group of experts
on a subject.
• Includes at least two rounds of experts answering questions and giving
justification for their answers, providing the opportunity between rounds
for changes and revisions.
•
Delphi Method
The multiple rounds, which are
stopped after a pre-defined criterion is
reached, enable the group of experts
to arrive at a consensus forecast on the
subject being discussed
14
15. • A group of managers / executives meet and collectively develop a forecast.
Jury of Executive Opinion
15
17. • Linear regression model with a single explanatory variable
• two-dimensional sample points with one independent variable and one
dependent variable
Simple Linear Regression
• OLS estimation of parameters
17
18. • Models the relationship between two or more explanatory variables and
a response variable
• Fits a linear equation to observed data.
• Every value of the independent variable x is associated with a value of the
dependent variable y
Multiple Linear Regression
18
19. Time Series Models
Method Description
Naive Uses last period’s actual value as a forecast
Simple Average Uses an average of all past data as a forecast.
Simple MA Uses an average of a specified number of the most recent observations, with
each observation receiving the same emphasis or weight.
Weighted MA Uses an average of a specified number of the most recent observations, with
each observation receiving a different emphasis or weight.
Exponential
Smoothing
Weighted average procedures with weights declining exponentially as data
become older.
Trend Adjusted ES An exponential smoothing model with a mechanism for making adjustments
when strong trend patterns are inherent in the data.
Adaptive Response
Rate ES
Similar to the simple exponential smoothing model. However, when the errors
are not constant and vary over time, the constant “alpha” is adjusted when
errors are high or low. The smoothing parameter therefore adapts to the data.
Curve Fitting Technique that attempt to explain variation using statistical tools.
ARIMA Auto-Regressive (AR) model effectively coupled with Moving Average (MA)
extended to non-stationary data.
19
20. Data set
Year 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
Yield
(t/ha) 3.09 2.71 3.45 3.41 3.59 2.39 3.84 3.48 3.26 3.91
The data given below shows the yield (t/ha) of Kharif Rice in Guntur district
of Andhra Pradesh from 2005 to 2014.
[ Source: Directorate of Economics and Statistics, Andhra Pradesh ]
20
21. Naive Method
• Forecast for time t+1 i.e., Y(t+1) will be the current yield (Yt)
Year Time (t) Current yield (Yt) forecast Y(t+1)
2005 1 3.09 -
2006 2 2.71 3.09
2007 3 3.45 2.71
2008 4 3.41 3.45
2009 5 3.59 3.41
2010 6 2.39 3.59
2011 7 3.84 2.39
2012 8 3.48 3.84
2013 9 3.26 3.48
2014 10 3.91 3.26
2015 11 3.91
21
22. • Forecast = average of all past yields
Method of Simple Average
Year Time (t) Current yield (Yt) forecast Y(t+1)
2005 1 3.09 -
2006 2 2.71 3.09
2007 3 3.45 (3.09+2.71)/2=2.9
2008 4 3.41 (3.09+2.71+3.45)/3=3.08
2009 5 3.59 3.16
2010 6 2.39 3.25
2011 7 3.84 3.11
2012 8 3.48 3.21
2013 9 3.26 3.24
2014 10 3.91 3.25
2015 11 3.31
22
23. • Forecast= average of specified number of recent observations with equal
weights
• Consider 2 year moving average
Method of Simple Moving Average
Year Time (t) Current yield (Yt) forecast Y(t+1)
2005 1 3.09 -
2006 2 2.71 -
2007 3 3.45 (3.09+2.71)/2=2.9
2008 4 3.41 (2.71+3.45)/2=3.08
2009 5 3.59 (3.45+3.41)/2=3.43
2010 6 2.39 3.5
2011 7 3.84 2.99
2012 8 3.48 3.11
2013 9 3.26 3.66
2014 10 3.91 3.37
2015 11 3.58 23
24. • Forecast= average of specified number of recent observations with
unequal weights
• Consider 3 year moving average with weights 0.5,0.3 and 0.2 respectively
Method of Weighted Moving Average
Year Time (t) Current yield (Yt) forecast Y(t+1)
2005 1 3.09
-
2006 2 2.71 -
2007 3 3.45 -
2008 4 3.41 (3.45*0.5+2.71*0.3+3.09*0.2)/(0.5+0.3+0.2)= 3.15
2009 5 3.59 (3.41*0.5+3.45*0.3+2.71*0.2)/(0.5+0.3+0.2)= 3.28
2010 6 2.39 (3.59*0.5+3.41*0.3+3.45*0.2)/(0.5+0.3+0.2)= 3.51
2011 7 3.84 2.95
2012 8 3.48 3.36
2013 9 3.26 3.37
2014 10 3.91 3.44
2015 11 3.63 24
25. • Forecast Ft+1 = α At + (1-α) Ft
• α - smoothing coefficient ranging from 0 and 1
• Let α=0.2
Exponential Smoothing
Year Time (t) Current yield (Yt) forecast Y(t+1)
2005 1 3.09 -
2006 2 2.71 3.09
2007 3 3.45 (0.2*2.71)+(0.8*3.09)=3.01
2008 4 3.41 (0.2*3.45)+(0.8*3.01)=3.1
2009 5 3.59 (0.2*3.41)+(0.8*3.1)=3.16
2010 6 2.39 3.25
2011 7 3.84 3.08
2012 8 3.48 3.23
2013 9 3.26 3.28
2014 10 3.91 3.27
2015 11 3.40 25
26. • For random and erratic time series data with no definite pattern, use a
larger value of .
• For random walk (randomly and smoothly walks up and down without
any repeating patterns) time series data, use a smaller value of .
• A greater amount of smoothing is desired,
Use longer period Moving Average Use smaller α value
• A smaller amount of smoothing is desired,
Use shorter period Moving Average Use a high α value
• Try different values of in fitting the model and
based on the minimum error choose the optimal .
Principles to determine α
26
27. • Uses measurable historical data and forecasts by calculating weighted
average of current actual value and forecast, with a trend added in it.
• AF (t+1)= F(t+1) + T(t+1)
• T(t+1)= β[F(t+1)-F(t)]+(1-β)*T(t)
= trend factor for next period
• T(t) = trend factor for current period
• β= trend adjustment factor
• Let α=0.2 , β=0.6 and trend factor for period 2 i.e., T(2)=0
Trend Adjusted Exponential Smoothing
27
29. Ft+1 = αt At + (1-αt) Ft
• αt+1 = | Et / Mt |
• Mt = β |et| + (1- β) Mt-1
• Et = β et + (1- β) Et-1
• et = At - Ft
• At is the actual yield at tth period
• Ft is the forecasted yield at tth period
• et is the error term at tth period
• Mt is the absolute smoothed error
• Et is the smoothed error
Adaptive Response Rate Exponential
Smoothing
29
31. • Fits a suitable curve and explains variation and forecasts using statistical
techniques.
Curve fitting
Method Remarks
linear Attempts to fit a straight line to the data
Exponential uses an increasing or decreasing curve rather than the straight line
useful when it is known that there has been a growth or decline in past
periods
Power produces a forecast curve that increases or decreases at a different rate
Logarithmic uses an alternate logarithmic model
Gompertz attempts to fit a 'Gompertz' or 'S' curve
Logistic attempts to fit a ‘Logistic’ curve
Parabola or
Quadratic
attempts to fit a 'Parabolic' (second order polynomial) curve and forecast
for a dampen data series
Compound produces a forecast curve that increases or decreases at a compound rate
Growth produces a forecast curve at an estimated growth rate
Cubic attempts to fit a 'Cubic' curve
Inverse attempts to fit an 'Inverse' curve 31
33. • Popularised by Box and Jenkins
• Applied to non stationary data
Stationarity of a TS process
• Underlying generating process is based on a constant mean and constant
variance
• Autocorrelation function (ACF) constant throughout the time
• Different subsets of time series sample – equal mean, variance and ACF
• Tested by Dickey-Fuller Test
Autocorrelation
• Observations are related to each other
• Simple correlation between Yt and Yt-p
• p is lag period
• Ranges from -1 to +1
• Max. Number of rp=n/4
(n= no. of time periods
Auto Regressive Integrated Moving
Average (ARIMA)
33
34. Partial Autocorrelation
• Degree of association between Yt and Yt-p when Y effects at other time
lags 1,2,3,...,p-1 are removed
ACF/PACF vs lag number
Auto Regressive Integrated Moving
Average (ARIMA)
ACF/
PACF
34
35. AR MA
ARIMA
Note : ϵt are independently and normally distributed with zero mean and
constant variance σ² for t=1,2,3,...,n
• Characterised by notation ARIMA (p,d,q)
• p –autoregression –from PACF
• d – integration (differencing)
• q – moving average – from ACF
• First order AR process denoted by ARIMA(1,0,0) is simply AR(1)
• First order MA process denoted by ARIMA(0,0,1) is simply MA(1)
• ARIMA (p,0,q) is simply ARMA
Auto Regressive Integrated Moving
Average (ARIMA)
35
37. Identification
• Check for stationarity – to obtain estimates
• Stationarity in mean & stationarity in variance
• Look at graph of data or ACF and PACF
• Fit AR model- if φ1 <1, stationarity
• If non stationary, go for differencing
• Log transformation – stationarity in variance
• Obtain p from PACF and q from ACF
• Hence model is ARIMA (p,d,q)
Model ACF PACF
AR Spikes decay towards zero Spikes cut off to zero
MA Spikes cut off to zero Spikes decay towards zero
ARMA Spikes decay towards zero Spikes decay towards zero
37
38. Estimation
• Obtain precise estimates of model
• By Ordinary Least Squares (OLS) method
• SAS, SPSS estimates through iteration
Diagnosis
• Model adequacy checking
• Analysis of residual
• Overall model adequacy by Q statistic:
Box-Pierce Statistic Ljung- Box Statistic
• Follows χ² distribution with (k-m1) df
Forecasting
• Done using the best model selected
38
40. • Two forecasting models to be ranked differently depending on the accuracy
measure used.
• Uses the naive model to compute the benchmark accuracy.
• Measures are:
• Mean Error (ME) =
• Mean Absolute Error (MAE) =
• Sum of Squared Error (SSE) =
• Mean of Squared Error (MSE) =
• Root of Mean Squared Error (RMSE) =
• Percent Error (PEt) =
Measurement of Forecast Accuracy
t
t
e
n
A
)F-( t
te
n
1
2
t
2
t e)F-( tA
22 1
)(
1
ttt e
n
FA
n
1
)F-( 2
t
n
At
100
)(
X
A
FA
t
tt
40
42. • Kesten, C. Green. and Scott, J. Armstrong..2007. Structured Analogies for
Forecasting. International Journal of Forecasting 23 (2007) 365–37.
• Enrique, de, Alba. and Manuel, Mendoz. 2007. Bayesian Forecasting Method
for Short Time Series.
• Wang, Xin. and Wang, Can. 2016. Empirical Study on Agricultural Products
Price Forecasting based on Internet-based Timely Price Information.
International Journal of Advanced Science and Technology Vol.87 (2016), pp.31-
36.
• Amin, M., Amanullah, M. and Akbar, A. 2014. Time Series Modeling for
Forecasting Wheat Production Of Pakistan. The Journal of Animal and Plant
Sciences 24(5):2014 page:1444-1451.
Case studies
42
43. • The use of analogies is subject to biases
• hypothesized that forecasts derived from an expert's structured analysis of
analogies would be more accurate than forecasts by experts who used
their unaided judgment.
• The procedure involves five steps:
(1) Administrator describes the target situation
(2) Administrator selects experts
(3) Experts identify and describe analogies
(4) Experts rate similarity
(5) Administrator derives forecasts
• Used eight conflict situations in this research. They provided between
three and six possible outcome options for each of them .
1. Structured Analogies for Forecasting
43
44. Conflict situations
• Artists protest: Members of a rich nation's artists' union occupied a major gallery and
demanded generous financial support from their government. What will be the final
resolution of the artists' sit-in? (6 options)
• Distribution channel: An appliance manufacturer proposed to a supermarket chain a
novel arrangement for retailing its wares. Will the management of the supermarket
chain agree to the plan? (3 options)
• 55% Pay plan: Professional sports players demanded a 55% share of gross revenues and
threatened to go on strike if the owners didn't concede. Will there be a strike, and if so,
how long will it last? (4 options)
• Nurses dispute: Angry nurses increased their pay demand and threatened more strike
action after specialist nurses and junior doctors received big increases. What will the
outcome of their negotiations be? (3 options)
• Personal grievance: An employee demanded a meeting with a mediator when her job
was downgraded after her new manager re evaluated it. What will be the outcome of
the meeting? (4 options)
• Telco takeover: An acquisitive telecommunications provider, after rejecting a seller's
mobile business offer, made a hostile bid for the corporation. How will the standoff
between the companies be resolved? (4 options)
• Water dispute: Troops from neighbouring nations moved to their common border, and
the downstream nation threatened to bomb the upstream nation's new dam. Will the
upstream neighbour agree to release additional water, and if not, how will the
downstream nation's government respond? (3 options)
• Zenith investment: Under political pressure, a large manufacturer evaluated an
investment in expensive new technology. How many new manufacturing plants will it
decide to commission? (3 options)
44
45. • Four treatments:
1. Unaided judgment (no instructions on how to forecast) without
collaboration.
2. Unaided judgment with collaboration
3. Structured analogies without collaboration
4. Structured analogies with collaboration
45
46. Results & Conclusions:
• The forecasts from structured analogies were more accurate. They were
more accurate for seven of the eight conflicts.
• The structured analogies forecasts were 46% accurate, compared to 28%
for chance.
• Structured analogies reduced the average forecast error by 21% compared
to unaided judgment forecasts (where forecast error is the percentage of
forecasts that were wrong)
• Forecasts from solo experts were on average 44% accurate across conflicts
(75 forecasts), compared to 42% for forecasts by collaborating experts (22
forecasts).
• No need to distinguish between solo and collaborative forecasts in our
analysis.
46
47. • Considered number of foreign tourists visiting Mexico.
• Used a binomial model and non informative priors to forecast the whole
year total for 1991, given monthly data for 1989 and 1990.
• Given a particular month, say January 1991, they used data from 1989 and
1990 to estimate the proportion p of tourists that visit Mexico every
January.
• p is the proportion of tourist volume in January to the total number of
tourists of the year.
• The inverse of this proportion (1/p) is used as an expansion factor
Forecast of 1991 total = January 1991 count / p
2. Bayesian Forecasting Method for Short Time
Series.
47
49. Comparison of Bayesian and ARIMA procedures
• Monthly average residential electricity usage in Iowa City from 1971–1978
49
Traditional time series models such as ARIMA are better for long series
while Bayesian forecasts outperform those in the case of the short series.
So Bayesian procedures can be effective if only a small amount of past data
is available.
50. • Based on actual data collected by crawler technology, attempts to carry on
trend estimate of certain produce price on a specific market by using time
series model.
• Data from Xinxin Agricultural Products Service Platform on the 955
samples of daily cucumber prices between 4 Jan 2011 and 15 Aug 2013.
• ARIMA model of cucumber prices is established to forecast short-term
price between 16 Aug 2013 and 3 Sep 2013.
3. Empirical Study on Agricultural Products Price
Forecasting based on Internet-based Timely Price
Information
50
52. • ARIMA(3,1,2) was found to be optimum
52
Conclusions:
(1) Cucumber daily price fluctuates
due to randomness and seasonal
factors. The cucumber price is low
in the middle of each year, which
is probably a connection with the
harvest season.
(2) The ARIMA model can provide
high accuracy of cucumber price
prediction and can be extended to
other agricultural products in
other markets.
53. • Large time periods i.e. 1902-2005 data was used
4. Time Series Modelling for Forecasting Wheat
Production Of Pakistan
53
55. 55
Conclusion:
(1) Best time series model for forecasting wheat production of
Pakistan is ARIMA (1,2,2)
(2) This model has lower AIC and SBIC as compared to other fitted time
series models.
(3) Wheat production of Pakistan would become 26623.5 thousand
tons in 2020 and would become double in 2060 as compared in 2010
under the assumption that there is no irregular movement or variation
is occurred.
56. Conclusions
• No forecasting technique is appropriate for all situations
• Choosing a technique depends on the pattern of data, accuracy
desired, time permitted to develop the forecast, the complexity of
the situation to be explained, the time period to be projected, the
amount of money available to carry out forecast and experience of
the forecaster.
• Forecast for a variable is generated
through the interaction of a number of
factors too complex to describe accurately
in a model. Therefore, all forecasts
certainly contain some errors. Selecting a
model with minimum forecast error is
important for accurate results
56
57. • Kesten, C. Green. and Scott, J. Armstrong..2007. Structured Analogies for Forecasting.
International Journal of Forecasting 23 (2007) 365–37.
• Enrique, de, Alba. and Manuel, Mendoz. 2007. Bayesian Forecasting Method for Short
Time Series.
• Abhishek, Agarwal., Vikas, Kumar., Ashish, Pandey. and Imran, Khan. 2012. An
Application of Time Series Analysis for Weather Forecasting. International Journal of
Engineering Research and Applications (IJERA) Vol. 2, Issue 2,Mar-Apr 2012, pp.974-
980.
• Nury, A. H.,Koch, M. and Alam, M.J.B. Time Series Analysis and Forecasting of
Temperatures in the Sylhet Division of Bangladesh.
• Wang, Xin. and Wang, Can. 2016. Empirical Study on Agricultural Products Price
Forecasting based on Internet-based Timely Price Information. International Journal of
Advanced Science and Technology Vol.87 (2016), pp.31-36.
• Amin, M., Amanullah, M. and Akbar, A. 2014. Time Series Modeling for Forecasting
Wheat Production Of Pakistan. The Journal of Animal and Plant Sciences 24(5):2014
page:1444-1451.
• Vamsikrishna, G. 2015. An Integrated Approach for Weather Forecasting based on Data
Mining and Forecasting Analysis. International Journal of Computer Applications (0975
– 8887) Volume 120 – No.11, June 2015
Literature cited
57