1. Module 3 - Time Series
CSC601.3: To Understand data modeling in time
series and its process.
2.
3. Overview of Time Series Analysis
Time series analysis attempts to model the underlying structure of observations
taken over time.
A time series, denoted Y =a+ bX , is an ordered sequence of equally spaced
values over time.
For example,Figure provides a plot of the monthly number of international airline
passengers over a 12-year period.
4.
5. Overview of Time Series Analysis
Following are the goals of time series analysis:
• Identify and model the structure of the time series.
• Forecast future values in the time series.
Time series analysis has many applications in finance, economics, biology,
engineering, retail, and manufacturing.
6. USE CASES FOR TIME SERIES ANALYSIS
Retail sales:
For various product lines, a clothing retailer is looking to forecast future monthly sales.
These forecasts need to account for the seasonal aspects of the customer's
purchasing decisions.
For example, in the northern hemisphere, sweater sales are typically brisk in the fall
season, and swimsuit sales are the highest during the late spring and early summer.
An appropriate time series model needs to account for fluctuating demand over the
calendar year.
7. USE CASES FOR TIME SERIES ANALYSIS
Stock trading:
Some high-frequency stock traders utilize a technique called pairs trading.
In pairs trading, an identified strong positive correlation between the prices of two stocks is used to detect a market opportunity.
Suppose the stock prices of Company A and Company B consistently move together.
Time series analysis can be applied to the difference of these companies' stock prices over time.
A statistically larger than expected price difference indicates that it is a good time to buy the stock of Company A and sell the stock of
Company B, or vice versa.
This trading approach depends on the ability to execute the trade quickly and be able to detect when the correlation in the
stock prices is broken.
Pairs trading is one of many techniques that falls into a trading strategy called statistical arbitrage.
8. Box-Jenkins Methodology
A time series consists of an ordered sequence of equally spaced values over time.
Examples of a time series are monthly unemployment rates, daily website visits, or stock prices
every second.
A time series can consist of the following components:
• Trend
• Seasonality
• Cyclic
• Random
9. Box-Jenkins Methodology
The trend refers to the long-term movement in a time series. It indicates whether
the observation values are increasing or decreasing over time. Examples of trends
are a steady increase in sales month over month or an annual decline of fatalities
due to car accidents.
The seasonality component describes the fixed, periodic fluctuation in the
observations over time.
As the name suggests, the seasonality component is often related to the calendar.
For example, monthly retail sales can fluctuate over the year due to the weather
and holidays.
10. Box-Jenkins Methodology
A cyclic component also refers to a periodic fluctuation, but one that is not as fixed as
in the case of a seasonality component.
For example, retails sales are influenced by the general state of the economy. Thus, a
retail sales time series can often follow the lengthy boom-bust cycles of the economy.
After accounting for the other three components, the random component is what
remains.
Although noise is certainly part of this random component, there is often some
underlying structure to this random component that needs to be modeled to forecast
future values of a given time series.
11.
12.
13.
14.
15.
16. Box-Jenkins Methodology
Developed by George Box and Gwilym Jenkins, the Box-Jenkins methodology for
time series analysis involves the following three main steps:
1. Condition data and select a model.
• Identify and account for any trends or seasonality in the time series.
• Examine the remaining time series and determine a suitable model.
2. Estimate the model parameters.
3. Assess the model and return to Step 1, if necessary.
17. Time Series Object
Time series is a series of data points in which each data point is associated
with a timestamp. They play a major role in understanding a lot of details on
specific factors with respect to time. Example: stock price at different points of
time on a given day, amount of sales in a region at different months of the
year.
18. Determining Stationarity
A time series is considered to be stationary if they do not have trend or
seasonal effects and thus has consistent statistical properties over time. These
properties include mean, variance, auto co-variance. Before applying statistical
modeling methods, the time series is required to be stationary. This is required
for easy modeling and doing proper forecast of the series; because if we take a
certain behavior over time, it should be same in the future.
19. Making Time Series Stationary
There are two major factors that make a time series non-stationary.
They are: Trend (non-constant mean) and Seasonality (Variation at specific
time-frames). Hence, before doing forecasting, we need to make the series
stationary and this can be done by adjusting the trend and seasonality. We can
then convert the forecasted values into real values by applying the trend and
seasonality constraints back again.
20. Adjusting Trend Using Smoothing
The first step is to reduce the trend using transformation like log, square root,
cube root etc. Smoothing is the most common method to model the trend.
When the time series data have significant irregular component and for
determining a pattern in the data, we want a smoothed curve which will reduce
these fluctuations. In smoothing we usually take the past few instances (rolling
estimates). Smoothing in curve is generally achieved using either moving
average method or exponential smoothing.
21. Adjusting Seasonality and Trend
Most time series have trends along with seasonality. There are two common
methods to remove trend and seasonality: Differencing and Seasonal
Decomposition
22. Stationarity and differencing
A stationary time series is one whose properties do not depend on the
time at which the series is observed.
Thus, time series with trends, or with seasonality, are not stationary — the
trend and seasonality will affect the value of the time series at different
times.
On the other hand, a white noise series is stationary — it does not matter
when you observe it, it should look much the same at any point in time.
23. Stationarity
Some cases can be confusing — a time series with cyclic behaviour (but with no
trend or seasonality) is stationary. This is because the cycles are not of a fixed
length, so before we observe the series we cannot be sure where the peaks and
troughs of the cycles will be.
In general, a stationary time series will have no predictable patterns in the long-
term. Time plots will show the series to be roughly horizontal (although some
cyclic behaviour is possible), with constant variance.
24.
25. Which of these series are stationary?
(a) Google stock price for 200 consecutive days;
(b) Daily change in the Google stock price for 200 consecutive days;
(c) Annual number of strikes in the US;
(d) Monthly sales of new one-family houses sold in the US;
(e) Annual price of a dozen eggs in the US (constant dollars);
(f) Monthly total of pigs slaughtered in Victoria, Australia;
(g) Annual total of lynx trapped in the McKenzie River district of north-west Canada;
(h) Monthly Australian beer production;
(i) Monthly Australian electricity production.
26. Which of these series are stationary?
Seasonality rules out series (d), (h) and (i).
Trends and changing levels rules out series (a), (c), (e), (f) and (i).
Increasing variance also rules out (i).
That leaves only (b) and (g) as stationary series.
27. Differencing
In Figure that the Google stock price was non-stationary in
panel (a), but the daily changes were stationary in panel (b).
This shows one way to make a non-stationary time series
stationary — compute the differences between consecutive
observations. This is known as differencing.
28. Differencing
Transformations such as logarithms can help to stabilise the variance of a time
series.
Differencing can help stabilise the mean of a time series by removing changes in
the level of a time series, and therefore eliminating (or reducing) trend and
seasonality.
https://otexts.com/fpp2/stationarity.html
31. To distinguish seasonal differences from ordinary differences, we sometimes
refer to ordinary differences as “first differences”, meaning differences at lag
1.
Sometimes it is necessary to take both a seasonal difference and a first
difference to obtain stationary data
Here, the data are first transformed using logarithms, then seasonal
differences are calculated
The data still seem somewhat non-stationary, and so a further lot of first
differences are computed
32. When both seasonal and first differences are applied, it makes no difference which is
done first—the result will be the same.
But, if the data have a strong seasonal pattern, we recommend that seasonal differencing
be done first, because the resulting series will sometimes be stationary and there will be
no need for a further first difference.
If first differencing is done first, there will still be seasonality present.
It is important that if differencing is used, the differences are interpretable.
First differences are the change between one observation and the next.
Seasonal differences are the change between one year to the next. Other lags are unlikely
to make much interpretable sense and should be avoided.
33. Autoregressive models
In a multiple regression model, we forecast the variable
of interest using a linear combination of predictors. In an
autoregression model, we forecast the variable of
interest using a linear combination of past values of the
variable. The term autoregression indicates that it is a
regression of the variable against itself.
38. ARIMA Modeling
Time series analysis can be used in a multitude of business applications for
forecasting a quantity into the future and explaining its historical patterns.
Exponential smoothing and ARIMA models are commonly used for time series
forecasting. Exponential smoothing models were based on a description of
trend and seasonality in the data while ARIMA models are used to describe the
autocorrelations in the data.
39.
40.
41. ACF PLOTS
ACF plot is a bar chart of coefficients of correlation between a time series and it lagged values.
ACF explains how the present value of a given time series is correlated with the past (1-unit
past, 2-unit past, …, n-unit past) values.
In the ACF plot, the y-axis expresses the correlation coefficient whereas the x-axis mentions the
number of lags.
Assume that, y(t-1), y(t), y(t-1),….y(t-n) are values of a time series at time t, t-1,…,t-n, then the
lag-1 value is the correlation coefficient between y(t) and y(t-1), lag-2 is the correlation
coefficient between y(t) and y(t-2) and so on.
42. PACF
PACF is the partial autocorrelation function that explains the partial correlation between the
series and lags itself.
In simple terms, PACF can be explained using a linear regression where we predict y(t) from
y(t-1), y(t-2), and y(t-3) .
In PACF, we correlate the “parts” of y(t) and y(t-3) that are not predicted by y(t-1) and y(t-2).
43. How to Interpret ACF and PACF plots for Identifying AR, MA,
ARMA, or ARIMA Models
In time series analysis, Autocorrelation Function (ACF) and the partial autocorrelation function
(PACF) plots are essential in providing the model’s orders such as p for AR and q for MA to
select the best model for forecasting.
44.
45.
46.
47. AR MODEL
Tail off is observed at ACF plot. Thus, it’s a AR model. From PACF, cut off happens at lag 2.
Thus, the order is 2. So it should be AR(2) model.
48.
49. MA Model
Tail off at PACF. Then we know that it’s a MA model. Cut off is at lag 1 in ACF. Thus, it’s MA(1)
model.
Not that there’s some more spikes that slightly go over above the threshold blue lines like
around lag 2 and 4. However, we always want a simplified model. So we usually take lower lag
number and significant spike like the one at lag 1.
50.
51.
52. ARMA Model
In both ACF and PACF plots, it’s not clear whether they are tailing off or cutting off. That’s where ARMA comes in.
With ARMA, the orders of p and q for AR and MA can be more than one. So testing out a few p and q combinations is
advised to get the better score of AIC and BIC.
To get p value for AR for ARMA model, we will look at PACF plots. The spikes are at 1 and 3. Thus it’s AR(1) and AR(3).
To get q value, we will look at ACF plot. The spikes are at 1 and 3. Thus it’s MA(1) and MA(3).
53.
54.
55. Identifying AR and MA orders by ACF and PACF plots:
Assume that, the time series is stationary, if not then we can perform
transformation and/or differencing of the series to convert the series
into a stationary process.
Once the series is stabilized, we can plot the ACF and PACF plots to
identify the orders of AR and MA terms in the ARMA model.
At times, only AR terms or only MA terms are sufficient to model the
process.
56. The ACF and PACF plots should be considered together to define the process.
For the AR process, we expect that the ACF plot will gradually decrease and simultaneously the
PACF should have a sharp drop after p significant lags.
To define a MA process, we expect the opposite from the ACF and PACF plots, meaning that:
the ACF should show a sharp drop after a certain q number of lags while PACF should show a
geometric or gradual decreasing trend.
On the other hand, if both ACF and PACF plots demonstrate a gradual decreasing pattern, then
the ARMA process should be considered for modeling.
58. Identify the model for the following plot
The ACF shows a gradually decreasing trend while the PACF cuts immediately after one lag.
59. Thus, the graphs suggest that an AR (1) model would be appropriate for the time series.
60. Figure show ACF and PACF for a stationary time series, respectively.
61. The ACF and PACF plots indicate that an MA (1) model would be appropriate for the time series
because the ACF cuts after 1 lag while the PACF shows a slowly decreasing trend.
62.
63. Figure show ACF and PACF for another stationary time series data.
Both ACF and PACF show slow decay (gradual decrease).
Hence, the ARMA (1,1) model would be appropriate for the series.
Again, observing the ACF plot: it sharply drops after two significant lags which indicates that an
MA (2) would be a good candidate model for the process.
Therefore, we should experiment with both ARMA (1,1) and MA (2) for the process
64. ACF and PACF plots:
After a time series has been stationarized by differencing, the next step in fitting an ARIMA model is to
determine whether AR or MA terms are needed to correct any autocorrelation that remains in the
differenced series.
By looking at the autocorrelation function (ACF) and partial autocorrelation (PACF) plots of the
differenced series, you can tentatively identify the numbers of AR and/or MA terms that are needed.
The ACF plot: it is merely a bar chart of the coefficients of correlation between a time series and lags of
itself.
The PACF plot is a plot of the partial correlation coefficients between the series and lags of itself.
In general, the "partial" correlation between two variables is the amount of correlation between them
which is not explained by their mutual correlations with a specified set of other variables.
For example, if we are regressing a variable Y on other variables X1, X2, and X3, the partial correlation
between Y and X3 is the amount of correlation between Y and X3 that is not explained by their common
correlations with X1 and X2. This partial correlation can be computed as the square root of the
reduction in variance that is achieved by adding X3 to the regression of Y on X1 and X2.
65. A partial autocorrelation is the amount of correlation between a variable and a lag of
itself that is not explained by correlations at all lower-order-lags.
The autocorrelation of a time series Y at lag 1 is the coefficient of correlation between Yt
and Yt-1, which is presumably also the correlation between Yt-1 and Yt-2. But if Yt is
correlated with Yt-1, and Yt-1 is equally correlated with Yt-2, then we should also expect
to find correlation between Yt and Yt-2.
In fact, the amount of correlation we should expect at lag 2 is precisely the square of the
lag-1 correlation. Thus, the correlation at lag 1 "propagates" to lag 2 and presumably to
higher-order lags.
The partial autocorrelation at lag 2 is therefore the difference between the actual
correlation at lag 2 and the expected correlation due to the propagation of correlation at
lag 1.
66. Reasons to Choose and Cautions
One advantage of ARIMA modeling is that the analysis can be based simply on historical time
series data for the variable of interest.
As observed in regression, various input variables need to be considered and evaluated for
inclusion in the regression model for the outcome variable.
Because ARIMA modeling, in general, ignores any additional input variables, the forecasting
process is simplified.
67. IMPORTANT QUESTION
1. Why use autocorrelation instead of autocovariance when examining
stationary time series?
2. When should an ARIMA(p,d,q) model in which d > 0 be considered instead
of an ARMA(p,q) model?
3. Describe the Box Jenkins Methodology for Time Series