Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data Science | Simplilearn
The document provides an overview of implementing time series analysis using R, focusing on concepts like stationarity, ARIMA models, and forecasting methodologies. It discusses the components of time series, the significance of autocorrelation, and the process of model validation through the Ljung-Box test. Additionally, it illustrates practical examples of forecasting air-ticket sales data and the decomposition of time series into trend, seasonality, and irregularity components.
Before implementing TimeSeries,
let’s quickly brush up what we
have discusses in Part 1 so far..
3.
What is TimeSeries?
It is time-dependent
These data points (past values) are analyzed to forecast a future
A Time Series is a sequence of data being recorded at specific time intervals
4.
Time Series isaffected by
four main components
Trend Seasonality Cyclicity Irregularity
5.
How do youdifferentiate
between a stationary and
Non-Stationary time series?Stationarity of Time Series depends on:
Mean
Variance
Co-Variance
6.
Stationarity of TimeSeries
The mean of the series should not be a function of time rather should be a constant
Here, mean is
constant with time
Here, mean is
increasing with
time
Stationary Non-Stationary
7.
What is that?
Wewill give a time code(t) variable to each row indicating each time period, let’s discuss this variable later:
t Year Quarter Sales(1000s) MA(4) CMA
1 Year 1 1 2.8
2 2 2.1
3 3 4 3.4 3.5
4 4 4.5 3.6 3.7
5 Year 2 1 3.8 3.9 4.0
6 2 3.2 4.1 4.2
7 3 4.8 4.3 4.3
8 4 5.4 4.4 4.4
9 Year 3 1 4 4.5 4.5
10 2 3.6 4.6 4.7
11 3 5.5 4.7 4.8
12 4 5.8 4.8 4.8
13 Year 4 1 4.3 4.9 4.9
14 2 3.9 5.0 5.1
15 3 6 5.2
16 4 6.4
8.
Forecast Time Series
Herewe can see that the predicted values overlap with the actual values and continues till year
5 which shows the accuracy of our forecasted values
9.
Now, let’s moveon with our
implementation of Time Series
using R
10.
What’s in itfor you?
Introduction to ARIMA Model
Auto-Correlation & Partial Auto-Correlation
Use Case: Forecast the sales of air-tickets using ARIMA
Model Validating using Ljung-Box Test
11.
Will use ARIMAmodel to forecast
the Time Series, let’s have a short
introduction to ARIMA model
12.
Will use ARIMAmodel to forecast
the Time Series, let’s have a short
introduction to ARIMA model
ARIMA stands for Auto Regressive Integrated Moving Average
It is specified by three order parameters: p,d,q
13.
Will use ARIMAmodel to forecast
the Time Series, let’s have a short
introduction to ARIMA model
ARIMA models are classified by three factors:
p = number of autoregressive terms (AR),
d = how many non-seasonal differences are needed to
achieve stationarity (I),
q = number of lagged forecast errors in the prediction
equation (MA)
14.
Will use ARIMAmodel to forecast
the Time Series, let’s have a short
introduction to ARIMA model
AR(p): number of autoregressive terms (AR)
Auto-Regressive
Parameter(AR), p
Example: ARIMA(2,0,0) has a value of p as 2
Degree of
Differencing(I), d
Moving Average(MA),
q
In terms ofa regression model,
autoregressive components refer
to prior values of the current value
17.
Will use ARIMAmodel to forecast
the Time Series, let’s have a short
introduction to ARIMA model
The second AR component would be x(t-2) and so on
These are often referred to as lagged terms. So the prior
value is called the first lag, and the one prior that the
second lag, and so on
If x(t) Current value
then AR component = x(t-1) * a
Where a = fitted coefficient
18.
Will use ARIMAmodel to forecast
the Time Series, let’s have a short
introduction to ARIMA model
It is equal to the number of non-seasonal differences needed to
achieve stationarity
1 level of differencing would mean you take the current value and
subtract the prior value from it
Auto-Regressive
Parameter(AR), p
Degree of
Differencing(I), d
Moving Average(MA),
q
19.
Will use ARIMAmodel to forecast
the Time Series, let’s have a short
introduction to ARIMA model
Differencing: Subtracting prior values from the current values:
If this series still shows a trend then you can do another level of differencing
with the first level differenced series
Values 1st Order Differencing Result
5 NA NA
4 4-5 -1
6 6-4 2
7 7-6 1
9 9-7 2
12 12-9 3
12 12-12 0
20.
Will use ARIMAmodel to forecast
the Time Series, let’s have a short
introduction to ARIMA model
It represents the error of the model as a combination of previous error
terms et
Auto-Regressive
Parameter(AR), p
Degree of
Differencing(I), d
Moving Average(MA),
q
ACF/PACF
In order totest whether or not the series and their error term is
auto correlated, we usually use:
Auto-correlation function
(ACF)
Partial Auto-correlation function
(PACF)
As it crossesthe blue dashed line, it shows that the
values are correlated. Hence, non-stationary.
ACF
Autocorrelation is the similarity between values of a same variable
across observations
What is Auto-correlation?
28.
As it crossesthe blue dashed line, it shows that the
values are correlated. Hence, non-stationary.
ACF
What is Auto-correlation?
• Auto-Correlation Function(ACF) tells you how correlated points
are with each other, based on how many time steps they are
separated by.
• It is used to determine how past and future data points are related
in a time series. It’s value can range from -1 to 1
29.
When we plotACF for our dataset, it crosses the
blue dashed line which indicates that the values are
correlated. Hence, non-stationary.
ACF
R plots 95% significance boundaries as blue dotted lines
30.
PACF
The PACF functionshows a definite pattern, which does not repeat, we can conclude that the data
does not show any seasonality
What is Partial Auto-correlation?
Partial autocorrelation is the degree of association between two
variables while adjusting the effect of one or more additional variables
31.
The PACF functionshows a definite pattern, which does not repeat, we can conclude that the data
does not show any seasonality
PACF
What is Partial Auto-correlation?
• PACF (Partial Auto-Correlation Function) gives partial correlation
of time series with its own lagged values
• It’s value can range from -1 to 1
32.
PACF
The PACF functionshows a definite pattern, which does not repeat, we can conclude that the data
does not show any seasonality
Use Case: TimeSeries Forecasting
Data Description: 10 year air-ticket sales data of airline industry
from 1949-1960
Objective: To predict the airline tickets’ sales of 1961 using Time Series
Analysis
35.
Use Case: TimeSeries Forecasting
Identificatio
nofthe
important
parameters
and
characteris
tics,which
adequately
describe
thetime
series
behavior
Time Series
Behavior
Time Series
Forecasting
Goal of
Time
Series
Identification of the Time Series components like trend ,
seasonality to describe the behavior
Forecasting the values of the Time Series, depending on its
actual and past values
36.
Use Case: ExploratoryData Analysis
Load the data
Look at the data
It is a Time Series dataset
37.
Use Case: ExploratoryData Analysis
Load the data
Look at the data
It is a Time Series dataset
Use Case: ExploratoryData Analysis
Let’s use the boxplot function to see any seasonal effects:
46.
Use Case: ExploratoryData Analysis
SEASONALITY
TREND
The passenger numbers increase over time indicating an
increasing linear trend
In the boxplot there are more passengers travelling in months 6 to 9,
indicating seasonality with a apparent cycle of 12 months
Thus, we can make some initial inferences:
47.
Use Case: TimeSeries Decomposition
We will decompose the Time Series
Decomposing means separating the original Time Series into its components(trend, seasonality, irregularity)
Using decompose function in R Applying multiplicative model
48.
Use Case: TimeSeries Decomposition
We will decompose the Time Series
Decomposing means separating the original Time Series into its components(trend, seasonality, irregularity)
ddata =Decomposed data
49.
Use Case: TimeSeries Decomposition
We will decompose the Time Series
Decomposing means separating the original Time Series into its components(trend, seasonality, irregularity)
ddata =Decomposed data
50.
The data musthave a
constant variance and mean,
right?
51.
Yes, you caneasily model your data
if it is Stationary
Use Case: FitA Time Series Model
For instructor
ARIMA Model
The ARIMA(2,1,1)(0,1,0)[12] model parameters are:
Lag 1 differencing (d),
An autoregressive term of second lag (p),
A moving average model of order 1 (q)
Then the seasonal model has an autoregressive term of first lag (D) at
model period 12 units, in this case months
56.
Use Case: FitA Time Series Model
The ARIMA fitted model is:
Y^=0.5960Yt−2+0.2143Yt−12−0.9819et−1+EY^
=0.5960Yt−2+0.2143Yt−12−0.9819et−1+E
where E is error
ARIMA Model
Use Case: Diagnostics
ARIMAModel
Let’s plot the ACF for residuals:
The residual plots are centered
around 0 as noise, with no pattern.
Hence, the ARIMA model is a fairly
good fit
59.
Use Case: CalculateForecast
You can plot a forecast of the Time Series using the forecast function, with a 95% confidence interval where h is
the forecast horizon periods in months
Use Case: Validation
Conclusion
Wecan conclude from the ARIMA output, that
our model using parameters (2, 1, 1) has been
shown to adequately fit the data
67.
Summary
Arima model Acfand pacf Exploratory data analysis
Forecast and validationTime series decomposition
Editor's Notes
#3 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#5 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#6 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#8 Now the moving average is centered, and is proper for 3rd quarter onwards
#10 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#12 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#13 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#14 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#15 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#16 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#17 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#18 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#19 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#20 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#21 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#22 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#23 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#24 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#26 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#27 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#48 Decomposing a time series means separating it into its constituent components, which are usually a trend component and an irregular component, and if it is a seasonal time series, a seasonal component
#49 Decomposing a time series means separating it into its constituent components, which are usually a trend component and an irregular component, and if it is a seasonal time series, a seasonal component
#50 Decomposing a time series means separating it into its constituent components, which are usually a trend component and an irregular component, and if it is a seasonal time series, a seasonal component
#51 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#52 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#53 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#62 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#65 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?
#66 does fat intake/weight affects cholesterol?
Will the area of the house affect the house pricing?
Do customer satisfaction influence customer loyalty?