1. TIME SERIES
FORECASTING USING
TBATS MODEL
Team Members:
• D22017 – P Gowtham Kumar
• D22041 – R Sai Sasi Sekhar
• D22049 – B Vinay Kumar
Course: TF
2. Introduction
TBATS is a forecasting method to
model time series data
• TBATS is an Acronym for key
features of the model
• The main aim of this is to forecast
time series with complex seasonal
patterns using exponential
smoothing
• There can be many types of
seasonality's present (e.g., time of
day, daily, weekly, monthly, yearly).
2
3. ADVANTAGES OF TBATS
• Many time series exhibit complex and multiple seasonal patterns (e.g., hourly data that
contains a daily pattern, weekly pattern and an annual pattern).
• The most popular models (e.g. ARIMA and exponential smoothing) can only account for
one seasonality.
• TBATS model has the capability to deal with complex seasonality's (e.g., non-integer
seasonality, non-nested seasonality and large-period seasonality) with no seasonality
constraints, making it possible to create detailed, long-term forecasts.
3
4. How to Forecast Time Series With Multiple Seasonality's
• We often encounter seasonality in time
series. Seasonality is the periodical
variation in our series. It is a cycle that
occurs over a fixed period in our series.
• Here, we can clearly see seasonal cycle,
as every year, the number of air
passengers peaks around the month of
July and falls down again.
• To forecast this series, we can simply use
a SARIMA model, since there is only one
seasonal period with a length of one
year.
4
Monthly total number of air passengers for an airline, from
January 1949 to December 1960. We notice a clear
seasonal pattern in the series, with more people travelling
during the months of June, July, and August
5. • Now, things get complicated when we are working with high frequency data.
• For example, an hourly time series can exhibit a daily, weekly, monthly and yearly
seasonality, meaning that we now have multiple seasonal periods.
• Take a look at the hourly traffic volume on the Interstate 94 shown below.
5
Hourly traffic volume,
westbound, on the interstate
94 in Minneapolis, Minnesota.
Here we can see both a daily
seasonality (more cars are on
the road during the day than
during the night),
But also a weekly seasonality
(more car are on the road
Monday to Friday, than during
the weekends).
6. 6
• Looking at the data above, we can see that we have two seasonal periods!
• First, we have a daily seasonality, as we see that more cars travel on the road during the
day than during the night.
• Second, we have a weekly seasonality, as traffic volume is higher during weekdays than
during the weekend.
• In this case, a SARIMA model cannot be used, because we can only specify one seasonal
periods, whereas we definitely have two seasonal periods in our data: a daily seasonality
and a weekly seasonality.
• Using BATS and TBATS models, we can fit and forecast time series that have more than
one seasonal period.
7. 7
BATS Model
• Exponential smoothing is a family of forecasting methods. The general idea behind these forecasting
methods is that future values are a weighted average of past values, with the weights decaying
exponentially as we go back in time. Forecasting methods include SES, DES and TES.
• State-space modelling is a framework in which a time series is seen as a set of observed data that is
influenced by a set of unobserved factors. The state-space model then expresses the relationship
between the two sets. Again, this must be seen as a framework, as an ARMA model can be expressed as a
state-space model.
• Box-Cox transformation is a power transformation that helps make the series stationary, by stabilizing the
variance and mean over time.
• ARMA errors is a process in which we apply an ARMA model on the residuals of the time series in order
to find any unexplained relationship. Usually, the residuals of a model should be totally random, unless
some information was not captured by the model. Here, we use an ARMA model to capture any
remaining information in the residuals.
• Trend is a component of a time series that explains the long-term change in the mean value of the series.
When we have a positive trend, then our series is increasing over time. With a negative trend, the series
decreases over time.
• The seasonal component is what explains the periodical variation in the series.
8. 8
• To summarize, BATS is an extension of exponential smoothing methods that combines
a Box-Cox transformation to handle non-linear data and uses an ARMA model to
capture autocorrelation in the residuals.
• The advantage of using BATS is that it can treat non-linear data, solve the
autocorrelation problem in residuals since it uses an ARMA model, and it can take into
account multiple seasonal periods.
• However, the seasonal periods must be integer numbers, otherwise BATS cannot be
applied. For example, suppose that you have weekly data with a yearly seasonality,
then your period is 365.25/7 which is approximately 52.2. In that case, BATS is ruled
out.
• Furthermore, BATS can take a long time to fit if the seasonal period is very large,
meaning that it is not suitable if you have hourly data with a monthly (the period would
be 730).
• Thus, the TBATS model was developed to address that situation.
9. TBATS
• It uses the same components as the BATS model, however it represents each seasonal
period as a trigonometric representation based on Fourier series. This allows the model to
fit large seasonal periods and non-integer seasonal periods.
• It is thus a better choice when dealing with high-frequency data and it usually fits faster
than BATS.
• Applying the models to forecast the next seven days of hourly traffic volume.
9
10. First, double-seasonal Holt–Winters (DSHW) exponential smoothing equation with additive
trend and additive seasonality is shown below. This model (Eqs. 7 to 11), developed by Taylor
(2003), was an extension of the Holt–Winters exponential smoothing.
10
11. BATS
The following equations show the extension of double-seasonal Holt–Winters (DSHW),
called Box–Cox transformation, ARMA errors, trend, and multiple seasonal patterns (BATS).
These are expressed by Eqs. 12 to 17 (De Livera 2010).
11
12. TBATS
The following equations show the extension of BATS model by adapting Eqs. (12) to (17) with
the following expressions. This adaptation is called TBATS model (Eqs. 18 to 21) (De Livera
et al. 2011).
12
13. • Of course, we recognize the plot from the beginning of this article and notice that the
traffic volume is indeed lower during the weekend than during the weekdays. Also, we
see a daily seasonality, with traffic being heavier during the day than at night.
• Therefore, we have two periods: the daily period has a length of 24 hours, and the weekly
period has a length of 168 hours. Let’s keep that in mind as we move on to modeling.
13
14. 14
Modeling
• For modeling of data. Here, we use the sktime package. This framework which brings many
statistical and machine learning methods for time series. It also uses a similar syntax
convention to scikit-learn, making it easy to use.
• The first step is to define our target and define the forecast horizon.
• Here, the target is the traffic volume itself.
• For the forecast horizon, we wish to predict one week of data.
• Since we have hourly data, we must then predict 168 timesteps (7 * 24) into the future.
15. Inference
• That’s a bit anticlimactic, but let’s understand why this happened.
• It is possible that our dataset is too small.
• It might be that the sample that we used for testing turns out to favor the baseline model.
One way to verify would be to forecast multiple 168 hour-horizon, to see if the baseline
model still outperforms the rest.
• Also, it can be that we were too strict with the models’ parameters. Here, we forced both
models to use Box-Cox transformations and remove the trend component. However, we
could have not specified those parameters, and the model would have tried both
possibilities for each parameter and select the one with the lowest AIC (Akaike’s
Information Criterion). While this makes the training process longer, it might also result in
better performance from BATS and TBATS.
• Nevertheless, a key takeaway is that a building a baseline model is very important for any
forecasting project.
15
16. Conclusion
• BATS and TBATS models, were used to forecast time series that have more than one
seasonal period, in which case a SARIMA model cannot be used.
• We applied both models to forecast the hourly traffic volume, but it turned out that our
baseline remained the best performing model.
• Nevertheless, we saw that BATS and TBATS can indeed model time series with complex
seasonality's.
16
17. Potential improvements
•Forecast multiple 168 hour-horizon and check if the baseline is indeed the most performant
model. the entire dataset original dataset can be used, which contains much more data than
what we worked with.
•Do not specify the parameters use_box_cox, use_trend, and use_damped_trend, and allow the
model to make the best selection based on the AIC.
17
18. Key takeaways
•Always build a baseline model when forecasting
•BATS and TBATS can be used for modeling time series with complex seasonality
•BATS works well when the periods are short and integer numbers
•TBATS trains faster than BATS and works with seasonal periods that are not integers
18