2. Disclaimers
● Data come from Sant’Andrea Hospital of Rome.
● The analysis focuses on the emergency department patients volumes
prediction, not with optimization of emergencies management.
3. Outline
1. Introduction
2. Data
3. Time series theory
4. Prediction models
4.1. Basic forecasting models
4.2. ARIMA
4.3. Exponential Smoothing
4.4. Neural Networks
5. Conclusions
4. Outline
1. Introduction
2. Data
3. Time series theory
4. Prediction models
4.1. Basic forecasting models
4.2. ARIMA
4.3. Exponential Smoothing
4.4. Neural Networks
5. Conclusions
5. Introduction
● Examination of some ML techniques for time series
forecasting
● Case study: ED visits data from Roman hospital with the goal
to predict ED weekly visits
6. Outline
1. Introduction
2. Data
3. Time series theory
4. Prediction models
4.1. Basic forecasting models
4.2. ARIMA
4.3. Exponential Smoothing
4.4. Neural Networks
5. Conclusions
7. Data
● Dataset
○ 5 years of data: 2014 → 2018
● Year by year average number of weekly ER visits
2014 2015 2016 2017 2018
mean 136 150 146 152 132
22. Outline
1. Introduction
2. Data
3. Time series theory
4. Prediction models
4.1. Basic forecasting models
4.2. ARIMA
4.3. Exponential Smoothing
4.4. Neural Networks
5. Conclusions
23. Prediction models: 2) ARIMA
● ARIMA methods rely on a linear forecasting model in which the independent variables (or
regressors) is the variable itself but at different times in the past → autoregressor (AR)
● As usual with LM collinearity (correlations among regressors) must be avoided
● Autocorrelation checks to establish if the time series is stationary
● If it is not: differencing
24. Prediction models: 2) ARIMA:
- Autocorrelation
● Autocorrelation measures the
linear relationship between lagged
values of a time series
● Trended time series tend to have
positive values that decrease as
the lags increase (small here)
25. Prediction models: 2) ARIMA
- Autocorrelation
● Autocorrelations will be larger for
the seasonal lags (at multiples of
the seasonal frequency)
● Here peaks at about every 25
weeks (~6 months)
26. Prediction models: 2) ARIMA
- Differencing
Differencing checks data
log
season
differenced
differenced
27. Prediction models: 2) ARIMA
Non seasonal ARIMA:
- differencing with
autoregression and a
moving average
model.
80% and 90% confidence
intervals shown in shaded
areas
28. Prediction models: 2) SARIMA
Seasonal ARIMA:
- differencing with
autoregression, a
moving average model
and a seasonal
component, D=n. of
seasonal differences.
29. Outline
1. Introduction
2. Data
3. Time series theory
4. Prediction models
4.1. Basic forecasting models
4.2. ARIMA
4.3. Exponential Smoothing
4.4. Neural Networks
5. Conclusions
30. Prediction models: 3) exponential
smoothing
● Forecasts = weighted averages of past observations.
● Weights decaying exponentially as the observations get older.
● The more recent the observation the higher the associated weight and so
the more similar the forecasted value will be to it.
If α is close to 1: more weight to the more recent observations.
31. Prediction models: 3) Holt method
● test
● Damped
Dampening the trend so that it
approaches a constant some
time in the future
● Holt’s method
32. Prediction models: 3) Holt - Winter
● H-W = seasonal Holt method
● Better results with seasonal
● Though it underestimates the
seasonality
33. Outline
1. Introduction
2. Data
3. Time series theory
4. Prediction models
4.1. Basic forecasting models
4.2. ARIMA
4.3. Exponential Smoothing
4.4. Neural Networks
5. Conclusions
34. Prediction models: 4) Neural Networks
A neural network can be thought of as a
network of neurons organized in layers.
● The first layer is the input layer
● The last one is output layer
● Intermediate layers are called hidden.
With time series data, lagged values of
the time series can be used as inputs:
neural network autoregression or NNAR.
36. Conclusions
● Limitation of the analysis:
○ weekly data is difficult to work with because the seasonal period (the number of weeks in a year) is both
large and non-integer. The average number of weeks in a year is 52.18. Most of the methods we have
considered require the seasonal period to be a not too large integer.
○ cyclics (mid of 2018 spike) unpredictable
● Future developments:
○ SARIMAX: X stand for exogenuous variables
37.
38.
39.
40. Time series decomposition: STL
data
season
trend
remainder
● Seasonal and
Trend
decomposition
using Loess
● The seasonal
component is
allowed to
change over
time.