SlideShare a Scribd company logo
1 of 19
Download to read offline
ELECTRICITY CONSUMPTION
FORECASTING USING ARIMA,
UCM AND MACHINE LEARNING
MODELS
University of Milano-Bicocca
Master's Degree in Data Science
Streaming Data Management and Time Series Analysis
Academic Year 2022-2023
Author:
Giorgio CARBONE matr. n° 811974
Introduction
❑ Univariate, regular, high-frequency time series
❑ 48096 electricity consumption measured every 10
minutes
❑ from 01/01/2017 00:00:00 to 30/11/2017 23:50:00
❑ Comparing different approaches to modelling
❑ Statistical approaches → ARIMA, UCM
❑ Machine Learning → Random Forest, k-NN
❑ Deep Learning → GRU Recurrent Neural Networks
❑ Goal → predict the 4320 observations of December
2017
❑ from 01/12/2017 00:00:00 to 30/12/2017 23:50:00
Data Exploration
❑ Two variables → Power, Date (date-hour labels)
❑ no duplicates or missing values
❑ energy consumption increases in summer
❑ Yearly seasonality
❑ caused by the cycle of the seasons
❑ not observable (only one year of data)
❑ Rare anomalies
Data Exploration
❑ Daily seasonality
❑ day-night cycle
❑ 24-hour period → 144 observations
❑ Weekly seasonality
❑ weekend consumption reduction
❑ 7-days period → 1008 observations
Hold-out models evaluation method
❑ Training Set
❑ From 01/01/2017 00:00:00 to 31/10/2017 23:50:00
❑ 43776 observations
❑ Validation Set
❑ From 01/11/2017 00:00:00 to 30/11/2017 23:50:00
❑ 4320 observations → consistent with Dec
forecast horizon
❑ Choosing the models with the lowest MAE on
validation set → for each type
❑ highest generalization capabilities
❑ re-estimated on training+validation
❑ used to predict from 01/12/2017 00:00:00 to
30/12/2017 23:50:00
Training
Set
Validation
Set
ARIMA MODELS
ARIMA: stationarity
❑ Non-stationarity in variance
❑ Box-Cox plot → Slight linear correlation
❑ Models estimated on Box-Cox transformed
time series → performed worse → no
transformation applied
❑ Non-stationarity in mean
❑ trend and multiple seasonalities
❑ Augmented Dickey-Fuller and KPSS (5%)
❑ rejected hypothesis of need for simple
difference
❑ need for a seasonal difference with
lag=144
ARIMA: no regressors
❑ First model → ARIMA 0,0,0 0,1,1 144 (1.1)
❑ SMA(1) and seasonal difference (p=144)
❑ MAEvalidation = 1415.85
❑ Residuals ACF → linear memory in early lags
and multiples of 1008 (weekly seasonality)
❑ Residuals PACF → linear memory in first 3
lags
❑ Tried to capture more linear memory
❑ ARIMA 3,1,0 0,1,1 144 (1.7)
❑ Lowest residuals ACF and PACF values →
good adaptation to training data
❑ High MAEvalidation → poor predictive
capabilities
[1.1] Residuals ACF and PACF and validation set predictions
[1.7] Residuals ACF and PACF and validation set predictions
ARIMAX
❑ model weekly seasonality using 6 dummy regressors
❑ 𝐴𝑅𝐼𝑀𝐴 2,1,0 0,1,1 144 + 6 dummies [1.10]
❑ captures a lot of linear memory
❑ MAEvalidation = 4116.94
❑ 𝐴𝑅𝐼𝑀𝐴 0,0,0 1,0,0 144 + 6 dummies [1.11]
❑ MAEvalidation = 1065.45
❑ model weekly seasonality using sinusoids
❑ 𝐴𝑅𝐼𝑀𝐴 0,0,0 1,0,0 144 + 8 sinusoids (p = 1008) [1.12]
❑ 𝐌𝐀𝑬𝒗𝒂𝒍𝒊𝒅𝒂𝒕𝒊𝒐𝒏 = 𝟏𝟎𝟏𝟎. 𝟎𝟖 → best ARIMA model
❑ model daily seasonality using sinusoids + weekly seasonality using
dummies → no improvements in prediction accuracy
❑ 𝐴𝑅𝐼𝑀𝐴 0,0,0 1,0,0 144 + 8 sinusoids (p = 144) [1.13]
❑ 𝐴𝑅𝐼𝑀𝐴 0,0,0 1,0,0 144 + 8 sinusoids (p = 144) + 6 dummies [1.14]
[1.12] Residuals ACF and PACF and validation
set predictions
ARIMA: time series aggregated by hour
❑ Time series aggregated by hour
❑ 24 time series → 24 hours of the day
❑ averaging the 6 values within each hour
❑ Evaluation
1. Fitted the same model independently on each of the 24 time
series
2. Predicted the hourly observations for November 2017
3. Restored 10-minute granularity by cubic interpolation using spline
functions
4. Computed MAEvalidation
❑ 𝐴𝑅𝐼𝑀𝐴 0,1,1 0,0,1 7 [1.15]
❑ captures a lot of linear memory
❑ MAEvalidation = 1142.87
[1.15] Residuals ACF and PACF and validation
set predictions
UCM MODELS
UCM: time series aggregated by hour
❑ Models are trained in time series aggregated by hour (averaging the 6 values within each hour)
❑ predicting the 720 hourly observations for November 2017
❑ restored 10-minute granularity by cubic interpolation
❑ ො
𝑦 = 𝑇𝑅𝐸𝑁𝐷 + 𝑆𝐸𝐴𝑆𝑑𝑎𝑖𝑙𝑦 24ℎ + 𝑆𝐸𝐴𝑆𝑤𝑒𝑒𝑘𝑙𝑦(168ℎ)
❑ Different models and combinations tested
1. 𝐿𝐿𝑇 + 10 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝑠𝑖𝑛𝑢𝑠𝑜𝑖𝑑𝑠𝑤𝑒𝑒𝑘𝑙𝑦 𝑝 = 168 + (𝟐𝟒 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝒅𝒖𝒎𝒎𝒊𝒆𝒔𝒅𝒂𝒊𝒍𝒚 𝑽𝑺 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝒔𝒊𝒏𝒖𝒔𝒐𝒊𝒅𝒔𝒅𝒂𝒊𝒍𝒚 𝑝 = 24 )
❑ 𝟐𝟒 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝒅𝒖𝒎𝒎𝒊𝒆𝒔𝒅𝒂𝒊𝒍𝒚 → better accuracy
2. 𝐿𝐿𝑇 + 10 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒔𝒊𝒏𝒖𝒔𝒐𝒊𝒅𝒔𝒘𝒆𝒆𝒌𝒍𝒚 𝑝 = 168 𝑉𝑆 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒄𝒚𝒄𝒍𝒆𝒘𝒆𝒆𝒌𝒍𝒚 𝑝 = 168 + 24 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝑑𝑢𝑚𝑚𝑖𝑒𝑠𝑑𝑎𝑖𝑙𝑦
❑ 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒄𝒚𝒄𝒍𝒆𝒘𝒆𝒆𝒌𝒍𝒚 𝑝 = 168 → very high MAEvalidation
3. 𝐿𝐿𝑇 + (1 − 16) 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒔𝒊𝒏𝒖𝒔𝒐𝒊𝒅𝒔𝒘𝒆𝒆𝒌𝒍𝒚 𝑝 = 168 + 24 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝑑𝑢𝑚𝑚𝑖𝑒𝑠𝑑𝑎𝑖𝑙𝑦 [2.1]
❑ Testing the optimal number of weekly sinusoids for the final model
❑ 2 sinusoids → 𝑀𝐴𝐸𝑣𝑎𝑙𝑖𝑑𝑎𝑡𝑖𝑜𝑛 = 1210.08 and 6 sinusoids → 𝑀𝐴𝐸𝑣𝑎𝑙𝑖𝑑𝑎𝑡𝑖𝑜𝑛 = 1299.41
UCM: original time series
❑ Original 10-minutes frequency time series
❑ Training set: September and October 2017 observations
❑ Applied logarithm → improved performance
❑ Final and best UCM model [2.2] → 𝑴𝑨𝑬𝒗𝒂𝒍𝒊𝒅𝒂𝒕𝒊𝒐𝒏 = 𝟏𝟏𝟖𝟑. 𝟒𝟎
❑ 𝑳𝒐𝒄𝒂𝒍 𝑳𝒊𝒏𝒆𝒂𝒓 𝑻𝒓𝒆𝒏𝒅 𝑳𝑳𝑻 +
❑ 𝟏𝟎 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒔𝒊𝒏𝒖𝒔𝒐𝒊𝒅𝒔𝒘𝒆𝒆𝒌𝒍𝒚 𝒑 = 𝟏𝟎𝟎𝟖 +
❑ 𝟏𝟎 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒔𝒊𝒏𝒖𝒔𝒐𝒊𝒅𝒔𝒅𝒂𝒊𝒍𝒚 𝒑 = 𝟏𝟒𝟒
❑ Initial parameters (vy = time series variance)
- 𝑙𝑜𝑔𝑉𝑎𝑟𝐸𝑡𝑎:
𝑣𝑦
10
| 𝑙𝑜𝑔𝑉𝑎𝑟𝑍𝑒𝑡𝑎:
𝑣𝑦
10000
- 𝑙𝑜𝑔𝑉𝑎𝑟𝑂𝑚𝑒𝑔𝑎144:
𝑣𝑦
1000
| 𝑙𝑜𝑔𝑉𝑎𝑟𝑂𝑚𝑒𝑔𝑎1008:
𝑣𝑦
10000
- 𝑙𝑜𝑔𝑉𝑎𝑟𝐸𝑝𝑠𝑖𝑙𝑜𝑛:
𝑣𝑦
10
- 𝑎1 = 𝑚𝑒𝑎𝑛 𝑡𝑖𝑚𝑒 𝑠𝑒𝑟𝑖𝑒𝑠 | 𝑃1 = 𝑣𝑦 ∗ 10
[2.2] Trend estimate and validation set predictions
MACHINE LEARNING MODELS
Deep Learning: GRU Recurrent Neural Networks
❑ tested different hyperparameters:
• number of GRU layers and number of relative neuronal activities
• presence or absence of dropout layer after GRU layers
• batch size, epochs
• learning rate and optimizer
• number of lag and lead
• activation function
• dense layers size
❑ Trained on original 10-minute frequency time series
❑ Standardized (𝜇 = 0, 𝜎 = 1)
❑ Differentiated (144 observations)
❑ Using lags as regressors
Deep Learning: GRU Recurrent Neural Networks
❑ Final Network Hyperparameters:
• Batch Size: 144
• Epochs = 40
• Lag = 4320, Lead = 4320 (multi-step prediction)
• LR = 0.001, Optimizer = Adam, Loss = MAE
❑ Network Architecture:
❑ Good forecasting accuracy → MAEvalidation = 1184.07
❑ best ML model
Input Layer
GRU Layer (16 units, no dropout)
Linear Activation
Hyperbolic tangent Activation
GRU Layer (16 units, no dropout)
Hyperbolic tangent Activation
Dense Layer (FC, 30 neurons)
Machine Leaning: Random Forest and k-NN
❑ Trained on original 10-minute frequency time series
❑ Using lags as regressors
❑ Random Forest
• tested various hyperparameters configurations:
• number of lags, number of trees, using external regressors (hour of the day and day of the week dummies)
• recursive forecasting (using estimates as new regressors)
• Best Model: lags = 144*7, 250 trees, day of the week dummies → MAEvalidation = 2350.14
❑ k-NN, best model → MAEvalidation = 1694.75
• 144*7 = 1008 lags
• Multi-Step Ahead Strategy (MIMO) forecasting → lead = 4320
• median function to aggregate values
FINAL RESULTS
The most accurate models on the validation set
Model Class Best Model for each class MAEvalidation
ARIMA ARIMA(0,0,0) (1,0,0)144+ 8 weekly sinusoids
(p=1008)
1010.08
UCM LLT
+ 10 stochastic sinusoidsweekly p = 1008
+ 10 stochastic sinusoidsdaily p = 144
1183.40
ML Gated Recurrent Units (GRU) Network 1184.07
❑ Best models for each class
❑ are re-trained on training+validation
❑ used to predict the 4320 December 2017 observations

More Related Content

Similar to Electricity Consumption Forecasting Using Arima, UCM, Machine Learning and Deep Learning Models

Evaluation in Information Retrieval
Evaluation in Information RetrievalEvaluation in Information Retrieval
Evaluation in Information Retrieval
Dishant Ailawadi
 
Small Scale Concentrated Solar Power
Small Scale Concentrated Solar PowerSmall Scale Concentrated Solar Power
Small Scale Concentrated Solar Power
Matthew Mobley
 

Similar to Electricity Consumption Forecasting Using Arima, UCM, Machine Learning and Deep Learning Models (20)

qHNMR for purity determination
qHNMR for purity determinationqHNMR for purity determination
qHNMR for purity determination
 
Short term power forecasting Awea 2014
Short term power forecasting Awea 2014Short term power forecasting Awea 2014
Short term power forecasting Awea 2014
 
Improving Physical Parametrizations in Climate Models using Machine Learning
Improving Physical Parametrizations in Climate Models using Machine LearningImproving Physical Parametrizations in Climate Models using Machine Learning
Improving Physical Parametrizations in Climate Models using Machine Learning
 
Climate model
Climate modelClimate model
Climate model
 
KCC2017 28APR2017
KCC2017 28APR2017KCC2017 28APR2017
KCC2017 28APR2017
 
CLIM: Transition Workshop - Optimization Methods in Remote Sensing - Jessica...
CLIM: Transition Workshop - Optimization Methods in Remote Sensing  - Jessica...CLIM: Transition Workshop - Optimization Methods in Remote Sensing  - Jessica...
CLIM: Transition Workshop - Optimization Methods in Remote Sensing - Jessica...
 
Presentation
PresentationPresentation
Presentation
 
Evaluation in Information Retrieval
Evaluation in Information RetrievalEvaluation in Information Retrieval
Evaluation in Information Retrieval
 
Sigma Xi Research Showcase 2018 - Oleksii Volkovskyi
Sigma Xi Research Showcase 2018 - Oleksii VolkovskyiSigma Xi Research Showcase 2018 - Oleksii Volkovskyi
Sigma Xi Research Showcase 2018 - Oleksii Volkovskyi
 
Geothermal Reserves Assessment
Geothermal Reserves AssessmentGeothermal Reserves Assessment
Geothermal Reserves Assessment
 
Damian Peckett - Artificially Intelligent Crop Irrigation
Damian Peckett - Artificially Intelligent Crop Irrigation Damian Peckett - Artificially Intelligent Crop Irrigation
Damian Peckett - Artificially Intelligent Crop Irrigation
 
Classification of Grasp Patterns using sEMG
Classification of Grasp Patterns using sEMGClassification of Grasp Patterns using sEMG
Classification of Grasp Patterns using sEMG
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 
Electricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksElectricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural Networks
 
Md2k 0219 shang
Md2k 0219 shangMd2k 0219 shang
Md2k 0219 shang
 
Lower back pain Regression models
Lower back pain Regression modelsLower back pain Regression models
Lower back pain Regression models
 
Small Scale Concentrated Solar Power
Small Scale Concentrated Solar PowerSmall Scale Concentrated Solar Power
Small Scale Concentrated Solar Power
 
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Parameter selection in a combined cycle power plant
Parameter selection in a combined cycle power plantParameter selection in a combined cycle power plant
Parameter selection in a combined cycle power plant
 

More from Giorgio Carbone

More from Giorgio Carbone (8)

Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
Identification Of Alzheimer's Disease Using A Deep Learning Method Based O...
Identification Of  Alzheimer's Disease Using A  Deep Learning Method Based  O...Identification Of  Alzheimer's Disease Using A  Deep Learning Method Based  O...
Identification Of Alzheimer's Disease Using A Deep Learning Method Based O...
 
Milano Air Quality: Interactive Data Visualization
Milano Air Quality: Interactive Data VisualizationMilano Air Quality: Interactive Data Visualization
Milano Air Quality: Interactive Data Visualization
 
Competitive Pokémon Graph Database
Competitive Pokémon Graph DatabaseCompetitive Pokémon Graph Database
Competitive Pokémon Graph Database
 
Video Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetVideo Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 dataset
 
Word Embedding (Word2Vec and CADE): the evolution of tópoi in the Italian lit...
Word Embedding (Word2Vec and CADE): the evolution of tópoi in the Italian lit...Word Embedding (Word2Vec and CADE): the evolution of tópoi in the Italian lit...
Word Embedding (Word2Vec and CADE): the evolution of tópoi in the Italian lit...
 
Extreme Extractive Text Summarization and Topic Modeling (using LSA and LDA t...
Extreme Extractive Text Summarization and Topic Modeling (using LSA and LDA t...Extreme Extractive Text Summarization and Topic Modeling (using LSA and LDA t...
Extreme Extractive Text Summarization and Topic Modeling (using LSA and LDA t...
 
CXR-ACGAN: Auxiliary Classifier GAN for Conditional Generation of Chest X-Ray...
CXR-ACGAN: Auxiliary Classifier GAN for Conditional Generation of Chest X-Ray...CXR-ACGAN: Auxiliary Classifier GAN for Conditional Generation of Chest X-Ray...
CXR-ACGAN: Auxiliary Classifier GAN for Conditional Generation of Chest X-Ray...
 

Recently uploaded

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 

Recently uploaded (20)

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 

Electricity Consumption Forecasting Using Arima, UCM, Machine Learning and Deep Learning Models

  • 1. ELECTRICITY CONSUMPTION FORECASTING USING ARIMA, UCM AND MACHINE LEARNING MODELS University of Milano-Bicocca Master's Degree in Data Science Streaming Data Management and Time Series Analysis Academic Year 2022-2023 Author: Giorgio CARBONE matr. n° 811974
  • 2. Introduction ❑ Univariate, regular, high-frequency time series ❑ 48096 electricity consumption measured every 10 minutes ❑ from 01/01/2017 00:00:00 to 30/11/2017 23:50:00 ❑ Comparing different approaches to modelling ❑ Statistical approaches → ARIMA, UCM ❑ Machine Learning → Random Forest, k-NN ❑ Deep Learning → GRU Recurrent Neural Networks ❑ Goal → predict the 4320 observations of December 2017 ❑ from 01/12/2017 00:00:00 to 30/12/2017 23:50:00
  • 3. Data Exploration ❑ Two variables → Power, Date (date-hour labels) ❑ no duplicates or missing values ❑ energy consumption increases in summer ❑ Yearly seasonality ❑ caused by the cycle of the seasons ❑ not observable (only one year of data) ❑ Rare anomalies
  • 4. Data Exploration ❑ Daily seasonality ❑ day-night cycle ❑ 24-hour period → 144 observations ❑ Weekly seasonality ❑ weekend consumption reduction ❑ 7-days period → 1008 observations
  • 5. Hold-out models evaluation method ❑ Training Set ❑ From 01/01/2017 00:00:00 to 31/10/2017 23:50:00 ❑ 43776 observations ❑ Validation Set ❑ From 01/11/2017 00:00:00 to 30/11/2017 23:50:00 ❑ 4320 observations → consistent with Dec forecast horizon ❑ Choosing the models with the lowest MAE on validation set → for each type ❑ highest generalization capabilities ❑ re-estimated on training+validation ❑ used to predict from 01/12/2017 00:00:00 to 30/12/2017 23:50:00 Training Set Validation Set
  • 7. ARIMA: stationarity ❑ Non-stationarity in variance ❑ Box-Cox plot → Slight linear correlation ❑ Models estimated on Box-Cox transformed time series → performed worse → no transformation applied ❑ Non-stationarity in mean ❑ trend and multiple seasonalities ❑ Augmented Dickey-Fuller and KPSS (5%) ❑ rejected hypothesis of need for simple difference ❑ need for a seasonal difference with lag=144
  • 8. ARIMA: no regressors ❑ First model → ARIMA 0,0,0 0,1,1 144 (1.1) ❑ SMA(1) and seasonal difference (p=144) ❑ MAEvalidation = 1415.85 ❑ Residuals ACF → linear memory in early lags and multiples of 1008 (weekly seasonality) ❑ Residuals PACF → linear memory in first 3 lags ❑ Tried to capture more linear memory ❑ ARIMA 3,1,0 0,1,1 144 (1.7) ❑ Lowest residuals ACF and PACF values → good adaptation to training data ❑ High MAEvalidation → poor predictive capabilities [1.1] Residuals ACF and PACF and validation set predictions [1.7] Residuals ACF and PACF and validation set predictions
  • 9. ARIMAX ❑ model weekly seasonality using 6 dummy regressors ❑ 𝐴𝑅𝐼𝑀𝐴 2,1,0 0,1,1 144 + 6 dummies [1.10] ❑ captures a lot of linear memory ❑ MAEvalidation = 4116.94 ❑ 𝐴𝑅𝐼𝑀𝐴 0,0,0 1,0,0 144 + 6 dummies [1.11] ❑ MAEvalidation = 1065.45 ❑ model weekly seasonality using sinusoids ❑ 𝐴𝑅𝐼𝑀𝐴 0,0,0 1,0,0 144 + 8 sinusoids (p = 1008) [1.12] ❑ 𝐌𝐀𝑬𝒗𝒂𝒍𝒊𝒅𝒂𝒕𝒊𝒐𝒏 = 𝟏𝟎𝟏𝟎. 𝟎𝟖 → best ARIMA model ❑ model daily seasonality using sinusoids + weekly seasonality using dummies → no improvements in prediction accuracy ❑ 𝐴𝑅𝐼𝑀𝐴 0,0,0 1,0,0 144 + 8 sinusoids (p = 144) [1.13] ❑ 𝐴𝑅𝐼𝑀𝐴 0,0,0 1,0,0 144 + 8 sinusoids (p = 144) + 6 dummies [1.14] [1.12] Residuals ACF and PACF and validation set predictions
  • 10. ARIMA: time series aggregated by hour ❑ Time series aggregated by hour ❑ 24 time series → 24 hours of the day ❑ averaging the 6 values within each hour ❑ Evaluation 1. Fitted the same model independently on each of the 24 time series 2. Predicted the hourly observations for November 2017 3. Restored 10-minute granularity by cubic interpolation using spline functions 4. Computed MAEvalidation ❑ 𝐴𝑅𝐼𝑀𝐴 0,1,1 0,0,1 7 [1.15] ❑ captures a lot of linear memory ❑ MAEvalidation = 1142.87 [1.15] Residuals ACF and PACF and validation set predictions
  • 12. UCM: time series aggregated by hour ❑ Models are trained in time series aggregated by hour (averaging the 6 values within each hour) ❑ predicting the 720 hourly observations for November 2017 ❑ restored 10-minute granularity by cubic interpolation ❑ ො 𝑦 = 𝑇𝑅𝐸𝑁𝐷 + 𝑆𝐸𝐴𝑆𝑑𝑎𝑖𝑙𝑦 24ℎ + 𝑆𝐸𝐴𝑆𝑤𝑒𝑒𝑘𝑙𝑦(168ℎ) ❑ Different models and combinations tested 1. 𝐿𝐿𝑇 + 10 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝑠𝑖𝑛𝑢𝑠𝑜𝑖𝑑𝑠𝑤𝑒𝑒𝑘𝑙𝑦 𝑝 = 168 + (𝟐𝟒 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝒅𝒖𝒎𝒎𝒊𝒆𝒔𝒅𝒂𝒊𝒍𝒚 𝑽𝑺 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝒔𝒊𝒏𝒖𝒔𝒐𝒊𝒅𝒔𝒅𝒂𝒊𝒍𝒚 𝑝 = 24 ) ❑ 𝟐𝟒 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝒅𝒖𝒎𝒎𝒊𝒆𝒔𝒅𝒂𝒊𝒍𝒚 → better accuracy 2. 𝐿𝐿𝑇 + 10 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒔𝒊𝒏𝒖𝒔𝒐𝒊𝒅𝒔𝒘𝒆𝒆𝒌𝒍𝒚 𝑝 = 168 𝑉𝑆 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒄𝒚𝒄𝒍𝒆𝒘𝒆𝒆𝒌𝒍𝒚 𝑝 = 168 + 24 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝑑𝑢𝑚𝑚𝑖𝑒𝑠𝑑𝑎𝑖𝑙𝑦 ❑ 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒄𝒚𝒄𝒍𝒆𝒘𝒆𝒆𝒌𝒍𝒚 𝑝 = 168 → very high MAEvalidation 3. 𝐿𝐿𝑇 + (1 − 16) 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒔𝒊𝒏𝒖𝒔𝒐𝒊𝒅𝒔𝒘𝒆𝒆𝒌𝒍𝒚 𝑝 = 168 + 24 𝑠𝑡𝑜𝑐ℎ𝑎𝑠𝑡𝑖𝑐 𝑑𝑢𝑚𝑚𝑖𝑒𝑠𝑑𝑎𝑖𝑙𝑦 [2.1] ❑ Testing the optimal number of weekly sinusoids for the final model ❑ 2 sinusoids → 𝑀𝐴𝐸𝑣𝑎𝑙𝑖𝑑𝑎𝑡𝑖𝑜𝑛 = 1210.08 and 6 sinusoids → 𝑀𝐴𝐸𝑣𝑎𝑙𝑖𝑑𝑎𝑡𝑖𝑜𝑛 = 1299.41
  • 13. UCM: original time series ❑ Original 10-minutes frequency time series ❑ Training set: September and October 2017 observations ❑ Applied logarithm → improved performance ❑ Final and best UCM model [2.2] → 𝑴𝑨𝑬𝒗𝒂𝒍𝒊𝒅𝒂𝒕𝒊𝒐𝒏 = 𝟏𝟏𝟖𝟑. 𝟒𝟎 ❑ 𝑳𝒐𝒄𝒂𝒍 𝑳𝒊𝒏𝒆𝒂𝒓 𝑻𝒓𝒆𝒏𝒅 𝑳𝑳𝑻 + ❑ 𝟏𝟎 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒔𝒊𝒏𝒖𝒔𝒐𝒊𝒅𝒔𝒘𝒆𝒆𝒌𝒍𝒚 𝒑 = 𝟏𝟎𝟎𝟖 + ❑ 𝟏𝟎 𝒔𝒕𝒐𝒄𝒉𝒂𝒔𝒕𝒊𝒄 𝒔𝒊𝒏𝒖𝒔𝒐𝒊𝒅𝒔𝒅𝒂𝒊𝒍𝒚 𝒑 = 𝟏𝟒𝟒 ❑ Initial parameters (vy = time series variance) - 𝑙𝑜𝑔𝑉𝑎𝑟𝐸𝑡𝑎: 𝑣𝑦 10 | 𝑙𝑜𝑔𝑉𝑎𝑟𝑍𝑒𝑡𝑎: 𝑣𝑦 10000 - 𝑙𝑜𝑔𝑉𝑎𝑟𝑂𝑚𝑒𝑔𝑎144: 𝑣𝑦 1000 | 𝑙𝑜𝑔𝑉𝑎𝑟𝑂𝑚𝑒𝑔𝑎1008: 𝑣𝑦 10000 - 𝑙𝑜𝑔𝑉𝑎𝑟𝐸𝑝𝑠𝑖𝑙𝑜𝑛: 𝑣𝑦 10 - 𝑎1 = 𝑚𝑒𝑎𝑛 𝑡𝑖𝑚𝑒 𝑠𝑒𝑟𝑖𝑒𝑠 | 𝑃1 = 𝑣𝑦 ∗ 10 [2.2] Trend estimate and validation set predictions
  • 15. Deep Learning: GRU Recurrent Neural Networks ❑ tested different hyperparameters: • number of GRU layers and number of relative neuronal activities • presence or absence of dropout layer after GRU layers • batch size, epochs • learning rate and optimizer • number of lag and lead • activation function • dense layers size ❑ Trained on original 10-minute frequency time series ❑ Standardized (𝜇 = 0, 𝜎 = 1) ❑ Differentiated (144 observations) ❑ Using lags as regressors
  • 16. Deep Learning: GRU Recurrent Neural Networks ❑ Final Network Hyperparameters: • Batch Size: 144 • Epochs = 40 • Lag = 4320, Lead = 4320 (multi-step prediction) • LR = 0.001, Optimizer = Adam, Loss = MAE ❑ Network Architecture: ❑ Good forecasting accuracy → MAEvalidation = 1184.07 ❑ best ML model Input Layer GRU Layer (16 units, no dropout) Linear Activation Hyperbolic tangent Activation GRU Layer (16 units, no dropout) Hyperbolic tangent Activation Dense Layer (FC, 30 neurons)
  • 17. Machine Leaning: Random Forest and k-NN ❑ Trained on original 10-minute frequency time series ❑ Using lags as regressors ❑ Random Forest • tested various hyperparameters configurations: • number of lags, number of trees, using external regressors (hour of the day and day of the week dummies) • recursive forecasting (using estimates as new regressors) • Best Model: lags = 144*7, 250 trees, day of the week dummies → MAEvalidation = 2350.14 ❑ k-NN, best model → MAEvalidation = 1694.75 • 144*7 = 1008 lags • Multi-Step Ahead Strategy (MIMO) forecasting → lead = 4320 • median function to aggregate values
  • 19. The most accurate models on the validation set Model Class Best Model for each class MAEvalidation ARIMA ARIMA(0,0,0) (1,0,0)144+ 8 weekly sinusoids (p=1008) 1010.08 UCM LLT + 10 stochastic sinusoidsweekly p = 1008 + 10 stochastic sinusoidsdaily p = 144 1183.40 ML Gated Recurrent Units (GRU) Network 1184.07 ❑ Best models for each class ❑ are re-trained on training+validation ❑ used to predict the 4320 December 2017 observations