Data-X Fall 2018
Anomalous Electricity Consumption Detection
Students: Haiwei Hu, Weiling Kang, Qinyan Shen, Lan Wei, Yang Yu Mentor: Shikhar Verma UC Berkeley Engineering
Over-consumption of electrical energy has been a
growing concern worldwide. Electricity over-consumption
incurs not only economical issues but also severe
environmental concerns, such as light pollution and
increased carbon emissions. Therefore, the objective of
this project is to predict energy consumption and identify
anomalous consumption through the means of training
energy usage dataset with Auto ARIMA, SARIMA
(seasonal autoregressive integrated moving average)
and RNN (recurrent neural network) machine learning
models.
Keywords: Energy consumption; Anomalous energy
consumption detection; ARIMA, SARIMA; RNN;
Time-series forecasting
● Imputation Method:
Impute missing data with the average value of
electricity consumption at the targeted moment on
that day of the weeks in that certain month.
● Correlation Analysis:
Explore the correlation between features.
● Prediction Method (Comparison):
- Rolling ARIMA
- SARIMA
- RNN (Keras):LSTM
● Anomaly Detection:
Detect time points when energy consumption is
2*standard deviation greater (lower) than moving
median in a window of 8 hours.
● Project Management:
- Agile (Asana)
• Further improvement on missing value
imputation.
• Use regularization techniques to prevent
overfitting.
• Generalize the model - require more parameters
(e.g. income, house type & size, build year).
• Parallel outlier detection between different
households in order to determine reason for
anomaly behavior.
• Develop a front end such as a mobile app to give
timely warning on anomalous consumption
behavior.
• Provide seasonal electricity consumption report
and compare with neighborhood’s usage. Suggest
action steps accordingly.
• Help users understand unusual changes in their
electricity bills. Possible reasons could be
appliance malfunction or misuse.
• Create private accounts for every customer,
making it easier for them to keep track of their
usage.
● Between
appliances:
[1]Gallo, Giampiero M. “Ultra High-Frequency Data Management.” Academia.edu - Share Research, 2002,
www.academia.edu/3031576/Ultra_High-Frequency_Data_Management.
[2]A review on applications of ANN and SVM for building electrical energy consumption forecasting
https://www.sciencedirect.com/science/article/pii/S1364032114000914
[3]Greek long-term energy consumption prediction using artificial neural networks
https://www.sciencedirect.com/science/article/abs/pii/S0360544209004514#vt1
[4]Using intelligent data analysis to detect abnormal energy consumption in buildings
https://www.sciencedirect.com/science/article/pii/S0378778806001514
[5]Modeling functional outliers for high frequency time series forecasting with neural networks: an empirical
evaluation for electricity load data
https://pdfs.semanticscholar.org/23e3/3b8f8ccd4297afc51f734d2aa48aaf080c5f.pdf
[6]How to Remove Trends and Seasonality with a Difference Transform in Python:
https://machinelearningmastery.com/remove-trends-seasonality-difference-transform-python/
1. Obtain high frequency electricity consumption
dataset from ETH Zurich Website.
2. Pre-process data through data cleaning and
missing data imputation.
3. Data down-sampling: decrease data frequency
from every second to hourly and daily.
4. Data visualization: scatter plot, pairplot and
heatmap.
5. Applied Rolling ARIMA on single appliance.
6. Forecast 2-month consumption using SARIMA
and RNN model.
7. Compare training accuracy between two
forecasting models.
8. Develop anomalous energy consumption
detection function incorporating forecasted
energy consumption.
Smart Meter Data: 245 days of aggregate electricity
consumption data with each file containing 86,400
rows (sec). The prediction is based on this data.
Plug Data: 7 appliances’ consumption data on a
daily basis.
Supplementary Data: temperature and holiday
information of the corresponding days.
● Case 2: Total electricity usage prediction
➔ Method 1: SARIMA
➔ Method 2: RNN / LSTM
➢ Strong correlation observed between kettle &
coffee machine, freezer & fridge, washing
machine & dryer.
➢ Strong correlation observed between fridge,
freezer, coffee machine, kettle & temperature
Our project utilized Rolling ARIMA, SARIMA and
RNN models to predict the household’s electricity
consumption. Additionally,there was no correlation
between temperature/holiday and appliance through
anomaly data detection analysis. Hence the
anomalies might result from family’s misuse in
appliances, which needs further investigation. Thus
we strongly recommend users to be mindful on their
energy consumption behaviors.
● with other factors
temperature/holiday
➢ 1. Correlation Study
● Case1: Single appliance (fridge) usage
prediction with rolling ARIMA
➢ 2: Components Study
● Observed = Trend+Seasonal+Residual
➢ 3: Future Forecast

Anomalous Detection for High-Frequency Electricity Consumption Data

  • 1.
    Data-X Fall 2018 AnomalousElectricity Consumption Detection Students: Haiwei Hu, Weiling Kang, Qinyan Shen, Lan Wei, Yang Yu Mentor: Shikhar Verma UC Berkeley Engineering Over-consumption of electrical energy has been a growing concern worldwide. Electricity over-consumption incurs not only economical issues but also severe environmental concerns, such as light pollution and increased carbon emissions. Therefore, the objective of this project is to predict energy consumption and identify anomalous consumption through the means of training energy usage dataset with Auto ARIMA, SARIMA (seasonal autoregressive integrated moving average) and RNN (recurrent neural network) machine learning models. Keywords: Energy consumption; Anomalous energy consumption detection; ARIMA, SARIMA; RNN; Time-series forecasting ● Imputation Method: Impute missing data with the average value of electricity consumption at the targeted moment on that day of the weeks in that certain month. ● Correlation Analysis: Explore the correlation between features. ● Prediction Method (Comparison): - Rolling ARIMA - SARIMA - RNN (Keras):LSTM ● Anomaly Detection: Detect time points when energy consumption is 2*standard deviation greater (lower) than moving median in a window of 8 hours. ● Project Management: - Agile (Asana) • Further improvement on missing value imputation. • Use regularization techniques to prevent overfitting. • Generalize the model - require more parameters (e.g. income, house type & size, build year). • Parallel outlier detection between different households in order to determine reason for anomaly behavior. • Develop a front end such as a mobile app to give timely warning on anomalous consumption behavior. • Provide seasonal electricity consumption report and compare with neighborhood’s usage. Suggest action steps accordingly. • Help users understand unusual changes in their electricity bills. Possible reasons could be appliance malfunction or misuse. • Create private accounts for every customer, making it easier for them to keep track of their usage. ● Between appliances: [1]Gallo, Giampiero M. “Ultra High-Frequency Data Management.” Academia.edu - Share Research, 2002, www.academia.edu/3031576/Ultra_High-Frequency_Data_Management. [2]A review on applications of ANN and SVM for building electrical energy consumption forecasting https://www.sciencedirect.com/science/article/pii/S1364032114000914 [3]Greek long-term energy consumption prediction using artificial neural networks https://www.sciencedirect.com/science/article/abs/pii/S0360544209004514#vt1 [4]Using intelligent data analysis to detect abnormal energy consumption in buildings https://www.sciencedirect.com/science/article/pii/S0378778806001514 [5]Modeling functional outliers for high frequency time series forecasting with neural networks: an empirical evaluation for electricity load data https://pdfs.semanticscholar.org/23e3/3b8f8ccd4297afc51f734d2aa48aaf080c5f.pdf [6]How to Remove Trends and Seasonality with a Difference Transform in Python: https://machinelearningmastery.com/remove-trends-seasonality-difference-transform-python/ 1. Obtain high frequency electricity consumption dataset from ETH Zurich Website. 2. Pre-process data through data cleaning and missing data imputation. 3. Data down-sampling: decrease data frequency from every second to hourly and daily. 4. Data visualization: scatter plot, pairplot and heatmap. 5. Applied Rolling ARIMA on single appliance. 6. Forecast 2-month consumption using SARIMA and RNN model. 7. Compare training accuracy between two forecasting models. 8. Develop anomalous energy consumption detection function incorporating forecasted energy consumption. Smart Meter Data: 245 days of aggregate electricity consumption data with each file containing 86,400 rows (sec). The prediction is based on this data. Plug Data: 7 appliances’ consumption data on a daily basis. Supplementary Data: temperature and holiday information of the corresponding days. ● Case 2: Total electricity usage prediction ➔ Method 1: SARIMA ➔ Method 2: RNN / LSTM ➢ Strong correlation observed between kettle & coffee machine, freezer & fridge, washing machine & dryer. ➢ Strong correlation observed between fridge, freezer, coffee machine, kettle & temperature Our project utilized Rolling ARIMA, SARIMA and RNN models to predict the household’s electricity consumption. Additionally,there was no correlation between temperature/holiday and appliance through anomaly data detection analysis. Hence the anomalies might result from family’s misuse in appliances, which needs further investigation. Thus we strongly recommend users to be mindful on their energy consumption behaviors. ● with other factors temperature/holiday ➢ 1. Correlation Study ● Case1: Single appliance (fridge) usage prediction with rolling ARIMA ➢ 2: Components Study ● Observed = Trend+Seasonal+Residual ➢ 3: Future Forecast