This document analyzes electricity net generation data from the United States Energy Information Administration from 1973 to 2015 using time series analysis and ARIMA modeling. It finds that electricity net generation has an increasing trend over time with seasonal peaks in the summer months. An ARIMA(1,1,1)(0,1,1)[12] model is shown to fit the data well and forecast a 1.63% increase in generation over the next 17 months with a validation forecast for August 2015 being within 0.41% of actual reported values. However, the analysis notes that further regression analysis could provide more insights into impacts of different generation sources over time.
1. Electricity Net Generation in U.S
Time series analysis and forecasting
Shen (Carol) Yan, Shih-Wen (Elsa) Huang
2. Motivation
We are curious whether time series confirm to our original assumption:
winter has the highest net electricity generation.
Dataset from EIA has 511 observations and 2 variables: month and
electricity net generation total
* EIA: Energy Information Administration
3. Background
With the economic growth and industries development in the
U.S, the demand of electricity is increasing year by year. This
phenomenon leads to higher electricity generation and also
reflects on the dataset from January 1973 to July 2015:
Increasing trend- Total of electricity net generation increase per
year.
Seasonal behavior
4. 39%
1%0%
27%
0%
19%
7%
7% 0%0%
2014 Electricity generation sources
coal
petroleum liquids
petroleum coke
natural gas
other gas
nuclear
hydroelectric
conventional
renewable source
pump
other
35%
1%0%
32%
0%
19%
6%
7% 0%0%
2015 Electricity generation sources (till August)
coal
petroleum liquids
petroleum coke
natural gas
other gas
nuclear
hydroelectric
conventional
renewable source
pump
other
Electricity sources
5. Objectives
1. The model behavior of this dataset
2. Create the fitting model to forecast the following
electricity generation in next 17 month till December
2016.
6. Time plot of electricity generation
Trend: Increasing trend
Seasonality
Spikes - Something happened in 2009: about price
2009
Electricity net generation
decreased
9. Deseasonalization
ACF & PACF:
Dickey-Fuller test:
p-value(0.01) <0.05, null hypothesis of non-stationary is rejected.
10. Build the model-SARIMA
Model: ARIMA(1,1,1)(0, 1, 1)[12]
Test of coefficients: All parameters are significant.
Expression: (1-0.45B)(1-B)(1-B12)Xt=(1-0.90B)(1-0.73B12)
11. Diagnosis
ACF plot of residuals: generally stationary
L-jung Box tests: p-value>0.05, cannot reject White
Noise(residuals)
Normal quantile plot:
Brief conclusion:
The model SARIMA(1,1,1)(0,1,1)[12] is statistically acceptable
and can be processed to explain and make a prediction.
13. Validation of model
MAPE from Back-test: 1.63%
Compare with the latest data announced by EIA
and calculate new MAPE: 0.41%
Released from EIA Our Forecast
August 2015 392298 393923
* EIA: Energy Information Administra
Fit well!
14. Conclusion
This is a non-stationary model with an increasing trend.
Model has seasonal behavior: peak period is during summer.
The forecasts for the following 17 months are consistent with previous
patterns.
Our model is reliable: The specific forecast of August is with minor error to
the number announced by Energy Information Administration official
website.
Limitation:
Further research is needed on time series regression to identify impact of
each source such as, petroleum, coal, nuclear and natural gas, etc., on
electricity net generation in the U.S.
Life experience: electricity bill is always high during winter
In the following discusion
we can observe the time series has increasing trend and seasonal behavior
Conduct timpe plot with ts
Spikes, tend, seasonality
The price of natural gas dropped in 2009 so industries tend to use natural gas to generate electricity rather than using coal
So the total electricity temporary decreased in 2009, here is the spike
First diff
Very sig seasonality in acf
After first difference and deseason, ACF decays to 0, indicating stationarity, so does df show
although here are still spiky at certain lags which show very strong seasonality , the ACF value is between 0.1 and -0.1, which is very tiny, we think this is fine so we go to the next step to build the model
We use auto arima bic criteria obtain this seasonal model : the arima model is (1,1,1) and the seasonal order is (0,1,1) the time period is 12
The next step we do model fitting and here is the is coeff test show are parameters are sig
This normal plot can capture most of values except some extreme ones, so the residuals closed to normal distribution
Lowest electricity generation is during spring
Fit well!!!!!!!!!
According to the ACF PACF plot, we find the TS has very large shock indicating that they will experience long time to converge to mean. So