Enhancing the Prediction Accuracy
of Solar Power Generation using
a Generative Adversarial Network
Kundjanasith Thonglek1
, Kohei Ichikawa1
, Keichi Takahashi1
,
Chawanat Nakasan2
, Kazufumi Yuasa3
, Tadatoshi Babasaki3
, Hajimu Iida1
1
Nara Institute of Science and Technology, Nara, Japan
2
Kanazawa University, Ishikawa, Japan
3
NTT FACILITIES, INC., Tokyo, Japan
1
Produce stable electricity with solar power generation
Cloud
Wind
A commonly known problem in solar power generation is
Controlling the amount of stored electricity in batteries to produce stable electricity
➢ Unpredictable
➢ No pollution
➢ Predictable
➢ Pollution
2
Prediction of solar power generation
➢ There are several existing approaches to tackle controlling the amount of
stored electricity in batteries to produce stable electricity
○ Designing a new energy storage [1]
○ Improving the capacity of the batteries [2]
○ Building a prediction model for the solar power generation [3]
➢ Applying a new energy storage or developing the existing batteries requires
large amounts of investment more than embedding prediction models
○ Therefore, the development of highly accurate prediction models
such as deep learning has received much attentions
References
[1] C. Thombre, S. Shah, M. Mahajan, and T. Haldankar, “Design of a battery-less solar energy storage system based on re-generation of energy”, in Proceedings of IEEE International Conference on
Computing, Communication and Networking Technologies (ICCCNT), 2017, pp. 1-5.
[2] Z. Hradiflex, P. Moldrik, and R. Chvalek, “Solar energy storage using hydrogen technology”, in Proceedings of IEEE International Conference on Environment and Electrical Engineering (EEEIC),
2010, pp. 110-113
[3] V. Prema and U. Rao, “Development of statistical time series models for solar power prediction”, Renewable Energy, Vol 83, 2015, pp. 100-109 3
Deep learning approaches
➢ Deep learning is a kind of machine learning approach that applies neural networks
with many layers
○ It has been proved to be highly effective in a wide range of prediction
➢ Applying LSTM is outperformed for predicting solar power generation [1]
○ Solar power generation is a kind of time-series data
○ LSTM
■ It is a type of recurrent neural network
■ It is able to hold long-term historical data
■ It is able to learn contexts from time-series data and forecast future trends
References
[1] J. Zhang, Y. Chi, and L. Xiao, “Solar power generation forecast based on LSTM”, in Proceedings of IEEE International Conference on Software Engineering and Service Science (ICSESS),
2018, pp. 869-872
4
Frequency distribution of solar power generation
➢ The maximum values of solar power generation is 8.0 kWh
➢ The minimum values of solar power generation is 0.0 kWh
Solar power generation is strongly biased towards 0.0 kWh
This imbalance dataset
makes it difficult to build an
accurate prediction model
5
Average daily solar power generation in each month
Solar power generation is varied over various seasons in one year
This large seasonal
variation of the dataset also
makes it difficult to build
an accurate prediction
model
6
The gap between highest and lowest solar power
generation is very large
Data augmentation
➢ The size and variability of training datasets are important factors that affect
the prediction accuracy of deep learning models.
➢ In general, limited size and variability of the training dataset lead to
underfitting or overfitting problems.
○ Data augmentation alleviates those problems by increasing the size and
variability of training datasets with artificially generated samples
➢ Data augmentation for time-series
○ Basic approaches: time-domain and frequency-domain methods
○ Advanced approaches: statistical-based and learning-based methods
7
Generative Adversarial Network (GAN)
➢ GAN is one of the most popular learning-based
methods for time-series data augmentation
➢ GAN can prevent some limitations caused by
adversarial learning in the practical application
of the conventional generative models
➢ Two models are trained simultaneously by an
adversarial process.
○ A generator ("the artist") learns to create
images that look real
○ A discriminator ("the art critic") learns to
tell real images apart from fakes.
8
References
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Proceedings of Advances in Neural Information Processing
Systems (NerIPS), Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Q. Weinberger, Eds., vol. 27. Curran Associates, Inc., 2014.
Dataset
➢ Since solar power system does not generate electricity at night, we take only data for
every hour from 6:00 AM to 6:00 PM
○ There are 13 records per day
Dec 6, 2017, 6:00 AM - Jun 17, 2020, 6:00 PM
Past solar power generation data
Solar power system installed at
Shinohashi building of
NTT Facilities (NTT-F)
Weather forecast data [1]
Meso-Scale Model of
Japan Meteorological Agency (JMA)
9
References
[1] K. Saito, T. Fujita, Y. Yamada, J. ichi Ishida, Y. Kumagai, K. Aranami, S. Ohmori, R. Nagasawa, S. Kumagai, C. Muroi, T. Kato, H. Eito, and Y. Yamazaki, “The operational JMA nonhydrostatic
mesoscale model,” Monthly Weather Review, vol. 134, no. 4, pp. 1266–1298, 2006.
Correlation of the features in our dataset
The highest correlation coefficient between
Solar radiation & Solar power generation
with a correlation coefficient of 0.79
The lowest correlation coefficient between
Middle cloudage & Solar power generation
with a correlation coefficient of -0.34
Correlation coefficient
It is used to measure the strength
of the relationship between two
variables.
10
Our approach
11
Dataset
Training
Dataset
Testing
Dataset
LSTM-based Prediction Model
GAN-based Model
Augmented Dataset
Evaluation
1. Preparation of training and testing datasets
2. Data augmentation using GAN-based model
3. Prediction using LSTM-based model
Input and output data range for the prediction model
➢ We propose a model to predict the next 24 hours solar power generation (i.e. 13 hours data
except for the night time) with the past information
t
Current time step
Solar power generation
Weather forecast information
Solar power generation
t + 13
t - 13
➢ The length of input sequence is 13
➢ The length of output sequence is 13
Predict future solar power generation for
next 13 time steps except for the night time.
12
Input data for the proposed GAN-based model
The input data format of the proposed GAN-based
model is composed of two parts
13
First part is a matrix of 13 x 12 that matches with the
input data of our proposed LSTM-based model
Second part is a matrix of 13 x 1 that matches with
the output data of our proposed LSTM-based model
Architecture of proposed GAN-based model
14
Input layer
Discriminator model
Generator model
Convolutional layer
First
Deconvolutional layer
Second
Deconvolutional layer
First
Convolutional layer
Second
Convolutional layer
Fully connected layer
Input layer
Architecture of LSTM-based prediction model
The length of input sequence
The number of selected features
The length of output sequence
The number of hidden units
15
Output layer is fully-connected layer
with ReLU activation function
Original Data
Generated Data
Example of generated and original data
The proposed GAN-based model was able to generated data efficiently because the
distribution of the generated data after training process is similar to the original data
16
Generated data at 1st
epoch Generated data at 100th
epoch
Prediction accuracy
17
The prediction model using
the original dataset
The prediction model using
the augmented dataset
Since neural networks are initialized with
random weights, we trained the model 100 times
with stratified K-fold cross-validation, and
measured and averaged the RMSE.
0.1497
0.1898
0.0606
0.0802
Prediction results
18
The prediction results indicates that the prediction error around noon is greater than
that of early morning and late evening.
Conclusion
➢ We studied how to improve the control of stored electricity in batteries to
produce stable electricity by predicting future solar power generation using
the past solar power generation and weather information
➢ The proposed LSTM-based prediction model achieved an RMSE of 0.1898
with stratified k-fold cross-validation
➢ To further enhance the accuracy of our prediction model, we designed a
neural network model based on GAN to augment the training dataset.
○ The proposed generative model is able to increase the number and
variability of the dataset. With the augmented dataset, our prediction
model achieved an RMSE of 0.0802 19
Future works
➢ Other data augmentation methods will be investigated to further improve the
prediction accuracy
➢ Data on solar power generation and weather information from other sources
should be used to validate the generality of the proposed prediction method
➢ Significant behaviors and features that impact on the control of the stored
electricity in batteries will be investigated
20
Q&A
Thank you for your attention
Email: thonglek.kundjanasith.ti7@is.naist.jp
21

Enhancing the Prediction Accuracy of Solar Power Generation using a Generative Adversarial Network.pdf

  • 1.
    Enhancing the PredictionAccuracy of Solar Power Generation using a Generative Adversarial Network Kundjanasith Thonglek1 , Kohei Ichikawa1 , Keichi Takahashi1 , Chawanat Nakasan2 , Kazufumi Yuasa3 , Tadatoshi Babasaki3 , Hajimu Iida1 1 Nara Institute of Science and Technology, Nara, Japan 2 Kanazawa University, Ishikawa, Japan 3 NTT FACILITIES, INC., Tokyo, Japan 1
  • 2.
    Produce stable electricitywith solar power generation Cloud Wind A commonly known problem in solar power generation is Controlling the amount of stored electricity in batteries to produce stable electricity ➢ Unpredictable ➢ No pollution ➢ Predictable ➢ Pollution 2
  • 3.
    Prediction of solarpower generation ➢ There are several existing approaches to tackle controlling the amount of stored electricity in batteries to produce stable electricity ○ Designing a new energy storage [1] ○ Improving the capacity of the batteries [2] ○ Building a prediction model for the solar power generation [3] ➢ Applying a new energy storage or developing the existing batteries requires large amounts of investment more than embedding prediction models ○ Therefore, the development of highly accurate prediction models such as deep learning has received much attentions References [1] C. Thombre, S. Shah, M. Mahajan, and T. Haldankar, “Design of a battery-less solar energy storage system based on re-generation of energy”, in Proceedings of IEEE International Conference on Computing, Communication and Networking Technologies (ICCCNT), 2017, pp. 1-5. [2] Z. Hradiflex, P. Moldrik, and R. Chvalek, “Solar energy storage using hydrogen technology”, in Proceedings of IEEE International Conference on Environment and Electrical Engineering (EEEIC), 2010, pp. 110-113 [3] V. Prema and U. Rao, “Development of statistical time series models for solar power prediction”, Renewable Energy, Vol 83, 2015, pp. 100-109 3
  • 4.
    Deep learning approaches ➢Deep learning is a kind of machine learning approach that applies neural networks with many layers ○ It has been proved to be highly effective in a wide range of prediction ➢ Applying LSTM is outperformed for predicting solar power generation [1] ○ Solar power generation is a kind of time-series data ○ LSTM ■ It is a type of recurrent neural network ■ It is able to hold long-term historical data ■ It is able to learn contexts from time-series data and forecast future trends References [1] J. Zhang, Y. Chi, and L. Xiao, “Solar power generation forecast based on LSTM”, in Proceedings of IEEE International Conference on Software Engineering and Service Science (ICSESS), 2018, pp. 869-872 4
  • 5.
    Frequency distribution ofsolar power generation ➢ The maximum values of solar power generation is 8.0 kWh ➢ The minimum values of solar power generation is 0.0 kWh Solar power generation is strongly biased towards 0.0 kWh This imbalance dataset makes it difficult to build an accurate prediction model 5
  • 6.
    Average daily solarpower generation in each month Solar power generation is varied over various seasons in one year This large seasonal variation of the dataset also makes it difficult to build an accurate prediction model 6 The gap between highest and lowest solar power generation is very large
  • 7.
    Data augmentation ➢ Thesize and variability of training datasets are important factors that affect the prediction accuracy of deep learning models. ➢ In general, limited size and variability of the training dataset lead to underfitting or overfitting problems. ○ Data augmentation alleviates those problems by increasing the size and variability of training datasets with artificially generated samples ➢ Data augmentation for time-series ○ Basic approaches: time-domain and frequency-domain methods ○ Advanced approaches: statistical-based and learning-based methods 7
  • 8.
    Generative Adversarial Network(GAN) ➢ GAN is one of the most popular learning-based methods for time-series data augmentation ➢ GAN can prevent some limitations caused by adversarial learning in the practical application of the conventional generative models ➢ Two models are trained simultaneously by an adversarial process. ○ A generator ("the artist") learns to create images that look real ○ A discriminator ("the art critic") learns to tell real images apart from fakes. 8 References I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Proceedings of Advances in Neural Information Processing Systems (NerIPS), Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Q. Weinberger, Eds., vol. 27. Curran Associates, Inc., 2014.
  • 9.
    Dataset ➢ Since solarpower system does not generate electricity at night, we take only data for every hour from 6:00 AM to 6:00 PM ○ There are 13 records per day Dec 6, 2017, 6:00 AM - Jun 17, 2020, 6:00 PM Past solar power generation data Solar power system installed at Shinohashi building of NTT Facilities (NTT-F) Weather forecast data [1] Meso-Scale Model of Japan Meteorological Agency (JMA) 9 References [1] K. Saito, T. Fujita, Y. Yamada, J. ichi Ishida, Y. Kumagai, K. Aranami, S. Ohmori, R. Nagasawa, S. Kumagai, C. Muroi, T. Kato, H. Eito, and Y. Yamazaki, “The operational JMA nonhydrostatic mesoscale model,” Monthly Weather Review, vol. 134, no. 4, pp. 1266–1298, 2006.
  • 10.
    Correlation of thefeatures in our dataset The highest correlation coefficient between Solar radiation & Solar power generation with a correlation coefficient of 0.79 The lowest correlation coefficient between Middle cloudage & Solar power generation with a correlation coefficient of -0.34 Correlation coefficient It is used to measure the strength of the relationship between two variables. 10
  • 11.
    Our approach 11 Dataset Training Dataset Testing Dataset LSTM-based PredictionModel GAN-based Model Augmented Dataset Evaluation 1. Preparation of training and testing datasets 2. Data augmentation using GAN-based model 3. Prediction using LSTM-based model
  • 12.
    Input and outputdata range for the prediction model ➢ We propose a model to predict the next 24 hours solar power generation (i.e. 13 hours data except for the night time) with the past information t Current time step Solar power generation Weather forecast information Solar power generation t + 13 t - 13 ➢ The length of input sequence is 13 ➢ The length of output sequence is 13 Predict future solar power generation for next 13 time steps except for the night time. 12
  • 13.
    Input data forthe proposed GAN-based model The input data format of the proposed GAN-based model is composed of two parts 13 First part is a matrix of 13 x 12 that matches with the input data of our proposed LSTM-based model Second part is a matrix of 13 x 1 that matches with the output data of our proposed LSTM-based model
  • 14.
    Architecture of proposedGAN-based model 14 Input layer Discriminator model Generator model Convolutional layer First Deconvolutional layer Second Deconvolutional layer First Convolutional layer Second Convolutional layer Fully connected layer Input layer
  • 15.
    Architecture of LSTM-basedprediction model The length of input sequence The number of selected features The length of output sequence The number of hidden units 15 Output layer is fully-connected layer with ReLU activation function
  • 16.
    Original Data Generated Data Exampleof generated and original data The proposed GAN-based model was able to generated data efficiently because the distribution of the generated data after training process is similar to the original data 16 Generated data at 1st epoch Generated data at 100th epoch
  • 17.
    Prediction accuracy 17 The predictionmodel using the original dataset The prediction model using the augmented dataset Since neural networks are initialized with random weights, we trained the model 100 times with stratified K-fold cross-validation, and measured and averaged the RMSE. 0.1497 0.1898 0.0606 0.0802
  • 18.
    Prediction results 18 The predictionresults indicates that the prediction error around noon is greater than that of early morning and late evening.
  • 19.
    Conclusion ➢ We studiedhow to improve the control of stored electricity in batteries to produce stable electricity by predicting future solar power generation using the past solar power generation and weather information ➢ The proposed LSTM-based prediction model achieved an RMSE of 0.1898 with stratified k-fold cross-validation ➢ To further enhance the accuracy of our prediction model, we designed a neural network model based on GAN to augment the training dataset. ○ The proposed generative model is able to increase the number and variability of the dataset. With the augmented dataset, our prediction model achieved an RMSE of 0.0802 19
  • 20.
    Future works ➢ Otherdata augmentation methods will be investigated to further improve the prediction accuracy ➢ Data on solar power generation and weather information from other sources should be used to validate the generality of the proposed prediction method ➢ Significant behaviors and features that impact on the control of the stored electricity in batteries will be investigated 20
  • 21.
    Q&A Thank you foryour attention Email: thonglek.kundjanasith.ti7@is.naist.jp 21