2. M.Q. Raza et al. / Sustainable Cities and Society 31 (2017) 264–275 265
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
0.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
x 10
4
The hour
TheLoad(MW)
Load data of year 2008
Fig. 1. The hourly load demand of an ISO New England grid in 2008.
ing techniques can be classified into different categories, namely
persistence methods, physical techniques, statistical techniques,
hybrid models and new ensemble networks (Sisworahardjo, El-
Keib, Choi, Valenzuela, & Brooks, 2006). It can be found in the
literature that hybrid Kalman filters (Zheng, Girgis, & Makram,
2000), autoregressive models (Nowicka-Zagrajek & Weron, 2002),
autoregressive moving average with exogenous variable models
(Huang, Huang, & Wang, 2005; Wang, Neng-ling, Hai-qing, Ye, & Jia-
dong, 2008), and statistical techniques (Chakhchoukh, Panciatici, &
Mili, 2011) are examples of the above load forecasting techniques.
These can provide accurate forecast results under less uncertain
load demand. However, the forecast errors in the above models
are significantly increased as a consequence of exogenous vari-
ables (i.e., human impact, and cultural and sociological events), and
meteorological variables (i.e., temperature, relative humidity, dew
point, dry bulb and wind speed) as well. In addition, high penetra-
tion of solar and wind generation into power systems can add more
uncertainties to the load demand-side. As a result, the existing fore-
casting models reviewed above cannot address the uncertainty of
load demand caused by metrological and sociological events.
Artificial intelligence (AI)-based techniques for load forecasting
have attracted remarkable attention of researchers since the mid
1990 s due its suitability for nonlinear input models and prediction
problems. The AI techniques also have the ability to solve com-
plex problems with high accuracy under uncertain conditions. A
wide variety of AI techniques applied for STLF have been reported
in the literature. Typical examples for such techniques are support
vector regression techniques (Wang, Li, Niu, & Tan, 2012), artifi-
cial immune systems (Abdul Hamid & Abdul Rahman, 2010), radial
basis functions (Ranaweera, Hubele, & Papalexopoulos, 1995), and
Bayesian neural networks learned by a hybrid Monte Carlo algo-
rithm (Niu, Shi, & Wu, 2012a). In (Cecati, Kolbusz, Ró ˙zycki, Siano,
& Wilamowski, 2015), the authors also presented a technique to
enhance the training performance of radial basis function network.
However, most of the existing techniques may not provide accu-
rate predictions due incapability of single models. Some of them
perform well in a certain horizon of forecasting, but leave a higher
prediction error in the rest horizon. This leads to an overall higher
prediction error. Recently, the authors in (Yu & Xu, 2014) devel-
oped a combinational load forecasting model, which is based on
the genetic algorithm and backpropagation (BP) neural network.
This technique can offer a better forecasted result in most of the
cases under the study. However, the BP learning does not ensure
a global optimum solution to the neural network (NN) training.
Particularly, the BP learning technique is stuck in a local optimum
solution and does not optimize the NN weight bias values during
the training process. This may affect the NN performance and lead
to a higher forecast error. To overcome the above shortcomings, this
paper develops a novel ANN-based load forecasting model. In this
model, a GPSO algorithm is proposed to optimize the NN weight
bias values during the training process. In addition, metrological
variables as defined previously, which have a significant impact on
load demand, along with historical lagged load values are formu-
lated in the proposed model to enhance the training performance
and the predication accuracy.
The reminder of this paper is organized as follows: Section 2
examines the impact of the architecture and inputs of an artifi-
cial neutral network (ANN) on load demand forecasting. It also
shows the effect of correlated loads and weather inputs on load
demand. Section 3 provides a brief description of the particle swarm
optimization (PSO) algorithm, followed by a pseudo code of GPSO,
GPSO-based neural network training, fitness function and encoding
technique. Section 4 presents the results of seasonal load forecasts,
including the proposed model accuracy, convergence and regres-
sion analysis. These results are compared with a NN forecast model,
which is based on back propagation (BP) and Levenberg marquardt
(LM), for validation. The major conclusions and contributions of the
paper are highlighted in Section 5.
2. Neural network forecast model inputs and topology
2.1. Data observation
In this study, the historical load data of a power grid operated
by ISO New England, Regional Transmission Organization are used
to predict future load demand. Metrological variables such as dry
blub and dew point are also employed to train the load forecasting
model under the study. The hourly load and weather data from 1st
January to 31st December 2008 are used as inputs in the forecast-
ing model. Fig. 1 depicts the hourly load pattern of year 2008. The
3. 266 M.Q. Raza et al. / Sustainable Cities and Society 31 (2017) 264–275
Fig. 2. Correlated inputs for the proposed hourly load forecasting model.
seasonality of the load can be observed in the form of data series.
This seasonality was formed by the repeated cycle of load patterns
due to changes in seasonal temperature. It is also found that the
load demand in summer was significantly higher than autumn and
spring. This trend occurred due to changes in individual energy
consumptions and air conditioning, etc.
2.2. Proposed ANN forecasting model and its inputs
The load and weather data of an ISO New England power grid at
15 min intervals over two-year (2008 and 2009) are employed to
train and test the NN-based load forecasting model as reported in
(Shamsollahi, Cheung, Chen, & Germain, 2001). The input data of
the NN forecasting model are divided into training, testing and val-
idation data sets. The training data are used to train the model. The
seasonal data of 2009 are engaged to test and measure the accu-
racy of the model. The type and number of inputs in the forecasting
model is particularly important to enhance its performances. There
are no specific defined rules to select inputs for the forecasting
model. However, a suitable selection can be carried out based on
the field experience and expertise (Drezga & Rahman, 1998). In this
research, the ANN-based forecasting model is selected based on
the correlation of specific input variables with load demand. Fig. 2
shows the inputs for the hourly load forecasting model proposed
in this paper.
As shown in Fig. 2, the output for the proposed model com-
prises Ld (w, d, h), which represents the predicted load demand
over a particular hour of the same day and week. In order to enhance
the training performance, the inputs (i.e., the historical lagged val-
ues of load demand, meteorological and exogenous variables) are
integrated in the proposed model and further described as follows:
• Ld (w, d, h-1): represents the load demand for a pervious hour of
the same day and week
• Ld (w, d-1, h): represents the load demand for the same hour of
a week in a previous day
• Ld(w-1, d, h): represents the load demand for the same hour of a
day of a previous week
• Day of week: represents either the first day or the second day or
any other days of a week
• Working day or off day (type of day)
• Hour of day
• Dew point temperature
• Dry blub temperature.
2.3. Working day or off-day
In practice, the load demand in working days is normally differ-
ent from off-days due to variations in human activities. Therefore,
in order to increase the forecast accuracy, type of day is used as an
input in the proposed ANN forecasting model.
2.4. Day pointer D (k) and hour pointer H (k)
The load demand varies in time of a day and day of a week.
As an illustration of this trend, Fig. 1 shows the load profile of an
ISO New England grid over a week. It is observed from the figure
that the variation in load demand was from Monday to Sunday and
morning to night (Fig. 3).
A cyclic pattern can also be observed in the form of a one-month
load plot during hours of a day and days of a month. As an example,
Fig. 4 presents the load curve of an ISO New England grid over a
month. It can be seen from the figure that the load demand over
weekends was relatively higher than weekdays.
Overall, it can be concluded from the one-week and month load
profiles investigated above that load demand on weekends is fairly
different from weekdays. In order develop an accurate load fore-
casting model, type and hours of a day can be used as inputs in the
proposed model.
2.5. Weather input variables
The impact of metrological conditions on load demand has
been reported in (Ching-Lai, Watson, & Majithia, 2005; Drezga &
Rahman, 1998; Lai Hor, Watson, & Majithia, 2005; Ruth & Lin, 2006).
These studies showed that load demand varies with respect to
changes in temperature, season and other metrological variables. In
this study, an ISO New England grid web database, which provides
dew point and dry bulb temperature along with historical load data,
is employed to investigate the effect of weather on load demand.
Based on these data, Fig. 5 plots the relationship between dry blub
and load demand. The graph presents that as the value of the dry
blub increases, the power system demand also rises and vice versa.
The study of human perception shows that dry blub in the range of
45F to 65F is suitable for humans. The load demand is low within
this range of dew points.
3. Proposed ANN architecture
In this study, a three-layer feed-forward neural network, which
has one input, a hidden and an output layer, is used to forecast the
load demand (see Fig. 6). A large number of configurations experi-
mented with different settings for various parameters of the neural
network architecture (e.g., the number of hidden layers, the num-
ber of neurons per hidden layer, and different types of network
transfer function). Moreover, the best-suited inputs of NN with the
lowest forecast error are selected and the model performance is
also analysed with various network parameters. As shown in Fig. 6,
the proposed neural network architecture is 8-20-1, which contains
the neurons 8-20-1 in input, hidden and output layers, respectively.
3.1. Particle swarm optimization
Particle swarm optimization (PSO) is a heuristic search approach
based on population, which ensures the global optimal solution.
Eberhart, an electrical engineer and Kennedy, a social physiolo-
gist were developed this evolutionary computation technique in
1995 (Kennedy & Eberhart, 1995). The literature review shows that
PSO proves itself as a powerful optimization tool, which is applied
to a wide variety of real-world problems (AlRashidi & El-Hawary,
2009). PSO is a population-based optimization technique, which
4. M.Q. Raza et al. / Sustainable Cities and Society 31 (2017) 264–275 267
Fig. 3. The one-week load profile of an ISO New England grid (January 1–7, 2008).
Fig. 4. The one-month load profile of an ISO New England grid (January 1–31, 2008).
is inspired by the sociological behaviour of birds or fishes moving
in search of food. The birds or fishes try to find food by their own
best-search experience as well as social experience (Fig. 7).
In the PSO technique, each candidate of population is called as a
particle, which tends to find the best solution based on its own and
neighbor experience in a multidimensional search space. A group
of particles is called as a swarm, which tends to find the optimal
solution to a certain optimization problem. The major advantage of
the PSO technique is to adjust only two parameters, which are the
velocity and position of particles. Each particle updates his position
and velocity based on his own and social experience as expressed
by Eqs. (1) and (2) (Eberhart & Kennedy, 1995).
v(k+1)
i
= wvk
i
+ c1r1(Pbestk
i
− xk
i
) + c2r2(gbestk
− xk
i
) (1)
xk+1
i
= xk
i
+ vk+1
i
(2)
where,
c1 and c2 are positive constants, which control the personal and
global components of the algorithm,
r1 and r2 are randomly generated numbers within a range of
[0,1],
w is the inertia weight,
P best
k
i is the personal particle best position achieved, which is
based on its own experience,
g best
k
is the global particle best position achieved by the all
particles, which are based on overall swarm’s experience,
k is the iteration index,
vk
i
is the current velocity of the particle,
5. 268 M.Q. Raza et al. / Sustainable Cities and Society 31 (2017) 264–275
Fig. 5. Relationship between dry blub and power load demand.
Fig. 6. The proposed ANN architecture of the load forecasting model.
v(k+1)
i
is the new velocity of the particle,
xk
i
is the current of the position,
xk+1
i
is a new particle position.
3.2. Global best particle swarm optimization (GPSO)
The Global best or Gbest version of PSO can achieve the global
optimum solution and produce a better result than the local best
version of PSO. In the Gbest PSO, the particle position is influenced
by his own previously visited and neighbourhood particles best
position. Consequently, the Gbest PSO can potentially find the opti-
mal solution based on the experience of all particles in a swarm. It
will most likely to achieve the global optimum point using Eqs.
(3) and (4). At this position, updating the process reflects the social
influence of all particles (Razana, Shamsuddin, & Ab. Aziz, 2009). For
the Gbest PSO, the velocity and position of particles are calculated
as follows:
v(k+1)
i
= wvk
i
(t) + c1r1(yk
i
(t) − xk
i
(t)) + c2r2(t)(ˆyi(t) − xk
i
(t) (3)
x
(k+1)
i
= xk
i
(t) + v(k+1)
i
(t) (4)
where:
C1 and C2: are acceleration constants, which are used to define
the contribution of social and cognitive components in the velocity
update process.
R1 and R2: are the randomly generated numbers in the range of
[0,1].
vk
i
: the velocity of the ith particle in dimension k direction at
time t.
xk
i
(t): the position of the ith particle in dimension k direction at
time t.
6. M.Q. Raza et al. / Sustainable Cities and Society 31 (2017) 264–275 269
Fig. 7. Flow diagram of the Gbest PSO algorithm.
Fig. 8. The Pseudo code of the Gbest PSO algorithm.
In the Gbest algorithm, the velocity is updated with the best
value of the entire swarm. The first and second parts in Eq. rep-
resents cognitive and social components, respectively. The pseudo
code of the Gbest PSO algorithm is shown in Fig. 8.
3.3. Artificial neural network (ANN) training using PSO
The learning process of neural network with PSO is carried out
based on the particles position, in which each particle represents
the potential solution. Each particle position represents a set of
weights and the number of weights associated with the network
that determines the dimensionality of particles in search. The par-
ticle moves in search space and each training epoch particle is trying
to achieve the global minimum.
In each epoch, the particle position and velocity are updated
according to Eqs. (3) and (4). This process is repeated until the
certain threshold of learning errors is not achieved.
At the global minimum point, the weight values (current parti-
cle positions) generate the minimum network learning error. Fig. 9
represents the learning steps of neural network with PSO (den
Bergh & Engelbrecht, 2000; Van den Bergh, 2002). The details of
the steps are explained as follows:
Step 1: The learning of the network is initialized with generation
of PSO population. The initial particle position is assigned as neural
network training parameters (i.e., weight and bias).
Step 2: The network is trained with the particle position.
Step 3: The network generates an error using the provided net-
work weight values in the training process.
Step 4: The particle positions (weight and bias) are updated to
minimize the learning error of the network. The “Pbest” and “Gbest”
values are used to calculate the new velocity using Eq. (3) to obtain
the new position of the particle for the targeted learning error.
Step 5: To calculate a new set of particle positions (weight, bias),
the value of velocity is added in the previous position using Eq. (4).
The difference of desired and network outputs produces an error,
which is called as the mean square error (MSE) of the network.
The ANN training algorithm is utilized to reduce the MSE until the
threshold criteria meet. The MSE is regarded as an objective func-
tion in the training process of the neural network. This error is used
to update the weight values of the network to achieve the targeted
output. This process will continue until either the stopping con-
ditions of the minimum target learning error is achieved or the
maximum number of epochs is reached, as shown in Fig. 9.
3.4. PSO tuning parameters
There are four basic tuning parameters of the PSO algorithm
listed below, which are used in the optimization problem, to
achieve the optimal result.
• The number of particles
• Time interval ( t)
• C1 is the acceleration constant of the (Gbest) component
• c2 is the acceleration constant of the (Pbest) component.
7. 270 M.Q. Raza et al. / Sustainable Cities and Society 31 (2017) 264–275
Fig. 9. The ANN learning process using the PSO algorithm.
The PSO parameters can be adjusted based on the nature of
the optimization problem to achieve the higher quality of results.
The number of particles in a swarm should be neither very large
nor very small. The large swarm size increases the complexities of
search space and computational time. The small swarm size does
not optimally discover the search space to achieve the global opti-
mum solution. Accordingly, the optimal selection of swarm sizes
can provide better network performance by achieving a global opti-
mum rather than local optima. t is the time interval taken by
the particle to move in search space. A gradual increase in t will
provide a lower granularity movement within the search space.
C1 and c2 are constant values used to define the weights of Pbest
and Gbest in the velocity update process. If C1 is larger than c2, then
the particle is inclined towards the global best solution. Otherwise,
the particle tends to move towards the local best solution. There
are some other parameters that may have an effect on the per-
formances of the PSO algorithm such as: the number of iterations,
stopping criteria and the dimension of particles.
3.5. Fitness function
The fitness function of neural training is calculated as (Mirjalili,
Mohd Hashim, & Sardroudi, 2012). Fig. 10 shows a three-layer
neural network with one input, hidden and output layer. In the
proposed network, n is the number of input nodes, h is the num-
ber of hidden layer nodes, and y is nodes in the output layer of the
network. It is assumed that the transfer functions of the hidden
layer and output layer are sigmoid and linear transfer functions,
respectively.
The output of the hidden layer can be calculated as follows:
f (Sj) = 1/ 1 + exp −
n
i=1
Wij.xi − Âj (5)
where j = 1, 2, 3. . .,h; Sj =
j
i=1
wij.xi − Âj; Wij is the connection
weight of the ith node of the input layer to the jth node of the hidden
layer; xi is the ith input; Âj is the threshold of the hidden layer unit;
Fig. 10. 1-3-1 artificial neural network (ANN) structure.
n is the number of input nodes. The output of the kth layer can be
expressed as follows:
Ok =
H
i=1
wkj.f (Sj) − Âk (6)
where wkj is the connection weight from the jth hidden layer to the
kth output node; Âk is the threshold of the output layer, where k
varies from 1, 2, 3,. . ., y.
Finally, the fitness function defined as the learning error (E) of
the network can be estimated as follows:
E =
r
i=1
Ek/(q ∗ o) (7)
where
Ek =
q
i=1
(yk
i
− dk
i
)
2
(8)
8. M.Q. Raza et al. / Sustainable Cities and Society 31 (2017) 264–275 271
Here, q is the number of training samples; yk
i
is the actual output
of the ith input unit when the kth training sample is used; dk
i
is the
desired output of the ith input unit when the kth training sample is
employed.
The fitness function of the ith training sample can be defined as
follows:
Fitness (xi) = E(xi) (9)
3.6. Encoding technique
As reported in (Zhang, Zhang, Lock, & Lyu, 2007), three encoding
strategies (i.e., vector encoding, matrix encoding and binary encod-
ing) are used to represent the weight values of neural network. In
this study, the matrix encoding strategy is employed to represent
the weight values of the network as it is suitable for the proposed
feed-forward artificial neural network (ANN) (Zhang et al., 2007).
A 1–3-1 ANN architecture is shown in Fig. 10 and the weight bias
value for each particle as an example can be written as follows:
Particle(:, :, i) = [W1,B1,W ,B2] (10)
where i = 1, 2, 3. . ., the total number of particles.
W1 =
⎡
⎢
⎣
W 12
W 13
W 14
⎤
⎥
⎦, B1 =
⎡
⎢
⎣
Â1
Â2
 3
⎤
⎥
⎦, W2 =
⎡
⎢
⎣
W25
W35
W45
⎤
⎥
⎦, B2 = Â4 (11)
where W1 is the weight matrix of the hidden layer; B1 is the bias
value of the hidden layer; W2
is the weight matrix of the output
layer; B2 is the bias value of the output layer; W2
is the transpose
of the weight matrix of the output layer. Moreover, each particle is
decoded into the weight matrix for calculation of ANN outputs.
4. Results and discussion
To assess the prediction quality of the proposed load forecasting
model, the mean absolute percentage error (MAPE) can be mea-
sured as in Eq. (12).
MAPE = 1/M
m
i=1
|Lactual − Lpredicted|/Lactual (12)
where Lactual is the actual load; Lpredicted is the forecasted load;
M is the number of data points.
Each season normally has two months, which represent the
weather behaviour of that season. For instance, December and
January are considered as winter months and March and April are
as spring months. Similarly, July and August are seen as summer
months, and September and October as autumn months. In this
study, 168 h (one week) ahead represents each season to analyse
the performance of the proposed forecasting model and seasonal
variations over a year. The first week of January, March, July and
September of 2009 represents winter, spring, summer and autumn,
respectively (Niu, Shi, & Dash Wu, 2012b).
Figs. 11–14 show the forecasted results of 168-h ahead load pat-
terns for winter, spring, summer and autumn. These results are
generated using different ANN load forecasting models, which are
based on the BP, LM and GPSO (or PSO shown in the figures) training
techniques. The actual load profile measured for each season is also
plotted in the corresponding figures. Meanwhile, Fig. 15 presents
the comparison of the MAPE values achieved using the above tech-
niques. The MAPE obtained using the proposed GPSO-based model
is compared with the BP and LM-based techniques. As described
in the previous sections, the MAPE of the load forecasting model is
affected by the uncertainty of metrological conditions and changes
in sociological activities.
Fig. 11. 168-h ahead load forecasts using different techniques for winter.
Fig. 12. 168-h ahead load forecasts using different techniques for spring.
Fig. 13. 168-h ahead load forecasts using different techniques for summer.
9. 272 M.Q. Raza et al. / Sustainable Cities and Society 31 (2017) 264–275
Fig. 14. 168-h ahead load forecasts using different techniques for autumn.
It is observed from Figs. 11–14 that the proposed GPSO-based
model produces better forecast results over all the seasons than
the BP and LM-based techniques. As shown in Fig. 15, in summer,
the proposed technique produces a MAPE of 1.40%, which is lower
than the results obtained using the BP and LM-based techniques, at
4.66% and 3.82% respectively. The forecast accuracy of the models
is enhanced by improving the training quality of the network.
Another obervation from Figs. 11–15 is that the load demand
in summer is higher than winter and autumn due to variations in
seasonal temperature. Consequently, the MAPE achieved using the
BP, LM and GPSO-based models in winter is higher than the other
comparative seasons due sociological activities, as shown in Fig. 15.
It means that due to its efficiency of ANN training, the GPSO-based
model can yield the highest accuracy among the models considered.
It is also proved that the GPSO-trained ANN model shows better
generalization capability over large training data sets.
It is also observed from Figs. 11–14 that the load demand varies
from 9 GW to 20 GW in the low to peak demand hours, respectively.
In week ahead (168 h) forecast, the load demand reached to peak 17
GW with large fluctuations in the first two days of the week. The BP-
trained ANN model produces less accurate results and the demand
curve is away from the actual load demand, indicating a high fore-
cast error. In addition, Figs. 11–14 show the highest errors in the
fifth day of autumn and the sixth day of summer weeks. Despite
fluctuations in the load demand, the proposed GPSO-based model is
able to capture a better solution than the BP and LM-trained models.
Table 1 summarizes and compares the results of the load
demand forecasts for 168-h (the first week) ahead over various sea-
sons in 2009, as previously reported in detail. The results show that
the GPSO-trained ANN forecasting model produces MAPE values
over all the reasons significantly lower than the BP and LM-trained
methods. This outcome implies that the proposed GPSO-trained
ANN technique provides particularly higher forecast accuracy than
the BP and LM-trained ANN methods.
5. Analysis of training technique performance
Generally, a large range of input data is used to train the ANN-
based load forecasting model. However, correlated and normalized
input data can play a significant role in the training process of NN.
A set of input data that contain outliers can affect the performance
of load forecasting models. This may lead to poor network training,
thus leading to increases in the computational cost and conver-
gence time of the network (Gudise & Venayagamoorthy, 2003).
These issues can be resolved by training the network using a set
of highly correlated and preprocessed data. Fig. 16 depicts the rela-
tionship between the mean square error and the number of epochs
of the proposed GPSO-trained ANN forecasting model. The conver-
gence characteristics indicate that the proposed ANN model with
the GPSO training algorithm converges at the optimal level in 111
epochs. In contrast, the other comparative techniques using the BP
and LM-based ANN training take 719 and 368 epochs to converge
optimally for the same stopping criteria. These obtained results
are summarized in Table 2. Overall, the BP and LM-based training
Winter Spring Summer Autumn
0
1
2
3
4
5
6
7
Seasons
MAPE(%)
Comparison of Seasonal Forecast Error
BP forecast model
LM forecast model
PSO forecast model
Fig. 15. Comparison of the MAPE values of the seasonal load forecasting using various techniques.
10. M.Q. Raza et al. / Sustainable Cities and Society 31 (2017) 264–275 273
Table 1
Comparison of the MAPE values of the seasonal load forecasting using different techniques.
Season Forecast date Forecast horizon (hours) BP MAPE (%) LM MAPE (%) GPSO MAPE (%)
Winter Jan 1–7, 2009 168 4.303 2.726 1.869
Spring Mar 1–7, 2009 168 4.088 3.163 1.760
Summer Jul 1–7, 2009 168 4.665 3.823 1.408
Autumn Sep 1–7 2009 168 4.500 2.919 1.577
0 20 40 60 80 100
10
1
10
2
10
3
10
4
10
5
Number Of Epochs Vs MSE of PSOMeanSquaredError(mse)
Number Of Epochs
Train
Validation
Test
Best
Fig. 16. The number of epochs vs. mean absolute errors for the GPSO-trained forecasting model.
Table 2
Comparison of training epochs using different techniques.
Training techniques Number of epochs
Backpropagation (BP) 719
Levenberg marquardt (LM) 368
Proposed technique 111
Table 3
Comparison of the regression analysis of different training techniques for ANN
model.
Training technique Training Testing Validation All
BP-trained ANN model 0.9046 0.9051 0.9022 0.9043
LM-trained ANN model 0.9562 0.9538 0.9667 0.9528
PSO-trained ANN model 0.9924 0.9917 0.9929 0.9923
methods take the number of epochs larger than the proposed GPSO
technique, which demands less computational burden and time to
converge. In other words, the GPSO training technique achieves
its target of the required training, testing and validation with the
lowest number of iterations (Table 3).
Fig. 17 depicts the regression analysis of the GPSO training tech-
nique for the proposed ANN load forecasting model with given
inputs. The regression analysis of network was performed to eval-
uate the confidence interval of the training, testing, validation and
overall performance of the proposed model. The proposed model
produces the highest R2 value as compared to benchmark mod-
els. It is also indicated that the forecasted load demand is rather
closer to the actual load value. Moreover, the confidence interval
of the proposed forecasting model is 99%, meaning that 1% of the
estimated data is not statistically significant for neural network. A
confidence interval of 99% shows our confidence in the NN training
process that network has adjusted according to its error.
Table 2 shows a regression comparison of the BP, LM and GPSO
training techniques in terms of training, testing, validation and
overall performance. The analsis of convergence and regression
shows that the proposed GPSO training technique outperforms
the BP and LM methods in terms of generalization capability and
learning of network. Other studies also used a similar technique
(Ince, Kiranyaz, Pulkkinen, & Gabbouj, 2010). Particularly, as shown
in the table, the proposed GPSO training technique generates an
overall R2 value of 0.9923, which is higher than the values pro-
duced by the BP and LM training methods at 0.9043 and 0.9528,
respectively. The forecast accuracy of the GPSO-based model is also
higher than the BP and LM-based methods. Consequently, it can
be concluded that the proposed GPSO-based model has a better
performance than the BP and LM techniques in terms of forecast
accuracy, convergence time and computational complexity. The
proposed forecasting model can be utilized for other applications
such as residential price and wind speed forecasting. The proposed
technique can also be applied for solar PV generation forecasting
for purposes of better planning and operation of power networks
(Raza, Nadarajah, & Ekanayake, 2016).
6. Conclusions
This paper proposes a novel GPSO-trained feed-forward ANN
model to forecast load demand. The proposed forecasting model is
employed to predict a 168-h (one week) ahead load demand over
various reasons of a year. Influential meteorological and exoge-
11. 274 M.Q. Raza et al. / Sustainable Cities and Society 31 (2017) 264–275
Fig. 17. Regression analysis for the GPSO-trained ANN forecasting model.
nous variables are also selected as inputs in the proposed model. In
order to enhance the ANN training performance, the GPSO algo-
rithm is applied to update the weight and bias of the network.
The simulation results shows that the proposed GPSO-based model
outperforms the BP and LM-trained ANN forecasting techniques in
terms of forecast accuracy. The result also shows that the proposed
model generates a minimum MAPE of 1.408% in summer due to less
sociological and metrological uncertainty. It converges optimally
in 111 epochs, which is significantly lower than the comparative
techniques. In addition, the proposed GPSO training technique pro-
duces a R2 value of 0.99 in testing, training, validation and overall
performance. The regression analysis of the network shows that the
overall confidence interval is 99% for training, testing and validation
of the network.
It can be concluded that the GPSO-trained ANN forecasting
model produces better forecast accuracy, generalization capability
and less convergence time than the BP and LM-based techniques.
The proposed model can also be utilized for other forecasting appli-
cations such as prices, wind, solar output power and economic
growth forecasts. Future research will investigate inclusion of other
correlated inputs (e.g., humidity information, wind speed, cloud
cover, rainfall, and human body indices) in the proposed forecasting
model to enhance the prediction performance.
References
Abdul Hamid, M. B., & Abdul Rahman, T. K. (2010). Short term load forecasting
using an artificial neural network trained by artificial immune system learning
algorithm. 12th international conference o computer modelling and simulation
(UKSim) [Vol., no., pp.408,413].
AlRashidi, M. R., & El-Hawary, M. E. (2009). A survey of particle swarm
optimization applications in electric power systems, evolutionary
computation. IEEE Transactions on, 13, 913–918.
den Bergh, V., & Engelbrecht, F. (2000). Cooperative learning in neural networks
using particle swarm optimizers. South African Computer Journal, 26, 84–90.
Cecati, C., Kolbusz, J., Ró ˙zycki, P., Siano, P., & Wilamowski, B. M. (2015). A novel RBF
training algorithm for short-term electric load forecasting and comparative
studies. IEEE Transactions on Industrial Electronics, 62(10), 6519–6529.
Chakhchoukh, Y., Panciatici, P., & Mili, L. (2011). Electric load forecasting based on
statistical robust methods, power systems. IEEE Transactions on, 26,
982–991.
Ching-Lai, H., Watson, S. J., & Majithia, S. (2005). Analyzing the impact of weather
variables on monthly electricity demand. Power Systems, IEEE Transactions on,
20, 2078–2085.
Dong, B., Cao, C., & Eang Lee, S. (2005). Applying support vector machines to
predict building energy consumption in tropical region. Energy and Buildings,
37, 545–553.
Drezga, I., & Rahman, S. (1998). Input variable selection for ANN-based short-term
load forecasting. Power Systems, IEEE Transactions on, 13(no.4), 1238–1244
[Nov].
Eberhart, R. C., & Kennedy, J. (1995). A new optimizer using particle swarm theory.
In Proceeding of the sixth international symposium on micro machine and human
science, nagoya, Japan (pp. 39–43).
Felice, D., & Xin Yao, M. (2011). Short-term load forecasting with neural network
ensembles: a comparative study [Application notes]. Computational Intelligence
Magazine, IEEE, 6(August (3)).
Gudise, V. G., & Venayagamoorthy, G. K. (2003). Comparison of particle swarm
optimization and backpropagation as training algorithms for neural networks.
Proceedings of the 2003 IEEE swarm intelligence symposium [SIS ‘03., vol., no.,
pp.110, 117].
Huang, M., Huang, C. J., & Wang, M. L. (2005). A particle swarm optimization to
identifying the ARMAX model for short-term load forecasting. IEEE Trans.
Power Syst., 20(May), 1126–1133.
Ince, T., Kiranyaz, S., Pulkkinen, J., & Gabbouj, M. (2010). Evaluation of global and
local training techniques over feed-forward neural network architecture
spaces for computer-aided medical diagnosis. Expert System Application, 37,
8450–8461.
Kennedy, J., & Eberhart, R. C. (1995). Particle swarm optimi zation. In IEEE IntI. conf
on neural networks, vol. 4,. perth, Australia (pp. 1942–1948).
Khosravi, et al. (2010). Construction of optimal prediction intervals for load
forecasting problems, power systems. IEEE Transactions on, 25, 1496–1503.
Lai Hor, C., Watson, S. J., & Majithia, S. (2005). Analyzing the impact of weather
variables on monthly electricity demand. Power Systems, IEEE Transactions on,
20(November) [pp.2078,2085].
12. M.Q. Raza et al. / Sustainable Cities and Society 31 (2017) 264–275 275
Mirjalili, S. A., Mohd Hashim, S. Z., & Sardroudi, H. M. (2012). Training feedforward
neural networks using hybrid particle swarm optimization and gravitational
search algorithm. Applied Mathematics and Computation, 218, 982–991.
Niu, H., Shi, D., & Wu, D. (2012). Short-term load forecasting using bayesian neural
networks learned by Hybrid Monte Carlo algorithm. Applied Soft Computing, 12,
1822–1827.
Niu, D.-x, Shi, H.-f, & Dash Wu, D. (2012). Short-term load forecasting using
bayesian neural networks learned by Hybrid Monte Carlo algorithm. Applied
Soft Computing, 12, 1822–1827.
Nowicka-Zagrajek, J., & Weron, R. (2002). Modeling electricity loads in California:
ARMA models with hyperbolic noise. Signal Processing, 82, 1903–1915.
Ranaweera, D. K., Hubele, N. F., & Papalexopoulos, A. D. (1995). Application of
radial basis function neural network model for short-term load forecasting. IEE
proceedings generation, transmission and distribution, Vol. 142 [pp.45,50].
Raza, M. Q., Nadarajah, M., & Ekanayake, C. (2016). On recent advances in PV
output power forecast. Solar Energy, 136(October), 125–144 [ISSN 0038-092].
Razana, A., Shamsuddin, S. M., & Ab. Aziz, F. (2009). The impact of social network
structure in particle swarm optimization for classification problems.
International Journal of Soft Computing, 4, 151–156.
Ruth, M., & Lin, A.-C. (2006). Regional energy demand and adaptations to climate
change: methodology and application to the state of Maryland, USA. Energy
Policy, 34, 2820–2833.
Ruzic, S., Vuckovic, A., & Nikolic, N. (2003). Weather sensitive method for short
term load forecasting in electric power utility of Serbia. IEEE Trans. Power
System, 18, 1581–1586.
Saksornchai, T., Wei-jen, L., Methaprayoon, K., Liao, V., & Ross, R. J. (2005). Improve
the unit commitment scheduling by using the neural-network-based
short-term load forecasting. IEEE Transactions on Industry Applications, 41 [pp.
169,179].
Shamsollahi, P., Cheung, K. W., Chen, Q., & Germain, E. H. (2001). A neural network
based very short term load forecaster for the interim ISO New England
electricity market system. 22nd IEEE power engineering society international
conference on [vol., no., pp.217,222].
Sisworahardjo, N. S., El-Keib, A. A., Choi, J., Valenzuela, J., & Brooks, R. (2006). A
stochastic load model for an electricity market. Electric Power Systems Research,
76, 500–508.
Van den Bergh. (2002). An analysis of particle swarm optimizers. Pretoria, South
Africa: Department of Computer Science, University of Pretoria [Ph.D. thesis].
Wang, T., Neng-ling, Z., Hai-qing, J., Ye, Z., Jia-dong, Q., & Liang-bo. (2008). A new
ARMAX model based on evolutionary algorithm and particle swarm
optimization for short-term load forecasting. Electric Power Systems Research,
78, 1679–1685.
Wang, J., Li, L., Niu, D., & Tan, Z. (2012). An annual load forecasting model based on
support vector regression with differential evolution algorithm. Applied Energy,
94, 65–70.
Yu, F., & Xu, X. (2014). A short-term load forecasting model of natural gas based on
optimized genetic algorithm and improved BP neural network. Applied Energy,
134(December), 102–113 [ISSN 0306-2619].
Zhang, J. R., Zhang, J., Lock, T. M., & Lyu, M. R. (2007). A hybrid particle swarm
optimization–back-propagation algorithm for feedforward neural network
training. Appllied Mathematics Computing, 128, 1026–1037.
Zheng, T., Girgis, A. A., & Makram, E. B. (2000). A hybrid Wavelet-Kalman filter
method for load forecasting. Electric Power Systems Research, 54, 11–17.