SlideShare a Scribd company logo
1 of 145
Download to read offline
Artificial Neural Network and Monte Carlo Optimization
for Reservoir Operation
by
Stephen James Klein
A Thesis
Presented to
The Faculty of Humboldt State University
In Partial Fulfillment
of the Requirements for the Degree
Master of Science
In Environmental Systems: Engineering
May 10, 1999
Artificial Neural Network and Monte Carlo Optimization
for Reservoir Operation
by
Stephen James Klein
Approved by the Master's Thesis Committee:
Brad A. Finney, Major Professor Date
Elizabeth A. Eschenbach, Committee Member Date
Robert Willis, Committee Member Date
Charles M. Biles, Graduate Coordinator Date
Ronald A. Fritzsche Date
Dean for Research and Graduate Studies
ABSTRACT
Determining the optimal operation of a reservoir system is frequently hampered
by uncertainty associated with future inflows. Generalized operating policies are a
potentially fast and easy to use means of real time operation. They use readily
available predictors (i.e., current reservoir storage and short term predicted inflow)
calibrated against the optimal response of the system. Artificial neural networks
represent the most recent attempt to improve generalized reservoir release models.
This study builds on prior research in water resources planning by investigating the
incorporation of time lagged inputs of inflow and demand to improve the perfor-
mance of a generalized neural network reservoir release model. This study differs
from much of the previous work on generalized operating rules in that Monte Carlo
optimization is used. Previous research has relied primarily on deterministic data,
but here the Monte Carlo element generates reservoir inflow synthetically. The
nonlinear objective function is to minimize the sum of square deficits and maximize
hydropower production. Dynamic Programming has been the primary optimization
tool in generalized operating rule research. In this study however, a nonlinear pro-
gramming (NLP) model is developed and applied to a series of hypothetical water
resource systems. The reduced gradient algorithm is used for the solution of the
NLP. The Monte Carlo optimization methodology allows a virtually unlimited pool
of calibration and validation data to be rapidly derived for a variety of reservoir con-
iii
figurations. The performance of the neural network model relative to the nonlinear
programming solution is compared over a range of reservoir storage, demand-deficit,
and streamflow correlation structures in order to investigate its applicability. The
results show that a significant improvement in the performance of generalized reser-
voir release neural network can be achieved using time lagged inputs of inflow and
demand. Furthermore, the ability of a generalized neural network reservoir release
model to incorporate time series information is substantiated.
iv
ACKNOWLEDGEMENTS
I received special assistance from three members of the Department of Envi-
ronmental Resources Engineering at Humboldt State University: Professors Brad
Finney, Robert Willis, and Elizabeth Eschenbach.
Professor Finney tirelessly encouraged me and guided me through this project,
from beginning to end. Professor Willis gave me the training that allowed me to
conduct the research. Professor Eschenbach edited and critiqued the final versions
of the thesis.
Thank you all for your invaluable assistance.
Finally, I dedicate this thesis to my wife, Tirian, who gave me constant love
and support throughout my graduate studies.
v
TABLE OF CONTENTS
ABSTRACT iii
ACKNOWLEDGEMENTS
LIST OF TABLES ix
LIST OF FIGURES
NOTATION xii
1.0 INTRODUCTION 1
2.0 LITERATURE REVIEW 5
2.1 Artificial Neural Network Modeling 5
2.1.1 Strengths and Weaknesses 6
2.1.2 Topology 7
2.1.3 Preprocessing 11
2.1.4 Training 13
2.1.5 Time Series Modeling in Hydrology 21
2.2 Generalized Reservoir Operating Rules 23
2.3 Artificial Neural Networks In Water Resources Planning 26
2.3.1 Single Reservoir Management 26
2.3.2 Multiple Reservoir Management 28
2.3.3 Large Scale Water Projects 30
vi
2.4 Optimization and ANN 33
2.5 Study Contributions 34
3.0 METHODOLOGY 37
3.1 Development of the Reservoir Release Model 37
3.1.1 Reservoir Release Mathematical Program 38
3.1.2 Overview of Monte Carlo Optimization Procedure 41
3.1.3 Generating Synthetic Reservoir Inflow 43
3.1.4 Determining Incremental Flow and Reservoir Demand 44
3.1.5 Optimization 46
3.2 Development of ANN Generalized Reservoir Release Model 50
3.2.1 Artificial Neural Network Training 50
3.2.2 Neural Network Reservoir Release Simulation Model 52
3.2.3 Development of a Standard Operating Procedure 55
3.2.4 Feature Selection and Internal Network Architecture 57
3.2.5.Stopping Criteria 58
3.2.6 Reservoir Simulation Scenarios 60
3.2.7 Experimental Design Approach 62
4.0 RESULTS AND DISCUSSION 66
4.1 Neural Network Configuration 66
4.2 Feature Selection 67
4.3 Reservoir Simulations 70
vii
4.3.1 Simulation Results: Sum of Squared Deficits 70
4.3.2 Simulation Results: Total Deficit, Hydropower, and Spill 79
4.3.3 Frequency Characterization 83
4.4 Effect of Annual Lag-1 Autocorrelation Coefficient 90
4.5 Application to Real Time Reservoir Operation 92
5.0 CONCLUSIONS 95
6.0 SUGGESTIONS FOR FURTHER RESEARCH 98
REFERENCES 100
APPENDIX A - MPS Matrix Generator - (mg.f) 105
APPENDIX B - MINOS Subroutine FUNOBJ - (opt.f) 110
APPENDIX C - Neural Network Reservoir Release Program -
(nn_resv_sim.cpp) 118
viii
LIST OF TABLES
1. Annual statistics for synthetic reservoir inflows compared with the
calibration inflow series (acre-feet x103) 44
2. Coefficients and constants for Equation (22) 46
3. Cross-validation test using a P6-200Mhz PC 59
4. Configurations of four different reservoirs (acre-feet x103) 61
5. Target values of the ODI index 61
6. Sum of squared deficits as a percentage of the NLP solution for the
results of the generalized reservoir release neural network model 76
7. Sum of squared deficits as a percentage of the NLP solution for the
results of the standard operating procedure 78
8. Total deficit as a percentage of the NLP solution for the results of the
generalized reservoir release neural network 79
9. Total deficit as a percentage of the NLP solution for the results of the
standard operating procedure 80
10. Hydropower production as a percentage of the NLP solution for the
results of the generalized reservoir release neural network 82
11. Hydropower production as a percentage of the NLP solution for the
results of the standard operating procedure 83
12. Summary of the computer run time requirements for calibrating the
generalized reservoir release neural network 93
ix
LIST OF FIGURES
1. Illustration of a typical feedforward neural network 9
2. Flow diagrams: (i) and (ii), of the two primary neural network
learning modes 14
3. Functional plane analogy to the optimization process used by neural
networks 17
4. Overview of the Monte Carlo optimization procedure 42
5. Trade off curve representing the nondominated solutions of the
objective function 49
6. Flow diagram of the generalized reservoir release simulation program,
nn_resv_sim 54
7. Flow diagram the standard operating procedure 56
8. Figure 8. Experimental design approach, comparing neural network
i) training procedure and ii) testing procedure 63
9. Sum of squared demand deficits with respect to paired time-lagged
neural network inputs of It and DEMt 68
10. Total demand deficit with respect to paired time-lagged neural network
inputs of It and DEMt 68
11. MSI=0.23, comparing the sum of squared deficits of the reservoir
release methodologies over the range of ODI 72
12. MSI=0.84, comparing the sum of squared deficits of the reservoir
release methodologies over the range of ODI 73
13. MSI=1.7, comparing the sum of squared deficits of the reservoir
release methodologies over the range of ODI 74
14. MSI=3.4, comparing the sum of squared deficits or the reservoir
release methodologies over the range of ODI 75
15. MSI=1.7, comparing the sum of squared deficits or the reservoir
release methodologies over the range of ODI for two test scenarios 77
16. Comparison of reservoir release methodologies using a storage
frequency curve for MSI=1.7 and ODI≈0.04 84
17. Comparison of reservoir release methodologies using a release
for demand frequency curve for MSI=1.7 and OD≈0.04 84
18. Comparison of reservoir release methodologies using a storage
frequency curve for MSI=1.7 and ODI≈0.09 85
19. Comparison of reservoir release methodologies using a release
for demand frequency curve for MSI=1.7 and ODI≈0.09 85
20. Comparison of reservoir release methodologies using a storage
frequency curve for MSI=1.7 and ODI≈0.17 86
21. Comparison of reservoir release methodologies using a release
for demand frequency curve for MSI=1.7 and ODI≈0.17 86
22. Comparison of reservoir release methodologies using a storage
frequency curve for MSI=1.7 and ODI≈0.31 87
23. Comparison of reservoir release methodologies using a release
for demand frequency curve for MSI=1.7 and ODI≈0.31 87
24. The performance of a five and twelve-lag generalized neural network
reservoir release models for increasing values of the lag-1 auto-
correlation coefficient 91
xi
=
=
=
=
=
=
=
=
Weight matrix
Individual neural network weight (dimensionless)
Generic desired response of the system
Generic output of the system
Generic distance generator
Indicates an adjustable weight matrix
Step size
Mathematical representation of local error
= Momentum factor
= Number of monthly time periods in operational time horizon
= Time increment (month)
= Generic integer increment
= Seasonal index(month of year)
= Hydraulic head (feet)
= Generic input variable
= Generic output variable
= Flow (acre-feet x 103/month)
= Hydroelectric power production
= Gigawatt-hours
= Generic reservoir storage term
= Reservoir storage (acre-feet x 103)
= Maximum reservoir storage (acre-feet x103)
= Minimum reservoir storage (acre-feet x103)
= Reservoir release for demand (acre-feet x103)
= Reservoir release for hydropower (acre-feet x103)
xii
=
=
=
=
=
=
Demand (acre-feet x 103)
Reservoir spill (acre-feet x103)
Volumetric monthly incremental flow (acre-feet x103)
Maximum combined RDt, RHt release (acre-feet x103)
Minimum RDt + RHt release (acre-feet x 103)
Generic constant
= Quadratic regression parameters
= bo
= b1/2
= b2/4
= Hydroelectric power plant efficiency (dimensionless)
= Deficit penalty weighting (dimensionless)
= Hydropower benefit weighting (dimensionless)
= Monthly volumetric flow requirement for fish (acre-feet x103)
= Seasonal weighting factor on demand (dimensionless)
xiii
1.0 INTRODUCTION
The purpose of research into generalized reservoir release models in water re-
sources planning has been to create useful real time operational models. The rela-
tive accuracy, to the optimal, of the release decisions made by generalized models
is dependent on the ability of the models to generalize from conflicting and com-
peting information contained in optimal release decisions and associated stochastic
reservoir inflow and demand information. Furthermore, generalized reservoir re-
lease models are an attempt to take advantage of the computational efficiency of
input/output models. Young [1967] first demonstrated the applicability of gener-
alized models by using linear regression in deriving generalized reservoir operating
policies for single reservoir systems using the results of dynamic programming.
More recently, several studies have demonstrated the ability of neural net-
works to perform as generalized reservoir release models. Neural networks are in-
put/output or black box models in that inputs are simply mapped to outputs.
Neural networks have their origins in the field of artificial intelligence, which over
the last several decades has developed a series of mathematical models attempting
to mimic the processing capabilities of the human brain. One of these capabilities
is the ability of the brain to use parallel processing in order to rapidly interpret
large amounts of information through pattern recognition. Much of the value of
neural networks to scientists and engineers can be found in the fact that they are
extremely computationally efficient. Once trained (i.e., calibrated), neural networks
1
2
can produce a resultant in a matter of seconds. Recent studies include generalized
neural network models trained with the results of dynamic and linear programming
on several operational scales: single reservoir [Sakakima et al., 1992; Raman and
Chandramouli, 1996], multiple reservoirs [Sand et al., 1994], and large-scale water
projects [Liu and Yao, 1995].
It has also been shown in the hydrology literature that neural networks per-
form comparable, and in some instances outperform statistical time series models.
Raman and Sunilkumar [1995] used a neural network to synthesis reservoir inflows
and compared the neural network with an AR(2) model. Hsu et al. [1995] devel-
oped an artificial neural network model to predict the rainfall runoff relationship
for the medium-size basin and compared the neural network with an ARMAX (au-
toregressive moving average with exogenous inputs) time series model. Vaziri [1997]
compared an artificial neural network with an ARIMA model to predict Caspian
Sea water surface level.
The purpose of this study is to investigate whether a performance increase in
a generalized reservoir release neural network can be achieved by adding lagged
inputs of inflow and demand to the three input neural network used by Raman and
Chandramouli [1996]. The purpose is also to show the applicability of the release
model over a range of reservoir storage, demand-deficit, and streamflow correlation
structures. Neural network training (i.e., calibration) data is developed from Monte
Carlo optimization of a nonlinear programming (NLP) problem.
3
Previous generalized neural network studies have relied on deterministic data,
with the exception of Saad et al [1994] who used synthetic data, and on dynamic
programming, with the exception of Liu and Yao [1995] who used linear program-
ming. The present study uses nonlinear Monte Carlo optimization. The Monte
Carlo element allows for the rapid generation of reservoir inflow and demand in-
formation. This information, once optimized, is used as training and testing data
by the neural network. The use of nonlinear programming (NLP) allows for simple
parameterization of the reservoir's configuration (i.e., size).
The nonlinear objective is to minimize the sum of square deficits of release
for residential and industrial demand while maximizing hydropower production.
The objective is optimized using the MINOS software package, a general-purpose
optimizer utilizing a reduced gradient algorithm. The noninferior solutions of this
multiobjective programming problem depended on the relative penalty and benefit
weights given to the square demand deficit and hydropower production respectively.
The noninferior solution investigated in this study is the point that maximizes the
production of hydroelectric power with little or no tradeoff in terms of the sum of
squared deficits.
Lastly, in order to study the behavior of the generalized neural network model,
two indices were developed. The first, the maximum storage to average annual in-
flow or (MSI), provides a means of describing different size classes of reservoirs. The
second, optimal deficit to average annual inflow (ODI), provides a means of uni-
4
formly investigating a relative deficit over a range of MSI reservoir configurations.
These indices along with a systematic variation in the lag-1 autocorrelation param-
eter in the streamflow generator allows for the examination of the performance of
the generalized neural network release model over a range of storage configurations,
demand-deficits, and streamflow correlation structures.
2.0 LITERATURE REVIEW
The literature review is divided into five principal sections: Artificial Neural
Network Modeling, Generalized Reservoir Operating Rules, Artificial Neural Net-
works in Water Resources Planning, Optimization and Artificial Neural Networks,
and Study Contributions. The first section is an introduction to artificial neural
network computing with specific discussions of strengths and weaknesses, topology,
preprocessing, training, and time series modeling in hydrology. The second section
is a review of the literature on generalized reservoir operating policies. The third
section is a survey of neural networks in water resources planning with specific con-
sideration of single reservoir management, multiple reservoir management, and large
scale water project management. The fourth section examines neural networks as
an optimization tool in water resources outside the context of generalized operat-
ing rules. The final section discusses this study's contributions to water resources
planning and management.
2.1 Artificial Neural Network Modeling
Neural networks are an attempt to perform pattern recognition by mapping
inputs to outputs based on mathematical models which attempt to model the power
and flexibility of the human brain. The literature frequently uses the term neural
network (NN) or artificial neural network (ANN) interchangeably. When consulting
the literature, it is helpful to recognize that artificial neural networks may be referred
5
6
to as neurocomputing, network computation, connectionism, parallel distributed
processing, layered adaptive systems, self-organizing networks, or neuromorphic
systems or networks. Collectively, the variety of names synonymous with neural
networks exemplify the diverse and varied nature of neural computing [Zurada,
1992].
Neural networks must be taught or trained (i.e., parameter estimation). Learn-
ing corresponds to changes in weights as the network is exposed to new patterns,
associations, and functional relationships. Zurada [1992] observes that neurocom-
puting lies somewhere between engineering and artificial intelligence. While math-
ematical techniques used are engineering-like (i.e., reduced gradient optimization),
an experimental ad hoc approach must be taken because the field lacks a theory
for selecting an appropriate network architecture for a particular application. The
subsections that follow draw primary upon Bishop [1995] and Zurada [1992], rec-
ommended by Salle [1999] in a review of the best books on neural networks, as well
as the NeuroSolutions online users manual [Lefebvre et al., 1997].
2.1.1 Strengths and Weaknesses
Artificial neural networks are of interest to scientists and engineers for their
ability to generalize and function as black box models. Generalization takes place
when a network generates reasonable outputs when presented with an unseen data
set. Neural networks function as black box models in that prior knowledge of the
system under investigation is not required; thus, they are useful in problems were
7
the underlying cause and effect mechanisms are not well understood.
Neural network models are considered robust in that small input errors typi-
cally have minimal impact on output. They are easy to implement as a result of
recent advances in software engineering and personal computing that have made
off-the-shelf neurocomputing software readily available. Finally, the neural network
topology itself is extremely flexible in that it can readily incorporate any number of
inputs and outputs [Thirumalaiah and Deo, 1998; Medsker et al., 1998; Zang and
Stanley, 1997].
Medsker et al. [1998] elaborated on the disadvantages of artificial neural net-
works by making the following points. Neural networks in general cannot guarantee
an optimal solution or be repeatable in terms of reproducing the same internal net-
work weights. Neural networks can give significant importance to input variables
that come into conflict with traditional theory. Finally, training times can be ex-
cessive and can be compounded by the fact that determining network architecture
frequently requires a trial-and-error approach.
2.1.2 Topology
Neural networks are composed of a large number of interconnecting processing
elements (PEs), also referred to as neurons, nodes or axons. The form of the in-
terconnections between PEs provides a means of neural network classification. In a
fully recurrent neural network, every PE is connected to every other PE, including
itself. For a neural network to be recurrent, there must be at least one feedback
8
connection allowing information from one input presentation to affect future out-
puts. The feedforward neural network is the specialized case where the network's
connections are restricted to the forward direction. Training (i.e., parameter esti-
mation) is straightforward and far more reliable in feedforward networks than with
recurrent networks, as output is dependent only on current input. The overwhelm-
ing majority of neural network studies in water resources are feedforward rather
than recurrent.
Feedforward networks have PEs arranged in layers with one or more hidden
layers sandwiched between input and output layers. A generic feedforward neural
network is sometimes referred to as a multilayer perceptron (MLP). The function
of a PE within a feedforward neural network depends on its position within the
network. Figure 1 illustrates three types of PEs whose functionality is typically
defined as follows: an input PE serves as a simple identity map or buffer between
input and output activity; a hidden PE performs as a summing junction at its input,
a nonlinear transformation, and lastly, a splitting node at its output; and an output
PE performs the same as a hidden PE with the exception that there is no splitting
node at its output.
A synapse is a linear mapping between layers of PEs. The synapse is composed
of a weight matrix with each matrix element representing connection strength. The
summation function at the beginning of a PE finds the weighted average of all the
9
Figure 1. Illustration of a typical feedforward neural network, modified
with permission (Medsker et al., 1996)
with
where
10
inputs and may be represented by
where xi is the resultant feeding the i-th processing element, while x j is the output
of the previous processing element and wij is a connection weight linking PEj to
PEi. The nonlinear transformation following the summation function is also referred
to as a transfer or activation function. Two frequently used transfer functions in
neural network studies are the sigmoid function
and the hyperbolic tangent function
is an adaptable linear transformation within the activation function.
The value of wi is fit during the training process while 6 is not adaptive, it may be
parameterized. Equations (1) through (4) follow from Lefebvre et al. [1997].
11
2.1.3 Preprocessing
Feature selection, feature extraction, and input normalization are preprocessing
methods for reducing input dimensionality to improve neural network performance.
Feature selection refers to picking the best subset of inputs from the domain of po-
tential inputs. Feature extraction involves extracting data from a set of inputs and
generating a new smaller transformed set of inputs that preserves the most relevant
information of the original inputs. Data normalization of the neural network inputs
and outputs improves training performance by conditioning the network's inputs
and outputs to the initial random values of the network's weights.
Feature selection involves reducing the dimensionality of a neural network by se-
lecting a subset of features or input variables and discarding the remainder. Bishop
[1995] suggested that any methodology for feature selection involves the establish-
ment of a selection criterion to compare one input subset with another. He further
recommended a systematic means of searching through candidate subsets of input
variables.
Prior knowledge is one approach to perform feature selection by using rele-
vant information in addition to that provided by the training data [Bishop, 1995].
Prior knowledge can be incorporated into the network topology in pre- and post-
processing, as well also during training. Physically based analytical and numerical
models are an abundant source of prior knowledge. Bishop's example involves appli-
cations of neural networks where data varies as a function of time. Prior knowledge
where is the number of time lags.
12
in this case involves incorporating time-lagged information into the network topol-
ogy. This is frequently done by presenting simultaneously successive values of the
time-dependent variable, given by
Sensitivity analysis of network inputs relative to network output are a second
means of feature selection. One way of performing a sensitivity analysis is to freeze
network weights and feed two sets of dithered input data through the network. The
differences between the two sets are then converted to percentages relative to all
other input variables. These percentages represent the output change caused by the
positive and negative dithers [Lefebvre et al., 1997]. Such sensitivity information
is useful in determining the relative importance of each input variable and the
subsequent reduction in dimensionality by eliminating insignificant inputs. The
sensitivity values allow one to see if the corresponding sensitivities follow from
an understanding of the problem, or when understanding is lacking, to determine
significant underlying relationships.
Feature extraction involves extracting data from a set of inputs and generating
a new smaller transformed set of inputs that preserves the most relevant information
of the original inputs. There is the expectation that there will be a reduction in the
dimensionality of the input space with minimal loss of discriminating information.
Principal component analysis is a classical statistical technique for linear feature
extraction. The technique involves truncating lesser components to a specific order.
Neural networks themselves can be made to perform principal component analysis
13
and can be divided into two categories: those that originate from the Hebbian
learning rule and those that use least squares learning rules [Diamantaras and Kung,
1996]. Hebbian learning requires that the that the learning signal be equal to the
neuron's output [Hebb, 1949].
Linear rescaling, a common form of pre- and post-processing, is frequently
needed for proper network training as the magnitudes of input/output variables
do not necessarily reflect their relative importance. Normalization can transform
input/output variables to the order unity, while neural network weights can be
randomly initialized within the range [1, -1]. Without input/output normalization,
it would be necessary to find solutions to the weights that differ markedly from one
another and that create conditions poorly conducive to training.
2.1.4 Training
A second means of classifying neural networks is to differentiate between learn-
ing modes. The two primary modes are supervised and unsupervised learning,
illustrated in Figure 2. The figure shows that for supervised learning when an input
is applied, the desired response d of the system must be explicitly provided by an
external teacher in order for the distance generator to calculate the distance mea-
sure or error between the actual output o and the desired response [Zurada,
1992]. The distance measure is error information which can be used by an updating
weight matrix W.algorithm to improve on the network's adjustable
14
Figure 2. Flow diagrams: (i) and (ii), of the two primary neural network
learning modes. From Introduction to Artificial Neural Systems, 1st edition,
by Zurada [1992]. Redrawn with permission of Global Rights Group, a
division of International Thomson Publishing. Fax 800 730-2215.
With unsupervised learning there is no teacher to provide a desired response,
and thus, there is no explicit error information to send to an updating algorithm
15
in order to improve on the network's weights. Unsupervised learning must instead
rely on pre-specified internal rules of interaction for updating weights. These algo-
rithms search for redundant patterns, regularities and separating properties. This
process is also known as self-organization [Zurada, 1992]. Statistical principal com-
ponent analysis and neural network principal component analysis are two examples
of unsupervised learning [Bishop, 1995].
Learning can take place on either the epoch or exemplar level corresponding to
on-line or batch training. An epoch is one pass through the entire input series, while
exemplar refers to a single example from the input series. Error updating of the
weights takes place after each exemplar in the case of on-line training. Batch train-
ing performs weight updates at the end of an epoch using an average of accumulated
error information [Lefebvre et al., 1997].
Prior to implementing supervised learning, two things must be decided: an
optimality criterion to be minimized and a gradient search method to compute
optimal network weights. The criterion typically used in neural network studies is
the summation over a training interval of the quadratic cost function or the mean
squared error criterion (MSE)
with d(n) the desired response, y(n) the network's output, and N the number of
input observations. MSE is the most frequently applied neural network optimality
criterion, although other optimality criteria are possible (e.g., the absolute value).
16
The learning elements of a feedforward neural network can be thought of as a
juxtaposition of three independent planes (Figure 3): the forward activation plane,
the backpropagation plane, and the gradient search plane [Lefebvre et al., 1997].
The neural network designer specifies the topology (Section 2.1.2) of the forward
plane that dictates the form of the adjoining backpropagation plane. These two
planes are attached by the error criterion (i.e. MSE) that sends into the back-
propagation plane the composite error e2 determined by the difference between the
desired signal and the network output. The gradient search plane adjusts the activa-
tion plane weights by using the error information contained in the backpropagation
plane.
The backpropagation algorithm is an innovative means of propagating error
back through the network, from its output toward its input, prior to the evaluation of
derivatives. The fundamental concept behind the backpropagation algorithm is the
evaluation of the relative error contributed by each particular weight also referred to
as the credit assignment problem [Bishop, 1995]. Rumelhart, Hinton and Williams
popularized the backpropagation technique [Rumelhart et al., 1986], although the
mathematical framework necessary for training was published by Werbos [1974].
17
Figure 3. Functional plane analogy to the optimization process used
by neural networks, reproduced by permission of NeuroDimension Inc.
(Lefebvre et al., 1997).
There are several methods for searching the performance surface: step, momen-
tum, quickprop, and delta-bar-delta. The momentum equation (on-line learning),
represented by
is an extension of step learning with the addition of a momentum term where a is
the momentum factor. Equation (7) shows that the updated value of the weight is
equal to its current value, plus a step error-adjustment, plus momentum from the
value of the previous weight. The constant n is referred to as the step size while
the local error is represented by Si. The value of 6 is calculated using error function
derivatives. The evaluation of the gradients is made possible by the fact that the
18
error function of a neural network contains continuously differentiable functions
of the weights. A complete presentation of the backpropagation algorithm and
associated derivative evaluations can be found in Zurada [1992] and Bishop [1995].
The typical error surface of a neural network presents a challenging optimiza-
tion problem, being nonconvex and to varying degrees of high dimensionality with
numerous local minima and regions insensitive to variations in network weights.
Reduced gradient search algorithms may encounter difficulties similar to other op-
timization approaches in that they are sensitive to initial starting conditions, are
easily trapped by local optima, and are often ineffective when searching high di-
mension weight spaces [Hsu et al., 1995].
One approach to handling problems associated with local minima is to develop
a more powerful search algorithm. A hybrid approach was taken by Hsu et al. [1995],
using linear least squares and multistart simplex optimization to find a global or near
global solution. However, some authors argue that local minima have not been a
significant problem in training neural networks. Zurada [1992] stated that problems
with local minima were not significant in many of the training cases studied. He
attributed this to stochastic initial conditions of the backpropagation algorithm in
that the internal weights are random and thus the initial square error surfaces are
random. He added that the algorithm has be found to be equivalent to a form
of stochastic approximation [Zurada refers to White, 1989]. Zurada stressed the
importance of incorporating randomness within the learning process to enhance the
19
stochastic nature of network training. One form of that randomness can be found
in the initial values of a neural network's weights, typically randomized within the
interval [1, -1].
The form of the neuron's activation function, the number of hidden layers and
the number of nodes per hidden layer, play an important role in capturing the com-
plexity of the problem. A trial and error approach is the most common procedure
for determining the number of hidden layers and nodes per hidden layer. If the
internal dimensionality is too small, the network may not have sufficient degrees of
freedom to learn the process. If the internal dimensionality is too large, the network
may not converge during training or may overfit the data (Karunanithi et al., 1994).
Pruning can be used to reduce the internal dimensionality of the network once an
appropriate internal network architecture has been determined. The procedure in-
volves removing connections between various nodes and thus reducing the effective
size of the weight matrix.
One way around a trial and error approach for developing a neural network's
architecture is to use the cascade-correlation algorithm. Karunanithi et al. [1992]
applied a cascade-correlation neural network to estimate flows at an ungaged site on
the Huron River in Michigan and compared the neural network with an analytical
power model in terms of accuracy and convenience. The cascade-correlation neural
network differs from the feedforward neural network in that it can automatically
construct a model that adequately captures the complexity of the problem. The
20
idea behind Fahlman and Lebiere's [1990] algorithm is to allow the neural network
to incrementally construct its own topology based on training error. The algorithm
decreases training times and allows for more consistency in solving problems.
There are several strategies for stopping the training of a network: stopping
training after a predetermined number of epochs, stopping when the error is below
an acceptable threshold value, or using a cross validation procedure. The goal of
any stop criterion should be to maximize the network's generalization and avoid
overtraining. The latter occurs when the network memorizes the desired data and
thus does a poor job of generalizing. Stop criteria based on iteration number or level
of acceptable error requires personal judgement to determine what level of error is
appropriate or by which iteration is an acceptable solution reached. Cross validation
provides a systematic means of stopping the training process by evaluating the
performance of an independent validation set while simultaneously training. Once
the error in the independent validation set begins to increase, training is stopped.
A third independent data set for measuring final network performance is called a
test set.
Neural network learning is a time consuming process, but once trained, neural
networks are quite fast. The iterative learning process is slow in part because
the backpropagation algorithm needs to update the network's weights based upon
gradient-descent error information. Training time requirements are compounded by
the fact that different input configurations must be evaluated in the feature selection
21
process. Trained neural networks are quite fast as the computations involved are
primarily matrix multiplication. Training times in this thesis were on the order of
hours, while neural network test runs were on the order of seconds.
2.1.5 Time Series Modeling in Hydrology
The ability of neural networks to model time varying hydrologic phenomena
has been well documented. Raman and Sunilkumar [1995] used a neural network
to synthesize reservoir inflows and compared the results with a lag-2 autoregressive
model. Ichiyanagi et al. [1996] used neural networks to predict reservoir inflows.
Other areas of neural network hydrologic investigation include predicting streamflow
flow [Crespo and Mora, 1993; Karunanithi et al., 1994; Thirumalaiah and Deo,
1998a], flood flow [Thirumalaiah and Deo, 1998b], inland sea water level [Vaziri,
1997], rainfall [French et al., 1992], rainfall-runoff [Hsu et al., 1995; Smith and
Eli, 1995; Minns and Hall, 1996; Carriere et al., 1996; Dawson and Wilby, 1998;
Jayawardena and Fernando, 1998], and water quality transport [DeSilets et al.,
1992; Maier and Dandy, 1996; Gong et al., 1996; Whitehead et al., 1997; Zhang and
Stanley, 1997].
Three of the papers compared an artificial neural network with a time series
model. Vaziri [1997] compares an artificial neural network with an autoregressive
integrated moving average (ARIMA) model to predict the current water level of
the Caspian Sea. Both models produced reasonable results when compared with
recorded levels. The neural network model, on average, underestimated the sea level
22
by three centimeters, while the ARIMA model overestimated by three centimeters.
Hsu et al. [1995] developed an artificial neural network model to predict the rainfall
runoff relationship for the medium-size Leaf River basin near Collins, Mississippi
and compared it with an ARMAX (autoregressive moving average with exogenous
inputs) time series model. The artificial neural network model outperformed the
ARMAX model giving the authors reason to suggest that a neural network approach
may be a superior alternative to the ARMAX time series approach.
Raman and Sunilkumar [1995] compared an artificial neural network for reser-
voir inflows with a lag-2 autoregressive model AR(2) for Mangalam and Pothund
reservoirs located in the Bharathapuzha basin, Kerala state, India. Twenty-three
years of historic monthly data were divided into three categories corresponding
to training, cross-validation, and testing. Cross validation served as the stopping
criteria. Twelve independent monthly neural networks were used since they out-
performed a single continuous input network. The final neural network consisted
of four inputs and two outputs corresponding to the two previous reservoir inflows
and the predicted inflow for each reservoir. The mean, standard deviation, and
skewness of the predicted inflows were compared for both reservoirs with regard to
the artificial neural network and AR model. The authors concluded that neural
networks are a viable alternative for time series modeling in water resources.
A neural network time series approach was taken by Ichiyanagi et al. [1996]
to predict hourly inflows into the Hatanagi-Daiichi Dam on the Oi River in Japan.
23
Two separate artificial neural networks were trained based on two categories of
cumulative rainfall events, light and heavy corresponding to light events under 200
millimeters and heavy events over 200 millimeters. Inputs included current and five
previous hourly values of rainfall, five previous river flow rates, the base flow rate,
predicted total rainfall amount, and predicted event duration. The light-rainfall
based neural system did a reasonable job of predicting inflow during light and heavy
rainfall. However, the light-rainfall based system did a poor job in predicting the
peak heavy-rainfall inflow. While the heavy-rainfall neural network did a good job
in predicting peak inflow during heavy-rainfall, it did poorly in predicting general-
inflow. From these observations, the authors decided to use the light-rainfall neural
network in all instances when the accumulated rainfall was below 200 millimeters.
Once 200 millimeters is exceeded, the model switches to the heavy-rainfall neural
network.
2.2 Generalized Reservoir Operating Rules
Regression, probability distribution functions, and neural networks have all
been shown to be capable of performing as generalized reservoir operating rules.
While the first two approaches are reviewed in this section, the third, the neural
network approach, is covered in the next section. Generalized reservoir operating
rules are data driven approaches for incorporating the results of an optimization
into real-time reservoir operation. The goal of such rules is to incorporate as best
possible, the optimal solution, given the stochastic nature of the system.
24
Young [1967] used least squares regression as a means of identifying an annual
operating rule for a single reservoir calibrated from the results of a deterministic DP.
Current storage volume and projected Monte Carlo inflows served as the predictors
in a regression equation for the optimal reservoir release. Young's work showed
regression as a means of creating a general operating rule to simulate the year-by-
year operation of a reservoir system. Bhaskar and Whitlatch [1980] took a similar
approach to Young except that they performed the simulation step to verify the
performance of the regression equation rather than to rely on the value of R2.
Furthermore, their reservoir release policy was based on a monthly rather than an
annual time step as with Young. They found that the best fit model, indicated by
a maximum R2, did not necessarily correspond with the best general operating rule
under simulation. They concluded that a linear release policy gave better results for
a two-sided quadratic loss function, while for a one-sided quadratic loss function, a
nonlinear regression policy gave superior results.
Karamouz and Houck [1982] used an iterative optimization regression simu-
lation approach to improve on Young's [1967] procedure by refining the general
operating rule to increase its correlation with the optimal deterministic DP opera-
tion. Their algorithm accomplished this in a finite number of steps by constraining
the DP to within a specified percentage of release defined by the previous general
regression-operating rule. However, convergence to the best general operating rule
is not guaranteed. Five years later, Karamouz and Houck [1987] compared their
25
iterative DP-regression-simulation (DPR) model with a stochastic dynamic pro-
gramming (SDP) model. The simulation phase of their study showed that DPR
was more effective in operating medium to large reservoirs, while SDP was more
effective in operating small reservoirs.
Willis et al. [1984] developed a general release policy using a probability dis-
tribution function (PDF) of optimal releases conditioned on observable hydrologic
conditions. The PDF policy gave nearly identical responses when compared to the
optimal release. Monte Carlo simulation was used to generate 101 realizations of
768 months (64 years) of reservoir inflow and river makeup data for the Mad River
Basin. Linear programming (LP) was used to determine the optimal release pattern
for each of the 768 monthly inflow sequences. A statistical analysis of the optimal
releases indicated that there were several good choices for conditioning the PDF:
hydrologic season, current and previous period reservoir storage (St and St-1), and
current and previous period inflow (It and It-1 ). Only current period inflow and
previous period final storage were selected because of computer memory limitations
[Finney, 1998]. The final PDF formulation involved sorting through 77,568 optimal
releases and placing them into 216 class intervals based on It and St-1.
26
2.3 Artificial Neural Networks In Water Resources Planning
Neural network models have been developed for water resource systems ranging
from single reservoir [Sakakima et al., 1992; Raman and Chandramouli, 1996], mul-
tiple reservoir [Saad et al., 1994; Taha and Ghosh, 1995; Sandu and Finch, 1996];
and large-scale water projects [Liu and Yao, 1995]. Four of the six models in these
studies are generalized reservoir release models [Sakakima et al., 1992; Raman and
Chandramouli, 1996; Liu and Yao, 1995; Taha and Ghosh, 1995].
2.3.1 Single Reservoir Management
Two generalized reservoir release models for stand alone reservoirs are pre-
sented in this section. Sakakima et al. [1992] studies flood control operation (i.e.,
typhoon) using two different neural network models. Raman and Chandramouli
[1996] investigated reservoir operation in terms of minimizing the squared deficit
of the release for agricultural demand. While both papers use DP to generate the
optimal training response, the work by Raman and Chandramouli is far more com-
parable to Young's [1967] concept of a generalized reservoir release model than is
the work by Sakakima et al.
Sakakima et al. [1992] developed a generalized neural network reservoir release
model for real-time reservoir operation during extreme inflow events (i.e., typhoon).
Teacher's signals or the desired training responses were taken as the optimum re-
lease discharge calculated by dynamic programming (DP). The reservoir's primary
27
purpose was flood control with the initial storage volume set to zero. Model inputs
included typhoon course, current inflow volume, increasing rate of inflow, and cur-
rent storage volume. Two approaches were taken in simulating real time operation
of Syorenji Dam within Yodo basin in Japan: the knowledge-based approach for pa-
rameter extraction and the prediction approach for hydrograph. The first method
was essentially a parameter pre-processing approach where the current hydrograph
was compared with a database of historic typhoon hydrographs and associated neu-
ral network parameters to find the closest fit. The historic model's parameters were
adjusted by a similarity factor. This methodology worked well in normal flood in-
stances but did poorly in events not well represented in the parameter database (i.e.,
large floods). The second method attempted to address the problem of the large
unseen flood by multiplying a historic normal-flood hydrograph by gained similar-
ities to produce a predicted hydrograph. The prediction approach for hydrograph
outperformed the knowledge-based one for a large unseen flood.
Raman and Chandramouli [1996] also demonstrated that neural networks can
be used to create a generalized reservoir model trained from the results of dy-
namic programming. The neural network model here was compared with a gener-
alized multiple linear regression model calibrated with the same optimization data,
stochastic dynamic programming model, and a standard operating policy. The ob-
jective of the optimization was to minimize the square deficit of the release to meet
irrigation demand using twenty-three years of historic fortnight data. Raman and
28
Chandramouli found that the two general operating rules derived from DP outper-
formed stochastic DP and a standard operating policy. Of the generalized models,
the neural network outperformed multiple linear regression for three-years of un-
seen historic data with a 2.2 percent reduction in sum of squared deficits. Neural
network inputs were limited to a single set consisting of initial storage, predicted
inflow, and demand. The final neural network model consisted of three input nodes,
four hidden sigmoid nodes, and one output node for a (3-4-1) configuration.
2.3.2 Multiple Reservoir Management
A time series neural network disaggregation technique was developed by Saad
et al. [1994] to determine the optimal storage levels in Hydro-Quebec's five reservoir
La Grande River installation (four in series one in parallel) based on an optimal
aggregated storage volume. The neural network results were compared with a prin-
cipal component disaggregation technique and found to be similar. Five hundred
years of monthly streamflow data were synthetically generated using a first-order
Markovian model. Deterministic DP was then used to minimize the expected cost
of energy production producing optimal storage levels in each of the five reservoirs.
The potential energy of each reservoir at its optimal storage level was then aggre-
gated into a single value. This aggregate potential energy value and the value for
the previous five months were presented to a feedforward neural network as input
with the optimal storage level as the desired output.
29
Saad et al. [1994] determined through experimentation that a two-hidden layer
neural network produced a lower root-mean-square error than a single layer thus
giving the neural network a final (6-5-5-5) configuration. Separate neural networks
were then trained for each month of the year using 475 out of the 500 optimal po-
tential energy contents; 25 years were retained as the test data set. An intermediate
stage, termed by Saad et al. as limit adjustment, prevented the output of the neural
network from exceeding the physical limits of each reservoir. The limit adjustment
distributed excess water among the other reservoirs relative to capacity. A graphical
presentation of the test results showed the feasibility of the neural network disag-
gregation, as well as superior performance of neural network disaggregation over
principal component disaggregation for two of the system's reservoirs during June,
the month with largest root-mean-square training error.
Taha and Ghosh [1995] proposed a Hybrid Intelligent Architecture (HIA) that
incorporated expert systems and connectionist architectures (i.e., neural networks)
and successfully applied the HIA to the control of water reservoirs on the Colorado
river near Austin. A novel feature of the HIA is its ability to change the characteri-
zation of continuous valued inputs during neural network training. Implementation
consisted of first generating the neural network's architecture using a Nodal Links
Algorithm (NLA). The algorithm used a set of principles to map continuously val-
ued inputs, in this case, a set of floodgate operating rules for the Lower Colorado
River Authority (LCRA). Gaussian distribution functions were then used to dis-
30
cretize continuously measured values into neural network input intervals. Finally,
the neural network was trained using an Augmented version of the Backpropagation
Algorithm (ABA). The algorithm is equivalent to standard backpropagation with
an important exception that it can adjust the shape of the Gaussian distribution
functions, thus control the shape of the neural network input intervals. Testing the
HIA produced a 94.2 percent match to the desired decisions based on LCRA op-
erating rules, compared with a 76.3 percent match for a standard backpropagation
neural network.
2.3.3 Large Scale Water Projects
Liu and Yao [1995] argued that the large-scale operation problem is more ap-
propriately suited for a non-structured or semi-structured decision system rather
than a structured decision system. They also pointed out that non-structured sys-
tems have not worked very well in the past and that in order for them to perform
more satisfactorily, such non-structured systems need a means of adequately inte-
grating available information into a cohesive decision making system. To this end,
they proposed an intelligent operation decision support system (IODSS) in which a
series of generalized neural networks served as the primary decision-makers. Sandu
and Finch [1996] conducted a feasibility study to examine the accuracy of neural
networks in modeling river delta salinity. The authors expressed interest in neural
networks because of their excellent time performance (once trained) as a simulator
to investigate such phenomena as carriage water.
30
cretize continuously measured values into neural network input intervals. Finally,
the neural network was trained using an Augmented version of the Backpropagation
Algorithm (ABA). The algorithm is equivalent to standard backpropagation with
an important exception that it can adjust the shape of the Gaussian distribution
functions, thus control the shape of the neural network input intervals. Testing the
HIA produced a 94.2 percent match to the desired decisions based on LCRA op-
erating rules, compared with a 76.3 percent match for a standard backpropagation
neural network.
2.3.3 Large Scale Water Projects
Liu and Yao [1995] argued that the large-scale operation problem is more ap-
propriately suited for a non-structured or semi-structured decision system rather
than a structured decision system. They also pointed out that non-structured sys-
tems have not worked very well in the past and that in order for them to perform
more satisfactorily, such non-structured systems need a means of adequately inte-
grating available information into a cohesive decision making system. To this end,
they proposed an intelligent operation decision support system (IODSS) in which a
series of generalized neural networks served as the primary decision-makers. Sandu
and Finch [1996] conducted a feasibility study to examine the accuracy of neural
networks in modeling river delta salinity. The authors expressed interest in neural
networks because of their excellent time performance (once trained) as a simulator
to investigate such phenomena as carriage water.
31
A generalized operating model for a large-scale water project trained from the
results of multiobjective linear optimization was used by Liu and Yao [1995] to sim-
ulate the operation of China's South to North Water Transfer Project (SNWTP).
The project consists of a 1200 km long canal that transfers water from the lower
reach of the Yangtze River to north China. The canal links four river basins (the
Yangtze, Huaihe, Yellow and Haihe) and eight natural lakes. The operation con-
sultant expert sub-systems (OCES) served as the primary decision support system
of what the authors referred to as an intelligent operation decision support system
(IODSS). The OCES system consisted of a sub-OCES or neural network for each
natural lake. Each neural network consisted of 24 model inputs including water
inflow, initial volume, and water demand for each lake and for corresponding water
supply areas for the coming period. Sub-OCES output was the terminal lake vol-
ume. Neural network simulation results were compared with linear programming
for Hong-ze Lake for the month of October for the 30 years of historic training data.
Test results of the simulated operation of SNWTP using unseen test data showed
that the system IODSS/SNWTP was capable of operating China's SNWTP.
Sandu and Finch [1996] conducted a feasibility study to determine the effective-
ness of using a feed-forward neural network to quantify the salinity levels at critical
locations in the Sacramento-San Joaquin Delta under various flow conditions. Their
motivation was to find a faster and more accurate means of predicting salinity levels
than the current Minimum Delta Outflow (MDO) routine of the California Depart-
32
ment of Water Resources Planning and Simulation Model (DWRSIM). The MDO
routine within DWRSIM simulates flow and salinity conditions using empirical and
heuristic methods. DWRSIM is an extremely large-scale monthly time step reservoir
simulation model of California's Central Valley. Two major Delta locations were
investigated. The inputs to the NN were the flow conditions and gate positions at
various locations within the Delta.
Sand and Finch [1996] concluded that neural networks warrant further study as
a potential replacement for the current MDO routine and as a means of investigat-
ing carriage water, about which the current MDO routine can provide little insight.
Carriage water is the marginal export cost or the extra water needed to carry a
unit of water across the Delta to the pumping plants for exports while maintaining
constant Delta salinity. The California Department of Water Resources Delta Sim-
ulation Model (DWRDSM, California 1994), an unsteady one-dimensional finite
difference hydrodynamic and salt transport model, has been used to investigate
carriage water. Unfortunately, this model is not a serious candidate for MDO rou-
tine replacement because run times are significantly greater than the current MDO
routine (on the order of ten-thousand times). Sandu and Finch proposed using
DWRDSM to train an artificial neural network.
33
2.4 Optimization and ANN
Neural networks have been used within three separate types of optimization
methodologies in water resources. The first type is a neural network trained from the
results of a programming problem to produce a generalized optimization model, as
was discussed in the preceding section concerning operating rules in water resources.
Neural networks have also been trained from the results of numerically based phys-
ical models in order to serve as simulators in linked-simulation-optimization (LSO)
routines. The third methodology relates to the problem of noninferior solutions
in multiobjective optimization and the need for decision-makers to make choices
with regard to the objectives. Here the neural network inputs are the objective
values, while the outputs are the weights (decision-maker's preferences) given to
each objective.
The LSO approach with a neural network performing as the simulation link
has been used in several groundwater studies to increase run times over numeri-
cally based physical models. Morshed and Kaluarachchi [1998] used LSO to solve
the inverse problem pertaining to the estimate the groundwater parameters that
control light hydrocarbon free-product volume predictions and flow. The artifi-
cial neural network (ANN) simulated groundwater flow, while a genetic algorithm
(GA) performed the optimization. ANN-GA LSO has also been applied directly
to groundwater remediation (Rogers, 1992; Rogers and Dowla, 1994; Rogers et al.,
1995; Johnson and Rogers, 1995).
34
Wen and Lee's [1994] approach was somewhat different in that they used a
neural network as an optimal multiobjective decision-making tool for water qual-
ity management in the Tou-Chen River Basin, Taiwan. The inputs to the neural
network were the objective function values, while the outputs were the weights
(decision-maker's preferences) given to each objective. They used the following
three objectives to formulate their optimization problem: minimize the BOD5 con-
centration in the first reach, minimize the cost of wastewater treatment in the
entire basin, and maximize the river assimilative capacity of BOD5. Compromise
programming was used to minimize the deviation between the ideal and the real ob-
jective solutions within the feasible region. The deviation was weighted to indicate
the relative importance of the decision-maker's preference. The neural network was
trained by using a database generated from optimizations performed with random
sets of objective function values to produce weight values. Thus, if given a set of
objective values, the neural network could produce a set of weights to be used by
compromising programming to find the noninferior solutions to the multiobjective
programming problem.
2.5 Study Contributions
One major purpose of this study is to look at water resources planning for
a single stand-alone reservoir from a wider operational perspective than previous
studies involving neural networks. This approach includes varying the size of the
reservoir relative to its inflow, subjecting the reservoir to increasing levels of demand
35
stress, and varying the correlation structure of reservoir inflows. Parameterization
the model's inputs allows for greater exploration of the applicability of the technique
and prevents the present study from being relegated to one particular test case, as
had been the situation with previous studies of neural networks in water resources
reservoir planning.
Parameterization of the physical plausibility constraints and the storage to
head conversion coefficients in a NLP problem simple requires altering individual
constraint and coefficient values. The ease by which these changes can be made make
NLP a useful means of generating new neural network training responses (i.e., the
optimal reservoir releases) for multiple reservoir configurations. Previous studies
have relied on dynamic programming to determine the inputs of generalized neural
network reservoir release models, with the exception of Liu and Yao [1995] who used
linear programming. The use of dynamic programming in the present study would
require the rediscretization of the state variable and the decision variables for each
new reservoir configuration. Furthermore, interpolation error would be difficult to
quantify for the different discretizations.
The use of deterministic data has been a limiting factor in previous investi-
gations involving neural networks in water resources planning, with the exception
of Saad et al. [1994] who used a first-order Markovian model. However, Saad et
al. made no investigation into altering the correlation structure of reservoir inflows.
The present study's use of a synthetic streamflow generator allows for an investiga-
36
tion into the effect of altering the correlation structure of reservoir inflows on the
generalized release model.
Raman and Chandramouli [1996] limited themselves to a three input gener-
alized reservoir release model in part because their objective was to compare the
generalized neural network model with a linear regression model. Linear regres-
sion models may not be a fair comparison to neural networks models as they are
not structured to incorporate the level of input dimensionality that a neural net-
works can. Therefore, one of the challenges with neural networks in water resources
planning is to quantify neural network performance relative to some standard of
performance.
As discussed earlier, neural networks are designed to parallel process informa-
tion and thus have the ability to incorporate a large number of inputs, relevant
or not. This characteristic in itself is an attractive feature because water resource
systems are large and complex. However, the opportunity for so many inputs can
at times pose a problem in determining the appropriate or best inputs to a partic-
ular model. Maier and Dandy [1996] concluded that in their paper on predicting
salinity in a river 14 days in advance that the most difficult part of the problem
was in determining the appropriate model inputs. One of the major challenges of
the present study as well is to determine a subset of appropriate inputs out of the
set of all possible inputs.
3.0 METHODOLOGY
There are two principal sections of the methodology chapter. The first section
begins with the development of the reservoir release mathematical program followed
by sub-sections on the Monte Carlo optimization procedure used to determine syn-
thetic reservoir inflows, reservoir demands, and optimal nonlinear programming
(NLP) reservoir release decisions. The second section covers the development of the
study's neural network reservoir release model using the reservoir inflows, reservoir
demands, and the NLP reservoir release decisions developed in the previous sec-
tion. The last part of this section covers the development of the study's reservoir
simulation scenarios and a discussion of the study's experimental design approach.
3.1 Development of the Reservoir Release Model
The purpose of this study is explore the performance of a generalized reser-
voir release model to time-lagged inputs, as well as to explore the applicability of
a generalized neural network release model over a range of reservoir storage config-
urations, demand-deficit conditions, and streamflow correlation structures. Monte
Carlo optimization was applied in order to provide training data to the generalized
reservoir release neural network. This technique required developing a nonlinear
programming (NLP) problem to represent the two objectives of the reservoir sys-
tem: reduction of the sum of squared demand deficits and hydropower generation.
37
38
A calibration set of hydrological data had to be found in order to calibrate the
synthetic streamflow generator needed to supply reservoir inflow data to the non-
linear programming problem. Mad River Basin in California served as the study's
source of hydrologic data. The hydrologic characteristics of this river basin provided
the deterministic data needed to calibrate the synthetic streamflow generator. In
the investigation stage of the study, a series of hypothetical single reservoir instances
or case studies were then superimposed onto the location of the existing reservoir
for the purposes of performing numerical experiments.
3.1.1 Reservoir Release Mathematical Program
The management problem posed is to determine releases that minimize demand
deficits while maximizing power production
39
Volumetric inflow to the reservoir is represented by I. Controlled releases are specif-
ically targeted to meet demand (RDt) and generate hydroelectric power (RHt). Un-
controlled releases are defined by spill (SPt). Reservoir demand (DEMt,Reservoir)
is equal to the demand at the pump station (DEMt,Pump Station) less the in-
cremental volumetric flow (Ft) between. Hydroelectric power production (f) in
Equation (8) is a function of the per month release for demand, the per month re-
lease for hydropower, as well as the current and previous period storage (St, St-1).
Thus, an additional benefit of releases intended for demand is the generation of
hydroelectric power. N is the number of monthly time periods in the operational
time horizon. The weighting term 0 allows for a penalty on the deficit, while π is
the benefit term associated with hydropower production.
Equations (10) through (13) represent a linear or convex constraint set to the
nonlinear programming problem. Equation (10) maintains mass continuity within
the reservoir system; (11) insures that the total controlled release falls between
the maximum and minimum; (12) prevents releases for demand from exceeding
that demand; and (13) constrains the model to storage between the maximum at
spillway height and the minimum below which the reservoir is incapable of making
releases. In reality the reservoir does exceed Smax when the reservoir height is above
the spillway; however, this is typically a temporary condition as the spill rapidly
lowers the water level. No constraint is placed on SPt. Evaporative losses as well as
storage gains and losses attributable to groundwater seepage-into and leakage-out-of
40
the reservoir are assumed to be negligible and therefore are neglected.
The hydropower production function, see Equation 14, can be derived as fol-
lows. The total gigawatt-hours in period t [Loucks et al., 1981] equals
The total flow q in period t is in thousands of acre-feet, the hydraulic head h is in
feet, and the power plant efficiency € is dimensionless. Modeling the hydroelectric
system requires substituting for qt and expressing hydraulic head in
terms of reservoir storage. The nonlinear hydraulic head in terms of reservoir storage
relationship can be expressed using a quadratic regression equation
where b0,b1, and b2 are the regression parameters. Averaging beginning and ending
storage levels for a particular time period
and making the appropriate substitutions into equation (14)
41
Lastly, substituting the hydropower production function back into the objective
where DEFt is the demand deficit at the reservoir. The objective is nonlinear in
that the deficit function is quadratic, as is the hydraulic head in terms of reservoir
storage function within the hydropower production equation. A further nonlinearity
in the hydropower production equation is the result of the nonlinear product of the
state (St, St-1) and the decision variables (RDt, RHt).
3.1.2 Overview of Monte Carlo Optimization Procedure
The use of Monte Carlo Optimization to generate synthetic streamflow, via
a random number generator, provided an efficient means of providing optimized
reservoir release data to train and test a neural network. The ability to easily
adjust such streamflow parameters as the annual autocorrelation coefficient in the
synthetic streamflow generator proved to be useful in studying the behavior of a
generalized neural network reservoir release model.
The Monte Carlo Optimization procedure followed in this study is a four step
process as illustrated in Figure 4. The first step involves the use of a statistical
model to generate synthetic reservoir inflows. Step two calculates the incremental
flow between the reservoir and the pump station. Step three calculates the reservoir
demand based on the downstream demand less the incremental flow between the
42
reservoir and the pump station. The fourth and final step is the actual optimiza-
tion of the reservoir release mathematical program, a linearly constrained nonlinear
optimization problem.
Figure 4. Overview of the Monte Carlo optimization procedure.
43
3.1.3 Generating Synthetic Reservoir Inflow
The LAST software package (Lane Applied Stochastic Techniques), a series of
precompiled FORTRAN programs designed and developed by the Bureau of Recla-
mation, was used as the study's synthetic streamflow generator. Lane's [1990] tem-
poral and spatial disaggregation technique is used by LAST to generate a seasonal
series while annual information is generated using a standard linear autoregressive
model (i.e., AR(1), AR(2)). While disaggregation modeling is capable of repro-
ducing statistics at more than one spatial or temporal level (i.e., key-station to
substation or annual to seasonal), this study's requirements were limited to tempo-
ral disaggregation in order to derive monthly reservoir inflows at a single location.
An unregulated streamflow record of 46 years was developed to calibrate LAST.
Sixteen years of unregulated streamflow at the USGS gage site on the Mad River
above Ruth Reservoir near Forest Glen, California were available at the beginning
of this study. To extend the period of record, the sixteen years of unregulated
streamflow was regressed against a surrogate watershed with a longer but overlap-
ping period of record. The USGS gage site on the Van Duzen River near Bridgeville,
California, proved to be sufficient. Overlapping historic data were separated into
wet months and dry months (i.e., July through October) and linear regressed. The
regression of daily flows produced a R2 value of 0.65 in the dry months and a value
of 0.90 in the wet months.
44
The new data set of 46 years of monthly flow was then verified for normality.
A natural log transformation was applied to dry months, and a one-third power
transformation was applied to wet months, both transformations followed from
Korte [1983]. The AR(1) model was implemented because of the small size of
the annual autocorrelation coefficient (-0.05). The synthetic annual and monthly
flow series from LAST were then statistically verified with the calibration data
set. Table 1 compares 5,000 years of annual synthetic data with the 46 years of
calibration data.
Table 1. Annual statistics for synthetic reservoir inflows compared with
the calibration inflow series (acre-feet x 103).
Statistic Calibration Series Validation Series
Years 46 5000
Mean 217.4 218.7
SD 89.7 89.3
Skew 0.11 0.23
Lag-1 CC -0.5 -0.3
3.1.4 Determining Incremental Flow and Reservoir Demand
Three piecewise linear equations were used to calculate the incremental flow
between the reservoir and the pump station, a value needed to calculate the demand
at the reservoir. Incremental flow was calculated by subtracting the estimated flow
below the reservoir from a gage site just below the pump station, approximately 70
miles downstream. Because the USGS gage site was just below the pump station,
is the minimum monthly volumetric flow requirement for fish in acre-where,
Residential is the portion of annual residential demand with as a sea-
sonal weighting factor, and represents the annual industrial de-
mand apportioned equally to each month. The seasonal values used for (acre-
45
monthly values of historic pump station withdrawals had to be added back into the
gage record. A regression analysis was performed by graphing estimated monthly
inflow into the reservoir against the calculated value of the monthly incremental
flow for WY-1982 through WY-1996. A two part linear approach captured both
high and low flows while keeping the transformations linear.
The demand at the reservoir is equal to the demand at the pump station less
the volumetric monthly incremental flow Ft between them
where T is a seasonal index for each month. The constraint equation insures that
the demand at Ruth never goes negative. The monthly demand at the pump station
is calculated using
feet x 103,
for each month are found in Table 2.feet x103) and seasonal coefficients
46
Table 2. Coefficients and constants for Equation (22).
1 Jan 4.6 0.06
2 Feb 4.2 0.06
3 Mar 4.6 0.06
4 Apr 4.4 0.07
5 May 4.6 0.09
6 Jun 4.5 0.10
7 Jul 3.1 0.12
8 Aug 2.5 0.12
9 Sep 1.8 0.10
10 Oct 2.5 0.10
11 Nov 4.5 0.09
12 Dec 4.6 0.06
3.1.5 Optimization
The final phase of the Monte Carlo Optimization Procedure involved the op-
timization of the reservoir release mathematical program, which is a linearly con-
strained nonlinear optimization problem. MINOS or Modular In-core Nonlinear
Optimization System, a FORTRAN based general-purpose linear/nonlinear opti-
mizer, was used to optimize the release problem. The program evaluates the objec-
tive function using a generalized reduced gradient (GRG) algorithm [Murtagh and
Saunder, 1995].
MINOS requires a main-input file and a user-coded subroutine to solve a lin-
ear constrained nonlinear optimization problem. The input file is comprised of two
sub-files: SPECS and MPS. The SPECS file consists of several lines of implemen-
tation specific instructions to MINOS, such as the size of the problem and the need
47
to verify the objective gradients numerically. The MPS file contains the explicit
representation of the linear portion of the reservoir release mathematical program.
MPS is a standardized format used by most commercially based linear program-
ming codes. A matrix generator automated the construction of the of the MPS file
for 550 years of monthly data. The FORTRAN code for the matrix generator, mg.f,
can be found in Appendix A.
FUNOBJ is a user-coded subroutine where the nonlinear objective function
is explicitly expressed. Ideally, the first derivatives of the objective function with
respect to the decision variables are known and can be coded by the user into the
subroutine. Having explicit derivative information avoids the numerical estimation
of the reduced gradient using forward and central differences by MINOS. Numerical
derivatives result in a loss in accuracy accompanied by a significant increase in CPU
time. In this study, it was possible to analytically compute the first derivative of
the objective function with respect to the decision variables (St, RDt, RHt and
SPt). The evaluated first derivative expressions were then programmed into the
subroutine along with the nonlinear objective function. The FORTRAN code for
the subroutine FUNOBJ, or opt.f, can be found in Appendix B.
The analytical expressions contained in FUNOBJ were verified in two one-
hundred year runs: one using the analytical derivatives embedded in FUNOBJ
and the second using MINOS's forward and central differencing routines. Both
runs produced the same objective value to 4 significant figures; however, the latter
48
required 6114 calls to FUNOBJ, while the former required approximately 24 million
calls to the subroutine. The analytical run took just a few minutes, while the
numerical run took approximately 12 hours. Furthermore, the norm of the reduced
gradient was 7.3E — 04 for the analytical solution, while the norm was 1.2E — 01
for the numerical solution.
The multiobjective optimization problem in this study is characterized by a
vector of two objective functions that are conflicting and competing, one for hy-
dropower generation and the other for reducing the squared deficit as shown in
Equation (15). The solutions to this problem are termed the nondominated or non-
inferior solutions. These two terms imply that as one moves to improve the value
of one objective, the value of the other decreases. The relative magnitude of the
two objective values is controlled by 0 and π in the objective function. The trade-
off curve given in Figure 5 is a visual illustration of the nondominated solutions
of the two objectives. The twenty points represented on the curve were generated
using 100-year optimizations of a single set of monthly data for a fixed value of 0
and variations in π. The optimization procedure is Monte Carlo in that the inputs
to optimization (i.e., inflow) are synthetically derived from a calibrated synthetic
streamflow model using a random number generator.
The NLP management solution out of the nondominated solutions was selected
at the breakpoint to reduce the generalized release problem from a dual generalized
release problem involving RDt and RHt, to a single generalized release problem
49
Figure 5. Trade off curve representing the nondominated solutions of
the objective function
involving RDt. This simplification made it easier to study the generalizing behavior
of a reservoir release neural network without worrying about an additional variable.
This simplification was possible as it was rather simple to construct a release policy
for hydropower releases that mimicked the NLP solution at the breakpoint. The
objective function value at the breakpoint maximized hydropower production with
negligible tradeoff in terms of increasing the sum of squared deficits, as illustrated
in Figure 5 at a power production level of around 3, 750 GWh.
50
3.2 Development of ANN Generalized Reservoir Release Model
This section is divided into two subsections. The first discusses the process
of training a neural network with a commercially available neural network soft-
ware package using the results of the Monte Carlo optimization to determine a
generalized release policy for RDt. The second covers the development of the reser-
voir simulation model in the form of a C++ computer program nn_resv_sim.cpp
(Appendix C). The C++ program replicates the trained feedforward plane of the
neural network using training weights imported from the neural network software to
get a generalized release for demand. The program then implements an operating
procedure to determine RHt and SPt, while insuring that the constraints of the
mathematical program are meet (i.e., primarily that mass continuity is maintained
within the reservoir system).
3.2.1 Artificial Neural Network Training
NeuroSolutions, a commercially based neural network software package [Lefeb-
vre et al., 1997] with a graphical user interface, was used to train the reservoir
release neural network using the results of the Monte Carlo optimization proce-
dure. The use of a commercial software package eliminated the need for further
software development and provided the mechanics needed to implement neural net-
work training. The training elements of a feedforward neural network, as discussed
in the literature review (Section 2.1.4) and shown in Figure 3, can be thought of
(i.e., absolute value, least squares, norm, and cost functions to the user defined
51
as a juxtaposition of three independent planes: the forward activation plane, the
backpropagation plane, and the gradient search plane [Lefebvre et al., 1997].
The topology or forward activation plane used in this study is what Lefebvre
et al. [1997] referred to as a generalized feedforward network. This topology is
a variation on the standard feedforward neural network or multilayer perception
(MLP) in that connections (i.e., synapses) can jump over one or more layers. The
advantage of the generalized feedforward network is that it can solve problems much
more quickly than the standard feedforward network. In some instances, a standard
feedforward network requires hundreds of times more training epochs than the an
equivalent generalized feedforward network [Lefebvre et al., 1997].
NeuroSolutions allows several choices of optimality criterion to be minimized
power) and also gradient search methods in the computing of the optimal network
weights (i.e., step, momentum, quickprop, and delta-bar-delta). However, for the
purposes of this study, the MSE optimality criterion and the momentum search
algorithm were found to be sufficient and were used in all training runs as discussed
in Section 2.1.4. Specific elements for the generalized feedforward topology were
easy to implement using the software's automated neural network setup procedure
called the NeuralWizard. The NeuralWizard allows one to easily specify the number
of inputs, outputs, and hidden layers, as well as the types of transfer functions and
the choice of stopping criteria.
52
3.2.2 Neural Network Reservoir Release Simulation Model
A limitation of neural networks in water resources is that they are input/output
models producing generalized decisions. Thus neural networks have no means of
insuring mass continuity within a water resource system. For instance, a generalized
neural network demand-release may exceed the ability of a particular reservoir to
meet that demand. This inequity between the generalized demand-release and what
can actually occur in the physical system must be reconciled. The most important
reconciliation step needed in water resources is the one that insures mass continuity.
Saad et al. [1994] used such a continuity adjustment step in their neural network
nonlinear disaggregation technique for the operation of a multi-reservoir system.
The generalized neural network reservoir simulation program used in this study,
nn_resv_sim.cpp (Appendix C), can be divided into two halves. The first half imple-
ments the feedforward plane of artificial neural network to determine a generalized
value of RDt. The second half implements a operating procedure to determine RHt
and SPt, at the same time making sure that the constraints of the mathematical
program are met. This includes insuring that the RDt does not exceed the demand
and that both the mass continuity and the physical plausibility constraints of the
reservoir system are maintained.
The program nn_resv_sim.cpp required numerically reconstructing the feedfor-
ward plane of the NeuroSolutions generalized feedforward topology in the form of
a C++ program. Considerable programming time and effort was spent emulat-
53
ing the mathematics of the feedforward plane of the neural network (i.e., matrix
multiplication and the occasional use of an analytical transfer function). The sim-
ulation program requires three output files from NeuroSolutions to be read in, all
beginning with the file name of that specific run: (filename.inputFile.nsn), (file-
name.desiredFile.nsn), and (filename.nsw). The first two files are automatically
generated by the program and contain the parameters for normalizing network in-
puts and denormalizing network outputs. The third file, ending with .nsw, contains
the weights of the neural network and must be specifically requested from the pro-
gram.
Figure 6 is a flowchart of the operating procedure used by the generalized
reservoir release simulation program, nn_resv_sim.exe. The generalized reservoir
simulation program begins by using the forward plane of the neural networks to
determine a generalized value of the RDt. It then goes through an operating pro-
cedure to determine RHt and SPt and to insure that the operational constraints
of the mathematical program are met. These constraint checks and associated ad-
justments are needed because the neural network can produce nonsensical results
with regard to mass continuity and the other constraints of the NLP. The operating
procedure includes steps to make sure that the RDt does not exceed DEMt, that
mass continuity is maintained, and that the physical plausibility constraints of the
reservoir system are met.
54
Figure 6. Flow diagram of the generalized reservoir release simulation
program, nn_resv_sim.
55
3.2.3 Development of a Standard Operating Procedure
A standard operating procedure (SOP) was developed to help provide a rational
generating policy to compare against both the NLP and the generalized neural
network model. This SOP is used as a reasonable measure of the largest possible
sum of squared deficits. The SOP attempts to release the entire demand rather
than hedge the demand-release in anticipation of future drought. On the other
hand, solution of the NLP is used as the measure of the best possible operating
scenario, far better than would be possible in real-time. This best case scenario
is the result of the ability of the nonlinear programming problem to incorporate
information from the entire time horizon.
The NLP and the generalized neural network models hedge their releases (i.e.,
reduce them below demand despite available water) in anticipation of greater peri-
ods of deficit ahead, to reduce the sum of squared deficits. The SOP in Figure 7 first
requires that if there is a demand, the reservoir is to attempt to release that demand,
and secondly, that once the reservoir is full, any excess water will be targeted for
hydropower releases. Prior to each decision within the SOP algorithm, the volume
of water available for RDt,avail., St,avail., and RHt,avail. is determined to insure
that the reservoir's physical plausibility constraints are not violated. No constraint
is placed on spill. The NLP solution may be considered to provide the lowest sum
of squared deficits, while the SOP can be expected to provide a reasonable value of
the highest sum of squared deficits.
56
Figure 7. Flow diagram of the standard operating procedure.
57
3.2.4 Feature Selection and Internal Network Architecture
A systematic simulation approach was taken to determine appropriate neural
network inputs, the results of which are presented in the next chapter. Reservoir
storage, demand, and inflow were all considered as inputs to the neural network to
predict the NLP releases for demand. Predicted current period inflow and demand
as well as previous period storage were considered as a baseline from which to start
the search. The enormous number of combinations of potential inputs forced some
simplifications of the search procedure. These simplifications included removing St
lags early on in the selection process with the exception of St-1, as well as keeping
the number of lags of It and DEMt equal for each run.
To streamline the study, only a cursory examination was made of increasing the
internal dimensionality of the neural network model to prevent missing a standout
solution that might improve the performance of the generalized neural network
reservoir release model. When the number of hidden nodes were increased and
separately when the number of hidden layers was increased from one to two, an
improved solution was not found. Thus it is felt that the default internal architecture
recommended by the NeuralWizard of one hidden layer and 23 hidden nodes (fixed
by the number of inputs) was sufficient in capturing the complexity of a generalized
reservoir release model containing 13 inputs and 1 output.
58
3.2.5. Stopping Criteria
A stopping criterion based on a maximum of 10,000 epochs was selected for the
entire study. The maximum epochs stopping criterion was simple to implement and
provided a degree of uniformity between training runs. A desired error threshold
stopping criterion was rejected as the average cost of the objective-criterion in Neu-
roSolutions is based on differences in the normalized output and normalized desired
output. A normalized value of the average error criterion is difficult to judge and
cannot be readily denormalized.
Cross-validation was also eliminated as a potential stopping criterion. Its elim-
ination as a potential stopping criterion was a result of the fact that the cross-
validation procedure requires the use of a validation data set. This data require-
ment is in addition to the training data set and therefore could have doubled the
number of optimization runs needed in the study. In addition, the cross-validation
procedure approximately doubles training time as the neural network software needs
to evaluate both a training and a validation data set in order to determine if the
error in the validation data set is increasing. Furthermore, the purpose of the cross-
validation procedure is to prevent overtraining. The potential for overtraining was
discounted after performing an extremely long cross-validation training run (i.e.,
100,000 epochs) discussed in the next paragraph.
59
A cross-validation training test was performed using 100,000 epochs with a
(15-22-1) neural network configuration, six time lagged pairs of It and DEMt, a
baseline of It, DEMt, and St — 1, and 500 years of monthly data. The purpose
of the validation test was to determine if and when the generalized neural network
becomes overtrained. Overtraining occurs when the neural network over-fits the
training set and thus is unable to best generalize on a new unseen data set. Train-
ing was periodically stopped and the neural networks weights were saved. The
generalized reservoir release model, nn_resv_sim.exe, was then run using the saved
weights and the cross validation data set. Table 3 is a summary of the results in
percentage reduction of the sum of squared deficits with respect to the previous
Table 3 stopping epoch. After examining the time tradeoffs with respect to the
percentage improvement in the solution, it was decided that a maximum of 10,000
training epochs would be a reasonable stopping criterion.
Table 3. Cross-validation test using a P6-200Mhz PC.
Training Epoch Time Percentage Reduction
1 ≈ 1 second N/A
1,000 ≈ 20 minutes 57%
10,000 ≈ 3 hours 3.2%
100,000 ≈30 hours 0.75%
60
3.2.6 Reservoir Simulation Scenarios
Sixteen different reservoir scenarios were constructed using two indices with
the purpose of investigating the applicability of the neural network model over a
range of storage and deficit conditions. The first index, maximum storage to average
annual inflow (MSI), is the ratio of the maximum reservoir storage to its average
annual inflow or
The second index, optimal deficit to average annual inflow or optimal deficit to
inflow (ODI), is the ratio of the reservoir deficit of the NLP solution to its average
annual inflow or
The use of the MSI index required the construction of four different reservoir
configurations. The MSI index and the physical parameters for each of the four
reservoir configurations are given in Table 4. The 220 acre-feet x103 was used as
the value of the total average annual inflow.
61
Table 4. Configurations of four different reservoirs (acre-feet x103)
Config-1 Config-2 Config-3 Config-4
MSI 0.23 0.84 1.7 3.4
Smax 50 185 370 740Smax
50 185 370 740
Smin
5 18.5 37 74
Rmax
100 100 100 100
Rmin
0 0 0 0
A quadratic hydraulic head to storage relationship was developed for each reservoir
configuration. This new relationship was accomplished by multiplying Ruth Reser-
voir's storage data by a capacity resizing factor and by multiplying Ruth's hydraulic
stage data with a reservoir height-resizing factor. The parameters b0, b1, and b2 in
each instance were then fit using quadratic regression. For each reservoir instance,
an average efficiency of 0.7 was selected to represent hydropower plant efficiency
(including head losses).
The use of the ODI index required that a range of target values be determined
prior to conducting numerical experiments. Target values of the ODI index were set
in order to provide a relative value upon which to compare the four reservoir con-
figurations. Numerous optimization runs were conducted varying DEMResidential
and DEMCommerical to meet the target values of the ODI index given in Table 5.
Table 5. Target values of the ODI index.
Name ODI Value
Run-a ≈ 0.04
Run-b ≈ 0.09
Run-c ≈ 0.17
Run-d •≈ 0.31
62
3.2.7 Experimental Design Approach
The purpose of this subsection is to give an overview this study's experimental
design approach. This subsection shows how the generalized reservoir release model
was parameterized to generate the numerical results found in the results and discus-
sion chapter that follows. Figure 8 illustrates the training and testing procedures
used to examine the performance of the generalized neural network reservoir release
model for an array of parameterize model inputs, coefficients, constraints, and de-
mand values. The adjustment points for the model inputs, coefficients, constraints,
and demand values are depicted by double arrows at the step within which they
can be adjusted.
The purpose of the neural network training procedure in Figure 8 is to develop
the neural network training weights. The weights are to be used by the generalized
neural network reservoir release simulation model (i.e., nn_resv_sim.exe) found in
the neural network testing procedure. The purpose of steps a and b in the testing
procedure is to develop an independent test set for reservoir simulation. Steps c,
d, and e in the test procedure are to optimize the nonlinear programming problem
for the purpose of having a relevant NLP solution to compare against the results of
the generalized reservoir release neural network.
In all four hypothetical reservoir instances, the initial and final storage con-
ditions of the reservoir system are fixed at one-half of Smax. In addition, in an
attempt to remove the influence of this somewhat arbitrarily set boundary condi-
63
Figure 8. Experimental design approach, comparing neural network i)
training procedure and ii) testing procedure.
64
tions, 25 years of optimized monthly data are excluded after Sinitial and prior to
Sfinal for the study's typical 550 year optimization run. This is shown in step e in
both the testing and training procedures in Figure 8.
The first purpose of this study is to investigate whether performance increase
in a generalized reservoir release neural network can be achieved by adding lagged
inputs of inflow and demand to the three input neural network used by Raman
and Chandramouli [1996]. This requires parameterizing on the time lagged inputs
of inflow and demand in step f of the testing procedure and then testing the final
weights using an independent test set in the neural network testing procedure.
The purpose of this study is also to show the applicability of the generalized
release model over a range of reservoir configurations, demand-deficits, and annual
streamflow autocorrelation structures. Each reservoir storage configuration is rep-
resented by a different value of the MSI index required new values ofS
max
and Smin
as well as the coefficients (i.e., b0, b1 and b2) representing the quadratic hydraulic
head in terms of the reservoir storage relationship (i.e., step c in both the training
and testing procedures). While the Smax was parameterized, Rmax was set at 100
[(acre-feet) x103] to prevent a physical constraint on the demand-release. Rmax
was held constant across reservoir configurations to make simplify the comparison
between the different reservoirs.
65
Meeting the target values of the ODI index for each reservoir configuration is
achieved by parameterizing on the residential and commercial demand and then
optimizing the NLP using MINOS (i.e., step b in both the training and testing
procedures). The resulting deficit from the solution to the NLP is divided by the
average annual inflow and compared with the target value of the ODI index. If
the value is within approximately ±1% of the target ODI value, the NLP solution
is kept for analysis. Otherwise, another optimization run is performed using new
adjusted residential and commercial demands in an attempt to get a NLP deficit
closer to the ODI target value.
The annual lag-1 autocorrelation coefficient (ACC) of reservoir inflow is pa-
rameterized in the synthetic streamflow generator in order to characterize the per-
formance of the neural network model over a range of reservoir inflow correlation
structures (i.e., step a in both the training and testing procedures). The annual
lag-1 ACC value was adjusted in the LAST parameter file by simply entering a
new value of the coefficient. While annual lag-1 ACC values were adjusted the
correlation structure of the monthly synthetic streamflow was kept the same.
4.0 RESULTS AND DISCUSSION
This chapter is divided into five sections: neural network configuration, feature
selection, reservoir simulations, effect of annual lag-1 autocorrelation coefficient,
and application to real-time reservoir operation. The first section describes the
form of the generalized neural network used in the reservoir simulation study. The
second discusses the investigation on the effect of increasing the number of time
lagged inputs of It and DEMt on the sum of squared deficits and total deficit. The
third section presents the bulk of the study's results by examining the operation of
four reservoir configurations, in terms of MSI, for three methodologies: nonlinear
optimization (NLP), generalized neural network, and standard operating procedure.
The reservoir systems are then stressed for progressively larger values of ODI. The
fourth section examines the effect of the lag-1 autocorrelation coefficient on the
neural network model. Finally, the fifth section discusses the applicability of the
procedure to real-time reservoir operation.
4.1 Neural Network Configuration
The inputs to the neural network consisted of inflow, demand, and storage.
Neural network input and output values were normalized within the interval [0,1],
while the initial internal weights were randomized within the interval [0,1]. The
hyperbolic tangent function was used as the transfer function for both hidden and
output nodes. One hidden layer was used, while the neural network software deter-
mined the number of internal nodes.
66
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr
klein_stephen_thesis_ot_fr

More Related Content

Viewers also liked

Patología respiratoria2
Patología respiratoria2Patología respiratoria2
Patología respiratoria2
LUZENIR Amaral
 

Viewers also liked (19)

Hertek hart voor-uw-zaak @zaandelta
Hertek hart voor-uw-zaak @zaandeltaHertek hart voor-uw-zaak @zaandelta
Hertek hart voor-uw-zaak @zaandelta
 
Kisi kisi uts
Kisi kisi utsKisi kisi uts
Kisi kisi uts
 
Patología respiratoria2
Patología respiratoria2Patología respiratoria2
Patología respiratoria2
 
QUALITY CONTROLPROCEDURE
QUALITY CONTROLPROCEDUREQUALITY CONTROLPROCEDURE
QUALITY CONTROLPROCEDURE
 
бг 2016 мп строительство
бг 2016 мп строительствобг 2016 мп строительство
бг 2016 мп строительство
 
Claves de la semana del 18 al 24 de enero
Claves de la semana del 18 al 24 de eneroClaves de la semana del 18 al 24 de enero
Claves de la semana del 18 al 24 de enero
 
ABC Varstva Avtorskih Pravic
ABC Varstva Avtorskih PravicABC Varstva Avtorskih Pravic
ABC Varstva Avtorskih Pravic
 
The Two Laws of Viral-ity by David Guilfoyle
The Two Laws of Viral-ity by David GuilfoyleThe Two Laws of Viral-ity by David Guilfoyle
The Two Laws of Viral-ity by David Guilfoyle
 
Fornetti per taratura termocoppie, pt100, termoresistenze - GIGA TECH SRL
Fornetti per taratura termocoppie, pt100, termoresistenze - GIGA TECH SRLFornetti per taratura termocoppie, pt100, termoresistenze - GIGA TECH SRL
Fornetti per taratura termocoppie, pt100, termoresistenze - GIGA TECH SRL
 
Misura scariche parziali - Sensori MC Monitoring PDC24-1000
Misura scariche parziali - Sensori MC Monitoring PDC24-1000Misura scariche parziali - Sensori MC Monitoring PDC24-1000
Misura scariche parziali - Sensori MC Monitoring PDC24-1000
 
Dan Petruccelli
Dan PetruccelliDan Petruccelli
Dan Petruccelli
 
Acquire learning
Acquire learningAcquire learning
Acquire learning
 
Evans Data DevRel 2016
Evans Data DevRel 2016 Evans Data DevRel 2016
Evans Data DevRel 2016
 
Lasertech Floorplans Ppt 01 20 2010
Lasertech Floorplans  Ppt 01 20 2010Lasertech Floorplans  Ppt 01 20 2010
Lasertech Floorplans Ppt 01 20 2010
 
Bca 1 st year
Bca 1 st yearBca 1 st year
Bca 1 st year
 
FAULT DETECTION ON OVERHEAD TRANSMISSION LINE USING ARTIFICIAL NEURAL NET...
 FAULT DETECTION ON OVERHEAD TRANSMISSION LINE  USING ARTIFICIAL NEURAL NET... FAULT DETECTION ON OVERHEAD TRANSMISSION LINE  USING ARTIFICIAL NEURAL NET...
FAULT DETECTION ON OVERHEAD TRANSMISSION LINE USING ARTIFICIAL NEURAL NET...
 
Miasis o gusaneras
Miasis o gusanerasMiasis o gusaneras
Miasis o gusaneras
 
Funny question
Funny questionFunny question
Funny question
 
Stock market analysis using ga and neural network
Stock market analysis using ga and neural networkStock market analysis using ga and neural network
Stock market analysis using ga and neural network
 

Similar to klein_stephen_thesis_ot_fr

Ab experiments of fluid flow in polymer microchannel
Ab experiments of fluid flow in polymer microchannelAb experiments of fluid flow in polymer microchannel
Ab experiments of fluid flow in polymer microchannel
ShaelMalik
 
Download-manuals-surface water-waterlevel-41howtoanalysedischargedata
 Download-manuals-surface water-waterlevel-41howtoanalysedischargedata Download-manuals-surface water-waterlevel-41howtoanalysedischargedata
Download-manuals-surface water-waterlevel-41howtoanalysedischargedata
hydrologyproject001
 

Similar to klein_stephen_thesis_ot_fr (20)

Neural Network Model Development with Soft Computing Techniques for Membrane ...
Neural Network Model Development with Soft Computing Techniques for Membrane ...Neural Network Model Development with Soft Computing Techniques for Membrane ...
Neural Network Model Development with Soft Computing Techniques for Membrane ...
 
IRJET- Univariate Time Series Prediction of Reservoir Inflow using Artifi...
IRJET-  	  Univariate Time Series Prediction of Reservoir Inflow using Artifi...IRJET-  	  Univariate Time Series Prediction of Reservoir Inflow using Artifi...
IRJET- Univariate Time Series Prediction of Reservoir Inflow using Artifi...
 
Ab experiments of fluid flow in polymer microchannel
Ab experiments of fluid flow in polymer microchannelAb experiments of fluid flow in polymer microchannel
Ab experiments of fluid flow in polymer microchannel
 
Optimized routing and pin constrained
Optimized routing and pin constrainedOptimized routing and pin constrained
Optimized routing and pin constrained
 
Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...
Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...
Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...
 
Biomicrofluidics-9-044103-2015
Biomicrofluidics-9-044103-2015Biomicrofluidics-9-044103-2015
Biomicrofluidics-9-044103-2015
 
Download-manuals-surface water-waterlevel-41howtoanalysedischargedata
 Download-manuals-surface water-waterlevel-41howtoanalysedischargedata Download-manuals-surface water-waterlevel-41howtoanalysedischargedata
Download-manuals-surface water-waterlevel-41howtoanalysedischargedata
 
Kirubanandan_Career Episode 3.pdf
Kirubanandan_Career Episode 3.pdfKirubanandan_Career Episode 3.pdf
Kirubanandan_Career Episode 3.pdf
 
710201911
710201911710201911
710201911
 
Big Data and Renewable Energy
Big Data and Renewable EnergyBig Data and Renewable Energy
Big Data and Renewable Energy
 
Chapter9
Chapter9Chapter9
Chapter9
 
Neural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank System
Neural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank SystemNeural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank System
Neural Network-Based Actuator Fault Diagnosis for a Non-Linear Multi-Tank System
 
710201911
710201911710201911
710201911
 
Performance comparison of SVM and ANN for aerobic granular sludge
Performance comparison of SVM and ANN for aerobic granular sludgePerformance comparison of SVM and ANN for aerobic granular sludge
Performance comparison of SVM and ANN for aerobic granular sludge
 
Geldenhuys model 20418
Geldenhuys model 20418Geldenhuys model 20418
Geldenhuys model 20418
 
Presentation: Wind Speed Prediction using Radial Basis Function Neural Network
Presentation: Wind Speed Prediction using Radial Basis Function Neural NetworkPresentation: Wind Speed Prediction using Radial Basis Function Neural Network
Presentation: Wind Speed Prediction using Radial Basis Function Neural Network
 
Parallel Left Ventricle Simulation Using the FEniCS Framework
Parallel Left Ventricle Simulation Using the FEniCS FrameworkParallel Left Ventricle Simulation Using the FEniCS Framework
Parallel Left Ventricle Simulation Using the FEniCS Framework
 
ADVENTURE_sFlowの最新動向
ADVENTURE_sFlowの最新動向ADVENTURE_sFlowの最新動向
ADVENTURE_sFlowの最新動向
 
Universal approximators for Direct Policy Search in multi-purpose water reser...
Universal approximators for Direct Policy Search in multi-purpose water reser...Universal approximators for Direct Policy Search in multi-purpose water reser...
Universal approximators for Direct Policy Search in multi-purpose water reser...
 
ConorSchlick_Thesis
ConorSchlick_ThesisConorSchlick_Thesis
ConorSchlick_Thesis
 

klein_stephen_thesis_ot_fr

  • 1. Artificial Neural Network and Monte Carlo Optimization for Reservoir Operation by Stephen James Klein A Thesis Presented to The Faculty of Humboldt State University In Partial Fulfillment of the Requirements for the Degree Master of Science In Environmental Systems: Engineering May 10, 1999
  • 2. Artificial Neural Network and Monte Carlo Optimization for Reservoir Operation by Stephen James Klein Approved by the Master's Thesis Committee: Brad A. Finney, Major Professor Date Elizabeth A. Eschenbach, Committee Member Date Robert Willis, Committee Member Date Charles M. Biles, Graduate Coordinator Date Ronald A. Fritzsche Date Dean for Research and Graduate Studies
  • 3. ABSTRACT Determining the optimal operation of a reservoir system is frequently hampered by uncertainty associated with future inflows. Generalized operating policies are a potentially fast and easy to use means of real time operation. They use readily available predictors (i.e., current reservoir storage and short term predicted inflow) calibrated against the optimal response of the system. Artificial neural networks represent the most recent attempt to improve generalized reservoir release models. This study builds on prior research in water resources planning by investigating the incorporation of time lagged inputs of inflow and demand to improve the perfor- mance of a generalized neural network reservoir release model. This study differs from much of the previous work on generalized operating rules in that Monte Carlo optimization is used. Previous research has relied primarily on deterministic data, but here the Monte Carlo element generates reservoir inflow synthetically. The nonlinear objective function is to minimize the sum of square deficits and maximize hydropower production. Dynamic Programming has been the primary optimization tool in generalized operating rule research. In this study however, a nonlinear pro- gramming (NLP) model is developed and applied to a series of hypothetical water resource systems. The reduced gradient algorithm is used for the solution of the NLP. The Monte Carlo optimization methodology allows a virtually unlimited pool of calibration and validation data to be rapidly derived for a variety of reservoir con- iii
  • 4. figurations. The performance of the neural network model relative to the nonlinear programming solution is compared over a range of reservoir storage, demand-deficit, and streamflow correlation structures in order to investigate its applicability. The results show that a significant improvement in the performance of generalized reser- voir release neural network can be achieved using time lagged inputs of inflow and demand. Furthermore, the ability of a generalized neural network reservoir release model to incorporate time series information is substantiated. iv
  • 5. ACKNOWLEDGEMENTS I received special assistance from three members of the Department of Envi- ronmental Resources Engineering at Humboldt State University: Professors Brad Finney, Robert Willis, and Elizabeth Eschenbach. Professor Finney tirelessly encouraged me and guided me through this project, from beginning to end. Professor Willis gave me the training that allowed me to conduct the research. Professor Eschenbach edited and critiqued the final versions of the thesis. Thank you all for your invaluable assistance. Finally, I dedicate this thesis to my wife, Tirian, who gave me constant love and support throughout my graduate studies. v
  • 6. TABLE OF CONTENTS ABSTRACT iii ACKNOWLEDGEMENTS LIST OF TABLES ix LIST OF FIGURES NOTATION xii 1.0 INTRODUCTION 1 2.0 LITERATURE REVIEW 5 2.1 Artificial Neural Network Modeling 5 2.1.1 Strengths and Weaknesses 6 2.1.2 Topology 7 2.1.3 Preprocessing 11 2.1.4 Training 13 2.1.5 Time Series Modeling in Hydrology 21 2.2 Generalized Reservoir Operating Rules 23 2.3 Artificial Neural Networks In Water Resources Planning 26 2.3.1 Single Reservoir Management 26 2.3.2 Multiple Reservoir Management 28 2.3.3 Large Scale Water Projects 30 vi
  • 7. 2.4 Optimization and ANN 33 2.5 Study Contributions 34 3.0 METHODOLOGY 37 3.1 Development of the Reservoir Release Model 37 3.1.1 Reservoir Release Mathematical Program 38 3.1.2 Overview of Monte Carlo Optimization Procedure 41 3.1.3 Generating Synthetic Reservoir Inflow 43 3.1.4 Determining Incremental Flow and Reservoir Demand 44 3.1.5 Optimization 46 3.2 Development of ANN Generalized Reservoir Release Model 50 3.2.1 Artificial Neural Network Training 50 3.2.2 Neural Network Reservoir Release Simulation Model 52 3.2.3 Development of a Standard Operating Procedure 55 3.2.4 Feature Selection and Internal Network Architecture 57 3.2.5.Stopping Criteria 58 3.2.6 Reservoir Simulation Scenarios 60 3.2.7 Experimental Design Approach 62 4.0 RESULTS AND DISCUSSION 66 4.1 Neural Network Configuration 66 4.2 Feature Selection 67 4.3 Reservoir Simulations 70 vii
  • 8. 4.3.1 Simulation Results: Sum of Squared Deficits 70 4.3.2 Simulation Results: Total Deficit, Hydropower, and Spill 79 4.3.3 Frequency Characterization 83 4.4 Effect of Annual Lag-1 Autocorrelation Coefficient 90 4.5 Application to Real Time Reservoir Operation 92 5.0 CONCLUSIONS 95 6.0 SUGGESTIONS FOR FURTHER RESEARCH 98 REFERENCES 100 APPENDIX A - MPS Matrix Generator - (mg.f) 105 APPENDIX B - MINOS Subroutine FUNOBJ - (opt.f) 110 APPENDIX C - Neural Network Reservoir Release Program - (nn_resv_sim.cpp) 118 viii
  • 9. LIST OF TABLES 1. Annual statistics for synthetic reservoir inflows compared with the calibration inflow series (acre-feet x103) 44 2. Coefficients and constants for Equation (22) 46 3. Cross-validation test using a P6-200Mhz PC 59 4. Configurations of four different reservoirs (acre-feet x103) 61 5. Target values of the ODI index 61 6. Sum of squared deficits as a percentage of the NLP solution for the results of the generalized reservoir release neural network model 76 7. Sum of squared deficits as a percentage of the NLP solution for the results of the standard operating procedure 78 8. Total deficit as a percentage of the NLP solution for the results of the generalized reservoir release neural network 79 9. Total deficit as a percentage of the NLP solution for the results of the standard operating procedure 80 10. Hydropower production as a percentage of the NLP solution for the results of the generalized reservoir release neural network 82 11. Hydropower production as a percentage of the NLP solution for the results of the standard operating procedure 83 12. Summary of the computer run time requirements for calibrating the generalized reservoir release neural network 93 ix
  • 10. LIST OF FIGURES 1. Illustration of a typical feedforward neural network 9 2. Flow diagrams: (i) and (ii), of the two primary neural network learning modes 14 3. Functional plane analogy to the optimization process used by neural networks 17 4. Overview of the Monte Carlo optimization procedure 42 5. Trade off curve representing the nondominated solutions of the objective function 49 6. Flow diagram of the generalized reservoir release simulation program, nn_resv_sim 54 7. Flow diagram the standard operating procedure 56 8. Figure 8. Experimental design approach, comparing neural network i) training procedure and ii) testing procedure 63 9. Sum of squared demand deficits with respect to paired time-lagged neural network inputs of It and DEMt 68 10. Total demand deficit with respect to paired time-lagged neural network inputs of It and DEMt 68 11. MSI=0.23, comparing the sum of squared deficits of the reservoir release methodologies over the range of ODI 72 12. MSI=0.84, comparing the sum of squared deficits of the reservoir release methodologies over the range of ODI 73 13. MSI=1.7, comparing the sum of squared deficits of the reservoir release methodologies over the range of ODI 74 14. MSI=3.4, comparing the sum of squared deficits or the reservoir release methodologies over the range of ODI 75
  • 11. 15. MSI=1.7, comparing the sum of squared deficits or the reservoir release methodologies over the range of ODI for two test scenarios 77 16. Comparison of reservoir release methodologies using a storage frequency curve for MSI=1.7 and ODI≈0.04 84 17. Comparison of reservoir release methodologies using a release for demand frequency curve for MSI=1.7 and OD≈0.04 84 18. Comparison of reservoir release methodologies using a storage frequency curve for MSI=1.7 and ODI≈0.09 85 19. Comparison of reservoir release methodologies using a release for demand frequency curve for MSI=1.7 and ODI≈0.09 85 20. Comparison of reservoir release methodologies using a storage frequency curve for MSI=1.7 and ODI≈0.17 86 21. Comparison of reservoir release methodologies using a release for demand frequency curve for MSI=1.7 and ODI≈0.17 86 22. Comparison of reservoir release methodologies using a storage frequency curve for MSI=1.7 and ODI≈0.31 87 23. Comparison of reservoir release methodologies using a release for demand frequency curve for MSI=1.7 and ODI≈0.31 87 24. The performance of a five and twelve-lag generalized neural network reservoir release models for increasing values of the lag-1 auto- correlation coefficient 91 xi
  • 12. = = = = = = = = Weight matrix Individual neural network weight (dimensionless) Generic desired response of the system Generic output of the system Generic distance generator Indicates an adjustable weight matrix Step size Mathematical representation of local error = Momentum factor = Number of monthly time periods in operational time horizon = Time increment (month) = Generic integer increment = Seasonal index(month of year) = Hydraulic head (feet) = Generic input variable = Generic output variable = Flow (acre-feet x 103/month) = Hydroelectric power production = Gigawatt-hours = Generic reservoir storage term = Reservoir storage (acre-feet x 103) = Maximum reservoir storage (acre-feet x103) = Minimum reservoir storage (acre-feet x103) = Reservoir release for demand (acre-feet x103) = Reservoir release for hydropower (acre-feet x103) xii
  • 13. = = = = = = Demand (acre-feet x 103) Reservoir spill (acre-feet x103) Volumetric monthly incremental flow (acre-feet x103) Maximum combined RDt, RHt release (acre-feet x103) Minimum RDt + RHt release (acre-feet x 103) Generic constant = Quadratic regression parameters = bo = b1/2 = b2/4 = Hydroelectric power plant efficiency (dimensionless) = Deficit penalty weighting (dimensionless) = Hydropower benefit weighting (dimensionless) = Monthly volumetric flow requirement for fish (acre-feet x103) = Seasonal weighting factor on demand (dimensionless) xiii
  • 14. 1.0 INTRODUCTION The purpose of research into generalized reservoir release models in water re- sources planning has been to create useful real time operational models. The rela- tive accuracy, to the optimal, of the release decisions made by generalized models is dependent on the ability of the models to generalize from conflicting and com- peting information contained in optimal release decisions and associated stochastic reservoir inflow and demand information. Furthermore, generalized reservoir re- lease models are an attempt to take advantage of the computational efficiency of input/output models. Young [1967] first demonstrated the applicability of gener- alized models by using linear regression in deriving generalized reservoir operating policies for single reservoir systems using the results of dynamic programming. More recently, several studies have demonstrated the ability of neural net- works to perform as generalized reservoir release models. Neural networks are in- put/output or black box models in that inputs are simply mapped to outputs. Neural networks have their origins in the field of artificial intelligence, which over the last several decades has developed a series of mathematical models attempting to mimic the processing capabilities of the human brain. One of these capabilities is the ability of the brain to use parallel processing in order to rapidly interpret large amounts of information through pattern recognition. Much of the value of neural networks to scientists and engineers can be found in the fact that they are extremely computationally efficient. Once trained (i.e., calibrated), neural networks 1
  • 15. 2 can produce a resultant in a matter of seconds. Recent studies include generalized neural network models trained with the results of dynamic and linear programming on several operational scales: single reservoir [Sakakima et al., 1992; Raman and Chandramouli, 1996], multiple reservoirs [Sand et al., 1994], and large-scale water projects [Liu and Yao, 1995]. It has also been shown in the hydrology literature that neural networks per- form comparable, and in some instances outperform statistical time series models. Raman and Sunilkumar [1995] used a neural network to synthesis reservoir inflows and compared the neural network with an AR(2) model. Hsu et al. [1995] devel- oped an artificial neural network model to predict the rainfall runoff relationship for the medium-size basin and compared the neural network with an ARMAX (au- toregressive moving average with exogenous inputs) time series model. Vaziri [1997] compared an artificial neural network with an ARIMA model to predict Caspian Sea water surface level. The purpose of this study is to investigate whether a performance increase in a generalized reservoir release neural network can be achieved by adding lagged inputs of inflow and demand to the three input neural network used by Raman and Chandramouli [1996]. The purpose is also to show the applicability of the release model over a range of reservoir storage, demand-deficit, and streamflow correlation structures. Neural network training (i.e., calibration) data is developed from Monte Carlo optimization of a nonlinear programming (NLP) problem.
  • 16. 3 Previous generalized neural network studies have relied on deterministic data, with the exception of Saad et al [1994] who used synthetic data, and on dynamic programming, with the exception of Liu and Yao [1995] who used linear program- ming. The present study uses nonlinear Monte Carlo optimization. The Monte Carlo element allows for the rapid generation of reservoir inflow and demand in- formation. This information, once optimized, is used as training and testing data by the neural network. The use of nonlinear programming (NLP) allows for simple parameterization of the reservoir's configuration (i.e., size). The nonlinear objective is to minimize the sum of square deficits of release for residential and industrial demand while maximizing hydropower production. The objective is optimized using the MINOS software package, a general-purpose optimizer utilizing a reduced gradient algorithm. The noninferior solutions of this multiobjective programming problem depended on the relative penalty and benefit weights given to the square demand deficit and hydropower production respectively. The noninferior solution investigated in this study is the point that maximizes the production of hydroelectric power with little or no tradeoff in terms of the sum of squared deficits. Lastly, in order to study the behavior of the generalized neural network model, two indices were developed. The first, the maximum storage to average annual in- flow or (MSI), provides a means of describing different size classes of reservoirs. The second, optimal deficit to average annual inflow (ODI), provides a means of uni-
  • 17. 4 formly investigating a relative deficit over a range of MSI reservoir configurations. These indices along with a systematic variation in the lag-1 autocorrelation param- eter in the streamflow generator allows for the examination of the performance of the generalized neural network release model over a range of storage configurations, demand-deficits, and streamflow correlation structures.
  • 18. 2.0 LITERATURE REVIEW The literature review is divided into five principal sections: Artificial Neural Network Modeling, Generalized Reservoir Operating Rules, Artificial Neural Net- works in Water Resources Planning, Optimization and Artificial Neural Networks, and Study Contributions. The first section is an introduction to artificial neural network computing with specific discussions of strengths and weaknesses, topology, preprocessing, training, and time series modeling in hydrology. The second section is a review of the literature on generalized reservoir operating policies. The third section is a survey of neural networks in water resources planning with specific con- sideration of single reservoir management, multiple reservoir management, and large scale water project management. The fourth section examines neural networks as an optimization tool in water resources outside the context of generalized operat- ing rules. The final section discusses this study's contributions to water resources planning and management. 2.1 Artificial Neural Network Modeling Neural networks are an attempt to perform pattern recognition by mapping inputs to outputs based on mathematical models which attempt to model the power and flexibility of the human brain. The literature frequently uses the term neural network (NN) or artificial neural network (ANN) interchangeably. When consulting the literature, it is helpful to recognize that artificial neural networks may be referred 5
  • 19. 6 to as neurocomputing, network computation, connectionism, parallel distributed processing, layered adaptive systems, self-organizing networks, or neuromorphic systems or networks. Collectively, the variety of names synonymous with neural networks exemplify the diverse and varied nature of neural computing [Zurada, 1992]. Neural networks must be taught or trained (i.e., parameter estimation). Learn- ing corresponds to changes in weights as the network is exposed to new patterns, associations, and functional relationships. Zurada [1992] observes that neurocom- puting lies somewhere between engineering and artificial intelligence. While math- ematical techniques used are engineering-like (i.e., reduced gradient optimization), an experimental ad hoc approach must be taken because the field lacks a theory for selecting an appropriate network architecture for a particular application. The subsections that follow draw primary upon Bishop [1995] and Zurada [1992], rec- ommended by Salle [1999] in a review of the best books on neural networks, as well as the NeuroSolutions online users manual [Lefebvre et al., 1997]. 2.1.1 Strengths and Weaknesses Artificial neural networks are of interest to scientists and engineers for their ability to generalize and function as black box models. Generalization takes place when a network generates reasonable outputs when presented with an unseen data set. Neural networks function as black box models in that prior knowledge of the system under investigation is not required; thus, they are useful in problems were
  • 20. 7 the underlying cause and effect mechanisms are not well understood. Neural network models are considered robust in that small input errors typi- cally have minimal impact on output. They are easy to implement as a result of recent advances in software engineering and personal computing that have made off-the-shelf neurocomputing software readily available. Finally, the neural network topology itself is extremely flexible in that it can readily incorporate any number of inputs and outputs [Thirumalaiah and Deo, 1998; Medsker et al., 1998; Zang and Stanley, 1997]. Medsker et al. [1998] elaborated on the disadvantages of artificial neural net- works by making the following points. Neural networks in general cannot guarantee an optimal solution or be repeatable in terms of reproducing the same internal net- work weights. Neural networks can give significant importance to input variables that come into conflict with traditional theory. Finally, training times can be ex- cessive and can be compounded by the fact that determining network architecture frequently requires a trial-and-error approach. 2.1.2 Topology Neural networks are composed of a large number of interconnecting processing elements (PEs), also referred to as neurons, nodes or axons. The form of the in- terconnections between PEs provides a means of neural network classification. In a fully recurrent neural network, every PE is connected to every other PE, including itself. For a neural network to be recurrent, there must be at least one feedback
  • 21. 8 connection allowing information from one input presentation to affect future out- puts. The feedforward neural network is the specialized case where the network's connections are restricted to the forward direction. Training (i.e., parameter esti- mation) is straightforward and far more reliable in feedforward networks than with recurrent networks, as output is dependent only on current input. The overwhelm- ing majority of neural network studies in water resources are feedforward rather than recurrent. Feedforward networks have PEs arranged in layers with one or more hidden layers sandwiched between input and output layers. A generic feedforward neural network is sometimes referred to as a multilayer perceptron (MLP). The function of a PE within a feedforward neural network depends on its position within the network. Figure 1 illustrates three types of PEs whose functionality is typically defined as follows: an input PE serves as a simple identity map or buffer between input and output activity; a hidden PE performs as a summing junction at its input, a nonlinear transformation, and lastly, a splitting node at its output; and an output PE performs the same as a hidden PE with the exception that there is no splitting node at its output. A synapse is a linear mapping between layers of PEs. The synapse is composed of a weight matrix with each matrix element representing connection strength. The summation function at the beginning of a PE finds the weighted average of all the
  • 22. 9 Figure 1. Illustration of a typical feedforward neural network, modified with permission (Medsker et al., 1996)
  • 23. with where 10 inputs and may be represented by where xi is the resultant feeding the i-th processing element, while x j is the output of the previous processing element and wij is a connection weight linking PEj to PEi. The nonlinear transformation following the summation function is also referred to as a transfer or activation function. Two frequently used transfer functions in neural network studies are the sigmoid function and the hyperbolic tangent function is an adaptable linear transformation within the activation function. The value of wi is fit during the training process while 6 is not adaptive, it may be parameterized. Equations (1) through (4) follow from Lefebvre et al. [1997].
  • 24. 11 2.1.3 Preprocessing Feature selection, feature extraction, and input normalization are preprocessing methods for reducing input dimensionality to improve neural network performance. Feature selection refers to picking the best subset of inputs from the domain of po- tential inputs. Feature extraction involves extracting data from a set of inputs and generating a new smaller transformed set of inputs that preserves the most relevant information of the original inputs. Data normalization of the neural network inputs and outputs improves training performance by conditioning the network's inputs and outputs to the initial random values of the network's weights. Feature selection involves reducing the dimensionality of a neural network by se- lecting a subset of features or input variables and discarding the remainder. Bishop [1995] suggested that any methodology for feature selection involves the establish- ment of a selection criterion to compare one input subset with another. He further recommended a systematic means of searching through candidate subsets of input variables. Prior knowledge is one approach to perform feature selection by using rele- vant information in addition to that provided by the training data [Bishop, 1995]. Prior knowledge can be incorporated into the network topology in pre- and post- processing, as well also during training. Physically based analytical and numerical models are an abundant source of prior knowledge. Bishop's example involves appli- cations of neural networks where data varies as a function of time. Prior knowledge
  • 25. where is the number of time lags. 12 in this case involves incorporating time-lagged information into the network topol- ogy. This is frequently done by presenting simultaneously successive values of the time-dependent variable, given by Sensitivity analysis of network inputs relative to network output are a second means of feature selection. One way of performing a sensitivity analysis is to freeze network weights and feed two sets of dithered input data through the network. The differences between the two sets are then converted to percentages relative to all other input variables. These percentages represent the output change caused by the positive and negative dithers [Lefebvre et al., 1997]. Such sensitivity information is useful in determining the relative importance of each input variable and the subsequent reduction in dimensionality by eliminating insignificant inputs. The sensitivity values allow one to see if the corresponding sensitivities follow from an understanding of the problem, or when understanding is lacking, to determine significant underlying relationships. Feature extraction involves extracting data from a set of inputs and generating a new smaller transformed set of inputs that preserves the most relevant information of the original inputs. There is the expectation that there will be a reduction in the dimensionality of the input space with minimal loss of discriminating information. Principal component analysis is a classical statistical technique for linear feature extraction. The technique involves truncating lesser components to a specific order. Neural networks themselves can be made to perform principal component analysis
  • 26. 13 and can be divided into two categories: those that originate from the Hebbian learning rule and those that use least squares learning rules [Diamantaras and Kung, 1996]. Hebbian learning requires that the that the learning signal be equal to the neuron's output [Hebb, 1949]. Linear rescaling, a common form of pre- and post-processing, is frequently needed for proper network training as the magnitudes of input/output variables do not necessarily reflect their relative importance. Normalization can transform input/output variables to the order unity, while neural network weights can be randomly initialized within the range [1, -1]. Without input/output normalization, it would be necessary to find solutions to the weights that differ markedly from one another and that create conditions poorly conducive to training. 2.1.4 Training A second means of classifying neural networks is to differentiate between learn- ing modes. The two primary modes are supervised and unsupervised learning, illustrated in Figure 2. The figure shows that for supervised learning when an input is applied, the desired response d of the system must be explicitly provided by an external teacher in order for the distance generator to calculate the distance mea- sure or error between the actual output o and the desired response [Zurada, 1992]. The distance measure is error information which can be used by an updating weight matrix W.algorithm to improve on the network's adjustable
  • 27. 14 Figure 2. Flow diagrams: (i) and (ii), of the two primary neural network learning modes. From Introduction to Artificial Neural Systems, 1st edition, by Zurada [1992]. Redrawn with permission of Global Rights Group, a division of International Thomson Publishing. Fax 800 730-2215. With unsupervised learning there is no teacher to provide a desired response, and thus, there is no explicit error information to send to an updating algorithm
  • 28. 15 in order to improve on the network's weights. Unsupervised learning must instead rely on pre-specified internal rules of interaction for updating weights. These algo- rithms search for redundant patterns, regularities and separating properties. This process is also known as self-organization [Zurada, 1992]. Statistical principal com- ponent analysis and neural network principal component analysis are two examples of unsupervised learning [Bishop, 1995]. Learning can take place on either the epoch or exemplar level corresponding to on-line or batch training. An epoch is one pass through the entire input series, while exemplar refers to a single example from the input series. Error updating of the weights takes place after each exemplar in the case of on-line training. Batch train- ing performs weight updates at the end of an epoch using an average of accumulated error information [Lefebvre et al., 1997]. Prior to implementing supervised learning, two things must be decided: an optimality criterion to be minimized and a gradient search method to compute optimal network weights. The criterion typically used in neural network studies is the summation over a training interval of the quadratic cost function or the mean squared error criterion (MSE) with d(n) the desired response, y(n) the network's output, and N the number of input observations. MSE is the most frequently applied neural network optimality criterion, although other optimality criteria are possible (e.g., the absolute value).
  • 29. 16 The learning elements of a feedforward neural network can be thought of as a juxtaposition of three independent planes (Figure 3): the forward activation plane, the backpropagation plane, and the gradient search plane [Lefebvre et al., 1997]. The neural network designer specifies the topology (Section 2.1.2) of the forward plane that dictates the form of the adjoining backpropagation plane. These two planes are attached by the error criterion (i.e. MSE) that sends into the back- propagation plane the composite error e2 determined by the difference between the desired signal and the network output. The gradient search plane adjusts the activa- tion plane weights by using the error information contained in the backpropagation plane. The backpropagation algorithm is an innovative means of propagating error back through the network, from its output toward its input, prior to the evaluation of derivatives. The fundamental concept behind the backpropagation algorithm is the evaluation of the relative error contributed by each particular weight also referred to as the credit assignment problem [Bishop, 1995]. Rumelhart, Hinton and Williams popularized the backpropagation technique [Rumelhart et al., 1986], although the mathematical framework necessary for training was published by Werbos [1974].
  • 30. 17 Figure 3. Functional plane analogy to the optimization process used by neural networks, reproduced by permission of NeuroDimension Inc. (Lefebvre et al., 1997). There are several methods for searching the performance surface: step, momen- tum, quickprop, and delta-bar-delta. The momentum equation (on-line learning), represented by is an extension of step learning with the addition of a momentum term where a is the momentum factor. Equation (7) shows that the updated value of the weight is equal to its current value, plus a step error-adjustment, plus momentum from the value of the previous weight. The constant n is referred to as the step size while the local error is represented by Si. The value of 6 is calculated using error function derivatives. The evaluation of the gradients is made possible by the fact that the
  • 31. 18 error function of a neural network contains continuously differentiable functions of the weights. A complete presentation of the backpropagation algorithm and associated derivative evaluations can be found in Zurada [1992] and Bishop [1995]. The typical error surface of a neural network presents a challenging optimiza- tion problem, being nonconvex and to varying degrees of high dimensionality with numerous local minima and regions insensitive to variations in network weights. Reduced gradient search algorithms may encounter difficulties similar to other op- timization approaches in that they are sensitive to initial starting conditions, are easily trapped by local optima, and are often ineffective when searching high di- mension weight spaces [Hsu et al., 1995]. One approach to handling problems associated with local minima is to develop a more powerful search algorithm. A hybrid approach was taken by Hsu et al. [1995], using linear least squares and multistart simplex optimization to find a global or near global solution. However, some authors argue that local minima have not been a significant problem in training neural networks. Zurada [1992] stated that problems with local minima were not significant in many of the training cases studied. He attributed this to stochastic initial conditions of the backpropagation algorithm in that the internal weights are random and thus the initial square error surfaces are random. He added that the algorithm has be found to be equivalent to a form of stochastic approximation [Zurada refers to White, 1989]. Zurada stressed the importance of incorporating randomness within the learning process to enhance the
  • 32. 19 stochastic nature of network training. One form of that randomness can be found in the initial values of a neural network's weights, typically randomized within the interval [1, -1]. The form of the neuron's activation function, the number of hidden layers and the number of nodes per hidden layer, play an important role in capturing the com- plexity of the problem. A trial and error approach is the most common procedure for determining the number of hidden layers and nodes per hidden layer. If the internal dimensionality is too small, the network may not have sufficient degrees of freedom to learn the process. If the internal dimensionality is too large, the network may not converge during training or may overfit the data (Karunanithi et al., 1994). Pruning can be used to reduce the internal dimensionality of the network once an appropriate internal network architecture has been determined. The procedure in- volves removing connections between various nodes and thus reducing the effective size of the weight matrix. One way around a trial and error approach for developing a neural network's architecture is to use the cascade-correlation algorithm. Karunanithi et al. [1992] applied a cascade-correlation neural network to estimate flows at an ungaged site on the Huron River in Michigan and compared the neural network with an analytical power model in terms of accuracy and convenience. The cascade-correlation neural network differs from the feedforward neural network in that it can automatically construct a model that adequately captures the complexity of the problem. The
  • 33. 20 idea behind Fahlman and Lebiere's [1990] algorithm is to allow the neural network to incrementally construct its own topology based on training error. The algorithm decreases training times and allows for more consistency in solving problems. There are several strategies for stopping the training of a network: stopping training after a predetermined number of epochs, stopping when the error is below an acceptable threshold value, or using a cross validation procedure. The goal of any stop criterion should be to maximize the network's generalization and avoid overtraining. The latter occurs when the network memorizes the desired data and thus does a poor job of generalizing. Stop criteria based on iteration number or level of acceptable error requires personal judgement to determine what level of error is appropriate or by which iteration is an acceptable solution reached. Cross validation provides a systematic means of stopping the training process by evaluating the performance of an independent validation set while simultaneously training. Once the error in the independent validation set begins to increase, training is stopped. A third independent data set for measuring final network performance is called a test set. Neural network learning is a time consuming process, but once trained, neural networks are quite fast. The iterative learning process is slow in part because the backpropagation algorithm needs to update the network's weights based upon gradient-descent error information. Training time requirements are compounded by the fact that different input configurations must be evaluated in the feature selection
  • 34. 21 process. Trained neural networks are quite fast as the computations involved are primarily matrix multiplication. Training times in this thesis were on the order of hours, while neural network test runs were on the order of seconds. 2.1.5 Time Series Modeling in Hydrology The ability of neural networks to model time varying hydrologic phenomena has been well documented. Raman and Sunilkumar [1995] used a neural network to synthesize reservoir inflows and compared the results with a lag-2 autoregressive model. Ichiyanagi et al. [1996] used neural networks to predict reservoir inflows. Other areas of neural network hydrologic investigation include predicting streamflow flow [Crespo and Mora, 1993; Karunanithi et al., 1994; Thirumalaiah and Deo, 1998a], flood flow [Thirumalaiah and Deo, 1998b], inland sea water level [Vaziri, 1997], rainfall [French et al., 1992], rainfall-runoff [Hsu et al., 1995; Smith and Eli, 1995; Minns and Hall, 1996; Carriere et al., 1996; Dawson and Wilby, 1998; Jayawardena and Fernando, 1998], and water quality transport [DeSilets et al., 1992; Maier and Dandy, 1996; Gong et al., 1996; Whitehead et al., 1997; Zhang and Stanley, 1997]. Three of the papers compared an artificial neural network with a time series model. Vaziri [1997] compares an artificial neural network with an autoregressive integrated moving average (ARIMA) model to predict the current water level of the Caspian Sea. Both models produced reasonable results when compared with recorded levels. The neural network model, on average, underestimated the sea level
  • 35. 22 by three centimeters, while the ARIMA model overestimated by three centimeters. Hsu et al. [1995] developed an artificial neural network model to predict the rainfall runoff relationship for the medium-size Leaf River basin near Collins, Mississippi and compared it with an ARMAX (autoregressive moving average with exogenous inputs) time series model. The artificial neural network model outperformed the ARMAX model giving the authors reason to suggest that a neural network approach may be a superior alternative to the ARMAX time series approach. Raman and Sunilkumar [1995] compared an artificial neural network for reser- voir inflows with a lag-2 autoregressive model AR(2) for Mangalam and Pothund reservoirs located in the Bharathapuzha basin, Kerala state, India. Twenty-three years of historic monthly data were divided into three categories corresponding to training, cross-validation, and testing. Cross validation served as the stopping criteria. Twelve independent monthly neural networks were used since they out- performed a single continuous input network. The final neural network consisted of four inputs and two outputs corresponding to the two previous reservoir inflows and the predicted inflow for each reservoir. The mean, standard deviation, and skewness of the predicted inflows were compared for both reservoirs with regard to the artificial neural network and AR model. The authors concluded that neural networks are a viable alternative for time series modeling in water resources. A neural network time series approach was taken by Ichiyanagi et al. [1996] to predict hourly inflows into the Hatanagi-Daiichi Dam on the Oi River in Japan.
  • 36. 23 Two separate artificial neural networks were trained based on two categories of cumulative rainfall events, light and heavy corresponding to light events under 200 millimeters and heavy events over 200 millimeters. Inputs included current and five previous hourly values of rainfall, five previous river flow rates, the base flow rate, predicted total rainfall amount, and predicted event duration. The light-rainfall based neural system did a reasonable job of predicting inflow during light and heavy rainfall. However, the light-rainfall based system did a poor job in predicting the peak heavy-rainfall inflow. While the heavy-rainfall neural network did a good job in predicting peak inflow during heavy-rainfall, it did poorly in predicting general- inflow. From these observations, the authors decided to use the light-rainfall neural network in all instances when the accumulated rainfall was below 200 millimeters. Once 200 millimeters is exceeded, the model switches to the heavy-rainfall neural network. 2.2 Generalized Reservoir Operating Rules Regression, probability distribution functions, and neural networks have all been shown to be capable of performing as generalized reservoir operating rules. While the first two approaches are reviewed in this section, the third, the neural network approach, is covered in the next section. Generalized reservoir operating rules are data driven approaches for incorporating the results of an optimization into real-time reservoir operation. The goal of such rules is to incorporate as best possible, the optimal solution, given the stochastic nature of the system.
  • 37. 24 Young [1967] used least squares regression as a means of identifying an annual operating rule for a single reservoir calibrated from the results of a deterministic DP. Current storage volume and projected Monte Carlo inflows served as the predictors in a regression equation for the optimal reservoir release. Young's work showed regression as a means of creating a general operating rule to simulate the year-by- year operation of a reservoir system. Bhaskar and Whitlatch [1980] took a similar approach to Young except that they performed the simulation step to verify the performance of the regression equation rather than to rely on the value of R2. Furthermore, their reservoir release policy was based on a monthly rather than an annual time step as with Young. They found that the best fit model, indicated by a maximum R2, did not necessarily correspond with the best general operating rule under simulation. They concluded that a linear release policy gave better results for a two-sided quadratic loss function, while for a one-sided quadratic loss function, a nonlinear regression policy gave superior results. Karamouz and Houck [1982] used an iterative optimization regression simu- lation approach to improve on Young's [1967] procedure by refining the general operating rule to increase its correlation with the optimal deterministic DP opera- tion. Their algorithm accomplished this in a finite number of steps by constraining the DP to within a specified percentage of release defined by the previous general regression-operating rule. However, convergence to the best general operating rule is not guaranteed. Five years later, Karamouz and Houck [1987] compared their
  • 38. 25 iterative DP-regression-simulation (DPR) model with a stochastic dynamic pro- gramming (SDP) model. The simulation phase of their study showed that DPR was more effective in operating medium to large reservoirs, while SDP was more effective in operating small reservoirs. Willis et al. [1984] developed a general release policy using a probability dis- tribution function (PDF) of optimal releases conditioned on observable hydrologic conditions. The PDF policy gave nearly identical responses when compared to the optimal release. Monte Carlo simulation was used to generate 101 realizations of 768 months (64 years) of reservoir inflow and river makeup data for the Mad River Basin. Linear programming (LP) was used to determine the optimal release pattern for each of the 768 monthly inflow sequences. A statistical analysis of the optimal releases indicated that there were several good choices for conditioning the PDF: hydrologic season, current and previous period reservoir storage (St and St-1), and current and previous period inflow (It and It-1 ). Only current period inflow and previous period final storage were selected because of computer memory limitations [Finney, 1998]. The final PDF formulation involved sorting through 77,568 optimal releases and placing them into 216 class intervals based on It and St-1.
  • 39. 26 2.3 Artificial Neural Networks In Water Resources Planning Neural network models have been developed for water resource systems ranging from single reservoir [Sakakima et al., 1992; Raman and Chandramouli, 1996], mul- tiple reservoir [Saad et al., 1994; Taha and Ghosh, 1995; Sandu and Finch, 1996]; and large-scale water projects [Liu and Yao, 1995]. Four of the six models in these studies are generalized reservoir release models [Sakakima et al., 1992; Raman and Chandramouli, 1996; Liu and Yao, 1995; Taha and Ghosh, 1995]. 2.3.1 Single Reservoir Management Two generalized reservoir release models for stand alone reservoirs are pre- sented in this section. Sakakima et al. [1992] studies flood control operation (i.e., typhoon) using two different neural network models. Raman and Chandramouli [1996] investigated reservoir operation in terms of minimizing the squared deficit of the release for agricultural demand. While both papers use DP to generate the optimal training response, the work by Raman and Chandramouli is far more com- parable to Young's [1967] concept of a generalized reservoir release model than is the work by Sakakima et al. Sakakima et al. [1992] developed a generalized neural network reservoir release model for real-time reservoir operation during extreme inflow events (i.e., typhoon). Teacher's signals or the desired training responses were taken as the optimum re- lease discharge calculated by dynamic programming (DP). The reservoir's primary
  • 40. 27 purpose was flood control with the initial storage volume set to zero. Model inputs included typhoon course, current inflow volume, increasing rate of inflow, and cur- rent storage volume. Two approaches were taken in simulating real time operation of Syorenji Dam within Yodo basin in Japan: the knowledge-based approach for pa- rameter extraction and the prediction approach for hydrograph. The first method was essentially a parameter pre-processing approach where the current hydrograph was compared with a database of historic typhoon hydrographs and associated neu- ral network parameters to find the closest fit. The historic model's parameters were adjusted by a similarity factor. This methodology worked well in normal flood in- stances but did poorly in events not well represented in the parameter database (i.e., large floods). The second method attempted to address the problem of the large unseen flood by multiplying a historic normal-flood hydrograph by gained similar- ities to produce a predicted hydrograph. The prediction approach for hydrograph outperformed the knowledge-based one for a large unseen flood. Raman and Chandramouli [1996] also demonstrated that neural networks can be used to create a generalized reservoir model trained from the results of dy- namic programming. The neural network model here was compared with a gener- alized multiple linear regression model calibrated with the same optimization data, stochastic dynamic programming model, and a standard operating policy. The ob- jective of the optimization was to minimize the square deficit of the release to meet irrigation demand using twenty-three years of historic fortnight data. Raman and
  • 41. 28 Chandramouli found that the two general operating rules derived from DP outper- formed stochastic DP and a standard operating policy. Of the generalized models, the neural network outperformed multiple linear regression for three-years of un- seen historic data with a 2.2 percent reduction in sum of squared deficits. Neural network inputs were limited to a single set consisting of initial storage, predicted inflow, and demand. The final neural network model consisted of three input nodes, four hidden sigmoid nodes, and one output node for a (3-4-1) configuration. 2.3.2 Multiple Reservoir Management A time series neural network disaggregation technique was developed by Saad et al. [1994] to determine the optimal storage levels in Hydro-Quebec's five reservoir La Grande River installation (four in series one in parallel) based on an optimal aggregated storage volume. The neural network results were compared with a prin- cipal component disaggregation technique and found to be similar. Five hundred years of monthly streamflow data were synthetically generated using a first-order Markovian model. Deterministic DP was then used to minimize the expected cost of energy production producing optimal storage levels in each of the five reservoirs. The potential energy of each reservoir at its optimal storage level was then aggre- gated into a single value. This aggregate potential energy value and the value for the previous five months were presented to a feedforward neural network as input with the optimal storage level as the desired output.
  • 42. 29 Saad et al. [1994] determined through experimentation that a two-hidden layer neural network produced a lower root-mean-square error than a single layer thus giving the neural network a final (6-5-5-5) configuration. Separate neural networks were then trained for each month of the year using 475 out of the 500 optimal po- tential energy contents; 25 years were retained as the test data set. An intermediate stage, termed by Saad et al. as limit adjustment, prevented the output of the neural network from exceeding the physical limits of each reservoir. The limit adjustment distributed excess water among the other reservoirs relative to capacity. A graphical presentation of the test results showed the feasibility of the neural network disag- gregation, as well as superior performance of neural network disaggregation over principal component disaggregation for two of the system's reservoirs during June, the month with largest root-mean-square training error. Taha and Ghosh [1995] proposed a Hybrid Intelligent Architecture (HIA) that incorporated expert systems and connectionist architectures (i.e., neural networks) and successfully applied the HIA to the control of water reservoirs on the Colorado river near Austin. A novel feature of the HIA is its ability to change the characteri- zation of continuous valued inputs during neural network training. Implementation consisted of first generating the neural network's architecture using a Nodal Links Algorithm (NLA). The algorithm used a set of principles to map continuously val- ued inputs, in this case, a set of floodgate operating rules for the Lower Colorado River Authority (LCRA). Gaussian distribution functions were then used to dis-
  • 43. 30 cretize continuously measured values into neural network input intervals. Finally, the neural network was trained using an Augmented version of the Backpropagation Algorithm (ABA). The algorithm is equivalent to standard backpropagation with an important exception that it can adjust the shape of the Gaussian distribution functions, thus control the shape of the neural network input intervals. Testing the HIA produced a 94.2 percent match to the desired decisions based on LCRA op- erating rules, compared with a 76.3 percent match for a standard backpropagation neural network. 2.3.3 Large Scale Water Projects Liu and Yao [1995] argued that the large-scale operation problem is more ap- propriately suited for a non-structured or semi-structured decision system rather than a structured decision system. They also pointed out that non-structured sys- tems have not worked very well in the past and that in order for them to perform more satisfactorily, such non-structured systems need a means of adequately inte- grating available information into a cohesive decision making system. To this end, they proposed an intelligent operation decision support system (IODSS) in which a series of generalized neural networks served as the primary decision-makers. Sandu and Finch [1996] conducted a feasibility study to examine the accuracy of neural networks in modeling river delta salinity. The authors expressed interest in neural networks because of their excellent time performance (once trained) as a simulator to investigate such phenomena as carriage water.
  • 44. 30 cretize continuously measured values into neural network input intervals. Finally, the neural network was trained using an Augmented version of the Backpropagation Algorithm (ABA). The algorithm is equivalent to standard backpropagation with an important exception that it can adjust the shape of the Gaussian distribution functions, thus control the shape of the neural network input intervals. Testing the HIA produced a 94.2 percent match to the desired decisions based on LCRA op- erating rules, compared with a 76.3 percent match for a standard backpropagation neural network. 2.3.3 Large Scale Water Projects Liu and Yao [1995] argued that the large-scale operation problem is more ap- propriately suited for a non-structured or semi-structured decision system rather than a structured decision system. They also pointed out that non-structured sys- tems have not worked very well in the past and that in order for them to perform more satisfactorily, such non-structured systems need a means of adequately inte- grating available information into a cohesive decision making system. To this end, they proposed an intelligent operation decision support system (IODSS) in which a series of generalized neural networks served as the primary decision-makers. Sandu and Finch [1996] conducted a feasibility study to examine the accuracy of neural networks in modeling river delta salinity. The authors expressed interest in neural networks because of their excellent time performance (once trained) as a simulator to investigate such phenomena as carriage water.
  • 45. 31 A generalized operating model for a large-scale water project trained from the results of multiobjective linear optimization was used by Liu and Yao [1995] to sim- ulate the operation of China's South to North Water Transfer Project (SNWTP). The project consists of a 1200 km long canal that transfers water from the lower reach of the Yangtze River to north China. The canal links four river basins (the Yangtze, Huaihe, Yellow and Haihe) and eight natural lakes. The operation con- sultant expert sub-systems (OCES) served as the primary decision support system of what the authors referred to as an intelligent operation decision support system (IODSS). The OCES system consisted of a sub-OCES or neural network for each natural lake. Each neural network consisted of 24 model inputs including water inflow, initial volume, and water demand for each lake and for corresponding water supply areas for the coming period. Sub-OCES output was the terminal lake vol- ume. Neural network simulation results were compared with linear programming for Hong-ze Lake for the month of October for the 30 years of historic training data. Test results of the simulated operation of SNWTP using unseen test data showed that the system IODSS/SNWTP was capable of operating China's SNWTP. Sandu and Finch [1996] conducted a feasibility study to determine the effective- ness of using a feed-forward neural network to quantify the salinity levels at critical locations in the Sacramento-San Joaquin Delta under various flow conditions. Their motivation was to find a faster and more accurate means of predicting salinity levels than the current Minimum Delta Outflow (MDO) routine of the California Depart-
  • 46. 32 ment of Water Resources Planning and Simulation Model (DWRSIM). The MDO routine within DWRSIM simulates flow and salinity conditions using empirical and heuristic methods. DWRSIM is an extremely large-scale monthly time step reservoir simulation model of California's Central Valley. Two major Delta locations were investigated. The inputs to the NN were the flow conditions and gate positions at various locations within the Delta. Sand and Finch [1996] concluded that neural networks warrant further study as a potential replacement for the current MDO routine and as a means of investigat- ing carriage water, about which the current MDO routine can provide little insight. Carriage water is the marginal export cost or the extra water needed to carry a unit of water across the Delta to the pumping plants for exports while maintaining constant Delta salinity. The California Department of Water Resources Delta Sim- ulation Model (DWRDSM, California 1994), an unsteady one-dimensional finite difference hydrodynamic and salt transport model, has been used to investigate carriage water. Unfortunately, this model is not a serious candidate for MDO rou- tine replacement because run times are significantly greater than the current MDO routine (on the order of ten-thousand times). Sandu and Finch proposed using DWRDSM to train an artificial neural network.
  • 47. 33 2.4 Optimization and ANN Neural networks have been used within three separate types of optimization methodologies in water resources. The first type is a neural network trained from the results of a programming problem to produce a generalized optimization model, as was discussed in the preceding section concerning operating rules in water resources. Neural networks have also been trained from the results of numerically based phys- ical models in order to serve as simulators in linked-simulation-optimization (LSO) routines. The third methodology relates to the problem of noninferior solutions in multiobjective optimization and the need for decision-makers to make choices with regard to the objectives. Here the neural network inputs are the objective values, while the outputs are the weights (decision-maker's preferences) given to each objective. The LSO approach with a neural network performing as the simulation link has been used in several groundwater studies to increase run times over numeri- cally based physical models. Morshed and Kaluarachchi [1998] used LSO to solve the inverse problem pertaining to the estimate the groundwater parameters that control light hydrocarbon free-product volume predictions and flow. The artifi- cial neural network (ANN) simulated groundwater flow, while a genetic algorithm (GA) performed the optimization. ANN-GA LSO has also been applied directly to groundwater remediation (Rogers, 1992; Rogers and Dowla, 1994; Rogers et al., 1995; Johnson and Rogers, 1995).
  • 48. 34 Wen and Lee's [1994] approach was somewhat different in that they used a neural network as an optimal multiobjective decision-making tool for water qual- ity management in the Tou-Chen River Basin, Taiwan. The inputs to the neural network were the objective function values, while the outputs were the weights (decision-maker's preferences) given to each objective. They used the following three objectives to formulate their optimization problem: minimize the BOD5 con- centration in the first reach, minimize the cost of wastewater treatment in the entire basin, and maximize the river assimilative capacity of BOD5. Compromise programming was used to minimize the deviation between the ideal and the real ob- jective solutions within the feasible region. The deviation was weighted to indicate the relative importance of the decision-maker's preference. The neural network was trained by using a database generated from optimizations performed with random sets of objective function values to produce weight values. Thus, if given a set of objective values, the neural network could produce a set of weights to be used by compromising programming to find the noninferior solutions to the multiobjective programming problem. 2.5 Study Contributions One major purpose of this study is to look at water resources planning for a single stand-alone reservoir from a wider operational perspective than previous studies involving neural networks. This approach includes varying the size of the reservoir relative to its inflow, subjecting the reservoir to increasing levels of demand
  • 49. 35 stress, and varying the correlation structure of reservoir inflows. Parameterization the model's inputs allows for greater exploration of the applicability of the technique and prevents the present study from being relegated to one particular test case, as had been the situation with previous studies of neural networks in water resources reservoir planning. Parameterization of the physical plausibility constraints and the storage to head conversion coefficients in a NLP problem simple requires altering individual constraint and coefficient values. The ease by which these changes can be made make NLP a useful means of generating new neural network training responses (i.e., the optimal reservoir releases) for multiple reservoir configurations. Previous studies have relied on dynamic programming to determine the inputs of generalized neural network reservoir release models, with the exception of Liu and Yao [1995] who used linear programming. The use of dynamic programming in the present study would require the rediscretization of the state variable and the decision variables for each new reservoir configuration. Furthermore, interpolation error would be difficult to quantify for the different discretizations. The use of deterministic data has been a limiting factor in previous investi- gations involving neural networks in water resources planning, with the exception of Saad et al. [1994] who used a first-order Markovian model. However, Saad et al. made no investigation into altering the correlation structure of reservoir inflows. The present study's use of a synthetic streamflow generator allows for an investiga-
  • 50. 36 tion into the effect of altering the correlation structure of reservoir inflows on the generalized release model. Raman and Chandramouli [1996] limited themselves to a three input gener- alized reservoir release model in part because their objective was to compare the generalized neural network model with a linear regression model. Linear regres- sion models may not be a fair comparison to neural networks models as they are not structured to incorporate the level of input dimensionality that a neural net- works can. Therefore, one of the challenges with neural networks in water resources planning is to quantify neural network performance relative to some standard of performance. As discussed earlier, neural networks are designed to parallel process informa- tion and thus have the ability to incorporate a large number of inputs, relevant or not. This characteristic in itself is an attractive feature because water resource systems are large and complex. However, the opportunity for so many inputs can at times pose a problem in determining the appropriate or best inputs to a partic- ular model. Maier and Dandy [1996] concluded that in their paper on predicting salinity in a river 14 days in advance that the most difficult part of the problem was in determining the appropriate model inputs. One of the major challenges of the present study as well is to determine a subset of appropriate inputs out of the set of all possible inputs.
  • 51. 3.0 METHODOLOGY There are two principal sections of the methodology chapter. The first section begins with the development of the reservoir release mathematical program followed by sub-sections on the Monte Carlo optimization procedure used to determine syn- thetic reservoir inflows, reservoir demands, and optimal nonlinear programming (NLP) reservoir release decisions. The second section covers the development of the study's neural network reservoir release model using the reservoir inflows, reservoir demands, and the NLP reservoir release decisions developed in the previous sec- tion. The last part of this section covers the development of the study's reservoir simulation scenarios and a discussion of the study's experimental design approach. 3.1 Development of the Reservoir Release Model The purpose of this study is explore the performance of a generalized reser- voir release model to time-lagged inputs, as well as to explore the applicability of a generalized neural network release model over a range of reservoir storage config- urations, demand-deficit conditions, and streamflow correlation structures. Monte Carlo optimization was applied in order to provide training data to the generalized reservoir release neural network. This technique required developing a nonlinear programming (NLP) problem to represent the two objectives of the reservoir sys- tem: reduction of the sum of squared demand deficits and hydropower generation. 37
  • 52. 38 A calibration set of hydrological data had to be found in order to calibrate the synthetic streamflow generator needed to supply reservoir inflow data to the non- linear programming problem. Mad River Basin in California served as the study's source of hydrologic data. The hydrologic characteristics of this river basin provided the deterministic data needed to calibrate the synthetic streamflow generator. In the investigation stage of the study, a series of hypothetical single reservoir instances or case studies were then superimposed onto the location of the existing reservoir for the purposes of performing numerical experiments. 3.1.1 Reservoir Release Mathematical Program The management problem posed is to determine releases that minimize demand deficits while maximizing power production
  • 53. 39 Volumetric inflow to the reservoir is represented by I. Controlled releases are specif- ically targeted to meet demand (RDt) and generate hydroelectric power (RHt). Un- controlled releases are defined by spill (SPt). Reservoir demand (DEMt,Reservoir) is equal to the demand at the pump station (DEMt,Pump Station) less the in- cremental volumetric flow (Ft) between. Hydroelectric power production (f) in Equation (8) is a function of the per month release for demand, the per month re- lease for hydropower, as well as the current and previous period storage (St, St-1). Thus, an additional benefit of releases intended for demand is the generation of hydroelectric power. N is the number of monthly time periods in the operational time horizon. The weighting term 0 allows for a penalty on the deficit, while π is the benefit term associated with hydropower production. Equations (10) through (13) represent a linear or convex constraint set to the nonlinear programming problem. Equation (10) maintains mass continuity within the reservoir system; (11) insures that the total controlled release falls between the maximum and minimum; (12) prevents releases for demand from exceeding that demand; and (13) constrains the model to storage between the maximum at spillway height and the minimum below which the reservoir is incapable of making releases. In reality the reservoir does exceed Smax when the reservoir height is above the spillway; however, this is typically a temporary condition as the spill rapidly lowers the water level. No constraint is placed on SPt. Evaporative losses as well as storage gains and losses attributable to groundwater seepage-into and leakage-out-of
  • 54. 40 the reservoir are assumed to be negligible and therefore are neglected. The hydropower production function, see Equation 14, can be derived as fol- lows. The total gigawatt-hours in period t [Loucks et al., 1981] equals The total flow q in period t is in thousands of acre-feet, the hydraulic head h is in feet, and the power plant efficiency € is dimensionless. Modeling the hydroelectric system requires substituting for qt and expressing hydraulic head in terms of reservoir storage. The nonlinear hydraulic head in terms of reservoir storage relationship can be expressed using a quadratic regression equation where b0,b1, and b2 are the regression parameters. Averaging beginning and ending storage levels for a particular time period and making the appropriate substitutions into equation (14)
  • 55. 41 Lastly, substituting the hydropower production function back into the objective where DEFt is the demand deficit at the reservoir. The objective is nonlinear in that the deficit function is quadratic, as is the hydraulic head in terms of reservoir storage function within the hydropower production equation. A further nonlinearity in the hydropower production equation is the result of the nonlinear product of the state (St, St-1) and the decision variables (RDt, RHt). 3.1.2 Overview of Monte Carlo Optimization Procedure The use of Monte Carlo Optimization to generate synthetic streamflow, via a random number generator, provided an efficient means of providing optimized reservoir release data to train and test a neural network. The ability to easily adjust such streamflow parameters as the annual autocorrelation coefficient in the synthetic streamflow generator proved to be useful in studying the behavior of a generalized neural network reservoir release model. The Monte Carlo Optimization procedure followed in this study is a four step process as illustrated in Figure 4. The first step involves the use of a statistical model to generate synthetic reservoir inflows. Step two calculates the incremental flow between the reservoir and the pump station. Step three calculates the reservoir demand based on the downstream demand less the incremental flow between the
  • 56. 42 reservoir and the pump station. The fourth and final step is the actual optimiza- tion of the reservoir release mathematical program, a linearly constrained nonlinear optimization problem. Figure 4. Overview of the Monte Carlo optimization procedure.
  • 57. 43 3.1.3 Generating Synthetic Reservoir Inflow The LAST software package (Lane Applied Stochastic Techniques), a series of precompiled FORTRAN programs designed and developed by the Bureau of Recla- mation, was used as the study's synthetic streamflow generator. Lane's [1990] tem- poral and spatial disaggregation technique is used by LAST to generate a seasonal series while annual information is generated using a standard linear autoregressive model (i.e., AR(1), AR(2)). While disaggregation modeling is capable of repro- ducing statistics at more than one spatial or temporal level (i.e., key-station to substation or annual to seasonal), this study's requirements were limited to tempo- ral disaggregation in order to derive monthly reservoir inflows at a single location. An unregulated streamflow record of 46 years was developed to calibrate LAST. Sixteen years of unregulated streamflow at the USGS gage site on the Mad River above Ruth Reservoir near Forest Glen, California were available at the beginning of this study. To extend the period of record, the sixteen years of unregulated streamflow was regressed against a surrogate watershed with a longer but overlap- ping period of record. The USGS gage site on the Van Duzen River near Bridgeville, California, proved to be sufficient. Overlapping historic data were separated into wet months and dry months (i.e., July through October) and linear regressed. The regression of daily flows produced a R2 value of 0.65 in the dry months and a value of 0.90 in the wet months.
  • 58. 44 The new data set of 46 years of monthly flow was then verified for normality. A natural log transformation was applied to dry months, and a one-third power transformation was applied to wet months, both transformations followed from Korte [1983]. The AR(1) model was implemented because of the small size of the annual autocorrelation coefficient (-0.05). The synthetic annual and monthly flow series from LAST were then statistically verified with the calibration data set. Table 1 compares 5,000 years of annual synthetic data with the 46 years of calibration data. Table 1. Annual statistics for synthetic reservoir inflows compared with the calibration inflow series (acre-feet x 103). Statistic Calibration Series Validation Series Years 46 5000 Mean 217.4 218.7 SD 89.7 89.3 Skew 0.11 0.23 Lag-1 CC -0.5 -0.3 3.1.4 Determining Incremental Flow and Reservoir Demand Three piecewise linear equations were used to calculate the incremental flow between the reservoir and the pump station, a value needed to calculate the demand at the reservoir. Incremental flow was calculated by subtracting the estimated flow below the reservoir from a gage site just below the pump station, approximately 70 miles downstream. Because the USGS gage site was just below the pump station,
  • 59. is the minimum monthly volumetric flow requirement for fish in acre-where, Residential is the portion of annual residential demand with as a sea- sonal weighting factor, and represents the annual industrial de- mand apportioned equally to each month. The seasonal values used for (acre- 45 monthly values of historic pump station withdrawals had to be added back into the gage record. A regression analysis was performed by graphing estimated monthly inflow into the reservoir against the calculated value of the monthly incremental flow for WY-1982 through WY-1996. A two part linear approach captured both high and low flows while keeping the transformations linear. The demand at the reservoir is equal to the demand at the pump station less the volumetric monthly incremental flow Ft between them where T is a seasonal index for each month. The constraint equation insures that the demand at Ruth never goes negative. The monthly demand at the pump station is calculated using feet x 103, for each month are found in Table 2.feet x103) and seasonal coefficients
  • 60. 46 Table 2. Coefficients and constants for Equation (22). 1 Jan 4.6 0.06 2 Feb 4.2 0.06 3 Mar 4.6 0.06 4 Apr 4.4 0.07 5 May 4.6 0.09 6 Jun 4.5 0.10 7 Jul 3.1 0.12 8 Aug 2.5 0.12 9 Sep 1.8 0.10 10 Oct 2.5 0.10 11 Nov 4.5 0.09 12 Dec 4.6 0.06 3.1.5 Optimization The final phase of the Monte Carlo Optimization Procedure involved the op- timization of the reservoir release mathematical program, which is a linearly con- strained nonlinear optimization problem. MINOS or Modular In-core Nonlinear Optimization System, a FORTRAN based general-purpose linear/nonlinear opti- mizer, was used to optimize the release problem. The program evaluates the objec- tive function using a generalized reduced gradient (GRG) algorithm [Murtagh and Saunder, 1995]. MINOS requires a main-input file and a user-coded subroutine to solve a lin- ear constrained nonlinear optimization problem. The input file is comprised of two sub-files: SPECS and MPS. The SPECS file consists of several lines of implemen- tation specific instructions to MINOS, such as the size of the problem and the need
  • 61. 47 to verify the objective gradients numerically. The MPS file contains the explicit representation of the linear portion of the reservoir release mathematical program. MPS is a standardized format used by most commercially based linear program- ming codes. A matrix generator automated the construction of the of the MPS file for 550 years of monthly data. The FORTRAN code for the matrix generator, mg.f, can be found in Appendix A. FUNOBJ is a user-coded subroutine where the nonlinear objective function is explicitly expressed. Ideally, the first derivatives of the objective function with respect to the decision variables are known and can be coded by the user into the subroutine. Having explicit derivative information avoids the numerical estimation of the reduced gradient using forward and central differences by MINOS. Numerical derivatives result in a loss in accuracy accompanied by a significant increase in CPU time. In this study, it was possible to analytically compute the first derivative of the objective function with respect to the decision variables (St, RDt, RHt and SPt). The evaluated first derivative expressions were then programmed into the subroutine along with the nonlinear objective function. The FORTRAN code for the subroutine FUNOBJ, or opt.f, can be found in Appendix B. The analytical expressions contained in FUNOBJ were verified in two one- hundred year runs: one using the analytical derivatives embedded in FUNOBJ and the second using MINOS's forward and central differencing routines. Both runs produced the same objective value to 4 significant figures; however, the latter
  • 62. 48 required 6114 calls to FUNOBJ, while the former required approximately 24 million calls to the subroutine. The analytical run took just a few minutes, while the numerical run took approximately 12 hours. Furthermore, the norm of the reduced gradient was 7.3E — 04 for the analytical solution, while the norm was 1.2E — 01 for the numerical solution. The multiobjective optimization problem in this study is characterized by a vector of two objective functions that are conflicting and competing, one for hy- dropower generation and the other for reducing the squared deficit as shown in Equation (15). The solutions to this problem are termed the nondominated or non- inferior solutions. These two terms imply that as one moves to improve the value of one objective, the value of the other decreases. The relative magnitude of the two objective values is controlled by 0 and π in the objective function. The trade- off curve given in Figure 5 is a visual illustration of the nondominated solutions of the two objectives. The twenty points represented on the curve were generated using 100-year optimizations of a single set of monthly data for a fixed value of 0 and variations in π. The optimization procedure is Monte Carlo in that the inputs to optimization (i.e., inflow) are synthetically derived from a calibrated synthetic streamflow model using a random number generator. The NLP management solution out of the nondominated solutions was selected at the breakpoint to reduce the generalized release problem from a dual generalized release problem involving RDt and RHt, to a single generalized release problem
  • 63. 49 Figure 5. Trade off curve representing the nondominated solutions of the objective function involving RDt. This simplification made it easier to study the generalizing behavior of a reservoir release neural network without worrying about an additional variable. This simplification was possible as it was rather simple to construct a release policy for hydropower releases that mimicked the NLP solution at the breakpoint. The objective function value at the breakpoint maximized hydropower production with negligible tradeoff in terms of increasing the sum of squared deficits, as illustrated in Figure 5 at a power production level of around 3, 750 GWh.
  • 64. 50 3.2 Development of ANN Generalized Reservoir Release Model This section is divided into two subsections. The first discusses the process of training a neural network with a commercially available neural network soft- ware package using the results of the Monte Carlo optimization to determine a generalized release policy for RDt. The second covers the development of the reser- voir simulation model in the form of a C++ computer program nn_resv_sim.cpp (Appendix C). The C++ program replicates the trained feedforward plane of the neural network using training weights imported from the neural network software to get a generalized release for demand. The program then implements an operating procedure to determine RHt and SPt, while insuring that the constraints of the mathematical program are meet (i.e., primarily that mass continuity is maintained within the reservoir system). 3.2.1 Artificial Neural Network Training NeuroSolutions, a commercially based neural network software package [Lefeb- vre et al., 1997] with a graphical user interface, was used to train the reservoir release neural network using the results of the Monte Carlo optimization proce- dure. The use of a commercial software package eliminated the need for further software development and provided the mechanics needed to implement neural net- work training. The training elements of a feedforward neural network, as discussed in the literature review (Section 2.1.4) and shown in Figure 3, can be thought of
  • 65. (i.e., absolute value, least squares, norm, and cost functions to the user defined 51 as a juxtaposition of three independent planes: the forward activation plane, the backpropagation plane, and the gradient search plane [Lefebvre et al., 1997]. The topology or forward activation plane used in this study is what Lefebvre et al. [1997] referred to as a generalized feedforward network. This topology is a variation on the standard feedforward neural network or multilayer perception (MLP) in that connections (i.e., synapses) can jump over one or more layers. The advantage of the generalized feedforward network is that it can solve problems much more quickly than the standard feedforward network. In some instances, a standard feedforward network requires hundreds of times more training epochs than the an equivalent generalized feedforward network [Lefebvre et al., 1997]. NeuroSolutions allows several choices of optimality criterion to be minimized power) and also gradient search methods in the computing of the optimal network weights (i.e., step, momentum, quickprop, and delta-bar-delta). However, for the purposes of this study, the MSE optimality criterion and the momentum search algorithm were found to be sufficient and were used in all training runs as discussed in Section 2.1.4. Specific elements for the generalized feedforward topology were easy to implement using the software's automated neural network setup procedure called the NeuralWizard. The NeuralWizard allows one to easily specify the number of inputs, outputs, and hidden layers, as well as the types of transfer functions and the choice of stopping criteria.
  • 66. 52 3.2.2 Neural Network Reservoir Release Simulation Model A limitation of neural networks in water resources is that they are input/output models producing generalized decisions. Thus neural networks have no means of insuring mass continuity within a water resource system. For instance, a generalized neural network demand-release may exceed the ability of a particular reservoir to meet that demand. This inequity between the generalized demand-release and what can actually occur in the physical system must be reconciled. The most important reconciliation step needed in water resources is the one that insures mass continuity. Saad et al. [1994] used such a continuity adjustment step in their neural network nonlinear disaggregation technique for the operation of a multi-reservoir system. The generalized neural network reservoir simulation program used in this study, nn_resv_sim.cpp (Appendix C), can be divided into two halves. The first half imple- ments the feedforward plane of artificial neural network to determine a generalized value of RDt. The second half implements a operating procedure to determine RHt and SPt, at the same time making sure that the constraints of the mathematical program are met. This includes insuring that the RDt does not exceed the demand and that both the mass continuity and the physical plausibility constraints of the reservoir system are maintained. The program nn_resv_sim.cpp required numerically reconstructing the feedfor- ward plane of the NeuroSolutions generalized feedforward topology in the form of a C++ program. Considerable programming time and effort was spent emulat-
  • 67. 53 ing the mathematics of the feedforward plane of the neural network (i.e., matrix multiplication and the occasional use of an analytical transfer function). The sim- ulation program requires three output files from NeuroSolutions to be read in, all beginning with the file name of that specific run: (filename.inputFile.nsn), (file- name.desiredFile.nsn), and (filename.nsw). The first two files are automatically generated by the program and contain the parameters for normalizing network in- puts and denormalizing network outputs. The third file, ending with .nsw, contains the weights of the neural network and must be specifically requested from the pro- gram. Figure 6 is a flowchart of the operating procedure used by the generalized reservoir release simulation program, nn_resv_sim.exe. The generalized reservoir simulation program begins by using the forward plane of the neural networks to determine a generalized value of the RDt. It then goes through an operating pro- cedure to determine RHt and SPt and to insure that the operational constraints of the mathematical program are met. These constraint checks and associated ad- justments are needed because the neural network can produce nonsensical results with regard to mass continuity and the other constraints of the NLP. The operating procedure includes steps to make sure that the RDt does not exceed DEMt, that mass continuity is maintained, and that the physical plausibility constraints of the reservoir system are met.
  • 68. 54 Figure 6. Flow diagram of the generalized reservoir release simulation program, nn_resv_sim.
  • 69. 55 3.2.3 Development of a Standard Operating Procedure A standard operating procedure (SOP) was developed to help provide a rational generating policy to compare against both the NLP and the generalized neural network model. This SOP is used as a reasonable measure of the largest possible sum of squared deficits. The SOP attempts to release the entire demand rather than hedge the demand-release in anticipation of future drought. On the other hand, solution of the NLP is used as the measure of the best possible operating scenario, far better than would be possible in real-time. This best case scenario is the result of the ability of the nonlinear programming problem to incorporate information from the entire time horizon. The NLP and the generalized neural network models hedge their releases (i.e., reduce them below demand despite available water) in anticipation of greater peri- ods of deficit ahead, to reduce the sum of squared deficits. The SOP in Figure 7 first requires that if there is a demand, the reservoir is to attempt to release that demand, and secondly, that once the reservoir is full, any excess water will be targeted for hydropower releases. Prior to each decision within the SOP algorithm, the volume of water available for RDt,avail., St,avail., and RHt,avail. is determined to insure that the reservoir's physical plausibility constraints are not violated. No constraint is placed on spill. The NLP solution may be considered to provide the lowest sum of squared deficits, while the SOP can be expected to provide a reasonable value of the highest sum of squared deficits.
  • 70. 56 Figure 7. Flow diagram of the standard operating procedure.
  • 71. 57 3.2.4 Feature Selection and Internal Network Architecture A systematic simulation approach was taken to determine appropriate neural network inputs, the results of which are presented in the next chapter. Reservoir storage, demand, and inflow were all considered as inputs to the neural network to predict the NLP releases for demand. Predicted current period inflow and demand as well as previous period storage were considered as a baseline from which to start the search. The enormous number of combinations of potential inputs forced some simplifications of the search procedure. These simplifications included removing St lags early on in the selection process with the exception of St-1, as well as keeping the number of lags of It and DEMt equal for each run. To streamline the study, only a cursory examination was made of increasing the internal dimensionality of the neural network model to prevent missing a standout solution that might improve the performance of the generalized neural network reservoir release model. When the number of hidden nodes were increased and separately when the number of hidden layers was increased from one to two, an improved solution was not found. Thus it is felt that the default internal architecture recommended by the NeuralWizard of one hidden layer and 23 hidden nodes (fixed by the number of inputs) was sufficient in capturing the complexity of a generalized reservoir release model containing 13 inputs and 1 output.
  • 72. 58 3.2.5. Stopping Criteria A stopping criterion based on a maximum of 10,000 epochs was selected for the entire study. The maximum epochs stopping criterion was simple to implement and provided a degree of uniformity between training runs. A desired error threshold stopping criterion was rejected as the average cost of the objective-criterion in Neu- roSolutions is based on differences in the normalized output and normalized desired output. A normalized value of the average error criterion is difficult to judge and cannot be readily denormalized. Cross-validation was also eliminated as a potential stopping criterion. Its elim- ination as a potential stopping criterion was a result of the fact that the cross- validation procedure requires the use of a validation data set. This data require- ment is in addition to the training data set and therefore could have doubled the number of optimization runs needed in the study. In addition, the cross-validation procedure approximately doubles training time as the neural network software needs to evaluate both a training and a validation data set in order to determine if the error in the validation data set is increasing. Furthermore, the purpose of the cross- validation procedure is to prevent overtraining. The potential for overtraining was discounted after performing an extremely long cross-validation training run (i.e., 100,000 epochs) discussed in the next paragraph.
  • 73. 59 A cross-validation training test was performed using 100,000 epochs with a (15-22-1) neural network configuration, six time lagged pairs of It and DEMt, a baseline of It, DEMt, and St — 1, and 500 years of monthly data. The purpose of the validation test was to determine if and when the generalized neural network becomes overtrained. Overtraining occurs when the neural network over-fits the training set and thus is unable to best generalize on a new unseen data set. Train- ing was periodically stopped and the neural networks weights were saved. The generalized reservoir release model, nn_resv_sim.exe, was then run using the saved weights and the cross validation data set. Table 3 is a summary of the results in percentage reduction of the sum of squared deficits with respect to the previous Table 3 stopping epoch. After examining the time tradeoffs with respect to the percentage improvement in the solution, it was decided that a maximum of 10,000 training epochs would be a reasonable stopping criterion. Table 3. Cross-validation test using a P6-200Mhz PC. Training Epoch Time Percentage Reduction 1 ≈ 1 second N/A 1,000 ≈ 20 minutes 57% 10,000 ≈ 3 hours 3.2% 100,000 ≈30 hours 0.75%
  • 74. 60 3.2.6 Reservoir Simulation Scenarios Sixteen different reservoir scenarios were constructed using two indices with the purpose of investigating the applicability of the neural network model over a range of storage and deficit conditions. The first index, maximum storage to average annual inflow (MSI), is the ratio of the maximum reservoir storage to its average annual inflow or The second index, optimal deficit to average annual inflow or optimal deficit to inflow (ODI), is the ratio of the reservoir deficit of the NLP solution to its average annual inflow or The use of the MSI index required the construction of four different reservoir configurations. The MSI index and the physical parameters for each of the four reservoir configurations are given in Table 4. The 220 acre-feet x103 was used as the value of the total average annual inflow.
  • 75. 61 Table 4. Configurations of four different reservoirs (acre-feet x103) Config-1 Config-2 Config-3 Config-4 MSI 0.23 0.84 1.7 3.4 Smax 50 185 370 740Smax 50 185 370 740 Smin 5 18.5 37 74 Rmax 100 100 100 100 Rmin 0 0 0 0 A quadratic hydraulic head to storage relationship was developed for each reservoir configuration. This new relationship was accomplished by multiplying Ruth Reser- voir's storage data by a capacity resizing factor and by multiplying Ruth's hydraulic stage data with a reservoir height-resizing factor. The parameters b0, b1, and b2 in each instance were then fit using quadratic regression. For each reservoir instance, an average efficiency of 0.7 was selected to represent hydropower plant efficiency (including head losses). The use of the ODI index required that a range of target values be determined prior to conducting numerical experiments. Target values of the ODI index were set in order to provide a relative value upon which to compare the four reservoir con- figurations. Numerous optimization runs were conducted varying DEMResidential and DEMCommerical to meet the target values of the ODI index given in Table 5. Table 5. Target values of the ODI index. Name ODI Value Run-a ≈ 0.04 Run-b ≈ 0.09 Run-c ≈ 0.17 Run-d •≈ 0.31
  • 76. 62 3.2.7 Experimental Design Approach The purpose of this subsection is to give an overview this study's experimental design approach. This subsection shows how the generalized reservoir release model was parameterized to generate the numerical results found in the results and discus- sion chapter that follows. Figure 8 illustrates the training and testing procedures used to examine the performance of the generalized neural network reservoir release model for an array of parameterize model inputs, coefficients, constraints, and de- mand values. The adjustment points for the model inputs, coefficients, constraints, and demand values are depicted by double arrows at the step within which they can be adjusted. The purpose of the neural network training procedure in Figure 8 is to develop the neural network training weights. The weights are to be used by the generalized neural network reservoir release simulation model (i.e., nn_resv_sim.exe) found in the neural network testing procedure. The purpose of steps a and b in the testing procedure is to develop an independent test set for reservoir simulation. Steps c, d, and e in the test procedure are to optimize the nonlinear programming problem for the purpose of having a relevant NLP solution to compare against the results of the generalized reservoir release neural network. In all four hypothetical reservoir instances, the initial and final storage con- ditions of the reservoir system are fixed at one-half of Smax. In addition, in an attempt to remove the influence of this somewhat arbitrarily set boundary condi-
  • 77. 63 Figure 8. Experimental design approach, comparing neural network i) training procedure and ii) testing procedure.
  • 78. 64 tions, 25 years of optimized monthly data are excluded after Sinitial and prior to Sfinal for the study's typical 550 year optimization run. This is shown in step e in both the testing and training procedures in Figure 8. The first purpose of this study is to investigate whether performance increase in a generalized reservoir release neural network can be achieved by adding lagged inputs of inflow and demand to the three input neural network used by Raman and Chandramouli [1996]. This requires parameterizing on the time lagged inputs of inflow and demand in step f of the testing procedure and then testing the final weights using an independent test set in the neural network testing procedure. The purpose of this study is also to show the applicability of the generalized release model over a range of reservoir configurations, demand-deficits, and annual streamflow autocorrelation structures. Each reservoir storage configuration is rep- resented by a different value of the MSI index required new values ofS max and Smin as well as the coefficients (i.e., b0, b1 and b2) representing the quadratic hydraulic head in terms of the reservoir storage relationship (i.e., step c in both the training and testing procedures). While the Smax was parameterized, Rmax was set at 100 [(acre-feet) x103] to prevent a physical constraint on the demand-release. Rmax was held constant across reservoir configurations to make simplify the comparison between the different reservoirs.
  • 79. 65 Meeting the target values of the ODI index for each reservoir configuration is achieved by parameterizing on the residential and commercial demand and then optimizing the NLP using MINOS (i.e., step b in both the training and testing procedures). The resulting deficit from the solution to the NLP is divided by the average annual inflow and compared with the target value of the ODI index. If the value is within approximately ±1% of the target ODI value, the NLP solution is kept for analysis. Otherwise, another optimization run is performed using new adjusted residential and commercial demands in an attempt to get a NLP deficit closer to the ODI target value. The annual lag-1 autocorrelation coefficient (ACC) of reservoir inflow is pa- rameterized in the synthetic streamflow generator in order to characterize the per- formance of the neural network model over a range of reservoir inflow correlation structures (i.e., step a in both the training and testing procedures). The annual lag-1 ACC value was adjusted in the LAST parameter file by simply entering a new value of the coefficient. While annual lag-1 ACC values were adjusted the correlation structure of the monthly synthetic streamflow was kept the same.
  • 80. 4.0 RESULTS AND DISCUSSION This chapter is divided into five sections: neural network configuration, feature selection, reservoir simulations, effect of annual lag-1 autocorrelation coefficient, and application to real-time reservoir operation. The first section describes the form of the generalized neural network used in the reservoir simulation study. The second discusses the investigation on the effect of increasing the number of time lagged inputs of It and DEMt on the sum of squared deficits and total deficit. The third section presents the bulk of the study's results by examining the operation of four reservoir configurations, in terms of MSI, for three methodologies: nonlinear optimization (NLP), generalized neural network, and standard operating procedure. The reservoir systems are then stressed for progressively larger values of ODI. The fourth section examines the effect of the lag-1 autocorrelation coefficient on the neural network model. Finally, the fifth section discusses the applicability of the procedure to real-time reservoir operation. 4.1 Neural Network Configuration The inputs to the neural network consisted of inflow, demand, and storage. Neural network input and output values were normalized within the interval [0,1], while the initial internal weights were randomized within the interval [0,1]. The hyperbolic tangent function was used as the transfer function for both hidden and output nodes. One hidden layer was used, while the neural network software deter- mined the number of internal nodes. 66