The name counterpropagation derives from the initial presentation of this network as a five-layered network with data flowing inward from both sides, through the middle layer and out the opposite sides. There is literally a counterflow of data through the network. Although this is an accurate picture of the network, it is unnecessarily complex; we can simplify it considerably with no loss of accuracy. In the simpler view of the counterpropagation network, it is a three-layered network. The input layer is a simple fan-out layer presenting the input pattern to every neurode in the middle layer.
The middle layer is a straightforward Kohonen layer, using the competitive filter learning scheme discussed in chapter 7. Such a scheme ensures that the middle layer will categorize the input patterns presented to it and will model the statistical distribution of the input pattern vectors. The third, or output layer of the counterpropagation network is a simple outstar array. The outstar, you may recall, can be used to associate a stimulus from a single neurode with an output pattern of arbitrary complexity. In operation, an input pattern is presented to the counterpropagation network and distributed by the input layer to the middle, Kohonen layer.
Here the neurodes compete to determine that neurode with the strongest response (the closest weight vector) to the input pattern vector. That winning neurode generates a strong output signal (usually a +1) to the next layer; all other neurodes transmit nothing. At the output layer we have a collection of outstar grid neurodes. These are neurodes that have been trained by classical (Pavlovian) conditioning to generate specific output patterns in response to specific stimuli from the middle layer. The neurode from the middle layer that has fired is the hub neurode of the outstar, and it corresponds to some pattern of outputs.
Because the outstar-layer neurodes have been trained to do so, they obediently reproduce the appropriate pattern at the output layer of the network. In essence then, the counterpropagation network is exquisitely simple: the Kohonen layer categorizes each input pattern, and the outstar layer reproduces whatever output pattern is appropriate to that category. What do we really have here? The counterpropagation network boils down to a simple lookup table. An input pattern is presented to the net, which causes one particular winning neurode in the middle layer to fire.
The output layer has learned to reproduce some specific output pattern when it is stimulated by a signal from this winner. Presenting the input stimulus merely causes the network to determine that this stimulus is closest to stored pattern 17, for example, and the output layer obediently reproduces pattern 17. The counterpropagation network thus performs a direct mapping of the input to the output.
The counterpropagation network consists of an input layer, a hidden layer called Kohonen, and an output layer called Grossberg layer (Hecht-Nielsen, 1987). The Kohonen layer works in a winner-takes all fashion. The training process of this network consists of two steps: first, an unsupervised learning is performed by the Kohonen layer, then after the Kohonen layer is stable a supervised learning is performed by the Grossberg layer. In normal operation, when an input is presented to the network, it is classified by the Kohonen layer and the winner node activates the appropriate output nodes in the Grossberg layer.
The counterpropagation network (Figure 3), developed by Hecht-Nielsen, consists of two layers: a Kohonen layer and a Grossberg layer (Hecht-Nielsen, 1987). In our prototype, the Kohonen layer works in a ''winner-take-all'' fashion. The input for the Kohonen layer is a vector that represents the relevant word-stems present in the title and abstract of the document. This input is normalized and processed by the Kohonen layer. After the output of Kohonen is stabilized, it is presented as the input to the Grossberg layer. The output of the Grossberg layer is the weighted sum of the Kohonen layer output. In training mode, we first train the Kohonen layer in an unsupervised mode. The training rule for the Kohonen layer is:
w new is the new value of a weight connecting an input x with the winner node of the Kohonen layer. w old is the previous value of this weight.
is a learning rate coefficient that is decreased during the training process.
w new = w old + (x - w old )
Lateral inhibition is implemented by using a Mexican hat function. This means that only the winning neuron and its neighbors participate in learning for a given pattern. The neighborhood size is decreased during the training process until it reaches 0.
After the Kohonen layer is trained, the training of the Grossberg layer starts. This is done in supervised mode. An input vector is applied, the Kohonen output is established, and the Grossberg outputs are calculated. If the difference between the desired outputs and the Grossberg layer outputs is greater than the acceptable error, then the weights are changed using the following training rule:
v ijnew = v ijold + (y j - v ijold ) k i
v ijnew is the new value of the weight that connects the Kohonen neuron i to the neuron j of the Grossberg layer.
v old is the previous value of this weight.
k i is the output of Kohonen neuron i (only one Kohonen neuron is nonzero).
is a training constant that is initialized in 0.1 and gradually reduced during the training.
y j is the desired value of the output j (MeSH terms)
Nie (1995) has shown the equivalence between counterpropagation networks and fuzzy model. This adds an interesting characteristic to this kind of networks because the knowledge contained in a trained network could be extracted and represented using fuzzy rules. The counterpropagation network was implemented in C++ using a library of objects included in Rao & Rao (1995). For this implementation we defined a class for representing the counterpropagation network that contains two objects the first object is of type Kohonen-layer and the second object is of type Grossberg-layer. Kohonen layer, and Grossberg layers are classes of objects that encapsulate all the data structures and operations of this type of layers.