Competitive Neural Network Artificial Intelligence

Unit 4
Competitive learning Neural
Network
Dr Minakshi Pradeep Atre, Head, AI & DS dept, PVG

Contents
• Components of CL network
• Pattern clustering and feature mapping network
• 1. ART networks
• Features of ART models
• Character recognition using ART network
• 2. Self-Organization Maps (SOM):
• Two Basic Feature Mapping Models
• Self-Organization Map
• SOM Algorithm
• Properties of Feature Map
• Computer Simulations
• 3. Learning Vector Quantization
• Adaptive Pattern Classification

Course Objectives
1. To provide students with a basic understanding of the fundamentals
and applications of artificial neural networks
2. To identify the learning algorithms and to know the issues of
various feed forward and feedback neural networks.
3. To understand the basic concepts of Associative Learning and
pattern classification.
4. To solve real world problems using the concept of Artificial Neural
Networks.

Course Outcomes
• CO1: Understand the basic features of neural systems and be able to
build the neural model.
• CO2: Perform the training of neural networks using various learning
rules.
• CO3: Grasping the use of Associative learning Neural Network
• CO4: Describe the concept of Competitive Neural Networks
• CO5: Implement the concept of Convolutional Neural Networks and
its models
• CO6: Use a new tool /tools to solve a wide variety of real-world
problems

Properties of Competitive Learning

•The units in a given layer are broken into several sets of nonoverlapping clusters. Each unit within a
cluster inhibits every other unit within a cluster. Within each cluster, the unit receiving the largest input
achieves its maximum value while all other units in the cluster are pushed to their minimum value.1 We
have arbitrarily set the maximum value to 1 and the minimum value to 0.
•Every unit in every cluster receives inputs from all members of the same set of input units.
•A unit learns if and only if it wins the competition with other units in its cluster.
•A stimulus pattern Sj consists of a binary pattern in which each element of the pattern is
either active or inactive. An active element is assigned the value 1 and an inactive element is assigned
the value 0.
•Each unit has a fixed amount of weight (all weights are positive) that is distributed among its input
lines. The weight on the line connecting to unit i on the upper layer from unit j on the lower layer is
designated wij. The fixed total amount of weight for unit j is designated ∑ jwij = 1. A unit learns by shifting weight from its inactive
to its active input lines. If a unit does not respond to a particular pattern, no learning takes place in that
unit. If a unit wins the competition, then each of its input lines gives up some portion ϵ of its weight and
that weight is then distributed equally among the active input lines. Mathematically, this learning rule
can be stated
https://web.stanford.edu/group/pdplab/pdphandbook/handbookch7.html

Introduction to competitive learning
• Hebbian learning - multiple output units can be fired simultaneously
• Competitive learning has all the output units compete among
themselves for activation
• As a result of such a competition, only one output unit, is active at
any given time
• This is known as winner-take-all
• Competitive learning has been found to exist in biological neural
networks

ART Adaptive resonance theory –
Introduction
• Carpenter and Stephen Grossberg proposed ART theory –ART-NN
• Widely used for clustering applications
• Competitive networks – unsupervised NNs – do not always form
stable clusters
• They are oscillatory when more input patterns are presented
• No guarantee that as more inputs are applied to an NN used for
clustering, the weights will eventually converge and be stable
• Learning instability occurs because of network’s adaptability (or
plasticity) – which causes prior learning to be eroded by more recent
learning

Why ART?
• The challenge – to develop the a NN that is receptive to significant new
patterns and still remains stable
• ART solves the problem of learning instability by modified type of
competitive learning
• Three types –
• ART 1 – 1986- can cluster only binary inputs
• ART 2- 1987 – can handle gray scale inputs
• ART 3 – 1989 – can handle analog inputs better by overcoming the limitations of
ART-2
• Key innovation – use of degree of expectations called vigilance parameter
• Vigilance parameter – user-specified value to decide the degree of
similarity essential for the input patterns to be assigned to a cluster unit

Working Principle of ART
• As each input is presented to the n/w, it is compared with the
prototype vector for a match (based on vigilance parameter)
• If the match between prototype and input vectors is not adequate, a
new prototype or cluster unit is selected
• This way, previous learned memories (prototypes) are not eroded by
new learning
• Basic ART is unsupervised
• Resonance – is a state of the n/w when a class of a prototype vector
closely matches to the current input vector, and leads to a state which
permits learning
• During this resonant state, the weight updation takes place

Basic Architecture of ART
• Consists of 3 layers:
• Input processing layer (L1) for processing
given inputs
• Output layer (L2) with the cluster units
• And reset layer (R) which decides the
degree of similarity of patterns placed on
the same cluster by a reset mechanism
• Input processing layer is further
divided into
• input layer (L1s) and
• input interface layer (L1i)
• The bottom-up weights (uij) are
connected between the input
interface layer to output-layer
• And the top-down weights (dji) are
connected between the output layer
and the interface layer
Bottom
-up
Top
Down

Roles of all the layers of ART architecture
• The output layer is a competitive layer or a recognition region where the
cluster units participate to check the closeness of the input patterns
• The interface layer is usually called the ‘comparison region’ where it gets
an input vector and transfers it to its best match in the recognition region
• The best match is the single neuron in the competitive layer whose set of
weights closely matches the input vector
• The reset layer compares the strength of the recognition match to the
vigilance parameter
• If the vigilance threshold is met, then the training or the updation of
weights takes place, else the firing of the recognition neuron is inhibited
until a new input vector is applied

Schematic
representation
Input layer – an
input pattern is
presented
Interface layer-
Comparison
region
Output layer-
Recognition
Region
Reset Layer

Operation of ART-1
• Present a binary input vector Sp to the input layer L1s and the information
is passed to its corresponding units in the input interface layer L1i
• Interface units transmit the information to the output layer L2 cluster units
through the bottom-up weights uij
• Output units compete to become a winner
• Operation is similar to MAXNET where the largest net input to the output
unit usually becomes the winner and the activation becomes 1
• All the other output units will have an activation of 0
• Let the winning cluster unit’s index be set to ‘J’ (capital ‘J’)
• If there’s a tie, then the index j which is the smallest one, can be
considered to be the winner
• Now the winner unit ‘J’ will be allowed to learn the input pattern

Next steps:
• Information about the winner is sent from the output layer L2 to the
interface layer L1s through the top-down weights, dji

Reference: Research article:
https://www.linkedin.com/pulse/reading-notes-adaptive-resonance-
theory-jaganadh-gopinadhan/

ART: an introduction
• Can be referred as "Deep Unsupervised Learning or Unsupervised
Learning for Deep Learning"
• key motivation for ART is stability-plasticity dilemma
•What is stability-plasticity dilemma?
• It is a constraint for artificial and biological neural systems.
• Learning in a parallel and distributed system requires plasticity for
incorporating new knowledge, at the same time stability has to be
maintained to prevent forgetting previous knowledge.
• Too much plasticity or stability may impede the capability of a
learning system

Catastrophic forgetting
• In short, it leads Catastrophic forgetting
• Catastrophic Forgetting is one of the key drawbacks of NN based
systems. It is defined as ' complete forgetting of previously learned
information by a Neural Network exposed to new information'. It is a
problem faced by many machine learning model and algorithms,
specifically Neural Networks. There are many methods proposed to
examine and address the Catastrophic Forgetting in NN. When we talk
a lot on "Deep Learning" have to think on Catastrophic Forgetting (CF)
too
Programming
Language
dilemma –
MATLAB to
Python

Algorithms
• https://blog.oureducation.in/art-adaptive-resonance-theory/

Alternative ART structure and algorithm

Reference: Research paper by the scientists:
Carpenter and Grossberg

Reference:
• https://opg.optica.org/ao/fulltext.cfm?uri=ao-26-23-4919&id=30891
• https://doi.org/10.1364/AO.26.004919

Features of ART
• 1) self -scaling computational unit
• 2) self-adjusting memory search
• 3) pre-learned pattern access and
• 4) attention vigilance

Discussion of features
• I) Self-Scaling Property: This refers to its ability to treat mismatches in input
patterns with few features as essential while suppressing the same mismatches
as noise in input patterns with many features. For image processing, this implies
that depending on the resolution of the image presented, the network will be
selective to certain features of it.
• 2) Self-Stabilizing Property: This refers to its ability to defend its fully committed
memory capacity from being washed away by an incessant flux of new input
patterns and to access a node in its memory without a search if a familiar input
were to be presented.
• 3) Plasticity: This refers to its ability to recognize and store new input patterns in a
nonstationary environment limited only by the total memory available.
• Properties 2) and 3) help not only in preserving previously learned images but
also in continuing to learn new images without erasing the memories of prior
images. All the above properties make the ART network an ideal for use as a
pattern recognition machine for image processing. However, the network is not
invariant to translations, rotations, and scaling and needs to be modified in order
to achieve that

Character recognition and computer
simulation
• http://techlab.bu.edu/files/resources/articles_tt/AN%20INVARIANT%
20PATTERN-
RECOGNITION%20MACHINE%20USING%20A%20MODIFIED%20ART%
20ARCHITECTURE.pdf

Match the pairs
Scientists
• McCulloch-Pitts (1943)
• Hebb n/w (1949)
• Simple Perceptron n/w
• Versions of perceptron
• Backpropagation
• Adaline
• SOFM
• ART
Models/ algorithms

Match the pairs
Scientists
• McCulloch-Pitts (1943)
• Hebb n/w (1949)
• Simple Perceptron n/w
• Versions of perceptron
• Backpropagation
• Adaline
• SOFM
• ART
Models/ algorithms
• McCulloch Pitts
• Donald Hebb
• Block (1862)
• Rosenblatt (1962), Minsky-Pappert (1969,1988)
• Bryson-Ho(1969), Werbos, Lecun(1985), Parker, Rumelhart
• Bernard Widrow and Marcian Hoff of Stanford (1959)
• Teuvo Kohonen (1980)
• Steven Grossberg and Gail Carpenter (1987)

Computer simulation of Competitive Learning

After the activation values are determined for each of the
output units, the weights must be adjusted according to the
learning rule.
This involves increasing the weights from the active input lines
to the winner and decreasing the weights from the inactive
lines to the winner.
This routine assumes that each input pattern sums to 1.0
across units, keeping the total amount of weight equal to 1.0
for a given output unit.
If we do not want to make this assumption, the routine could
easily be modified by implementing Equation 6.1 instead.

Computer simulation for weight updation

Feature mapping
• Feature mapping is a process which converts the
patterns of arbitrary dimensionality into a response of
one- or two dimensional arrays of neurons, i.e., it
converts a wide pattern space into a typical feature
space
• The network performing such a mapping is called
feature map
• Apart from its capability to reduce the higher
dimensionality, it has to preserve the neighborhood
relations of the input patterns, i.e., it has to obtain a
topology preserving map.
• For obtaining such feature maps, it is required to find a
self-organizing neural array which consists of neurons
arranged in a one-dimensional array or a two-
dimensional array.
• Typical network structure where each component of
the input vector x is connected to each of the nodes is
shown in Figure

• On the other hand, if the input vector is
two-dimensional, the inputs, say x(a, b),
can arrange themselves in a two
dimensional array defining the input
space (a, b)
• Here, the two layers are fully connected.
• The topological preserving property is
observed in the brain, but not found in
any other artificial neural network.
• Here, there are m output cluster units
arranged in a one- or two-dimensional
array and the input signals are n-tuples.
• The cluster (output) units’ weight vector
serves as an exemplar of the input
pattern that is associated with that
cluster

Principle of working
• At the time of self-organization, the weight vector of the cluster unit
which matches the input pattern very closely is chosen as the winner
unit.
• The closeness of weight vector of cluster unit to the input pattern
may be based on the square of the minimum Euclidean distance (L2
norm)
• The weights are updated for the winning unit and its neighboring
units.
• It should be noted that the weight vectors of the neighboring units
are not close to the input pattern and the connective weights do not
multiply the signal sent from the input units to the cluster units until
dot product measure of similarity is being used.

LVQ: Learning Vector Quantization

Concept and understanding the theory
• Learning vector quantization (LVQ) is a process of classifying the patterns, wherein each output
unit represents a particular class.
• Here, for each class several units should be used.
• The output unit weight vector is called the reference vector or code book vector for the class
which the unit represents.
• This is a special case of competitive net, which uses supervised learning methodology.
• During training, the output units are found to be positioned to approximate the decision surfaces
of the existing Bayesian classifier.
• Here, the set of training patterns with known classifications is given to the network, along with an
initial distribution of the reference vectors.
• When the training process is complete, an LVQ net is found to classify an input vector by assigning
it to the same class as that of the output unit, which has its weight vector very close to the input
vector.
• Thus, LVQ is a classifier paradigm that adjusts the boundaries between categories to minimize
existing misclassification.
• LVQ is used for optical character recognition, converting speech into phonemes and other
applications as well.
• LVQ net may resemble KSOFM net. Unlike LVQ, KSOFM output nodes do not correspond to the
known classes but rather correspond to unknown clusters that the KSOFM finds in the data
autonomously

Architecture
• Architecture of LVQ, which
is almost the same as that
of KSOFM, with the
difference being that in the
case of LVQ, the
topological structure at the
output unit is not being
considered.
• Here, each output unit has
knowledge about what a
known class represents.

Thus the first epoch of the training has been completed. It is noted that if correct class is obtained for first and second
input patterns, further epochs can be performed until all the winner units become equal to all the classes, i.e., all T = J

Competitive Neural Network Artificial Intelligence

More Related Content

Similar to Competitive Neural Network Artificial Intelligence

Recently uploaded

Competitive Neural Network Artificial Intelligence