1. Department of Information Technology 1Soft Computing (ITC4256 )
Dr. C.V. Suresh Babu
Professor
Department of IT
Hindustan Institute of Science & Technology
Unsupervised learning networks
2. Department of Information Technology 2Soft Computing (ITC4256 )
Action Plan
• Unsupervised Learning Networks
- Introduction to Kohonen Self-Organizing Feature Maps (KSOM)
- Rectangular grid computing
- Hexagonal grid computing
- KSOM architecture
- KSOM training algorithm
• Quiz at the end of session
4. Department of Information Technology 4Soft Computing (ITC4256 )
Kohonen Self-Organizing Feature Maps (KSOM)
• Suppose if there are some pattern of arbitrary
dimensions, however, we need them in one
dimension or two dimensions.
• Then the process of feature mapping would be
very useful to convert the wide pattern space
into a typical feature space.
• There can be various topologies, however the
following two topologies are used the most:
- Rectangular Grid Topology
- Hexagonal Grid Topology
5. Department of Information Technology 5Soft Computing (ITC4256 )
Rectangular Grid Topology
• This topology has 24 nodes in the distance-2 grid, 16 nodes in the distance-1 grid, and 8 nodes in the
distance-0 grid, which means the difference between each rectangular grid is 8 nodes.
• The winning unit is indicated by #.
6. Department of Information Technology 6Soft Computing (ITC4256 )
Hexagonal Grid Topology
• This topology has 18 nodes in the distance-2 grid, 12 nodes in the distance-1 grid, and 6 nodes in the
distance-0 grid, which means the difference between each rectangular grid is 6 nodes.
• The winning unit is indicated by #.
7. Department of Information Technology 7Soft Computing (ITC4256 )
KSOM - Architecture
• The architecture of KSOM is similar to that of the competitive
network.
• With the help of neighborhood schemes, discussed earlier, the
training can take place over the extended region of the network.
8. Department of Information Technology 8Soft Computing (ITC4256 )
KSOM – Training Algorithm
Step 1 − Initialize the weights, the learning rate α and the neighborhood
topological scheme.
Step 2 − Continue step 3-9, when the stopping condition is not true.
Step 3 − Continue step 4-6 for every input vector x.
Step 4 − Calculate Square of Euclidean Distance for j = 1 to m
n m
D(j) = ∑ ∑ (xi − wij)2
i=1 j=1
Step 5 − Obtain the winning unit J where D j is minimum.
9. Department of Information Technology 9Soft Computing (ITC4256 )
KSOM – Training Algorithm (Cont…)
Step 6 − Calculate the new weight of the winning unit by the following relation −
wij(new) = wij(old) + α[xi−wij(old)]
Step 7 − Update the learning rate α by the following relation −
α(t+1)=0.5αt
Step 8 − Reduce the radius of topological scheme.
Step 9 − Check for the stopping condition for the network.
10. Department of Information Technology 10Soft Computing (ITC4256 )
Quiz - Questions
1. What are the 2 topologies used in KSOM?
2. The winning unit is indicated by ---------.
a) * b) $ c) # d) !
3. What parameters has to be initialized for the training algorithm?
4. What is done in step 8?
5. The architecture of KSOM is similar to that of the ------------ network.
11. Department of Information Technology 11Soft Computing (ITC4256 )
Quiz - Answers
1. What are the 2 topologies used in KSOM?
i. Rectangular Grid Topology ii. Hexagonal Grid Topology
2. The winning unit is indicated by ---------.
a) * b) $ c) # d) !
3. What parameters has to be initialized for the training algorithm?
Weights, learning rate α and the neighbourhood topological scheme.
4. What is done in step 8?
Reduce the radius of topological scheme.
5. The architecture of KSOM is similar to that of the ------------ network.
Competitive
12. Department of Information Technology 12Soft Computing (ITC4256 )
Action Plan
• Unsupervised Learning Networks (Cont…)
- Introduction to ART
- Operational principle of ART
- ART1 architecture
- ART1 training algorithm
• Quiz at the end of session
13. Department of Information Technology 13Soft Computing (ITC4256 )
Adaptive Resonance Theory (ART)
• This network was developed by Stephen Grossberg and Gail Carpenter in 1987.
• It is based on competition and uses unsupervised learning model.
• Basically, ART network is a vector classifier which accepts an input vector and classifies it into one of
the categories depending upon which of the stored pattern it resembles the most.
14. Department of Information Technology 14Soft Computing (ITC4256 )
ART – Operational Principle
• The main operation of ART classification can be divided into the following
phases:
- Recognition phase
- Comparison phase
- Search phase
16. Department of Information Technology 16Soft Computing (ITC4256 )
ART1 - Architecture
1. Computational Unit:
a. Input unit (F1 layer):
i. F1a layer Input portion
ii. F1b layer Interface
b. Cluster Unit (F2 layer)
c. Reset Mechanism
2. Supplement Unit:
• Two supplemental units namely, G1 and G2 is added along with reset unit, R.
• They are called gain control units.
18. Department of Information Technology 18Soft Computing (ITC4256 )
ART1 – Architecture (Cont…)
Parameters Used:
• n − Number of components in the input vector
• m − Maximum number of clusters that can be formed
• bij − Weight from F1b to F2 layer, i.e. bottom-up weights
• tji − Weight from F2 to F1b layer, i.e. top-down weights
• ρ − Vigilance parameter
• ||x|| − Norm of vector x
19. Department of Information Technology 19Soft Computing (ITC4256 )
ART1 – Training Algorithm
Step 1 − Initialize the learning rate, the vigilance parameter, and the weights as
follows −
α > 1 and 0 < ρ ≤ 1
0 < bij(0) < (α) / (α − 1 + n) and tij(0) = 1
Step 2 − Continue step 3-9, when the stopping condition is not true.
Step 3 − Continue step 4-6 for every training input.
Step 4 − Set activations of all F1a and F1 units as follows
F2 = 0 and F1a = input vectors
Step 5 − Input signal from F1a to F1b layer must be sent like
si = xi
20. Department of Information Technology 20Soft Computing (ITC4256 )
ART1 – Training Algorithm (Cont…)
Step 6 − For every inhibited F2 node
yj = ∑ bijxi the condition is yj ≠ -1
i
Step 7 − Perform step 8-10, when the reset is true.
Step 8 − Find J for yJ ≥ yj for all nodes j.
Step 9 − Again calculate the activation on F1b as follows
xi = sitji
Step 10 − Now, after calculating the norm of vector x and vector s, we need to
check the reset condition as follows −
• If ||x||/ ||s|| < vigilance parameter ρ, then inhibit node J and go to step 7
• Else If ||x||/ ||s|| ≥ vigilance parameter ρ, then proceed further.
21. Department of Information Technology 21Soft Computing (ITC4256 )
ART1 – Training Algorithm (Cont…)
Step 11 − Weight updating for node J can be done as follows −
bij(new) = (αxi) / (α − 1 + ||x||)
tij(new) = xi
Step 12 − The stopping condition for algorithm must be checked.
22. Department of Information Technology 22Soft Computing (ITC4256 )
Quiz - Questions
1. ART network is a --------- classifier.
a) vector b) scalar c) linear d) non-linear
2. What are the 3 phases of the main operation of ART?
3. Name the 2 units of ART architecture.
4. What are the 3 components of computational unit?
5. What is the full form ART?
23. Department of Information Technology 23Soft Computing (ITC4256 )
Quiz - Answers
1. ART network is a --------- classifier.
a) vector
2. What are the 3 phases of the main operation of ART?
i. recognition ii. comparison iii. search
3. Name the 2 units of ART architecture.
i. computational unit ii. Supplement unit
4. What are the 3 components of computational unit?
i. input unit ii. Cluster unit iii. reset mechanism
5. What is the full form ART?
Adaptive Resonance Theory
24. Department of Information Technology 24Soft Computing (ITC4256 )
Action Plan
• Unsupervised Learning Networks (Cont…)
- Introduction to Radial Basis Function (RBF) network
- RBF architecture
- Hidden neural model
- Gaussian RBF
- RBF network parameters
- RBF learning algorithms
• Quiz at the end of session
25. Department of Information Technology 25Soft Computing (ITC4256 )
Radial Basis Function (RBF) Network
• A function is radial basis(RBF) if its output depends on (is a non-increasing
function of) the distance of the input from a given stored vector.
• The output of the red vector is “interpolated” using the three green vectors,
where each vector gives a contribution that depends on its weight and on its
distance from the red point.
• w1 < w3 < w2
27. Department of Information Technology 27Soft Computing (ITC4256 )
RBF - Architecture
• One hidden layer with RBF activation functions.
• Output layer with linear activation function.
28. Department of Information Technology 28Soft Computing (ITC4256 )
Hidden Neuron Model
• Hidden units use radial basis functions.
• The output depends on the distance of the input x from the center t.
• t is called center.
• is called spread.
• Center and spread are parameters.
29. Department of Information Technology 29Soft Computing (ITC4256 )
Hidden Neurons
• A hidden neuron is more sensitive to data points near its center.
• For Gaussian RBF this sensitivity may be tuned by adjusting the spread ,
where a larger spread implies less sensitivity.
31. Department of Information Technology 31Soft Computing (ITC4256 )
Types of
• Multiquadrics:
• Inverse multiquadrics:
• Gaussian functions (most used):
32. Department of Information Technology 32Soft Computing (ITC4256 )
RBF Network Parameters
• What do we have to learn for a RBF NN with a given architecture?
- The centers of the RBF activation functions.
- The spreads of the Gaussian RBF activation functions.
- The weights from the hidden to the output layer.
• Different learning algorithms may be used for learning the RBF network parameters.
33. Department of Information Technology 33Soft Computing (ITC4256 )
RBF - Learning Algorithm 1
• Centers are selected at random.
• Spreads are chosen by normalization:
• Then the activation function of hidden neuron i becomes:
• Weights are computed by means of the pseudo-inverse method.
34. Department of Information Technology 34Soft Computing (ITC4256 )
Learning Algorithm 1 - Summary
1. Choose the centers randomly from the training set.
2. Compute the spread for the RBF function using the normalization method.
3. Find the weights using the pseudo-inverse method.
35. Department of Information Technology 35Soft Computing (ITC4256 )
RBF - Learning Algorithm 2
Clustering algorithm for finding the centers :
• Initialization: tk(0) random k = 1, …, m1
• Sampling: draw x from input space .
• Similarity matching: find index of center closer to x.
• Updating: adjust centers.
• Continuation: increment n by 1, goto 2 and continue until no noticeable changes of centers occur.
36. Department of Information Technology 36Soft Computing (ITC4256 )
Learning Algorithm 2 - Summary
Hybrid Learning Process:
• Clustering for finding the centers.
• Spreads chosen by normalization.
• LMS algorithm for finding the weights.
37. Department of Information Technology 37Soft Computing (ITC4256 )
RBF - Learning Algorithm 3
• Apply the gradient descent method for finding centers, spread and weights,
by minimizing the (instantaneous) squared error.
• Update for:
centers:
spread:
weights:
38. Department of Information Technology 38Soft Computing (ITC4256 )
Quiz - Questions
1. ---------- units use radial basis functions.
2. A hidden neuron is more sensitive to data points near its ---------.
3. What are the types of ?
4. By what means weights are found in RBF learning algorithm 2 ?
5. How the centers are chosen in RBF learning algorithm 1 ?
39. Department of Information Technology 39Soft Computing (ITC4256 )
Quiz - Answers
1. -------- units use radial basis functions.
Hidden
2. A hidden neuron is more sensitive to data points near its ---------.
center
3. What are the types of ?
i. multiquadrics ii. inverse multiquadrics iii. Gaussian functions
4. By what means weights are found in RBF learning algorithm 2 ?
LMS algorithm
5. How the centers are chosen in RBF learning algorithm 1 ?
Centers randomly chosen from training set.
40. Department of Information Technology 40Soft Computing (ITC4256 )
Action Plan
• Unsupervised Learning Networks (Cont…)
- Introduction to Counter Propagation (CP) network
- CP architecture
- CP outstar and instar
- CP Operation
• Quiz at the end of session
41. Department of Information Technology 41Soft Computing (ITC4256 )
Counter Propagation Network
• CP algorithm consists of a input, hidden and
output layer.
• In this case the hidden layer is called the
Kohonen layer & the output layer is called the
Grossberg layer.
• The activation of this winner neuron is set to 1
& the activation of all other neurons in this
layer is set to 0.
42. Department of Information Technology 42Soft Computing (ITC4256 )
Counter Propagation Network (Cont…)
Purpose:
• Fast and coarse approximation of vector mapping.
• Input vectors x are divided into clusters/classes.
• Each cluster of x has output y, which is (hopefully) the average of
for all x in that class.
45. Department of Information Technology 45Soft Computing (ITC4256 )
Counter Propagation Network (Cont…)
1. Invented by Robert Hecht-Nielson, founder of HNC inc.
2. Consists of two opposing networks, one for learning a function, the other
for learning its inverse.
3. Each network has two layers:
- A Kohonen first layer that clusters inputs.
- An ‘outstar’ second layer to provide the output values for each cluster.
46. Department of Information Technology 46Soft Computing (ITC4256 )
CP - Outstar and Instar
• An instar responds to a single input.
• An outstar produces a single (multi dimensional) output d when simulated with a binary value x.
• Biologically, outstar would be synaptic weights, while instar would have dendritic ones.
• It is common to refer to weights as ‘synaptic’.
47. Department of Information Technology 47Soft Computing (ITC4256 )
CP - Outstar and Instar (Cont…)
• Variations can be possible by adding weights.
48. Department of Information Technology 48Soft Computing (ITC4256 )
CP Operation
• An outstar neuron is associated with each cluster representative.
• Given an input, the winner is found.
• An outstar is then stimulated to give the output.
• Since these networks operate by recognizing input patterns in the first
layer, one would generally use lots of neurons in this layer.
49. Department of Information Technology 49Soft Computing (ITC4256 )
Quiz - Questions
1. In CP network the hidden layer is called the --------- layer & the output
layer is called the --------- layer.
2. The activation of this winner neuron is set to 1 & the activation of all other
neurons in this layer is set to 0.
a) true b) false
3. ---------- vectors x are divided into clusters/classes.
a) input b) output c) outstar d) instar
4. CP network consist of two opposing networks, one for learning a ----------,
the other for learning its ----------.
5. Biologically, outstar would be -------- weights, while instar would have
---------- ones.
50. Department of Information Technology 50Soft Computing (ITC4256 )
Quiz - Answers
1. Kohonen & Grossberg
2. a) true
3. a) input
4. function & inverse
5. synaptic & dendritic