KOHONEN SELF ORGANIZING MAPS
History of kohonen som
 Developed in 1982 by Tuevo Kohonen, a professor
emeritus of the Academy of Finland
 Professor Kohonen worked on auto-associative
memory during the 70s and 80s and in 1982 he
presented his self-organizing map algorithm
History of Kohonen SOMs
 His idea on Kohonen SOM only became famous much
later in 1988 when he presented a paper on “The Neural
Phonetic Typewriter” on IEEE computer that his work
became widely known
 Since then many excellent papers and books have been
made on SOM
What are self organizing maps?
•Are aptly named “Self-Organizing” because no
supervision is required.
•SOMs learn on their own through unsupervised
competitive learning.
•They attempt to map their weights to conform to
the given input data.
What are self organizing maps?
•Thus SOMs are neural networks that employ unsupervised
learning methods, mapping their weights to conform to the
given input data with a goal of representing
multidimensional data in an easier and understandable
form for the human eye. (pragmatic value of representing
complex data)
What are self organizing maps?
•Training a SOM requires no target vector. A SOM
learns to classify the training data without any
external supervision.
The Architecture
•Made up of an input nodes and computational
nodes.
•Each computational node is connected to each input
node to form a lattice.
The Architecture
•There are no interconnections among the
computational nodes.
•The number of input nodes is determined by the
dimensions of the input vector.
Representing Data
•Weight vectors are of the same dimension as the
input vectors. If the training data consists of
vectors, V, of n dimensions:
V1,V2,V3...Vn
•Then each node will contain a corresponding
weight vector W, of n dimensions:
W1,W2,W3...Wn
A sample SOM
Terms used in SOMs
•vector quantization -This is a data compression
technique. SOMs provide a way of representing
multidimensional data in a much lower
dimensional space; typically one or two
dimensions
Terms used in SOMs…
•Neighbourhood
•Output space
•Input space
EXPLANATION: How Kohonen SOMs work
The SOM Algorithm
•The Self-Organizing Map algorithm can be broken up
into 6 steps
•1). Each node's weights are initialized.
•2). A vector is chosen at random from the set of
training data and presented to the network.
EXPLANATION: The SOM Algorithm…
3). Every node in the network is examined to calculate
which ones' weights are most like the input vector.The
winning node is commonly known as the Best Matching
Unit (BMU).
EXPLANATION: The SOM Algorithm…
•4).The radius of the neighbourhood of the BMU is
calculated.This value starts large.Typically it is set to
be the radius of the network, diminishing each time-
step.
EXPLANATION: The SOM Algorithm…
•5). Any nodes found within the radius of the BMU,
calculated in 4), are adjusted to make them more like
the input vector (Equation 3a, 3b).The closer a node is
to the BMU, the more its' weights are altered
•6). Repeat 2) for N iterations.
Computation of scores
•The Function for calculating the score inclusion for
an output node is known as : 𝑖 𝑛𝑖 − 𝑤𝑖𝑗 2
•Thus to calculate the score for inclusion with
output node i:
Computation of scores…
•To calculate the score for inclusion with output
node j:
(0.4 − 0.3)2 + (0.7 − 0.6)2 = 0.141
The Winning Node
•Node j becomes the winning node since it has the
lowest score.
•This implies that its weight vector values are similar
to the input values of the presented instance.
•i.e.The value of node j is closest to the input vector.
•As a result, the weight vectors associated with the
winning node are adjusted so as to reward the node
for winning the instance.
Concluding the tests
• Both of these are decreased linearly over the span of several iterations and
terminates after instance classifications do not vary from one iteration to
the next
• Finally the clusters formed by the training or test data are analysed in order
to determine what has been discovered
NEIGHBORHOOD ADJUSMENTS
• After adjusting the weights of the winning node, the neighbourhood nodes
also have their weights adjusted using the same formula
• A neighbourhood is typified by a square grid with the centre of the grid
containing the winning node.
• The size of the neighbourhood as well as the learning rate r is specified when
training begins
ILLUSTRATION: A Color Classifier
•Problem: Group and represent the primary colors and
their corresponding shades on a two dimensional plane.
A Color classifier: Sample Data
•The colors are represented in their RGB values to
form 3-dimensional vectors.
A Color classifier: Node Weighting
•Each node is characterized by:
•Data of the same dimensions as the sample
vectors
•An X,Y position
A Color Classifier: The algorithm
Initialize Map
Radius = d
Learning rate = r
For 1 to N iterations
Randomly select a sample
Get best matching unit
Scale neighbors
Adjust d, r appropriately
End for
A Color classifier:The Layout
A Color classifier: Getting a winner
•Go through all the weight vectors and calculate the
Euclidean distance from each weight to the chosen
sample vector
•Assume the RGB values are represented by the
values 0 – 6 depending on their intensity.
•i.e. Red = (6, 0, 0); Green = (0, 6, 0); Blue = (0, 0, 6)
A Color classifier: Getting a winner…
•If we have colour green as the sample input instance, a
probable node representing the colour light green (3,6,3)
will be closer to green than red.
• Light green = Sqrt((3-0)^2+(6-6)^2+(3-0)^2) = 4.24
Red = Sqrt((6-0)^2+(0-6)^2+(0-0)^2) = 8.49
A COLOR CLASSIFIER: DETERMINING THE
NEIGHBORHOOD
• Since a node has an X – Y position,
it’s neighbors can be easily
determined based on their radial
distance from the BMU
coordinates.
A COLOR CLASSIFIER: DETERMINING THE
NEIGHBORHOOD…
• The area of the neighbourhood shrinks over time with
each iteration.
A Color classifier: Learning
•Every node within the BMU's neighbourhood (including
the BMU) has its weight vector adjusted according to a
pre-determined equation
•The learning rate is decayed over time.
A Color classifier: Learning
•The effect of learning should be proportional to the
distance a node is from the BMU.
•A Gaussian function can be used to achieve this, whereby
the closest neighbors are adjusted the most to be more
like the input vector.
A Color classifier: Sample output
• Classification on a 40 x 40 SOM
Current Applications
• WEBSOM: Organization of a Massive Document
Collection
Current Applications…
•Classifying World Poverty
Current Applications…
•PhoneticTypewriter

Kohonen self organizing maps

  • 1.
  • 2.
    History of kohonensom  Developed in 1982 by Tuevo Kohonen, a professor emeritus of the Academy of Finland  Professor Kohonen worked on auto-associative memory during the 70s and 80s and in 1982 he presented his self-organizing map algorithm
  • 3.
    History of KohonenSOMs  His idea on Kohonen SOM only became famous much later in 1988 when he presented a paper on “The Neural Phonetic Typewriter” on IEEE computer that his work became widely known  Since then many excellent papers and books have been made on SOM
  • 4.
    What are selforganizing maps? •Are aptly named “Self-Organizing” because no supervision is required. •SOMs learn on their own through unsupervised competitive learning. •They attempt to map their weights to conform to the given input data.
  • 5.
    What are selforganizing maps? •Thus SOMs are neural networks that employ unsupervised learning methods, mapping their weights to conform to the given input data with a goal of representing multidimensional data in an easier and understandable form for the human eye. (pragmatic value of representing complex data)
  • 6.
    What are selforganizing maps? •Training a SOM requires no target vector. A SOM learns to classify the training data without any external supervision.
  • 7.
    The Architecture •Made upof an input nodes and computational nodes. •Each computational node is connected to each input node to form a lattice.
  • 8.
    The Architecture •There areno interconnections among the computational nodes. •The number of input nodes is determined by the dimensions of the input vector.
  • 9.
    Representing Data •Weight vectorsare of the same dimension as the input vectors. If the training data consists of vectors, V, of n dimensions: V1,V2,V3...Vn •Then each node will contain a corresponding weight vector W, of n dimensions: W1,W2,W3...Wn
  • 10.
  • 11.
    Terms used inSOMs •vector quantization -This is a data compression technique. SOMs provide a way of representing multidimensional data in a much lower dimensional space; typically one or two dimensions
  • 12.
    Terms used inSOMs… •Neighbourhood •Output space •Input space
  • 13.
    EXPLANATION: How KohonenSOMs work The SOM Algorithm •The Self-Organizing Map algorithm can be broken up into 6 steps •1). Each node's weights are initialized. •2). A vector is chosen at random from the set of training data and presented to the network.
  • 14.
    EXPLANATION: The SOMAlgorithm… 3). Every node in the network is examined to calculate which ones' weights are most like the input vector.The winning node is commonly known as the Best Matching Unit (BMU).
  • 15.
    EXPLANATION: The SOMAlgorithm… •4).The radius of the neighbourhood of the BMU is calculated.This value starts large.Typically it is set to be the radius of the network, diminishing each time- step.
  • 16.
    EXPLANATION: The SOMAlgorithm… •5). Any nodes found within the radius of the BMU, calculated in 4), are adjusted to make them more like the input vector (Equation 3a, 3b).The closer a node is to the BMU, the more its' weights are altered •6). Repeat 2) for N iterations.
  • 17.
    Computation of scores •TheFunction for calculating the score inclusion for an output node is known as : 𝑖 𝑛𝑖 − 𝑤𝑖𝑗 2 •Thus to calculate the score for inclusion with output node i:
  • 18.
    Computation of scores… •Tocalculate the score for inclusion with output node j: (0.4 − 0.3)2 + (0.7 − 0.6)2 = 0.141
  • 19.
    The Winning Node •Nodej becomes the winning node since it has the lowest score. •This implies that its weight vector values are similar to the input values of the presented instance. •i.e.The value of node j is closest to the input vector. •As a result, the weight vectors associated with the winning node are adjusted so as to reward the node for winning the instance.
  • 20.
    Concluding the tests •Both of these are decreased linearly over the span of several iterations and terminates after instance classifications do not vary from one iteration to the next • Finally the clusters formed by the training or test data are analysed in order to determine what has been discovered
  • 21.
    NEIGHBORHOOD ADJUSMENTS • Afteradjusting the weights of the winning node, the neighbourhood nodes also have their weights adjusted using the same formula • A neighbourhood is typified by a square grid with the centre of the grid containing the winning node. • The size of the neighbourhood as well as the learning rate r is specified when training begins
  • 22.
    ILLUSTRATION: A ColorClassifier •Problem: Group and represent the primary colors and their corresponding shades on a two dimensional plane.
  • 23.
    A Color classifier:Sample Data •The colors are represented in their RGB values to form 3-dimensional vectors.
  • 24.
    A Color classifier:Node Weighting •Each node is characterized by: •Data of the same dimensions as the sample vectors •An X,Y position
  • 25.
    A Color Classifier:The algorithm Initialize Map Radius = d Learning rate = r For 1 to N iterations Randomly select a sample Get best matching unit Scale neighbors Adjust d, r appropriately End for
  • 26.
  • 27.
    A Color classifier:Getting a winner •Go through all the weight vectors and calculate the Euclidean distance from each weight to the chosen sample vector •Assume the RGB values are represented by the values 0 – 6 depending on their intensity. •i.e. Red = (6, 0, 0); Green = (0, 6, 0); Blue = (0, 0, 6)
  • 28.
    A Color classifier:Getting a winner… •If we have colour green as the sample input instance, a probable node representing the colour light green (3,6,3) will be closer to green than red. • Light green = Sqrt((3-0)^2+(6-6)^2+(3-0)^2) = 4.24 Red = Sqrt((6-0)^2+(0-6)^2+(0-0)^2) = 8.49
  • 29.
    A COLOR CLASSIFIER:DETERMINING THE NEIGHBORHOOD • Since a node has an X – Y position, it’s neighbors can be easily determined based on their radial distance from the BMU coordinates.
  • 30.
    A COLOR CLASSIFIER:DETERMINING THE NEIGHBORHOOD… • The area of the neighbourhood shrinks over time with each iteration.
  • 31.
    A Color classifier:Learning •Every node within the BMU's neighbourhood (including the BMU) has its weight vector adjusted according to a pre-determined equation •The learning rate is decayed over time.
  • 32.
    A Color classifier:Learning •The effect of learning should be proportional to the distance a node is from the BMU. •A Gaussian function can be used to achieve this, whereby the closest neighbors are adjusted the most to be more like the input vector.
  • 33.
    A Color classifier:Sample output • Classification on a 40 x 40 SOM
  • 34.
    Current Applications • WEBSOM:Organization of a Massive Document Collection
  • 35.
  • 36.