Supervised versus Unsupervised Learning




Supervised

   Labelled Data.
   Difficult to justify biologically.
   Doesn’t ...
Self Organising Neural Networks




  The basic design of an Unsupervised Network
  Unsupervised Learning
  Geometric Inte...
No Teachers




  Learn about regularities in the environment
  Recognition — familiarity to previous inputs
  Classificati...
Example



von der Malsburg, C (1973). Self–organisation of orientation sensi-
tive cells in the striate cortex. Kyberneti...
Basic Requirements for Unsupervised Networks



Rumelhart, D. and Zipser, D. (1985). Feature discovery by competi-
tive le...
Learning in the Rumelhart and Zipser Network




   Winning unit learns.
   Weights become more like input patterns
   (cl...
Example of Weight Redistribution




   16 inputs; for each stimulus   assume 8 inputs are
   active.

                   ...
Network Summary




                  -

                  -
Example




  2 classification units — binary classification
  16 input lines
  Dipole input (2/16 neighbouring inputs activ...
Geometric representation of Learning
Problems with “Competitive Learning”




  How many units?
  Normalisation - biological?
  Problem of dead units?
   1. Le...
Competitive Learning



                           -



                           -




  Input space is divided up – uni...
Statistical Views


                                     y




                  w1        w2           wi        wN



  ...
Correlation matrix




   Ensemble average and slow changing weights:
                        ¶                  ·
       ...
Eigenvectors & Eigenvalues




  Vector Ü viewed as a point in N dimensional space
  (e.g. Ü = 1,1,1.5 ).



             ...
Unconstrained Hebbian Learning




                       Û    « Û
  Over a large number of patterns the eigenvector with
...
Principal Components




             Find principal components:




Principal component of data = maximal eigenvector of
...
Oja rule




   Simple Hebbian learning is unstable, weights grow
   without limits:


                         dÛ
       ...
Correlation matrices and eigenvectors



Given the simple rule:


                Û         Û            (ignore «)

w can...
Summary




  No external teacher needed.
  Competition arises from “winner take all” and weight
  normalisation.
  Discov...
Upcoming SlideShare
Loading in...5
×

REDES NEURONALES APRENDIZAJE Supervised Vs Unsupervised

516

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
516
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

REDES NEURONALES APRENDIZAJE Supervised Vs Unsupervised

  1. 1. Supervised versus Unsupervised Learning Supervised Labelled Data. Difficult to justify biologically. Doesn’t fit all situations. Unsupervised Input Environment only.
  2. 2. Self Organising Neural Networks The basic design of an Unsupervised Network Unsupervised Learning Geometric Interpretation What they Learn Problems with Self Organising Neural Networks Statistical Views. Origins: Rosenblatt’s “spontaneous learning” in per- ceptrons Important work by Fukushima, Grossberg, Kohonen, von der Malsburg, Willshaw
  3. 3. No Teachers Learn about regularities in the environment Recognition — familiarity to previous inputs Classification — clustering Feature Mapping — topographic mappings Encoding — dimensionality reduction — data com- pression What determines what is learnt?
  4. 4. Example von der Malsburg, C (1973). Self–organisation of orientation sensi- tive cells in the striate cortex. Kybernetik, 14: 85–100. Environment. Initially random. Orientation tuned units.
  5. 5. Basic Requirements for Unsupervised Networks Rumelhart, D. and Zipser, D. (1985). Feature discovery by competi- tive learning. Cognitive Science, 2: 75–112. 1. Input units or Input lines. 2. Response units. Number of units. Units not all the same. 3. Limit the strength of units. 5 2 1 0 1 2000 Input pattern 1 0.5 0.01. Weight normalisation. 4. Allow the units to compete. “winner take all”. 5. Learning.
  6. 6. Learning in the Rumelhart and Zipser Network Winning unit learns. Weights become more like input patterns (classification). Normalisation by weight redistribution: ´ ¡Û ¼ if loses on stimulus «   «Û Ò if wins on stimulus = 1 (0) if input is (in)active on pattern . Ò = number of inputs active for pattern È (Ò ). « is the learning constant.
  7. 7. Example of Weight Redistribution 16 inputs; for each stimulus assume 8 inputs are active. È Assume that for each output unit , weights are ini- tially normalised: ½ ½ Û ½. Then ¡Û ½ «   «Û if wins and is ON ¡Û  «Û if wins and is OFF All weights for wining unit decremented by «Û Total weight from all lines decremented È «Û Since ÈÛ ½, loss = total deducted from all weights on winning unit = « Each weight on an active line is incremented by ½ « gain = total amount of weight added = ¡ ½ « «. loss = gain, so no net change in weight.
  8. 8. Network Summary - -
  9. 9. Example 2 classification units — binary classification 16 input lines Dipole input (2/16 neighbouring inputs active) Weights learned unit 1 unit 2 Also discovers horizontal, diagonal divisions; simi- lar result in 3-d. System discovers spatial structure, not in architec- ture.
  10. 10. Geometric representation of Learning
  11. 11. Problems with “Competitive Learning” How many units? Normalisation - biological? Problem of dead units? 1. Leaky learning 2. Conscience mechanism Not a magic technique. c.f. horizontal/vertical line task (Rumelhart & Zipser, 1985).
  12. 12. Competitive Learning - - Input space is divided up – units learn about a subset of the input patterns. Input space broken into groups of maximum simi- larity. Cluster analysis. Two sources of competition: 1. Winner-take-all mechanism 2. Resource limitation (normalisation)
  13. 13. Statistical Views y w1 w2 wi wN x1 x2 xi xN Simple Hebbian learning: dÛ dØ «Ü Ý Linear activation function. Ý ÛÜ Ý Û¡Ü Then dÛ dØ «Ü ÛÜ dÛ Û dØ « Û ÜÜ
  14. 14. Correlation matrix Ensemble average and slow changing weights: ¶ · Û « Û ÜÜ Û « Û ÜÜ Û « Û Û « Û where is the correlation matrix: ÜÜ
  15. 15. Eigenvectors & Eigenvalues Vector Ü viewed as a point in N dimensional space (e.g. Ü = 1,1,1.5 ). z=1.5 y=1 x=1 A Matrix as a linear transformation. Ú Õ Eigenvectors/values
  16. 16. Unconstrained Hebbian Learning Û « Û Over a large number of patterns the eigenvector with the largest eigenvalue will be the dominant influ- ence in weight change. Weights change fastest in the direction of the eigen- vector with the largest eigenvalue. So weights tend to the principle component of the data. Solutions to unbounded weights: Explicit Normalisation. Oja type rule – new terms. Simple weight decay.
  17. 17. Principal Components Find principal components: Principal component of data = maximal eigenvector of the covariance matrix of the data.
  18. 18. Oja rule Simple Hebbian learning is unstable, weights grow without limits: dÛ dØ «Ü Ý Oja rule adds weight decay term: dÛ dØ «Ý´Ü   ÝÛ µ Several properties (p202, Hertz et al., 1991) 1. Û tends to 1. 2. Û is maximal eigenvector of . ª « 3. Variance of the output, Ý ¾ , is maximised by Û. ¯ Decorrelate output units (via lateral inhibitory con- nections) to get other components (Sanger).
  19. 19. Correlation matrices and eigenvectors Given the simple rule: Û Û (ignore «) w can be rewritten in terms of the eigenvectors ( ) of with eigenvalues : Û ½ ½ · ¾ ¾ · Ò Ò where Û¡ Û ´ ½ ½· ¾ ¾ · Ò Ò µ But since : Û ½ ½ ½ · ¾ ¾ ¾ · Ò Ò Ò So weight derivative grows mostly in direction of eigen- vector Ñ with largest eigenvalue Ñ
  20. 20. Summary No external teacher needed. Competition arises from “winner take all” and weight normalisation. Discovers principal features of input environment. Output units have maximal variance.

×