Your SlideShare is downloading. ×
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply



Published on

Published in: Education

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Supervised versus Unsupervised Learning Supervised Labelled Data. Difficult to justify biologically. Doesn’t fit all situations. Unsupervised Input Environment only.
  • 2. Self Organising Neural Networks The basic design of an Unsupervised Network Unsupervised Learning Geometric Interpretation What they Learn Problems with Self Organising Neural Networks Statistical Views. Origins: Rosenblatt’s “spontaneous learning” in per- ceptrons Important work by Fukushima, Grossberg, Kohonen, von der Malsburg, Willshaw
  • 3. No Teachers Learn about regularities in the environment Recognition — familiarity to previous inputs Classification — clustering Feature Mapping — topographic mappings Encoding — dimensionality reduction — data com- pression What determines what is learnt?
  • 4. Example von der Malsburg, C (1973). Self–organisation of orientation sensi- tive cells in the striate cortex. Kybernetik, 14: 85–100. Environment. Initially random. Orientation tuned units.
  • 5. Basic Requirements for Unsupervised Networks Rumelhart, D. and Zipser, D. (1985). Feature discovery by competi- tive learning. Cognitive Science, 2: 75–112. 1. Input units or Input lines. 2. Response units. Number of units. Units not all the same. 3. Limit the strength of units. 5 2 1 0 1 2000 Input pattern 1 0.5 0.01. Weight normalisation. 4. Allow the units to compete. “winner take all”. 5. Learning.
  • 6. Learning in the Rumelhart and Zipser Network Winning unit learns. Weights become more like input patterns (classification). Normalisation by weight redistribution: ´ ¡Û ¼ if loses on stimulus «   «Û Ò if wins on stimulus = 1 (0) if input is (in)active on pattern . Ò = number of inputs active for pattern È (Ò ). « is the learning constant.
  • 7. Example of Weight Redistribution 16 inputs; for each stimulus assume 8 inputs are active. È Assume that for each output unit , weights are ini- tially normalised: ½ ½ Û ½. Then ¡Û ½ «   «Û if wins and is ON ¡Û  «Û if wins and is OFF All weights for wining unit decremented by «Û Total weight from all lines decremented È «Û Since ÈÛ ½, loss = total deducted from all weights on winning unit = « Each weight on an active line is incremented by ½ « gain = total amount of weight added = ¡ ½ « «. loss = gain, so no net change in weight.
  • 8. Network Summary - -
  • 9. Example 2 classification units — binary classification 16 input lines Dipole input (2/16 neighbouring inputs active) Weights learned unit 1 unit 2 Also discovers horizontal, diagonal divisions; simi- lar result in 3-d. System discovers spatial structure, not in architec- ture.
  • 10. Geometric representation of Learning
  • 11. Problems with “Competitive Learning” How many units? Normalisation - biological? Problem of dead units? 1. Leaky learning 2. Conscience mechanism Not a magic technique. c.f. horizontal/vertical line task (Rumelhart & Zipser, 1985).
  • 12. Competitive Learning - - Input space is divided up – units learn about a subset of the input patterns. Input space broken into groups of maximum simi- larity. Cluster analysis. Two sources of competition: 1. Winner-take-all mechanism 2. Resource limitation (normalisation)
  • 13. Statistical Views y w1 w2 wi wN x1 x2 xi xN Simple Hebbian learning: dÛ dØ «Ü Ý Linear activation function. Ý ÛÜ Ý Û¡Ü Then dÛ dØ «Ü ÛÜ dÛ Û dØ « Û ÜÜ
  • 14. Correlation matrix Ensemble average and slow changing weights: ¶ · Û « Û ÜÜ Û « Û ÜÜ Û « Û Û « Û where is the correlation matrix: ÜÜ
  • 15. Eigenvectors & Eigenvalues Vector Ü viewed as a point in N dimensional space (e.g. Ü = 1,1,1.5 ). z=1.5 y=1 x=1 A Matrix as a linear transformation. Ú Õ Eigenvectors/values
  • 16. Unconstrained Hebbian Learning Û « Û Over a large number of patterns the eigenvector with the largest eigenvalue will be the dominant influ- ence in weight change. Weights change fastest in the direction of the eigen- vector with the largest eigenvalue. So weights tend to the principle component of the data. Solutions to unbounded weights: Explicit Normalisation. Oja type rule – new terms. Simple weight decay.
  • 17. Principal Components Find principal components: Principal component of data = maximal eigenvector of the covariance matrix of the data.
  • 18. Oja rule Simple Hebbian learning is unstable, weights grow without limits: dÛ dØ «Ü Ý Oja rule adds weight decay term: dÛ dØ «Ý´Ü   ÝÛ µ Several properties (p202, Hertz et al., 1991) 1. Û tends to 1. 2. Û is maximal eigenvector of . ª « 3. Variance of the output, Ý ¾ , is maximised by Û. ¯ Decorrelate output units (via lateral inhibitory con- nections) to get other components (Sanger).
  • 19. Correlation matrices and eigenvectors Given the simple rule: Û Û (ignore «) w can be rewritten in terms of the eigenvectors ( ) of with eigenvalues : Û ½ ½ · ¾ ¾ · Ò Ò where Û¡ Û ´ ½ ½· ¾ ¾ · Ò Ò µ But since : Û ½ ½ ½ · ¾ ¾ ¾ · Ò Ò Ò So weight derivative grows mostly in direction of eigen- vector Ñ with largest eigenvalue Ñ
  • 20. Summary No external teacher needed. Competition arises from “winner take all” and weight normalisation. Discovers principal features of input environment. Output units have maximal variance.