Self-organizing maps - Tutorial


Published on

Self-organizing maps tutorial

Published in: Technology, Education

Self-organizing maps - Tutorial

  1. 1. "Apprentissage non supervisé" de la théorie à la pratique Miguel Arturo Barreto Sánz
  2. 2. Outline● Introduction The unsupervised learning● The Self-Organizing Map The biological inspiration The algorithm Characteristics Examples● Practical examples using MATLAB 1
  3. 3. IntroductionUnsupervised learning is a way to form “natural groupings”or clusters of patterns.Unsupervised learning seeks to determine how the data areorganized.It is distinguished from supervised learning in that thelearner is given only unlabeled examples..Among neural network models, the Self-Organizing Map(SOM) are commonly used unsupervised learningalgorithms.The SOM is a topographic organization in which nearbylocations in the map represent inputs with similar properties. 2
  4. 4. The Self-Organizing Map The biological inspiration Sensory information is processed in the neocortex by highly ordered neuronal networks. • Tangential to the cortical surface,W. Penfield representations of the sensory periphery are organized into well-ordered maps. • Taste maps in gustatory cortex (Accolla et al., 2007) • Somatotopic maps in primary somatosensory cortex (Kaas, 1991). 3
  5. 5. The Self-Organizing Map The biological inspiration Other prominent cortical maps are the tonotopic organization of auditory cortex (Kalatsky et al., 2005), The most intensely studied example is the primary visual cortex, which is arranged with superimposed maps of retinotopy, ocular dominance and orientation (Bonhoeffer and Grinvald, 1991). 4
  6. 6. The Self-Organizing Map The biological inspiration Humunculus 5
  7. 7. The Self-Organizing Map The biological inspirationSomatosensory cortex dominated by the representationof teeth in the naked mole-rat brainKenneth C. Catania, and Michael S. Remple. 6
  8. 8. The Self-Organizing Map The biological inspirationA remarkably high degree of organization is obvious in theprimary somatosensory cortex, in which a clear pattern ofcytoarchitectonic units termed ‘barrels’ are observed inperfect match with the arrangement of the whiskers on thesnout of the mouse (Woolsey and Van der Loos, 1970) 7
  9. 9. The Self-Organizing Map The biological inspirationMapping functionally related sensoryinformation onto nearby cortical regions isthought to minimize axonal wiring length andsimplify the synaptic circuits underlyingcorrelation-based associational plasticity. 8
  10. 10. The Self-Organizing Map In a topology-preserving map, units located physically next to each other will respond to classes of input vectors that are likewise next to each other. Although it is easy to visualize units next to each other in aTeuvo Kohonen two-dimensional array, it is not so easy to determine which classes of vectors are next to each other in a high- dimensional space. Large-dimensional input vectors are, in a sense, projected down on the two dimensional map in a way that maintains the natural order of the input vectors. This dimensional reduction could allow us to visualize easily important relationships among the data that otherwise might go unnoticed. 9
  11. 11. The Self-Organizing MapA SOM is formed of neurons located on aregular, usually 1- or 2-dimensional grid.The neurons are connected to adjacentneurons by a neighborhood relationdictating the structure of the map.In the 2-dimensional case the neurons ofthe map can be arranged either on arectangular or a hexagonal lattice 2 2 1 1 0 Input Input 0 10
  12. 12. The algorithmThe weights of the neuronsare initialized t=0 2
  13. 13. The algorithmExample 2
  14. 14. The algorithmThe training utilizes BMUcompetitive learning.The neuron with weightvector most similar to theinput is called the bestmatching unit (BMU).The weights of the BMUand neurons close to it inthe SOM lattice areadjusted towards theinput vector.The magnitude of thechange decreases withtime and with distancefrom the BMU. 2
  15. 15. The algorithmNext example 2
  16. 16. The algorithm 2
  17. 17. The algorithm 2
  18. 18. The algorithm 2
  19. 19. CharacteristicsInputs: State of health, Quality of life word mapnutrition, educationalservices etc. 2
  20. 20. Characteristics Input 3 Dimentions Output 2 dimentions z x xy y 2
  21. 21. Visualization 2
  22. 22. 2
  23. 23. Introduction 2
  24. 24. Visualization 2
  25. 25. Clusters of sites with similar characteristics Soil What crops or varieties are likely to perform well where and when.ClimateGenotype Homologues places for Colombian coffee production. Brazil, Equator, East Africa, and New Guinea. 14 2
  26. 26. Clusters of sites with similar characteristicsFor commercial (mass production) crops (rice, corn) it is known the“when” and “where”For native crops (guanabana, lulo) or special types of crops (coffeevarieties) it is not the case. When and what I must cultivate ? Market demand DAPA (Diversification Agriculture Project The COCH project Alliance) 16 2
  27. 27. 1. Large database The challenges2. Multivariable problem 1 point 1 Km 1 Km 1 336,025 points 2
  28. 28. The challenges Introduction 1. Large datasets 2. Multivariate problem Climate, management, variety, climate estimates, soil etc. Example. BIOCLIM is a bioclimatic prediction system which uses surrogate terms (bioclimatic parameters) derived from mean monthly climate estimates, to approximate energy and water balances at a given locationB1. Annual Mean Temperature B11. Mean Temperature of Coldest QuarterB2. Mean Diurnal Range(Mean(period max-min)) B12. Annual PrecipitationB3. Isothermality (P2/P7) B13. Precipitation of Wettest PeriodB4. Temperature Seasonality (Coefficient of Variation) B14. Precipitation of Driest PeriodB5. Max Temperature of Warmest Period B15. Precipitation SeasonalityB6. Min Temperature of Coldest Period (Coefficient of Variation)B7. Temperature Annual Range (P5-P6) B16. Precipitation of Wettest QuarterB8. Mean Temperature of Wettest Quarter B17. Precipitation of Driest QuarterB9. Mean Temperature of Driest Quarter B18. Precipitation of Warmest QuarterB10. Mean Temperature of Warmest Quarter B19. Precipitation of Coldest Quarter 2
  29. 29. Clusters of sites with similar characteristicsHow to work ?How to obtain Prototypes, Clustering and Visualization at the sametime ?ApproachUnsupervised learningSelf-organizing mapsTwo flavors of SOMsSelf-organizing maps Growing hierarchical mapStatic map – Just one representation Different representations to different levels 2
  30. 30. Clusters of sites with similar characteristicsSelf-Organizing Map (SOM) The clusters found in the feature space in many cases are not the same as those found in geographic space. Represent clusters of a multidimensional space: map multidimensional data onto a two-dimensional lattice of cells. Similarity of sugarcane growing environmental conditions (1999-2005) using Self-organizing 2 maps 29
  31. 31. Approaches GHSOM P 2
  32. 32. P1. Annual Mean Temperature P2. Mean Diurnal Range(Mean(period max-min)) Introduction P3. Isothermality (P2/P7) P4. Temperature Seasonality (Coefficient of Variation) P5. Max Temperature of Warmest Period P6. Min Temperature of Coldest Period P7. Temperature Annual Range (P5-P6) P8. Mean Temperature of Wettest Quarter P9. Mean Temperature of Driest Quarter P10. Mean Temperature of Warmest Quarter P11. Mean Temperature of Coldest Quarter P12. Annual Precipitation P13. Precipitation of Wettest Period P14. Precipitation of Driest Period P15. Precipitation Seasonality(Coefficient of Variation) P16. Precipitation of Wettest Quarter P17. Precipitation of Driest Quarter P18. Precipitation of Warmest Quarter P19. Precipitation of Coldest QuarterGHSOMComponentplanes 2
  33. 33. Merci !