Algortimos bio-inspirados para clustering y visualizacion de datos geoespaciales
Upcoming SlideShare
Loading in...5
×
 

Algortimos bio-inspirados para clustering y visualizacion de datos geoespaciales

on

  • 390 views

 

Statistics

Views

Total Views
390
Views on SlideShare
390
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Algortimos bio-inspirados para clustering y visualizacion de datos geoespaciales Algortimos bio-inspirados para clustering y visualizacion de datos geoespaciales Presentation Transcript

  • Faculté des Hautes Etudes Commerciales (HEC) Institut des Systèmes dinformation (ISI)Algoritmos bio-inspirados para clustering y visualización de datos geo-espaciales Miguel Arturo Barreto Sánz
  • Outline● Algoritmos bio-inspirados ?● Desafios en el clustering yvisualizacion de datos geo-espaciales g p● Algoritmos bio-inspirados usados enclustering y visualizacion de datosgeo-espaciales● Conclusiones 1
  • 1.Bio inspirados 1 Bio-inspirados ?Speedos Aerodynamic Surfaces"Fastskin" suit, Fastskin for Vehiclesinspired byshark skin Technologies T h l i Inspired by Sharks By Tracy Staedter, feb 2009 , Discovery News y
  • 1.Bio inspirados 1 Bio-inspirados ? Inspired b byA clear version ofTouchco’s human skinmultitouch sensor By Nick Bilton, Dec 30 2009,platform The New York Times Sensors capture the p Sensors pick variation in pressure levels up the pressure of a of a pencil drawing. hand placed on a Touchco device 2
  • 1.Bio inspirados 1 Bio-inspirados ?• La naturaleza innova inventa prueba valida mejora y innova, inventa, prueba, valida,diversifica los sistemas vivos desde hace centenas demillones de años.• El punto de vista de los sistemas bio-inspirados se basaen el estudio de las “invenciones” y las “astucias” de lanaturaleza para inspirarse y crear soluciones (esto nosignifica necesariamente copiar).• Innumerables ejemplos de soluciones de ingeniería“natural”“ t l” son ya utilizadas para el d tili d l desarrollo d nuevos ll demateriales, retinas artificiales, etc. Andres Perez-Uribe Perez Uribe 1
  • 1.Bio inspirados 1 Bio-inspirados ? Fuentes de inspiraciónLargo termino Evolución E l ió Auto-organización Aprendizaje EmergenciaCorto termino Individuo Poblaciones 1
  • 1.Bio inspirados 1 Bio-inspirados ? Fuentes de inspiraciónLargo termino Evolución E l ió Auto-organización Aprendizaje EmergenciaCorto termino Individuo Poblaciones 1
  • 1.Bio inspirados? 1 Bio-inspirados? Auto-organización The rat whisker-barrel systemIt is also the rats sensory system of choice for exploring the environment and collecting informationabout the location, shape, size and texture of objects around it. The system is well suited to examiningneural coding issues because of its functional efficiency and its elegant structural organization. The g y g gwhisker area of somatosensory cortex (known as barrel cortex) is arranged as a topographic map ofthe whiskers .This means that sensory signals arising in one whisker are channelled through arestricted population of neurons and can be sampled by an electrode at different stages of the sensorysystem.
  • 1.Bio inspirados?1 Bio-inspirados? Clustering bio-inspiradoNeural networks have solved a wide range ofproblems and h bl d have good l d learning capabilities. i bilitiTheir strengths include adaptation, ease ofimplementation, parallelization, speed, and p p pflexibility.Bio inspiredBio-inspired clustering is closely related to theconcept of competitive learning.
  • 1.Bio-inspirados ? Clustering bio-inspirado bio inspirado Hard and soft competitive learning Hard … a) k initial "means" b) k clusters are c) The centroid of d) Steps 2 and 3 are created by each of the k repeated until associating g clusters becomes convergence has been every the new means reached. observation with the nearest mean
  • 1.Bio-inspirados ? Clustering bio-inspirado bio inspirado Hard and soft competitive learning Soft S ft … mi = mi + α(t)hci(t)(x - mi) The neighborhood function hck(t) is centered over the best matched g () neuron mc, which is shown as a black cell. The neighboring neurons that have their weights recalculated by this best match are shown in gray. Other neurons are not affected.
  • 1.Bio-inspirados ? Clustering bio-inspirado bio inspirado Hierarchical Self-organizing structures Se o ga Self-organizing g Adaptive Hierarchical Hierarchical Feature Incremental Growing Hierarchical SOM Maps Grid Growing
  • 1.Bio-inspirados ? Clustering bio-inspirado bio inspirado Hierarchical Self-organizing structures Fuzzy Growing Hierarchical Self-organizing Networks (FGHSON)
  • 2. Desafíos en clustering y visualización de datos geo-espaciales Information received from remote sensing systems, and environmental monitoring devices used in: ● Agro-ecology ● Environmental change ● Species distribution ● Disease propagation ● Urban dynamics ● Migration patterns 3
  • 2. Desafíos en clustering y visualización de datos geo-espaciales The special nature of spatio-temporal data poses several spatio temporal challenges to clustering and visualization. For instance: 1. Visualization of clusters in both geographic and feature space 2. The fact that spatial and temporal relationships exist at various levels (scales); ( ); 3. To handle fuzzy boundaries in geospatial clusters 4. The temporal context in which some variables are involved 5. The high dimensionally of the geospatial data sets 6. 6 The large quantity of data 17
  • 2. Desafíos en clustering y visualización de datos geo-espaciales Geographic space and f t G hi d feature space Geographic space is concerned with surface features as the terrain we walk on. Feature space is concerned with the representation of similarities associated with geo-referenced sites in the geographic space Geographic space Feature space 23
  • 2. Desafios en clustering y visualizacion de datos geo-espaciales Geographic space and f t G hi d feature space The clusters found in the feature space in many cases are not the same as those found in geographic space. Represent clusters of a multidimensional space: map multidimensional data o to t o d e s o a onto a two-dimensional lattice of cells. Similarity of sugarcane growing environmental conditions (1999 2005) diti (1999-2005) using Self-organizing maps 29
  • 2. Desafios en clustering y visualizacion de datos geo-espaciales Heterogeneity in scales Necessary to have methodologies to evaluate clusters at different scales in order to find “interesting” patterns between levels. Improve the analysis of cluster structure at different scales, creating representations of the cluster f ili i f h l facilitating the selection of clusters at different scales.Geographic space Feature space 19
  • 2. Desafios en clustering y visualizacion de datos geo-espaciales Boundaries in geospatial data Crisp Fuzzy Algorithms for clustering spatio- temporal databases have to consider the neighbors of the geo geo- referenced data. For instance part of the complexity instance, of the problem lies in the fact that the boundaries of these neighbors are not hard, but rather soft , boundaries. 21
  • 2. Desafíos en clustering y visualización de datos geo-espaciales Temporal relationships b t T l l ti hi between spatial objects The relationship between spatial objects can change over time. This dynamic relationships can be observed for instance in the cluster changes over the time time.22 Similarity of sugarcane growing environmental conditions (1999-2001) using Self- organizing maps
  • 3. Algoritmos bio-inspirados usados en clustering y visualización de datos geo-espaciales i li ió d d t i l Why to use bio-inspired algorithms ? y p g 1. Discovering natural clusters in unlabeled data sets. 2. Reduction of information redundancy contained in the data. 3. The maximization of mutual information between the inputs and the outputs of a network in the presence of noise noise. 4. To help discover nonlinear, local or partial correlations between variables. 5. To work with data with unknown distribution.
  • 3. Algoritmos bio-inspirados usados en clustering y visualización de datos geo-espaciales i li ió d d t i lA trivial case: finding zones with analogous precipitation and air temperaturein South America by using FGHSON Recorderis! FGHSON Fuzzy Growing Hierarchical Self-organizing Networks (FGHSON)
  • 3. Algoritmos bio-inspirados usados en clustering y visualización de datos geo-espacialesA trivial case: finding zones with analogous precipitation and air temperature in South America by usingFGHSON January Air temperature and precipitation
  • 3. Algoritmos bio-inspirados usados en clustering y visualización de datos geo-espacialesA trivial case: finding zones with analogous precipitation and air temperature in South America by usingFGHSON January Air temperature and precipitation
  • 3. Algoritmos bio-inspirados usados en clustering y visualización de datos geo-espaciales Clusters of sites with similar characteristics in time and space For commercial (mass production) crops (rice, corn) it is known the “when” and “where” For native crops (e.g. guanabana, lulo) it is not the case (e g guanabana case. When and what I must cultivate ? Market demand The COCH project 16
  • 3. Algoritmos bio-inspirados usados en clustering y visualización de datos geo-espaciales Clusters of sites with similar characteristics in time and space Soil What crops or varieties are likely to perform well where and when.ClimateGenotype (Source: Homologue) Homologues places for Colombian coffee production. Brazil, Equator, East Africa, and New Guinea. 14
  • 3. Algoritmos bio-inspirados usados en clustering y visualización de datos geo-espaciales Clusters of sites with similar characteristics in time and space Harvest at different time of the same crop 15
  • 3. Algoritmos bio-inspirados usados en clustering y visualización de datos geo-espaciales FGHSON using to find analogous ecoregions through time
  • 3. Algoritmos bio-inspirados usados en clustering y visualización de datos geo-espaciales FGHSON using to find analogous ecoregions through time
  • Conclusiones (I)• Discovering natural clusters in unlabeled data sets. The continuous updating, large quantity, and th di l tit d the diverse uses of geospatial d t make diffi lt t l b l d f ti l data, k difficult to labeled observations in order to define classes.• Reduction of information redundancy contained in the data. Soft competitivelearning algorithms create prototypes of the observations. Hence, large data sets g g p yp , gcan be reduced without, or a minimal, lose of information• The maximization of mutual information between the inputs and the outputsof a network in the presence of noise. Usually, geospatial variables are measuredby instruments in difficult and not controlled environmental conditions (e g satellites (e.g. satellites,meteorological stations).• To help discover nonlinear, local or partial correlations between variables.Several soft competitive learning algorithms allow the projection of high-dimensionalspace in a two dimensional grid. Thus, allowing the visual exploratory analysis ofdata, facilitating to discover non linear, local, or partial correlations;• To work with data with unknown distribution. Many clustering algorithms hadbeen developed to deal with certain data distributions (e g Gaussian distributions) (e.g. distributions).Soft competitive learning algorithms are very useful when working with geospatialdata because they do not need to assume any data distribution 1
  • Conclusiones (II)FGHSONAdvantages1.1 FGHSON does not require a priory setup of the number of clusters clusters. This aspect is critical when dealing with geospatial data, because usually it is no possible estimate a priory the optimal number of clusters that can better represent a data set2. The membership of the observations to the clusters is fuzzy3. The final structure does not necessarily lead to a balanced hierarchy(i.e.(i e a hierarchy with equal depth in each branch) Therefore areas in the branch). Therefore,input space that require more units for appropriate data representationcreate deeper branches than others. It is important when dealing withgeographical-based data, due to in many cases are found regions thatmust be better represented 1
  • Conclusiones (III)FGHSONAdvantages4. The algorithm execute a self-organizing p g g g processes that can be p performed inparallel. Hence, when dealing when large data sets the tasks can be divideddistributing computational cost.5. A software using FGHSON algorithm in geosciences is in development6. The maps on individual layers can not grow irregularly in shape and they can notmay remove connections between neighboring units. In this way it is lose informationabout the input data.Disadvantages1. The FGHSOM can not project a high-dimensional space in a two dimensional space2. The FGHSOM is a new algorithm 1