Classification of similar productivity zones in the sugar cane culture using clustering of SOM component planes based on the SOM distance matrix - Presentation Transcript
Classification of similar productivity zones in the sugar cane culture using clustering of SOM component planes based on the SOM distance matrix Miguel BARRETO Andrés Pérez-Uribe MINISTERIO DE AGRICULTURA Y DESARROLLO RURAL asocaña
Introduction
The agricultural productivity of a geographic area depends on many
agro-ecological variables like soil and terrain characteristics, climatic
constraints, human behavior and management.
Soil Management Climate Genotype Productivity
The problem
The world of agriculture is diverse and heterogeneous .
The traditional approach has been to develop technologies and agriculture management as if it was homogeneous, with controlled experiments . However, it is expensive and it takes long time .
In agriculture there are really few possibilities of controlling or modifying the conditions in which the cultures grow.
A new approach Management Climate Genotype Experiment 1. Every crop is an experiment Sowing Growing Harvest Soil
A new approach 4 experiments Same cultivated zone For example: 1999 2000 2001 2002
A new approach 1358 experiments Management Climate Genotype 2. Each agroecological event is unique in time and space, but it is possible to find similar characteristics between events that allow finding similar behaviors permitting to discover why and how the agroecological variables affect the crop development and therefore the agricultural productivity. Sowing Growing Harvest Soil
Challenges
This approach presents these challenges :
To deal with problems such as: quantity, quality and type of data. Quantity refers to the number of variables and the observations associated to each variable. Quality refers to data integrity. As far as the type of data is concerned, it refers to the nature of the data, qualitative (e.g., genotype) and quantitative (e.g., temperature).
To optimize the visualization and analysis of the variables.
The idea Soil type A, B etc Variety type A,B etc Management type A,B etc Weather condition Sunny, rainy etc 1. To construct a plane for each zone with its characteristics.
The idea 2. To find natural groups of experiments with similar characteristics (Without knowing the productivity). Conditions A Conditions B 3. Add labels and look for the more homogeneous groups Zone 1 Rainy B B C Zone 2 Sunny A B A Sunny A B A Zone 3 Sunny A B A Zone 5 Sunny A B A Zone 6 Rainy B B C Zone 7 Rainy B B C Zone 8 Rainy B B C Zone 9
The idea (Analyze the conditions) 4. To extract new knowledge about the relationship between the agro-ecological variables and productivity. Soil type B Variety type C Management type B Weather condition Rainy Soil type A Variety type A Management type B Weather condition Sunny Conditions A High productivity Conditions B Low productivity
The variables
Climate variables. Continuous data.
Average Temperature (TempAvg), / After seed (AS) / Before Harvest (BH)
Average Relative Humidity (RHAvg) / After seed (AS) / Before Harvest (BH)
Radiation (Rad) / After seed (AS) / Before Harvest (BH)
Precipitation (Prec) / After seed (AS) / Before Harvest (BH)
Soil variables.
Order (Ord) / 3 Orders (Ord1, Ord2, Ord3) Nominal Data
Texture (Tex) / Ordinal Data
Deep (Dee)/ Ordinal Data
Topographic variables.
Landscape (Ls) / 3 Landscapes (Ls1, Ls2, Ls3) Nominal Data
Slope (Sl). / Ordinal Data
Other variables.
Water Balance (WB) Ordinal Data
Variety (Var) / 3 varieties (V1, V2, V3) Nominal Data
Production
Total 54
Months After Seed (AS) Months Before Harvest (BH) 1 2 3 4 1 2 3 4
SOM visualization of the variables Soil type Variety type Management type Weather condition Relative Humidity (RH) Before Harvest (BH) After Seeding (AS) Radiation (Ra) Before Harvest (BH) After Seeding (AS) Soil order 2 Sugarcane variety 1 Precipitation (P) Before Harvest (BH) After Seeding (AS) Temperature (T) Before Harvest (BH) After Seeding (AS)
Component planes To improve the analysis of the relationships between variables and/or their influence on the outputs of the system, it is possible to slice the Self-organizing maps in order to visualize their so-called component planes Zone 1 Zone 2 Zone 3 Zone 4 Zone n Variable 54 Variable 2 Variable 1 Zone 1358 Zone 3 Zone 2 Zone 1
SOM visualization of the variables Relative Humidity (RH) Before Harvest (BH) After Seeding (AS) Radiation (Ra) Before Harvest (BH) After Seeding (AS) Sugarcane variety 1 Precipitation (P) Before Harvest (BH) After Seeding (AS) Temperature (T) Before Harvest (BH) After Seeding (AS) Soil order 2 Relative Humidity (RH) Before Harvest (BH) After Seeding (AS) Radiation (Ra) Before Harvest (BH) After Seeding (AS) Soil order 2 Sugarcane variety 1 Precipitation (P) Before Harvest (BH) After Seeding (AS) Temperature (T) Before Harvest (BH) After Seeding (AS)
Correlation hunting The task of organizing similar components planes in order to find correlating components is called correlation hunting. However, when the number of components is large it is difficult to determine which planes are similar to each other.
Correlation hunting A new SOM can be used to reorganize the component planes in order to perform the correlation hunting. The main idea is to place correlated components close to each other. An advantage of using a SOM for component plane projection is that the placements of the component planes can be shown on a regular grid . In addition, an ordered presentation of similar components is automatically generated. A disadvantage is that the choice of grouping variables is left to the user .
Clustering of SOM component planes based on the SOM distance matrix The U-matrix had been used as an effective cluster distance function. The U-matrix visualizes distances between each map unit and its neighbors, thus it is possible to visualize the SOM cluster structure .
Clustering of SOM component planes based on the SOM distance matrix
Clusters with similar productivity Medium High Low Productivity 0 10 - 10
Prototypes from clusters with similar productivity Relative Humidity (RH) Before Harvest (BH) After Seeding (AS) Radiation (Ra) Before Harvest (BH) After Seeding (AS) Soil order 2 Sugarcane variety 1 Precipitation (P) Before Harvest (BH) After Seeding (AS) Temperature (T) Before Harvest (BH) After Seeding (AS)
Best Matching Units from radiation before harvest (RaBH) Ra1BH Ra2BH Ra3BH Ra4BH Ra5BH Best Matching Units
Analyzing the plots Radiation Relative Humidity Temperature
Analyzing the plots
It is possible to examine the behavior of the radiation for the two component planes previously chosen as example in a scatter plot.
It is possible to observe that the two zones present similar values of radiation in the months after seed (RaAS).
During the months before harvest (RaBH) the radiation presents the same behavior in the high-medium the and low productivity regions, but with a shift.
This pattern indicates that the high radiation in the months before the harvest might affect the accumulation of saccharose in the plant.
Conclusions
Visualization of agroecological zones is very important but difficult due to the high dimensionality of the data . The SOM algorithm is a powerful technique able to deal with this problem, but …
In this study we have utilized the U-matrix and the component plane representation to illustrate the usefulness of the SOM for similar zones visualization and analysis tasks.
By analyzing the obtained groups of agro-ecological variables and cultivated zones, it was possible, as an example of the application of the methodology, to find a relationship between the radiation after seed, before harvest, and a high-medium productivity .
We are currently looking forward to develop data mining and visualization techniques in order to improve the decision support in the sugar cane culture based on the aforementioned methodology.
0 comments
Post a comment