Presentation2 - GEOQUAL, ELCmapas & ECOGEO tools

483 views
391 views

Published on

CAPFITOGEN english presentations for workshops

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
483
On SlideShare
0
From Embeds
0
Number of Embeds
244
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Presentation2 - GEOQUAL, ELCmapas & ECOGEO tools

  1. 1. Tools Mauricio Parra Quijano FAO consultant International Treaty on Plant Genetic Resources for Nutrition and Agriculture CAPFITOGEN Program Coordinator
  2. 2. GEOQUAL Evaluates the quality of the geo-referencing at a given collecting site indicated in passport data
  3. 3. Geo-referencing and passport data 40° 20’ 33.4’’ N 03° 11’ 52.1’’ W
  4. 4. Why should we evaluate the geo-referencing quality? Coordinates True site x km 1030 75 100
  5. 5. Potential effects of poor geo-referencing 10 km 5 oC 6 oC 7 oC 8 oC 9 oC 1 km
  6. 6. Level Value ORIGCTY CRI ADM1 Punta Arenas ADM2 Buenos Aires ADM3 NA ADM4 NA Level Value ORIGCTY CRI ADM1 Punta Arenas ADM2 Pérez Zeledón ADM3 NA ADM4 NA Description of the collecting site Error describing the collecting sites
  7. 7. GEOQUAL features •GEOQUAL is a tool which assigns a quality value to the passport data of a germplasm collection that include coordinates. •The user enters the passport data in FAO-Bioversity 2012 format. •GEOQUAL calculates three parameters COORQUAL, LOCALQUAL and SUITQUAL along with other sub-parameters. •The parameters are summarized to generate both TOTALQUAL (0-60 range) and TOTALQUAL100 (0-100 range)
  8. 8. Parameter that determines the intrinsic quality of the coordinates included in the passport data. Values ​​from 0 to 20. Sub-parameters.: • ERRORES: Values beyond the coordinate frame • PRECIS: Accuracy level. Measured in degrees, minutes or seconds (sexagesimal) • GEORBLE: Probability of correct coordinates from site description • INTERTEMP: Quality of coordinates by collection year • * GEOREFMETH: System by which coordinates are assigned COORQUAL
  9. 9. SUITQUAL Parameter that assigns a quality value to the coordinates according to the appropriateness of the collection site for plant growth. Values ​​from 0 to 20. • Difference between cultivated and wild plants (SAMPSTAT) • It uses information on land use from Global Land Cover map (1 km)
  10. 10. > 30 km 10-20 km 5-10 km 0-1 km Ground level 0 20 Distance from the coastline SUITQUAL
  11. 11. Lower resolution Higher resolution zoom! SUITQUAL
  12. 12. LOCALQUAL Parameter that comes from the comparison of the site (locality) description and administrative data coming from the coordinates, both from user’s passport data. • The administrative geo-referenced information is extracted from the GADM database • The comparison is between character strings, generating a distance (Levenshtein). Insertions, deletions and changes are determined, to assume that a string is equal to another. Function "agrep" in R • According to the number of correct matches, a value ranging from 0 to 20 is assigned. Passport description GADM from Coordinates GADM (second option) ORIGCTY ISO ADM1 NAME1 VARNAME1 ADM2 NAME2 VARNAME2 ADM3 NAME3 VARNAME3 ADM4 NAME4 VARNAME4
  13. 13. LOCALQUAL
  14. 14. TOTALQUAL = COORQUAL + SUITQUAL + LOCALQUAL VALUES FROM 0 TO 60 TOTALQUAL100 = (TOTALQUAL*60) /100 0 98 LOCALQUAL and TOTALQUAL100
  15. 15. 0 100 TOTALQUAL100: Unified value of the geo-referencing quality 80 90 Use of GEOQUAL
  16. 16. ELC maps It allows the user to create eco-geographical land characterization maps (ELC), that reflect adaptive scenarios for a given species (or species groups) and a specific country or region
  17. 17. Characterization of a territory
  18. 18. Variable selection Geophysical variables Cluster analysis Determination of optimal number of groups Combination (N bioclimatic*N geophysical*N edaphic) Categories MAP Description of categories using original variables Edaphic variables Cluster analysis Determination of optimal number of groups Bioclimatic variables Cluster analysis Determination of optimal number of groups How an ELC map is developed?
  19. 19. Expert opinion / knowledge • Experts on target species are a valuable source of information • Surveys are an efficient way to gather information from expert knowledge (internet/email, meetings, workshops, etc.). • Variable lists are made by components, with details on the nature of the variables (explanation of codes, variable units, source, etc..). Then a value is assigned based on the importance that a given variable has regarding the adaptation of the species. Bibliography search on major factors in the adaptation of target species Variable selection I
  20. 20. Variable selection II Debugging: • Redundancy? Correlation? Collinearity? • Bivariate correlations analysis, PCA, the inflation factor of VIF variance (comparison of linear relationships between variables – only in regression) • Significance. Through a multiple regression analysis taking into account a dependent variable (that gives a measurement of ​​adaptation). x1 x2 x1 x1 x1
  21. 21. What type of map you need? Depending on the approach of the analysis, the ELC map can be : 1. Generalist map 2. Map by species / gene pool / group of related Sp (Specific map) It defines the major environments for a large number of species (related or not). For most of these species, the ELC map should discriminate different adaptive scenarios in a given target area. It is expected to find unadjusted relationships between adaptive characteristic of a smaller group of species and the resulting map (see Parra-Quijano et al., 2012). They define in more detail the key environments for a particular species or a limited set of genetically related species. A good fit between the map and the adaptive characteristics of the target species is expected.
  22. 22. ELC mapas tool results • Maps (which can be opened with DIVA-GIS) and tables describing each category.
  23. 23. ECOGEO It allows to perform eco-geographical characterization of the geo-referenced collecting sites
  24. 24. 0 cm 5 cm 10 cm Internodes length = 5.56 cm 1 2 3 1 0 1 0 1 0 = present = 1 = absent = 0 NOT of the germplasm but of the collecting site ECOGEO is a characterization
  25. 25. Process of ecogeographical characterization Characterization matrix : Rows: Germplasm identifier Columns: Ecogreographical descriptors passport Data (including coordinates) GIS Elevation Average Annual Temp Soil Organic Carbon Soil pH …. …. Y X
  26. 26. Point or radial extraction? 2 4 3 1 3 2 1 3 2 1 1 3 1 1 3 4 Ecogeografical variable X NA NA NA NA 1 1 3 4NA ACCENUMB VARIABLE a NA b NA c 2 2 4 3 1 3 2 1 3 2 1 1 3 1 1 3 4 NA NA NA NA 1 1 3 4NA a b c Distribution of passport data entries 2 4 3 1 3 2 1 3 2 1 1 3 1 1 3 4 NA NA NA NA 1 1 3 4NA GIS overlap Extraction results ACCENUMB VARIABLE a NA (1) b 1 c 3 a b c True location a=68 b=65 c=50 GEOQUAL uncertainty Radius Radial extraction
  27. 27. ACCENUMB CAPTURED VALUES AVERAGE a NA,1,1 1 b NA,1,1 1 c 3,2,1,3,2, 3 2.333 GIS overlap Results of radial extraction ACCENUMB VARIABLE a 1 b 1 c 3 Correct extraction ACCENUMB VARIABLE a NA b NA c 2 Point extraction 1 1 2.333 Radial extraction 2 4 3 1 3 2 1 3 2 1 1 3 1 1 3 4 NA NA NA NA 1 1 3 4NA
  28. 28. Characterization matrix 409-09 320-05319-05 318-05317-05 315-05316-05 405-09 391-07390-07 386-09385-07 386-07375-06 406-09323-05 376-07321-05 401-08311-05 372-06 377-07307-05 369-06299-05 368-06530-09 528-09527-09 523-09524-09 378-07379-07 526-09 504-09-v504-09 503-09-v503-09 501-09502-09 507-09534-09 533-09531-09 532-09 300-05541-09 540-09536-09 535-09522-09 529-09539-09 537-09538-09 308-05414-09 276-05 277-05306-05 357-06365-06 366-06505-09-v 525-09415-09 285-05283-05 284-05546-10 403-09 402-09355-06 356-06304-05 302-05303-05 349-06337-06 338-06397-08 353-06396-08 413-09 516-09454-09 455-09412-09 279-05281-05 287-05280-05 291-05309-05 389-07392-07 324-06 350-06351-06 521-09-v521-09 520-09-v519-09-v 519-09518-09-v 518-09517-09-v 517-09516-09-v 515-09-v 515-09514-09-v 514-09465-09 464-09463-09 462-09461-09 460-09459-09 458-09456-09 457-09 506-09-v505-09 506-09513-09-v 513-09512-09-v 512-09511-09-v 511-09510-09-v 510-09509-09-v 509-09 508-09508-09-v 268-05288-05 289-05361-06 341-06360-06 292-05548-10 348-06 347-06346-06 345-06343-06 342-06335-06 334-06333-06 332-06327-06-v 325-06293-05 298-05 551-10297-05 296-05295-05 294-05262-05 263-05410-09 411-09417-09 418-09393-07 275-05 394-07549-10 552-10550-10 395-07404-09 266-05380-07 274-05467-09 416-09466-09 383-07 382-07269-05 265-05267-05 381-07273-05 272-05270-05 271-05301-05 282-05305-05 507-09-v 453-09452-09 450--09451-09 02468 Cluster analysis - Ecogeographic characterization hclust (*, "average") ecogeodist Height d = 1 23 4 5 6 7 8 9101112 1314 1516 17 18 19 20 212223 24 2526 27 2829 30313233 34 35 36 37 383940 41 42 43 44 45 46 474849505152 53 54 55 565758596061 6263 64 656667686970 71 7273 74 7576 77 7879 8081 82 83 84 85 86 87 8889 90 91 9293 949596 97 9899 100 101 102 103 104105 106 107108 109 110 111 112 113 114 115 116 117 118 119 120 121 122123124125 126127 128129130131132133134135136137 138139 140141142143144145 146 147 148149 150 151 152153154155156157158159160161162163 164165166167 168 169170171172173174175176177178 179 180181 182 183 184185 186 187 188189190191 192193 194195196 197198 199 200 201202 203 204 DECLATITUDE alt northness slope bio_18 bio_1 t_clay t_sand t_oc t_silt t_ph_h2o Eigenvalues Data analysis

×