Big data ciat april_2014_dj_et_slideshare


Published on

Presenters: Daniel Jiménez (Leader of the Big Data expert group, DAPA) & Edgar Torres (Leader, Rice Program, AGROBIODIVERSITY)

Title: BIG DATA: BIG DATA ANALYSIS: is it a solution to understand Big Problems? . The case of yield variation of rice in Colombia

Cukier and Mayer-Schönberger (2013) stated “As the telescope enabled us to comprehend the universe and the microscope allowed us to understand germs, the new techniques for collecting and analyzing information will help us to make sense of our world in ways we are just starting to appreciate”. We subscribe to this view and nowadays in agriculture we have the capacity to capture, analyze, store and share agricultural information in ways which 10 years ago was considered science fiction. The amount and variety of agricultural data generated by multiple individuals and organizations using a huge range of techniques and technologies is growing exponentially. We believe that the next agricultural (r)evolution will come from the development of innovation systems that harness agricultural data from multiple sources, to generate new knowledge that will increase agricultural productivity moving beyond blanket technological solutions towards a system of dynamic site-specific management, which are sensitive and responsive to climate, soil and local socio-economic conditions.

In this seminar, CIAT's researchers will share how several databases that have been collected for different purposes and shared by FEDRARROZ (the country-wide association of rice growers in Colombia), have been used to obtain important insights to support FEDEARROZ on how to be more efficient managing rice at site-specific level.

Mayer-Schonberger, V., Cukier, K., 2013. ). Big Data: A Revolution That Will Transform How We Live, Work and Think

Published in: Data & Analytics, Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Big data ciat april_2014_dj_et_slideshare

  1. 1. BIG DATA: BIG DATA ANALYSIS: is it a solution to understand big problems? Rice program (Agrobioversity) & Big Data expert group (DAPA)
  2. 2. computational models are tailored to the analysis of the data rather than data to a particular methodology, as researchers have done for over a century Applying the principles of Big Data to research in agriculture • Big Data refers to things that one can do at a large scale that cannot be done at a smaller one to extract new insights • Sometimes to inform is better than explain – Looking for patterns or associations • Approaching “N=All” • Adding value to secondary databases Big Data (Foreign Affairs magazine / McKinsey's High Tech)… Cukier and Mayer-Schönberger (2013)
  3. 3. computational models are tailored to the analysis of the data rather than data to a particular methodology, as researchers have done for over a century How? • Including the use of ICTs to collect (androids app), analyze (traditional and machine learning techniques), share (in a way that facilitates the decision making at different levels and for different users) • Analytical approaches tailored to the analysis of the data rather than data to a particular methodology, as researchers have done for over a century • Development of tools as part of a close dialogue with end-users
  4. 4. How? + + = Climate Soil Crop management productivity/ha (including varieties) % ? + % ? + %? = To Explain (100 %) Maximizing productivity in agricultural systems. Working with secondary databases • To Identify the combination of factors that lead to high and low productivities (empirical approaches – machine learning) • Within the framework “Convenio MADR-CIAT” climate change project – Adaptation strategy
  5. 5. 0.0 1.0 2.0 3.0 4.0 5.0 6.0 0 500 1000 1500 2000 2500 3000 3500 Tn/ha Thousandstonsorhas Trends on Rice Production, Harvested Area and Yield in Colombia, 1990-2012 Area Production Yield The problem: In Colombia, since 2009 there is a significant reduction on the yields at the farm level Source USDA-PSD
  6. 6. And what are the causes for this yield reduction? We can see similar problems in Central America, Ecuador, Peru and Venezuela. Reductions on yield that are causing heavy losses to the rice farmers Not a single factor is involved: Drought, high minimum temperatures, low light, high humidity, bacteria, mites , fungus , lack of adaptation etc. low yields are caused by Burkholderia glumae!
  7. 7. Misdiagnosis, wrong treatments and excessive pesticides applications causing others problems (Hoja Blanca) Non ecoefficient And to worsen the problem the farmers wants a “magical cure
  8. 8. Reducing stress because of lack of water. Water Harvest Better agronomy Key points, Crop Rotation and Regulations Improved Cultivars Increasing Yield Potential Protecting Yield Adding value There is something missing here? How we can manage this problem?
  9. 9. AMTEC Massive Adoption of Technology OBJECTIVES  To transfer jointly the technology available for crop management.  To increase productivity and reduce production costs, with the least environmental impact, in a context of social responsibility  To aim for competitiveness and profitability of rice farmers in Colombia TECHNOLOGY TRANSFER Field days Planning and good management practices Visits to research centers Demonstration Trials Reduction costs
  10. 10. County AMTEC Farmer AMTEC vs Farmer Yield Ton ha -1 Cost US$/Ton Yield Ton ha -1 Cost US$/Ton Yield Ton ha -1 Cost US$/Ton El Juncal 6,50 417 5,30 614 1,20 -197 Ibagué 7,96 338 6,90 456 1,06 -118 Norte Tolima 7,48 366 6,29 485 1,19 -119 Montería 6,38 323 4,68 470 1,70 -147 Zulia 6,56 328 5,79 370 0,77 -42 Pompeya 5,70 309 4,30 503 1,40 -194 María La Baja 8,75 248 6,13 333 2,62 -85 Pompeya 4,30 475 3,36 600 0,94 -125 Ibagué 8,66 322 7,23 406 1,43 -84 Fundación 6,53 299 5,60 384 0,93 -85 Casanare 5,90 319 5,20 434 0,70 -115 Average 6,79 340,4 5,52 459,5 1,27 -119,1 AMTEC Results from 2012 and 2013… Source Fedearroz Agronomy helps a lot! 2012 2013
  11. 11. Gene discovery Emerging pathogen: Burkholderia glumae, producing grain sterility Sources of tolerance identified Tolerant genotype showing 60% less damage than susceptible genotypes Molecular markers are being developed to speed up the transference of this trait into elite germplasm Susceptible Tolerant (field evaluation)
  12. 12. Trait Discovery Gene Discovery & Marker Applications Germplasm Enhancement Elite Breeding Breeding pipeline •QTLs mapping; •QTL validation; •functional markers identification •MABC; •recurrent selection; •genomic selection •inbred FLAR •CIRAD & hybrids-HIAAL; •MET •trait value characterization; •screening methods; •donors identification; •populations development; •sequencing; •gene validation
  13. 13. TECHNOLOGY TRANSFER (25agronomist) RESEARCH BREEDING AND AGRONOMY (45 researchers) Breeding (Conventional 7,) Agronomy (Physiology 3, Phytopatology 1, Soils 2, Water 2, Crop Management 26, Biotech 3, Weeds 1) ECONOMICS (7 officials) Updated Socio-economic studies Our strategic partner for Rice Research in Colombia
  14. 14. computational models are tailored to the analysis of the data rather than data to a particular methodology, as researchers have done for over a century National Survey • Purpose: Keep the crop sector updated • N= 738 cropping events Harvesting records • Purpose: Technical research (crop management, soils, breeding, biotechnology, physiology) • N= 3193 cropping events “Data is no longer regarded as static, whose usefulness is finished once the purpose for which it was collected is achieve” Information on: Planting and harvesting date, productivity , grain humidity, variety, cropping system Zones: Caribbean, Andean (Tolima), Plains (Llanos) Databases: Databases…. plenty of information
  15. 15. Adding value to secondary databases. The case of information on cropping events of rice in Colombia Planting dates experiments (Field trials) • Purpose: Technical research on the best sowing date • N= 272 cropping events Adding value to secondary databases…but first, merging databases: Challenging task!!! Climate • About 27 weather stations
  16. 16. Letting the data speak “Before Big Data our analysis were usually limited to testing a small number of hypotheses that we defined well before we even collected the data. When we let the data speak we can make connections that we had never thought existed” Cukier and Mayer-Schönberger (2013)
  17. 17. Sowing Harvest a cropping event in rice = 120 days Climate series for all variables Crop time Hypothesis Yield variation is associated with climate
  18. 18. FEDEARROZ 733, 27 % of productivity variation explained Multivariate analysis for Saldaña (research station- Andean zone ): cropping events (2007 to 2012) Lagunas, 47 % of productivity variation explained Letting the data speak FEDEARROZ 733 N = 189 N = 63 Cimarrón Barinas
  19. 19. Letting the data speak Climate and analysis based on phenological stages in Saldaña (research station ) Andean zone 2007 – 2012 (N= about 800 cropping events – irrigated rice) • The crop sector can suggest to farmers the best planting date • By assessing the same approach in other stations (enviroments) – New insights for future breeding • Adaptation strategy for climate change Climate accounts for 30% to 40% of production variability in irrigated rice
  20. 20. computational models are tailored to the analysis of the data rather than data to a particular methodology, as researchers have done for over a century Letting the data speak Climate and analysis based on phenological stages in Zone: Colombian Plains- 2007 – 2012 (N= about 500 cropping events – Upland rice) • Rainfall is a critical driving factor for upland rice during grain filling and panicle initiation • Machine learning (MLP) Again! - climate accounts for 30% to 40% of production variability in upland rice
  21. 21. Letting the data speak Climate and analysis based on phenological stages in Zone Plains-Colombia 2007 – 2012 N= about 200 (cropping events – Upland rice.. variety F174) • Temperature is a critical driving factor for variety 174 (upland rice) during grain filling • Machine learning (MLP) This time climate explained more than 40% of production variability !!! in upland rice V F174
  22. 22. Case study : working with secondary databases: Seasonal forecast, niñ@s & Big Data. Rice in Colombia (Pompeya- Llanos) What is likely to happen in March-April-May 2014? We generated 24 clusters based on more than 500 cropping events • Seasonal forecast + (data) Best technologies + Big Data analysis = Better adaptive responses to CC and CV Cluster 7 Rice variety Productivity (Kg/Ha) Cropping events F174 4,564 31 FORTALEZA 3,543 17 F2000 4,977 8 LAGUNAS 5,052 6 MOCARI 4,604 6
  23. 23. What can we do with these results? FLAR and CIAT Rice Breeders • Better understanding of yield and its formation under changing, complex, and extremely variable conditions. • New breeding objectives like low light tolerance, pattern of biomass accumulation etc. • Better environments definition FEDEARROZ • Reduce pesticide applications.. since it is demonstrated that there are other factors behind the yield variation • Establish planting dates and new crop systems based on crop rotation • Establish a dynamic system for crop management based on short term prediction to manage the risk associated with the changing conditions CGIAR • Expand this experience to other crops and areas • Understand the importance of FARMERS ORGANIZATIONS to have impact • Interesting concept for CCAFS, GRiSP, MAYZE others
  24. 24. •The analytical approach used demonstrated that variation of rice productivity can be associated with climate (30 -45%) • Internal Cooperation between research areas within CIAT and external FEDEARROZ is a powerful combination- Also… multidisciplinary work is key!!! •As long as the information is available it can be applied in any other regions/ crops • CCAFS is keen to integrate – CN selected CSMS (CIAT- FLAR-IRRI) • Start collaborations with the yield gap taskforce •Encourage others partners in LAC to collect information and be part of this idea…(e.g strategy of FLAR) and add value to info that has been already collected. Concluding remarks and perspectives
  25. 25. Modern information technoloy, Big Data, Site-specific Management/Agriculture, digital soil mapping, Terra I, Bio-informatics are already here… A new Ageekulture can be regarded as complementary to CIAT’s traditional research in order to fulfill the center`s mission Concluding remarks and perspectives
  26. 26. THANK YOU!!! •Patricia Guzman – FEDEARROZ •Nestor Gutierrez- FEDEARROZ •Jose Levis – FEDEARROZ •Gabriel Garces - FEDEARROZ • Andy Jarvis (CC expert) •Edgar Torres – (Rice Breeder) •Daniel Jiménez (Agronomist) • Camila Rebolledo – (Plant Physiologist) • Sylvain Delerce – Agronomist /Math background •Hugo Dorado (Statistician) •Armando Muñoz (Biologist) •Victor Patiño (Statistician) •Juan Felipe Rodriguez (The computer science component)  MADR, FEDEARROZ, CCAFS, GRiSP