Advertisement
Advertisement

More Related Content

Similar to Advances in gene-based crop modeling(20)

More from CIAT(20)

Advertisement

Advances in gene-based crop modeling

  1. Gene-Based Crop Modeling J. W. Jones, M. J. Correll, K. J. Boote, S. Gezan, and C. E. Vallejos CIAT Aug 4, 2015 Source: Monica Ozores-Hampton
  2.  Crop models can be considered as non-linear functions  Estimate GSPs (Genetic Coefficients), fit linear statistical model to estimate GSPs vs. QTLs  Develop new statistical linear mixed effects models of G, E, and GxE for different processes • E.g., flowering date, node addition rate, leaf size, max number of MS nodes, …  Integrate new relationships into existing DSSAT CROPGRO- Bean model  Develop component process modules using linear or nonlinear mixed effects models of traits vs. QTLs and environmental factors, combine them to demo modular approach  Future – compare Genomic Prediction for beans similar to Technow et al. Plos One 2015)  Discussion Outline: Our Work in Modeling CIAT Aug 4, 2015
  3. Dynamic Crop Models  Dynamic, variables of interest change over time (state variables)  Environment also changes over time  System of equations & not just a single variable to predict  Variables interact, typically in highly non-linear ways, varying over time  There is not a single equation to calculate the response that one is interested in (e.g., final yield of a crop)  Final yield (and other variables) may reach their final values in many different ways, depending on genetics and environment CIAT Aug 4, 2015
  4. General Form of a Dynamic System Model Discrete Time/Difference Equation Difference equation form, when time step equals 1 (e.g., 1 day): U1,t+1 = U1,t + g1[Ut, Xt, θ] U2,t+1 = U2,t + g2[Ut, Xt, θ] . . . US,t+1 = US,t + gS[Ut, Xt, θ] CIAT Aug 4, 2015
  5. Example Final yield response to all variables during a season Y =f (X;θ) where X represents all explanatory variables during a season, θ represents all parameters of the dynamic model f represents a function (typically implicit function) • We could write this as Y = simulated final grain biomass at harvest time, T, as affected by explanatory variables (e.g., irrigation applied during a season) and by all parameters Dynamic System Model as a Response Model CIAT Aug 4, 2015
  6. Example of Response Simulated by Crop Model CIAT Aug 4, 2015
  7. How Simulation Computes Responses Figure 1.3. Computer program flow diagram showing how a simulation model is used as a function such that any time a response is needed, the simulation is run to calculate state variables for every time step, but return only the value of selected state variable for the time of interest. In this case, we are interested in Y at a time t = 140. CIAT Aug 4, 2015
  8.  Quantities in the model that represent variations in crop performance across cultivars or lines  GSPs are the same as “cultivar coefficients” that have been used routinely in the models contained in DSSAT  Examples • Phenology – e.g., duration to first flower under optimal conditions • Size of leaves on the main stem • Maximum rate of node appearance on the main stem under optimal conditions • Number of seeds per pod (or per ear in maize)  Must be known for each cultivar to simulate its performance Genotype-Specific Parameters (GSPs) CIAT Aug 4, 2015
  9. Example, DSSAT CROPGRO-Bean Model using GSPs 0 500 1000 1500 2000 20 40 60 80 Leaf,Stem,orSeedMass Days after Sowing Leaf-Jatu-Rong Leaf-Porrillo S. Stem-Jatu-Rong Stem-Porrillo S. Seed-Jatu-Rong Seed-Porrillo S. Obs Leaf Obs Leaf Obs Stem Obs Stem Obs Seed Obs Seed Flw Sd Flw Sd R7 R7 Figure 6. Time course of leaf, stem, and seed mass accumulation of Jatu-Rong (Andean) and Porrillo Sintetico (Meso-American) cultivars relative to time of first flower (Flw), first seed (Sd), & beginning maturity (R7) (grown at Palmira, Colombia (data from Sexton et al., 1994, 1997). CIAT Aug 4, 2015
  10. Application of Crop Models Genotypes G, M Selection for Optimal Responses Bean Crop Model Environment, Management Data Sim Phenotypic Responses Iterative Exploration GSPs CIAT Aug 4, 2015
  11.  TRIFL is a GSP in the existing bean model  TRIFL is the maximum rate of node appearance on the main stem, number per day  Temperature has a major effect on how rapid new nodes appear on the main stem  The model* in the DSSAT common bean model is: GSP Example - TRIFL 𝑁𝐴𝑅(𝑡) = 𝑇𝑅𝐼𝐹𝐿 ∙ ( 1 24 ) 𝑇ℎ∗ − 𝑇𝑏𝑎𝑠𝑒 (𝑇𝑜𝑝𝑡1 − 𝑇𝑏𝑎𝑠𝑒) where NAR(t) = rate of new node or leaf appearance on the main stem on day t, #/day, TRIFL = maximum node/main stem leaf addition rate, number per day, Tbase = base temperature, below which the rate is 0.0, 0C, Topt1 = temperature above which node addition rate remains its maximum value, 0C, Thour = hourly temperature in the field where the crop is growing, 0C, and 𝑇ℎ∗ = 𝑇𝑏𝑎𝑠𝑒 𝑖𝑓 𝑇ℎ𝑜𝑢𝑟 𝑖𝑓 𝑇𝑜𝑝𝑡1 𝑖𝑓 𝑇ℎ𝑜𝑢𝑟 < 𝑇𝑏𝑎𝑠𝑒 𝑇𝑏𝑎𝑠𝑒 < 𝑇ℎ𝑜𝑢𝑟 < 𝑇𝑜𝑝𝑡1 𝑇𝑜𝑝𝑡1 < 𝑇ℎ𝑜𝑢𝑟 CIAT Aug 4, 2015
  12.  TRIFL is a GSP  Tbase and Topt1 are not GSPs, but are species-dependent parameters in the current bean model  Also, TRIFL has been used as fixed across cultivars in the past due to lack of information  We now know that TRIFL varies significantly across lines/cultivars, based on our NSF study  What about Tbase and Topt1?  Example will be given later in the week on how this new information is affecting how we model beans TRIFL Example (continued) CIAT Aug 4, 2015
  13.  Data are needed for each cultivar or genotype  In our NSF study, we had over 180 genotypes, and for each of them, we had observations in the field at 5 locations  These data were used to estimate GSPs, as will be shown later in the workshop  The basic idea is that we use the multi-location experiment phenotypic data: • Set initial GSPs as input to the simulation, • compare simulated and observed phenotypic data, • compute a measure of how close the simulated phenotypic data are to observed • Vary the GSPs and search the range of feasible values until a criterion is met, such as minimizing the sum of the differences (errors) squared (e.g., MSE basis) or maximizes a likelihood function Estimating GSPs CIAT Aug 4, 2015
  14. GSP Estimation: Various Approaches, including Bayesian MCMC for Model Development, Genomic Prediction, etc. RILs Error/Likelihood Bean Crop Model Multi-Location Experiments Phenotypic Data QTLs (~traits) Environment, Management Data Sim Phenotypic Responses Iterative Estimation GSPs GSP* & QTL effects
  15. Adding Genetic Information for Application of Crop Models (Ideotype Design, Selection of G, M for E, Genomic Prediction) Genotypes G, M Selection for Optimal Responses Bean Crop Model QTLs Environment, Management Data Sim Phenotypic Responses Iterative Exploration GSPs CIAT Aug 4, 2015
  16.  Current approaches – develop relationships between GSPs and QTLs (e.g., White and Hoogenboom, 1996, 2003; Messina et al., 2006; etc.)  Why not continue this? • Current models do not include GSPs for all processes and traits that we now know are under genetic control (examples from this study) • May need to modify environmental effects, interactions, in the model • Current crop models are not ideally structured to make all of the changes that are needed. • Major changes are likely needed in many places, although some code may be reusable • Although some existing crop models are modular, new modules are needed that are designed based on what we are now learning about genetic control of processes and so that new modules can be easily modified as more is learned, fine granularity Need for a new gene-based model CIAT Aug 4, 2015
  17. Example Results After Incorporating* Gene- Based Component in CROPGRO-Bean 0 2 4 6 8 10 12 14 16 18 20 40 60 80 100 Days after Planting Leaf number (Jamapa QTLs (-1) 0.3 m ro) Leaf number (Calima QTLs (+1) 0.3 m ro) 0 1000 2000 3000 4000 5000 6000 20 40 60 80 100 Days after Planting Grain wt kg/ha (Jamapa QTLs (-1) 0.3 m ro) Tops wt kg/ha (Jamapa QTLs (-1) 0.3 m ro) Grain wt kg/ha (Calima QTLs (+1) 0.3 m ro) Tops wt kg/ha (Calima QTLs (+1) 0.3 m ro) Main Stem Node Number Biomass and Pod Mass, kg/ha * Incorporated NAR to compute TRIFL only CIAT Aug 4, 2015
  18.  Need to account for G x E x M interactions on processes  Need to design for evolution as more knowledge about genetic effects on crop components is obtained  Example Gene-based Model of bean leaf area  Design modules with QTL effects on CM processes  Still a work in progress New Modular Approach CIAT Aug 4, 2015
  19. 𝑵𝑨𝑹(𝒕) = 0.252 + 0.021 ∙ 𝑇𝐸𝑀𝑃 − 21.51 − 0.005 ∙ 𝑆𝑅𝐴𝐷 − 17.38 − 0.004 ∙ 𝐷𝐿 − 12.74 − 0.010 ∙ 𝐵𝑛𝑔072 − 0.032 ∙ 𝐹𝐼𝑁 + 0.009 ∙ 𝐵𝑛𝑔083 − 0.008 ∙ 𝐷𝑖𝑀7−𝟕 − 0.004 ∙ 𝐵𝑛𝑔072 ∙ 𝐷𝐿 − 12.74 − 0.003 ∙ 𝐹𝐼𝑁 ∙ (𝑇𝐸𝑀𝑃 − 21.51) Linear Mixed Effects Model for NAR(t) Bng072 Marker for QTL found to influence NAR, + 1 for Calima and -1 for Jamapa parental lines Bng083 Marker for QTL found to influence NAR, equal to + 1 for Calima and -1 for Jamapa parental lines DL Average daylength during time when nodes were being added in genotype g at site s (h) DLmean Average daylength across sites in the experiment during node addition, h Dim7-7 Gene or QTL found to influence NAR, equal to + 1 for Calima and -1 for Jamapa parental lines FIN Gene or QTL found to influence NAR, equal to + 1 for Calima and -1 for Jamapa parental lines NAR(t) Node addition rate, nodes per day added to the main stem for genotype g grown at site s SRAD Average SRAD across sites in the experiments, MJ m-2 d-1 TEMP Average of daily mean temperature during the time when nodes were added, 0C CIAT Aug 4, 2015
  20. NAR vs. Temperature Parent Lines 0 0.1 0.2 0.3 0.4 0.5 0.6 0 10 20 30 40 NodeAdditionRate,#/d Temperature, C Jamapa (-1) Calima (+1) 0 0.1 0.2 0.3 0.4 0.5 0.6 0 10 20 30 40 NodeAdditionRate,#/d Temperature, C Jamapa (-1) Jamapa with Calima FIN Calima (+1) Calima with Jamapa FIN (a) (b) CIAT Aug 4, 2015
  21. Modular Approach Example of a module: model that computes node addition rate on day t (NAR(t)) CIAT Aug 4, 2015
  22.  We know that temperature effects on most crop growth processes is nonlinear  Also, this linear model uses mean temperature during observation period, when we know that plants respond non-linearly to temperature and should be considered hourly  So, modules need to be dynamic and include nonlinear effects But, is Linear Model Adequate? CIAT Aug 4, 2015
  23. 𝑁𝐴𝑅 𝑡 = 𝑎0 + 𝑎1 ∙ 1 24 𝑇ℎ∗−𝑇𝑏𝑎𝑠𝑒 𝑇𝑜𝑝𝑡1−𝑇𝑏𝑎𝑠𝑒 + 𝑎2 ∙ 𝐷𝐿(𝑡) − 𝐷𝐿 + 𝑎3 ∙ ( 𝑆𝑅𝐴𝐷(𝑡) − Example Nonlinear Model Formulation 𝑇ℎ∗ = 𝑇𝑏𝑎𝑠𝑒 𝑖𝑓 𝑇ℎ𝑜𝑢𝑟 𝑖𝑓 𝑇𝑜𝑝𝑡1 𝑖𝑓 𝑇ℎ𝑜𝑢𝑟 < 𝑇𝑏𝑎𝑠𝑒 𝑇𝑏𝑎𝑠𝑒 < 𝑇ℎ𝑜𝑢𝑟 < 𝑇𝑜𝑝𝑡1 𝑇𝑜𝑝𝑡1 < 𝑇ℎ𝑜𝑢𝑟 CIAT Aug 4, 2015
  24.  What are the GSPs in the above equation?  Are they constant across environments?  Does this nonlinear formulation make sense relative to physiological process and what we know?  Is it sufficiently robust? How can we determine this?  Will the GSPs in this equation remain fixed across genotypes? Environments? Management?  Will “calibration” be needed after fitting these equations to field data? If so, how will this differ from what we now do?  We should formulate nonlinear models based on mechanistic knowledge, then estimate parameters using data from genetic family across diverse environments. What About GSPs? CIAT Aug 4, 2015
  25. Simulated Mainstem Nodes vs. Days After Planting ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ●● ● ●● ● ● ● ● ●●● ● ● ● ●● ● ●● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15 0 5 10 15 CTNDPAPOPR 0 25 50 75 100 125 DAP NodeNumber RIL ● ● CAL JAM Simulation: Main Stem Node Number
  26. Area Expansion of Leaves on Main Stem ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 50 100 150 200 250 0 50 100 150 200 250 0 50 100 150 200 250 0 50 100 150 200 250 CTNDPAPO 10 20 30 40 50 DAP LeafArea[cm^2] variable ● ● ● ● ● LA1 LA2 LA3 LA4 LA5 Leaf Area by node Sim: CAL ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 50 100 150 200 250 0 50 100 150 200 250 0 50 100 150 200 250 0 50 100 150 200 250 CTNDPAPO 10 20 30 40 50 DAP LeafArea[cm^2] variable ● ● ● ● ● LA1 LA2 LA3 LA4 LA5 Leaf Area by node Sim: JAM CIAT Aug 4, 2015
  27. Prediction of Main Stem Leaf Area ●●● ●● ● ●● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ●●●● ●● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 0 500 1000 1500 2000 2500 CTNDPAPOPR 0 25 50 75 100 125 DAP LeafArea[cm^2] RIL ● ● CAL JAM Simulation: Main Stem Leaf Area CIAT Aug 4, 2015
  28. CommonBean Model: Integrating modules Integrated Modules CommonBean Model Initialize T storage matrix Initialize d = VEDAP Set initial state variables Set hour h = 1 Read T for hour h Calculate & update mean T Hour = 24 ? Run MSNOD.max Module Run NAR Module Run LAMS Module Day (d) = End point (DAY) ? End CommonBean Model h=h+1 d=d+1 No No CIAT Aug 4, 2015
  29. Advances in Genomic Prediction CIAT Aug 4, 2015
  30. 1550 Doubled Haploid Lines Synthetic Data Set, Maize Champaign, IL 2012, 2013 Technow et al., 2015
  31. Model GSPs, which in turn are used in function to predict yield (highly nonlinear) CIAT Aug 4, 2015
  32. Crop Model-Based Genomic Prediction outperforms GBLUP CIAT Aug 4, 2015
  33. Crop Model-Based Genomic Prediction outperforms GBLUP QTLs estimate Yield via Crop Model function using GSPs QTLs estimate Yield via GBLUP Yield=f(4 GSPs,Env)
  34. Discussion  Demonstrated benefits of merging crop modeling and genetics  Various methods are reasonable  Need new G,E nonlinear functions estimated using mixed effects models, physiologically based with G and E components (management also)  Modularity is important, short and long term  Paper in Special Issue  Genomic Prediction with crop models likely to perform better than other methods (GBLUP) CIAT Aug 4, 2015
  35. Discussion CIAT Aug 4, 2015
Advertisement