Advertisement
Advertisement

More Related Content

Similar to Early generation selection in an intra population recurrent selection breeding program within a synthetic population (20)

Advertisement

More from CIAT(20)

Advertisement

Early generation selection in an intra population recurrent selection breeding program within a synthetic population

  1. Early generation selection in a recurrent selection breeding program within a synthetic population – Using genomewide markers to speed-up the process Seminar on genomic selection 17/10/2014 Tuong-Vi Cao, UMR AGAP, CIRAD-BIOS
  2. Genomic selection based on genome-wide genotype-phenotype relations is a promising approach for breeding : 1. to access more selection candidates (higher intensity of selection) and 2. to reduce the duration of selection cycles (maximize genetic gain/unit time) This is even more interesting since molecular information is becoming more accessible while phenotypic information is becoming limiting, in terms of resources allocation.
  3. The upland rice breeding program of CIAT initiated this approach and first results based on cross-validation within calibration population data) showed that it is possible to use such an approach but the accuracy is rather low globally. Some reasons are already stressed (only one year*location evaluation, only additive effects modelled).
  4. My present contribution is about : 1. the way phenotypic predictor may be defined and modelled to take into account dominance and epistatic interactions, and 2. the way to integrate markers to reduce further the duration of selection cycles.
  5. What has been done and what is the question ? [Ms Ms] 1/4 EEP 2010 Seed increase through SSD [Ms ms] 1/2 [ms ms] 1/4 392 S1:2 progenies segregating for ms gene EEP 2010 S0:1 progenies segregating : ¼ [Ms Ms] + ½ [Ms ms] + ¼ [ms ms] pl4 pl2 pl3 pl1 S1:2 DNA extraction of 8 S1:2 plants and genotyping for ms locus EEP 2011 A EELL 2008 Four synthetic populations segregating for ms gene : ½ [ms ms] + ½ [ms Ms] MS MF PCT-4A PCT-4C PCT-11 MS MF PCT-4B S0:1 Extraction of 100 S0:1 progenies per population on MF plants
  6. What has been done and what is the question ? [Ms Ms] 1/4 EEP 2010 Seed increase through SSD [Ms ms] 1/2 [ms ms] 1/4 392 S1:progenies segregating for ms gene EEP 2010 S0:1 progenies segregating : ¼ [Ms Ms] + ½ [Ms ms] + ¼ [ms ms] pl4 pl2 pl3 pl1 S1:2 DNA extraction of 8 S1:2 plants and genotyping for ms locus EEP 2011 A EELL 2008 Four synthetic populations segregating for ms gene : ½ [ms ms] + ½ [ms Ms] MS MF PCT-4A PCT-4C PCT-11 MS MF PCT-4B S0:1 Extraction of 100 S0:1 progenies per population on MF plants 2
  7. What has been done and what is the question ? [Ms Ms] 1/4 [Ms ms] 1/2 [ms ms] 1/4 392 S1:2 progenies segregating for ms genes S2:3 S2:3 Phenotyping S2:4 Bulk seed increase S2:3 S2:3 DNA extraction of 15 S2:3 plants per progeny Choice of one [Ms Ms] plant per S1:2 progeny to constitute the calibration population. GBS genotyping to infer the genotype of S2 plants Phenotyping of S2:4 progenies to calibrate the model Bulk seed increase
  8. What has been done and what is the question ? • The S2 population as the base population structure for calibration is an option because a partially fixed material: – is more homogenous and easier to phenotype (minimum intra-progeny variation and maximum between progeny variation) – minimizes the bias due to dominance effects. • However, it is time and resources consuming : – to produce material to calibrate the prediction model (S2 population to be sampled, S2:3 bulks to be genotyped, S2:4 progenies to be phenotyped) – to produce the breeding material until S2 generation before being predicted in each cycle. • Hence, is it possible to save time & resources through : – Early phenotyping for calibrating the model ? – Early prediction of breeding candidates ?
  9. Genetic model • For simplicity, let us suppose two biallelic loci M and N, • Let M N i k be a genotype in S0 generation, M N j l • The genotypic value is  A A G S  0       D      A AA AA D                         AD DD ijkl jkl AD AD ikl AD ijl ijk jk AA il jl AA ik kl ij l k j A i ijkl Additive effects associated with alleles i or j of M locus and alleles k or l of N locus Dominance effects associated with M and N loci respectively Additive*additive epistasis associated with one allele of M locus and one allele of N locus Additive*dominance epistasis associated with 2 alleles of first locus and 1 allele of second locus Dominance*dominance epistasis associated with all alleles
  10. Genetic model • At meiosis, the genotype produces four gametes with frequencies depending on the recombination rate r, • If selfed, the genotype produces ten genotypes in the S1 generation … Gametes and their respective frequencies k iNM 1 r  l iNM 2 r M j Nk 2 r M j Nl 2 1 r  2 Gametes and their respective frequencies MiNk 1  r Giikk Giikl Gijkk Gijkl 2 MiNl r 2 Giikl Giill Gijkl Gijll M j Nk r 2 Gijkk Gijkl Gjjkk Gjjkl M j Nl 1  r 2 Gijkl Gijll Gjjkl Gjjll Genotypic value / Genotype
  11. Genetic model • With respective frequencies shown below : Genotype Frequency Giikk ¼ (1-r)² Gjjll ¼ (1-r)² Giill ¼ r² Gjjkk ¼ r² Gijkl ½ (1-r)² Gijkl ½ r² Giikl ½ r (1-r) Gijkk ½ r (1-r) Gijll ½ r (1-r) Gjjkl ½ r (1-r) Non recombinant double homozygote genotypes Recombinant double homozygote genotypes Non recombinant double heterozygote genotype Recombinant double heterozygote genotype Partially recombinant genotypes, homozygote for one locus and heterozygote for the other locus
  12. Genetic components of generation means • The frequencies form a vector, V1, associated with the S1 generation : Genotype Frequency Giikk ¼ (1-r)² Gjjll ¼ (1-r)² Giill ¼ r² Gjjkk ¼ r² Gijkl V1= ½ (1-r)² Gijkl ½ r² Giikl ½ r (1-r) Gijkk ½ r (1-r) Gijll ½ r (1-r) Gjjkl ½ r (1-r) If V2 is the vector of frequencies of the S2 generation, then one can find the relationship between V1 and V2 …
  13. Genetic components of generation means • This relation is V2 = M*V1 • It holds for any couple of successive generations (Vn+1=M*Vn). • M matrix is used to estimate genotypic values and genetic covariances between successive generations. r r 1 0 0 0 (1 )² ² 0 0 1 1 r r  0 1 0 0 (1  )² ² 0 0 r r 0 0 1 0 ² (1  )² 0 0 r r 0 0 0 1 ² (1  )² 0 0 r r 0 0 0 0 (1  )² ² 0 0 0 0 r r 0 0 0 0 ² (1  )² 0 0 0 0 r r r r 0 0 0 0 (1  ) (1  ) 0 0 0 r r r r 0 0 0 0 (1  ) (1  ) 0 0 0 r r r r 0 0 0 0 (1 ) (1 ) 0 0 0 1 2 1 1 1 1 1 1   1 r r r r 2 1 1 1 1 1 1 1 2 1 1 1 2 2 2 1 1 1 2 2 2 1 1 1 2 2 2 2 1 1 1 2 1 1 1 2 2 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 0 0 0 0 (1 ) (1 ) 0 0 0 M    Ongoing questions : • Is it possible to relate the frequencies of any generation (including RILs) to the ones of first generation directly (i.e. S0 plant or F1 cross)? • If yes, it is also possible to relate any generation mean and genetic covariance the ones of unselfed S0 plant or F1 cross ?
  14. Genetic components of generation means • Thus if r = ½ (for simplicity), the genotypic mean value of S1 M N progeny of a S0 plant/cross is :   AA   AA  1              1 i k M N      D D D 1       DD AD DD AD DD 1         1      A A      A 1   AD     DD DD AD D 1       D  AD DD AD 1 • If successive generations are allowed to segregate and AD recombine until complete fixation (i.e. neither selection nor drift), the expected mean value of the RILs will be :                 ijkl jjkl DD ijll ijkk iikl jjkk DD iill jjll DD iikk jkl AD ikl ijl ijk jkk AD jjk AD ill iil jll AD jjl AD ikk iik jk AA il jl AA ik ll kk jj D ii kl ij l k j A i G S ijkl 4 8 16 2 4 4 4 2 1    G S  1      D D D 2 1       DD AD DD AD 1          AA   AA  1                   A A      G S  jjkk DD iill jjll DD iikk jkk AD jkk AD AD ill AD iil jll AD jjl AD ikk iik jk AA il jl AA ik ll kk jj D ii A l k j A i ijkl 4 2 2  ijkl j l
  15. Line value concept : definition and prediction • Line value (LV) is the mean value of all RILs that a plant or a cross can produce through successive selfings (or haplo-diploïdisation). • LV may be predicted by any couple of successive generations : G G Sn Sn  1 • If a F1 and its F2 self are both phenotyped, then [2*GF2-GF1] predicts the mean value of RILs derivable from the cross. The genetic components may be written as follows : 1 G  G    A  A  A  A  D  D  D  D  AA  AA  AA  AA F F i j k l ii jj kk ll ik jl il jk 1 AD AD AD AD AD AD AD AD         1 1 1 • This predictor equals the expected LV (S∞ 1 Gijkl) except for the DD terms. ijkl ijkl 2*           iik ikk jjl jll iil ill jkk jkk  DD DD      iikk jjll DD iill DD jjkk DD DD iikl ijkk DD ijll DD jjkl DD ijkl 2 4 8 2 2 2 2 * 2 1         
  16. Line value concept : definition and prediction • The difference in DD terms between the expected line value (S∞ Gijkl) and its prediction (2*GF2-GF1) : 1 –The prediction includes the quantity DD= which is associated with heterozygote structures. 1 –While the line value includes the quantity DD’= associated with homozygote structures. This means that if DD=DD’=0, then the prediction of LV obtained from early generations will be exactly equal to the expected LV (S∞ Gijkl).  iikl ijkk ijll jjkl   ijkl  DD DD DD DD DD 2 4     1  DD  DD  DD  DD  iikk jjll iill jjkk 8
  17. Applying LV concept to RS breeding scheme : advantages & specifics aspects • Efficient & early prediction of the potential of plants or crosses to produce performant inbred lines, even for traits with dominance and epistatic interactions. • In the context of the CIAT rice breeding scheme, unique S0 plants can not be phenotyped properly, so successive selves can be used to construct the predictor of interest, which is [2 * S2Gijkl - S1Gijkl] or [2 * S3Gijkl - S2Gijkl], depending on the quantity of seeds needed for phenotyping (i.e. monolocal versus multilocal experimentation).
  18. Applying LV concept to RS breeding scheme : advantages & specifics aspects • Advantages of LV predictor compared with S2:4 predictor : – Gain in the duration of the calibration process (1 or 2 generations) – Gain in the duration of a selection cycle (prediction of S0:2 progenies instead of S2:4 progenies) – No bias due to dominance (as in single generation phenotyping) • Specific aspects to focus on : – Bulk multiplication of seeds is mandatory (to maintain allelic frequencies to be able to develop the equations) – The ms locus controlling male sterility is difficult to manage if genotyping for the locus is not available to differentiate S0 plants – Number of progenies to be phenotyped is halved if equal resources is considered (as two generations needed to be phenotyped)
  19. Accelerating further the process using genomewide markers • Line value may be used as phenotype in a genomic model instead of single selfed progeny value. The procedure consists in: – GBS Genotyping of S0 plants, – Phenotyping of S1 and S2 (or S2 and S3) progenies, • Gain at two levels compared with S2 genotyping and S2:4 phenotyping: – Calibration takes 2 generations (S1 and S2) or 3 generations (S2 and S3) instead of 4 generations – Prediction takes place on S0 plants directly without multiplying until S2 generation
  20. Accelerating further the process using genomewide markers Procedure when genotyping of ms locus is available : – Genotyping of S0 plants for ms locus – GBS genotyping of S0 plants that cary [Ms Ms] genotype at ms locus only – Seed increase of [Ms Ms] S0 plants until S2 or S3 generations – Phenotyping of S1 + S2 (or S2 + S3)
  21. Conclusion This procedure optimises the GS scheme for some aspects : • Calibration of the model based on very early generations • Early prediction of the breeding population (S0). This maximizes the genetic gain par unit time. • Line value predictor are less unbiased by complex effects even if these may be important in early generations, in particular dominance
  22. Thank you !

Editor's Notes

  1. My presentation is about reflexions and questions based on the experience of genomic selection conducted on upland rice in CIAT To begin with, I remind what has been done and try to identify some aspects that can be optimised ?
Advertisement