SlideShare a Scribd company logo
1 of 91
Download to read offline
Simulation Analysis of Doubled
Haploids in a Wheat
Breeding Program
November 1999
N.L. Kruger
Research Report #5
Simulation Analysis of Doubled Haploids in a
Wheat Breeding Program
Narelle Lee Kruger
This report was submitted as a requirement of the subject AG421, for a Bachelor of
Agricultural Science (Plant Breeding) in the Faculty of Natural Resources, Agriculture and
Veterinary Science in the School of Land and Food, The University of Queensland,
November 1999.
Abstract
The Germplasm Enhancement Program (GEP) of the Australian Northern Wheat
Improvement Program (NWIP) is presently based on an S1 recurrent selection strategy.
The objective of the GEP is to ensure a continual supply of high yielding germplasm for the
pedigree programs in the NWIP. The GEP works on a four year breeding cycle. Years 1
and 2 are used for intermating, selection for the traits maturity and height, and seed
multiplication of the S1 families. Multi-environment Trials (METs) of the S1 families are
conducted in years 3 and 4 and selection is based on grain yield and grain protein
concentration data collected from these METs. There is interest in whether using doubled
haploid (DH) lines in the MET evaluation phase of the recurrent selection program, in
place of S1 families, could contribute to an increase in the rate of genetic improvement for
grain yield.
The objective of this project was to use computer simulation to investigate the applicability
of a DH strategy in the GEP for a range of genotype-environment system models which are
considered to be relevant to wheat improvement in the northern grains region of Australia.
This study considered the influence of models including additive and genotype-by-
environment (G×E) interaction effects. The computer simulation program QU-GENE,
developed at The University of Queensland, was used in this study. The advantages of the
computer simulation approach over alternative approaches based on either theoretical
analysis or experimental evaluation were: (1) that more complex genetic models could be
examined than were possible to examine using the theoretical approach, (2) larger
experiments with many more factors could be examined than would be feasible in an
experimental investigation, and (3) answers to researchable questions could be obtained in
a more timely manner than would be possible using either a theoretical or experimental
approach.
Five simulation experiments were conduct to compare the response to selection or genetic
gain when either S1 families or DH lines were used in the GEP. Experiment one involved
simulating the effective population size (Ne) for DH lines and S1 families, and comparing
these results to the theoretical predictions. The simulated results conformed well to the
results predicted from theory. For the same number of selected individuals, the S1 families
had a higher Ne than the DH lines and therefore the S1 family strategy was less likely to
lose favourable genes by random drift. The effect of linkage disequilibrium was assessed in
experiment two. Linkage disequilibrium was shown to have an influential role in the rates
of increase in the frequency of favourable genes in the GEP. The third experiment
compared the response to selection when identical numbers of S1 families and DH lines
were evaluated, for an additive genetic model without G×E interactions. The results
indicated that 250 DH lines had an advantage over 1000 S1 families in terms of rate of
response to selection. In the fourth experiment, intensity of selection was manipulated by
changing the number of families selected to proceed into the next cycle of selection.
Increasing the intensity of selection by selecting fewer families increased the rate of
response to selection in the short-term. However, selecting fewer families also decreased Ne
and consequently selecting too few families resulted in the loss of favourable genes from
the population due to the effects of random drift, resulting in a reduction in long-term
response to selection. As a trade-off, selecting 20 families greatly reduced the chance of
favourable genes from being lost from the population due to drift, without slowing the
response to selection significantly. Experiment five assessed the influence of introducing
complexity into the additive model by incorporating genotype-by-environment (G×E)
interactions. The advantage observed by the DH lines over S1 families for the additive
model was retained, and was also present when a MET based on one year for the DH lines
was conducted in comparison to two years for the S1 families.
Computer simulation analyses of the expected short-term and long-term responses to
selection for a range of additive genetic models suggests there are advantages of the DH
strategy when it is feasible to generate 250 or more DH lines for evaluation in the MET
phase of the GEP. This advantage was also observed with the presence of G×E interaction
in the model. These outcomes suggest that the use of DH lines in place of S1 families in the
GEP may be a feasible activity. As the production of DH lines becomes less expensive and
labour intensive, more DH lines will be able to be produced in a year and therefore greater
gains in selection will be potentially observed.
Declaration of Originality
This report describes the original work of the author, except where otherwise stated. It has
not been submitted previously as part of degree requirements at any other University.
Narelle Lee Kruger
Publications relevant to this thesis
Kruger NL, Podlich DW, Cooper M (1999) Comparison of S1 and doubled haploid
recurrent selection strategies by computer simulation with applications for the Germplasm
Enhancement Program of the Northern Wheat Improvement Program. In ‘Proceedings of
the Ninth Assembly Wheat Breeding Society of Australia.’ (Eds P Williamson, P Banks, I
Haak, J Thompson, A Campbell) pp.216-219. (Wheat Breeding Society of Australia:
Toowoomba)
Acknowledgements
I would like to thank the Grains Research and Development Corporation for the
Undergraduate Honours Scholarship Award and their support of this research project.
I would especially like to thank my project supervisor Dr Mark Cooper for his valuable
time, expertise, guidance, encouragement and dedication throughout the course of the
project.
I would also like to thank Dean Podlich for his expertise, advice, time and support.
Thankyou also to Mr Ian Phillips for helping with the use of ASREML and the support of
my fellow colleagues Dr Ian DeLacy, Ms Nicole Jensen, Mr Kevin Micallef and Mr Anura
Ratnasiri.
i
Table of Contents
List of Tables .............................................................................................................. iii
List of Figures..............................................................................................................iv
1. Introduction.........................................................................................1
2. Literature Review ...............................................................................4
2.1 Introduction ...........................................................................................4
2.2 Germplasm Enhancement Program....................................................7
2.3 Doubled Haploids ................................................................................11
2.4 Genotype-by-environment Interaction..............................................12
2.5 QU-GENE ............................................................................................13
2.6 Effective Population Size (Ne).............................................................15
2.7 Linkage Disequilibrium .......................................................................18
2.8 Study Focus ...........................................................................................20
3. Materials and Methods.....................................................................21
Experiment 1: Comparison of simulated and theoretical predictions
of the effective population size (Ne) of S1 families and DH lines.................23
Experiment 2: Determining the effects of linkage disequilibrium
on gene frequency and response to selection ...............................................25
Experiment 3: Evaluating the response to selection of S1 families
and DH lines for an additive genetic model.................................................26
Experiment 4: Evaluating the impact of selection proportion on
response to selection for an additive genetic model.....................................26
Experiment 5: Evaluating the influence of G×E interaction
genetic models on response to selection.......................................................28
4. Results ................................................................................................31
Experiment 1: Comparison of simulated and theoretical predictions
of the effective population size (Ne) of S1 families and DH lines.................31
Experiment 2: Determining the effects of linkage disequilibrium
on gene frequency and response to selection ...............................................34
ii
Experiment 3: Evaluating the response to selection of S1 families
and DH lines for an additive genetic model.................................................36
Experiment 4: Evaluating the impact of selection proportion on
response to selection for an additive genetic model.....................................40
Experiment 5: Evaluating the influence of G×E interaction
genetic models on response to selection.......................................................42
5. Discussion...........................................................................................45
Experiment 1: Comparison of simulated and theoretical predictions
of the effective population size (Ne) of S1 families and DH lines.................45
Experiment 2: Determining the effects of linkage disequilibrium
on gene frequency and response to selection ...............................................46
Experiment 3: Evaluating the response to selection of S1 families
and DH lines for an additive genetic model.................................................48
Experiment 4: Evaluating the impact of selection proportion on
response to selection for an additive genetic model.....................................51
Experiment 5: Evaluating the influence of G×E interaction
genetic models on response to selection 52
Overall..........................................................................................................54
6. Conclusion..........................................................................................56
7. The Future .........................................................................................57
8. References..........................................................................................58
9. Appendices..........................................................................................65
9.1 Appendix 1 ............................................................................................65
9.2 Appendix 2 ............................................................................................76
iii
List of Tables
Table 1: Selected intensity (%) and corresponding standardised
selection differential (S), (within the brackets) from
Falconer and Mackay (1996), changes depending on
the number of families in the MET and the number of
families selected (selected proportion) from the MET. .............................27
Table 2: Input file for the QUGENE engine. This file represents
the G×E model 5 (Table 3). The genes 5-16 are removed
from this presentation for conciseness. GN represents the
gene number and E1 – E5 represent the five environment
types within the target population of environments. The
detail of the structure of this input file is explained by
Podlich and Cooper, (1997; 1998) .............................................................29
Table 3: Each model describes the number of genes interacting
with the five environment types and the level of G×E
interaction present as described by the ratio of the
genotype-by-environment interaction variance to
the genotypic variance (σ2
GE:σ2
G)..............................................................30
iv
List of Figures
Figure 1a: Components and pathways of germplasm transfer
for yield improvement in the Australian Northern
Wheat Improvement Program: LRC-QDPI
represents the Queensland Department of Primary
Industries pedigree breeding programs located in
Toowoomba at the Leslie Research Centre; PBI-US
represents the University of Sydney pedigree
breeding programs located in Narrabri; GEP
represents the University of Queensland Germplasm
Enhancement Program (Cooper et al., 1999) ............................................8
Figure 1b: Outline of the Germplasm Enhancement Program
(GEP) 4 year cycle. Pictures show examples
of the activities and field experiments undertaken
at each stage of the cycle (Cooper et al., 1999).........................................9
Figure 2: Schematic outline of the QU-GENE simulation
software. The central ellipse shows the engine and the
surrounding boxes show the application modules.
The GEPRSS module was used in this study
(Podlich and Cooper, 1997, 1998)...........................................................14
Figure 3: Outline of the activities involved in the S1 and DH
breeding strategies over one cycle of the GEP. The
S1 activities are adapted from Fabrizius et al., (1996)............................21
v
Figure 4: S1 families effective population size (Ne) calculated
theoretically (solid line) and the average of the simulation
runs (broken lines) for a range of values for the number
of S0 plants sampled (M) and the number of reserve
seed used for intermating per S0 plant sampled (m)................................31
Figure 5: Simulated S1 family effective population size (Ne)
variation, about the average of the simulation runs
(solid line) for a range of S0 plants sampled (M) and
the two extreme values of reserve seed used for
intermating per S0 plant sampled (m). (for intermediate
levels of m refer to Appendix 2)..............................................................32
Figure 6: Comparison of the DH simulated average (closed circles)
and DH theoretical (solid line) effective population size for
a range of S0 plants sampled ( 'M ) and when only one DH
plant was produced per S0 plant sampled ( 'm = 1).................................32
Figure 7: Average of the simulated DH effective population
size (Ne) for a range of S0 plants sampled ( 'M ) and
DH plants produced per S0 plant sampled 'm .
A regression line is fitted to each 'm ......................................................33
Figure 8: Simulated S1 family effective population size (Ne) variation,
about the average of the simulation runs (solid line) for a range
of S0 plants sampled ( 'M ) and two extreme numbers of DH
plants produced per S0 plant sampled ( 'm ) (for intermediate
levels of 'm refer to Appendix ................................................................34
vi
Figure 9: The influence of linkage disequilibrium in the GEP with
S1 families, 20 genes under both an additive model and
G×E interaction (σ2
GE:σ2
G = 2.89) model after five cycles
of selection. (a) frequency of each gene in the model
plotted for one generation of random mating per cycle
(b) gene frequency and value for each of the 20 genes
for one generation of random mating per cycle (c) frequency
of each gene in the model plotted for 10 generations of random
mating per cycle, and (d) gene frequency and value for each
of the 20 genes for 10 generations of random mating per cycle.............35
Figure 10: Comparison of the response to selection for S1 families
and DH lines with heritability 0.05, 20 genes and four
family sizes over 10 cycles of selection...................................................37
Figure 11: Comparison of the response to selection for S1
families and DH lines with heritability 0.05, 500
families and gene numbers over 10 cycles of selection...........................38
Figure 12: Comparison of the response to selection for S1
families and DH lines with 20 genes, two heritability
levels and two family sizes over 10 cycles of selection ..........................38
Figure 13: Comparison of the response to selection for 1000 S1
families to 100, 250, 500 and 1000 DH lines two
heritability levels and two gene numbers over 10 cycles of selection.....39
Figure 14: Comparison of the response to selection for S1
families (a,c,e) and DH lines (b,d,f) with a heritability
0.95, 20 genes and three levels of families selected over
10 cycles of selection...............................................................................41
vii
Figure 15: Comparison of response to selection of the G×E interaction
(σ2
GE:σ2
G = 1.1: Table 3) model and the additive model
with constant heritability (0.95), genes (20) and two
levels of families selected (S) for four different family
sizes over 10 cycles of selection..............................................................42
Figure 16: Comparison of response to selection of 1000 S1
families to two sizes of DH lines, 20 genes, two
heritability levels, two levels of G×E interaction
and both S1 families and DH lines having two years of METs...............43
Figure 17: Comparison of response to selection of 1000 S1
families to two sizes of DH lines, 20 genes, two
heritability levels, two levels of G×E interaction
and both S1 families and DH lines having two years of METs...............44
1
1. Introduction
The Germplasm Enhancement Program (GEP) is an S1 recurrent selection program. It
operates as a parent building component of the Northern Wheat Improvement Program
(NWIP) of Australia (Fabrizius et al., 1996). Recurrent selection programs are conducted to
achieve medium and long-term genetic improvement by increasing the frequency of
favourable genes and gene combinations. This is achieved by a combination of long-term
breeding strategies aimed at improvement of the genetic resource base, and short-term
breeding strategies aimed at exploiting the potential of the genetic resource base available
at a particular point in time. Hallauer (1981) argued that recurrent selection is the most
efficient breeding strategy for long-term genetic improvement and pedigree breeding is the
most efficient for short-term exploitation of genetic resources for the purpose of cultivar
development. This study focused on the issues relevant to the rate of genetic improvement
and long-term genetic improvement of a population, with applications to the GEP of the
NWIP. The response to selection can be evaluated in terms of the improvement in the mean
of a population subjected to selection and the rate of genetic improvement can be evaluated
by investigating response to selection over a series of cycles of selection.
Optimising the allocation of resources to activities within the GEP to achieve its role in the
NWIP is a complex problem. There is interest in whether a strategy using doubled haploid
(DH) lines (i.e. plants developed by a process where the haploid genome has been doubled)
can contribute to an increase in the rate of genetic improvement relative to that achieved by
the current S1 strategy. Some advantages of using DH lines in the GEP are: (1) the plants
are completely homozygous in one generation, whereas the S1 families are still segregating,
(2) the variation among DH lines is not influenced by dominance and they have twice as
much additive genetic variation partitioned among lines relative to S1 families, and (3)
selection of superior genotypes should be easier and more efficient. Some disadvantages of
using DH lines in the GEP are: (1) their production is more difficult and costly relative to
S1 families, and (2) with the current DH technology based on the wheat × maize crossing
system, they would add an extra year to the GEP cycle.
2
Breeding programs take many years to conduct and experimental comparisons of DH and
S1 family selection would be costly and time consuming. Further, it is questionable
whether experiments with sufficient power could be conducted to detect significant
differences between the breeding strategies within three cycles of the breeding program (i.e.
12-15 years). Simulation allows a rapid and low cost assessment of the potential value of
using doubled haploids in the GEP. The aim of this study was to use computer simulation
methodology to compare the expected selection response for S1 and DH recurrent selection
strategies for a range of genetic models where the variables heritability, number of cycles,
number of families evaluated in METs, number of families selected and number of genes
contributing to the trait were manipulated. The genetic models investigated in this study
consider the influences of additive and additive-by-environment (G×E) interaction effects.
Response to selection in a recurrent selection program is a balance between directed
changes in gene frequency due to the effects of artificial selection imposed by the breeder
and random changes in gene frequency due to the effects of random drift. It has been
emphasised by Hospital and Chevalet (1996) that the joint effects of selection, linkage and
drift must be considered in any evaluation of selection response. Both changes in gene
frequency due to the effects of selection and drift are influenced by the number of lines
selected and the resulting selection intensity applied by the breeder. Therefore, in this study
factors that influenced response to selection (i.e. heritability, selection intensity and number
of families/lines evaluated) and random drift (i.e. effective population size) were examined
for their influence on selection response within the GEP. The effective population size (Ne)
of DH populations is not well documented in the literature. Simulations were run to
determine the factors that influenced Ne of DH lines in the GEP and to determine whether
the simulated variation for Ne could be explained using the theoretical prediction equations
derived from the work of Comstock (1996).
Linkage disequilibrium is an important consideration in the GEP as the base population was
generated from a small number of parents. Linkage disequilibrium can cause the genetic
variability of a population to either be inflated or depressed, depending on the linkage phase
relationships (coupling or repulsion) among the loci influencing the traits to be manipulated
3
by selection. This in turn affects the heritability of the trait being selected for and therefore
the population’s response to selection.
The computer simulation platform QU-GENE, developed at The University of Queensland
(Podlich and Cooper, 1998) was used to conduct the simulations. One of the application
modules available within QU-GENE is the GEPRSS (Germplasm Enhancement Program
Recurrent Selection Strategy; Podlich and Cooper, 1997). The GEPRSS module was
modified, and an option added so that the user could choose between the current S1 or DH
strategy. This gives the breeder the choice of whether or not they would like to incorporate
doubled haploids into the program in place of the S1 families.
This report has been structured into five sections:
(1) Literature review (section 2): Within this section the relevant literature has been
investigated and a background to the project presented,
(2) Materials and methods (section 3): The objective of this section was to outline the
components of the different genetic models used in the computer simulations, the
experiments undertaken were (i) comparison of simulated and theoretical
predictions of the effective population size (Ne) of S1 families and DH lines, (ii)
determining the effects of linkage disequilibrium on gene frequency and response
to selection, (iii) evaluating the response to selection of S1 families and DH lines
for an additive genetic model, (iv) evaluating the impact of selection proportion on
response to selection for an additive genetic model, (v) evaluating the influence of
G×E interaction genetic models on response to selection.
(3) Results (section 4): This section presents the analysis of the results of the
simulation experiments and the graphical representations of the important results
of the computer simulations,
(4) Discussion (section 5): Presented in this section is an interpretation of the results of
the simulation experiments, and what these results mean for the GEP,
(5) Conclusion (section 6): This section draws together the important points that
resulted from conducting the simulation study,
(6) The future (section 7): Additional investigations to be run in the future are outlined
in this section.
4
2. Literature review
2.1 Introduction
One of the main objectives of a breeding program is to produce a range of cultivars
superior to those that already exist. This objective can be quantified by monitoring the
response to selection of a breeding population over cycles of selection. Any change in the
mean genetic value of a population due to the influence of selective forces is termed the
response to selection (R) or genetic gain. This can be quantified as the difference of the
mean phenotypic value between the offspring of the selected parents and the whole of the
parental generation before selection (Falconer and Mackay, 1996). Response to selection
can be predicted from knowledge of the heritability of the traits subjected to selection, and
the selection pressure applied using the following formula
2 2
pR h S ih σ= = , (1)
where h2
is heritability, which can be obtained from genetic experiments conducted for
generations prior to selection, S is the selection differential, i is the standardised selection
differential and σp is the standard error of the phenotypic values of the individuals
(selection units). The selection differential is the mean phenotypic value of the individuals
selected as parents, expressed as a deviation from the population mean. The selection
differential is not known until the parents are selected. However, the expected value of the
standardised selection differential can be predicted assuming that the distribution of
phenotypic values of the individuals to be subjected to selection is normal.
The prediction of response however is only valid for one generation of selection as the
response depends on the heritability of the character in the generation from which the
parents are selected. The heritability of the character is expected to change between
generations of selection for two reasons. First, any response to selection will cause the gene
frequencies to change, on which the heritability depends, and secondly, the selection of
parents reduces the variance and the heritability (Falconer and Mackay, 1996).
5
The response to selection equation is therefore important to plant breeders as it quantifies
the genetic gain achievable in any cycle of selection. The basic principle of any plant
breeding program is the continuous improvement of the target species. It is the plant
breeder’s role to control the intensity and speed of this improvement by changing the
genetic structure of a population (Williams, 1964). By understanding the underlying
concepts of the components of the response to selection equation, and the effect that
manipulating them has on the response to selection, breeders can increase the genetic gain
expected from a cycle of selection.
Equation (1) is a basic prediction equation which applies to the mass selection of
individuals in a random mating base population. Most breeding programs however apply
different forms of selection to a population. The general equation can be extended to
accommodate the features of the different selection methods. Of relevance to this study is
the prediction equation for S1 recurrent selection in the GEP, which is
( )
2
'
2 212
' 4 2 21
' 4
A
c
AE DEe
A D
kc
G
rt t
σ
σ σσ
σ σ
=
+
+ + +
, (2)
where Gc is the expected gain per cycle, k is the standardised selection differential applied
to S1 families, c is the parental control, 2
'Aσ is the additive genetic variance plus a
component that is mainly a function of degree of dominance, 2
eσ is the environmental
(error) component of variance, r is the number of replications per environment, t is the
number of environments, 2
'AEσ and 2
DEσ are the additive-by-environmental and
dominance-by-environmental interactions variance, and 2
Dσ is the dominance genetic
variance (Fehr, 1987). The parental control factor (c) is 1 for selfed families, as used in the
GEP. By changing different variables in the QU-GENE GEPRSS module, the components
of the prediction equation can be manipulated which allows simulation investigations of
different selection strategies for the GEP.
6
If no artificial or natural selection pressures are placed on a population, there is no expected
response to selection, therefore any changes in gene frequency and population mean will be
a result of the effects of random drift, when the effects of mutation and migration are
neglected. A rapid response to selection is necessary in the short-term so that breeders can
find a cultivar that is better than what is currently being used. Long-term selection, from the
perspective of a breeding program, is more concerned about maintaining the genetic
diversity within a population for periods of at least 40 years, while maintaining genetic
advance from selection. However, the maintenance of genetic variation must be balanced
with reductions in genetic variation due to the positive effects of selection. If the response
to selection plateaus then it is possible all of the genetic diversity has been lost from the
population. If this occurs early in a breeding program, new germplasm may need to be
introduced into the program. Alternatively, countering forces from natural selection or
mutation may be balancing the effects of the artificial selection. If such countering forces
are present an alternative breeding strategy may have to be considered. This may also
involve introducing new sources of genetic variation and/or increasing the artificial
selection pressures applied. The overall aim of a breeding program is to maintain its long-
term response to selection while also achieving new cultivar development through short-
term response to selection.
The goal of recurrent selection is to maintain the variability of a population for one or more
quantitative characters, with minimal reduction of genetic diversity in the long-term to
allow for continued genetic gain (Hallauer, 1981; Strahwald and Geiger, 1988; Carver and
Bruns, 1993; De Koyer et al., 1999). Recurrent selection maintains heterozygosity of loci
and promotes crossing over within gene blocks, which has the potential to release large
amounts of genetic variance and contribute positively to maximising genetic gain.
Recurrent selection is most commonly associated with breeding of allogamous (cross-
pollinating) species (e.g. maize, Hallauer and Miranda (1988)). A recent review of genetic
gains (Carver and Bruns, 1993) for grain yield and quality for autogamous (self-pollinating)
species indicates that recurrent selection has been equally, if not more effective than
traditional breeding methods, such as the pedigree strategy.
7
By testing homozygous lines (DH) rather than heterozygous families (S1), selection
efficiency can be increased in a recurrent selection breeding scheme (Griffing, 1975;
Baenzinger et al., 1984). Knowledge of expected genetic gain by selection has proved to be
useful for choosing the most efficient selection method. Before starting a long-term
recurrent selection program the breeder needs to know whether the progress is likely to be
high enough throughout the recurrent cycles of selection (Charmet et al., 1993).
Findings of Carver and Bruns (1993) and De Koyer et al. (1999) indicate that the genetic
gain is often highest in the first cycle of selection when genetic variance (σ2
G) is usually
greatest. Selection and genetic drift will ultimately cause a decrease in σ2
G in later cycles of
selection when a larger proportion of the favourable alleles of genes are fixed or lost. This
results in a decrease in genetic diversity, which may or may not be significant or sufficient
enough to affect the long-term response to selection.
To improve selection efficiency, a breeder wants to be able to select in their population a
particular phenotype that accurately reflects the true-breeding genotype. This is where the
use of DH lines in breeding programs can greatly enhance selection efficiency. In general,
the probability of selecting a particular phenotype in a conventional F2 population is (¼)n
for recessive, and (¾)n
for dominant genes, where n is the number of loci segregating. This
compares to (½)n
for both recessive and dominant genes in a DH population. For example
in phenotypic selection for three recessive genes, (½)3
=1/8 of the DH plants would be
selected and expected to breed true in the DH population, compared to (¼)3
=1/64 of the F2
population. For dominant genes, 1/8 of the DH population would breed true for the desired
trait, whereas 27/64 of the F2 population will have to be selected to ensure the inclusion of
the desired 1/64 true-breeding lines (Baenzinger et al., 1984).
2.2 Germplasm Enhancement Program
The main breeding objective of the Germplasm Enhancement Program (GEP) is to combine
high yielding germplasm, from selected sources around the world, with high quality
Australian wheats, and maintain a long-term population improvement strategy that will
8
provide a source of high yielding and high quality wheat germplasm to the pedigree
breeding programs run by the Leslie Research Centre (LRC) at Toowoomba and the Plant
Breeding Institute of the University of Sydney (PBI-US) at Narrabri, of the northern grains
region of Australia (Figure 1a) (Cooper et al., 1999).
The current strategy used in the GEP program is a modified S1 recurrent selection strategy.
It works on a four-year cycle within the general recurrent selecting framework (Figure 1b).
Years 1 and 2 are used for intermating, selection for the traits maturity and height, and seed
multiplication of the S1 families. Multi-environment Trials (METs) of the S1 families are
conducted in years 3 and 4 and selection is based on grain yield and grain protein
concentration. It is expected that this improvement strategy can provide a gradual increase
of favourable allelic frequencies and thus increase the mean of a population for the selected
traits (Fabrizius et al., 1996).
Figure 1a: Components and pathways of germplasm transfer for yield improvement in the
Australian Northern Wheat Improvement Program: LRC-QDPI represents the Queensland
Department of Primary Industries pedigree breeding programs located in Toowoomba at the Leslie
Research Centre; PBI-US represents the University of Sydney pedigree breeding programs located
in Narrabri; GEP represents the University of Queensland Germplasm Enhancement Program
(Cooper et al., 1999).
Cultivar
LRC-QDPI
Toowoomba
PBI-US
Narrabri
Germplasm Enhancement
Program GEP-UQ
Overseas Germplasm
Research Programs
Parents Parents
9
Year Activity
1
2
3
4
Random
intermating
10,000 S0 plants
Select 2,000
(height, maturity)
2,000 S1 families
Select 1,000
(height, maturity, rust)
MET (5 sites)
1,000 S1 families
MET (5 sites)
1,000 S1 families
Select 20-30
(yield & protein)
LRC-
QDPI
PBI-US
CIMMYT
Figure 1b: Outline of the Germplasm Enhancement Program (GEP) 4 year cycle. Pictures show examples
of the activities and field experiments undertaken at each stage of the cycle (Cooper et al., 1999).
10
Genotype-by-environment (G×E) interactions have a large impact on response to selection
for grain yield of wheat in the Australian target production environments, particularly
G×S×Y (genotype-by-site-by-year) interactions (Basford and Cooper, 1998). The focus on
G×E interactions arises because the interactions introduce uncertainty into the process of
selection among genotypes, especially when selection is based on their phenotypic
performance in a relatively small sample of environments taken from the target population
of environments (Cooper and DeLacy, 1994), as occurs in the case of the GEP. The GEP
therefore utilises two years of METs to accommodate for the G×S×Y interactions that are
encountered, in order to improve S1 family mean heritability and thus help make selection
more efficient. The traditional S1 selection strategy works on a three year cycle, using only
one year of METs. The first two years involve similar steps to those conducted for the GEP.
The traditional S1 selection strategy has been applied in maize breeding for target
environment populations where G×S×Y interactions are not sufficiently large as to warrant
two years of METs (Hallauer and Miranda, 1988). However, for yield of wheat in the
northern grains region the large G×S×Y interactions require at least two years of METs
(Brennan et al., 1981; Cooper et al., 1996). Hence the modification of the S1 recurrent
selection strategy for the GEP involves an additional year of multi-environment testing of
the S1 families.
Following theoretical considerations relevant to the partitioning of additive genetic
variation for quantitative traits and the contributions of this variation to selection response
it has been argued that the inclusion of doubled haploids into the GEP strategy in place of
S1 families can increase the rate of genetic improvement achieved by the GEP. Therefore,
this project was developed to evaluate whether there is an increase in response to selection
in the program contributed by the use of DH lines in place of S1 families. Limitations on
the availability of resources, particularly labour costs and time, will influence the feasibility
of the production of DH lines for use in the GEP. At present approximately 300 DH lines
could be produced by one dedicated person with the available resources for use in the GEP.
As the ability to produce more DH improves (e.g. through increasing skilled labour
availability, decreasing time and cost to produce the DH lines) the relative merits of DH
line selection increases (Strahwald and Geiger, 1988).
11
2.3 Doubled Haploids
Doubled haploids are plants for which the haploid genome has been doubled. There are a
variety of methods available for producing DH lines in wheat:
(1) anther culture (Ouyang et al., 1973; Henry and de Buyser, 1981), and
(2) chromosome elimination
a) wheat × maize method (Laurie and Bennett, 1986, 1988)
b) Hordeum bulbosum method (Barclay, 1975; Sitch and Snape, 1986)
Jensen and Kammholz (1998), modified the wheat × maize method, which is the DH line
production method currently used at the Leslie Research Centre (LRC). In wheat, selected
wheat plants are emasculated and crossed as females with maize pollen to produce a
haploid wheat embryo. The haploid embryo is progressed to a haploid plant in tissue
culture. The young haploid plant from this embryo undergoes a colchicine treatment that
causes the chromosomes to double, resulting in a doubled haploid (diploid: in the case of
wheat as it is an allohexaploid referring to an amphidiploid) plant.
Plant breeders have long been interested in the use of DH lines in breeding programs, as
there are several advantages with using them;
(1) a DH line exhibits twice as much additive genetic variation among lines as that for
S1 families used in an S1 recurrent selection program. DH lines do not express
dominance variation or segregation within lines, resulting in easier and more
efficient selection (Griffing, 1975; Baenziger et al., 1984; Wricke et al., 1986;
Snape, 1989), and
(2) the selection efficiency among completely inbred DH lines is increased as
homozygosity is reached in one generation (equivalent to F∞ selfing generations),
instead of being close to homozygosity after 5-6 generations of self-pollination,
(Baenziger et al., 1984; Wricke and Weber, 1986; Witherspoon and Wernsman,
1989).
The adoption of this new technology however has been slow due to several disadvantages;
(1) production of them is difficult and quite costly, and
12
(2) if they were to be used in the GEP, and to keep the GEP as a four year cycle, the
first year of multi-environment trials would be lost due to the extra time taken to
produce the DH lines relative to the S1 families, therefore the DH lines would only
undergo one year of METs.
Due to DH being completely homozygous in one generation, they allow for only one
crossover opportunity, which is desirable if a line contains a superior combination of genes,
as those genes will be fixed permanently. However with the S1 families, recombination can
occur in every generation, therefore both increasing the chances of finding a good
combination of genes, but also allowing a chance to lose that combination of genes. The
effects of recombination lessens with each progressive generation of selfing (Baenziger et
al., 1984; Knox et al., 1998).
2.4 Genotype-by-environment Interaction
Genotype-by-environment (G×E) interactions can result in changes in rank among
genotypes in different environmental conditions (Haldane, 1947; Comstock and Moll,
1963). When cultivars are compared in different environments, their performance relative
to each other may not be the same. One cultivar may have the highest yield in some
environments and a second cultivar may excel in others. G×E interaction is a major
problem in the study of quantitative traits as it complicates the interpretation of genetic
experiments and undermines the repeatability of experimental results, which consequently
makes predictions difficult and reduces the efficiency of selection (Kearsey and Pooni,
1996).
To emphasise the different influences of G×E interaction on the efficiency of selection they
are sometimes categorised into interactions due to:
(1) heterogeneity of genetic variance among environments (Robertson, 1959), i.e. the
ranking of the genotypes does not differ between environments, only the
magnitude of the difference between the genotypes in each environment changes,
13
therefore the same genotypes are selected regardless of environment and prediction
of response to selection is not complicated by changes in rank of genotypes, or
(2) lack of genetic correlation among environments (Robertson, 1959), i.e. this source
of interaction can result in cross-over interactions, where reranking of the
genotypes occurs and a genotype that performs well in one environment, does not
perform well in other environments, this form of G×E interaction complicates the
selection decisions in a breeding program.
The analysis of variance (ANOVA) has been used to partition total phenotypic variation
into components due to genotype, G×E interaction and error (Brennan and Byth, 1979;
DeLacy et al., 1990). The relative sizes of the variance components are frequently used to
quantify the magnitude of G×E interactions. The influence of G×E interaction in a breeding
program is a problem when the ratio of the G×E interaction to genotypic variance
(σ2
GE:σ2
G) is high (Cooper and DeLacy, 1994).
Genotype-by-environment interactions for the grain yield of wheat are large in the northern
grains region and these commonly change the rank of genotypes. These interactions have a
major influence on selection decisions, and therefore response to selection in the GEP.
With the GEP being a recurrent selection program, continual long-term improvement can
only occur if the breeders are able to efficiently select the superior genotypes, for the target
environmental conditions.
2.5 QU-GENE
QU-GENE (QUantitative-GENEtics) is a computer simulation platform developed for the
quantitative analysis of genetic models. The QU-GENE software platform was developed
with a modular structure (Figure 2) and consists of two major component levels;
(1) the genotype-environment system engine (QUGENE), which is used to define the
genetic models to be examined, and
14
(2) the application modules that examine properties of the genotype-environment
system by investigating, analysing or manipulating a population of genotypes for a
target population of environments created from the QUGENE engine (Podlich and
Cooper, 1997, 1998). For the purposes of this study the GEPRSS application
module is used in combination with the engine.
Figure 2: Schematic outline of the QU-GENE simulation software. The central ellipse shows the
engine and the surrounding boxes show the application modules. The GEPRSS module was used in
this study (Podlich and Cooper, 1997, 1998).
QU-GENE enables investigation of the impact of resource allocation decisions within the
breeding program, e.g. variables population size and selection decisions influence how the
resources will be allocated (Fabrizius et al., 1996). QU-GENE has also been used to model
breeding programs in previous simulation experiments (Podlich et al., 1999; Podlich and
Cooper, 1999). Strahwald and Geiger (1988) have previously published work involving the
use of computer simulation to study the efficiency of DH in a barley recurrent selection
program.
15
2.6 Effective Population Size (Ne)
The effective population size (Ne) of a breeding strategy is an important part of any
recurrent selection breeding program. It needs to be determined to quantify the potential
influence of random drift, and balanced with intense selection so that the maximum
response to selection from the available resources can be realised. If the Ne is too small then
favourable genes may be lost from the population through random genetic drift. When drift
occurs the response to selection can never reach its full potential. Therefore, it is important
to understand the effects of drift, which can be determined and quantified in terms of the
effective population size.
Quantifying the effect of random drift in a population requires knowledge of the variability
in changes of gene frequency between repeated runs of the same breeding strategy. This is
defined theoretically in terms of the idealised population (Falconer and Mackay, 1996).
However, populations do not always conform to that of an idealised population (random
mating, monoecious population in which there is no selection, there are N individuals that
reach reproductive age and function as parents, and only one offspring is produced per
mating). One way to deal with deviations from the idealised breeding structure is to express
the situation of a breeding program in terms of the effective number of breeding
individuals, or the effective population size (Ne). This is the number of individuals that
would give rise to the observed sampling variance for gene frequencies, or rate of
inbreeding, if they bred in the manner of the idealised population (Comstock, 1996;
Falconer and Mackay, 1996). The effective population size is therefore a relative measure
of the number of parents used to form a breeding population. It does not represent the
number of individuals from a population that are tested in a recurrent selection program and
is dependent on the level of inbreeding of the parents that are mated and the number of
gametes contributed to the next generation (Hallauer and Miranda, 1988).
Genetic drift is a consequence of sampling in a finite population of small Ne. This is a
disadvantage to any breeding program as genetic diversity needs to be maintained within a
breeding population. Small values of Ne can result from applying intense selection pressure.
16
Following the theoretical development by Comstock (1996) the standard procedure for
calculating Ne is quantified by the equation
( )1
2
11
e n
n
M
N
f
m
−
=
 
+ + 
  
, (3)
where considered in terms of the GEP, M is the number of S0 plants sampled, 1nf − is the
coefficient of inbreeding after n-1 generations of inbreeding, n is the number of successive
generations and m is the number of reserve seed used for intermating per S0 plant sampled.
In the above equation, whenever n and/or m are large enough to make
( )1
2
n
m
small relative
to ( )11 nf −+ then:
( )11
e
n
M
N
f −
≈
+
. (4)
The theoretical Ne for the S1 recurrent selection strategy used in the GEP is derived from
equation (3) by noting that 1nf − = 0 and n = 1. These values are substituted into equation (3)
and following rearrangement this becomes
( )



 +
=
m
m
M
Ne
2
12
. (5)
This equation is a special case for S1 families, derived from the standard procedure for
calculating Ne when selecting among families produced by self-fertilisation.
When calculating the Ne of doubled haploids, equation (3) is also used. DH lines are
completely inbred in one generation, therefore fn-1 = 1. If we assume that this is equivalent
to n being large then following equation (4) this results in the theoretical Ne of DH being
'
2
e
M
N = , (6)
17
where 'M is the number of S0 plants sampled in the case of the GEP. This equation is
expected as only one parent is contributing gametes to the next generation and not two
parents which occurs with cross, and self pollination (depending on the level of inbreeding
in the base population), where two sets contribute. Therefore to determine the Ne for DH
the number of S0 plants sampled ( 'M ) is divided by 2 (Equation 6).
Of practical importance to plant breeders is the amount by which the probabilities of
fixation of favourable alleles are increased by selection. Kimura (1957) considered the case
where a gene has two alleles, relative fitness of the single locus genotypes are constant
through time, any level of dominance except overdominance and effective population size
constant through time. He derived the following equation, which has been shown to be a
close approximation for the probability of the fixation of a favourable allele as
( )
( )
0
2 2 1 2
0
1
2 2 1 2
0
( )
e
e
p
N sx h x h
N sx h x h
e dx
P fixation
e dx
− + −  
− + −  
=
∫
∫
, (7)
where p0 is the initial frequency of the favourable allele, Ne is the effective population size
as defined above, s is the selection coefficient (a selection proportion), x is the continuous
variable being measure (gene frequency) ranging from 0→1, and h is level of dominance
coefficient.
When there is no selection (s = 0) then equation (7) reduces to P(fixation) = p. It also must
be noted that P(fixation) is a function of the product Nes, and not Ne and s as separate
values. It is much easier to derive a numerical evaluation of P(fixation) when h = ½, as this
corresponds to an additive model. When h = ½, and x is defined in relation to the initial
gene frequency ( 0p ) rather than as a continuous function then equation (7) reduces to
02
2
1
( )
1
e
e
N sp
N s
e
P fixation
e
−
−
 − =
 − 
. (8)
18
Equations (7) and (8) are important to breeders as they provide a quantifiable basis for
determining the influence that the population sizes and selection coefficient values used in
breeding programs have on the probability of fixing favourable genes. From equation (8) it
can be seen that with an additive model there is a relationship between Ne and the selection
coefficient that determines the chances of fixing or losing favourable alleles (Comstock,
1996).
2.7 Linkage Disequilibrium
Linkage disequilibrium is an important factor in the GEP as the starting population is
obtained from the random intermating of 10 initial parents (Fabrizius et al., 1996), a
relatively small number. In one generation of random mating recombination occurs,
however there is a chance that either undesirable or desirable genes may be linked to
desirable genes. When genes are linked, selection for the desirable gene will also result in
indirect selection of the linked gene, increasing its frequency in the population. This form
of indirect selection is unwanted if an undesirable gene is linked to a desirable gene being
selected for as it will decrease the potential response to selection of the population.
Linkage disequilibrium between loci can originate through selection, migration, mutation
and random drift (Lynch and Walsh, 1998). Two alleles at two loci (A allele or a allele at
locus 1 and B allele or b allele at locus 2) can be linked in a coupling
ab
AB
or repulsion
aB
Ab
phase. A population is in linkage disequilibrium when the frequency of gametes with genes
in coupling is not equal to the frequency of gametes with genes in repulsion (Fehr, 1987). It
is most common in populations derived from two inbred parents with contrasting
phenotypes (e.g. one parent is tall (AABB), while the other parent is short (aabb)). Linkage
disequilibrium can influence heritability estimates by causing an upward bias (increase) or
downward bias (decrease) in the estimates of additive (σ2
A) and non-additive (dominance
(σ2
D)) genetic variation (Fehr, 1987; Hallauer and Miranda, 1988).
19
Groups of genes that are linked, and tend to be transmitted intact from one generation to the
next, are referred to as linkage blocks. Linkage can influence estimates of genetic variance
for quantitative characters. For achievement of linkage equilibrium in a population, the
opportunity must be provided for genetic recombination within heterozygous individuals.
This requires repeated generations of intermating or selfing of heterozygous individuals.
Recombination is an event that occurs during meiosis, which causes new combinations of
genes to occur, and helps break up linkage blocks and reduce the linkage disequilibrium
effect. The length of linkage blocks that are retained in a breeding population is influenced
by the number of parents used to develop the population, the number of generations of
intermating before selfing is initiated and the number of selfing generations conducted after
intermating is completed (Fehr, 1987).
In the GEP the number of parents that form the starting population is relatively small, there
is only one generation of random mating before selfing starts followed by one generation of
selfing for the intermating units used within the GEP modified S1 family strategy. All these
factors contribute to a relatively low frequency of recombination events and a high level of
linkage disequilibrium in the GEP. It is expected that the level of linkage disequilibrium in
the DH strategy will be greater than that of the S1 families (Powell et al., 1992), as the S1
families, unlike DH lines, have a further opportunity to recombine during selfing after the
intermating of the selected lines.
The reduction in additive genetic variance due to gametic linkage disequilibrium caused by
selection, is known as the Bulmer effect (Bulmer, 1971, 1980; Falconer and Mackay,
1996). The changes of the additive genetic variance affect variances, covariances and
heritability, with these parameters requiring re-estimation at each cycle during recurrent
selection (Charmet et al., 1993). The change in these parameters means that the response to
selection will also be altered. To predict long-term response to selection the effects of
linkage disequilibrium and genetic drift on additive variance need to be considered
simultaneously (Wei et al., 1996).
20
2.8 Study focus
The literature outlined above covers the necessary background that needs to be considered
when evaluating the response to selection of a breeding program. The computer simulations
will be conducted using the computer program QU-GENE, to evaluate the response to
selection for both S1 families and DH lines by analysing the impact of effective population
sizes, selection intensity, linkage disequilibrium and genotype-by-environment interactions.
21
3. Materials and Methods
The QU-GENE simulation platform (Podlich and Cooper, 1998) was used to conduct the
simulation experiments. The application module, GEPRSS, representing the GEP had
already been developed prior to the commencement of this project (Podlich and Cooper,
1997). In the GEPRSS module both the S1 family and DH strategies were implemented as
options. An outline of the way in which two breeding strategies were modelled in the
GEPRSS module is presented in Figure 3.
Year Activity (S1) Activity (DH)
Figure 3: Outline of the activities involved in the S1 and DH breeding strategies over one cycle of
the GEP. The S1 activities are adapted from Fabrizius et al., (1996).
To quantify rate of response to selection and long-term selection response each breeding
strategy was run for 10 cycles, which is equivalent to 40 years of the S1 strategy and 50
years of the DH strategy. On a time scale of 40 years the two strategies can alternatively be
compared after 10 cycles of selection for S1 families and after 8 cycles of selection for DH
lines. Response to selection was calculated as the genotypic value of selected individuals
expressed as a percentage of the target genotype, where the target genotype was defined to
1 Random intermating
2
10,000 S0 plants
Sample 2,000
3
2,000 S1 families
Sample 1,000
4
MET (5 sites)
S1 evaluation
MET (5 sites) S1
evaluation.
5
Generate doubled
haploid plants
MET (5 sites)
DH evaluation. Select.
Production of DH lines
Seed increase
MET (5 sites)
DH evaluation
22
be the genotype containing all of the favourable alleles. The mean response to selection was
estimated as the average response obtained for 100 runs of the simulation experiment. This
methodology was used by Podlich et al. (1998) to normalise response to selection for
comparisons between genetic models.
Analyses of variance were conducted on the results of the experiments using the ASREML
software (Gilmour et al., 1999).
Important points with regard to the simulations:
(1) Heritability in the GEPRSS module is calculated on a plot mean basis, however in
the QUGENE engine it is assigned on a single plant basis in the base population.
For the MET evaluation phase of the cycle the between plot experimental variance
was set to be two times that of the within plot variance (Podlich et al., 1998),
(2) All experiments were conducted with 20 families being selected from the METs to
go into the next cycle of selection, unless otherwise stated,
(3) When the term families are used in the experiments it refers to both S1 families
and DH lines i.e. families and lines throughout the report are used interchangeably
for DH lines to simplify presentation of results.
Five simulation experiments were conducted to evaluate different aspects of response to
selection in the GEP. These were:
(1) Comparison of simulated and theoretical predictions of the effective population
size (Ne) of S1 families and DH lines (Experiment 1),
(2) Determining the effects of linkage disequilibrium on gene frequency and response
to selection (Experiment 2),
(3) Evaluating the response to selection of S1 families and DH lines for an additive
genetic model (Experiment 3),
(4) Evaluating the impact of selection proportion on response to selection for an
additive genetic model (Experiment 4),
(5) Evaluating the influence of G×E interaction genetic models on response to
selection (Experiment 5).
The treatments incorporated for each simulation experiment and their objectives are
explained below.
23
Experiment 1: Comparison of simulated and theoretical predictions of the effective
population size (Ne) of S1 families and DH lines
The objective of running the effective population size experiments was to determine
whether the S1 families or DH lines strategies in the GEP reached a critical point where
favourable genes were being lost due to the effects of genetic drift. A secondary objective
was to compare the simulation results of the Ne with the theoretical predictions, that were
given in the literature review section of this thesis. Doubled haploid effective population
equations only exist at a restricted level (e.g. for the case of one DH plant per S0 plant
selected), therefore it was of interest to see what the simulated Ne results were when more
DH plants were produced per S0 plant.
The effective population size was simulated using the S1 and DH strategy additive model
input files to see whether they conformed to the theoretical predictions. The following
parameters were considered:
(1) heritability (one level: 1.00)
(2) number of genes contributing to the trait (one level: 50)
(3) starting gene frequency (one level: 0.5)
(4) number of families used in the METs (five levels: 100, 250, 500, 750, 1000)
(5) number of families selected (no selection was imposed, therefore this value was
equal to the number of families being evaluated in the MET)
Refer to Appendix 1 Table A1.1 for QUGENE engine input file.
To test for the effective population size two more parameters were altered under each
option. The parameters changed in the theoretical equation (5) for the S1 families were:
(1) number of S0 plants sampled (equivalent to number of S1 families) ( M ): 5, 10,
15, 20, 25, 30, 40, 50, 100, 150, 200
(2) number of reserve seeds per S0 plant (m): 1, 2, 3, 4, 5, 10, 100.
M and m are both components of the theoretical equations, allowing the simulations and the
theoretical predictions to be compared. This enabled investigation of the effective
population size of the S1 strategy in the GEP at a range of numbers of families evaluated in
24
the MET and the number of reserve seed used in random mating to create the base
population for each cycle of selection.
The approach to determine the effective population size of the DH strategy was slightly
different to that used for the S1 families. The parameters changed in the input file to
determine the Ne of DH lines were:
(1) number of S0 plants sampled ( 'M ): 5, 10, 15, 20, 25, 30, 40, 50, 100, 150, 200
(2) number of DH produced per S0 plant ( 'm ): 1, 2, 3, 4, 5, 10.
Only 'M however, is present in the theoretical equation (6) to determine the Ne of DH
lines. There is presently no theoretical equation derived to determine the Ne of DH when
more than one DH plant is produced per S0 plant (i.e. when 'm >1). Simulations were still
conducted with values of 'm greater than one so that the response of Ne due to changes in
'm could be evaluated.
The effective population size was calculated for each of the 50 genes not under the
influence of selection in the QU-GENE simulations following the procedure outlined in
example 4.1 (p.70) in Falconer and Mackay (1996). The inbreeding coefficient (F) for each
gene was calculated from the variation among 100 runs using the following formula
2
q
F
pq
σ
= , (9)
where 2
qσ is the variance of gene frequencies among runs, p mean gene frequency of a
particular allele at a locus among runs, and q mean gene frequency of all other alleles at
that locus among runs. Using this procedure each gene gave an independent estimate of F.
From this estimate, the rate of inbreeding (∆F) can be calculated by rearranging the
following equation
( )
1
1 1 t
tF F∆ = − − , (10)
where t is the generation number. The effective population size of each of the fifty genes
was then calculated from ∆F with the following equation
1
2
eN
F
=
∆
. (11)
25
An estimate of the variation of Ne was obtained by estimating the effective population size
for each gene, and the variation amongst these genes.
Experiment 2: Determining the effects of linkage disequilibrium on gene frequency and
response to selection
A study was conducted to determine the impact of linkage disequilibrium on the frequency
of genes in the GEP after 5 cycles of selection. This study was only conducted on S1
families. Both additive and a complex G×E interaction model were assessed and compared.
Genetic models based on twenty genes were considered. The effects of the genes were
scaled to generate major or minor genes. The favourable alleles for all 20 genes
commenced at a gene frequency of 0.2. Therefore, any positive effects of selection were
expected to increase the gene frequencies above the starting value of 0.2. The following
parameters were considered in both the additive and G×E interaction models:
(1) heritability (one level: 0.95)
(2) number of genes contributing to the trait (one level: 20)
(3) starting gene frequency (one level: 0.2)
(4) number of families used in the METs (one level: 250)
(5) number of families selected (one level: 10)
Refer to Appendix 1 Table A1.2 for QUGENE engine input file.
To reduce the effects of linkage disequilibrium in the recurrent selection strategy, ten
generations of random mating were incorporated into the S1 program. The random matings
were conducted from the S0 plants for each cycle of selection. The rate of change of the
favourable alleles for each gene was monitored for the cases with and without the extra
generations of random mating, of particular interest was whether the presence of linkage
disequilibrium influenced the rate of change in gene frequency of the favourable alleles. In
the absence of any effects of linkage disequilibrium the rate of change in the frequencies of
the alleles was expected to be proportional to the size of the gene effects, and independent
of the number of generations of random mating. However, when linkage disequilibrium
was present, i.e. as was expected without the additional generations of random mating, the
26
rate of change in the frequencies of the alleles could be influenced by the size of the gene
effects and the degree of linkage disequilibrium.
Experiment 3: Evaluating the response to selection of S1 families and DH lines for an
additive genetic model
This experiment was conducted to determine whether using DH lines resulted in a faster
response to selection relative to the S1 families for a range of heritabilities, number of
genes contributing to the attribute, number of families evaluated in the MET and the
number of families selected from the MET to progress into the next cycle of selection.
Theoretical considerations suggest that a higher rate of response would be observed when
DH lines were used in the place of S1 families in the GEP.
Using a completely additive genetic model (i.e. no epistasis, no genotype-by-environment
interaction and no linkage) the following parameters were altered providing a range of
genetic model scenarios:
(1) heritability (five levels: 0.05, 0.25, 0.50, 0.75, 0.95)
(2) number of genes contributing to the trait (four levels: 5, 10, 20, 100)
(3) starting gene frequency (one level: 0.2)
(4) number of families used in the METs (five levels: 100, 250, 500, 750, 1000)
(5) number of selected families (one level: 20)
Refer to Appendix 1 Table A1.3 to Table A1.6 for the relevant QUGENE engine input file.
Experiment 4: Evaluating the impact of selection proportion on response to selection for
an additive genetic model
The number of families selected from one cycle of selection to be progressed through the
next cycle of selection is an important factor in the GEP. If too few families are selected
(high selection intensity) random drift may result and valuable genes may be lost from this
population. On the other hand if too many families are selected (low selection intensity)
27
then too many undesirable genes will be retained in the population and the response to
selection will be slowed down. The GEP currently selects 20 families based on the results
of the MET. This experiment was conducted to determine whether this figure provided a
suitable balance between selection intensity and effective population size.
Table 1: Selected intensity (%) and corresponding standardised selection differential (S), (within
the brackets) from Falconer and Mackay (1996), changes depending on the number of families in
the MET and the number of families selected (selected proportion) from the MET.
Number of families in METNumber of
families selected 250 500 750 1000
5 2% (2.054) 1% (2.326) 0.67% (2.4705) 0.5% (2.576)
10 4% (1.751) 2% (2.054) 1.3% (2.227) 1% (2.326)
15 6% (1.555) 3% (1.881) 2% (2.054) 1.5% (2.1705)
20 8% (1.405) 4% (1.751) 2.67% (1.945) 2% (2.054)
25 10% (1.282) 5% (1.645) 3.33% (1.8255) 2.5% (1.960)
30 12% (1.175) 6% (1.555) 4% (1.751) 3% (1.881)
Table 1 documents the number of families in a MET and the selection proportions applied
and shows that as the selected proportion increases, the standardised selection differential
and selection intensity decreases. To explore the effect that different selection proportions
can have on the response to selection, simulations were run for both S1 families and DH
lines where the following parameters were used:
(1) heritability (one level: 0.95)
(2) number of genes contributing to the trait (four levels: 5, 10, 20, 100)
(3) starting gene frequency (one level: 0.2)
(4) number of families used in the METs (five levels: 100, 250, 500, 750, 1000)
(5) number of selected families (six levels: 5, 10, 15, 20, 25, 30)
28
Experiment 5: Evaluating the influence of G××××E interaction genetic models on response
to selection
Genotype-by-environment interaction was included in the genetic model to determine
whether the responses to selection and advantages of the DH lines over S1 families, that
were observed for the additive model, would be retained in the presence of G×E
interactions. It was also incorporated as G×E interaction has a major influence on the
selection of genotypes in Australian environments, and the simulations would be
incomplete if it was not considered as a factor in the genetic model. Two major experiments
were undertaken to fulfil two objectives. The first, to assess the response to selection when
two years of METs were conducted for both the DH lines and S1 families. Secondly, to
assess response to selection when only the DH lines are conducted with one year of METs.
It was expected that the DH line advantage would be retained when two years of METs
were conducted, however it was uncertain whether this would be retained with the scenario
where one year of METs was used.
To introduce G×E interaction into the additive model, five environment types were added
into the QU-GENE engine input file. The inputs into the genotype-environment system can
be manipulated so that genes can have different effects in different environments, thus
generating G×E interactions. In the input (Table 2), a value of 0 means that a gene has no
effect in that environment, a value of 1 means that a gene has the effects defined by the
m,a,d genetic model, and –1 means that a gene has a cross-over genetic effect in that
environment. These different gene effects are outlined in bold (Table 2) for each of the five
environments (E1 – E5). Refer to appendix 1 for all of the input files.
Five G×E interaction models were produced to create different levels of G×E interaction
(Table 3). There were 20 genes contributing to the attribute subjected to selection in each of
the simulations. The genes in this experiment were interacting with five environment types.
29
Table 2: Input file for the QUGENE engine. This file represents the G×E model 5 (Table 3). The
genes 5-16 are removed from this presentation for conciseness. GN represents the gene number and
E1 – E5 represent the five environment types within the target population of environments. The
detail of the structure of this input file is explained by Podlich and Cooper, (1997; 1998).
S ! Pollination Type
N ! Random Seed
100 200 300
Y N N ! Linkage, Epistasis, Random GxE
1 ! Gene Sampling type (1=fixed, 2=random)
10 ! no. of runs to calc var comp
(BP,Progeny, Genes, Attributes, Environment Types, Sample Environments)
5000 10 22 2 5 1
0.4 0.3 0.15 0.1 0.05 ! Environment Frequency
1 1 1 1 1 ! GxE multipliers
0.95 1 ! Heritability for each Attribute
GN M A D AT L LN K E1 E2 E3 E4 E5 P
1 0.100 0.050 0.000 1 1 1 0 1 0 -1 1 1 0.2
2 0.100 0.050 0.000 1 1 0.5 0 1 1 -1 1 0 0.2
3 0.100 0.050 0.000 1 1 0.5 0 1 -1 1 -1 1 0.2
4 0.100 0.050 0.000 1 1 0.5 0 1 -1 0 1 1 0.2
" " " " " " " "
17 0.100 0.050 0.000 1 1 0.5 0 1 0 1 1 -1 0.2
18 0.100 0.050 0.000 1 1 0.5 0 1 -1 1 -1 1 0.2
19 0.100 0.050 0.000 1 1 0.5 0 1 0 -1 1 -1 0.2
20 0.100 0.050 0.000 1 1 0.5 0 1 1 -1 -1 1 0.2
21 0.50 -0.500 -0.490 2 1 2 0 1 1 1 1 1 0
22 0.50 -0.500 -0.490 2 1 0.5 0 1 1 1 1 1 1
N ! Mating Type
1 ! Selection type
********************************************************************
R ! Mating (Random Mating)
10 5 0.2 ! Generations, Generations before selection, Select Pressure
M ! Mating (Mixture)
0.8 0.2 10 5 0.2 ! Proportions (RM/S), Gen, Gen before Sel, Sel Pressure
N ! No Further Mating
30
To explore these models, simulations were run for both S1 families and DH lines where the
following parameters were used:
(1) heritability (three levels: 0.05, 0.25, 0.95)
(2) number of genes contributing to the trait (one level: 20)
(3) starting gene frequency (one level: 0.2)
(4) number of families used in the METs (four levels: 250, 500, 750, 1000)
(5) number of selected families (three levels: 10, 20, 30)
(6) level of G×E interaction (five levels: models 1, 2, 3, 4, 5); Table 3
Refer to Appendix 1 Table A1.7 to Table A1.11 for the relevant QUGENE engine input
files.
Table 3: Each model describes the number of genes interacting with the five environment types and
the level of G×E interaction present as described by the ratio of the genotype-by-environment
interaction variance to the genotypic variance (σ2
GE:σ2
G).
Model number Number of genes
interacting
(σ2
GE:σ2
G)
1 10 0.4
2 10 0.6
3 15 0.8
4 15 1.1
5 20 2.89
31
4. Results
Experiment 1: Comparison of simulated and theoretical predictions of the effective
population size (Ne) of S1 families and DH lines
The simulated effective population size (Ne) of the S1 family strategy corresponded well
with the predictions based on theoretical equation (5) (Figure 4). As the number of S1
families selected (M) increases, Ne increases. Ne also increases as the number of reserve
seed used for intermating per S0 plant sampled (m) increases.
Number of S0 plants sampled (M)
0 50 100 150 200
Effectivepopulationsize(Ne)
0
50
100
150
200
250
m=1
m=2
m=3
m=4
m=5
m=10
m=100
The variability of Ne for two levels of m (1 (Figure 5a), 100 (Figure 5b)) is indicated by the
scatter points about the mean (solid line) for each value of M. The variability of Ne about
the mean increases as the number of S0 plants sampled increases. This effect was observed
for all levels of m (Appendix 2).
Figure 4: S1 families effective population size (Ne) calculated theoretically (solid line)
and the average of the simulation runs (broken lines) for a range of values for the number
of S0 plants sampled (M) and the number of reserve seed used for intermating per S0 plant
sampled (m).
32
(a) m = 1
Number of S0 plants sampled (M)
0 50 100 150 200
Effectivepopulationsize(Ne)
0
50
100
150
200
250
300
(b) m = 100
Number of S0 plants sampled (M)
0 50 100 150 200
Effectivepopulationsize(Ne)
0
50
100
150
200
250
300
Like the S1 family Ne, for the DH lines, as the number of S0 plants sampled ( 'M )
increases, Ne increases (Figure 6). The simulated results and theoretical predictions show
good correspondence.
Number of S0 plants sampled (M')
0 50 100 150 200 250
Effectiveppulationsize(Ne)
0
20
40
60
80
100
120
m' = 1
Figure 7 indicates how the simulated Ne increases as the number of DH plants produced per
S0 plant ( 'M ) increased. Theoretical equations were derived only for the situation when
Figure 5: Simulated S1 family effective population size (Ne) variation, about the average
of the simulation runs (solid line) for a range of S0 plants sampled (M) and the two
extreme values of reserve seed used for intermating per S0 plant sampled (m). (for
intermediate levels of m refer to Appendix 2)
Figure 6: Comparison of the DH simulated average (closed circles) and DH
theoretical (solid line) effective population size for a range of S0 plants
sampled ( 'M ) and when only one DH plant was produced per S0 plant
sampled ( 'm = 1).
33
'm = 1, however, also plotted were the simulated Ne for four levels of 'm > 1. Like the S1
strategy, as the number of DH plants produced per S0 plant ( 'm ) increased the Ne also
increased. This increase was less than that observed for increasing m in the case of the S1
family strategy (Figure 4).
Number of S0 plants sampled (M')
0 50 100 150 200
Effectivepopulationsize(Ne)
0
20
40
60
80
100
120
140
160
180
200
m' = 1
m' = 2
m' = 3
m' = 4
m' = 5
m' = 10
The variability of the DH lines Ne for each 'M for the simulated data is shown for two
levels of 'm (1 (Figure 8a), 10 (Figure 8b)). The variation of the Ne is indicated by the
scatter points about the mean (solid line) for each value of 'M . The variability about the
mean increases as the number of S0 plants sampled increases. This effect was observed for
all levels of 'm (Appendix 2).
Figure 7: Average of the simulated DH effective population size (Ne) for
a range of S0 plants sampled ( 'M ) and DH plants produced per S0 plant
sampled 'm . A regression line is fitted to each 'm .
34
(a) m'=1
Number of S0 plants sampled (M')
0 50 100 150 200
Effectivepopulationsize(Ne
)
0
20
40
60
80
100
120
140
160
180
200
220
240
(b) m'=10
Number of S0 plants sampled (M')
0 50 100 150 200
Effectivepopulationsize(Ne
)
0
20
40
60
80
100
120
140
160
180
200
220
240
Experiment 2: Determining the effects of linkage disequilibrium on gene frequency and
response to selection
The results from the linkage disequilibrium experiment focus on cycle five of the GEP,
where the S1 family strategy was used. The genes were scaled to have a distribution of
effects ranging from 1.2% to 10% of the total trait value. Both an additive model and G×E
interaction (σ2
GE:σ2
G = 2.89) model were considered. All genes commenced with a gene
frequency of 0.2. Therefore, any increase in the frequency above 0.2 is a consequence of
selection. The smaller the increase in frequency towards a frequency of 1.0, the less
effective was the influence of selection on changing gene frequency. It can be seen from
Figure 9a,c that for most genes selection was effective in increasing the frequency of the
favourable allele, and after five cycles of selection the genes ended up with different gene
frequencies for both the additive and the G×E interaction models.
Figure 8: Simulated S1 family effective population size (Ne) variation, about the average
of the simulation runs (solid line) for a range of S0 plants sampled ( 'M ) and two
extreme numbers of DH plants produced per S0 plant sampled ( 'm ) (for intermediate
levels of 'm refer to Appendix 2)
35
(a) Gene frequency of 20 genes,
with one cycle of random mating
Gene number
0 5 10 15 20 25
Genefrequency
0.0
0.2
0.4
0.6
0.8
1.0
Additive
GxE
(b) Gene value and frequency for 20 genes,
with one cycle of random mating
Value of gene
0.00 0.02 0.04 0.06 0.08 0.10
Genefrequency
0.0
0.2
0.4
0.6
0.8
1.0
1.2
(c) Gene frequency of 20 genes,
with 10 generations of random mating
Gene number
0 5 10 15 20 25
Genefrequency
0.0
0.2
0.4
0.6
0.8
1.0
(d) Gene value and frequency for 20 genes,
with 10 generations of random mating per cycle
Value of gene
0.00 0.02 0.04 0.06 0.08 0.10
Genefrequency
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Additive
GxE
Additive
GxE
Additive
GxE
After five cycles of selection, genes with a relatively low value could have either a low or
high frequency of occurrence in the population (Figure 9b). When the genes are influenced
by the effects of G×E interaction their frequency in the population was generally less then
when there was no G×E interaction effect. There is a lack of a consistent relationship
between the magnitude of the effect of a gene and its frequency following five cycles of
selection for the case where no generations of additional random intermating were
undertaken (Figure 9b). Therefore, genes with similar value, in terms of the way that they
Figure 9: The influence of linkage disequilibrium in the GEP with S1 families, 20 genes
under both an additive model and G×E interaction (σ2
GE:σ2
G = 2.89) model after five cycles of
selection. (a) frequency of each gene in the model plotted for one generation of random
mating per cycle (b) gene frequency and value for each of the 20 genes for one generation of
random mating per cycle (c) frequency of each gene in the model plotted for 10 generations of
random mating per cycle, and (d) gene frequency and value for each of the 20 genes for 10
generations of random mating per cycle.
36
contributed to the trait, can have dissimilar gene frequencies. It is hypothesised that this is
predominantly a consequence of linkage disequilibrium. To reduce the effects of linkage
disequilibrium, ten generations of random intermating following each cycle of selection
were added into the simulation (Figure 9d). With the inclusion of the additional generations
of random mating after each cycle of selection, the frequency of the genes was found to be
approximately proportional to the value of the gene after five cycles of selection. As
expected, a pattern was observed after selection whereby genes with low value had a lower
frequency in the population relative to genes with a higher value. The genes in the additive
model still had a higher frequency that was the case for the G×E interaction model genes.
This was expected due to the added complications of selection due to G×E interaction.
There also appears to be a point (approximately a gene value of 0.055) on Figures 9b,d
where the value of the gene is high enough that linkage disequilibrium had little or no effect
on the frequency of these genes after five cycles of selection, as genes with affects of this
magnitude or greater have comparable gene frequencies.
Experiment 3: Evaluating the response to selection of S1 families and DH lines for an
additive genetic model
The analysis of variance on the additive model simulation output data indicated significant
interactions between the two breeding strategies (DH lines and S1 families) and cycles,
heritability, number of families tested in the MET and number of genes. Greater levels of
selection response were associated with higher levels of heritability, larger numbers of
families, smaller numbers of genes and increasing numbers of cycles. On average,
including all runs and cycles, the DH strategy had a 13% mean improvement over the S1
family strategy. The following results represent a comparison between the DH and S1
strategies for the changes in response to selection when the number of families, number of
genes and heritability, in each of the strategies were changed.
As the number of families evaluated in the MET increased (with heritability 0.05, 20 genes,
and selecting 20 families), there was stronger selection pressure placed on the population
37
(Table 1) resulting in an increase in the rate of genetic progress (Figure 10a,b,c,d). The
simulation therefore indicates that the DH strategy provided a greater response to selection
relative to the S1 strategy over all family sizes and cycles considered.
(a) 250 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
S1
DH
(d) 1000 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
S1
DH
(c) 750 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
S1
DH
(b) 500 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
S1
DH
At a low heritability (0.05) and a medium family size (500), as the number of genes
increased it took longer to achieve a large response to selection (Figure 11a,b,c,d).
However, the DH strategy had a faster rate of progress relative to the S1 strategy over all of
the gene levels and cycles. With 20 genes contributing toward the attribute under selection
(a potentially realistic value for some traits targeted by the GEP) the DH strategy reached
100% of the target genotype after seven cycles of selection (Figure 11c), while the S1
strategy only reached approximately 90% after 10 cycles.
Figure 10: Comparison of the response to selection for S1 families and DH
lines with heritability 0.05, 20 genes and four family sizes over 10 cycles of
selection.
38
(a) 5 genes
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
S1
DH
(d) 100 genes
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(c) 20 genes
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(b) 10 genes
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
S1
DH
S1
DH
S1
DH
(a) 0.05 heritability, 250 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
S1
DH
(d) 0.95 heritability, 1000 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
S1
DH
(c) 0.95 heritability, 250 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
S1
DH
(b) 0.05 heritability, 1000 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
S1
DH
Figure 11: Comparison of the response to selection for S1 families and DH
lines with heritability 0.05, 500 families and gene numbers over 10 cycles of
selection.
Figure 12: Comparison of the response to selection for S1 families and DH
lines with 20 genes, two heritability levels and two family sizes over 10 cycles
of selection.
39
Figure 12 shows the effects of two different heritability levels for low (250) and high
(1000) numbers of families when 20 genes are contributing towards the attribute under
selection. At the low heritability of 0.05 (Figure 12a,b) both family sizes have a slower
response to selection than when the heritability is high (0.95) (Figure 12c,d). When the
heritability is high, the 1000 families had a faster response to selection than when 250
families were used (Figure 12c,d). The DH strategy was again superior to the S1 strategy
across the levels of heritability examined. The change in the level of heritability however
did not have a great effect on response to selection when using an additive model.
The impact of the use of different numbers of DH lines in the GEP was assessed relative to
1000 S1 families by comparing the response to selection at two heritability levels (0.05 and
0.95) and two gene numbers (20 and 100). Over all the combinations examined the rate of
progress for 100 DH families was similar to the rate of progress observed for 1000 S1
families (Figure 13a,b,c,d). When the number of DH families was greater than or equal to
250, they gave a greater response to selection than that observed for 1000 S1 families. The
genetic models based on a larger number of genes resulted in the rate of progress being
slower (Figure 13b,d) than the models based on lower gene number (Figure 13a,c).
(a) heritability 0.25, 20 genes
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(c) heritability 0.95, 20 genes
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
1000 S1
100 DH
250 DH
500 DH
1000 DH
(d) heritability 0.95, 100 genes
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(b) heritability 0.25, 100 genes
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
1000 S1
100 DH
250 DH
500 DH
1000 DH
1000 S1
100 DH
250 DH
500 DH
1000 DH
1000 S1
100 DH
250 DH
500 DH
1000 DH
Figure 13: Comparison of the response to selection for 1000 S1 families to 100,
250, 500 and 1000 DH lines two heritability levels and two gene numbers over
10 cycles of selection.
40
The Bulmer effect was observed in the additive model simulations and was visualised as a
rapid decrease in heritability in the early cycles of selection. This effect can be observed on
Figure 13 as a greater and more rapid increase in response to selection for the first two
cycles of selection compared to the subsequent cycles of selection.
Experiment 4: Evaluating the impact of selection proportion on response to selection for
an additive genetic model
The impact of changing the number of families selected was examined with a heritability of
0.95, 20 genes, three different family sizes (250, 500, 1000) and three different numbers of
selected families (5, 20, 30) for S1 families and DH lines separately.
When five families were selected in the S1 strategy (Figure 14a) the rate of response to
selection was faster than that observed when 20 (Figure 14c) or 30 (Figure 14e) families
were selected. However, when five families were selected the long-term selection response
plateaued before it reached 100% of the target genotype (Figure 14a). This plateau did not
occur at less than 100% of the target genotype for either 20 or 30 families selected (Figure
14c,e). The same overall response was also observed using the DH strategy (Figure
14b,d,f). The DH response to selection was much faster than that observed for the S1
strategy at all levels of families selected. 1000 families in both the S1 and DH strategy had
the fastest short-term response to selection.
The sub-optimal long-term responses to selection that were observed when five S1 and DH
families were selected (Figure 14a,b) is a consequence of loss of favourable alleles for
some of the genes due to the effects of random drift. Thus, while the intense selection that
resulted when five families were selected gave a rapid short-term rate of genetic progress,
the small effective populations required to achieve the high selection intensity placed limits
on the long-term response to selection. The practice of selecting 20 S1 families, which is
currently used in the GEP, did not appear to place severe limits on the expected long-term
response to selection (Figure 14c).
41
(a) 5 S1 families selected
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(f) 30 DH lines selected
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
250 DH
500 DH
1000 DH
(d) 20 DH lines selected
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(b) 5 DH lines selected
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(c) 20 S1 families selected
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
250 S1
500 S1
1000 S1
(e) 30 S1 families selected
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
250 S1
500 S1
1000 S1
250 DH
500 DH
1000 DH
250 DH
500 DH
1000 DH
250 S1
500 S1
1000 S1
The rate of response to selection was examined further for the S1 strategy for both an
additive and G×E interaction model (Model 4 Table 3; σ2
GE:σ2
G = 1.1) with a heritability
0.95 and 20 genes, with two levels of families selected (10, 30), and at four different family
sizes (250, 500, 750, 1000) (Figure 15). For each of the four family sizes the G×E
interaction model (circular symbols) had a faster response to selection than the additive
model (triangular symbols) in the short to medium-term (Figure 15ab,c,d). Selecting 10
families also gave a greater response to selection then selecting 30 families for both the
additive and G×E interaction models. The rate of response to selection was increased from
that observed when 250 families were evaluated in the model (Figure 15a) to when 1000
Figure 14: Comparison of the response to selection for S1 families (a,c,e) and DH
lines (b,d,f) with a heritability 0.95, 20 genes and three levels of families selected
over 10 cycles of selection.
42
families were evaluated (Figure 15d). The effects of G×E interaction on response to
selection were examined further in simulation experiment five.
(a) 250 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
GxE 10 S
GxE 30 S
Add 10 S
Add 30 S
(b) 500 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(d) 1000 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(c) 750 families
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
GxE 10 S
GxE 30 S
Add 10 S
Add 30 S
GxE 10 S
GxE 30 S
Add 10 S
Add 30 S
GxE 10 S
GxE 30 S
Add 10 S
Add 30 S
Experiment 5: Evaluating the influence of G××××E interaction genetic models on response
to selection
The analysis of variance of the G×E interaction simulation output data indicated significant
interactions between the S1 (2 years of MET), DH (1 year of MET) and DH (2 years of
MET) breeding strategies and level of G×E interaction, cycles, selected proportion and
number of families. On average, including all runs and cycles, the DH (2 MET) had a 12%
increase in mean performance compared to the S1 (2 MET) and a 2% increase in mean
performance over DH (1 MET). DH (1 MET) also had on average a 9% increase in mean
performance compared to the S1 (2 MET). The studies conducted in simulation experiment
three indicated that 100 families were not required in this model as that family size was too
small for the DH lines to have a greater response than 1000 S1 families.
Figure 15: Comparison of response to selection of the G×E interaction
(σ2
GE:σ2
G = 1.1: Table 3) model and the additive model with constant heritability
(0.95), genes (20) and two levels of families selected (S) for four different family
sizes over 10 cycles of selection.
43
With two years of MET testing for both strategies, 250 DH lines have an advantage over
1000 S1 families at both high and low levels of heritability and for all levels of G×E
interaction considered (Figure 16a,b,c,d). A faster response to selection was observed at the
higher level of heritability (Figure 16b,d) compared to the lower heritability (Figure 16a,c).
(c) heritability 0.05 h2, σ2
GE:σ2
G = 2.89
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(a) heritability 0.05, σ2
GE:σ2
G = 0.8
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(b) heritability 0.95, σ2
GE:σ2
G = 0.8
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
1000 S1 (2 MET)
250 DH (2 MET)
1000 DH (2 MET)
(d) heritability 0.95, σ2
GE:σ2
G = 2.89
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
1000 S1 (2 MET)
250 DH (2 MET)
1000 DH (2 MET)
1000 S1 (2 MET)
250 DH (2 MET)
1000 DH (2 MET)
1000 S1 (2 MET)
250 DH (2 MET)
1000 DH (2 MET)
To compare a four year DH strategy to the four year S1 cycle, simulations were also run
where the DH breeding strategy was conducted for one year of METs while S1 families
remained at two years of METs (Figure 17a,b,c,d). The advantage of 250 DH lines over
1000 S1 families was retained, but the magnitude of the advantage reduced, when only one
year of METs was run. At a heritability of 0.95 the response to selection was faster then
when the heritability was 0.05, however the DH advantage was lost after 8 cycles of
selection and 1000 S1 families had a slightly greater response to selection in the long-term
(Figure 17b,d).
Figure 16: Comparison of response to selection of 1000 S1 families to two sizes
of DH lines, 20 genes, two heritability levels, two levels of G×E interaction and
both S1 families and DH lines having two years of METs.
44
(b) heritability 0.95, σ2
GE:σ2
G = 0.8
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
1000 S1 (2 MET)
250 DH (1 MET)
1000 DH (1 MET)
(d) heritability 0.95, σ2
GE:σ2
G = 2.89
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
1000 S1 (2 MET)
250 DH (1 MET)
1000 DH (1 MET)
(a) heritability 0.05, σ2
GE:σ2
G = 0.8
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
(c) heritability 0.05, σ2
GE:σ2
G = 2.89
Cycle
0 2 4 6 8 10
Performance(%targetgenotype)
0
20
40
60
80
100
1000 S1 (2 MET)
250 DH (1 MET)
1000 DH (1 MET)
1000 S1 (2 MET)
250 DH (1 MET)
1000 DH (1 MET)
Figure 17: Comparison of response to selection of 1000 S1 families to two sizes
of DH lines, 20 genes, two heritability levels, two levels of G×E interaction and
both S1 families and DH lines having two years of METs.
45
5. Discussion
Experiment 1: Comparison of simulated and theoretical predictions of the effective
population size (Ne) of S1 families and DH lines
The effective population size simulations were conducted to ensure that the Ne of both the
DH lines and S1 families was large enough that favourable alleles in the population did not
have a high probability of loss through random drift. However, if the Ne is too large the
response to selection will be slowed due to a reduction in selection pressure and the greater
tendency to retain the undesirable alleles in the population. In both the S1 and DH strategy
as the number of S0 plants sampled increased, the effective population size increased. This
is especially important in the DH strategy, as the Ne is smaller than when S1 families are
used. If a breeder was concerned about the Ne size being small with DH lines it is therefore
feasible to increase Ne in this recurrent selection strategy by sampling more than one DH
line from the selected S0 plants. Therefore, there are opportunities to manipulate the
effective population size with DH lines within the GEP if it became an issue. Previous
experiments however have indicated that the Ne is not so low as to have a major influence
on the response to selection relative to the effects of selection, even with relatively intense
selection, as long as the Ne is maintained above a value of 10.
An effective population size with a balance between the random drift and slowed response
to selection scenarios can be accommodated by selecting between 10 and 20 S1 or DH
families per cycle of selection. When the selected proportion was less than 5, there was
strong evidence that significant numbers of genes were lost due to random drift, if it was
greater than 20, the response to selection was slowed considerably.
An increase in the variability around the mean Ne as M (number of S1 families) or 'M
(number of DH lines) increased was a result of random fluctuations in gene frequency. This
variation was greater for those genes that were not under the influence of selection. This
indicates that the observed Ne has the ability to fluctuate dramatically as the selection
intensity decreases. For those genes under the influence of selection there is less scope for
46
undesirable loss of genes by chance. This was quantified in equation (8). As the effective
population size multiplied by the selection coefficient (Nes) increases the probability of the
favourable alleles being fixed in the population approaches one and the probability of loss
of the favourable alleles approaches zero.
Experiment 2: Determining the effects of linkage disequilibrium on gene frequency and
response to selection
The impact of linkage disequilibrium in the GEP was demonstrated in experiment 2 by
showing that low value genes could have a low or high frequency in the population after
five cycles of selection when there was only one generation of random mating between
cycles of selection. The need to consider the effects of linkage disequilibrium in this study
was alerted by the observation that in some of the simulations conducted with the presence
of G×E interaction effects in the genetic model (experiment 4 and 5), a faster response to
selection was being produced compared to the additive model (Figure 15). This result was
produced because in the additive model all of the genes contributing to the attribute had
small and equal effects, i.e. there were no major genes that were selected for initially to
increase the response to selection. However, with the G×E interaction model the genes had
different effects in different environments. The consequence of this was that there were
major and minor genes within the target population of environments. This resulted in the
major genes being fixed quickly, resulting in an increase in their frequency and therefore a
rapid response to selection. The fate of the minor genes was a consequence of the effects of
selection and linkage disequilibrium.
When the effects of genes in the additive model were scaled to be proportional in relative
effects in the same way as for the G×E interaction model it was possible to compare the
effects of linkage disequilibrium and selection for both the additive and G×E interaction
model. By cycle five, the favourable alleles of the genes in the G×E interaction model had
increased to a smaller gene frequency in the population than in the additive model. This
was due to G×E interaction adding a level of complexity into the selection procedure that
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project
Narelle Kruger Honours Project

More Related Content

What's hot

Matthew Pennell - Young Investigator Prize Talk
Matthew Pennell - Young Investigator Prize TalkMatthew Pennell - Young Investigator Prize Talk
Matthew Pennell - Young Investigator Prize Talkmwpennell
 
PAG poster Jeevan Adhikari
PAG poster Jeevan AdhikariPAG poster Jeevan Adhikari
PAG poster Jeevan AdhikariJeevan Adhikari
 
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...Sameer Khanal
 
Genotype imputation study in Gir dairy cattle of Gujarat
Genotype imputation study in Gir dairy cattle of GujaratGenotype imputation study in Gir dairy cattle of Gujarat
Genotype imputation study in Gir dairy cattle of GujaratSuperior Animal Genetics (SAG)
 
1 gpb 621 quantitative genetics introduction
1 gpb 621 quantitative genetics   introduction1 gpb 621 quantitative genetics   introduction
1 gpb 621 quantitative genetics introductionSaravananK153
 
Repurposing large datasets to dissect exposomic (and genomic) contributions i...
Repurposing large datasets to dissect exposomic (and genomic) contributions i...Repurposing large datasets to dissect exposomic (and genomic) contributions i...
Repurposing large datasets to dissect exposomic (and genomic) contributions i...Chirag Patel
 
Multi-trait analysis informs genetic disease studies (IIBMP 2020)
Multi-trait analysis informs genetic disease studies (IIBMP 2020)Multi-trait analysis informs genetic disease studies (IIBMP 2020)
Multi-trait analysis informs genetic disease studies (IIBMP 2020)Yosuke Tanigawa
 
Genomic Selection in dairy cattle breeding -An overview
Genomic Selection in dairy cattle breeding -An overviewGenomic Selection in dairy cattle breeding -An overview
Genomic Selection in dairy cattle breeding -An overviewSuperior Animal Genetics (SAG)
 
Epidemiology of malaria in irrigated parts of Tana River County, Kenya
Epidemiology of malaria in irrigated parts of Tana River County, KenyaEpidemiology of malaria in irrigated parts of Tana River County, Kenya
Epidemiology of malaria in irrigated parts of Tana River County, KenyaILRI
 
Correlation and Path Analysis of Groundnut (Arachis hypogaea L.) Genotypes in...
Correlation and Path Analysis of Groundnut (Arachis hypogaea L.) Genotypes in...Correlation and Path Analysis of Groundnut (Arachis hypogaea L.) Genotypes in...
Correlation and Path Analysis of Groundnut (Arachis hypogaea L.) Genotypes in...Premier Publishers
 
EWAS and the exposome: Mt Sinai in Brescia 052119
EWAS and the exposome: Mt Sinai in Brescia 052119EWAS and the exposome: Mt Sinai in Brescia 052119
EWAS and the exposome: Mt Sinai in Brescia 052119Chirag Patel
 
QTL Mapping for Gray Leaf Spot Resistance
QTL Mapping for Gray Leaf Spot ResistanceQTL Mapping for Gray Leaf Spot Resistance
QTL Mapping for Gray Leaf Spot ResistanceManjit Kang
 
Reproductive performance of different goat breeds in Malaysia
Reproductive performance of different goat breeds in MalaysiaReproductive performance of different goat breeds in Malaysia
Reproductive performance of different goat breeds in MalaysiaMohammed Muayad TA
 
Paper id 212014150
Paper id 212014150Paper id 212014150
Paper id 212014150IJRAT
 
Final From journal on website
Final From journal on websiteFinal From journal on website
Final From journal on websiteMichael Clawson
 
Biparental mating design
Biparental mating designBiparental mating design
Biparental mating designLokesh Gour
 

What's hot (20)

Matthew Pennell - Young Investigator Prize Talk
Matthew Pennell - Young Investigator Prize TalkMatthew Pennell - Young Investigator Prize Talk
Matthew Pennell - Young Investigator Prize Talk
 
FROM THE CLASSROOM TO AN OPINION NOTE: COMPLEMENTARY ANALYSIS OF THE GENETIC ...
FROM THE CLASSROOM TO AN OPINION NOTE: COMPLEMENTARY ANALYSIS OF THE GENETIC ...FROM THE CLASSROOM TO AN OPINION NOTE: COMPLEMENTARY ANALYSIS OF THE GENETIC ...
FROM THE CLASSROOM TO AN OPINION NOTE: COMPLEMENTARY ANALYSIS OF THE GENETIC ...
 
P ii
P iiP ii
P ii
 
PAG poster Jeevan Adhikari
PAG poster Jeevan AdhikariPAG poster Jeevan Adhikari
PAG poster Jeevan Adhikari
 
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...
Effects of exotic alleles and genetic backgrounds on fiber quality traits in ...
 
My publication-1
My publication-1My publication-1
My publication-1
 
Genotype imputation study in Gir dairy cattle of Gujarat
Genotype imputation study in Gir dairy cattle of GujaratGenotype imputation study in Gir dairy cattle of Gujarat
Genotype imputation study in Gir dairy cattle of Gujarat
 
1 gpb 621 quantitative genetics introduction
1 gpb 621 quantitative genetics   introduction1 gpb 621 quantitative genetics   introduction
1 gpb 621 quantitative genetics introduction
 
Repurposing large datasets to dissect exposomic (and genomic) contributions i...
Repurposing large datasets to dissect exposomic (and genomic) contributions i...Repurposing large datasets to dissect exposomic (and genomic) contributions i...
Repurposing large datasets to dissect exposomic (and genomic) contributions i...
 
Multi-trait analysis informs genetic disease studies (IIBMP 2020)
Multi-trait analysis informs genetic disease studies (IIBMP 2020)Multi-trait analysis informs genetic disease studies (IIBMP 2020)
Multi-trait analysis informs genetic disease studies (IIBMP 2020)
 
Genomic Selection in dairy cattle breeding -An overview
Genomic Selection in dairy cattle breeding -An overviewGenomic Selection in dairy cattle breeding -An overview
Genomic Selection in dairy cattle breeding -An overview
 
Epidemiology of malaria in irrigated parts of Tana River County, Kenya
Epidemiology of malaria in irrigated parts of Tana River County, KenyaEpidemiology of malaria in irrigated parts of Tana River County, Kenya
Epidemiology of malaria in irrigated parts of Tana River County, Kenya
 
Correlation and Path Analysis of Groundnut (Arachis hypogaea L.) Genotypes in...
Correlation and Path Analysis of Groundnut (Arachis hypogaea L.) Genotypes in...Correlation and Path Analysis of Groundnut (Arachis hypogaea L.) Genotypes in...
Correlation and Path Analysis of Groundnut (Arachis hypogaea L.) Genotypes in...
 
EWAS and the exposome: Mt Sinai in Brescia 052119
EWAS and the exposome: Mt Sinai in Brescia 052119EWAS and the exposome: Mt Sinai in Brescia 052119
EWAS and the exposome: Mt Sinai in Brescia 052119
 
QTL Mapping for Gray Leaf Spot Resistance
QTL Mapping for Gray Leaf Spot ResistanceQTL Mapping for Gray Leaf Spot Resistance
QTL Mapping for Gray Leaf Spot Resistance
 
Reproductive performance of different goat breeds in Malaysia
Reproductive performance of different goat breeds in MalaysiaReproductive performance of different goat breeds in Malaysia
Reproductive performance of different goat breeds in Malaysia
 
Paper id 212014150
Paper id 212014150Paper id 212014150
Paper id 212014150
 
FRUGE CV
FRUGE CVFRUGE CV
FRUGE CV
 
Final From journal on website
Final From journal on websiteFinal From journal on website
Final From journal on website
 
Biparental mating design
Biparental mating designBiparental mating design
Biparental mating design
 

Viewers also liked

2nd iwsrs rajaram
2nd iwsrs rajaram2nd iwsrs rajaram
2nd iwsrs rajaramICARDA
 
Nuclear Techniques in Food and Agriculture
Nuclear Techniques in Food and AgricultureNuclear Techniques in Food and Agriculture
Nuclear Techniques in Food and AgricultureJeju National University
 
MUTATION BREEDING - A NEW ERA OF SCIENCE
MUTATION BREEDING - A NEW ERA OF SCIENCEMUTATION BREEDING - A NEW ERA OF SCIENCE
MUTATION BREEDING - A NEW ERA OF SCIENCERezwana Nishat
 
Sources of Stem Rust Resistance in Cultivated and Wild Tetraploids
Sources of Stem Rust Resistance in Cultivated and Wild TetraploidsSources of Stem Rust Resistance in Cultivated and Wild Tetraploids
Sources of Stem Rust Resistance in Cultivated and Wild TetraploidsBorlaug Global Rust Initiative
 
New evidence supporting the occurrence of sexual reproduction in the wheat st...
New evidence supporting the occurrence of sexual reproduction in the wheat st...New evidence supporting the occurrence of sexual reproduction in the wheat st...
New evidence supporting the occurrence of sexual reproduction in the wheat st...Borlaug Global Rust Initiative
 
Continental sweeps and aggressiveness in wheat rust pathogens
Continental sweeps and aggressiveness in wheat rust pathogensContinental sweeps and aggressiveness in wheat rust pathogens
Continental sweeps and aggressiveness in wheat rust pathogensICARDA
 
GRM 2013: Breeding and selection strategies to combine and validate QTLs for ...
GRM 2013: Breeding and selection strategies to combine and validate QTLs for ...GRM 2013: Breeding and selection strategies to combine and validate QTLs for ...
GRM 2013: Breeding and selection strategies to combine and validate QTLs for ...CGIAR Generation Challenge Programme
 
Rust Bowl or Breadbasket? Keeping track of wheat rust pathogens in Africa
Rust Bowl or Breadbasket? Keeping track of wheat rust pathogens in AfricaRust Bowl or Breadbasket? Keeping track of wheat rust pathogens in Africa
Rust Bowl or Breadbasket? Keeping track of wheat rust pathogens in AfricaCIMMYT
 
Maintenance breeding
Maintenance breedingMaintenance breeding
Maintenance breedingPawan Nagar
 
Mutation breeding
Mutation breedingMutation breeding
Mutation breedingDev Hingra
 
mutation breeding in pre & post genomic era
mutation breeding in pre & post genomic era mutation breeding in pre & post genomic era
mutation breeding in pre & post genomic era Umesh b s
 
Selective breeding powerpoint
Selective breeding powerpointSelective breeding powerpoint
Selective breeding powerpointhannahreed
 
Breeding self pollinated crops
Breeding self pollinated cropsBreeding self pollinated crops
Breeding self pollinated cropsPawan Nagar
 
Breeding techniques in self pollinated crops presentation
Breeding techniques in self pollinated crops presentationBreeding techniques in self pollinated crops presentation
Breeding techniques in self pollinated crops presentationDev Hingra
 
Breeding methods in cross pollinated crops
Breeding methods in cross pollinated cropsBreeding methods in cross pollinated crops
Breeding methods in cross pollinated cropsDev Hingra
 

Viewers also liked (20)

2nd iwsrs rajaram
2nd iwsrs rajaram2nd iwsrs rajaram
2nd iwsrs rajaram
 
TL III_Genetic gains_ICRISAT
TL III_Genetic gains_ICRISATTL III_Genetic gains_ICRISAT
TL III_Genetic gains_ICRISAT
 
Narelle Kruger PhD thesis
Narelle Kruger PhD thesisNarelle Kruger PhD thesis
Narelle Kruger PhD thesis
 
Nuclear Techniques in Food and Agriculture
Nuclear Techniques in Food and AgricultureNuclear Techniques in Food and Agriculture
Nuclear Techniques in Food and Agriculture
 
MUTATION BREEDING - A NEW ERA OF SCIENCE
MUTATION BREEDING - A NEW ERA OF SCIENCEMUTATION BREEDING - A NEW ERA OF SCIENCE
MUTATION BREEDING - A NEW ERA OF SCIENCE
 
Sources of Stem Rust Resistance in Cultivated and Wild Tetraploids
Sources of Stem Rust Resistance in Cultivated and Wild TetraploidsSources of Stem Rust Resistance in Cultivated and Wild Tetraploids
Sources of Stem Rust Resistance in Cultivated and Wild Tetraploids
 
New evidence supporting the occurrence of sexual reproduction in the wheat st...
New evidence supporting the occurrence of sexual reproduction in the wheat st...New evidence supporting the occurrence of sexual reproduction in the wheat st...
New evidence supporting the occurrence of sexual reproduction in the wheat st...
 
سيمينار المعهد
سيمينار المعهدسيمينار المعهد
سيمينار المعهد
 
Continental sweeps and aggressiveness in wheat rust pathogens
Continental sweeps and aggressiveness in wheat rust pathogensContinental sweeps and aggressiveness in wheat rust pathogens
Continental sweeps and aggressiveness in wheat rust pathogens
 
GRM 2013: Breeding and selection strategies to combine and validate QTLs for ...
GRM 2013: Breeding and selection strategies to combine and validate QTLs for ...GRM 2013: Breeding and selection strategies to combine and validate QTLs for ...
GRM 2013: Breeding and selection strategies to combine and validate QTLs for ...
 
Rust Bowl or Breadbasket? Keeping track of wheat rust pathogens in Africa
Rust Bowl or Breadbasket? Keeping track of wheat rust pathogens in AfricaRust Bowl or Breadbasket? Keeping track of wheat rust pathogens in Africa
Rust Bowl or Breadbasket? Keeping track of wheat rust pathogens in Africa
 
Maintenance breeding
Maintenance breedingMaintenance breeding
Maintenance breeding
 
Mutation breeding
Mutation breedingMutation breeding
Mutation breeding
 
mutation breeding in pre & post genomic era
mutation breeding in pre & post genomic era mutation breeding in pre & post genomic era
mutation breeding in pre & post genomic era
 
Selective breeding powerpoint
Selective breeding powerpointSelective breeding powerpoint
Selective breeding powerpoint
 
Mutation breeding ppt
Mutation breeding ppt Mutation breeding ppt
Mutation breeding ppt
 
Plant polyploids
Plant polyploidsPlant polyploids
Plant polyploids
 
Breeding self pollinated crops
Breeding self pollinated cropsBreeding self pollinated crops
Breeding self pollinated crops
 
Breeding techniques in self pollinated crops presentation
Breeding techniques in self pollinated crops presentationBreeding techniques in self pollinated crops presentation
Breeding techniques in self pollinated crops presentation
 
Breeding methods in cross pollinated crops
Breeding methods in cross pollinated cropsBreeding methods in cross pollinated crops
Breeding methods in cross pollinated crops
 

Similar to Narelle Kruger Honours Project

Maternal genetic effect of resistance to rice yellow mottle virus disease in ...
Maternal genetic effect of resistance to rice yellow mottle virus disease in ...Maternal genetic effect of resistance to rice yellow mottle virus disease in ...
Maternal genetic effect of resistance to rice yellow mottle virus disease in ...Innspub Net
 
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...taxonbytes
 
Association mapping identifies loci for canopy coverage in diverse soybean ge...
Association mapping identifies loci for canopy coverage in diverse soybean ge...Association mapping identifies loci for canopy coverage in diverse soybean ge...
Association mapping identifies loci for canopy coverage in diverse soybean ge...Avjinder (Avi) Kaler
 
Genome-wide association mapping of canopy wilting in diverse soybean genotypes
Genome-wide association mapping of canopy wilting in diverse soybean genotypesGenome-wide association mapping of canopy wilting in diverse soybean genotypes
Genome-wide association mapping of canopy wilting in diverse soybean genotypesAvjinder (Avi) Kaler
 
Genomics in animal breeding from the perspectives of matrices and molecules
Genomics in animal breeding from the perspectives of matrices and moleculesGenomics in animal breeding from the perspectives of matrices and molecules
Genomics in animal breeding from the perspectives of matrices and moleculesMartin Johnsson
 
Japanese Environmental Children's Study and Data-driven E
Japanese Environmental Children's Study and Data-driven EJapanese Environmental Children's Study and Data-driven E
Japanese Environmental Children's Study and Data-driven EChirag Patel
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingJoshuaLee309
 
Modelling Food Systems as Neural Networks
Modelling Food Systems as Neural NetworksModelling Food Systems as Neural Networks
Modelling Food Systems as Neural NetworksIFPRI Africa
 
Report- Genome wide association studies.
Report- Genome wide association studies.Report- Genome wide association studies.
Report- Genome wide association studies.Varsha Gayatonde
 
Enhancing Genetic Gains through Innovations in Breeding Approaches
Enhancing Genetic Gains through Innovations in Breeding ApproachesEnhancing Genetic Gains through Innovations in Breeding Approaches
Enhancing Genetic Gains through Innovations in Breeding ApproachesICARDA
 
Literature Review Madihah Ismail PSP3 2024
Literature Review Madihah Ismail PSP3 2024Literature Review Madihah Ismail PSP3 2024
Literature Review Madihah Ismail PSP3 2024Madihah Ismail
 
Genetics Research Paper Topics.pptx
Genetics Research Paper Topics.pptxGenetics Research Paper Topics.pptx
Genetics Research Paper Topics.pptxnancymartinez402574
 
Varietal trial of 6 genotypes of lentil
Varietal trial of 6 genotypes of lentilVarietal trial of 6 genotypes of lentil
Varietal trial of 6 genotypes of lentilDinesh Ghimire
 
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X Crosses
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X CrossesFrequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X Crosses
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X CrossesJournal of Agriculture and Crops
 
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3MIB200A at UCDavis Module: Microbial Phylogeny; Class 3
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3Jonathan Eisen
 
Diallel Analysis of Cowpea Cultivar Ife Brown and its Mutants
Diallel Analysis of Cowpea Cultivar Ife Brown and its MutantsDiallel Analysis of Cowpea Cultivar Ife Brown and its Mutants
Diallel Analysis of Cowpea Cultivar Ife Brown and its MutantsAI Publications
 
Quality Characteristics, Phenotypic correlations and Principal Component Anal...
Quality Characteristics, Phenotypic correlations and Principal Component Anal...Quality Characteristics, Phenotypic correlations and Principal Component Anal...
Quality Characteristics, Phenotypic correlations and Principal Component Anal...Agriculture Journal IJOEAR
 

Similar to Narelle Kruger Honours Project (20)

Maternal genetic effect of resistance to rice yellow mottle virus disease in ...
Maternal genetic effect of resistance to rice yellow mottle virus disease in ...Maternal genetic effect of resistance to rice yellow mottle virus disease in ...
Maternal genetic effect of resistance to rice yellow mottle virus disease in ...
 
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
Zhang et al evol 2016 beyond otus phylogenetic identification of bacterial sy...
 
Pangenomics.pptx
Pangenomics.pptxPangenomics.pptx
Pangenomics.pptx
 
Association mapping identifies loci for canopy coverage in diverse soybean ge...
Association mapping identifies loci for canopy coverage in diverse soybean ge...Association mapping identifies loci for canopy coverage in diverse soybean ge...
Association mapping identifies loci for canopy coverage in diverse soybean ge...
 
Genome-wide association mapping of canopy wilting in diverse soybean genotypes
Genome-wide association mapping of canopy wilting in diverse soybean genotypesGenome-wide association mapping of canopy wilting in diverse soybean genotypes
Genome-wide association mapping of canopy wilting in diverse soybean genotypes
 
Genomics in animal breeding from the perspectives of matrices and molecules
Genomics in animal breeding from the perspectives of matrices and moleculesGenomics in animal breeding from the perspectives of matrices and molecules
Genomics in animal breeding from the perspectives of matrices and molecules
 
Japanese Environmental Children's Study and Data-driven E
Japanese Environmental Children's Study and Data-driven EJapanese Environmental Children's Study and Data-driven E
Japanese Environmental Children's Study and Data-driven E
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Modelling Food Systems as Neural Networks
Modelling Food Systems as Neural NetworksModelling Food Systems as Neural Networks
Modelling Food Systems as Neural Networks
 
Report- Genome wide association studies.
Report- Genome wide association studies.Report- Genome wide association studies.
Report- Genome wide association studies.
 
Enhancing Genetic Gains through Innovations in Breeding Approaches
Enhancing Genetic Gains through Innovations in Breeding ApproachesEnhancing Genetic Gains through Innovations in Breeding Approaches
Enhancing Genetic Gains through Innovations in Breeding Approaches
 
Literature Review Madihah Ismail PSP3 2024
Literature Review Madihah Ismail PSP3 2024Literature Review Madihah Ismail PSP3 2024
Literature Review Madihah Ismail PSP3 2024
 
Genetics Research Paper Topics.pptx
Genetics Research Paper Topics.pptxGenetics Research Paper Topics.pptx
Genetics Research Paper Topics.pptx
 
Varietal trial of 6 genotypes of lentil
Varietal trial of 6 genotypes of lentilVarietal trial of 6 genotypes of lentil
Varietal trial of 6 genotypes of lentil
 
PAG 2017 poster
PAG 2017 posterPAG 2017 poster
PAG 2017 poster
 
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X Crosses
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X CrossesFrequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X Crosses
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X Crosses
 
DoctorThesis_patry
DoctorThesis_patryDoctorThesis_patry
DoctorThesis_patry
 
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3MIB200A at UCDavis Module: Microbial Phylogeny; Class 3
MIB200A at UCDavis Module: Microbial Phylogeny; Class 3
 
Diallel Analysis of Cowpea Cultivar Ife Brown and its Mutants
Diallel Analysis of Cowpea Cultivar Ife Brown and its MutantsDiallel Analysis of Cowpea Cultivar Ife Brown and its Mutants
Diallel Analysis of Cowpea Cultivar Ife Brown and its Mutants
 
Quality Characteristics, Phenotypic correlations and Principal Component Anal...
Quality Characteristics, Phenotypic correlations and Principal Component Anal...Quality Characteristics, Phenotypic correlations and Principal Component Anal...
Quality Characteristics, Phenotypic correlations and Principal Component Anal...
 

Narelle Kruger Honours Project

  • 1. Simulation Analysis of Doubled Haploids in a Wheat Breeding Program November 1999 N.L. Kruger Research Report #5
  • 2. Simulation Analysis of Doubled Haploids in a Wheat Breeding Program Narelle Lee Kruger This report was submitted as a requirement of the subject AG421, for a Bachelor of Agricultural Science (Plant Breeding) in the Faculty of Natural Resources, Agriculture and Veterinary Science in the School of Land and Food, The University of Queensland, November 1999.
  • 3. Abstract The Germplasm Enhancement Program (GEP) of the Australian Northern Wheat Improvement Program (NWIP) is presently based on an S1 recurrent selection strategy. The objective of the GEP is to ensure a continual supply of high yielding germplasm for the pedigree programs in the NWIP. The GEP works on a four year breeding cycle. Years 1 and 2 are used for intermating, selection for the traits maturity and height, and seed multiplication of the S1 families. Multi-environment Trials (METs) of the S1 families are conducted in years 3 and 4 and selection is based on grain yield and grain protein concentration data collected from these METs. There is interest in whether using doubled haploid (DH) lines in the MET evaluation phase of the recurrent selection program, in place of S1 families, could contribute to an increase in the rate of genetic improvement for grain yield. The objective of this project was to use computer simulation to investigate the applicability of a DH strategy in the GEP for a range of genotype-environment system models which are considered to be relevant to wheat improvement in the northern grains region of Australia. This study considered the influence of models including additive and genotype-by- environment (G×E) interaction effects. The computer simulation program QU-GENE, developed at The University of Queensland, was used in this study. The advantages of the computer simulation approach over alternative approaches based on either theoretical analysis or experimental evaluation were: (1) that more complex genetic models could be examined than were possible to examine using the theoretical approach, (2) larger experiments with many more factors could be examined than would be feasible in an experimental investigation, and (3) answers to researchable questions could be obtained in a more timely manner than would be possible using either a theoretical or experimental approach. Five simulation experiments were conduct to compare the response to selection or genetic gain when either S1 families or DH lines were used in the GEP. Experiment one involved simulating the effective population size (Ne) for DH lines and S1 families, and comparing
  • 4. these results to the theoretical predictions. The simulated results conformed well to the results predicted from theory. For the same number of selected individuals, the S1 families had a higher Ne than the DH lines and therefore the S1 family strategy was less likely to lose favourable genes by random drift. The effect of linkage disequilibrium was assessed in experiment two. Linkage disequilibrium was shown to have an influential role in the rates of increase in the frequency of favourable genes in the GEP. The third experiment compared the response to selection when identical numbers of S1 families and DH lines were evaluated, for an additive genetic model without G×E interactions. The results indicated that 250 DH lines had an advantage over 1000 S1 families in terms of rate of response to selection. In the fourth experiment, intensity of selection was manipulated by changing the number of families selected to proceed into the next cycle of selection. Increasing the intensity of selection by selecting fewer families increased the rate of response to selection in the short-term. However, selecting fewer families also decreased Ne and consequently selecting too few families resulted in the loss of favourable genes from the population due to the effects of random drift, resulting in a reduction in long-term response to selection. As a trade-off, selecting 20 families greatly reduced the chance of favourable genes from being lost from the population due to drift, without slowing the response to selection significantly. Experiment five assessed the influence of introducing complexity into the additive model by incorporating genotype-by-environment (G×E) interactions. The advantage observed by the DH lines over S1 families for the additive model was retained, and was also present when a MET based on one year for the DH lines was conducted in comparison to two years for the S1 families. Computer simulation analyses of the expected short-term and long-term responses to selection for a range of additive genetic models suggests there are advantages of the DH strategy when it is feasible to generate 250 or more DH lines for evaluation in the MET phase of the GEP. This advantage was also observed with the presence of G×E interaction in the model. These outcomes suggest that the use of DH lines in place of S1 families in the GEP may be a feasible activity. As the production of DH lines becomes less expensive and labour intensive, more DH lines will be able to be produced in a year and therefore greater gains in selection will be potentially observed.
  • 5. Declaration of Originality This report describes the original work of the author, except where otherwise stated. It has not been submitted previously as part of degree requirements at any other University. Narelle Lee Kruger
  • 6. Publications relevant to this thesis Kruger NL, Podlich DW, Cooper M (1999) Comparison of S1 and doubled haploid recurrent selection strategies by computer simulation with applications for the Germplasm Enhancement Program of the Northern Wheat Improvement Program. In ‘Proceedings of the Ninth Assembly Wheat Breeding Society of Australia.’ (Eds P Williamson, P Banks, I Haak, J Thompson, A Campbell) pp.216-219. (Wheat Breeding Society of Australia: Toowoomba)
  • 7. Acknowledgements I would like to thank the Grains Research and Development Corporation for the Undergraduate Honours Scholarship Award and their support of this research project. I would especially like to thank my project supervisor Dr Mark Cooper for his valuable time, expertise, guidance, encouragement and dedication throughout the course of the project. I would also like to thank Dean Podlich for his expertise, advice, time and support. Thankyou also to Mr Ian Phillips for helping with the use of ASREML and the support of my fellow colleagues Dr Ian DeLacy, Ms Nicole Jensen, Mr Kevin Micallef and Mr Anura Ratnasiri.
  • 8. i Table of Contents List of Tables .............................................................................................................. iii List of Figures..............................................................................................................iv 1. Introduction.........................................................................................1 2. Literature Review ...............................................................................4 2.1 Introduction ...........................................................................................4 2.2 Germplasm Enhancement Program....................................................7 2.3 Doubled Haploids ................................................................................11 2.4 Genotype-by-environment Interaction..............................................12 2.5 QU-GENE ............................................................................................13 2.6 Effective Population Size (Ne).............................................................15 2.7 Linkage Disequilibrium .......................................................................18 2.8 Study Focus ...........................................................................................20 3. Materials and Methods.....................................................................21 Experiment 1: Comparison of simulated and theoretical predictions of the effective population size (Ne) of S1 families and DH lines.................23 Experiment 2: Determining the effects of linkage disequilibrium on gene frequency and response to selection ...............................................25 Experiment 3: Evaluating the response to selection of S1 families and DH lines for an additive genetic model.................................................26 Experiment 4: Evaluating the impact of selection proportion on response to selection for an additive genetic model.....................................26 Experiment 5: Evaluating the influence of G×E interaction genetic models on response to selection.......................................................28 4. Results ................................................................................................31 Experiment 1: Comparison of simulated and theoretical predictions of the effective population size (Ne) of S1 families and DH lines.................31 Experiment 2: Determining the effects of linkage disequilibrium on gene frequency and response to selection ...............................................34
  • 9. ii Experiment 3: Evaluating the response to selection of S1 families and DH lines for an additive genetic model.................................................36 Experiment 4: Evaluating the impact of selection proportion on response to selection for an additive genetic model.....................................40 Experiment 5: Evaluating the influence of G×E interaction genetic models on response to selection.......................................................42 5. Discussion...........................................................................................45 Experiment 1: Comparison of simulated and theoretical predictions of the effective population size (Ne) of S1 families and DH lines.................45 Experiment 2: Determining the effects of linkage disequilibrium on gene frequency and response to selection ...............................................46 Experiment 3: Evaluating the response to selection of S1 families and DH lines for an additive genetic model.................................................48 Experiment 4: Evaluating the impact of selection proportion on response to selection for an additive genetic model.....................................51 Experiment 5: Evaluating the influence of G×E interaction genetic models on response to selection 52 Overall..........................................................................................................54 6. Conclusion..........................................................................................56 7. The Future .........................................................................................57 8. References..........................................................................................58 9. Appendices..........................................................................................65 9.1 Appendix 1 ............................................................................................65 9.2 Appendix 2 ............................................................................................76
  • 10. iii List of Tables Table 1: Selected intensity (%) and corresponding standardised selection differential (S), (within the brackets) from Falconer and Mackay (1996), changes depending on the number of families in the MET and the number of families selected (selected proportion) from the MET. .............................27 Table 2: Input file for the QUGENE engine. This file represents the G×E model 5 (Table 3). The genes 5-16 are removed from this presentation for conciseness. GN represents the gene number and E1 – E5 represent the five environment types within the target population of environments. The detail of the structure of this input file is explained by Podlich and Cooper, (1997; 1998) .............................................................29 Table 3: Each model describes the number of genes interacting with the five environment types and the level of G×E interaction present as described by the ratio of the genotype-by-environment interaction variance to the genotypic variance (σ2 GE:σ2 G)..............................................................30
  • 11. iv List of Figures Figure 1a: Components and pathways of germplasm transfer for yield improvement in the Australian Northern Wheat Improvement Program: LRC-QDPI represents the Queensland Department of Primary Industries pedigree breeding programs located in Toowoomba at the Leslie Research Centre; PBI-US represents the University of Sydney pedigree breeding programs located in Narrabri; GEP represents the University of Queensland Germplasm Enhancement Program (Cooper et al., 1999) ............................................8 Figure 1b: Outline of the Germplasm Enhancement Program (GEP) 4 year cycle. Pictures show examples of the activities and field experiments undertaken at each stage of the cycle (Cooper et al., 1999).........................................9 Figure 2: Schematic outline of the QU-GENE simulation software. The central ellipse shows the engine and the surrounding boxes show the application modules. The GEPRSS module was used in this study (Podlich and Cooper, 1997, 1998)...........................................................14 Figure 3: Outline of the activities involved in the S1 and DH breeding strategies over one cycle of the GEP. The S1 activities are adapted from Fabrizius et al., (1996)............................21
  • 12. v Figure 4: S1 families effective population size (Ne) calculated theoretically (solid line) and the average of the simulation runs (broken lines) for a range of values for the number of S0 plants sampled (M) and the number of reserve seed used for intermating per S0 plant sampled (m)................................31 Figure 5: Simulated S1 family effective population size (Ne) variation, about the average of the simulation runs (solid line) for a range of S0 plants sampled (M) and the two extreme values of reserve seed used for intermating per S0 plant sampled (m). (for intermediate levels of m refer to Appendix 2)..............................................................32 Figure 6: Comparison of the DH simulated average (closed circles) and DH theoretical (solid line) effective population size for a range of S0 plants sampled ( 'M ) and when only one DH plant was produced per S0 plant sampled ( 'm = 1).................................32 Figure 7: Average of the simulated DH effective population size (Ne) for a range of S0 plants sampled ( 'M ) and DH plants produced per S0 plant sampled 'm . A regression line is fitted to each 'm ......................................................33 Figure 8: Simulated S1 family effective population size (Ne) variation, about the average of the simulation runs (solid line) for a range of S0 plants sampled ( 'M ) and two extreme numbers of DH plants produced per S0 plant sampled ( 'm ) (for intermediate levels of 'm refer to Appendix ................................................................34
  • 13. vi Figure 9: The influence of linkage disequilibrium in the GEP with S1 families, 20 genes under both an additive model and G×E interaction (σ2 GE:σ2 G = 2.89) model after five cycles of selection. (a) frequency of each gene in the model plotted for one generation of random mating per cycle (b) gene frequency and value for each of the 20 genes for one generation of random mating per cycle (c) frequency of each gene in the model plotted for 10 generations of random mating per cycle, and (d) gene frequency and value for each of the 20 genes for 10 generations of random mating per cycle.............35 Figure 10: Comparison of the response to selection for S1 families and DH lines with heritability 0.05, 20 genes and four family sizes over 10 cycles of selection...................................................37 Figure 11: Comparison of the response to selection for S1 families and DH lines with heritability 0.05, 500 families and gene numbers over 10 cycles of selection...........................38 Figure 12: Comparison of the response to selection for S1 families and DH lines with 20 genes, two heritability levels and two family sizes over 10 cycles of selection ..........................38 Figure 13: Comparison of the response to selection for 1000 S1 families to 100, 250, 500 and 1000 DH lines two heritability levels and two gene numbers over 10 cycles of selection.....39 Figure 14: Comparison of the response to selection for S1 families (a,c,e) and DH lines (b,d,f) with a heritability 0.95, 20 genes and three levels of families selected over 10 cycles of selection...............................................................................41
  • 14. vii Figure 15: Comparison of response to selection of the G×E interaction (σ2 GE:σ2 G = 1.1: Table 3) model and the additive model with constant heritability (0.95), genes (20) and two levels of families selected (S) for four different family sizes over 10 cycles of selection..............................................................42 Figure 16: Comparison of response to selection of 1000 S1 families to two sizes of DH lines, 20 genes, two heritability levels, two levels of G×E interaction and both S1 families and DH lines having two years of METs...............43 Figure 17: Comparison of response to selection of 1000 S1 families to two sizes of DH lines, 20 genes, two heritability levels, two levels of G×E interaction and both S1 families and DH lines having two years of METs...............44
  • 15. 1 1. Introduction The Germplasm Enhancement Program (GEP) is an S1 recurrent selection program. It operates as a parent building component of the Northern Wheat Improvement Program (NWIP) of Australia (Fabrizius et al., 1996). Recurrent selection programs are conducted to achieve medium and long-term genetic improvement by increasing the frequency of favourable genes and gene combinations. This is achieved by a combination of long-term breeding strategies aimed at improvement of the genetic resource base, and short-term breeding strategies aimed at exploiting the potential of the genetic resource base available at a particular point in time. Hallauer (1981) argued that recurrent selection is the most efficient breeding strategy for long-term genetic improvement and pedigree breeding is the most efficient for short-term exploitation of genetic resources for the purpose of cultivar development. This study focused on the issues relevant to the rate of genetic improvement and long-term genetic improvement of a population, with applications to the GEP of the NWIP. The response to selection can be evaluated in terms of the improvement in the mean of a population subjected to selection and the rate of genetic improvement can be evaluated by investigating response to selection over a series of cycles of selection. Optimising the allocation of resources to activities within the GEP to achieve its role in the NWIP is a complex problem. There is interest in whether a strategy using doubled haploid (DH) lines (i.e. plants developed by a process where the haploid genome has been doubled) can contribute to an increase in the rate of genetic improvement relative to that achieved by the current S1 strategy. Some advantages of using DH lines in the GEP are: (1) the plants are completely homozygous in one generation, whereas the S1 families are still segregating, (2) the variation among DH lines is not influenced by dominance and they have twice as much additive genetic variation partitioned among lines relative to S1 families, and (3) selection of superior genotypes should be easier and more efficient. Some disadvantages of using DH lines in the GEP are: (1) their production is more difficult and costly relative to S1 families, and (2) with the current DH technology based on the wheat × maize crossing system, they would add an extra year to the GEP cycle.
  • 16. 2 Breeding programs take many years to conduct and experimental comparisons of DH and S1 family selection would be costly and time consuming. Further, it is questionable whether experiments with sufficient power could be conducted to detect significant differences between the breeding strategies within three cycles of the breeding program (i.e. 12-15 years). Simulation allows a rapid and low cost assessment of the potential value of using doubled haploids in the GEP. The aim of this study was to use computer simulation methodology to compare the expected selection response for S1 and DH recurrent selection strategies for a range of genetic models where the variables heritability, number of cycles, number of families evaluated in METs, number of families selected and number of genes contributing to the trait were manipulated. The genetic models investigated in this study consider the influences of additive and additive-by-environment (G×E) interaction effects. Response to selection in a recurrent selection program is a balance between directed changes in gene frequency due to the effects of artificial selection imposed by the breeder and random changes in gene frequency due to the effects of random drift. It has been emphasised by Hospital and Chevalet (1996) that the joint effects of selection, linkage and drift must be considered in any evaluation of selection response. Both changes in gene frequency due to the effects of selection and drift are influenced by the number of lines selected and the resulting selection intensity applied by the breeder. Therefore, in this study factors that influenced response to selection (i.e. heritability, selection intensity and number of families/lines evaluated) and random drift (i.e. effective population size) were examined for their influence on selection response within the GEP. The effective population size (Ne) of DH populations is not well documented in the literature. Simulations were run to determine the factors that influenced Ne of DH lines in the GEP and to determine whether the simulated variation for Ne could be explained using the theoretical prediction equations derived from the work of Comstock (1996). Linkage disequilibrium is an important consideration in the GEP as the base population was generated from a small number of parents. Linkage disequilibrium can cause the genetic variability of a population to either be inflated or depressed, depending on the linkage phase relationships (coupling or repulsion) among the loci influencing the traits to be manipulated
  • 17. 3 by selection. This in turn affects the heritability of the trait being selected for and therefore the population’s response to selection. The computer simulation platform QU-GENE, developed at The University of Queensland (Podlich and Cooper, 1998) was used to conduct the simulations. One of the application modules available within QU-GENE is the GEPRSS (Germplasm Enhancement Program Recurrent Selection Strategy; Podlich and Cooper, 1997). The GEPRSS module was modified, and an option added so that the user could choose between the current S1 or DH strategy. This gives the breeder the choice of whether or not they would like to incorporate doubled haploids into the program in place of the S1 families. This report has been structured into five sections: (1) Literature review (section 2): Within this section the relevant literature has been investigated and a background to the project presented, (2) Materials and methods (section 3): The objective of this section was to outline the components of the different genetic models used in the computer simulations, the experiments undertaken were (i) comparison of simulated and theoretical predictions of the effective population size (Ne) of S1 families and DH lines, (ii) determining the effects of linkage disequilibrium on gene frequency and response to selection, (iii) evaluating the response to selection of S1 families and DH lines for an additive genetic model, (iv) evaluating the impact of selection proportion on response to selection for an additive genetic model, (v) evaluating the influence of G×E interaction genetic models on response to selection. (3) Results (section 4): This section presents the analysis of the results of the simulation experiments and the graphical representations of the important results of the computer simulations, (4) Discussion (section 5): Presented in this section is an interpretation of the results of the simulation experiments, and what these results mean for the GEP, (5) Conclusion (section 6): This section draws together the important points that resulted from conducting the simulation study, (6) The future (section 7): Additional investigations to be run in the future are outlined in this section.
  • 18. 4 2. Literature review 2.1 Introduction One of the main objectives of a breeding program is to produce a range of cultivars superior to those that already exist. This objective can be quantified by monitoring the response to selection of a breeding population over cycles of selection. Any change in the mean genetic value of a population due to the influence of selective forces is termed the response to selection (R) or genetic gain. This can be quantified as the difference of the mean phenotypic value between the offspring of the selected parents and the whole of the parental generation before selection (Falconer and Mackay, 1996). Response to selection can be predicted from knowledge of the heritability of the traits subjected to selection, and the selection pressure applied using the following formula 2 2 pR h S ih σ= = , (1) where h2 is heritability, which can be obtained from genetic experiments conducted for generations prior to selection, S is the selection differential, i is the standardised selection differential and σp is the standard error of the phenotypic values of the individuals (selection units). The selection differential is the mean phenotypic value of the individuals selected as parents, expressed as a deviation from the population mean. The selection differential is not known until the parents are selected. However, the expected value of the standardised selection differential can be predicted assuming that the distribution of phenotypic values of the individuals to be subjected to selection is normal. The prediction of response however is only valid for one generation of selection as the response depends on the heritability of the character in the generation from which the parents are selected. The heritability of the character is expected to change between generations of selection for two reasons. First, any response to selection will cause the gene frequencies to change, on which the heritability depends, and secondly, the selection of parents reduces the variance and the heritability (Falconer and Mackay, 1996).
  • 19. 5 The response to selection equation is therefore important to plant breeders as it quantifies the genetic gain achievable in any cycle of selection. The basic principle of any plant breeding program is the continuous improvement of the target species. It is the plant breeder’s role to control the intensity and speed of this improvement by changing the genetic structure of a population (Williams, 1964). By understanding the underlying concepts of the components of the response to selection equation, and the effect that manipulating them has on the response to selection, breeders can increase the genetic gain expected from a cycle of selection. Equation (1) is a basic prediction equation which applies to the mass selection of individuals in a random mating base population. Most breeding programs however apply different forms of selection to a population. The general equation can be extended to accommodate the features of the different selection methods. Of relevance to this study is the prediction equation for S1 recurrent selection in the GEP, which is ( ) 2 ' 2 212 ' 4 2 21 ' 4 A c AE DEe A D kc G rt t σ σ σσ σ σ = + + + + , (2) where Gc is the expected gain per cycle, k is the standardised selection differential applied to S1 families, c is the parental control, 2 'Aσ is the additive genetic variance plus a component that is mainly a function of degree of dominance, 2 eσ is the environmental (error) component of variance, r is the number of replications per environment, t is the number of environments, 2 'AEσ and 2 DEσ are the additive-by-environmental and dominance-by-environmental interactions variance, and 2 Dσ is the dominance genetic variance (Fehr, 1987). The parental control factor (c) is 1 for selfed families, as used in the GEP. By changing different variables in the QU-GENE GEPRSS module, the components of the prediction equation can be manipulated which allows simulation investigations of different selection strategies for the GEP.
  • 20. 6 If no artificial or natural selection pressures are placed on a population, there is no expected response to selection, therefore any changes in gene frequency and population mean will be a result of the effects of random drift, when the effects of mutation and migration are neglected. A rapid response to selection is necessary in the short-term so that breeders can find a cultivar that is better than what is currently being used. Long-term selection, from the perspective of a breeding program, is more concerned about maintaining the genetic diversity within a population for periods of at least 40 years, while maintaining genetic advance from selection. However, the maintenance of genetic variation must be balanced with reductions in genetic variation due to the positive effects of selection. If the response to selection plateaus then it is possible all of the genetic diversity has been lost from the population. If this occurs early in a breeding program, new germplasm may need to be introduced into the program. Alternatively, countering forces from natural selection or mutation may be balancing the effects of the artificial selection. If such countering forces are present an alternative breeding strategy may have to be considered. This may also involve introducing new sources of genetic variation and/or increasing the artificial selection pressures applied. The overall aim of a breeding program is to maintain its long- term response to selection while also achieving new cultivar development through short- term response to selection. The goal of recurrent selection is to maintain the variability of a population for one or more quantitative characters, with minimal reduction of genetic diversity in the long-term to allow for continued genetic gain (Hallauer, 1981; Strahwald and Geiger, 1988; Carver and Bruns, 1993; De Koyer et al., 1999). Recurrent selection maintains heterozygosity of loci and promotes crossing over within gene blocks, which has the potential to release large amounts of genetic variance and contribute positively to maximising genetic gain. Recurrent selection is most commonly associated with breeding of allogamous (cross- pollinating) species (e.g. maize, Hallauer and Miranda (1988)). A recent review of genetic gains (Carver and Bruns, 1993) for grain yield and quality for autogamous (self-pollinating) species indicates that recurrent selection has been equally, if not more effective than traditional breeding methods, such as the pedigree strategy.
  • 21. 7 By testing homozygous lines (DH) rather than heterozygous families (S1), selection efficiency can be increased in a recurrent selection breeding scheme (Griffing, 1975; Baenzinger et al., 1984). Knowledge of expected genetic gain by selection has proved to be useful for choosing the most efficient selection method. Before starting a long-term recurrent selection program the breeder needs to know whether the progress is likely to be high enough throughout the recurrent cycles of selection (Charmet et al., 1993). Findings of Carver and Bruns (1993) and De Koyer et al. (1999) indicate that the genetic gain is often highest in the first cycle of selection when genetic variance (σ2 G) is usually greatest. Selection and genetic drift will ultimately cause a decrease in σ2 G in later cycles of selection when a larger proportion of the favourable alleles of genes are fixed or lost. This results in a decrease in genetic diversity, which may or may not be significant or sufficient enough to affect the long-term response to selection. To improve selection efficiency, a breeder wants to be able to select in their population a particular phenotype that accurately reflects the true-breeding genotype. This is where the use of DH lines in breeding programs can greatly enhance selection efficiency. In general, the probability of selecting a particular phenotype in a conventional F2 population is (¼)n for recessive, and (¾)n for dominant genes, where n is the number of loci segregating. This compares to (½)n for both recessive and dominant genes in a DH population. For example in phenotypic selection for three recessive genes, (½)3 =1/8 of the DH plants would be selected and expected to breed true in the DH population, compared to (¼)3 =1/64 of the F2 population. For dominant genes, 1/8 of the DH population would breed true for the desired trait, whereas 27/64 of the F2 population will have to be selected to ensure the inclusion of the desired 1/64 true-breeding lines (Baenzinger et al., 1984). 2.2 Germplasm Enhancement Program The main breeding objective of the Germplasm Enhancement Program (GEP) is to combine high yielding germplasm, from selected sources around the world, with high quality Australian wheats, and maintain a long-term population improvement strategy that will
  • 22. 8 provide a source of high yielding and high quality wheat germplasm to the pedigree breeding programs run by the Leslie Research Centre (LRC) at Toowoomba and the Plant Breeding Institute of the University of Sydney (PBI-US) at Narrabri, of the northern grains region of Australia (Figure 1a) (Cooper et al., 1999). The current strategy used in the GEP program is a modified S1 recurrent selection strategy. It works on a four-year cycle within the general recurrent selecting framework (Figure 1b). Years 1 and 2 are used for intermating, selection for the traits maturity and height, and seed multiplication of the S1 families. Multi-environment Trials (METs) of the S1 families are conducted in years 3 and 4 and selection is based on grain yield and grain protein concentration. It is expected that this improvement strategy can provide a gradual increase of favourable allelic frequencies and thus increase the mean of a population for the selected traits (Fabrizius et al., 1996). Figure 1a: Components and pathways of germplasm transfer for yield improvement in the Australian Northern Wheat Improvement Program: LRC-QDPI represents the Queensland Department of Primary Industries pedigree breeding programs located in Toowoomba at the Leslie Research Centre; PBI-US represents the University of Sydney pedigree breeding programs located in Narrabri; GEP represents the University of Queensland Germplasm Enhancement Program (Cooper et al., 1999). Cultivar LRC-QDPI Toowoomba PBI-US Narrabri Germplasm Enhancement Program GEP-UQ Overseas Germplasm Research Programs Parents Parents
  • 23. 9 Year Activity 1 2 3 4 Random intermating 10,000 S0 plants Select 2,000 (height, maturity) 2,000 S1 families Select 1,000 (height, maturity, rust) MET (5 sites) 1,000 S1 families MET (5 sites) 1,000 S1 families Select 20-30 (yield & protein) LRC- QDPI PBI-US CIMMYT Figure 1b: Outline of the Germplasm Enhancement Program (GEP) 4 year cycle. Pictures show examples of the activities and field experiments undertaken at each stage of the cycle (Cooper et al., 1999).
  • 24. 10 Genotype-by-environment (G×E) interactions have a large impact on response to selection for grain yield of wheat in the Australian target production environments, particularly G×S×Y (genotype-by-site-by-year) interactions (Basford and Cooper, 1998). The focus on G×E interactions arises because the interactions introduce uncertainty into the process of selection among genotypes, especially when selection is based on their phenotypic performance in a relatively small sample of environments taken from the target population of environments (Cooper and DeLacy, 1994), as occurs in the case of the GEP. The GEP therefore utilises two years of METs to accommodate for the G×S×Y interactions that are encountered, in order to improve S1 family mean heritability and thus help make selection more efficient. The traditional S1 selection strategy works on a three year cycle, using only one year of METs. The first two years involve similar steps to those conducted for the GEP. The traditional S1 selection strategy has been applied in maize breeding for target environment populations where G×S×Y interactions are not sufficiently large as to warrant two years of METs (Hallauer and Miranda, 1988). However, for yield of wheat in the northern grains region the large G×S×Y interactions require at least two years of METs (Brennan et al., 1981; Cooper et al., 1996). Hence the modification of the S1 recurrent selection strategy for the GEP involves an additional year of multi-environment testing of the S1 families. Following theoretical considerations relevant to the partitioning of additive genetic variation for quantitative traits and the contributions of this variation to selection response it has been argued that the inclusion of doubled haploids into the GEP strategy in place of S1 families can increase the rate of genetic improvement achieved by the GEP. Therefore, this project was developed to evaluate whether there is an increase in response to selection in the program contributed by the use of DH lines in place of S1 families. Limitations on the availability of resources, particularly labour costs and time, will influence the feasibility of the production of DH lines for use in the GEP. At present approximately 300 DH lines could be produced by one dedicated person with the available resources for use in the GEP. As the ability to produce more DH improves (e.g. through increasing skilled labour availability, decreasing time and cost to produce the DH lines) the relative merits of DH line selection increases (Strahwald and Geiger, 1988).
  • 25. 11 2.3 Doubled Haploids Doubled haploids are plants for which the haploid genome has been doubled. There are a variety of methods available for producing DH lines in wheat: (1) anther culture (Ouyang et al., 1973; Henry and de Buyser, 1981), and (2) chromosome elimination a) wheat × maize method (Laurie and Bennett, 1986, 1988) b) Hordeum bulbosum method (Barclay, 1975; Sitch and Snape, 1986) Jensen and Kammholz (1998), modified the wheat × maize method, which is the DH line production method currently used at the Leslie Research Centre (LRC). In wheat, selected wheat plants are emasculated and crossed as females with maize pollen to produce a haploid wheat embryo. The haploid embryo is progressed to a haploid plant in tissue culture. The young haploid plant from this embryo undergoes a colchicine treatment that causes the chromosomes to double, resulting in a doubled haploid (diploid: in the case of wheat as it is an allohexaploid referring to an amphidiploid) plant. Plant breeders have long been interested in the use of DH lines in breeding programs, as there are several advantages with using them; (1) a DH line exhibits twice as much additive genetic variation among lines as that for S1 families used in an S1 recurrent selection program. DH lines do not express dominance variation or segregation within lines, resulting in easier and more efficient selection (Griffing, 1975; Baenziger et al., 1984; Wricke et al., 1986; Snape, 1989), and (2) the selection efficiency among completely inbred DH lines is increased as homozygosity is reached in one generation (equivalent to F∞ selfing generations), instead of being close to homozygosity after 5-6 generations of self-pollination, (Baenziger et al., 1984; Wricke and Weber, 1986; Witherspoon and Wernsman, 1989). The adoption of this new technology however has been slow due to several disadvantages; (1) production of them is difficult and quite costly, and
  • 26. 12 (2) if they were to be used in the GEP, and to keep the GEP as a four year cycle, the first year of multi-environment trials would be lost due to the extra time taken to produce the DH lines relative to the S1 families, therefore the DH lines would only undergo one year of METs. Due to DH being completely homozygous in one generation, they allow for only one crossover opportunity, which is desirable if a line contains a superior combination of genes, as those genes will be fixed permanently. However with the S1 families, recombination can occur in every generation, therefore both increasing the chances of finding a good combination of genes, but also allowing a chance to lose that combination of genes. The effects of recombination lessens with each progressive generation of selfing (Baenziger et al., 1984; Knox et al., 1998). 2.4 Genotype-by-environment Interaction Genotype-by-environment (G×E) interactions can result in changes in rank among genotypes in different environmental conditions (Haldane, 1947; Comstock and Moll, 1963). When cultivars are compared in different environments, their performance relative to each other may not be the same. One cultivar may have the highest yield in some environments and a second cultivar may excel in others. G×E interaction is a major problem in the study of quantitative traits as it complicates the interpretation of genetic experiments and undermines the repeatability of experimental results, which consequently makes predictions difficult and reduces the efficiency of selection (Kearsey and Pooni, 1996). To emphasise the different influences of G×E interaction on the efficiency of selection they are sometimes categorised into interactions due to: (1) heterogeneity of genetic variance among environments (Robertson, 1959), i.e. the ranking of the genotypes does not differ between environments, only the magnitude of the difference between the genotypes in each environment changes,
  • 27. 13 therefore the same genotypes are selected regardless of environment and prediction of response to selection is not complicated by changes in rank of genotypes, or (2) lack of genetic correlation among environments (Robertson, 1959), i.e. this source of interaction can result in cross-over interactions, where reranking of the genotypes occurs and a genotype that performs well in one environment, does not perform well in other environments, this form of G×E interaction complicates the selection decisions in a breeding program. The analysis of variance (ANOVA) has been used to partition total phenotypic variation into components due to genotype, G×E interaction and error (Brennan and Byth, 1979; DeLacy et al., 1990). The relative sizes of the variance components are frequently used to quantify the magnitude of G×E interactions. The influence of G×E interaction in a breeding program is a problem when the ratio of the G×E interaction to genotypic variance (σ2 GE:σ2 G) is high (Cooper and DeLacy, 1994). Genotype-by-environment interactions for the grain yield of wheat are large in the northern grains region and these commonly change the rank of genotypes. These interactions have a major influence on selection decisions, and therefore response to selection in the GEP. With the GEP being a recurrent selection program, continual long-term improvement can only occur if the breeders are able to efficiently select the superior genotypes, for the target environmental conditions. 2.5 QU-GENE QU-GENE (QUantitative-GENEtics) is a computer simulation platform developed for the quantitative analysis of genetic models. The QU-GENE software platform was developed with a modular structure (Figure 2) and consists of two major component levels; (1) the genotype-environment system engine (QUGENE), which is used to define the genetic models to be examined, and
  • 28. 14 (2) the application modules that examine properties of the genotype-environment system by investigating, analysing or manipulating a population of genotypes for a target population of environments created from the QUGENE engine (Podlich and Cooper, 1997, 1998). For the purposes of this study the GEPRSS application module is used in combination with the engine. Figure 2: Schematic outline of the QU-GENE simulation software. The central ellipse shows the engine and the surrounding boxes show the application modules. The GEPRSS module was used in this study (Podlich and Cooper, 1997, 1998). QU-GENE enables investigation of the impact of resource allocation decisions within the breeding program, e.g. variables population size and selection decisions influence how the resources will be allocated (Fabrizius et al., 1996). QU-GENE has also been used to model breeding programs in previous simulation experiments (Podlich et al., 1999; Podlich and Cooper, 1999). Strahwald and Geiger (1988) have previously published work involving the use of computer simulation to study the efficiency of DH in a barley recurrent selection program.
  • 29. 15 2.6 Effective Population Size (Ne) The effective population size (Ne) of a breeding strategy is an important part of any recurrent selection breeding program. It needs to be determined to quantify the potential influence of random drift, and balanced with intense selection so that the maximum response to selection from the available resources can be realised. If the Ne is too small then favourable genes may be lost from the population through random genetic drift. When drift occurs the response to selection can never reach its full potential. Therefore, it is important to understand the effects of drift, which can be determined and quantified in terms of the effective population size. Quantifying the effect of random drift in a population requires knowledge of the variability in changes of gene frequency between repeated runs of the same breeding strategy. This is defined theoretically in terms of the idealised population (Falconer and Mackay, 1996). However, populations do not always conform to that of an idealised population (random mating, monoecious population in which there is no selection, there are N individuals that reach reproductive age and function as parents, and only one offspring is produced per mating). One way to deal with deviations from the idealised breeding structure is to express the situation of a breeding program in terms of the effective number of breeding individuals, or the effective population size (Ne). This is the number of individuals that would give rise to the observed sampling variance for gene frequencies, or rate of inbreeding, if they bred in the manner of the idealised population (Comstock, 1996; Falconer and Mackay, 1996). The effective population size is therefore a relative measure of the number of parents used to form a breeding population. It does not represent the number of individuals from a population that are tested in a recurrent selection program and is dependent on the level of inbreeding of the parents that are mated and the number of gametes contributed to the next generation (Hallauer and Miranda, 1988). Genetic drift is a consequence of sampling in a finite population of small Ne. This is a disadvantage to any breeding program as genetic diversity needs to be maintained within a breeding population. Small values of Ne can result from applying intense selection pressure.
  • 30. 16 Following the theoretical development by Comstock (1996) the standard procedure for calculating Ne is quantified by the equation ( )1 2 11 e n n M N f m − =   + +     , (3) where considered in terms of the GEP, M is the number of S0 plants sampled, 1nf − is the coefficient of inbreeding after n-1 generations of inbreeding, n is the number of successive generations and m is the number of reserve seed used for intermating per S0 plant sampled. In the above equation, whenever n and/or m are large enough to make ( )1 2 n m small relative to ( )11 nf −+ then: ( )11 e n M N f − ≈ + . (4) The theoretical Ne for the S1 recurrent selection strategy used in the GEP is derived from equation (3) by noting that 1nf − = 0 and n = 1. These values are substituted into equation (3) and following rearrangement this becomes ( )     + = m m M Ne 2 12 . (5) This equation is a special case for S1 families, derived from the standard procedure for calculating Ne when selecting among families produced by self-fertilisation. When calculating the Ne of doubled haploids, equation (3) is also used. DH lines are completely inbred in one generation, therefore fn-1 = 1. If we assume that this is equivalent to n being large then following equation (4) this results in the theoretical Ne of DH being ' 2 e M N = , (6)
  • 31. 17 where 'M is the number of S0 plants sampled in the case of the GEP. This equation is expected as only one parent is contributing gametes to the next generation and not two parents which occurs with cross, and self pollination (depending on the level of inbreeding in the base population), where two sets contribute. Therefore to determine the Ne for DH the number of S0 plants sampled ( 'M ) is divided by 2 (Equation 6). Of practical importance to plant breeders is the amount by which the probabilities of fixation of favourable alleles are increased by selection. Kimura (1957) considered the case where a gene has two alleles, relative fitness of the single locus genotypes are constant through time, any level of dominance except overdominance and effective population size constant through time. He derived the following equation, which has been shown to be a close approximation for the probability of the fixation of a favourable allele as ( ) ( ) 0 2 2 1 2 0 1 2 2 1 2 0 ( ) e e p N sx h x h N sx h x h e dx P fixation e dx − + −   − + −   = ∫ ∫ , (7) where p0 is the initial frequency of the favourable allele, Ne is the effective population size as defined above, s is the selection coefficient (a selection proportion), x is the continuous variable being measure (gene frequency) ranging from 0→1, and h is level of dominance coefficient. When there is no selection (s = 0) then equation (7) reduces to P(fixation) = p. It also must be noted that P(fixation) is a function of the product Nes, and not Ne and s as separate values. It is much easier to derive a numerical evaluation of P(fixation) when h = ½, as this corresponds to an additive model. When h = ½, and x is defined in relation to the initial gene frequency ( 0p ) rather than as a continuous function then equation (7) reduces to 02 2 1 ( ) 1 e e N sp N s e P fixation e − −  − =  −  . (8)
  • 32. 18 Equations (7) and (8) are important to breeders as they provide a quantifiable basis for determining the influence that the population sizes and selection coefficient values used in breeding programs have on the probability of fixing favourable genes. From equation (8) it can be seen that with an additive model there is a relationship between Ne and the selection coefficient that determines the chances of fixing or losing favourable alleles (Comstock, 1996). 2.7 Linkage Disequilibrium Linkage disequilibrium is an important factor in the GEP as the starting population is obtained from the random intermating of 10 initial parents (Fabrizius et al., 1996), a relatively small number. In one generation of random mating recombination occurs, however there is a chance that either undesirable or desirable genes may be linked to desirable genes. When genes are linked, selection for the desirable gene will also result in indirect selection of the linked gene, increasing its frequency in the population. This form of indirect selection is unwanted if an undesirable gene is linked to a desirable gene being selected for as it will decrease the potential response to selection of the population. Linkage disequilibrium between loci can originate through selection, migration, mutation and random drift (Lynch and Walsh, 1998). Two alleles at two loci (A allele or a allele at locus 1 and B allele or b allele at locus 2) can be linked in a coupling ab AB or repulsion aB Ab phase. A population is in linkage disequilibrium when the frequency of gametes with genes in coupling is not equal to the frequency of gametes with genes in repulsion (Fehr, 1987). It is most common in populations derived from two inbred parents with contrasting phenotypes (e.g. one parent is tall (AABB), while the other parent is short (aabb)). Linkage disequilibrium can influence heritability estimates by causing an upward bias (increase) or downward bias (decrease) in the estimates of additive (σ2 A) and non-additive (dominance (σ2 D)) genetic variation (Fehr, 1987; Hallauer and Miranda, 1988).
  • 33. 19 Groups of genes that are linked, and tend to be transmitted intact from one generation to the next, are referred to as linkage blocks. Linkage can influence estimates of genetic variance for quantitative characters. For achievement of linkage equilibrium in a population, the opportunity must be provided for genetic recombination within heterozygous individuals. This requires repeated generations of intermating or selfing of heterozygous individuals. Recombination is an event that occurs during meiosis, which causes new combinations of genes to occur, and helps break up linkage blocks and reduce the linkage disequilibrium effect. The length of linkage blocks that are retained in a breeding population is influenced by the number of parents used to develop the population, the number of generations of intermating before selfing is initiated and the number of selfing generations conducted after intermating is completed (Fehr, 1987). In the GEP the number of parents that form the starting population is relatively small, there is only one generation of random mating before selfing starts followed by one generation of selfing for the intermating units used within the GEP modified S1 family strategy. All these factors contribute to a relatively low frequency of recombination events and a high level of linkage disequilibrium in the GEP. It is expected that the level of linkage disequilibrium in the DH strategy will be greater than that of the S1 families (Powell et al., 1992), as the S1 families, unlike DH lines, have a further opportunity to recombine during selfing after the intermating of the selected lines. The reduction in additive genetic variance due to gametic linkage disequilibrium caused by selection, is known as the Bulmer effect (Bulmer, 1971, 1980; Falconer and Mackay, 1996). The changes of the additive genetic variance affect variances, covariances and heritability, with these parameters requiring re-estimation at each cycle during recurrent selection (Charmet et al., 1993). The change in these parameters means that the response to selection will also be altered. To predict long-term response to selection the effects of linkage disequilibrium and genetic drift on additive variance need to be considered simultaneously (Wei et al., 1996).
  • 34. 20 2.8 Study focus The literature outlined above covers the necessary background that needs to be considered when evaluating the response to selection of a breeding program. The computer simulations will be conducted using the computer program QU-GENE, to evaluate the response to selection for both S1 families and DH lines by analysing the impact of effective population sizes, selection intensity, linkage disequilibrium and genotype-by-environment interactions.
  • 35. 21 3. Materials and Methods The QU-GENE simulation platform (Podlich and Cooper, 1998) was used to conduct the simulation experiments. The application module, GEPRSS, representing the GEP had already been developed prior to the commencement of this project (Podlich and Cooper, 1997). In the GEPRSS module both the S1 family and DH strategies were implemented as options. An outline of the way in which two breeding strategies were modelled in the GEPRSS module is presented in Figure 3. Year Activity (S1) Activity (DH) Figure 3: Outline of the activities involved in the S1 and DH breeding strategies over one cycle of the GEP. The S1 activities are adapted from Fabrizius et al., (1996). To quantify rate of response to selection and long-term selection response each breeding strategy was run for 10 cycles, which is equivalent to 40 years of the S1 strategy and 50 years of the DH strategy. On a time scale of 40 years the two strategies can alternatively be compared after 10 cycles of selection for S1 families and after 8 cycles of selection for DH lines. Response to selection was calculated as the genotypic value of selected individuals expressed as a percentage of the target genotype, where the target genotype was defined to 1 Random intermating 2 10,000 S0 plants Sample 2,000 3 2,000 S1 families Sample 1,000 4 MET (5 sites) S1 evaluation MET (5 sites) S1 evaluation. 5 Generate doubled haploid plants MET (5 sites) DH evaluation. Select. Production of DH lines Seed increase MET (5 sites) DH evaluation
  • 36. 22 be the genotype containing all of the favourable alleles. The mean response to selection was estimated as the average response obtained for 100 runs of the simulation experiment. This methodology was used by Podlich et al. (1998) to normalise response to selection for comparisons between genetic models. Analyses of variance were conducted on the results of the experiments using the ASREML software (Gilmour et al., 1999). Important points with regard to the simulations: (1) Heritability in the GEPRSS module is calculated on a plot mean basis, however in the QUGENE engine it is assigned on a single plant basis in the base population. For the MET evaluation phase of the cycle the between plot experimental variance was set to be two times that of the within plot variance (Podlich et al., 1998), (2) All experiments were conducted with 20 families being selected from the METs to go into the next cycle of selection, unless otherwise stated, (3) When the term families are used in the experiments it refers to both S1 families and DH lines i.e. families and lines throughout the report are used interchangeably for DH lines to simplify presentation of results. Five simulation experiments were conducted to evaluate different aspects of response to selection in the GEP. These were: (1) Comparison of simulated and theoretical predictions of the effective population size (Ne) of S1 families and DH lines (Experiment 1), (2) Determining the effects of linkage disequilibrium on gene frequency and response to selection (Experiment 2), (3) Evaluating the response to selection of S1 families and DH lines for an additive genetic model (Experiment 3), (4) Evaluating the impact of selection proportion on response to selection for an additive genetic model (Experiment 4), (5) Evaluating the influence of G×E interaction genetic models on response to selection (Experiment 5). The treatments incorporated for each simulation experiment and their objectives are explained below.
  • 37. 23 Experiment 1: Comparison of simulated and theoretical predictions of the effective population size (Ne) of S1 families and DH lines The objective of running the effective population size experiments was to determine whether the S1 families or DH lines strategies in the GEP reached a critical point where favourable genes were being lost due to the effects of genetic drift. A secondary objective was to compare the simulation results of the Ne with the theoretical predictions, that were given in the literature review section of this thesis. Doubled haploid effective population equations only exist at a restricted level (e.g. for the case of one DH plant per S0 plant selected), therefore it was of interest to see what the simulated Ne results were when more DH plants were produced per S0 plant. The effective population size was simulated using the S1 and DH strategy additive model input files to see whether they conformed to the theoretical predictions. The following parameters were considered: (1) heritability (one level: 1.00) (2) number of genes contributing to the trait (one level: 50) (3) starting gene frequency (one level: 0.5) (4) number of families used in the METs (five levels: 100, 250, 500, 750, 1000) (5) number of families selected (no selection was imposed, therefore this value was equal to the number of families being evaluated in the MET) Refer to Appendix 1 Table A1.1 for QUGENE engine input file. To test for the effective population size two more parameters were altered under each option. The parameters changed in the theoretical equation (5) for the S1 families were: (1) number of S0 plants sampled (equivalent to number of S1 families) ( M ): 5, 10, 15, 20, 25, 30, 40, 50, 100, 150, 200 (2) number of reserve seeds per S0 plant (m): 1, 2, 3, 4, 5, 10, 100. M and m are both components of the theoretical equations, allowing the simulations and the theoretical predictions to be compared. This enabled investigation of the effective population size of the S1 strategy in the GEP at a range of numbers of families evaluated in
  • 38. 24 the MET and the number of reserve seed used in random mating to create the base population for each cycle of selection. The approach to determine the effective population size of the DH strategy was slightly different to that used for the S1 families. The parameters changed in the input file to determine the Ne of DH lines were: (1) number of S0 plants sampled ( 'M ): 5, 10, 15, 20, 25, 30, 40, 50, 100, 150, 200 (2) number of DH produced per S0 plant ( 'm ): 1, 2, 3, 4, 5, 10. Only 'M however, is present in the theoretical equation (6) to determine the Ne of DH lines. There is presently no theoretical equation derived to determine the Ne of DH when more than one DH plant is produced per S0 plant (i.e. when 'm >1). Simulations were still conducted with values of 'm greater than one so that the response of Ne due to changes in 'm could be evaluated. The effective population size was calculated for each of the 50 genes not under the influence of selection in the QU-GENE simulations following the procedure outlined in example 4.1 (p.70) in Falconer and Mackay (1996). The inbreeding coefficient (F) for each gene was calculated from the variation among 100 runs using the following formula 2 q F pq σ = , (9) where 2 qσ is the variance of gene frequencies among runs, p mean gene frequency of a particular allele at a locus among runs, and q mean gene frequency of all other alleles at that locus among runs. Using this procedure each gene gave an independent estimate of F. From this estimate, the rate of inbreeding (∆F) can be calculated by rearranging the following equation ( ) 1 1 1 t tF F∆ = − − , (10) where t is the generation number. The effective population size of each of the fifty genes was then calculated from ∆F with the following equation 1 2 eN F = ∆ . (11)
  • 39. 25 An estimate of the variation of Ne was obtained by estimating the effective population size for each gene, and the variation amongst these genes. Experiment 2: Determining the effects of linkage disequilibrium on gene frequency and response to selection A study was conducted to determine the impact of linkage disequilibrium on the frequency of genes in the GEP after 5 cycles of selection. This study was only conducted on S1 families. Both additive and a complex G×E interaction model were assessed and compared. Genetic models based on twenty genes were considered. The effects of the genes were scaled to generate major or minor genes. The favourable alleles for all 20 genes commenced at a gene frequency of 0.2. Therefore, any positive effects of selection were expected to increase the gene frequencies above the starting value of 0.2. The following parameters were considered in both the additive and G×E interaction models: (1) heritability (one level: 0.95) (2) number of genes contributing to the trait (one level: 20) (3) starting gene frequency (one level: 0.2) (4) number of families used in the METs (one level: 250) (5) number of families selected (one level: 10) Refer to Appendix 1 Table A1.2 for QUGENE engine input file. To reduce the effects of linkage disequilibrium in the recurrent selection strategy, ten generations of random mating were incorporated into the S1 program. The random matings were conducted from the S0 plants for each cycle of selection. The rate of change of the favourable alleles for each gene was monitored for the cases with and without the extra generations of random mating, of particular interest was whether the presence of linkage disequilibrium influenced the rate of change in gene frequency of the favourable alleles. In the absence of any effects of linkage disequilibrium the rate of change in the frequencies of the alleles was expected to be proportional to the size of the gene effects, and independent of the number of generations of random mating. However, when linkage disequilibrium was present, i.e. as was expected without the additional generations of random mating, the
  • 40. 26 rate of change in the frequencies of the alleles could be influenced by the size of the gene effects and the degree of linkage disequilibrium. Experiment 3: Evaluating the response to selection of S1 families and DH lines for an additive genetic model This experiment was conducted to determine whether using DH lines resulted in a faster response to selection relative to the S1 families for a range of heritabilities, number of genes contributing to the attribute, number of families evaluated in the MET and the number of families selected from the MET to progress into the next cycle of selection. Theoretical considerations suggest that a higher rate of response would be observed when DH lines were used in the place of S1 families in the GEP. Using a completely additive genetic model (i.e. no epistasis, no genotype-by-environment interaction and no linkage) the following parameters were altered providing a range of genetic model scenarios: (1) heritability (five levels: 0.05, 0.25, 0.50, 0.75, 0.95) (2) number of genes contributing to the trait (four levels: 5, 10, 20, 100) (3) starting gene frequency (one level: 0.2) (4) number of families used in the METs (five levels: 100, 250, 500, 750, 1000) (5) number of selected families (one level: 20) Refer to Appendix 1 Table A1.3 to Table A1.6 for the relevant QUGENE engine input file. Experiment 4: Evaluating the impact of selection proportion on response to selection for an additive genetic model The number of families selected from one cycle of selection to be progressed through the next cycle of selection is an important factor in the GEP. If too few families are selected (high selection intensity) random drift may result and valuable genes may be lost from this population. On the other hand if too many families are selected (low selection intensity)
  • 41. 27 then too many undesirable genes will be retained in the population and the response to selection will be slowed down. The GEP currently selects 20 families based on the results of the MET. This experiment was conducted to determine whether this figure provided a suitable balance between selection intensity and effective population size. Table 1: Selected intensity (%) and corresponding standardised selection differential (S), (within the brackets) from Falconer and Mackay (1996), changes depending on the number of families in the MET and the number of families selected (selected proportion) from the MET. Number of families in METNumber of families selected 250 500 750 1000 5 2% (2.054) 1% (2.326) 0.67% (2.4705) 0.5% (2.576) 10 4% (1.751) 2% (2.054) 1.3% (2.227) 1% (2.326) 15 6% (1.555) 3% (1.881) 2% (2.054) 1.5% (2.1705) 20 8% (1.405) 4% (1.751) 2.67% (1.945) 2% (2.054) 25 10% (1.282) 5% (1.645) 3.33% (1.8255) 2.5% (1.960) 30 12% (1.175) 6% (1.555) 4% (1.751) 3% (1.881) Table 1 documents the number of families in a MET and the selection proportions applied and shows that as the selected proportion increases, the standardised selection differential and selection intensity decreases. To explore the effect that different selection proportions can have on the response to selection, simulations were run for both S1 families and DH lines where the following parameters were used: (1) heritability (one level: 0.95) (2) number of genes contributing to the trait (four levels: 5, 10, 20, 100) (3) starting gene frequency (one level: 0.2) (4) number of families used in the METs (five levels: 100, 250, 500, 750, 1000) (5) number of selected families (six levels: 5, 10, 15, 20, 25, 30)
  • 42. 28 Experiment 5: Evaluating the influence of G××××E interaction genetic models on response to selection Genotype-by-environment interaction was included in the genetic model to determine whether the responses to selection and advantages of the DH lines over S1 families, that were observed for the additive model, would be retained in the presence of G×E interactions. It was also incorporated as G×E interaction has a major influence on the selection of genotypes in Australian environments, and the simulations would be incomplete if it was not considered as a factor in the genetic model. Two major experiments were undertaken to fulfil two objectives. The first, to assess the response to selection when two years of METs were conducted for both the DH lines and S1 families. Secondly, to assess response to selection when only the DH lines are conducted with one year of METs. It was expected that the DH line advantage would be retained when two years of METs were conducted, however it was uncertain whether this would be retained with the scenario where one year of METs was used. To introduce G×E interaction into the additive model, five environment types were added into the QU-GENE engine input file. The inputs into the genotype-environment system can be manipulated so that genes can have different effects in different environments, thus generating G×E interactions. In the input (Table 2), a value of 0 means that a gene has no effect in that environment, a value of 1 means that a gene has the effects defined by the m,a,d genetic model, and –1 means that a gene has a cross-over genetic effect in that environment. These different gene effects are outlined in bold (Table 2) for each of the five environments (E1 – E5). Refer to appendix 1 for all of the input files. Five G×E interaction models were produced to create different levels of G×E interaction (Table 3). There were 20 genes contributing to the attribute subjected to selection in each of the simulations. The genes in this experiment were interacting with five environment types.
  • 43. 29 Table 2: Input file for the QUGENE engine. This file represents the G×E model 5 (Table 3). The genes 5-16 are removed from this presentation for conciseness. GN represents the gene number and E1 – E5 represent the five environment types within the target population of environments. The detail of the structure of this input file is explained by Podlich and Cooper, (1997; 1998). S ! Pollination Type N ! Random Seed 100 200 300 Y N N ! Linkage, Epistasis, Random GxE 1 ! Gene Sampling type (1=fixed, 2=random) 10 ! no. of runs to calc var comp (BP,Progeny, Genes, Attributes, Environment Types, Sample Environments) 5000 10 22 2 5 1 0.4 0.3 0.15 0.1 0.05 ! Environment Frequency 1 1 1 1 1 ! GxE multipliers 0.95 1 ! Heritability for each Attribute GN M A D AT L LN K E1 E2 E3 E4 E5 P 1 0.100 0.050 0.000 1 1 1 0 1 0 -1 1 1 0.2 2 0.100 0.050 0.000 1 1 0.5 0 1 1 -1 1 0 0.2 3 0.100 0.050 0.000 1 1 0.5 0 1 -1 1 -1 1 0.2 4 0.100 0.050 0.000 1 1 0.5 0 1 -1 0 1 1 0.2 " " " " " " " " 17 0.100 0.050 0.000 1 1 0.5 0 1 0 1 1 -1 0.2 18 0.100 0.050 0.000 1 1 0.5 0 1 -1 1 -1 1 0.2 19 0.100 0.050 0.000 1 1 0.5 0 1 0 -1 1 -1 0.2 20 0.100 0.050 0.000 1 1 0.5 0 1 1 -1 -1 1 0.2 21 0.50 -0.500 -0.490 2 1 2 0 1 1 1 1 1 0 22 0.50 -0.500 -0.490 2 1 0.5 0 1 1 1 1 1 1 N ! Mating Type 1 ! Selection type ******************************************************************** R ! Mating (Random Mating) 10 5 0.2 ! Generations, Generations before selection, Select Pressure M ! Mating (Mixture) 0.8 0.2 10 5 0.2 ! Proportions (RM/S), Gen, Gen before Sel, Sel Pressure N ! No Further Mating
  • 44. 30 To explore these models, simulations were run for both S1 families and DH lines where the following parameters were used: (1) heritability (three levels: 0.05, 0.25, 0.95) (2) number of genes contributing to the trait (one level: 20) (3) starting gene frequency (one level: 0.2) (4) number of families used in the METs (four levels: 250, 500, 750, 1000) (5) number of selected families (three levels: 10, 20, 30) (6) level of G×E interaction (five levels: models 1, 2, 3, 4, 5); Table 3 Refer to Appendix 1 Table A1.7 to Table A1.11 for the relevant QUGENE engine input files. Table 3: Each model describes the number of genes interacting with the five environment types and the level of G×E interaction present as described by the ratio of the genotype-by-environment interaction variance to the genotypic variance (σ2 GE:σ2 G). Model number Number of genes interacting (σ2 GE:σ2 G) 1 10 0.4 2 10 0.6 3 15 0.8 4 15 1.1 5 20 2.89
  • 45. 31 4. Results Experiment 1: Comparison of simulated and theoretical predictions of the effective population size (Ne) of S1 families and DH lines The simulated effective population size (Ne) of the S1 family strategy corresponded well with the predictions based on theoretical equation (5) (Figure 4). As the number of S1 families selected (M) increases, Ne increases. Ne also increases as the number of reserve seed used for intermating per S0 plant sampled (m) increases. Number of S0 plants sampled (M) 0 50 100 150 200 Effectivepopulationsize(Ne) 0 50 100 150 200 250 m=1 m=2 m=3 m=4 m=5 m=10 m=100 The variability of Ne for two levels of m (1 (Figure 5a), 100 (Figure 5b)) is indicated by the scatter points about the mean (solid line) for each value of M. The variability of Ne about the mean increases as the number of S0 plants sampled increases. This effect was observed for all levels of m (Appendix 2). Figure 4: S1 families effective population size (Ne) calculated theoretically (solid line) and the average of the simulation runs (broken lines) for a range of values for the number of S0 plants sampled (M) and the number of reserve seed used for intermating per S0 plant sampled (m).
  • 46. 32 (a) m = 1 Number of S0 plants sampled (M) 0 50 100 150 200 Effectivepopulationsize(Ne) 0 50 100 150 200 250 300 (b) m = 100 Number of S0 plants sampled (M) 0 50 100 150 200 Effectivepopulationsize(Ne) 0 50 100 150 200 250 300 Like the S1 family Ne, for the DH lines, as the number of S0 plants sampled ( 'M ) increases, Ne increases (Figure 6). The simulated results and theoretical predictions show good correspondence. Number of S0 plants sampled (M') 0 50 100 150 200 250 Effectiveppulationsize(Ne) 0 20 40 60 80 100 120 m' = 1 Figure 7 indicates how the simulated Ne increases as the number of DH plants produced per S0 plant ( 'M ) increased. Theoretical equations were derived only for the situation when Figure 5: Simulated S1 family effective population size (Ne) variation, about the average of the simulation runs (solid line) for a range of S0 plants sampled (M) and the two extreme values of reserve seed used for intermating per S0 plant sampled (m). (for intermediate levels of m refer to Appendix 2) Figure 6: Comparison of the DH simulated average (closed circles) and DH theoretical (solid line) effective population size for a range of S0 plants sampled ( 'M ) and when only one DH plant was produced per S0 plant sampled ( 'm = 1).
  • 47. 33 'm = 1, however, also plotted were the simulated Ne for four levels of 'm > 1. Like the S1 strategy, as the number of DH plants produced per S0 plant ( 'm ) increased the Ne also increased. This increase was less than that observed for increasing m in the case of the S1 family strategy (Figure 4). Number of S0 plants sampled (M') 0 50 100 150 200 Effectivepopulationsize(Ne) 0 20 40 60 80 100 120 140 160 180 200 m' = 1 m' = 2 m' = 3 m' = 4 m' = 5 m' = 10 The variability of the DH lines Ne for each 'M for the simulated data is shown for two levels of 'm (1 (Figure 8a), 10 (Figure 8b)). The variation of the Ne is indicated by the scatter points about the mean (solid line) for each value of 'M . The variability about the mean increases as the number of S0 plants sampled increases. This effect was observed for all levels of 'm (Appendix 2). Figure 7: Average of the simulated DH effective population size (Ne) for a range of S0 plants sampled ( 'M ) and DH plants produced per S0 plant sampled 'm . A regression line is fitted to each 'm .
  • 48. 34 (a) m'=1 Number of S0 plants sampled (M') 0 50 100 150 200 Effectivepopulationsize(Ne ) 0 20 40 60 80 100 120 140 160 180 200 220 240 (b) m'=10 Number of S0 plants sampled (M') 0 50 100 150 200 Effectivepopulationsize(Ne ) 0 20 40 60 80 100 120 140 160 180 200 220 240 Experiment 2: Determining the effects of linkage disequilibrium on gene frequency and response to selection The results from the linkage disequilibrium experiment focus on cycle five of the GEP, where the S1 family strategy was used. The genes were scaled to have a distribution of effects ranging from 1.2% to 10% of the total trait value. Both an additive model and G×E interaction (σ2 GE:σ2 G = 2.89) model were considered. All genes commenced with a gene frequency of 0.2. Therefore, any increase in the frequency above 0.2 is a consequence of selection. The smaller the increase in frequency towards a frequency of 1.0, the less effective was the influence of selection on changing gene frequency. It can be seen from Figure 9a,c that for most genes selection was effective in increasing the frequency of the favourable allele, and after five cycles of selection the genes ended up with different gene frequencies for both the additive and the G×E interaction models. Figure 8: Simulated S1 family effective population size (Ne) variation, about the average of the simulation runs (solid line) for a range of S0 plants sampled ( 'M ) and two extreme numbers of DH plants produced per S0 plant sampled ( 'm ) (for intermediate levels of 'm refer to Appendix 2)
  • 49. 35 (a) Gene frequency of 20 genes, with one cycle of random mating Gene number 0 5 10 15 20 25 Genefrequency 0.0 0.2 0.4 0.6 0.8 1.0 Additive GxE (b) Gene value and frequency for 20 genes, with one cycle of random mating Value of gene 0.00 0.02 0.04 0.06 0.08 0.10 Genefrequency 0.0 0.2 0.4 0.6 0.8 1.0 1.2 (c) Gene frequency of 20 genes, with 10 generations of random mating Gene number 0 5 10 15 20 25 Genefrequency 0.0 0.2 0.4 0.6 0.8 1.0 (d) Gene value and frequency for 20 genes, with 10 generations of random mating per cycle Value of gene 0.00 0.02 0.04 0.06 0.08 0.10 Genefrequency 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Additive GxE Additive GxE Additive GxE After five cycles of selection, genes with a relatively low value could have either a low or high frequency of occurrence in the population (Figure 9b). When the genes are influenced by the effects of G×E interaction their frequency in the population was generally less then when there was no G×E interaction effect. There is a lack of a consistent relationship between the magnitude of the effect of a gene and its frequency following five cycles of selection for the case where no generations of additional random intermating were undertaken (Figure 9b). Therefore, genes with similar value, in terms of the way that they Figure 9: The influence of linkage disequilibrium in the GEP with S1 families, 20 genes under both an additive model and G×E interaction (σ2 GE:σ2 G = 2.89) model after five cycles of selection. (a) frequency of each gene in the model plotted for one generation of random mating per cycle (b) gene frequency and value for each of the 20 genes for one generation of random mating per cycle (c) frequency of each gene in the model plotted for 10 generations of random mating per cycle, and (d) gene frequency and value for each of the 20 genes for 10 generations of random mating per cycle.
  • 50. 36 contributed to the trait, can have dissimilar gene frequencies. It is hypothesised that this is predominantly a consequence of linkage disequilibrium. To reduce the effects of linkage disequilibrium, ten generations of random intermating following each cycle of selection were added into the simulation (Figure 9d). With the inclusion of the additional generations of random mating after each cycle of selection, the frequency of the genes was found to be approximately proportional to the value of the gene after five cycles of selection. As expected, a pattern was observed after selection whereby genes with low value had a lower frequency in the population relative to genes with a higher value. The genes in the additive model still had a higher frequency that was the case for the G×E interaction model genes. This was expected due to the added complications of selection due to G×E interaction. There also appears to be a point (approximately a gene value of 0.055) on Figures 9b,d where the value of the gene is high enough that linkage disequilibrium had little or no effect on the frequency of these genes after five cycles of selection, as genes with affects of this magnitude or greater have comparable gene frequencies. Experiment 3: Evaluating the response to selection of S1 families and DH lines for an additive genetic model The analysis of variance on the additive model simulation output data indicated significant interactions between the two breeding strategies (DH lines and S1 families) and cycles, heritability, number of families tested in the MET and number of genes. Greater levels of selection response were associated with higher levels of heritability, larger numbers of families, smaller numbers of genes and increasing numbers of cycles. On average, including all runs and cycles, the DH strategy had a 13% mean improvement over the S1 family strategy. The following results represent a comparison between the DH and S1 strategies for the changes in response to selection when the number of families, number of genes and heritability, in each of the strategies were changed. As the number of families evaluated in the MET increased (with heritability 0.05, 20 genes, and selecting 20 families), there was stronger selection pressure placed on the population
  • 51. 37 (Table 1) resulting in an increase in the rate of genetic progress (Figure 10a,b,c,d). The simulation therefore indicates that the DH strategy provided a greater response to selection relative to the S1 strategy over all family sizes and cycles considered. (a) 250 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 S1 DH (d) 1000 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 S1 DH (c) 750 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 S1 DH (b) 500 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 S1 DH At a low heritability (0.05) and a medium family size (500), as the number of genes increased it took longer to achieve a large response to selection (Figure 11a,b,c,d). However, the DH strategy had a faster rate of progress relative to the S1 strategy over all of the gene levels and cycles. With 20 genes contributing toward the attribute under selection (a potentially realistic value for some traits targeted by the GEP) the DH strategy reached 100% of the target genotype after seven cycles of selection (Figure 11c), while the S1 strategy only reached approximately 90% after 10 cycles. Figure 10: Comparison of the response to selection for S1 families and DH lines with heritability 0.05, 20 genes and four family sizes over 10 cycles of selection.
  • 52. 38 (a) 5 genes Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 S1 DH (d) 100 genes Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (c) 20 genes Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (b) 10 genes Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 S1 DH S1 DH S1 DH (a) 0.05 heritability, 250 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 S1 DH (d) 0.95 heritability, 1000 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 S1 DH (c) 0.95 heritability, 250 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 S1 DH (b) 0.05 heritability, 1000 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 S1 DH Figure 11: Comparison of the response to selection for S1 families and DH lines with heritability 0.05, 500 families and gene numbers over 10 cycles of selection. Figure 12: Comparison of the response to selection for S1 families and DH lines with 20 genes, two heritability levels and two family sizes over 10 cycles of selection.
  • 53. 39 Figure 12 shows the effects of two different heritability levels for low (250) and high (1000) numbers of families when 20 genes are contributing towards the attribute under selection. At the low heritability of 0.05 (Figure 12a,b) both family sizes have a slower response to selection than when the heritability is high (0.95) (Figure 12c,d). When the heritability is high, the 1000 families had a faster response to selection than when 250 families were used (Figure 12c,d). The DH strategy was again superior to the S1 strategy across the levels of heritability examined. The change in the level of heritability however did not have a great effect on response to selection when using an additive model. The impact of the use of different numbers of DH lines in the GEP was assessed relative to 1000 S1 families by comparing the response to selection at two heritability levels (0.05 and 0.95) and two gene numbers (20 and 100). Over all the combinations examined the rate of progress for 100 DH families was similar to the rate of progress observed for 1000 S1 families (Figure 13a,b,c,d). When the number of DH families was greater than or equal to 250, they gave a greater response to selection than that observed for 1000 S1 families. The genetic models based on a larger number of genes resulted in the rate of progress being slower (Figure 13b,d) than the models based on lower gene number (Figure 13a,c). (a) heritability 0.25, 20 genes Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (c) heritability 0.95, 20 genes Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 1000 S1 100 DH 250 DH 500 DH 1000 DH (d) heritability 0.95, 100 genes Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (b) heritability 0.25, 100 genes Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 1000 S1 100 DH 250 DH 500 DH 1000 DH 1000 S1 100 DH 250 DH 500 DH 1000 DH 1000 S1 100 DH 250 DH 500 DH 1000 DH Figure 13: Comparison of the response to selection for 1000 S1 families to 100, 250, 500 and 1000 DH lines two heritability levels and two gene numbers over 10 cycles of selection.
  • 54. 40 The Bulmer effect was observed in the additive model simulations and was visualised as a rapid decrease in heritability in the early cycles of selection. This effect can be observed on Figure 13 as a greater and more rapid increase in response to selection for the first two cycles of selection compared to the subsequent cycles of selection. Experiment 4: Evaluating the impact of selection proportion on response to selection for an additive genetic model The impact of changing the number of families selected was examined with a heritability of 0.95, 20 genes, three different family sizes (250, 500, 1000) and three different numbers of selected families (5, 20, 30) for S1 families and DH lines separately. When five families were selected in the S1 strategy (Figure 14a) the rate of response to selection was faster than that observed when 20 (Figure 14c) or 30 (Figure 14e) families were selected. However, when five families were selected the long-term selection response plateaued before it reached 100% of the target genotype (Figure 14a). This plateau did not occur at less than 100% of the target genotype for either 20 or 30 families selected (Figure 14c,e). The same overall response was also observed using the DH strategy (Figure 14b,d,f). The DH response to selection was much faster than that observed for the S1 strategy at all levels of families selected. 1000 families in both the S1 and DH strategy had the fastest short-term response to selection. The sub-optimal long-term responses to selection that were observed when five S1 and DH families were selected (Figure 14a,b) is a consequence of loss of favourable alleles for some of the genes due to the effects of random drift. Thus, while the intense selection that resulted when five families were selected gave a rapid short-term rate of genetic progress, the small effective populations required to achieve the high selection intensity placed limits on the long-term response to selection. The practice of selecting 20 S1 families, which is currently used in the GEP, did not appear to place severe limits on the expected long-term response to selection (Figure 14c).
  • 55. 41 (a) 5 S1 families selected Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (f) 30 DH lines selected Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 250 DH 500 DH 1000 DH (d) 20 DH lines selected Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (b) 5 DH lines selected Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (c) 20 S1 families selected Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 250 S1 500 S1 1000 S1 (e) 30 S1 families selected Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 250 S1 500 S1 1000 S1 250 DH 500 DH 1000 DH 250 DH 500 DH 1000 DH 250 S1 500 S1 1000 S1 The rate of response to selection was examined further for the S1 strategy for both an additive and G×E interaction model (Model 4 Table 3; σ2 GE:σ2 G = 1.1) with a heritability 0.95 and 20 genes, with two levels of families selected (10, 30), and at four different family sizes (250, 500, 750, 1000) (Figure 15). For each of the four family sizes the G×E interaction model (circular symbols) had a faster response to selection than the additive model (triangular symbols) in the short to medium-term (Figure 15ab,c,d). Selecting 10 families also gave a greater response to selection then selecting 30 families for both the additive and G×E interaction models. The rate of response to selection was increased from that observed when 250 families were evaluated in the model (Figure 15a) to when 1000 Figure 14: Comparison of the response to selection for S1 families (a,c,e) and DH lines (b,d,f) with a heritability 0.95, 20 genes and three levels of families selected over 10 cycles of selection.
  • 56. 42 families were evaluated (Figure 15d). The effects of G×E interaction on response to selection were examined further in simulation experiment five. (a) 250 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 GxE 10 S GxE 30 S Add 10 S Add 30 S (b) 500 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (d) 1000 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (c) 750 families Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 GxE 10 S GxE 30 S Add 10 S Add 30 S GxE 10 S GxE 30 S Add 10 S Add 30 S GxE 10 S GxE 30 S Add 10 S Add 30 S Experiment 5: Evaluating the influence of G××××E interaction genetic models on response to selection The analysis of variance of the G×E interaction simulation output data indicated significant interactions between the S1 (2 years of MET), DH (1 year of MET) and DH (2 years of MET) breeding strategies and level of G×E interaction, cycles, selected proportion and number of families. On average, including all runs and cycles, the DH (2 MET) had a 12% increase in mean performance compared to the S1 (2 MET) and a 2% increase in mean performance over DH (1 MET). DH (1 MET) also had on average a 9% increase in mean performance compared to the S1 (2 MET). The studies conducted in simulation experiment three indicated that 100 families were not required in this model as that family size was too small for the DH lines to have a greater response than 1000 S1 families. Figure 15: Comparison of response to selection of the G×E interaction (σ2 GE:σ2 G = 1.1: Table 3) model and the additive model with constant heritability (0.95), genes (20) and two levels of families selected (S) for four different family sizes over 10 cycles of selection.
  • 57. 43 With two years of MET testing for both strategies, 250 DH lines have an advantage over 1000 S1 families at both high and low levels of heritability and for all levels of G×E interaction considered (Figure 16a,b,c,d). A faster response to selection was observed at the higher level of heritability (Figure 16b,d) compared to the lower heritability (Figure 16a,c). (c) heritability 0.05 h2, σ2 GE:σ2 G = 2.89 Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (a) heritability 0.05, σ2 GE:σ2 G = 0.8 Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (b) heritability 0.95, σ2 GE:σ2 G = 0.8 Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 1000 S1 (2 MET) 250 DH (2 MET) 1000 DH (2 MET) (d) heritability 0.95, σ2 GE:σ2 G = 2.89 Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 1000 S1 (2 MET) 250 DH (2 MET) 1000 DH (2 MET) 1000 S1 (2 MET) 250 DH (2 MET) 1000 DH (2 MET) 1000 S1 (2 MET) 250 DH (2 MET) 1000 DH (2 MET) To compare a four year DH strategy to the four year S1 cycle, simulations were also run where the DH breeding strategy was conducted for one year of METs while S1 families remained at two years of METs (Figure 17a,b,c,d). The advantage of 250 DH lines over 1000 S1 families was retained, but the magnitude of the advantage reduced, when only one year of METs was run. At a heritability of 0.95 the response to selection was faster then when the heritability was 0.05, however the DH advantage was lost after 8 cycles of selection and 1000 S1 families had a slightly greater response to selection in the long-term (Figure 17b,d). Figure 16: Comparison of response to selection of 1000 S1 families to two sizes of DH lines, 20 genes, two heritability levels, two levels of G×E interaction and both S1 families and DH lines having two years of METs.
  • 58. 44 (b) heritability 0.95, σ2 GE:σ2 G = 0.8 Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 1000 S1 (2 MET) 250 DH (1 MET) 1000 DH (1 MET) (d) heritability 0.95, σ2 GE:σ2 G = 2.89 Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 1000 S1 (2 MET) 250 DH (1 MET) 1000 DH (1 MET) (a) heritability 0.05, σ2 GE:σ2 G = 0.8 Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 (c) heritability 0.05, σ2 GE:σ2 G = 2.89 Cycle 0 2 4 6 8 10 Performance(%targetgenotype) 0 20 40 60 80 100 1000 S1 (2 MET) 250 DH (1 MET) 1000 DH (1 MET) 1000 S1 (2 MET) 250 DH (1 MET) 1000 DH (1 MET) Figure 17: Comparison of response to selection of 1000 S1 families to two sizes of DH lines, 20 genes, two heritability levels, two levels of G×E interaction and both S1 families and DH lines having two years of METs.
  • 59. 45 5. Discussion Experiment 1: Comparison of simulated and theoretical predictions of the effective population size (Ne) of S1 families and DH lines The effective population size simulations were conducted to ensure that the Ne of both the DH lines and S1 families was large enough that favourable alleles in the population did not have a high probability of loss through random drift. However, if the Ne is too large the response to selection will be slowed due to a reduction in selection pressure and the greater tendency to retain the undesirable alleles in the population. In both the S1 and DH strategy as the number of S0 plants sampled increased, the effective population size increased. This is especially important in the DH strategy, as the Ne is smaller than when S1 families are used. If a breeder was concerned about the Ne size being small with DH lines it is therefore feasible to increase Ne in this recurrent selection strategy by sampling more than one DH line from the selected S0 plants. Therefore, there are opportunities to manipulate the effective population size with DH lines within the GEP if it became an issue. Previous experiments however have indicated that the Ne is not so low as to have a major influence on the response to selection relative to the effects of selection, even with relatively intense selection, as long as the Ne is maintained above a value of 10. An effective population size with a balance between the random drift and slowed response to selection scenarios can be accommodated by selecting between 10 and 20 S1 or DH families per cycle of selection. When the selected proportion was less than 5, there was strong evidence that significant numbers of genes were lost due to random drift, if it was greater than 20, the response to selection was slowed considerably. An increase in the variability around the mean Ne as M (number of S1 families) or 'M (number of DH lines) increased was a result of random fluctuations in gene frequency. This variation was greater for those genes that were not under the influence of selection. This indicates that the observed Ne has the ability to fluctuate dramatically as the selection intensity decreases. For those genes under the influence of selection there is less scope for
  • 60. 46 undesirable loss of genes by chance. This was quantified in equation (8). As the effective population size multiplied by the selection coefficient (Nes) increases the probability of the favourable alleles being fixed in the population approaches one and the probability of loss of the favourable alleles approaches zero. Experiment 2: Determining the effects of linkage disequilibrium on gene frequency and response to selection The impact of linkage disequilibrium in the GEP was demonstrated in experiment 2 by showing that low value genes could have a low or high frequency in the population after five cycles of selection when there was only one generation of random mating between cycles of selection. The need to consider the effects of linkage disequilibrium in this study was alerted by the observation that in some of the simulations conducted with the presence of G×E interaction effects in the genetic model (experiment 4 and 5), a faster response to selection was being produced compared to the additive model (Figure 15). This result was produced because in the additive model all of the genes contributing to the attribute had small and equal effects, i.e. there were no major genes that were selected for initially to increase the response to selection. However, with the G×E interaction model the genes had different effects in different environments. The consequence of this was that there were major and minor genes within the target population of environments. This resulted in the major genes being fixed quickly, resulting in an increase in their frequency and therefore a rapid response to selection. The fate of the minor genes was a consequence of the effects of selection and linkage disequilibrium. When the effects of genes in the additive model were scaled to be proportional in relative effects in the same way as for the G×E interaction model it was possible to compare the effects of linkage disequilibrium and selection for both the additive and G×E interaction model. By cycle five, the favourable alleles of the genes in the G×E interaction model had increased to a smaller gene frequency in the population than in the additive model. This was due to G×E interaction adding a level of complexity into the selection procedure that