SlideShare a Scribd company logo
1 of 143
Download to read offline
UNIVERSITY OF MINNESOTA
This is to certify that I have examined this bound copy of a Doctorate thesis by
John Edward McLaughlin
and have found that it is complete and satisfactory in all respects,
and that any and all revisions required by the final
examining committee have been made.
Ronald L. Phillips, Friedrich Srienc
Name of Faculty Advisers
Signature of Faculty Advisers
Date
GRADUATE SCHOOL
Genetic Analysis of Variation in Endosperm Cell Number and
Endoreduplication in Maize (Zea mays L.)
A THESIS
SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL
OF THE UNIVERSITY OF MINNESOTA
BY
John Edward McLaughlin
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
Ronald L. Phillips, Friedrich Srienc, Advisers
August 2006
© John Edward McLaughlin August 2006
Acknowledgments
I would like to thank the following individuals and organizations for their im-
portant contributions to this research. First, I wish to thank my co-advisors, Drs.
Ronald Phillips and Friedrich Srienc, for their guidance on this maize endosperm
research project. I was exposed to a great variety of scientific ideas and methods
under their direction. I greatly appreciate the time that I spent in Dr. Phillips’ lab-
oratory. His laboratory fostered an environment of learning. Funding for the project
was obtained through a grant to Drs. Phillips and Srienc from the NIH Biotechnology
Training Program. In addition, I wish to thank the members of my defense commit-
tee for providing suggestions and editorial comments on the dissertation: Drs. Burle
Gengenbach, Ruth Shaw, and Deon Stuthman. Thank you for providing your time
and expertise to this project.
Mark Millard, curator of the USDA Regional Plant Introduction Station at Iowa
State University, kindly provided the teosinte accessions and seed for the other ex-
otic germplasm sources. Robert Murzyn provided the mature Tripsacum dactyloides
plants which he made available in both the greenhouse and in the field. Larry Carlson
was an excellent source of information concerning the elite X exotic germplasm work
as he had introgressed Zea diploperennis into several elite inbreds by several gen-
erations of backcrossing. Larry previously identified the short day treatment (time
frame and duration) necessary to induce Zea diploperennis to flower in the Minnesota
environment. Also, Larry provided the 50 gallon steel trash barrels used for the
short day treatment of the exotic germplasm. Benjamin Burr from the Brookhaven
National Laboratory (BNL), provided seed for both recombinant inbred line popu-
lations. Previous seed increase and maintenance of both RIL populations was per-
formed by Ronald Phillips’ laboratory group. In addition, the molecular marker
information data set used in this study was developed at BNL (currently stored at
the Maize Genetics and Genomics Database). Georgia Yerk-Davis developed the
completely randomized experimental design for the immortalized Tx303 X CO159
F2 (IF2) experiment that was grown at the Missouri Agriculture Experiment Sta-
tion in Columbia, MO. Georgia collected the kernel samples from the IF2 population
and prepared the samples for storage in ethanol. Georgia provided an additional 58
lines for the Tx303 X CO159 mapping population that were not part of the original
54 IF2 population mapping panel. In addition, Georgia provided additional molec-
ular marker data for this extended population of lines that was collected from the
i
University of Missouri-Columbia RFLP laboratory. This molecular marker data was
not, and currently is not, available from the Maize Genetics and Genomics Database
website. Daily weather data for the St. Paul location was provided by Dave Ruschy
of the University of Minnesota Department of Soil, Water, and Climate. Randy
Miles, of the Missouri Agricultural Experimental Station, provided daily weather
data for the Sanborn Field (Columbia, MO) location. Jack Otis from the University
of Minnesota Poultry Lab supplied the chicken red blood cells. James Holland de-
signed and programmed the SAS code for the calculation of heritability and genetic
correlation estimates based on REML and GLM methods. Dianne Harris from the
Beckmann Coulter Corporation taught me how to operate and maintain the Epics
XL Flow Cytometer. In addition, Dianne provided instruction for the use of the
Epics XL SYSTEM II software. The flow cytometry work was performed in the lab-
oratory of Robert Jones with the technical support of Jeff Roessler. I would also
like to express my thanks to Bruce Bagwell, Donald Herbert, Ben Hunsberger, and
Mark Munson of Verity Software House, Inc. for their help in developing a statis-
tical model which allows the fitting and estimation of multiple ploidy peaks in the
MODFIT LT 3.0 program. Jim Halgerson from the NIRS Forage Quality Lab (De-
partment of Agronomy and Plant Genetics) at the University of Minnesota provided
instruction on the wet chemistry and NIR analysis of total kernel protein and starch.
Richard W. Kaszeta wrote and provided me with the LATEX template (thesis.tex)
and style/support files (thesis-me.cls, me-tools.sty, menet.bst) designed to meet the
University of Minnesota Graduate School thesis formatting requirements. He also pro-
vided excellent help to get me started using the LATEX language for both typesetting
and the inclusion of encapsulated postscript graphic files. The LATEX template is avail-
able at: http://www.menet.umn.edu/ kaszeta/phdthesis (verified October 20, 2003).
The program BibTeXMng was used to develop, organize, and format the bibliogra-
phy. BibTeXMng was written by Petr and Nikolay Vabishchevich and the program is
available at: http://www.imamod.ru/ vab/bibtexmng/ (verified October 20, 2003).
Dean Flanders from the Agronomy and Plant Genetics department provided great
PC and Unix computer support. I would like to thank Richard Kowles who showed
me the practical nature of the work involved with the endosperm collection, fixation,
and nuclei preparation. Also, Richard Kowles provided support for many aspects of
this dissertation including the development of the initial experimental design and the
statistical analysis of flow cytometry data. I would also like to extend special thanks
to the following people for their contribution to this work. Suzanne Livingston and
ii
Jayanti Suresh provided excellent laboratory and field technical support. Mike Olsen
and Cristian Vl˘adut¸u provided many insightful discussions regarding quantitative ge-
netics along with generous laboratory and field help. Finally, I would like to thank
my family for their love and support.
iii
Abstract
(323 words)
The cytogenetics of two aspects of early endosperm growth in maize, the es-
tablishment of cell number and the extent of endoreduplication of the tissue as a
whole, was studied using flow cytometry and quantitative genetic methods. Cell cy-
cle parameters of endosperm nuclei from two recombinant inbred line populations
(T232 X CM37 and CO159 X Tx303) and one immortalized F2 population (Tx303 X
CO159) were measured. Natural genetic variability and transgressive segregation for
both endosperm cell number and extent of endoreduplication were observed. Multi-
year, broad-sense heritabilities for the cytological traits were measured for the T232
X CM37 population. The heritability at 18 DAP for endosperm cell number was
estimated as 0.23 ± 0.14 and for mean endosperm ploidy as 0.43 ± 0.12 (entry mean
basis). The phenotypic correlation between endosperm cell number and mean ploidy
for the three mapping populations ranged from −0.20 ± 0.13 to −0.57 ± 0.28. After
the transition from the mitotic cell cycle to endoreduplication occurs, the prolifera-
tive capacity of that cell terminates. The negative phenotypic correlation between
the mean ploidy and cell number traits suggests that this cell cycle transition may
have a dramatic effect on several developmental processes including growth rate and
yield. A composite trait, mean total C (mean ploidy × total endosperm cell number)
(MTC), provided the most consistent phenotypic correlations with the trait mature
100 kernel weight (g) in the T232 X CM37 mapping population (0.44 ± 0.26 in 1996
and 0.45 ± 0.27 in 1997). In total, eight endosperm cell number QTLs and ten mean
ploidy QTLs were identified by composite interval mapping. The identified QTL ef-
fects (2a) for the endosperm cell number trait ranged from 138 × 103
to 312 × 103
cells. The identified QTL effects (2a) for the mean ploidy trait ranged from 0.96
to 2.38 mean ploidy units (C). A better understanding of the genetics that controls
early endosperm development has implications for the improvement of seed quality
and yield.
iv
Contents
Contents v
List of Tables viii
List of Figures ix
Chapter 1 Literature Review 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2 Materials and Methods 7
2.1 Plant Material and Growth Conditions . . . . . . . . . . . . . . . . . 7
2.2 Endosperm Sampling and Storage . . . . . . . . . . . . . . . . . . . . 8
2.3 Endosperm Nuclei Preparations . . . . . . . . . . . . . . . . . . . . . 10
2.4 Analysis of Phenotypic Data . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.1 Heritability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4.2 Genetic and Phenotypic Correlations Between Traits . . . . . 14
2.4.3 Weather Information . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Composite Interval Mapping . . . . . . . . . . . . . . . . . . . . . . . 16
2.6 Joint Time-Related Mapping . . . . . . . . . . . . . . . . . . . . . . . 17
2.7 Flow Cytometry Instrumentation, Settings, and Measurements . . . . 18
2.8 Modeling of the Cell Cycle . . . . . . . . . . . . . . . . . . . . . . . . 20
v
2.9 Data Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.10 Statistical Analysis of the Flow Cytometry Data . . . . . . . . . . . . 21
2.11 Kernel Protein and Starch Determinations . . . . . . . . . . . . . . . 23
Chapter 3 Results 24
3.1 Endosperm Cytological Trait Histograms . . . . . . . . . . . . . . . . 24
3.1.1 T232 X CM37 RIL . . . . . . . . . . . . . . . . . . . . . . . . 24
3.1.2 Tx303 X CO159 IF2 and CO159 X Tx303 RIL . . . . . . . . . 33
3.2 Multi-Environment ANOVA Results . . . . . . . . . . . . . . . . . . . 38
3.3 Single-Environment ANOVA Results . . . . . . . . . . . . . . . . . . 41
3.4 Flow Cytometric Analysis . . . . . . . . . . . . . . . . . . . . . . . . 43
3.5 Heritability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.6 Genetic and Phenotypic Correlations Between Traits . . . . . . . . . 53
3.7 Quantitative Trait Analysis . . . . . . . . . . . . . . . . . . . . . . . 57
3.7.1 Endosperm Cell Number QTLs . . . . . . . . . . . . . . . . . 59
3.7.2 Endosperm Mean Ploidy QTLs . . . . . . . . . . . . . . . . . 63
3.8 Additional Sources of Genetic Variability . . . . . . . . . . . . . . . . 68
3.8.1 Zea diploperennis . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.8.2 Tripsacum dactyloides . . . . . . . . . . . . . . . . . . . . . . 71
Chapter 4 DISCUSSION 74
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2 Quantitative Genetic Parameters . . . . . . . . . . . . . . . . . . . . 75
4.3 QTL Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.4 Genetic Determinants of Maize Endosperm Cell Number, Extent of
Endoreduplication Control, and Yield . . . . . . . . . . . . . . . . . . 83
4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
vi
References 94
Appendix A SAS Programs 109
A.1 Single Environment PROC GLM and PROC MIXED SAS Code . . . 109
A.1.1 Single Environment: PROC GLM, Randomized Complete Block
(Random Model) . . . . . . . . . . . . . . . . . . . . . . . . . 109
A.1.2 Single Environment: PROC GLM, Randomized Complete Block
(MIXED Model) . . . . . . . . . . . . . . . . . . . . . . . . . 110
A.1.3 Single Environment: PROC MIXED, Randomized Complete
Block (Random Model) . . . . . . . . . . . . . . . . . . . . . . 111
A.1.4 Single Environment: PROC MIXED, Randomized Complete
Block (MIXED Model) . . . . . . . . . . . . . . . . . . . . . . 112
A.1.5 Single Environment Heritability Calculation: PROC MIXED . 113
A.2 Multiple Environment PROC GLM and PROC MIXED SAS Code . . 115
A.2.1 Multiple Environment Heritability Calculation: PROC GLM . 115
A.2.2 Multiple Environment: PROC MIXED, Randomized Complete
Block (MIXED Model) . . . . . . . . . . . . . . . . . . . . . . 115
A.2.3 Multiple Environments Heritability Calculation: PROC MIXED 117
A.3 Genetic Correlation Calculation . . . . . . . . . . . . . . . . . . . . . 118
A.3.1 Genetic Correlation: GLM MANOVA . . . . . . . . . . . . . . 118
A.3.2 Genetic Correlation: PROC MIXED (REML MANOVA) . . . 123
Appendix B MODFIT LT 3.0 Output Example 129
vii
List of Tables
2.1 Exotic Accessions Grown in St. Paul . . . . . . . . . . . . . . . . . . 9
3.1 Multi-Environment Covariance Parameter Estimate (REML) and Fixed
Effect Solution Table for T232 X CM37 RIL Trait Mean Endosperm
Nuclear Ploidy (18 DAP) . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Multi-Environment Covariance Parameter Estimate (REML) and Fixed
Effect Solution Table for T232 X CM37 RIL Trait Mean 100 Kernel
Weight (g) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Multi-year and Single Year Trait Heritability Estimates- T232 X CM37
RIL Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4 Heritability for total endosperm cell number, mean endosperm nuclear
ploidy and mean total C on a bulked-sample-basis for the immortalized
Tx303 X CO159 F2 Population Mapping Population (St. Paul, MN
and Columbia, MO 1996) . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5 Single Year Trait Heritability for Total Endosperm Cell Number, Mean
Endosperm Nuclear Ploidy and Mean Total C for the CO159 X Tx303
RIL Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.6 Genetic and Phenotypic Correlations for T232 X CM37 RIL Traits
(REML) (St. Paul, MN 1996 and 1997) . . . . . . . . . . . . . . . . . 56
3.7 Summary of the Identified QTLs for the Trait Mean Endosperm Cell
Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.8 Summary of the Identified QTLs for the Trait Mean Endosperm Nu-
clear Ploidy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
viii
List of Figures
3.1 Histograms of Mean Endosperm Cell Number at 18 DAP for the 48
T232 X CM37 RIL Families (St. Paul, MN 1996 and 1997) . . . . . 26
3.2 Histograms of Mean Endosperm Ploidy (MEP) measured in C units
at 18 DAP for the 48 T232 X CM37 RIL Families (St. Paul, MN
1996 and 1997) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Box Plots Displaying the Distribution of the Trait Mean Endosperm
Nuclear Ploidy, within the 48 T232 X CM37 RIL Families from 14
to 24 DAP in 1996 and 1997 (St. Paul, MN) . . . . . . . . . . . . . 28
3.4 Observed Endosperm Mean Nuclear Ploidy for the 48 T232 X CM37
RIL Lines for a Period of 10 Days (14, 16, 18, 20, and 24 DAP)
Measured in St. Paul, MN (1996) . . . . . . . . . . . . . . . . . . . 29
3.5 Observed Endosperm Mean Nuclear Ploidy for the 48 T232 X CM37
RIL Lines for a Period of 10 Days (14, 16, 18, 20, and 24 DAP)
Measured in St. Paul, MN (1997) . . . . . . . . . . . . . . . . . . . 30
3.6 Histograms of 100 Kernel Weight, within the 48 T232 X CM37 RIL
Families (St. Paul, MN 1996 and 1997) . . . . . . . . . . . . . . . . 31
3.7 Histograms of Total Kernel Protein and Starch Percentages, within
the 48 T232 X CM37 RIL Families (St. Paul, MN 1996 and 1997) . 32
3.8 Histograms of Mean Endosperm Cell Number at 16 DAP for the
112 Immortalized Tx303 X CO159 F2 Families (St. Paul, MN and
Columbia, MO 1996) . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.9 Histograms of Mean Endosperm Nuclear Ploidy (C) at 16 DAP for
the 112 Immortalized Tx303 X CO159 F2 Families (St. Paul, MN
and Columbia, MO 1996) . . . . . . . . . . . . . . . . . . . . . . . . 36
ix
3.10 Histogram (Fitted Distribution) of Parental and F1 Endosperm (18
DAP) Nuclei Together with CRBCs from Flow Cytometry Measure-
ment (St. Paul, MN 1997) . . . . . . . . . . . . . . . . . . . . . . . 44
3.11 Flow Cytometry Histograms of Parental (CO159 and Tx303) and
F1 (CO159 X Tx303) Endosperm Nuclei together with CRBCs (St.
Paul, MN 1996) Showing Endoreduplication Distribution Differences
at 16 DAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.12 Regression of Log Endosperm Cell Number on Mean Nuclear Ploidy
from the Immortalized Tx303 X CO159 F2 Population Endosperm
Samples Collected at 16 DAP in Columbia, MO 1996 . . . . . . . . 55
3.13 Genetic Map of the Endosperm Cell Number and Mean Ploidy QTLs
Identified from the Three Mapping Populations . . . . . . . . . . . 58
3.14 Immortalized Tx303 X CO159 F2 Population CIM: QTL Likelihood
Maps on Chromosome 7 for the Trait Mean Endosperm Cell Number
at 16 DAP (Columbia, MO 1996). . . . . . . . . . . . . . . . . . . . 63
3.15 Ear Diversity Range: Modern Inbred (B73) to Zea diploperennis (St.
Paul, MN 1998) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.16 Histogram (Raw Distribution) of Zea diploperennis Nuclei at 12 DAP
together with CRBCs from Flow Cytometry Measurement (St. Paul,
MN 1998) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.17 Histogram (Raw Distribution) of Zea diploperennis Nuclei at 16 DAP
together with CRBCs from Flow Cytometry Measurement (St. Paul,
MN 1998) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.18 Histogram (Raw Distribution) of Tripsacum dactyloides Nuclei at 14
DAP together with CRBCs from Flow Cytometry Measurement (St.
Paul, MN 1998) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.1 Epi-fluorescence Image of a 16 DAP kernel (DE2 X H99) Longitudi-
nal Cyrosection Stained with DAPI- Aleurone to Inner Endosperm
Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
B.1 MODFIT LT 3.0 Graphical Output Displaying the Fitted CRBC and
Endosperm Nuclei Cytogram Peaks . . . . . . . . . . . . . . . . . . 130
x
Chapter 1
Literature Review
1.1 Introduction
The endosperm composes approximately 80-85% of the mature maize kernel
dry weight thus this component of seed tissue makes a large contribution to grain
quality, composition, and yield (Wolf et al., 1952; Kowles and Phillips, 1988). The
endosperm develops rapidly and attains a high metabolic activity within a relatively
short developmental time period (Ingle et al., 1965; Kowles et al., 1992b; Larkins
et al., 2001). The harvest index for maize, the ratio of grain yield to total plant
mass, is approximately 50% (Jurgens et al., 1978; Sinclair, 1998). The relatively high
harvest index for maize indicates that a large amount of photosynthate is efficiently
converted into harvestable grain.
The control of endosperm growth depends, in part, on the control of two dis-
tinct, but related, cell cycle programs that are present during development. After a
period of syncytial karyokinesis following double fertilization, the endosperm tissue
initially grows by increasing cell number through mitosis (Kowles and Phillips, 1988).
1
At 8-10 DAP, a period of development marked by a high mitotic index (≈ 10%), a
fraction of endosperm cells in the central region of the tissue begin to differentiate
and undergo nuclear polyploidization through a modified cell cycle called endoredu-
plication (Kowles and Phillips, 1985). Endoreduplication is a truncated cell cycle
which results in alternating Gap (G) and DNA synthesis (S phase) phases without a
mitotic phase nor a cytokinesis event (Phillips et al., 1985; Kowles and Phillips, 1985,
1988; Schweizer et al., 1995; Grafi and Larkins, 1995). Measured in haploid ploidy
units (C), a triploid endosperm nucleus in the G1 phase of the cell cycle would be
characterized as having a 3C DNA content. Cells which begin endoreduplication be-
come terminally differentiated and grow by increases in both nuclear and cytoplasmic
volume (Kowles and Phillips, 1988).
Endoreduplication is a common mechanism of genome multiplication (Brodsky
and Uryvaeva, 1985). This alternative cell cycle is present in many tissues, and
the most notable are those which have secretory and storage functions (Nagl, 1976).
Endoreduplication plays a substantial role during maize endosperm development in
terms of the growth dynamics of the tissue (Kowles and Phillips, 1985; Kowles et al.,
1990; Dilkes et al., 2002). As the endoreduplication cell cycle begins in the maize
endosperm, mitotic activity falls sharply in endosperm cells at approximately 10-12
DAP. Peripheral endosperm cells continue to divide by mitosis, but in the central
region of the endosperm the mitotic index drops to near zero after 14 DAP (Kowles
and Phillips, 1985). As many as 90% (including the 6C ploidy class) of endosperm
cells undergo endoreduplication to some extent during tissue development (Larkins
et al., 2001). The DNA content of an individual cell nucleus is correlated with nuclear
volume (Kowles and Phillips, 1985, 1988). In addition to increasing nuclear volume
during endoreduplication, the endosperm cell as a whole undergoes a substantial in-
crease in both total volume and cell size. The period of nuclear and cell enlargement
2
(approximately 8 to 28+ days) is temporally correlated with endosperm cell differ-
entiation, starch deposition, total endosperm RNA content, total endosperm sugar
content, and storage protein accumulation (Ingle et al., 1965; Larkins et al., 2001). In
addition, developmental gradients within the endosperm tissue are formed such that
the central cells (starchy endosperm) contain the largest nuclei and the peripheral
cells (aleurone) contain the smallest (Kowles and Phillips, 1985). The resulting en-
dosperm tissue is highly heterogeneous in terms of both nuclear DNA content and cell
size (Kowles et al., 1990). However, the role that cell cycle control plays in making the
endosperm such an efficient and productive tissue is just beginning to be understood
(Grafi and Larkins, 1995; Becraft, 2001; Leiva-Neto et al., 2004).
The molecular mechanisms that control the transition from the mitotic cell cycle
to endoreduplication are not well understood. However, cell cycle research in plants
such as maize and Arabidopsis has identified a few key regulatory steps. Grafi and
Larkins (1995) identified two major cell cycle control mechanisms that are important
in controlling the transition from a mitotic cell cycle to the endoreduplication cell cycle
in the maize endosperm. In endoreduplicating endosperm tissue, mitosis is inhibited
by a decrease in mitosis promoting factor MPF and DNA replication is induced by the
activation of S phase-related kinases. Recently, Leiva-Neto et al. (2004) reported that
reducing cyclin-dependent kinase A (CDKA) activity in the maize endosperm has a
dramatic effect on the extent of endoreduplication. The ectopic expression of a dom-
inant negative mutant gene for CDKA reduced mean ploidy by 50% compared to the
wildtype control. Boudolf et al. (2004) used transgenic modification of Arabidopsis
to show a link between CDK cyclin B1 (CDKB1) expression and the E2F transcrip-
tion factor pathway. Overexpressing CDKB1 plants enhanced the endoreduplication
phenotype in the leaf blade.
3
The physiological and functional significance of endoreduplication is not clear.
Several investigators have attempted to correlate genome multiplication with pro-
ductivity in both plant and animals, but in many cases the comparisons are too
complex to give a definitive answer (Pearson, 1974; Barlow, 1978; Larkins et al.,
2001; Brodsky and Uryvaeva, 1985). Specific examples have been noted of greatly
increased transcription rates in tissues containing endoreduplicated cells when com-
pared to tissues composed of predominantly diploid cells of the same organism (Clut-
ter et al., 1974; Calvi et al., 1998; Scharp´e and Van Parijs, 1973; Nagl, 1976). A
role for endoreduplication has been suggested in enhancing the transcriptional po-
tential, protein-synthesizing capacity, and the functional activity of a wide range of
tissues (Nagl, 1976; Barlow, 1978; D’Amato, 1984; Melaragno et al., 1993; Goverse
et al., 2000; Foucher and Kondorosi, 2000; Zhao and Grafi, 2000; Larkins et al., 2001;
Leiva-Neto et al., 2004).
It has been proposed that the productivity of crop plants can be enhanced by
modulating the degree of endoreduplication in seed tissues (Brunori et al., 1993; Inz´e
et al., 2002; Nadimpalli and Simmons, 2002; Inz´e et al., 2004). Cavallini et al. (1995)
noted a strong positive correlation between maize kernel protein content and the
extent of endoreduplication in the Illinois high and low protein genotypes. However,
Leiva-Neto et al. (2004) reduced the extent of maize endosperm endoreduplication by
50% by the ectopic expression of a dominant mutant in the cyclin-dependent kinase
A gene and found only slight reductions in starch and storage protein accumulations
compared to the wildtype control.
The proportion of endosperm cells in a particular ploidy class has been shown to
be influenced by both genotypic and environmental influences (Kowles and Phillips,
1985; Artlip et al., 1995; Cavallini et al., 1995; Kowles et al., 1992a; Engelen-Eigles
4
et al., 2001; Dilkes et al., 2002). The maternal genotype of the plant, through both
sporophytic and zygotic influences, substantially impacts the level of endoreduplica-
tion that develops in the endosperm tissue (Kowles et al., 1997; Dilkes et al., 2002).
The extent of endoreduplication in the maize endosperm has been determined to be
heritable and quantitative genetic components based on parent-offspring regression
have been measured (Dilkes et al., 2002). Significant components of variance re-
lated to the maternal zygotic and maternal sporophytic effects were identified. The
estimates from that study represent aggregate quantitative genetic components.
The present study builds on the previous work by using molecular markers as a
further variance-component partitioning tool. Flow cytometric analysis of endosperm
samples from three different maize mapping populations were used to collect cell num-
ber and ploidy data. Small to moderate heritabilities for the endosperm cell number
and mean ploidy traits were found. The cytological and morphological phenotypic
data from the three mapping populations represented natural genetic variation which
was correlated with molecular marker data. The molecular marker information per-
mits the dissection of the total genetic variance to defined genomic regions (QTL
positions), the estimation of magnitude of individual effects, the calculation of gene
action, and the identification of parental (allelic) contribution of these effects. In
addition, genetic correlations between the traits 100 kernel weight and total kernel
starch/protein show significant relationships to both the endosperm cell number and
mean ploidy traits. Inclusion of both endoreduplication and cell number data in
this study permit a fuller understanding of endosperm development from a cell cycle
perspective.
The objective of this study was to identify and characterize regions of the maize
genome which control two cytological aspects of endosperm development: establish-
5
ment of endosperm cell number and extent of endoreduplication.
• To do so, we developed a defined method to quantify the endoreduplication
phenotype using the flow cytometry program MODFIT LT 3.0.
Specific objectives include:
• Estimation of genetic components of variation, broad-sense heritabilities, and
genetic correlations for the traits endosperm cell number and extent of en-
doreduplication.
• The identification of quantitative trait loci (QTLs) on a genome-wide basis using
composite interval mapping (CIM) methods.
• Comparison of cytological data to agronomic traits such as final kernel weight
to test correlations at the population level.
• Testing endosperm samples from Zea diploperennis and Tripsacum dactyloides
for evidence of endoreduplication.
6
Chapter 2
Materials and Methods
2.1 Plant Material and Growth Conditions
Three mapping populations were used in these studies: two recombinant in-
bred line families, T232 X CM37 and CO159 X Tx303 (Burr et al., 1988) and one
immortalized F2 population from the cross Tx303 X CO159 (Gardiner et al., 1993).
The RIL sets were grown at the University of Minnesota Experiment Station,
St. Paul, MN during the growing seasons of 1996 and 1997 (planting dates: May
12th and May 11th, respectively). The T232 X CM37 RIL set consists of 48 lines
and the CO159 X Tx303 RIL set consists of 43 lines. Each RIL set was at least
at the 11th generation of inbreeding. Each RIL set was grown in a lattice design
with two replicates. Plots consisted of hand-planted single rows of 35 plants which
were later thinned to 30 plants. Plot rows were 6.7 meters long and 76 cm apart
(58,960 plants/ha). The IF2 mapping population, composed of 112 lines, was grown
at two locations in 1996. Two replicates (18-30 plants per line per rep) designed in
a randomized complete block were grown at the University of Minnesota Experiment
7
Station, St. Paul, MN. In addition, two replicates of the IF2 material were grown at
the University of Missouri-Columbia Experimental Station.
In addition to the above populations, several maize accessions from the USDA
Regional Plant Introduction Station (Ames, Iowa) were tested. A single exception was
seed of Tripsacum dactyloides obtained by Dr. Robert Murzyn, originally obtained
from Shepherd Farms, Inc. (Clifton Hill, MO). The accessions evaluated in St. Paul
in 1996-1998 are shown in Table 2.1.
All exotic accessions were grown in St. Paul, MN in 1996, 1997, and 1998. These
accessions require the imposition of a short day environment to induce the plants to
flower in time to set seed in the Minnesota environment. The plants were covered
with 50 gallon steel trash barrels at 7 PM and uncovered at 7AM each day during an
approximate three week period of growth (1st three weeks of June until the plants
were too tall to fit inside the barrel).
2.2 Endosperm Sampling and Storage
Controlled pollinations were made using standard methods. For the T232 X
CM37 RIL population, endosperm samples were collected at 14, 16, 18, 20, and 24
DAP. For both the CO159 X Tx303 RIL and IF2 populations, endosperm samples
were collected at 16 DAP. Kernels from the mid-section of the cob were immediately
placed in ethanol:propionic acid (3:1, v/v). After 24 hours, kernels were placed in
70% ethanol for storage at -20◦
C. Prior to preparation, kernels were equilibrated in
35% and then 0% ethanol.
8
Table2.1:ExoticAccessionsGrowninSt.Paul
SubspeciesorVarietyPlantNameAccessionNumberSourceCountry
Zeamaysssp.mays
Mexico141Ames19561Mexico,Mexico
CacaoAmarilloNSL12Colombia
JalaNSL2834Mexico
Ancash515PI571973Ancash,Peru
SmallSeedPI411138China
Zeamaysssp.mexicana
ChalcoteosintePI384060Oaxaca,Mexico
Ames8083PI8083FederalDistrict,Mexico
Doebley625PI478399Durango,Mexico
AcecePI566684Mexico,Mexico
CundazPI566693Michoacan,Mexico
Zeamaysvar.parviglumis
ElSaladoBalsasPI384061Guerrero,Mexico
Zeadiploperennis
2265API21884Jalisco,Mexico
Zealuxurians
G-42PI21879Chiquimula,Guatemala
30919PI21893Chinandega,Nicaragua
Tripsacumdactyloides(2n=36)
EasterngamagrassPMK-24USA
9
2.3 Endosperm Nuclei Preparations
Each sample preparation consisted of nuclei from the entire endosperm of a bulk
of kernels (3-6 for RILs to 40-60 for the IF2). The kernels were sequentially equi-
librated in 50, 25 and 0% ethanol (v/v) and the endosperms were excised. First,
the entire endosperm tissue from the collection of kernels was isolated under a dis-
section scope or by using a magnifying visor. Removal of the endosperm (including
the aleurone) from the kernel was accomplished using dental dissecting instruments
in a manner that excluded the embryo, pericarp, and pedicel tissue. For each RIL
endosperm preparation, six kernels per ear were prepared and combined (bulked) to
represent a treatment sample. For the IF2 endosperm preparations, 3-6 kernels per
ear from 12 to 20 ears per replicate were bulked for further processing.
A method developed by Reddy and Daynard (1983) and modified by Myers et al.
(1990b) was used to obtain solutions of endosperm nuclei. This method was shown to
give a quantitative release of nuclei (Myers et al., 1990b). These isolated endosperm
samples were placed into a pectinase digestion solution (1 part pectinase (ICN Bio-
chemicals, Cleveland, OH) to 3 parts citrate-phosphate buffer (8.8g Na2HPO4 + 3.6
g citric acid L-1 [pH4.0] + 0.1% [w/v] NaN3). Endosperms were dissected from the
kernels, placed in 1 mL of pectinase solution in a capped Falcon tube, and incubated
at 37◦
C until soft. The enzyme digestion time was dependent on endosperm age.
Endosperms less than 10 DAP required less than 4 hours in the pectinase solution;
older endosperms (20-24 DAP) required 8 to 12 hours to soften. Over-digestion led
to extensive nuclei loss as evidenced by observation with both microscopy and flow
cytometry. Nuclei were dispersed by forcing the tissue through a 18 gauge needle
with a syringe. The syringe and needle were then washed with the pectinase buffer
(minus the pectinase). Added to this nuclei suspension was 75 µL Rnase (10mg/mL),
10
approximately 10,000 chicken red blood cells (CRBCs) from a measured stock solu-
tion, and a final concentration of 1.5 µg propidium iodide per estimated 1000 nuclei.
Nuclei were stained for 2 hrs. at 37◦
C and then stored at 4◦
C until analysis (storage
time not to exceed 24 hrs.).
CRBCs were used to measure the average DNA content per endosperm nu-
cleus and to calculate the number of nuclei present in the entire endosperm and in
various sub-populations. Fresh CRBCs were obtained from the Department of An-
imal Science, University of Minnesota. Trial experiments showed that propidium
iodide-stained CRBCs preparations (DNA content ∼ 2.33 pg) could be visualized
separately from the endosperm nuclei on cytograms of the log forward angle light
scatter verses log DNA-fluorescence intensity. CRBCs from a male chicken were col-
lected and stained cells were counted on a hemacytometer at 40× magnification. The
counts were multiplied by the appropriate dilution factors to determine cell numbers.
Three counts of at least 300 nuclei per observation were made just prior to the ad-
dition of CRBCs to the endosperm nuclei mixture. The procedure for the fixative of
CRBCs was as follows: Fresh CRBCs were diluted with 1X PBS (pH 7.3) until they
could be counted with a hemacytometer to obtain the concentration. The CRBCs
were stored up to one month in PBS buffer at 4◦
C.
2.4 Analysis of Phenotypic Data
Phenotypic data for the three traits were analyzed by SAS®
software (Unix/version
8.2, (SAS Institute, Inc., 2003)). Means and standard deviations were determined for
each character for the two parents, the F1 hybrid, and the RIL or IF2 lines. The
normality of each trait distribution was assessed by the (Shapiro and Wilk, 1965)
W statistic (PROC UNIVARIATE NORMAL). For each trait, the homogeneity of
11
variances among environments (locations or years) was checked using Hartley’s Fmax
test (Hartley, 1950).
Components of variance estimates were obtained using the general linear model
(GLM) procedure of the SAS/STAT program (SAS Institute, Inc., 2003):
Y = Mean + Y ear + Rep(Y ear)
+Y ear × Geno + Geno × Rep(Y ear) (2.1)
2.4.1 Heritability
Broad-sense heritability (H2
) estimates were estimated by dividing the genotypic
variance by the phenotypic variance (Hallauer and Miranda, 1988):
Heritability on a entry mean basis across years was estimated as
H2
=
ˆσ2
G
ˆσ2
G +
ˆσ2
GE
e +
ˆσ2
ε
re
(2.2)
Heritability on a entry mean basis using a single year of data was estimated as
H2
=
ˆσ2
G
ˆσ2
G +
ˆσ2
ε
r
(2.3)
where ˆσ2
G is the genotypic variance,
ˆσ2
ε
re is the error variance divided by the
number of replications multiplied by the number of years, r is the harmonic mean
for the number of replicates per year, and
ˆσ2
GE
e is the genotype by year interaction
12
variance.
Type III sums of squares from the SAS program analyses were used to obtain
the mean square estimates used in the ANOVA-based heritability estimates. Unsym-
metrical, exact 95% confidence intervals for the ANOVA-based heritability estimates
were calculated according to the method of Knapp et al. (1985). F-values from the
F(df1, df2) distribution were obtained with MacAnova (Oehlert and Bingham, 2001).
The SAS code for both the PROC GLM and PROC MIXED models can be
found in Appendix A.1 (page 109) for the single environment design and Appendix
A.2 (page 115) for the multiple environment design.
Holland et al. (2003) presented The PROC MIXED SAS program code for the
estimation of heritability on an entry (family) mean basis including the code to cal-
culate the associated standard errors. Approximate standard errors for heritability
estimates calculated with PROC MIXED were estimated using the delta method
(Lynch and Walsh, 1998; Holland et al., 2003). Holland et al. (2001) also presented
methods to calculate heritability estimates based on a bulked-sample-basis. Heri-
tability on a bulked-sample-basis was estimated for cell cycle components from the
IF2 mapping populations, because these traits were measured on samples of kernels
bulked from two replicate plots per location rather than individual plots. This was
done to reconstitute the F2 phenotype from the bulked IF2 line. In this case, ex-
perimental error among bulked samples within a location was not estimated, and
genotype-by-environment interaction (confounded with experimental error) served as
the residual variance from the analysis over locations of bulked sample values.
Heritability on a bulked-sample-basis was calculated as:
ˆHbulked−sample−basis =
ˆσ2
G
ˆσ2
G + ˆσ2
GE
(2.4)
13
The SAS code, with minor changes, is also presented in Appendix A, section
A.1.5 on page 113 for the single environment case and section A.2.3 on page 117 for
the multiple environment case.
2.4.2 Genetic and Phenotypic Correlations Between Traits
The genetic and phenotypic correlations among traits were estimated from mul-
tivariate analysis of variance (MANOVA) using PROC GLM of the SAS/STAT pro-
gram (SAS Institute, Inc., 2003). The genetic correlation between traits x and y is
estimated as
ˆrGxy =
ˆσGxy
ˆσGx ˆσGy
(2.5)
where ˆrGxy is the estimated genotypic covariance between traits x and y. For
the RIL populations, agronomic traits are represented by line means per replicate
and cytological traits are represented by pooled kernel samples from one plant per
replicate. ˆrGx and ˆrGy are the estimated genotypic standard deviations for the traits
x and y respectively.
The phenotypic correlation between traits x and y is estimated as
ˆrPxy =
ˆσPxy
ˆσPx ˆσPy
(2.6)
where ˆσPxy is the estimated phenotypic covariance between traits x and y. ˆσPx
and ˆσPy are the estimated phenotypic standard deviations for trait x and y, respec-
tively.
Genotypic and phenotypic variance components and associated standard errors
for each trait were estimated separately using the restricted maximum likelihood
14
method in the SAS PROC Mixed program, considering all effects in the model except
the intercept to be random effects (Holland et al., 2003). Standard errors were cal-
culated based on the formula presented by (Mode and Robinson, 1959). Lynch and
Walsh (1998); Holland et al. (2003) further elaborate on the calculation of standard
errors for genetic variance components. For the RIL populations, there is no genetic
variance among individuals within the group, thus the phenotypic and environmental
correlations are functionally equivalent (Lynch and Walsh, 1998).
The SAS program code for the estimation of genetic correlation (using MANOVA
and REML MANOVA) and the associated standard errors developed by James Hol-
land is available on his website at North Carolina State University
(URL: http://www4.ncsu.edu/
%7Ejholland/correlation/correlation.html).
The SAS code for the phenotypic and genotypic correlations are also presented
in Appendix A, section A.3.1 (page 118) for the PROC GLM method and A.3.2
(page 123) for the PROC MIXED method.
2.4.3 Weather Information
Growing degree units (GDUs) and precipitation data were obtained from the
University of Minnesota Agricultural Experiment Station (St. Paul) and the Univer-
sity of Missouri Agricultural Experimental Station (Sanborn). Accumulated growing
degree day units (GDUs) were calculated according to the formula
(maximum◦
C + minimum◦
C)
2
− 10◦
C (2.7)
15
where 10◦
C was set as the minimum temperature and 30◦
C was set for the
maximum temperature if the actual temperatures exceeded these limits. The GDU
and precipitation values were summed from the planting to the day of pollination and
from pollination to the endosperm sampling day.
Precipitation data were organized into two categories: precipitation from polli-
nation to sampling (PTS) and total accumulated precipitation (TAP) from planting
to the sampling period.
2.5 Composite Interval Mapping
The computer program PLABQTL (Utz and Melchinger, 1996, 2000) was used
to identify QTLs based on the maize linkage maps and phenotypic data. PLABQTL
performs composite interval mapping (CIM) by combining an interval mapping ap-
proach (Lander and Botstein, 1989) and regression methods with the use of selected
markers as covariates. CIM in PLABQTL is based on multiple regression using marker
cofactors preselected by stepwise regression (Haley and Knott, 1992).
Cofactors were selected by a stepwise regression procedure in PLABQTL. Em-
pirical, genome-wide LOD threshold levels (α = 0.05) were established based on 1000
permutations of the final CIM model containing the preselected cofactors from step-
wise regression (Doerge and Churchill, 1996; Doerge and Rebai, 1996). The threshold
applies to a two-sided test in which alleles from either parental strain may increase
or decrease the mean trait value under analysis. Akaike’s information criterion (AIC)
was utilized during the mapping procedure as a stopping rule in selecting subsets of
regression variables and for the selection of the most probable model (Sakamoto et
al., 1986 in Jansen, 1993). A penalty of 3 for the AIC score, as recommended by
the authors, was used to select the final markers used as cofactors in the analysis
16
(Utz and Melchinger, 2000). Models that had AIC values larger than 3 were deemed
significantly different. Final selection was for the QTL model that minimized the
AIC. Jansen (1993) discussed the use of the AIC in both the cofactor selection and
QTL model selection process. The percentage of the genotypic variance which is ex-
plained by the multi-locus QTL model was calculated as follows: the genetic variance
explained is calculated as the coefficient of determination (R2
) divided by the broad
sense heritability (H2
). The standard error of this statistic is calculated under the
assumption of known heritability (Utz and Melchinger, 2000).
The additive effect (a) is reported as half the difference between the genotypic
values of the two homozygotes at the putative QTL locus (Utz and Melchinger, 2000).
The dominance effect reflects the genotypic value of the heterozygote relative to the
two homozygotes at that locus. The dominance effect calculated for the IF2 repre-
sents the difference between the mean of the heterozygous class and the mean of the
homozygous classes at a given QTL. The additive and dominance effects calculated
from multiple regression in PLABQTL were tested for significance by comparing the
partial sums of squares term from the regression (1 DOF) to the residual sum of
squares from the regression ANOVA for the entire model. The significance of the
genetic effect is tested by performing an F-test on this ratio.
2.6 Joint Time-Related Mapping
The computer program JZmapqtl in the QTL Cartographer (Basten and Zeng,
2002) suite of mapping programs was used for mapping the joint likelihood QTL pro-
file of a single trait measured across time. JZmapqtl is an extension of CIM (Zeng,
1993, 1994) and allows multiple traits to be analyzed simultaneously (Jiang and Zeng,
1995). Following Wu et al. (1999), JZmapqtl was used for time-related mapping
17
(TRM). Instead of using separate, correlated traits as input into the JZmapqtl algo-
rithm, a single trait measured across five time points was used (repeated measures
approach). Stepwise regression using the forward-backward search (F-to-enter and
exit was set at p = 0.10) approach was used to identify cofactors. The Perl script
Permute.pl (Basten, 2003) was used to obtain the joint and single trait genomic ex-
perimentwise thresholds.
2.7 Flow Cytometry Instrumentation, Settings, and Mea-
surements
A Coulter Epics®
MXL (Coulter Corp. Hialeah, Fl) with an argon ion laser
operating at 488 nm was used for flow cytometric analysis of the maize endosperm
nuclei. Samples were mixed with vortex mixing immediately prior to analysis to
prevent sedimentation. Forward angle light scatter (FALS) and right angle light
scatter (RALS) data were collected in both linear (FS and SS) and logarithmic
modes (FSLog and SSlog), respectively. The photomultiplier tube 3 detector was set
to collect nuclei fluorescence data in both area (integrated or total fluorescence) and
peak (peak fluorescence measured using the AUX designation) mode. The area mode
data were defined as FL3-Propidium Iodide (FL3 PI ) because these data represent the
fluorescence intensity signal from the propidium iodide nuclear stain. Because of the
large variation of the particles’ properties, signals were expressed with a logarithmic
transformation (Log) (FL3 Log-PI ). Nuclei were measured with a flow rate through
the flow cytometer of approximately 25-200 nuclei per second. A minimum threshold
(discriminator) to trigger event data collection was set using CRBCs. The Aux (FL3
Peak) channel was set to exclude events with fluorescence intensities that fell below
the CRBC peak.
18
Exclusion of doublets was performed by plotting
AUX
FL3 Log-PI
verses FL3 Log-PI
and by excluding events with high integrated and low peak signals. In addition,
debris was gated based on Log Forward Light Scatter (FSLog) verses FL3 Log-PI
(FL3 Log-PI ) cytograms. Events that fell outside of the main CRBC and nuclei
mean clusters were excluded from the analysis.
For statistical analysis of the cytometric parameters (cell number and ploidy
peak areas), gated signals were displayed as one-parameter histograms in logarithmic
mode. The data were further gated using software (MODFIT LT 3.0) to eliminate
nuclear debris from the analysis. The DNA amount, proportional to the fluorescence
signal, is expressed as arbitrary C values in which the 1C value comprises the DNA
content of the unreplicated haploid chromosome complement. Signals obtained from
maize leaf tissues were used to adjust the gain settings so that signals from all intact
nuclei were registered within the channel range.
At least 10,000 nuclei (20,000 non-debris events set to terminate the run) were
analyzed for each sample and every determination was made in duplicate. MODFIT
LT 3.0 (Verity Software House, Inc.) flow cytometry software was used to analyze
the DNA ploidy histograms. Data obtained from these programs were subjected to
statistical analysis using the SAS program and further processed for QTL mapping
purposes. Nuclei number was determined by flow cytometry. A known concentration
of CRBCs was added to each of the nuclei preparation samples prior to the flow
cytometric analysis.
19
2.8 Modeling of the Cell Cycle
Single Gaussian distributions were fit to the CRBC and each DNA C peak using
the flow cytometry software MODFIT LT 3.0. The model is built upon a series of
Gaussian curves which are fit to each DNA ploidy peak 1
. Non-linear regression
using the Marquardt Compromise method was used to minimize the mean square
error (MSE) for the entire distribution space (Bagwell, 1993). The majority of debris
was gated out from the list-mode data before entering the modeling software. Based
on PI fluorescence data alone it is difficult to delimit the S-phase from the main ploidy
peaks in flow cytograms derived from endosperm nuclei analysis. In this study, the
S-phase was not separately analyzed. Instead, the main ploidy peaks were fit using
normal curves which include the unknown fraction of the S-phase.
2.9 Data Calculation
For each experimental set, nuclei from leaf tissue or embryo samples were mea-
sured first to determine the location of 3C in terms of the channel number (between
2C and 4C nuclei from either leaf or embryo samples). The total nuclei number per
endosperm was calculated according to the ratio of CRBCs to total nuclei number
(Schweizer, 1992). From this total number, plus the proportion of each nuclei pop-
ulation related to DNA content (3C, 6C, 12C, 48C, and 96C), the actual 3C nuclei
number, 6C nuclei number, and up to 96C (the highest peak DNA content in nuclei
regularly observed for the four inbred parents used in this study) were calculated.
The equations for these calculations are the following:
1
A model to fit the endoreduplication cell cycle ploidy pattern with the MODFIT LT 3.0 program
was built in collaboration with Ben Hunsberger and Mark Munson of Verity Software (Topsham,
Maine).
20
Ntotal = NCRBCadded ∗ (Ntotal/NCRBC)FCM (2.8)
N3C = Ntotal ∗ (N3C/Ntotal)FCM (2.9)
Where
• Ntotal = total nuclei number
• NCRBC = the number of CRBCs
• N3 C = 3 C nuclei number
• N6 C = 6 C nuclei number
• · · · N384 C = 384 C nuclei number
Appendix B on page 129 includes an example of the graphical output from the
MODFIT LT 3.0 model.
2.10 Statistical Analysis of the Flow Cytometry Data
The analysis of variance of the cytological data from each mapping population
was calculated with SAS (SAS Institute, Inc., 2003). The variance was partitioned
to better estimate the portion of genetic variation by removing the variation due to
other factors, such as block effects, sampling time, and weather covariates. For the
estimation of heritability, all effects - including the genotype - were considered to be
random. By definition, the heritability equations call for the genotype effect to be set
as a random factor. However, for the estimation of trait means to be included in the
QTL mapping algorithms, a different ANOVA structure was used. The randomized
21
complete block analysis (mixed model) consisted of both fixed effects (RIL lines or
genotypes, weather covariate, and sampling time) and random effects (years, blocks
nested in year, year×genotype, and error). The RIL genotype effect was classified as a
fixed effect because each RIL line is highly inbred and can be readily and consistently
multiplied for repeated experimentation across locations and years. In addition, each
RIL genotype can be considered as potentially valuable genetic material for further
study based on the results of QTL analysis. For each ANOVA, residual plots were
made to determine if the fitted data were normally distributed. Endosperm cell
number data were log transformed for data analysis and then back-transformed to
report results. For each trait, an F-test was performed to determine the significance
of the year effect. When the F-tests were significant (p < 0.05), the two years of
data were analyzed separately. The effect of year on genotype rank was tested by
examining the genotype×year interaction term.
Analysis of covariance was used to estimate the effects of sampling time and
the environment (GDU and precipitation) on the maize endosperm growth charac-
teristics (cell number and mitotic/endoreduplication components). To minimize the
variance in cell number and mean ploidy attributed to sampling time, these traits were
standardized (when the particular covariate was significant) to the mean GDU and
precipitation values. If the environmental cofactor was significant, the least square
means were adjusted and subsequently used in the genetic mapping programs.
The SAS code for the estimation of trait means for the genotype term for each
individual RIL in the single environment design is presented in Appendix A, section
A.1.4 on page 112. The SAS code for the estimation of trait means for the geno-
type term for each individual RIL in the multiple environment design is presented in
Appendix A, section A.2.2 on page 115.
22
2.11 Kernel Protein and Starch Determinations
The major carbon sinks in the maize kernel are storage products: starch (en-
dosperm), protein (endosperm), and oil (embryo). Mature kernel tissue contains
approximately 66% starch and 15% protein on a dry weight basis (Doehlert, 1990).
Kernel protein and starch content of mature kernels were measured to compare these
traits with the cytological measurements (cell number and extent of endoreduplica-
tion) taken during an earlier phase of endosperm development. Approximately 23g of
seed from each replicate line of the 1997 T232 X CM37 RIL population was ground to
pass through a 1-mm screen and dried at 60◦
C for 24h. The ground meal was placed
in a secure container (large pill box) and tumbled for 45 min. to homogenize each
sample. Ground samples were analyzed using a Foss North America Model 6500 Near
Infrared Reflective Spectrophotometer (NIRS). NIRS was used to estimate crude pro-
tein levels of the 1997 T232 X CM37 RIL population. The percentage total crude
protein and starch content were estimated using an NIRS contrived corn grain equa-
tion (idcgrfe.equ, Infrasoft International). The commercial corn grain equation was
monitored by measuring micro-kjeldahl crude protein levels of 27 experimental lines
from 1997. Samples were measured in duplicate and a nitrogen-to-protein conver-
sion factor of 5.70 was used to determine crude protein (5.70 was selected to correct
for non-protein nitrogen assuming a 17.5% nitrogen content of maize kernel protein,
1/0.175=5.70). Micro-Kjeldahl determined crude protein levels were regressed on
NIRS determined crude protein levels in order to check the calibration of the com-
mercial corn grain equation. The percentage total starch content was determined
directly from the commercial equations.
23
Chapter 3
Results
3.1 Endosperm Cytological Trait Histograms
3.1.1 T232 X CM37 RIL
Trait means of the parental inbreds T232 and CM37 differed (p < 0.05) for
mean endosperm cell number (MECN) and mean endosperm ploidy (MEP) when
measured at 18 DAP in St. Paul, MN for both the 1996 and 1997 field seasons. The
trait distribution for MECN and MEP of the T232 X CM37 recombinant inbred lines
(RILs) including markers for the parental means are shown in Figures (3.1 a and b
and 3.2 a and b), respectively. The F1 values are included in the 1997 histogram. As
a measure of the variability of the measurements, we also report the standard error
of the mean.
Transgressive segregation was detected in the population of 48 RIL lines for
the two cytological characteristics measured in both years (Figures 3.1 a and b and
3.2 a and b). The phenotypic values from the 48 RIL lines from the T232 X CM37
mapping population measured for MECN at 18 DAP in St. Paul, MN 1996 (Figure
24
3.1a) and 1997 (Figure 3.1b) approximate normal distributions with Shapiro-Wilk
test statistics of 0.9808 (p = 0.175) and 0.9809 (p = 0.2766), respectively. The cell
number for the endosperm samples collected in 1996 ranged from 5.24 × 105
cells to
11.59 × 105
cells with a mean endosperm cell number for the population at 18 DAP
of 7.86 × 105
cells. The cell number for the endosperm samples collected in 1997
ranged from 4.00 × 105
cells to 12.93 × 105
cells with a mean endosperm cell number
for the population at 18 DAP of 9.44 × 105
cells. The phenotypic values from the
48 RIL lines from the T232 X CM37 mapping population measured for MEP at 18
DAP in St. Paul, MN in 1996 and 1997 deviate from normality with Shapiro-Wilk
test statistics of 0.9248 (p < 0.0001) and 0.9673 (p = 0.039), respectively. Normal
probability plots of ordered data vs. rankits for these two data sets reveal fairly
straight lines indicating that the deviation from normality is not extreme in either
case. The mean ploidy for the endosperm samples collected in 1996 ranged from 9.96
C to 19.22 C with a population mean of 13.45 C. The mean ploidy for the endosperm
samples collected in 1997 ranged from 5.17 C to 14.97 C with a population mean of
9.30 C. Figure 3.3 displays two box plots that represent the mean endosperm ploidy
data from the 48 T232 X CM37 RILs collected in 1996 (a) and 1997 (b) at the 14 to
24 DAP developmental stages. The observed endosperm mean nuclear ploidy trait
data for the 48 T232 X CM37 RIL lines for the 10 Days period (14, 16, 18, 20, and
24 DAP) that was measured in St. Paul, MN (1996) are presented in Figure 3.4. The
observed endosperm mean nuclear ploidy trait data for the 48 T232 X CM37 RIL
lines for the 10 Days period (14, 16, 18, 20, and 24 DAP) that was measured in St.
Paul, MN (1997) are presented in Figure 3.5. In addition, 100 kernel weight data
were obtained in both years (Figures 3.6 a and b). The traits percentage total kernel
protein and kernel starch content from the 1997 harvest are shown in (Figures 3.7 a
and b).
25
4X10
5
0
2
4
6
8
10
12
5X10
5
6
X10
5
7X10
5
8X10
5
9X10
5
10X10
5
11X105
12
X10
5
Frequency
EndospermCellNumberat18DAP
T232XCM37RILPopulation(1996)
T232
7.03X105
±5.0X104
CM37
9.45X105±8.6X104
(a)
0
2
4
6
8
10
12
4X10
5
5X10
5
6
X10
5
7X10
5
8X10
5
9X10
5
10X10
5
12
X10
5
Frequency
EndospermCellNumberat18DAP
T232XCM37RILPopulation(1997)
T232
7.8X105
±4.5X104
CM37
8.9X105
±6.2X104
11
X10
5
13
X10
5
F1
12.2X105
±9.2X104
(b)
Figure3.1:HistogramsdisplayingtheT232XCM37RILfrequencydistributionofthetraitmeanendospermcellnumber(MECN)
at18DAPforthe(A.)1996and(B.)1997fieldseasons(St.Paul,MN).Thetraitvaluesrepresentthemeansof
tworeplications.Abulkpreparationof3-6endospermsamplesfromasingleearrepresentsareplicate.Theabsolute
frequenciesareshownalongthey-axis.Thetraitvaluesareshownonthex-axis.
26
891011121314151617181920
0
1
2
3
4
5
6
7
8
9
10
Frequency
EndospermMeanNuclearPloidy(C-value)at18DAP
T232XCM37RILPopulation(1996)
T232
12.8±0.6CM37
15.3±1.0
(a)
5678910111213141516
0
2
4
6
8
10
12
14
Frequency EndospermMeanNuclearPloidy(C-value)at18DAP
T232XCM37RILPopulation(1997)
T232
12.76±0.45
CM37
15.79±0.64
F1
9.60±0.33
(b)
Figure3.2:HistogramsdisplayingtheT232XCM37RILfrequencydistributionofthetraitmeanendospermploidy(MEP)
measuredinCunitsat18DAPforthe(A.)1996and(B.)1997fieldseasons(St.Paul,MN).Thetraitvalues
representthemeansoftworeplications.Abulkpreparationof3-6endospermsamplesfromasingleearrepresentsa
replicate.Theabsolutefrequenciesareshownalongthey-axis.Thetraitvaluesareshownonthex-axis.
27
141618202224
6
8
10
12
14
16
18
20
22
EndospermMeanNuclearPloidy(C)
StageofEndospermDevelopment(DAP)
T232XCM37RILPopulation(1996)
(a)
141618202224
4
6
8
10
12
14
16
18
20
EndospermMeanNuclearPloidy(C)
StageofEndospermDevelopment(DAP)
T232XCM37RILPopulation(1997)
(b)
Figure3.3:Boxplotsdisplayingthedistributionofthetraitmeanendospermnuclearploidy,withinthe48T232XCM37
RILFamiliesfrom14to24DAPin1996and1997(St.Paul,MN).
28
Figure3.4:Observedendospermmeannuclearploidyforthe48T232XCM37RILlinesforaperiodof10Days(14,16,
18,20,and24DAP)measuredinSt.Paul,MN(1996).ThekeybelowtheX-axis,labeledRILID,refersto
individualRILgenotypesintheT232XCM37population.
12345678
910111213141516
1718192021222324
2526272829303132
3334353637383940
4142434445464748
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
1416182022241517192123
RILID
DaysAfterPollination
ObservedMeanNuclearPloidy(C)
29
Figure3.5:Observedendospermmeannuclearploidyforthe48T232XCM37RILlinesduringaperiodof10Days(14,
16,18,20,and24DAP)measuredinSt.Paul,MN(1997).ThekeybelowtheX-axis,labeledRILID,refers
toindividualRILgenotypesintheT232XCM37population.
12345678
910111213141516
1718192021222324
2526272829303132
3334353637383940
4142434445464748
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
1416182022241517192123
RILID
DaysAfterPollination
ObservedMeanNuclearPloidy(C)
30
1618202224262830323436384042
0
2
4
6
8
10
12
14
Frequency
100KernelWeight(g)
T232XCM37RILPopulation(1996)
T232
26.5±2.3
CM37
25.8±2.1
(a)
161820222426283032343638404244
0
1
2
3
4
5
6
7
8
9
10
Frequency
100KernelWeight(g)
T232XCM37RILPopulation(1997)
T232
31.8±2.0
CM37
28.5±1.4
F1
28.6±0.8
(b)
Figure3.6:HistogramsdisplayingtheT232XCM37RILfrequencydistributionofthetrait100KernelWeight(g)forthe(A.)
1996and(B.)1997fieldseasons(St.Paul,MN).Thetraitvaluesrepresentthemeansoftworeplications.The
absolutefrequenciesareshownalongthey-axis.Thetraitvaluesareshownonthex-axis.
31
1010.51111.51212.51313.51414.51515.516
0
2
4
6
8
10
12
Frequency
TotalKernelProteinPercentage
T232XCM37RILPopulation(1997)
T232
12.20±0.74
CM37
14.61±0.71
(a)
545556575859606162636465666768
0
2
4
6
8
10
12
Frequency
TotalKernelStarchPercentage
T232XCM37RILPopulation(1997)
T232
63.0±0.4
CM37
61.2±0.7
(b)
Figure3.7:HistogramsdisplayingtheT232XCM37RILfrequencydistributionofthetraitstotalkernelproteinpercentage(A.)
andtotalkernelstarchpercentage(B.)collectedinthe1997fieldseason(St.Paul,MN).Thetraitvaluesrepresent
themeansoftworeplications.Theabsolutefrequenciesareshownalongthey-axis.Thetraitvaluesareshownon
thex-axis.
32
3.1.2 Tx303 X CO159 IF2 and CO159 X Tx303 RIL
Characteristic trait values of the parental inbreds CO159 and Tx303 differed
(p < 0.05) for MECN and MEP measured at 16 DAP in the St. Paul, MN and
Columbia, MO locations during the 1996 field seasons. The trait distribution for
MECN and MTC of the CO159 X Tx303 immortalized F2 (IF2) including markers
for the parental means are shown in Figures (3.8 and 3.9), respectively. Phenotypic
distributions of the CO159 X Tx303 RIL populations grown in 1996 and 1997 are not
shown.
Transgressive segregation was detected in the population of 112 IF2 lines for
the two cytological characteristics measured in both locations: MECN (Figure 3.8)
and MEP (Figure 3.9). The phenotypic values from the 112 IF2 lines from the
CO159 X Tx303 mapping population measured for MECN at 16 DAP in St. Paul,
MN (Figure 3.8a) and Columbia, MO (Figure 3.8b) during the 1996 field season
display non-normal unimodal distributions. The IF2 line data for MECN deviated
from normality, in the direction of positive skewness, in both locations (Columbia
(p = 0.0034) and St. Paul (p = 0.007)), calculated on the basis of the Shapiro-
Wilk statistic (Shapiro and Wilk, 1965). However, plots of ordered data vs. rankits
(expected order of the data assuming the sample was from a normal population)
revealed straight lines indicating the deviation from normality was not extreme. A
log transformation of MECN from both locations normalized the data distribution
based on the Shapiro-Wilk test statistic in both cases (p > 0.05). It should be noted
here that quantitative trait values follow a mixture of distributions that approximate
a normal distribution as increasing numbers of loci contribute to the total effect.
For example, the non-normality of quantitative trait distributions is expected in the
presence of a small number segregating QTLs of major effect (Doerge and Churchill,
33
1996). One possibility is that the log-normal distribution well describes the actual
distribution of this trait for this mapping population. Another possibility is that the
non-normal distribution of the raw cell number data reflects the limited sampling due
to the small population size.
The cell number for the endosperm samples collected from the IF2 population
grown in St. Paul, MN location in 1996 ranged from 3.49 × 105
to 12.8 × 105
with
a population mean of 6.50 × 105
(Figure 3.8a). The cell number for the endosperm
samples collected from the Columbia, MO location in 1996 ranged from 5.43 × 105
to 13.63 × 105
with a mean endosperm cell number for the population at 16 DAP of
8.75 × 105
(Figure 3.8b). The IF2 population MEP at 16 DAP (St. Paul location)
ranged from 7.2 to 13.5 with a population mean of 9.95 C (Figure 3.9a). The IF2
population mean endosperm ploidy at 16 DAP (Columbia, MO location) ranged from
9.43 to 15.60 with a population mean of 12.03 (Figure 3.9b).
The phenotypic values from the 41 RIL lines from the CO159 X Tx303 mapping
population measured for MECN at 16 DAP in St. Paul, MN 1996 deviate from
normality with a Shapiro-Wilk test statistics of 0.9174 (p < 0.001). The cell number
for the endosperm samples collected in 1996 ranged from 3.62×105
cells to 16.94×105
cells with a mean endosperm cell number for the population at 16 DAP of 8.72 ×
105
cells. The phenotypic values from the 41 RIL lines from the CO159 X Tx303
mapping population measured for MEP at 16 DAP in St. Paul, MN 1996 deviate
from normality with a Shapiro-Wilk test statistics of 0.9520 (p = 0.0052). Plots of
ordered data vs. rankits reveal fairly straight lines indicating that the deviation from
normality is not extreme. The mean ploidy range for the endosperm samples collected
in 1996 ranged from 7.34 C to 19.98 C with a population mean of 11.19 C. The
phenotypic values from the 41 RIL lines from the CO159 X Tx303 mapping population
34
3X105
0
5
10
15
20
25
5X105
12X105
7X105
9X105
11X105
Frequency
EndospermCellNumberat16DAP
F1
9.63X105
±3.5X104
CO159
7.00X105
±3.8X104
Tx303
9.75X105
±5.1X104
(a)St.Paul,MN1996
0
5
10
15
20
25
30
4X105
6X105
14X105
8X105
10X105
12X105
16X105
Frequency
EndospermCellNumberat16DAP
F1
7.32X105
±6.4X104
CO159
7.30X105
±4.8X104
Tx303
9.60X105
±6.3X104
(b)Columbia,MO1996
Figure3.8:HistogramsdisplayingtheTx303XCO159IF2frequencydistributionofthetraitmeanendospermcellnumberat
16DAPfromthe(A.)St.Paul,MNand(B.)Columbia,MOlocationsin1996.TheIF2linetraitvaluesrepresenta
singlereplicationfromabulkedendospermsamplefromeachindividualIF2line.Theabsolutefrequenciesareshown
alongthey-axis.Thetraitvaluesareshownonthex-axis.
35
7891011121314
0
2
4
6
8
10
12
14
16
18
Frequency
MeanEndospermNuclearPloidy
CO159
11.69±0.33
Tx303
9.70±0.61
F1
12.4±0.55
(a)St.Paul,MN1996
910111213141516
0
5
10
15
20
25
30
Frequency
MeanEndospermNuclearPloidy
CO159
12.06±0.29
Tx303
11.72±1.17
F1
12.97±0.09
(b)Columbia,MO1996
Figure3.9:HistogramsdisplayingtheTx303XCO159IF2frequencydistributionofthetraitmeannuclearploidy(C)at16DAP
fromthe(A.)St.Paul,MNand(B.)Columbia,MOlocationsin1996.TheIF2linetraitvaluesrepresentasingle
replicationfromabulkedendospermsamplefromeachindividualIF2line.Theabsolutefrequenciesareshownalong
they-axis.Thetraitvaluesareshownonthex-axis.
36
measured for MEP at 16 DAP in St. Paul, MN 1997 deviate from normality with a
Shapiro-Wilk test statistics of 0.9237 (p < 0.0001). Plots of ordered data vs. rankits
reveal fairly straight lines indicating that the deviation from normality is not extreme.
The mean ploidy range for the endosperm samples collected in 1997 ranged from 4.65
C to 11.91 C with a population mean of 7.42 C.
37
3.2 Multi-Environment ANOVA Results
Three traits, the two cytological traits and one agronomic trait (100 kernel
weight) from the T232 X CM37 RIL population were analyzed across years (1996
and 1997). The randomized complete block analysis (mixed model) consisted of both
fixed effects (RIL lines or genotypes) and random effects (years, blocks nested in year,
year × genotype, and error). Analysis was performed using the SAS/STAT PROC
MIXED program (SAS Institute, Inc., 2003). The MECN genotype effect estimate
(p = 0.0540) was not different from zero. The REML fixed effect estimate for the
genotype effect was greater than zero for MEP and 100 kernel weight (p < 0.0001).
Multi-environment REML ANOVA tables for MEP and kernel weight are presented
in Tables 3.1 3.2, respectively. The year and year×genotype terms were not detected
(α = 0.05) for either cytological trait except for the year × genotype term for 100
kernel weight (p = 0.0021).
38
Table3.1:Multi-environmentcovarianceparameterestimate(REML)andfixedsolutiontableforT232XCM37RIL
traitmeanendospermnuclearploidy(18DAP).
Covariance
parameter
EstimateStandard
Error
ZvaluePr>Z
Year7.9911.550.69p=0.2446
Rep(Year)0.230.340.67p=0.2501
Year×Genotype0.560.700.79p=0.2143
Error4.240.666.41p<0.0001
FixedEffectEstimateStandardDFtValuePr>|t|
Error
Intercept9.282.321284.00p<0.0001
TypeIIITest
FixedEffectNum.
DF
Den.DFF
Value
Pr>F
Genotype471282.43p<0.0001
39
Table 3.2: Multi-environment covariance parameter estimate (REML) and fixed
effect solution table for T232 X CM37 RIL trait mean 100 kernel weight
(g).
Covariance
parameter
Estimate Standard
Error
Z value Pr > Z
Y ear 0.03 0.39 0.07 p = 0.4702
Y ear×Genotype 6.78 2.36 2.87 p = 0.0021
Error 7.61 1.16 6.56 p < 0.0001
Fixed Effect Estimate Standard DF t Value Pr >|t|
Error
Intercept 36.29 2.3 129 15.75 p < 0.0001
Type III Test
Fixed Effect Num.
DF
Den. DF F Value Pr > F
Genotype 46 129 2.52 p < 0.0001
The Rep(Y ear) variance component estimate was negative and was removed from
the model.
40
3.3 Single-Environment ANOVA Results
The three weather covariates (GDU, PTS- precipitation from Pollination To
Sampling, and TAP- Total Accumulated Precipitation from planting to the sampling
period) were added as fixed factors (1 DF) to each cytological trait ANOVA model
for the three mapping populations. None of the covariates tested in the T232 X CM37
cytological trait models for the 1996 and 1997 data were detected at α = 0.05. Tests
for the genotype term in the T232 X CM37 cytological trait models using single-
year REML ANOVA models identified a genotype effect for MEP (p = 0.0008, 1996;
p = 0.0310, 1997) but not (α = 0.05) for MECN (p = 0.1370, 1996; p = 0.0937, 1997).
The REML estimate for the genotype effect for the MECN trait at 16 DAP
(1996) in the CO159 X Tx303 RIL mapping population were detected at p = 0.0004.
Neither the GDU or precipitation covariates were detected in ANCOVA models. The
genotype effect term for MEP (16 DAP) was large at p < 0.0001 in 1996. For the
trait MEP the precipitation PTS covariate was present (p = 0.0050) in a REML
ANCOVA model (a positive covariate relationship was found between PTS and the
trait MEP). The interaction between the covariate and the genotype term did not
exist (p = 0.6330) in the ANCOVA model suggesting that, in general, the slopes
for genotype do not differ depending on the PTS covariate. Due to the nature of
IF2 experimental design (bulked kernels, combined replicates), it was not possible
to determine the existence of the genotype term (a calculation which would permit
the estimation of an environmental component from the total phenotypic variance)
for either cytological trait. Genetic mapping proceeded without these tests as if the
bulked IF2 phenotypic data represented, when pooled within lines, an F2 population.
Simple linear regression was used to test and measure the effect of the three co-
variates on the IF2 cytological traits for the IF2 populations grown in both Columbia,
41
MO and St. Paul, MN in 1996. An association between GDUs and log endosperm cell
number for the population grown in Columbia, MO was detected. The heat unit data
accounted for approximately 9.0% of the trait variation (R = +0.28 (R2
= 9.0%),
p < 0.0001). Precipitation data (PTS) also served as a significant cofactor for the en-
dosperm cell number trait measured from kernel samples collected from the Columbia,
MO location. However, in a multiple regression model containing both covariates,
only the GDU term was detected. There was no relationship between log endosperm
cell number and GDU for the population grown in St. Paul (p = 0.579). The regres-
sion on all three covariates with both sets of MEP data indicated no relationships
between the trait and the cofactors α = 0.05. The population segregates for days to
anthesis and the cytological characteristics are known to be influenced by phenology
factors such as accumulated GDUs. The covariate correction allows for an equitable
comparison of cytological characteristics of samples collected on different dates and
has the potential to increase precision.
Covariate analysis is based on the assumption that there is a constant regression
relationship among the different treatments (in this case, genotypes). This assump-
tion can be tested by examining the heterogeneity of slopes and was checked by using
PROC GLM in SAS. The significance of the genotypes X GDUs term tests for this as-
pect of the ANCOVA analysis. However, it is not possible to test for the heterogeneity
of slopes using the bulked IF2 data so the adjustments were performed without this
check.
42
3.4 Flow Cytometric Analysis
Figure 3.10 shows a panel of endosperm nuclei histograms (fitted histogram
data) from the inbred parent T232, CM37, and the F1 sampled at 18 DAP. ModFit
LT 3.0 was used to model the CRBC and endosperm nuclei ploidy peaks (Gaussian
distributions) using non-linear regression methods. Using multi-parameter analysis,
two gates were set to eliminate both debris and nuclei doublets. The peak areas were
used to calculate the number of nuclei (cells) in each ploidy class.
43
02004006008001000
070140210280
Number
Channels(FL3LOGPI)
3C
CRBCs
6C
12C
24C
48C
96C
192C
T232EndospermNuclei(18DAP)
EndospermCellNumber:8.10x105
MeanNuclearPloidy:13.58
FractionofNuclei>6C:0.348
(a)T232
02004006008001000
060120180240
Number
Channels(FL3LOGPI)
CM37EndospermNuclei(18DAP)
EndospermCellNumber:9.20x105
MeanNuclearPloidy:16.71
FractionofNuclei>6C:0.4633C
CRBCs
6C
12C
24C
48C
96C
192C
(b)CM37
02004006008001000
070140210280
Number
Channels(FL3LOGPI)
T232XCM37F1EndospermNuclei(18DAP)
EndospermCellNumber:11.6X105
MeanNuclearPloidy:9.66
FractionofNuclei>6C:0.308
3C
CRBCs
6C
12C
24C
48C
96C
(c)F1
Figure3.10:Histogram(fitteddistribution)ofparentalandF1endosperm(18DAP)nucleitogetherwithCRBCsfrom
flowcytometrymeasurement(St.Paul,MN1997).PloidypeakareascalculatedwithModFitLT3.0.
44
Figure 3.11 shows a panel of endosperm nuclei histograms from the inbred parent
CO159 (A and B), the F1 (C and D), and inbred parent Tx303 (E and F) sampled
at 16 DAP. The left-hand side of the panel contains the raw histogram data (A,C,E)
and the right-hand side of the panel contains the fitted histogram data (B,D,F).
45
Channels (FL3 LOG-PI)
0 200 400 600 800 1000
Number
070140210280
Channels (FL3 LOG-PI)
0 200 400 600 800 1000
Number
070140210280
Channels (FL3 LOG-PI)
0 200 400 600 800 1000
Number
0100200300400500
Channels (FL3 LOG-PI)
0 200 400 600 800 1000
Number
0100200300400500
Channels (FL3 LOG-PI)
0 200 400 600 800 1000
Number
0100200300400500
Channels (FL3 LOG-PI)
0 200 400 600 800 1000
Number
0100200300400500
CO159
F1
Tx303
CO159
F1
Tx303
CRBC
3C
12C
24C
48C
96C
6C CRBC
3C
12C
24C
48C
96C
6C
A B
C D
E F
Figure 3.11: Flow cytometry histograms of parental (CO159 and Tx303) and F1
(CO159 X Tx303) endosperm nuclei together with CRBCs (St. Paul,
MN 1996) showing endoreduplication distribution differences at 16
DAP. The left panel (A,C,E) represents that raw histogram data for
CO159, the F1, and Tx303, respectively. The fitted data (colored his-
tograms) are presented in (B,D,F).
46
3.5 Heritability
Low to moderate1
heritability estimates were found for the cytological traits in
all three mapping populations. Broad-sense heritability (H2
) estimates were made
by dividing the genotypic variance (ˆσ2
G) by the phenotypic variance (ˆσ2
P ). The phe-
notypic variance, also known as the total variance, is composed of the total geno-
typic variance (additive, dominance, and epistatic variance) and the environmental
variance. For RILs, the component of total genetic variance (ˆσ2
G) more directly esti-
mates additive genetic variance (ˆσ2
A) since the influence from dominant effects in this
highly inbred material is absent. If the epistatic variance (ˆσ2
I ) and maternal variance
(ˆσ2
M ) components are small, then the broad sense heritability estimate approaches
the narrow-sense heritability (h2
) which is defined as the ratio
ˆσ2
A
ˆσ2
P
. Epistasis occurs
between two (or more) loci when the effects of alleles at one locus depend on what
alleles are present at the other loci. Generally, QTL analyses have detected epistasis
by the presence of a significant interaction term between two loci from a two-factor
ANOVA. Cheverud and Routman (1995) make a distinction between this ”statisti-
cally” defined epistasis and what they call ”physiological” epistasis. Physiological
epistasis more closely estimates the effect of genetic interactions on the physiology
and development of the organism. To detect physiological epistasis, Cheverud and
Routman (1995) partition the genotypic/phenotypic data into three categories: raw
genotypic data, non-epistatic values (each independent of the alternate locus geno-
type), and epistatic values (deviations of the two-locus genotypic values from the
non-epistatic values) for each of the nine possible two-locus genotypic classes (F2
example). Although statistics are obviously used to detect and define both types of
epistatic interactions, the distinction is important because physiological epistasis can
make substantial contributions to the additive, dominant, and interaction variance
1
Low to moderate heritability here is defined as an approximate range from 0.0 to 0.50.
47
components (Cheverud and Routman, 1995; Doebley et al., 1995; Eshed and Zamir,
1996; Lark et al., 1995; Li et al., 1997; Lefebvre and Palloix, 1996). The methods
that detect statistical epistasis neglects contributions to the additive and dominance
values and variance components. Unfortunately, epistasis was not estimated in this
study due to the small size of the populations used for genetic mapping.
Heritability estimates for all traits measured for the T232 X CM37 RIL pop-
ulation are presented on an entry mean basis (ANOVA and REML). Heritabilities
calculated across years on an entry mean basis ranged from 0.23 for MECN (18 DAP)
to 0.79 for days to 50% pollen shed (REML method). Heritabilities calculated on an
entry mean basis across years (equation (2.2)) are given in Table 3.3. Heritabilities
from single year data on an entry mean basis ranged from a low of 0.16 for MTC
at 18 DAP (1996) to a high of 0.85 for 100 kernel weight (1997). Heritabilities on a
single year, entry mean basis (equation (2.3)) are listed in Table 3.3.
48
Table3.3:Multi-yearandsingleyearheritabilityestimates-T232XCM37RILpopulation.NSindicatesnotsignificant-
theconfidenceintervalfortheheritabilityestimateincludedthevaluezero.
ANOVAREML
TraitEntryMeanBasisEntryMeanBasis
H2
withExact95%CIH2
±SE
CombinedYears(1996and1997)
MeanEndospermCellNumber(18DAP)NS0.23±0.14
MeanEndospermNuclearPloidy(18DAP)0.53(0.23,0.71)0.43±0.12
MeanTotalC(18DAP)0.41(0.04,0.64)0.27±0.10
KernelWeight0.47(0.13,0.67)0.47±0.13
Daysto50%PollenShed0.74(0.57,0.84)0.79±0.05
1996
100KernelWeight0.68(0.47,0.80)0.60±0.11
TotalEndospermCellNumber(18DAP)NS0.20±0.12
MeanEndospermNuclearPloidy(18DAP)0.60(0.34,0.75)0.51±0.12
MeanTotalC(18DAP)NS0.16±0.12
1997
100KernelWeight0.89(0.83,0.94)0.85±0.04
TotalKernelProteinContent0.86(0.77,0.91)0.82±0.054
TotalKernelStarchContent0.87(0.78,0.92)0.83±0.05
TotalEndospermCellNumber(18DAP)NS0.26±0.13
MeanEndospermNuclearPloidy(18DAP)0.46(0.11,0.67)0.40±0.16
MeanTotalC(18DAP)NS0.18±0.14
49
For the cell cycle components estimated using flow cytometry, heritability esti-
mates based on the bulked-sample-basis (Holland et al., 2001) using the IF2 data from
St. Paul, MN and Columbia, MO and using the equation (2.4) are listed in Table
3.4. Based on the results from the bulked sample heritability data which indicated a
large (p < 0.001) location (represented by bulked replication term) effect for all three
traits, the data were analyzed separately for the two locations. Heritability estimates
for the CO159 X Tx303 RIL traits measured in 1996 are presented in Table 3.5.
50
Table3.4:Heritabilityonabulked-sample-basisfortheimmortalizedTx303XCO159F2mappingpopulation(St.Paul,
MNandColumbia,MO1996).
TraitANOVAREML
BulkedDatafromeachlocationEntryMeanBasisPlotBasis
1996H2
withExact95%CIH2
±SE
TotalEndospermCellNumber(16DAP)0.21(−0.28,0.51)0.16±0.07
MeanEndospermNuclearPloidy(16DAP)0.17(−0.36,0.49)0.14±0.10
MeanTotalC(16DAP)0.19(−0.39,0.52)0.12±0.07
51
Table3.5:Singleyeartraitheritabilityfortotalendospermcellnumber,meanendospermnuclearploidyandmeantotal
CfortheCO159XTx303RILpopulation.
ANOVAREML
TraitEntryMeanBasisEntryMeanBasis
H2
withExact95%CIH2
±SE
1996
TotalEndospermCellNumber(16DAP)0.50(0.18,0.69)0.56±0.08
MeanEndospermNuclearPloidy(16DAP)0.60(0.31,0.77)0.77±0.07
MeanTotalC(16DAP)0.64(0.41,0.78)0.53±0.08
52
3.6 Genetic and Phenotypic Correlations Between Traits
Genetic (REML MANOVA, equation (2.5)) and phenotypic correlations (REML
MANOVA, equation (2.6)) for 1996 and 1997 T232 X CM37 RIL trait data are listed
in Table 3.6. In general, the GLM and REML estimates are similar so only the REML
results are presented. Genetic correlations from single year data ranged from a low
−0.65±0.53 for the traits kernel protein percentage and MECN at 18 DAP (1997) to
a high 0.99±0.52 for the traits 100 kernel weight and MECN at 18 DAP (1996). The
standard errors for the cytological trait correlations are high relative to the standard
errors for the genetic/phenotypic correlations estimated for the agronomic traits. In
addition, standard errors for the genetic/phenotypic correlations are also much higher
compared to the standard errors of the heritability estimates. This is due to the fact
that correlations are multivariate in nature and require substantially larger population
sizes to achieve comparable standard error estimates (Lynch and Walsh, 1998).
The traits endosperm cell number (log transformation) and mean ploidy (16
DAP) from the Tx303 X CO159 endosperm samples collected in Columbia, MO were
negatively correlated (R = −0.32, p < 0.0001), see Figure 3.12. The traits log
MECN and MEP from the St. Paul location data (data not shown) were not related
(R = −0.12, p = 0.2607).
The correlation between MEP and MECN (1996) was estimated from the CO159
X Tx303 RIL mapping population data (16 DAP). The PROC MIXED method for
the computation of the genetic correlation failed to converge. The failure of the
PROC MIXED genetic correlation to converge is likely due to the small sample size
of this mapping population. The genetic correlation between MEP and MECN was
estimated to be −0.22 ± 0.14 using the PROC GLM method developed by Holland
et al. (2001). The phenotypic correlation was estimated to be −0.20 ± 0.13 on an
53
entry-by-environment basis.
54
6.12
6.05
5.98
5.91
5.84
5.77
5.70
9 11 13 15 17
LogEndospermCellNumber
Mean Endosperm Nuclear Ploidy
R2 = 0.10
R = -0.32
p<0.0001
Figure 3.12: Regression of log endosperm cell number on mean nuclear ploidy from
the immortalized Tx303 X CO159 F2 population endosperm samples
collected at 16 DAP in Columbia, MO 1996.
55
Table 3.6: Genetic and phenotypic correlations for T232 X CM37 RIL traits (St.
Paul, MN 1996 and 1997). Multivariate mixed model analysis (REML).
The traits kernel starch total and kernel protein total were derived by
multiplying total kernel starch and protein percentage with the 100 kernel
weight trait. NS indicates not significant- the confidence interval for the
correlation estimate included the value of zero.
Trait Combination Year Genotypic Phenotypic
Correlation Correlation
±SE ±SE
Endosperm Cell Number / 1996 −0.63 ± 0.29 −0.57 ± 0.28
Mean Ploidy (18 DAP) 1997 NS NS
Kernel Weight / 1996 −0.47 ± 0.27 −0.36 ± 0.22
Mean Ploidy (18 DAP) 1997 NS NS
Kernel Weight / 1996 0.99 ± 0.52 0.44 ± 0.26
Endosperm Cell Number (18 DAP) 1997 NS NS
Kernel Weight / 1996 0.65 ± 0.46 0.44 ± 0.26
Mean Total C (18 DAP) 1997 NS 0.45 ± 0.27
Mean Ploidy (18 DAP) / 1997 −0.39 ± 0.27 −0.32 ± 0.22
Kernel Protein Percentage
Endosperm Cell Number / 1997 −0.65 ± 0.53 −0.43 ± 0.26
Kernel Protein Percentage
Mean Ploidy (18 DAP) / 1997 0.53 ± 0.27 0.46 ± 0.22
Kernel Starch Percentage
Endosperm Cell Number (18 DAP) / 1997 NS NS
Kernel Starch Percentage
Mean Total C (18 DAP) / 1997 NS NS
Kernel Protein Total
Mean Total C (18 DAP) / 1997 NS 0.48 ± 0.27
Kernel Starch Total
56
3.7 Quantitative Trait Analysis
Figure 3.13 presents a genome-wide summary of the MECN and MEP QTLs
identified from all three mapping populations. QTL analysis was performed using
the phenotypic data from the individual years (1996 and 1997). Although the year
and year × genotype terms from the multi-environment ANOVA analysis were not
different from 0 for either of the cytological traits in the T232 X CM37 mapping
population, the cytological phenotypic data were not pooled for QTL analysis across
years. By not pooling the data, differences in QTL location and effect per year may
be detected. The power to detect QTLs is increased by CIM due to the reduction in
the error variance when significant marker cofactors are present in the QTL model.
In addition, although the study focused on the 18 DAP MEP data for heritability
and genetic/phenotypic correlation estimations, the full range of MEP data (14 to
24 DAP) were used for QTL detection (Joint CIM and CIM from individual DAP
stages). QTL mapping for both the Tx303 X CO159 IF2 and CO159 X Tx303 RIL
populations is based on individual year (location) data. Initial single marker analysis
using linear regression identified the most likely major QTLs and additional potential
cofactors for composite interval mapping (CIM). Linear regression and conventional
interval mapping (IM) results are not reported because in many cases, QTLs were
not identified until additional marker cofactors were included in the mapping model.
57
Figure3.13:GeneticmapoftheendospermcellnumberandmeanploidyQTLsidentifiedfromthethreemappingpopula-
tions.LinkagedistanceswerecalculatedwithMapmakerQTLandarebasedontheT232XCM37mapping
population.Theboxheightadjacenttothechromosomerepresentsanapproximate3LODsupportinterval
fortheidentifiedQTL.QTLregionsaremarkedbywhiteboxes(endospermcellnumber)andblackboxes
(endospermmeanploidy).Thepercentagephenotypicvarianceexplained,theadditivevalue(A),andthe
maximumLODscoreforeachQTLislistedwithinthebox.ThepopulationIDislistedbelowthebox.
20.3%
5.02
TxCOIF2
MO
0.0npi114a
npi220a
bnl13.05a
isu1410a
umc103a
bnl9.44
pdk2
hox1
umc12a
bnl12.30a
umc48a
bnl17.17
npi108b
bnl10.24b
npi224b
csu96b
umc3a
bnltas1m
5.8
20.4
27.5
42.4
56.1
65.3
72.9
77.5
81.2
90.4
96.1
118.5
123.9
136.7
144.3
155.2
179.2
Chr-8
Bin
8.01
8.03
8.02
8.05
8.06
8.07
8.09
8.04
21.5%
4.98
TxCOIF2
STP
17.7%
3.69
TCM
97
+118(A)+129(A)
+118(A)
Chr-1
0.0
13.3
31.6
44.6
60.6
71.3
77.3
86.2
101.8
117.2
135.1
145.1
167.8
181.6
189.7
197.5
209.6
225.1
233.7
242.2
252.7
267.4
275.6
bnltas1h
bnltas1c
umc94a
cdo20a
pds1
uaz120
umc11a
p1
bnl7.21a
uaz9
umc58
uaz18d
uaz20a
npi236
umc37a
bz2
ias7
kn1
knox8
bnl8.29a
chi1
uaz22
mpik9
Bin
1.00
1.01
1.03
1.02
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
16.1%
6.78
TxCO
IF2MO
-0.69(A)
24DAP
10.47
TCM
Joint96
+0.58(A)
-0.72(A)
14DAP
11.86
TCM
Joint97
-0.60(A)
14DAP
5.10
TCM
97
12.6%
0.0pgs1
umc53a
pbs5
mpik4b
b1
umc34
bnl12.09
pic3
ici99
npi277a
csu54a
uaz31b
umc122
dpg6d
mpik26
uaz33a
mha1
bnl17.14
knox4
13.1
17.2
38.1
50.1
57.1
69.5
78.2
85.1
96.9
114.8
132.9
140.4
144.1
157.9
163.6
178.7
189.8
192.2
Chr-2
24.2%
3.81
TCM
96
-81(A)
Bin
2.00
2.02
2.04
2.03
2.06
2.07
2.08
2.09
2.10
13.1%
4.54
TxCOIF2
MO
+0.51(A)
6.21/
4.77
TCM
Joint97
and
at18/20
DAP
-1.0(A)
18/20
DAP
~20%
Chr-3
0.0uaz109
umc32a
e8
me1
chs566
dup104
umc102
umc60
ici98
npi328b
bnl6.16a
bnl1.297
cdo345b
sh2
npi425a
uaz117a
9.0
23.0
41.3
57.7
67.7
76.3
82.2
95.3
112.4
132.9
140.5
148.5
157.3
164.6
193.2
Bin
3.00
3.01
3.05
3.04
3.06
3.07
3.08
3.09
3.10
9.7%
2.88
TxCOIF2
STP
+94(A)
-1.60(A)
16DAP
4.51
TCM
96
11.0%
0.0agrr115
umc123
bx4
uaz51
zpl1b
umc31a
uaz53b
bet2
uaz73
umc156a
mpik3
trg1
umc19
wsunia3
npi253b
uaz122
c2
cuny9
mgs2
npi333
uwo3
ivr2a
umc111a
cdc2
8.1
11.4
17.7
20.8
48.9
61.2
78.2
81.3
88.1
92.9
103.1
110.0
115.4
123.8
132.5
138.7
146.2
149.7
158.7
169.2
189.9
196.3
199.3
Chr-4
Bin
4.01
4.03
4.02
4.05
4.06
4.07
4.08
4.09
4.11
4.10
4.04
4.10
4.10
4.09
10.0%
3.39
TxCOIF2
STP
19.9%
6.33
TxCOIF2
STP
29.2%
3.69
TCM
97
-1.1(A)
+69(A)
+0.96(A)
12.0%*
3.96
TCM
96
14,20,
24*DAP
-1.1(A)11.1%
3.68
TCM
97
24DAP
-1.2(A)
0.0bnltas2b
uaz75
npi890
csh1c
uaz214
bnl6.25a
uaz163
ucsd64a
phyA2
csicmah9
csu150b
mpik33e
csu168a
umc1
npi213
bnl4.36
a2
bt1
uaz131
amp3
bnl10.12
pal1
bnl5.40
csu26a
ici229
umc108
wsunia5
npi288a
php1001
ias13b
7.1
16.9
16.9
21.8
32.0
53.7
64.3
72.0
85.1
91.1
98.0
107.9
115.6
121.5
129.7
133.7
138.1
151.4
155.9
175.4
179.7
184.6
203.6
208.4
222.0
257.8
269.2
279.4
296.9
Chr-5Bin
5.00
5.01
5.02
5.03
5.04
5.05
5.06
5.07
5.08
75.09
13.4%
3.42
TCM
97
11.6%
5.08
TxCOIF2
MO
12.2%
4.91
TxCOIF2
MO
6.0%
2.99
TxCOIF2
STP
22.8%
3.35
COTx
96
+1.2(A)
+0.48(A)
+0.52(A)
+0.52(A)
-76(A)
10.2%
3.67
TCM
96
18DAP
-1.1(A)
15.8%
3.18
+0.67(A)
TCM
96
14DAP
0.0npi340a
npi235a
enp1
npi393
uaz106a
umc65a
uaz160
bnl3.03
pge20
bnl5.47a
uaz256
uaz19d
bnl17.12
idh2
php20599
6.8
17.1
27.9
36.1
45.9
53.8
64.8
69.5
85.3
105.7
127.1
138.7
149.0
152.3
Chr-6
10.7%
Bin
6.00
6.01
6.04
6.03
6.05
6.06
6.07
4.39
TCOIF2
MO
15.6%
2.71
COTx
97
+78(A)
-0.62(A)
0.0bnl25
rs1
cuny12
uaz20b
csu11
npi224a
uaz221
umc110a
bnl6.27
uaz92
bcd249b
bnl8.39
npi385
npi113a
bnl8.44a
pbs7
abg373
6.8
15.1
19.8
31.6
35.0
65.1
68.7
74.6
88.4
97.2
103.7
116.7
126.0
132.4
148.9
152.8
Chr-7
Bin
7.00
7.01
7.03
7.02
7.04
7.05
7.06
15.4%
7.53
TxCOIF2
MO
17.2%
3.33
TxCOIF2
STP
18.4%
4.22
COTx
96
-80(A)
-109(A)
-156(A)
0.0csu95a
npi253a
c1
sh1
uaz237a
wx1
umc153
bnl8.17
dpg6c
npi443
npi439b
bnl7.57
uaz148
npi291
npi97b
csu50b
12.5
20.0
22.9
30.0
38.6
49.6
60.1
63.9
71.3
78.8
83.4
93.0
105.8
117.3
131.1
Chr-9
Bin
9.01
9.03
9.02
9.05
9.06
9.07
9.08
9.04
15.1%
3.60
TCM
96
+1.03(A)
0.0ucsd72b
mpik12a
php20075a
npi285
sad1
dpg5
npi303
npi232a
umc44a
umc57a
npi306
npi321a
gln1
8.3
23.4
31.5
45.6
53.7
68.1
78.4
98.1
102.8
118.1
130.9
135.4
Chr-10
Bin
10.00
10.03
10.01
10.02
10.04
10.05
10.06
10.07
EndospermCellNumberQTLs
EndospermMeanPloidyQTLs
58
3.7.1 Endosperm Cell Number QTLs
Table 3.7 includes all of the MECN QTLs identified by CIM from the three
mapping populations. The polygenic nature of the trait is evident from the putative
QTLs identified across the genome. No common endosperm cell number-related QTLs
were identified across the years in the T232 X CM37 RIL population. This indicates
QTL × environment interaction. The magnitude of a particular QTL effect can
change depending on environmental conditions. An alternative explanation involves
the fact that the sample size for this population is small and the power to identify
QTLs is low for all but the most major QTLs. Thus, only the most major QTLs for
a given year are identified by these QTL analyses. Both important major and minor
QTLs can be expected to be missed due to both QTL × genotype interaction, the
small sample size, and sampling error.
Composite interval mapping identified one QTL that significantly influenced
MECN from kernel samples collected from the T232 X CM37 mapping population
in St. Paul in 1996. The QTL identified on chromosome 2 is marked by ici99 (bin
2.06) and has an average negative effect of allele substitution of 81 × 103
endosperm
cells (T232 direction). The QTL region identified on chromosome 2 met the p < 0.01
experimentwise threshold. The final QTL model, including this one QTL region,
accounted for 24.2 ± 10.8% of the phenotypic variance and 89.5 ± 39.9% of the geno-
typic variance. Two MECN QTLs were identified using the 1997 data. Both QTLs
exceeded the α = 0.1 experimentwise threshold (LOD 3.26) but only the QTL on
chromosome 8 exceeded the α = 0.05 threshold (LOD 3.63). The QTL identified on
chromosome 5 is marked by npi288a (bin 5.08) and has an average negative effect
of allele substitution of 76 × 103
endosperm cells (T232 direction). The QTL region
identified on chromosome 8 is marked by csu96b (bin 8.08) and has an average neg-
59
ative effect of allele substitution of 88 × 103
endosperm cells (T232 direction). The
final QTL model, including both QTL regions, accounted for 27.6 ± 11.6% of the
phenotypic variance and 78.7 ± 33.1% of the genotypic variance.
Composite interval mapping identified one QTL that significantly influenced
MECN from kernel samples collected from the CO159 X Tx303 RIL mapping pop-
ulation in St. Paul in 1996. The QTL identified on chromosome 7 is marked by
npi435 (bin 7.04) and has an average negative effect of allele substitution of 159×103
endosperm cells (Tx303 direction) (Table 3.7). The QTL region identified on chro-
mosome 7 exceeded the α = 0.01 experimentwise threshold. The final QTL model,
including this single QTL region, accounted for 18.4 ± 11.2% of the phenotypic vari-
ance and 32.9 ± 20.0% of the genotypic variance. Similar QTL mapping results were
found using the log transformation of the endosperm cell number data.
Two common QTL regions, on chromosomes 7 and 8, were identified in both
environments for MECN in the Tx303 X CO159 IF2 mapping population. Composite
interval mapping identified three QTLs (on chromosomes 6, 7, and 8) that significantly
influenced MECN from kernel samples collected from the IF2 mapping population in
Columbia, MO in 1996 (Table 3.7). The three QTLs displayed significant additive
gene action. The multiple QTL model, including all three regions simultaneously,
accounted for 35.0 ± 7.3% of the phenotypic variance. Unlike the RIL final QTL
models, total model genotypic variance is not reported for the IF2 mapping results.
This is due to the lack of genotype replication (in this case replication of the bulked
material that represents the IF2 line within each environment) necessary to separate
genotypic and environmental variance components (P = G + E). Four QTLs (on
chromosomes 3, 4, 7, and 8) were identified that significantly influenced MECN from
the IF2 mapping population grown in St. Paul, MN in 1996 (Table 3.7). All four
60
QTLs displayed additive gene action. The multiple QTL model, including all four
regions simultaneously, accounted for 45.2 ± 7.7% of the phenotypic variance. An
example of a QTL scan profile for the endosperm cell number trait (Columbia, MO,
1996) is given in Figure 3.14.
61
Table3.7:SummaryofidentifiedQTLsforthetraitmeanendospermcellnumber.
Chr.cMLocusAdditivePercentLODExperimentwiseParentPopulationYearDev.
EffectVarianceProbabilityStage
(CellNumber)Explained
285.2ici99−81×103
24.23.810.005<p<0.01T232TCMRILSTP199618DAP
348.0bnl5.37a94×103
9.72.880.05<p<0.10Tx303TxCOIF2STP199616DAP
432.0umc31a69×103
10.03.390.05<p<0.10Tx303TxCOIF2STP199616DAP
5270.2npi288a−76×103
13.43.420.05<p<0.10T232TCMRILSTP199718DAP
680.0umc132a78×103
10.74.390.01<p<0.05Tx303TxCOIF2MO199616DAP
7114.4npi435−156×103
18.44.22p<0.01Tx303COTxRILSTP199616DAP
770.0csu8−80×103
15.47.53p<0.01CO159TxCOIF2MO199616DAP
771.5umc254−109×103
17.23.330.05<p<0.10CO159TxCOIF2STP199616DAP
826.0umc103a118×103
20.65.02p<0.01Tx303TxCOIF2MO199616DAP
835.9stp1129×103
21.54.98p<0.01Tx303TxCOIF2STP199616DAP
8150.3csu96b−88×103
17.73.690.05<p<0.10T232TCMRILSTP199718DAP
62
Figure 3.14: Immortalized Tx303 X CO159 F2 population composite interval map-
ping: QTL likelihood maps on chromosome 7 for the trait mean en-
dosperm cell number at 16 DAP (Columbia, MO 1996). The empiri-
cally derived threshold values from permutation analysis are indicated
by the horizontal solid line (5%) and dashed line (1%). The black
triangle marker indicates the location of the cofactor used in CIM.
20 40 60 80 cM
2
4
6
csu582
asg8
asg34a
asg49
csu296umc254
csu8
umc245bnl8.44aumc168umc35a
LODScore
8
3.7.2 Endosperm Mean Ploidy QTLs
Composite interval mapping identified one QTL that significantly influenced
MEP from kernel samples collected from the T232 X CM37 mapping population in
St. Paul in 1996. The QTL identified on chromosome 9 is marked by npi443 (bin
9.05) and has an average positive effect of allele substitution of 1.03 mean ploidy units
(CM37 direction) (Table 3.8). The QTL region identified on chromosome 9 exceeded
the α = 0.05 experimentwise threshold (LOD 3.39). The final QTL model, including
this QTL region, accounted for 15.1±9.5% of the phenotypic variance and 25.2±15.9%
63
of the genotypic variance. One QTL was identified that significantly influenced MEP
from kernel samples collected from the T232 X CM37 mapping population in St. Paul
in 1997. The QTL identified on chromosome 4 is marked by bet2 (gylcinebetaine2)
(bin 4.05) and has an average negative effect of allele substitution of 1.10 mean ploidy
units (T232 direction) 2
. The QTL region identified on chromosome 4 exceeded the
α = 0.05 experimentwise threshold (LOD 3.29). The final QTL model, including this
QTL region, accounted for 29.2 ± 11.7% of the phenotypic variance and 63.5 ± 25.4%
of the genotypic variance.
Composite interval mapping failed to identify QTL(s) that significantly (α =
0.05 experimentwise threshold) influenced MEP from kernel samples collected from
the CO159 X Tx303 RIL mapping population in St. Paul in 1996. However, a puta-
tive QTL on chromosome 5 was identified at the α = 0.10 experimentwise threshold
level (LOD 3.16). This putative QTL, also identified by linear regression, on chro-
mosome 5 is marked by php20566 (bin 5.06) and has an average positive effect of
allele substitution of 1.19 mean ploidy units (CO159 direction) (Table 3.7). The final
QTL model, including this QTL region, accounted for 22.8±11.8% of the phenotypic
variance and 29.7±15.3% of the genotypic variance. Using the 1997 CO159 X Tx303
mapping data, a putative QTL on chromosome 6 was identified at the α = 0.25 ex-
perimentwise threshold level (LOD 2.65). The putative QTL, also identified by linear
regression, on chromosome 6 is marked by tug8 (bin 6.04) and has an average negative
effect of allele substitution of 0.62 mean ploidy units (Tx303 direction). The final
QTL model, including this QTL region, accounted for 15.6±11.8% of the phenotypic
variance.
Composite interval mapping identified four QTLs that influenced the trait MEP
2
This chromosome region (bin 4.05) was also identified using T232 X CM37 RIL data in both
1996 (using 14, 20, and 24 DAP stage data) and 1997 (using 24 DAP stage data (Figure 3.13)
64
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06
je-mclaughlin-dissertation06

More Related Content

What's hot

FINAL REVISIONS MS THESIS
FINAL REVISIONS MS THESISFINAL REVISIONS MS THESIS
FINAL REVISIONS MS THESISTom Hajek
 
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X Crosses
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X CrossesFrequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X Crosses
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X CrossesJournal of Agriculture and Crops
 
Synthetic biology: Concepts and Applications
Synthetic biology: Concepts and ApplicationsSynthetic biology: Concepts and Applications
Synthetic biology: Concepts and ApplicationsUSTC, Hefei, PRC
 
Cimetta et al., 2013
Cimetta et al., 2013Cimetta et al., 2013
Cimetta et al., 2013Fran Flores
 
Vasantharajan janakiraman 1_resume_july_2016
Vasantharajan janakiraman 1_resume_july_2016Vasantharajan janakiraman 1_resume_july_2016
Vasantharajan janakiraman 1_resume_july_2016Vasant Janakiraman
 
The Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics ResearchersThe Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics ResearchersLarry Smarr
 
Paper-based synthetic gene networks
Paper-based synthetic gene networksPaper-based synthetic gene networks
Paper-based synthetic gene networksGHMHI_MIT
 
Synthetic biology for pathway engineering
Synthetic biology for pathway engineeringSynthetic biology for pathway engineering
Synthetic biology for pathway engineeringKarthikeyan Rathinam
 
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...ExternalEvents
 
Viral Metagenomics (CABBIO 20150629 Buenos Aires)
Viral Metagenomics (CABBIO 20150629 Buenos Aires)Viral Metagenomics (CABBIO 20150629 Buenos Aires)
Viral Metagenomics (CABBIO 20150629 Buenos Aires)bedutilh
 
Synthetic cells
Synthetic cellsSynthetic cells
Synthetic cellsFizza Khan
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Nathan Olson
 
Synthetic Biology-Engaging Biology with Engineering
Synthetic Biology-Engaging Biology with EngineeringSynthetic Biology-Engaging Biology with Engineering
Synthetic Biology-Engaging Biology with EngineeringNavaneetha Krishnan J
 
OBC | Synthetic biology announcing the coming technological revolution
OBC | Synthetic biology announcing the coming technological revolutionOBC | Synthetic biology announcing the coming technological revolution
OBC | Synthetic biology announcing the coming technological revolutionOut of The Box Seminar
 
SWARNAVA ROY CV(01-2016)
SWARNAVA ROY CV(01-2016)SWARNAVA ROY CV(01-2016)
SWARNAVA ROY CV(01-2016)Swarnava Roy
 
Synthetic biology
Synthetic biology Synthetic biology
Synthetic biology Elham Lasemi
 
2016-07-CV_JaemunChoi04
2016-07-CV_JaemunChoi042016-07-CV_JaemunChoi04
2016-07-CV_JaemunChoi04Jae-Mun Choi
 
Jason C Poole Cv Linked In
Jason C Poole Cv Linked InJason C Poole Cv Linked In
Jason C Poole Cv Linked Inrastare1a
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisDespoina Kalfakakou
 
Application of bioinformatics in agriculture sector
Application of bioinformatics in agriculture sectorApplication of bioinformatics in agriculture sector
Application of bioinformatics in agriculture sectorSuraj Singh
 

What's hot (20)

FINAL REVISIONS MS THESIS
FINAL REVISIONS MS THESISFINAL REVISIONS MS THESIS
FINAL REVISIONS MS THESIS
 
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X Crosses
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X CrossesFrequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X Crosses
Frequency of Polyploids of Solanum tuberosum Dihaploids in 2X × 2X Crosses
 
Synthetic biology: Concepts and Applications
Synthetic biology: Concepts and ApplicationsSynthetic biology: Concepts and Applications
Synthetic biology: Concepts and Applications
 
Cimetta et al., 2013
Cimetta et al., 2013Cimetta et al., 2013
Cimetta et al., 2013
 
Vasantharajan janakiraman 1_resume_july_2016
Vasantharajan janakiraman 1_resume_july_2016Vasantharajan janakiraman 1_resume_july_2016
Vasantharajan janakiraman 1_resume_july_2016
 
The Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics ResearchersThe Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics Researchers
 
Paper-based synthetic gene networks
Paper-based synthetic gene networksPaper-based synthetic gene networks
Paper-based synthetic gene networks
 
Synthetic biology for pathway engineering
Synthetic biology for pathway engineeringSynthetic biology for pathway engineering
Synthetic biology for pathway engineering
 
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
 
Viral Metagenomics (CABBIO 20150629 Buenos Aires)
Viral Metagenomics (CABBIO 20150629 Buenos Aires)Viral Metagenomics (CABBIO 20150629 Buenos Aires)
Viral Metagenomics (CABBIO 20150629 Buenos Aires)
 
Synthetic cells
Synthetic cellsSynthetic cells
Synthetic cells
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
Synthetic Biology-Engaging Biology with Engineering
Synthetic Biology-Engaging Biology with EngineeringSynthetic Biology-Engaging Biology with Engineering
Synthetic Biology-Engaging Biology with Engineering
 
OBC | Synthetic biology announcing the coming technological revolution
OBC | Synthetic biology announcing the coming technological revolutionOBC | Synthetic biology announcing the coming technological revolution
OBC | Synthetic biology announcing the coming technological revolution
 
SWARNAVA ROY CV(01-2016)
SWARNAVA ROY CV(01-2016)SWARNAVA ROY CV(01-2016)
SWARNAVA ROY CV(01-2016)
 
Synthetic biology
Synthetic biology Synthetic biology
Synthetic biology
 
2016-07-CV_JaemunChoi04
2016-07-CV_JaemunChoi042016-07-CV_JaemunChoi04
2016-07-CV_JaemunChoi04
 
Jason C Poole Cv Linked In
Jason C Poole Cv Linked InJason C Poole Cv Linked In
Jason C Poole Cv Linked In
 
Bioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesisBioinformatics as a tool for understanding carcinogenesis
Bioinformatics as a tool for understanding carcinogenesis
 
Application of bioinformatics in agriculture sector
Application of bioinformatics in agriculture sectorApplication of bioinformatics in agriculture sector
Application of bioinformatics in agriculture sector
 

Similar to je-mclaughlin-dissertation06

Microbial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureMicrobial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureLarry Smarr
 
BolingerJustin - Honors Thesis
BolingerJustin - Honors ThesisBolingerJustin - Honors Thesis
BolingerJustin - Honors ThesisJustin P. Bolinger
 
Bioinformatics group presentation
Bioinformatics group presentationBioinformatics group presentation
Bioinformatics group presentationNaeem Ahmed
 
Bioinformatics group presentation
Bioinformatics group presentationBioinformatics group presentation
Bioinformatics group presentationNaeem Ahmed
 
Plant_Cytogenetics_Methods_and_Protocols_Humana_Press,_2016.pdf
Plant_Cytogenetics_Methods_and_Protocols_Humana_Press,_2016.pdfPlant_Cytogenetics_Methods_and_Protocols_Humana_Press,_2016.pdf
Plant_Cytogenetics_Methods_and_Protocols_Humana_Press,_2016.pdfQusayAlMaghayerh
 
Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....Jonathan Eisen
 
Sensing metabolites for the monitoring of tissue engineered construct cellula...
Sensing metabolites for the monitoring of tissue engineered construct cellula...Sensing metabolites for the monitoring of tissue engineered construct cellula...
Sensing metabolites for the monitoring of tissue engineered construct cellula...Antoine DEGOIX
 
Bioinformatics lecture 1
Bioinformatics lecture 1Bioinformatics lecture 1
Bioinformatics lecture 1Hamid Ur-Rahman
 
Bioinformatics Lecture 1
Bioinformatics  Lecture 1Bioinformatics  Lecture 1
Bioinformatics Lecture 1Hamid Ur-Rahman
 
crop breeding.pdf
crop breeding.pdfcrop breeding.pdf
crop breeding.pdfKareemUmer
 
Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2Larry Smarr
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...Human Variome Project
 
IJSRED-V2I1P5
IJSRED-V2I1P5IJSRED-V2I1P5
IJSRED-V2I1P5IJSRED
 
Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingStuti Nayak
 
Kim Solez combining resources in tx and regen med make no small plans
Kim Solez combining resources in tx and regen med make no small plansKim Solez combining resources in tx and regen med make no small plans
Kim Solez combining resources in tx and regen med make no small plansKim Solez ,
 
21 genomes and their evolution
21   genomes and their evolution21   genomes and their evolution
21 genomes and their evolutionRenee Ariesen
 
Lecaut et al 2012
Lecaut et al 2012Lecaut et al 2012
Lecaut et al 2012Fran Flores
 

Similar to je-mclaughlin-dissertation06 (20)

Microbial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New CyberinfrastructureMicrobial Metagenomics Drives a New Cyberinfrastructure
Microbial Metagenomics Drives a New Cyberinfrastructure
 
BolingerJustin - Honors Thesis
BolingerJustin - Honors ThesisBolingerJustin - Honors Thesis
BolingerJustin - Honors Thesis
 
Bioinformatics group presentation
Bioinformatics group presentationBioinformatics group presentation
Bioinformatics group presentation
 
Bioinformatics group presentation
Bioinformatics group presentationBioinformatics group presentation
Bioinformatics group presentation
 
Plant_Cytogenetics_Methods_and_Protocols_Humana_Press,_2016.pdf
Plant_Cytogenetics_Methods_and_Protocols_Humana_Press,_2016.pdfPlant_Cytogenetics_Methods_and_Protocols_Humana_Press,_2016.pdf
Plant_Cytogenetics_Methods_and_Protocols_Humana_Press,_2016.pdf
 
Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....
 
Sensing metabolites for the monitoring of tissue engineered construct cellula...
Sensing metabolites for the monitoring of tissue engineered construct cellula...Sensing metabolites for the monitoring of tissue engineered construct cellula...
Sensing metabolites for the monitoring of tissue engineered construct cellula...
 
Bioinformatics lecture 1
Bioinformatics lecture 1Bioinformatics lecture 1
Bioinformatics lecture 1
 
Bioinformatics Lecture 1
Bioinformatics  Lecture 1Bioinformatics  Lecture 1
Bioinformatics Lecture 1
 
crop breeding.pdf
crop breeding.pdfcrop breeding.pdf
crop breeding.pdf
 
Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2Cross-Disciplinary Biomedical Research at Calit2
Cross-Disciplinary Biomedical Research at Calit2
 
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
MseqDR consortium: a grass-roots effort to establish a global resource aimed ...
 
A framework for human microbiome research
A framework for human microbiome researchA framework for human microbiome research
A framework for human microbiome research
 
IJSRED-V2I1P5
IJSRED-V2I1P5IJSRED-V2I1P5
IJSRED-V2I1P5
 
Quality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic ModelingQuality Assessment of Biomedical Metadata using Topic Modeling
Quality Assessment of Biomedical Metadata using Topic Modeling
 
Brief introduction to Bioinformatics
Brief introduction to BioinformaticsBrief introduction to Bioinformatics
Brief introduction to Bioinformatics
 
Kim Solez combining resources in tx and regen med make no small plans
Kim Solez combining resources in tx and regen med make no small plansKim Solez combining resources in tx and regen med make no small plans
Kim Solez combining resources in tx and regen med make no small plans
 
21 genomes and their evolution
21   genomes and their evolution21   genomes and their evolution
21 genomes and their evolution
 
Lecaut et al 2012
Lecaut et al 2012Lecaut et al 2012
Lecaut et al 2012
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 

je-mclaughlin-dissertation06

  • 1. UNIVERSITY OF MINNESOTA This is to certify that I have examined this bound copy of a Doctorate thesis by John Edward McLaughlin and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the final examining committee have been made. Ronald L. Phillips, Friedrich Srienc Name of Faculty Advisers Signature of Faculty Advisers Date GRADUATE SCHOOL
  • 2. Genetic Analysis of Variation in Endosperm Cell Number and Endoreduplication in Maize (Zea mays L.) A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY John Edward McLaughlin IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Ronald L. Phillips, Friedrich Srienc, Advisers August 2006
  • 3. © John Edward McLaughlin August 2006
  • 4. Acknowledgments I would like to thank the following individuals and organizations for their im- portant contributions to this research. First, I wish to thank my co-advisors, Drs. Ronald Phillips and Friedrich Srienc, for their guidance on this maize endosperm research project. I was exposed to a great variety of scientific ideas and methods under their direction. I greatly appreciate the time that I spent in Dr. Phillips’ lab- oratory. His laboratory fostered an environment of learning. Funding for the project was obtained through a grant to Drs. Phillips and Srienc from the NIH Biotechnology Training Program. In addition, I wish to thank the members of my defense commit- tee for providing suggestions and editorial comments on the dissertation: Drs. Burle Gengenbach, Ruth Shaw, and Deon Stuthman. Thank you for providing your time and expertise to this project. Mark Millard, curator of the USDA Regional Plant Introduction Station at Iowa State University, kindly provided the teosinte accessions and seed for the other ex- otic germplasm sources. Robert Murzyn provided the mature Tripsacum dactyloides plants which he made available in both the greenhouse and in the field. Larry Carlson was an excellent source of information concerning the elite X exotic germplasm work as he had introgressed Zea diploperennis into several elite inbreds by several gen- erations of backcrossing. Larry previously identified the short day treatment (time frame and duration) necessary to induce Zea diploperennis to flower in the Minnesota environment. Also, Larry provided the 50 gallon steel trash barrels used for the short day treatment of the exotic germplasm. Benjamin Burr from the Brookhaven National Laboratory (BNL), provided seed for both recombinant inbred line popu- lations. Previous seed increase and maintenance of both RIL populations was per- formed by Ronald Phillips’ laboratory group. In addition, the molecular marker information data set used in this study was developed at BNL (currently stored at the Maize Genetics and Genomics Database). Georgia Yerk-Davis developed the completely randomized experimental design for the immortalized Tx303 X CO159 F2 (IF2) experiment that was grown at the Missouri Agriculture Experiment Sta- tion in Columbia, MO. Georgia collected the kernel samples from the IF2 population and prepared the samples for storage in ethanol. Georgia provided an additional 58 lines for the Tx303 X CO159 mapping population that were not part of the original 54 IF2 population mapping panel. In addition, Georgia provided additional molec- ular marker data for this extended population of lines that was collected from the i
  • 5. University of Missouri-Columbia RFLP laboratory. This molecular marker data was not, and currently is not, available from the Maize Genetics and Genomics Database website. Daily weather data for the St. Paul location was provided by Dave Ruschy of the University of Minnesota Department of Soil, Water, and Climate. Randy Miles, of the Missouri Agricultural Experimental Station, provided daily weather data for the Sanborn Field (Columbia, MO) location. Jack Otis from the University of Minnesota Poultry Lab supplied the chicken red blood cells. James Holland de- signed and programmed the SAS code for the calculation of heritability and genetic correlation estimates based on REML and GLM methods. Dianne Harris from the Beckmann Coulter Corporation taught me how to operate and maintain the Epics XL Flow Cytometer. In addition, Dianne provided instruction for the use of the Epics XL SYSTEM II software. The flow cytometry work was performed in the lab- oratory of Robert Jones with the technical support of Jeff Roessler. I would also like to express my thanks to Bruce Bagwell, Donald Herbert, Ben Hunsberger, and Mark Munson of Verity Software House, Inc. for their help in developing a statis- tical model which allows the fitting and estimation of multiple ploidy peaks in the MODFIT LT 3.0 program. Jim Halgerson from the NIRS Forage Quality Lab (De- partment of Agronomy and Plant Genetics) at the University of Minnesota provided instruction on the wet chemistry and NIR analysis of total kernel protein and starch. Richard W. Kaszeta wrote and provided me with the LATEX template (thesis.tex) and style/support files (thesis-me.cls, me-tools.sty, menet.bst) designed to meet the University of Minnesota Graduate School thesis formatting requirements. He also pro- vided excellent help to get me started using the LATEX language for both typesetting and the inclusion of encapsulated postscript graphic files. The LATEX template is avail- able at: http://www.menet.umn.edu/ kaszeta/phdthesis (verified October 20, 2003). The program BibTeXMng was used to develop, organize, and format the bibliogra- phy. BibTeXMng was written by Petr and Nikolay Vabishchevich and the program is available at: http://www.imamod.ru/ vab/bibtexmng/ (verified October 20, 2003). Dean Flanders from the Agronomy and Plant Genetics department provided great PC and Unix computer support. I would like to thank Richard Kowles who showed me the practical nature of the work involved with the endosperm collection, fixation, and nuclei preparation. Also, Richard Kowles provided support for many aspects of this dissertation including the development of the initial experimental design and the statistical analysis of flow cytometry data. I would also like to extend special thanks to the following people for their contribution to this work. Suzanne Livingston and ii
  • 6. Jayanti Suresh provided excellent laboratory and field technical support. Mike Olsen and Cristian Vl˘adut¸u provided many insightful discussions regarding quantitative ge- netics along with generous laboratory and field help. Finally, I would like to thank my family for their love and support. iii
  • 7. Abstract (323 words) The cytogenetics of two aspects of early endosperm growth in maize, the es- tablishment of cell number and the extent of endoreduplication of the tissue as a whole, was studied using flow cytometry and quantitative genetic methods. Cell cy- cle parameters of endosperm nuclei from two recombinant inbred line populations (T232 X CM37 and CO159 X Tx303) and one immortalized F2 population (Tx303 X CO159) were measured. Natural genetic variability and transgressive segregation for both endosperm cell number and extent of endoreduplication were observed. Multi- year, broad-sense heritabilities for the cytological traits were measured for the T232 X CM37 population. The heritability at 18 DAP for endosperm cell number was estimated as 0.23 ± 0.14 and for mean endosperm ploidy as 0.43 ± 0.12 (entry mean basis). The phenotypic correlation between endosperm cell number and mean ploidy for the three mapping populations ranged from −0.20 ± 0.13 to −0.57 ± 0.28. After the transition from the mitotic cell cycle to endoreduplication occurs, the prolifera- tive capacity of that cell terminates. The negative phenotypic correlation between the mean ploidy and cell number traits suggests that this cell cycle transition may have a dramatic effect on several developmental processes including growth rate and yield. A composite trait, mean total C (mean ploidy × total endosperm cell number) (MTC), provided the most consistent phenotypic correlations with the trait mature 100 kernel weight (g) in the T232 X CM37 mapping population (0.44 ± 0.26 in 1996 and 0.45 ± 0.27 in 1997). In total, eight endosperm cell number QTLs and ten mean ploidy QTLs were identified by composite interval mapping. The identified QTL ef- fects (2a) for the endosperm cell number trait ranged from 138 × 103 to 312 × 103 cells. The identified QTL effects (2a) for the mean ploidy trait ranged from 0.96 to 2.38 mean ploidy units (C). A better understanding of the genetics that controls early endosperm development has implications for the improvement of seed quality and yield. iv
  • 8. Contents Contents v List of Tables viii List of Figures ix Chapter 1 Literature Review 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 2 Materials and Methods 7 2.1 Plant Material and Growth Conditions . . . . . . . . . . . . . . . . . 7 2.2 Endosperm Sampling and Storage . . . . . . . . . . . . . . . . . . . . 8 2.3 Endosperm Nuclei Preparations . . . . . . . . . . . . . . . . . . . . . 10 2.4 Analysis of Phenotypic Data . . . . . . . . . . . . . . . . . . . . . . . 11 2.4.1 Heritability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4.2 Genetic and Phenotypic Correlations Between Traits . . . . . 14 2.4.3 Weather Information . . . . . . . . . . . . . . . . . . . . . . . 15 2.5 Composite Interval Mapping . . . . . . . . . . . . . . . . . . . . . . . 16 2.6 Joint Time-Related Mapping . . . . . . . . . . . . . . . . . . . . . . . 17 2.7 Flow Cytometry Instrumentation, Settings, and Measurements . . . . 18 2.8 Modeling of the Cell Cycle . . . . . . . . . . . . . . . . . . . . . . . . 20 v
  • 9. 2.9 Data Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.10 Statistical Analysis of the Flow Cytometry Data . . . . . . . . . . . . 21 2.11 Kernel Protein and Starch Determinations . . . . . . . . . . . . . . . 23 Chapter 3 Results 24 3.1 Endosperm Cytological Trait Histograms . . . . . . . . . . . . . . . . 24 3.1.1 T232 X CM37 RIL . . . . . . . . . . . . . . . . . . . . . . . . 24 3.1.2 Tx303 X CO159 IF2 and CO159 X Tx303 RIL . . . . . . . . . 33 3.2 Multi-Environment ANOVA Results . . . . . . . . . . . . . . . . . . . 38 3.3 Single-Environment ANOVA Results . . . . . . . . . . . . . . . . . . 41 3.4 Flow Cytometric Analysis . . . . . . . . . . . . . . . . . . . . . . . . 43 3.5 Heritability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.6 Genetic and Phenotypic Correlations Between Traits . . . . . . . . . 53 3.7 Quantitative Trait Analysis . . . . . . . . . . . . . . . . . . . . . . . 57 3.7.1 Endosperm Cell Number QTLs . . . . . . . . . . . . . . . . . 59 3.7.2 Endosperm Mean Ploidy QTLs . . . . . . . . . . . . . . . . . 63 3.8 Additional Sources of Genetic Variability . . . . . . . . . . . . . . . . 68 3.8.1 Zea diploperennis . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.8.2 Tripsacum dactyloides . . . . . . . . . . . . . . . . . . . . . . 71 Chapter 4 DISCUSSION 74 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.2 Quantitative Genetic Parameters . . . . . . . . . . . . . . . . . . . . 75 4.3 QTL Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.4 Genetic Determinants of Maize Endosperm Cell Number, Extent of Endoreduplication Control, and Yield . . . . . . . . . . . . . . . . . . 83 4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 vi
  • 10. References 94 Appendix A SAS Programs 109 A.1 Single Environment PROC GLM and PROC MIXED SAS Code . . . 109 A.1.1 Single Environment: PROC GLM, Randomized Complete Block (Random Model) . . . . . . . . . . . . . . . . . . . . . . . . . 109 A.1.2 Single Environment: PROC GLM, Randomized Complete Block (MIXED Model) . . . . . . . . . . . . . . . . . . . . . . . . . 110 A.1.3 Single Environment: PROC MIXED, Randomized Complete Block (Random Model) . . . . . . . . . . . . . . . . . . . . . . 111 A.1.4 Single Environment: PROC MIXED, Randomized Complete Block (MIXED Model) . . . . . . . . . . . . . . . . . . . . . . 112 A.1.5 Single Environment Heritability Calculation: PROC MIXED . 113 A.2 Multiple Environment PROC GLM and PROC MIXED SAS Code . . 115 A.2.1 Multiple Environment Heritability Calculation: PROC GLM . 115 A.2.2 Multiple Environment: PROC MIXED, Randomized Complete Block (MIXED Model) . . . . . . . . . . . . . . . . . . . . . . 115 A.2.3 Multiple Environments Heritability Calculation: PROC MIXED 117 A.3 Genetic Correlation Calculation . . . . . . . . . . . . . . . . . . . . . 118 A.3.1 Genetic Correlation: GLM MANOVA . . . . . . . . . . . . . . 118 A.3.2 Genetic Correlation: PROC MIXED (REML MANOVA) . . . 123 Appendix B MODFIT LT 3.0 Output Example 129 vii
  • 11. List of Tables 2.1 Exotic Accessions Grown in St. Paul . . . . . . . . . . . . . . . . . . 9 3.1 Multi-Environment Covariance Parameter Estimate (REML) and Fixed Effect Solution Table for T232 X CM37 RIL Trait Mean Endosperm Nuclear Ploidy (18 DAP) . . . . . . . . . . . . . . . . . . . . . . . . 39 3.2 Multi-Environment Covariance Parameter Estimate (REML) and Fixed Effect Solution Table for T232 X CM37 RIL Trait Mean 100 Kernel Weight (g) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.3 Multi-year and Single Year Trait Heritability Estimates- T232 X CM37 RIL Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.4 Heritability for total endosperm cell number, mean endosperm nuclear ploidy and mean total C on a bulked-sample-basis for the immortalized Tx303 X CO159 F2 Population Mapping Population (St. Paul, MN and Columbia, MO 1996) . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.5 Single Year Trait Heritability for Total Endosperm Cell Number, Mean Endosperm Nuclear Ploidy and Mean Total C for the CO159 X Tx303 RIL Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.6 Genetic and Phenotypic Correlations for T232 X CM37 RIL Traits (REML) (St. Paul, MN 1996 and 1997) . . . . . . . . . . . . . . . . . 56 3.7 Summary of the Identified QTLs for the Trait Mean Endosperm Cell Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.8 Summary of the Identified QTLs for the Trait Mean Endosperm Nu- clear Ploidy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 viii
  • 12. List of Figures 3.1 Histograms of Mean Endosperm Cell Number at 18 DAP for the 48 T232 X CM37 RIL Families (St. Paul, MN 1996 and 1997) . . . . . 26 3.2 Histograms of Mean Endosperm Ploidy (MEP) measured in C units at 18 DAP for the 48 T232 X CM37 RIL Families (St. Paul, MN 1996 and 1997) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3 Box Plots Displaying the Distribution of the Trait Mean Endosperm Nuclear Ploidy, within the 48 T232 X CM37 RIL Families from 14 to 24 DAP in 1996 and 1997 (St. Paul, MN) . . . . . . . . . . . . . 28 3.4 Observed Endosperm Mean Nuclear Ploidy for the 48 T232 X CM37 RIL Lines for a Period of 10 Days (14, 16, 18, 20, and 24 DAP) Measured in St. Paul, MN (1996) . . . . . . . . . . . . . . . . . . . 29 3.5 Observed Endosperm Mean Nuclear Ploidy for the 48 T232 X CM37 RIL Lines for a Period of 10 Days (14, 16, 18, 20, and 24 DAP) Measured in St. Paul, MN (1997) . . . . . . . . . . . . . . . . . . . 30 3.6 Histograms of 100 Kernel Weight, within the 48 T232 X CM37 RIL Families (St. Paul, MN 1996 and 1997) . . . . . . . . . . . . . . . . 31 3.7 Histograms of Total Kernel Protein and Starch Percentages, within the 48 T232 X CM37 RIL Families (St. Paul, MN 1996 and 1997) . 32 3.8 Histograms of Mean Endosperm Cell Number at 16 DAP for the 112 Immortalized Tx303 X CO159 F2 Families (St. Paul, MN and Columbia, MO 1996) . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.9 Histograms of Mean Endosperm Nuclear Ploidy (C) at 16 DAP for the 112 Immortalized Tx303 X CO159 F2 Families (St. Paul, MN and Columbia, MO 1996) . . . . . . . . . . . . . . . . . . . . . . . . 36 ix
  • 13. 3.10 Histogram (Fitted Distribution) of Parental and F1 Endosperm (18 DAP) Nuclei Together with CRBCs from Flow Cytometry Measure- ment (St. Paul, MN 1997) . . . . . . . . . . . . . . . . . . . . . . . 44 3.11 Flow Cytometry Histograms of Parental (CO159 and Tx303) and F1 (CO159 X Tx303) Endosperm Nuclei together with CRBCs (St. Paul, MN 1996) Showing Endoreduplication Distribution Differences at 16 DAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.12 Regression of Log Endosperm Cell Number on Mean Nuclear Ploidy from the Immortalized Tx303 X CO159 F2 Population Endosperm Samples Collected at 16 DAP in Columbia, MO 1996 . . . . . . . . 55 3.13 Genetic Map of the Endosperm Cell Number and Mean Ploidy QTLs Identified from the Three Mapping Populations . . . . . . . . . . . 58 3.14 Immortalized Tx303 X CO159 F2 Population CIM: QTL Likelihood Maps on Chromosome 7 for the Trait Mean Endosperm Cell Number at 16 DAP (Columbia, MO 1996). . . . . . . . . . . . . . . . . . . . 63 3.15 Ear Diversity Range: Modern Inbred (B73) to Zea diploperennis (St. Paul, MN 1998) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.16 Histogram (Raw Distribution) of Zea diploperennis Nuclei at 12 DAP together with CRBCs from Flow Cytometry Measurement (St. Paul, MN 1998) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.17 Histogram (Raw Distribution) of Zea diploperennis Nuclei at 16 DAP together with CRBCs from Flow Cytometry Measurement (St. Paul, MN 1998) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.18 Histogram (Raw Distribution) of Tripsacum dactyloides Nuclei at 14 DAP together with CRBCs from Flow Cytometry Measurement (St. Paul, MN 1998) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.1 Epi-fluorescence Image of a 16 DAP kernel (DE2 X H99) Longitudi- nal Cyrosection Stained with DAPI- Aleurone to Inner Endosperm Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 B.1 MODFIT LT 3.0 Graphical Output Displaying the Fitted CRBC and Endosperm Nuclei Cytogram Peaks . . . . . . . . . . . . . . . . . . 130 x
  • 14. Chapter 1 Literature Review 1.1 Introduction The endosperm composes approximately 80-85% of the mature maize kernel dry weight thus this component of seed tissue makes a large contribution to grain quality, composition, and yield (Wolf et al., 1952; Kowles and Phillips, 1988). The endosperm develops rapidly and attains a high metabolic activity within a relatively short developmental time period (Ingle et al., 1965; Kowles et al., 1992b; Larkins et al., 2001). The harvest index for maize, the ratio of grain yield to total plant mass, is approximately 50% (Jurgens et al., 1978; Sinclair, 1998). The relatively high harvest index for maize indicates that a large amount of photosynthate is efficiently converted into harvestable grain. The control of endosperm growth depends, in part, on the control of two dis- tinct, but related, cell cycle programs that are present during development. After a period of syncytial karyokinesis following double fertilization, the endosperm tissue initially grows by increasing cell number through mitosis (Kowles and Phillips, 1988). 1
  • 15. At 8-10 DAP, a period of development marked by a high mitotic index (≈ 10%), a fraction of endosperm cells in the central region of the tissue begin to differentiate and undergo nuclear polyploidization through a modified cell cycle called endoredu- plication (Kowles and Phillips, 1985). Endoreduplication is a truncated cell cycle which results in alternating Gap (G) and DNA synthesis (S phase) phases without a mitotic phase nor a cytokinesis event (Phillips et al., 1985; Kowles and Phillips, 1985, 1988; Schweizer et al., 1995; Grafi and Larkins, 1995). Measured in haploid ploidy units (C), a triploid endosperm nucleus in the G1 phase of the cell cycle would be characterized as having a 3C DNA content. Cells which begin endoreduplication be- come terminally differentiated and grow by increases in both nuclear and cytoplasmic volume (Kowles and Phillips, 1988). Endoreduplication is a common mechanism of genome multiplication (Brodsky and Uryvaeva, 1985). This alternative cell cycle is present in many tissues, and the most notable are those which have secretory and storage functions (Nagl, 1976). Endoreduplication plays a substantial role during maize endosperm development in terms of the growth dynamics of the tissue (Kowles and Phillips, 1985; Kowles et al., 1990; Dilkes et al., 2002). As the endoreduplication cell cycle begins in the maize endosperm, mitotic activity falls sharply in endosperm cells at approximately 10-12 DAP. Peripheral endosperm cells continue to divide by mitosis, but in the central region of the endosperm the mitotic index drops to near zero after 14 DAP (Kowles and Phillips, 1985). As many as 90% (including the 6C ploidy class) of endosperm cells undergo endoreduplication to some extent during tissue development (Larkins et al., 2001). The DNA content of an individual cell nucleus is correlated with nuclear volume (Kowles and Phillips, 1985, 1988). In addition to increasing nuclear volume during endoreduplication, the endosperm cell as a whole undergoes a substantial in- crease in both total volume and cell size. The period of nuclear and cell enlargement 2
  • 16. (approximately 8 to 28+ days) is temporally correlated with endosperm cell differ- entiation, starch deposition, total endosperm RNA content, total endosperm sugar content, and storage protein accumulation (Ingle et al., 1965; Larkins et al., 2001). In addition, developmental gradients within the endosperm tissue are formed such that the central cells (starchy endosperm) contain the largest nuclei and the peripheral cells (aleurone) contain the smallest (Kowles and Phillips, 1985). The resulting en- dosperm tissue is highly heterogeneous in terms of both nuclear DNA content and cell size (Kowles et al., 1990). However, the role that cell cycle control plays in making the endosperm such an efficient and productive tissue is just beginning to be understood (Grafi and Larkins, 1995; Becraft, 2001; Leiva-Neto et al., 2004). The molecular mechanisms that control the transition from the mitotic cell cycle to endoreduplication are not well understood. However, cell cycle research in plants such as maize and Arabidopsis has identified a few key regulatory steps. Grafi and Larkins (1995) identified two major cell cycle control mechanisms that are important in controlling the transition from a mitotic cell cycle to the endoreduplication cell cycle in the maize endosperm. In endoreduplicating endosperm tissue, mitosis is inhibited by a decrease in mitosis promoting factor MPF and DNA replication is induced by the activation of S phase-related kinases. Recently, Leiva-Neto et al. (2004) reported that reducing cyclin-dependent kinase A (CDKA) activity in the maize endosperm has a dramatic effect on the extent of endoreduplication. The ectopic expression of a dom- inant negative mutant gene for CDKA reduced mean ploidy by 50% compared to the wildtype control. Boudolf et al. (2004) used transgenic modification of Arabidopsis to show a link between CDK cyclin B1 (CDKB1) expression and the E2F transcrip- tion factor pathway. Overexpressing CDKB1 plants enhanced the endoreduplication phenotype in the leaf blade. 3
  • 17. The physiological and functional significance of endoreduplication is not clear. Several investigators have attempted to correlate genome multiplication with pro- ductivity in both plant and animals, but in many cases the comparisons are too complex to give a definitive answer (Pearson, 1974; Barlow, 1978; Larkins et al., 2001; Brodsky and Uryvaeva, 1985). Specific examples have been noted of greatly increased transcription rates in tissues containing endoreduplicated cells when com- pared to tissues composed of predominantly diploid cells of the same organism (Clut- ter et al., 1974; Calvi et al., 1998; Scharp´e and Van Parijs, 1973; Nagl, 1976). A role for endoreduplication has been suggested in enhancing the transcriptional po- tential, protein-synthesizing capacity, and the functional activity of a wide range of tissues (Nagl, 1976; Barlow, 1978; D’Amato, 1984; Melaragno et al., 1993; Goverse et al., 2000; Foucher and Kondorosi, 2000; Zhao and Grafi, 2000; Larkins et al., 2001; Leiva-Neto et al., 2004). It has been proposed that the productivity of crop plants can be enhanced by modulating the degree of endoreduplication in seed tissues (Brunori et al., 1993; Inz´e et al., 2002; Nadimpalli and Simmons, 2002; Inz´e et al., 2004). Cavallini et al. (1995) noted a strong positive correlation between maize kernel protein content and the extent of endoreduplication in the Illinois high and low protein genotypes. However, Leiva-Neto et al. (2004) reduced the extent of maize endosperm endoreduplication by 50% by the ectopic expression of a dominant mutant in the cyclin-dependent kinase A gene and found only slight reductions in starch and storage protein accumulations compared to the wildtype control. The proportion of endosperm cells in a particular ploidy class has been shown to be influenced by both genotypic and environmental influences (Kowles and Phillips, 1985; Artlip et al., 1995; Cavallini et al., 1995; Kowles et al., 1992a; Engelen-Eigles 4
  • 18. et al., 2001; Dilkes et al., 2002). The maternal genotype of the plant, through both sporophytic and zygotic influences, substantially impacts the level of endoreduplica- tion that develops in the endosperm tissue (Kowles et al., 1997; Dilkes et al., 2002). The extent of endoreduplication in the maize endosperm has been determined to be heritable and quantitative genetic components based on parent-offspring regression have been measured (Dilkes et al., 2002). Significant components of variance re- lated to the maternal zygotic and maternal sporophytic effects were identified. The estimates from that study represent aggregate quantitative genetic components. The present study builds on the previous work by using molecular markers as a further variance-component partitioning tool. Flow cytometric analysis of endosperm samples from three different maize mapping populations were used to collect cell num- ber and ploidy data. Small to moderate heritabilities for the endosperm cell number and mean ploidy traits were found. The cytological and morphological phenotypic data from the three mapping populations represented natural genetic variation which was correlated with molecular marker data. The molecular marker information per- mits the dissection of the total genetic variance to defined genomic regions (QTL positions), the estimation of magnitude of individual effects, the calculation of gene action, and the identification of parental (allelic) contribution of these effects. In addition, genetic correlations between the traits 100 kernel weight and total kernel starch/protein show significant relationships to both the endosperm cell number and mean ploidy traits. Inclusion of both endoreduplication and cell number data in this study permit a fuller understanding of endosperm development from a cell cycle perspective. The objective of this study was to identify and characterize regions of the maize genome which control two cytological aspects of endosperm development: establish- 5
  • 19. ment of endosperm cell number and extent of endoreduplication. • To do so, we developed a defined method to quantify the endoreduplication phenotype using the flow cytometry program MODFIT LT 3.0. Specific objectives include: • Estimation of genetic components of variation, broad-sense heritabilities, and genetic correlations for the traits endosperm cell number and extent of en- doreduplication. • The identification of quantitative trait loci (QTLs) on a genome-wide basis using composite interval mapping (CIM) methods. • Comparison of cytological data to agronomic traits such as final kernel weight to test correlations at the population level. • Testing endosperm samples from Zea diploperennis and Tripsacum dactyloides for evidence of endoreduplication. 6
  • 20. Chapter 2 Materials and Methods 2.1 Plant Material and Growth Conditions Three mapping populations were used in these studies: two recombinant in- bred line families, T232 X CM37 and CO159 X Tx303 (Burr et al., 1988) and one immortalized F2 population from the cross Tx303 X CO159 (Gardiner et al., 1993). The RIL sets were grown at the University of Minnesota Experiment Station, St. Paul, MN during the growing seasons of 1996 and 1997 (planting dates: May 12th and May 11th, respectively). The T232 X CM37 RIL set consists of 48 lines and the CO159 X Tx303 RIL set consists of 43 lines. Each RIL set was at least at the 11th generation of inbreeding. Each RIL set was grown in a lattice design with two replicates. Plots consisted of hand-planted single rows of 35 plants which were later thinned to 30 plants. Plot rows were 6.7 meters long and 76 cm apart (58,960 plants/ha). The IF2 mapping population, composed of 112 lines, was grown at two locations in 1996. Two replicates (18-30 plants per line per rep) designed in a randomized complete block were grown at the University of Minnesota Experiment 7
  • 21. Station, St. Paul, MN. In addition, two replicates of the IF2 material were grown at the University of Missouri-Columbia Experimental Station. In addition to the above populations, several maize accessions from the USDA Regional Plant Introduction Station (Ames, Iowa) were tested. A single exception was seed of Tripsacum dactyloides obtained by Dr. Robert Murzyn, originally obtained from Shepherd Farms, Inc. (Clifton Hill, MO). The accessions evaluated in St. Paul in 1996-1998 are shown in Table 2.1. All exotic accessions were grown in St. Paul, MN in 1996, 1997, and 1998. These accessions require the imposition of a short day environment to induce the plants to flower in time to set seed in the Minnesota environment. The plants were covered with 50 gallon steel trash barrels at 7 PM and uncovered at 7AM each day during an approximate three week period of growth (1st three weeks of June until the plants were too tall to fit inside the barrel). 2.2 Endosperm Sampling and Storage Controlled pollinations were made using standard methods. For the T232 X CM37 RIL population, endosperm samples were collected at 14, 16, 18, 20, and 24 DAP. For both the CO159 X Tx303 RIL and IF2 populations, endosperm samples were collected at 16 DAP. Kernels from the mid-section of the cob were immediately placed in ethanol:propionic acid (3:1, v/v). After 24 hours, kernels were placed in 70% ethanol for storage at -20◦ C. Prior to preparation, kernels were equilibrated in 35% and then 0% ethanol. 8
  • 23. 2.3 Endosperm Nuclei Preparations Each sample preparation consisted of nuclei from the entire endosperm of a bulk of kernels (3-6 for RILs to 40-60 for the IF2). The kernels were sequentially equi- librated in 50, 25 and 0% ethanol (v/v) and the endosperms were excised. First, the entire endosperm tissue from the collection of kernels was isolated under a dis- section scope or by using a magnifying visor. Removal of the endosperm (including the aleurone) from the kernel was accomplished using dental dissecting instruments in a manner that excluded the embryo, pericarp, and pedicel tissue. For each RIL endosperm preparation, six kernels per ear were prepared and combined (bulked) to represent a treatment sample. For the IF2 endosperm preparations, 3-6 kernels per ear from 12 to 20 ears per replicate were bulked for further processing. A method developed by Reddy and Daynard (1983) and modified by Myers et al. (1990b) was used to obtain solutions of endosperm nuclei. This method was shown to give a quantitative release of nuclei (Myers et al., 1990b). These isolated endosperm samples were placed into a pectinase digestion solution (1 part pectinase (ICN Bio- chemicals, Cleveland, OH) to 3 parts citrate-phosphate buffer (8.8g Na2HPO4 + 3.6 g citric acid L-1 [pH4.0] + 0.1% [w/v] NaN3). Endosperms were dissected from the kernels, placed in 1 mL of pectinase solution in a capped Falcon tube, and incubated at 37◦ C until soft. The enzyme digestion time was dependent on endosperm age. Endosperms less than 10 DAP required less than 4 hours in the pectinase solution; older endosperms (20-24 DAP) required 8 to 12 hours to soften. Over-digestion led to extensive nuclei loss as evidenced by observation with both microscopy and flow cytometry. Nuclei were dispersed by forcing the tissue through a 18 gauge needle with a syringe. The syringe and needle were then washed with the pectinase buffer (minus the pectinase). Added to this nuclei suspension was 75 µL Rnase (10mg/mL), 10
  • 24. approximately 10,000 chicken red blood cells (CRBCs) from a measured stock solu- tion, and a final concentration of 1.5 µg propidium iodide per estimated 1000 nuclei. Nuclei were stained for 2 hrs. at 37◦ C and then stored at 4◦ C until analysis (storage time not to exceed 24 hrs.). CRBCs were used to measure the average DNA content per endosperm nu- cleus and to calculate the number of nuclei present in the entire endosperm and in various sub-populations. Fresh CRBCs were obtained from the Department of An- imal Science, University of Minnesota. Trial experiments showed that propidium iodide-stained CRBCs preparations (DNA content ∼ 2.33 pg) could be visualized separately from the endosperm nuclei on cytograms of the log forward angle light scatter verses log DNA-fluorescence intensity. CRBCs from a male chicken were col- lected and stained cells were counted on a hemacytometer at 40× magnification. The counts were multiplied by the appropriate dilution factors to determine cell numbers. Three counts of at least 300 nuclei per observation were made just prior to the ad- dition of CRBCs to the endosperm nuclei mixture. The procedure for the fixative of CRBCs was as follows: Fresh CRBCs were diluted with 1X PBS (pH 7.3) until they could be counted with a hemacytometer to obtain the concentration. The CRBCs were stored up to one month in PBS buffer at 4◦ C. 2.4 Analysis of Phenotypic Data Phenotypic data for the three traits were analyzed by SAS® software (Unix/version 8.2, (SAS Institute, Inc., 2003)). Means and standard deviations were determined for each character for the two parents, the F1 hybrid, and the RIL or IF2 lines. The normality of each trait distribution was assessed by the (Shapiro and Wilk, 1965) W statistic (PROC UNIVARIATE NORMAL). For each trait, the homogeneity of 11
  • 25. variances among environments (locations or years) was checked using Hartley’s Fmax test (Hartley, 1950). Components of variance estimates were obtained using the general linear model (GLM) procedure of the SAS/STAT program (SAS Institute, Inc., 2003): Y = Mean + Y ear + Rep(Y ear) +Y ear × Geno + Geno × Rep(Y ear) (2.1) 2.4.1 Heritability Broad-sense heritability (H2 ) estimates were estimated by dividing the genotypic variance by the phenotypic variance (Hallauer and Miranda, 1988): Heritability on a entry mean basis across years was estimated as H2 = ˆσ2 G ˆσ2 G + ˆσ2 GE e + ˆσ2 ε re (2.2) Heritability on a entry mean basis using a single year of data was estimated as H2 = ˆσ2 G ˆσ2 G + ˆσ2 ε r (2.3) where ˆσ2 G is the genotypic variance, ˆσ2 ε re is the error variance divided by the number of replications multiplied by the number of years, r is the harmonic mean for the number of replicates per year, and ˆσ2 GE e is the genotype by year interaction 12
  • 26. variance. Type III sums of squares from the SAS program analyses were used to obtain the mean square estimates used in the ANOVA-based heritability estimates. Unsym- metrical, exact 95% confidence intervals for the ANOVA-based heritability estimates were calculated according to the method of Knapp et al. (1985). F-values from the F(df1, df2) distribution were obtained with MacAnova (Oehlert and Bingham, 2001). The SAS code for both the PROC GLM and PROC MIXED models can be found in Appendix A.1 (page 109) for the single environment design and Appendix A.2 (page 115) for the multiple environment design. Holland et al. (2003) presented The PROC MIXED SAS program code for the estimation of heritability on an entry (family) mean basis including the code to cal- culate the associated standard errors. Approximate standard errors for heritability estimates calculated with PROC MIXED were estimated using the delta method (Lynch and Walsh, 1998; Holland et al., 2003). Holland et al. (2001) also presented methods to calculate heritability estimates based on a bulked-sample-basis. Heri- tability on a bulked-sample-basis was estimated for cell cycle components from the IF2 mapping populations, because these traits were measured on samples of kernels bulked from two replicate plots per location rather than individual plots. This was done to reconstitute the F2 phenotype from the bulked IF2 line. In this case, ex- perimental error among bulked samples within a location was not estimated, and genotype-by-environment interaction (confounded with experimental error) served as the residual variance from the analysis over locations of bulked sample values. Heritability on a bulked-sample-basis was calculated as: ˆHbulked−sample−basis = ˆσ2 G ˆσ2 G + ˆσ2 GE (2.4) 13
  • 27. The SAS code, with minor changes, is also presented in Appendix A, section A.1.5 on page 113 for the single environment case and section A.2.3 on page 117 for the multiple environment case. 2.4.2 Genetic and Phenotypic Correlations Between Traits The genetic and phenotypic correlations among traits were estimated from mul- tivariate analysis of variance (MANOVA) using PROC GLM of the SAS/STAT pro- gram (SAS Institute, Inc., 2003). The genetic correlation between traits x and y is estimated as ˆrGxy = ˆσGxy ˆσGx ˆσGy (2.5) where ˆrGxy is the estimated genotypic covariance between traits x and y. For the RIL populations, agronomic traits are represented by line means per replicate and cytological traits are represented by pooled kernel samples from one plant per replicate. ˆrGx and ˆrGy are the estimated genotypic standard deviations for the traits x and y respectively. The phenotypic correlation between traits x and y is estimated as ˆrPxy = ˆσPxy ˆσPx ˆσPy (2.6) where ˆσPxy is the estimated phenotypic covariance between traits x and y. ˆσPx and ˆσPy are the estimated phenotypic standard deviations for trait x and y, respec- tively. Genotypic and phenotypic variance components and associated standard errors for each trait were estimated separately using the restricted maximum likelihood 14
  • 28. method in the SAS PROC Mixed program, considering all effects in the model except the intercept to be random effects (Holland et al., 2003). Standard errors were cal- culated based on the formula presented by (Mode and Robinson, 1959). Lynch and Walsh (1998); Holland et al. (2003) further elaborate on the calculation of standard errors for genetic variance components. For the RIL populations, there is no genetic variance among individuals within the group, thus the phenotypic and environmental correlations are functionally equivalent (Lynch and Walsh, 1998). The SAS program code for the estimation of genetic correlation (using MANOVA and REML MANOVA) and the associated standard errors developed by James Hol- land is available on his website at North Carolina State University (URL: http://www4.ncsu.edu/ %7Ejholland/correlation/correlation.html). The SAS code for the phenotypic and genotypic correlations are also presented in Appendix A, section A.3.1 (page 118) for the PROC GLM method and A.3.2 (page 123) for the PROC MIXED method. 2.4.3 Weather Information Growing degree units (GDUs) and precipitation data were obtained from the University of Minnesota Agricultural Experiment Station (St. Paul) and the Univer- sity of Missouri Agricultural Experimental Station (Sanborn). Accumulated growing degree day units (GDUs) were calculated according to the formula (maximum◦ C + minimum◦ C) 2 − 10◦ C (2.7) 15
  • 29. where 10◦ C was set as the minimum temperature and 30◦ C was set for the maximum temperature if the actual temperatures exceeded these limits. The GDU and precipitation values were summed from the planting to the day of pollination and from pollination to the endosperm sampling day. Precipitation data were organized into two categories: precipitation from polli- nation to sampling (PTS) and total accumulated precipitation (TAP) from planting to the sampling period. 2.5 Composite Interval Mapping The computer program PLABQTL (Utz and Melchinger, 1996, 2000) was used to identify QTLs based on the maize linkage maps and phenotypic data. PLABQTL performs composite interval mapping (CIM) by combining an interval mapping ap- proach (Lander and Botstein, 1989) and regression methods with the use of selected markers as covariates. CIM in PLABQTL is based on multiple regression using marker cofactors preselected by stepwise regression (Haley and Knott, 1992). Cofactors were selected by a stepwise regression procedure in PLABQTL. Em- pirical, genome-wide LOD threshold levels (α = 0.05) were established based on 1000 permutations of the final CIM model containing the preselected cofactors from step- wise regression (Doerge and Churchill, 1996; Doerge and Rebai, 1996). The threshold applies to a two-sided test in which alleles from either parental strain may increase or decrease the mean trait value under analysis. Akaike’s information criterion (AIC) was utilized during the mapping procedure as a stopping rule in selecting subsets of regression variables and for the selection of the most probable model (Sakamoto et al., 1986 in Jansen, 1993). A penalty of 3 for the AIC score, as recommended by the authors, was used to select the final markers used as cofactors in the analysis 16
  • 30. (Utz and Melchinger, 2000). Models that had AIC values larger than 3 were deemed significantly different. Final selection was for the QTL model that minimized the AIC. Jansen (1993) discussed the use of the AIC in both the cofactor selection and QTL model selection process. The percentage of the genotypic variance which is ex- plained by the multi-locus QTL model was calculated as follows: the genetic variance explained is calculated as the coefficient of determination (R2 ) divided by the broad sense heritability (H2 ). The standard error of this statistic is calculated under the assumption of known heritability (Utz and Melchinger, 2000). The additive effect (a) is reported as half the difference between the genotypic values of the two homozygotes at the putative QTL locus (Utz and Melchinger, 2000). The dominance effect reflects the genotypic value of the heterozygote relative to the two homozygotes at that locus. The dominance effect calculated for the IF2 repre- sents the difference between the mean of the heterozygous class and the mean of the homozygous classes at a given QTL. The additive and dominance effects calculated from multiple regression in PLABQTL were tested for significance by comparing the partial sums of squares term from the regression (1 DOF) to the residual sum of squares from the regression ANOVA for the entire model. The significance of the genetic effect is tested by performing an F-test on this ratio. 2.6 Joint Time-Related Mapping The computer program JZmapqtl in the QTL Cartographer (Basten and Zeng, 2002) suite of mapping programs was used for mapping the joint likelihood QTL pro- file of a single trait measured across time. JZmapqtl is an extension of CIM (Zeng, 1993, 1994) and allows multiple traits to be analyzed simultaneously (Jiang and Zeng, 1995). Following Wu et al. (1999), JZmapqtl was used for time-related mapping 17
  • 31. (TRM). Instead of using separate, correlated traits as input into the JZmapqtl algo- rithm, a single trait measured across five time points was used (repeated measures approach). Stepwise regression using the forward-backward search (F-to-enter and exit was set at p = 0.10) approach was used to identify cofactors. The Perl script Permute.pl (Basten, 2003) was used to obtain the joint and single trait genomic ex- perimentwise thresholds. 2.7 Flow Cytometry Instrumentation, Settings, and Mea- surements A Coulter Epics® MXL (Coulter Corp. Hialeah, Fl) with an argon ion laser operating at 488 nm was used for flow cytometric analysis of the maize endosperm nuclei. Samples were mixed with vortex mixing immediately prior to analysis to prevent sedimentation. Forward angle light scatter (FALS) and right angle light scatter (RALS) data were collected in both linear (FS and SS) and logarithmic modes (FSLog and SSlog), respectively. The photomultiplier tube 3 detector was set to collect nuclei fluorescence data in both area (integrated or total fluorescence) and peak (peak fluorescence measured using the AUX designation) mode. The area mode data were defined as FL3-Propidium Iodide (FL3 PI ) because these data represent the fluorescence intensity signal from the propidium iodide nuclear stain. Because of the large variation of the particles’ properties, signals were expressed with a logarithmic transformation (Log) (FL3 Log-PI ). Nuclei were measured with a flow rate through the flow cytometer of approximately 25-200 nuclei per second. A minimum threshold (discriminator) to trigger event data collection was set using CRBCs. The Aux (FL3 Peak) channel was set to exclude events with fluorescence intensities that fell below the CRBC peak. 18
  • 32. Exclusion of doublets was performed by plotting AUX FL3 Log-PI verses FL3 Log-PI and by excluding events with high integrated and low peak signals. In addition, debris was gated based on Log Forward Light Scatter (FSLog) verses FL3 Log-PI (FL3 Log-PI ) cytograms. Events that fell outside of the main CRBC and nuclei mean clusters were excluded from the analysis. For statistical analysis of the cytometric parameters (cell number and ploidy peak areas), gated signals were displayed as one-parameter histograms in logarithmic mode. The data were further gated using software (MODFIT LT 3.0) to eliminate nuclear debris from the analysis. The DNA amount, proportional to the fluorescence signal, is expressed as arbitrary C values in which the 1C value comprises the DNA content of the unreplicated haploid chromosome complement. Signals obtained from maize leaf tissues were used to adjust the gain settings so that signals from all intact nuclei were registered within the channel range. At least 10,000 nuclei (20,000 non-debris events set to terminate the run) were analyzed for each sample and every determination was made in duplicate. MODFIT LT 3.0 (Verity Software House, Inc.) flow cytometry software was used to analyze the DNA ploidy histograms. Data obtained from these programs were subjected to statistical analysis using the SAS program and further processed for QTL mapping purposes. Nuclei number was determined by flow cytometry. A known concentration of CRBCs was added to each of the nuclei preparation samples prior to the flow cytometric analysis. 19
  • 33. 2.8 Modeling of the Cell Cycle Single Gaussian distributions were fit to the CRBC and each DNA C peak using the flow cytometry software MODFIT LT 3.0. The model is built upon a series of Gaussian curves which are fit to each DNA ploidy peak 1 . Non-linear regression using the Marquardt Compromise method was used to minimize the mean square error (MSE) for the entire distribution space (Bagwell, 1993). The majority of debris was gated out from the list-mode data before entering the modeling software. Based on PI fluorescence data alone it is difficult to delimit the S-phase from the main ploidy peaks in flow cytograms derived from endosperm nuclei analysis. In this study, the S-phase was not separately analyzed. Instead, the main ploidy peaks were fit using normal curves which include the unknown fraction of the S-phase. 2.9 Data Calculation For each experimental set, nuclei from leaf tissue or embryo samples were mea- sured first to determine the location of 3C in terms of the channel number (between 2C and 4C nuclei from either leaf or embryo samples). The total nuclei number per endosperm was calculated according to the ratio of CRBCs to total nuclei number (Schweizer, 1992). From this total number, plus the proportion of each nuclei pop- ulation related to DNA content (3C, 6C, 12C, 48C, and 96C), the actual 3C nuclei number, 6C nuclei number, and up to 96C (the highest peak DNA content in nuclei regularly observed for the four inbred parents used in this study) were calculated. The equations for these calculations are the following: 1 A model to fit the endoreduplication cell cycle ploidy pattern with the MODFIT LT 3.0 program was built in collaboration with Ben Hunsberger and Mark Munson of Verity Software (Topsham, Maine). 20
  • 34. Ntotal = NCRBCadded ∗ (Ntotal/NCRBC)FCM (2.8) N3C = Ntotal ∗ (N3C/Ntotal)FCM (2.9) Where • Ntotal = total nuclei number • NCRBC = the number of CRBCs • N3 C = 3 C nuclei number • N6 C = 6 C nuclei number • · · · N384 C = 384 C nuclei number Appendix B on page 129 includes an example of the graphical output from the MODFIT LT 3.0 model. 2.10 Statistical Analysis of the Flow Cytometry Data The analysis of variance of the cytological data from each mapping population was calculated with SAS (SAS Institute, Inc., 2003). The variance was partitioned to better estimate the portion of genetic variation by removing the variation due to other factors, such as block effects, sampling time, and weather covariates. For the estimation of heritability, all effects - including the genotype - were considered to be random. By definition, the heritability equations call for the genotype effect to be set as a random factor. However, for the estimation of trait means to be included in the QTL mapping algorithms, a different ANOVA structure was used. The randomized 21
  • 35. complete block analysis (mixed model) consisted of both fixed effects (RIL lines or genotypes, weather covariate, and sampling time) and random effects (years, blocks nested in year, year×genotype, and error). The RIL genotype effect was classified as a fixed effect because each RIL line is highly inbred and can be readily and consistently multiplied for repeated experimentation across locations and years. In addition, each RIL genotype can be considered as potentially valuable genetic material for further study based on the results of QTL analysis. For each ANOVA, residual plots were made to determine if the fitted data were normally distributed. Endosperm cell number data were log transformed for data analysis and then back-transformed to report results. For each trait, an F-test was performed to determine the significance of the year effect. When the F-tests were significant (p < 0.05), the two years of data were analyzed separately. The effect of year on genotype rank was tested by examining the genotype×year interaction term. Analysis of covariance was used to estimate the effects of sampling time and the environment (GDU and precipitation) on the maize endosperm growth charac- teristics (cell number and mitotic/endoreduplication components). To minimize the variance in cell number and mean ploidy attributed to sampling time, these traits were standardized (when the particular covariate was significant) to the mean GDU and precipitation values. If the environmental cofactor was significant, the least square means were adjusted and subsequently used in the genetic mapping programs. The SAS code for the estimation of trait means for the genotype term for each individual RIL in the single environment design is presented in Appendix A, section A.1.4 on page 112. The SAS code for the estimation of trait means for the geno- type term for each individual RIL in the multiple environment design is presented in Appendix A, section A.2.2 on page 115. 22
  • 36. 2.11 Kernel Protein and Starch Determinations The major carbon sinks in the maize kernel are storage products: starch (en- dosperm), protein (endosperm), and oil (embryo). Mature kernel tissue contains approximately 66% starch and 15% protein on a dry weight basis (Doehlert, 1990). Kernel protein and starch content of mature kernels were measured to compare these traits with the cytological measurements (cell number and extent of endoreduplica- tion) taken during an earlier phase of endosperm development. Approximately 23g of seed from each replicate line of the 1997 T232 X CM37 RIL population was ground to pass through a 1-mm screen and dried at 60◦ C for 24h. The ground meal was placed in a secure container (large pill box) and tumbled for 45 min. to homogenize each sample. Ground samples were analyzed using a Foss North America Model 6500 Near Infrared Reflective Spectrophotometer (NIRS). NIRS was used to estimate crude pro- tein levels of the 1997 T232 X CM37 RIL population. The percentage total crude protein and starch content were estimated using an NIRS contrived corn grain equa- tion (idcgrfe.equ, Infrasoft International). The commercial corn grain equation was monitored by measuring micro-kjeldahl crude protein levels of 27 experimental lines from 1997. Samples were measured in duplicate and a nitrogen-to-protein conver- sion factor of 5.70 was used to determine crude protein (5.70 was selected to correct for non-protein nitrogen assuming a 17.5% nitrogen content of maize kernel protein, 1/0.175=5.70). Micro-Kjeldahl determined crude protein levels were regressed on NIRS determined crude protein levels in order to check the calibration of the com- mercial corn grain equation. The percentage total starch content was determined directly from the commercial equations. 23
  • 37. Chapter 3 Results 3.1 Endosperm Cytological Trait Histograms 3.1.1 T232 X CM37 RIL Trait means of the parental inbreds T232 and CM37 differed (p < 0.05) for mean endosperm cell number (MECN) and mean endosperm ploidy (MEP) when measured at 18 DAP in St. Paul, MN for both the 1996 and 1997 field seasons. The trait distribution for MECN and MEP of the T232 X CM37 recombinant inbred lines (RILs) including markers for the parental means are shown in Figures (3.1 a and b and 3.2 a and b), respectively. The F1 values are included in the 1997 histogram. As a measure of the variability of the measurements, we also report the standard error of the mean. Transgressive segregation was detected in the population of 48 RIL lines for the two cytological characteristics measured in both years (Figures 3.1 a and b and 3.2 a and b). The phenotypic values from the 48 RIL lines from the T232 X CM37 mapping population measured for MECN at 18 DAP in St. Paul, MN 1996 (Figure 24
  • 38. 3.1a) and 1997 (Figure 3.1b) approximate normal distributions with Shapiro-Wilk test statistics of 0.9808 (p = 0.175) and 0.9809 (p = 0.2766), respectively. The cell number for the endosperm samples collected in 1996 ranged from 5.24 × 105 cells to 11.59 × 105 cells with a mean endosperm cell number for the population at 18 DAP of 7.86 × 105 cells. The cell number for the endosperm samples collected in 1997 ranged from 4.00 × 105 cells to 12.93 × 105 cells with a mean endosperm cell number for the population at 18 DAP of 9.44 × 105 cells. The phenotypic values from the 48 RIL lines from the T232 X CM37 mapping population measured for MEP at 18 DAP in St. Paul, MN in 1996 and 1997 deviate from normality with Shapiro-Wilk test statistics of 0.9248 (p < 0.0001) and 0.9673 (p = 0.039), respectively. Normal probability plots of ordered data vs. rankits for these two data sets reveal fairly straight lines indicating that the deviation from normality is not extreme in either case. The mean ploidy for the endosperm samples collected in 1996 ranged from 9.96 C to 19.22 C with a population mean of 13.45 C. The mean ploidy for the endosperm samples collected in 1997 ranged from 5.17 C to 14.97 C with a population mean of 9.30 C. Figure 3.3 displays two box plots that represent the mean endosperm ploidy data from the 48 T232 X CM37 RILs collected in 1996 (a) and 1997 (b) at the 14 to 24 DAP developmental stages. The observed endosperm mean nuclear ploidy trait data for the 48 T232 X CM37 RIL lines for the 10 Days period (14, 16, 18, 20, and 24 DAP) that was measured in St. Paul, MN (1996) are presented in Figure 3.4. The observed endosperm mean nuclear ploidy trait data for the 48 T232 X CM37 RIL lines for the 10 Days period (14, 16, 18, 20, and 24 DAP) that was measured in St. Paul, MN (1997) are presented in Figure 3.5. In addition, 100 kernel weight data were obtained in both years (Figures 3.6 a and b). The traits percentage total kernel protein and kernel starch content from the 1997 harvest are shown in (Figures 3.7 a and b). 25
  • 39. 4X10 5 0 2 4 6 8 10 12 5X10 5 6 X10 5 7X10 5 8X10 5 9X10 5 10X10 5 11X105 12 X10 5 Frequency EndospermCellNumberat18DAP T232XCM37RILPopulation(1996) T232 7.03X105 ±5.0X104 CM37 9.45X105±8.6X104 (a) 0 2 4 6 8 10 12 4X10 5 5X10 5 6 X10 5 7X10 5 8X10 5 9X10 5 10X10 5 12 X10 5 Frequency EndospermCellNumberat18DAP T232XCM37RILPopulation(1997) T232 7.8X105 ±4.5X104 CM37 8.9X105 ±6.2X104 11 X10 5 13 X10 5 F1 12.2X105 ±9.2X104 (b) Figure3.1:HistogramsdisplayingtheT232XCM37RILfrequencydistributionofthetraitmeanendospermcellnumber(MECN) at18DAPforthe(A.)1996and(B.)1997fieldseasons(St.Paul,MN).Thetraitvaluesrepresentthemeansof tworeplications.Abulkpreparationof3-6endospermsamplesfromasingleearrepresentsareplicate.Theabsolute frequenciesareshownalongthey-axis.Thetraitvaluesareshownonthex-axis. 26
  • 46. 3.1.2 Tx303 X CO159 IF2 and CO159 X Tx303 RIL Characteristic trait values of the parental inbreds CO159 and Tx303 differed (p < 0.05) for MECN and MEP measured at 16 DAP in the St. Paul, MN and Columbia, MO locations during the 1996 field seasons. The trait distribution for MECN and MTC of the CO159 X Tx303 immortalized F2 (IF2) including markers for the parental means are shown in Figures (3.8 and 3.9), respectively. Phenotypic distributions of the CO159 X Tx303 RIL populations grown in 1996 and 1997 are not shown. Transgressive segregation was detected in the population of 112 IF2 lines for the two cytological characteristics measured in both locations: MECN (Figure 3.8) and MEP (Figure 3.9). The phenotypic values from the 112 IF2 lines from the CO159 X Tx303 mapping population measured for MECN at 16 DAP in St. Paul, MN (Figure 3.8a) and Columbia, MO (Figure 3.8b) during the 1996 field season display non-normal unimodal distributions. The IF2 line data for MECN deviated from normality, in the direction of positive skewness, in both locations (Columbia (p = 0.0034) and St. Paul (p = 0.007)), calculated on the basis of the Shapiro- Wilk statistic (Shapiro and Wilk, 1965). However, plots of ordered data vs. rankits (expected order of the data assuming the sample was from a normal population) revealed straight lines indicating the deviation from normality was not extreme. A log transformation of MECN from both locations normalized the data distribution based on the Shapiro-Wilk test statistic in both cases (p > 0.05). It should be noted here that quantitative trait values follow a mixture of distributions that approximate a normal distribution as increasing numbers of loci contribute to the total effect. For example, the non-normality of quantitative trait distributions is expected in the presence of a small number segregating QTLs of major effect (Doerge and Churchill, 33
  • 47. 1996). One possibility is that the log-normal distribution well describes the actual distribution of this trait for this mapping population. Another possibility is that the non-normal distribution of the raw cell number data reflects the limited sampling due to the small population size. The cell number for the endosperm samples collected from the IF2 population grown in St. Paul, MN location in 1996 ranged from 3.49 × 105 to 12.8 × 105 with a population mean of 6.50 × 105 (Figure 3.8a). The cell number for the endosperm samples collected from the Columbia, MO location in 1996 ranged from 5.43 × 105 to 13.63 × 105 with a mean endosperm cell number for the population at 16 DAP of 8.75 × 105 (Figure 3.8b). The IF2 population MEP at 16 DAP (St. Paul location) ranged from 7.2 to 13.5 with a population mean of 9.95 C (Figure 3.9a). The IF2 population mean endosperm ploidy at 16 DAP (Columbia, MO location) ranged from 9.43 to 15.60 with a population mean of 12.03 (Figure 3.9b). The phenotypic values from the 41 RIL lines from the CO159 X Tx303 mapping population measured for MECN at 16 DAP in St. Paul, MN 1996 deviate from normality with a Shapiro-Wilk test statistics of 0.9174 (p < 0.001). The cell number for the endosperm samples collected in 1996 ranged from 3.62×105 cells to 16.94×105 cells with a mean endosperm cell number for the population at 16 DAP of 8.72 × 105 cells. The phenotypic values from the 41 RIL lines from the CO159 X Tx303 mapping population measured for MEP at 16 DAP in St. Paul, MN 1996 deviate from normality with a Shapiro-Wilk test statistics of 0.9520 (p = 0.0052). Plots of ordered data vs. rankits reveal fairly straight lines indicating that the deviation from normality is not extreme. The mean ploidy range for the endosperm samples collected in 1996 ranged from 7.34 C to 19.98 C with a population mean of 11.19 C. The phenotypic values from the 41 RIL lines from the CO159 X Tx303 mapping population 34
  • 50. measured for MEP at 16 DAP in St. Paul, MN 1997 deviate from normality with a Shapiro-Wilk test statistics of 0.9237 (p < 0.0001). Plots of ordered data vs. rankits reveal fairly straight lines indicating that the deviation from normality is not extreme. The mean ploidy range for the endosperm samples collected in 1997 ranged from 4.65 C to 11.91 C with a population mean of 7.42 C. 37
  • 51. 3.2 Multi-Environment ANOVA Results Three traits, the two cytological traits and one agronomic trait (100 kernel weight) from the T232 X CM37 RIL population were analyzed across years (1996 and 1997). The randomized complete block analysis (mixed model) consisted of both fixed effects (RIL lines or genotypes) and random effects (years, blocks nested in year, year × genotype, and error). Analysis was performed using the SAS/STAT PROC MIXED program (SAS Institute, Inc., 2003). The MECN genotype effect estimate (p = 0.0540) was not different from zero. The REML fixed effect estimate for the genotype effect was greater than zero for MEP and 100 kernel weight (p < 0.0001). Multi-environment REML ANOVA tables for MEP and kernel weight are presented in Tables 3.1 3.2, respectively. The year and year×genotype terms were not detected (α = 0.05) for either cytological trait except for the year × genotype term for 100 kernel weight (p = 0.0021). 38
  • 53. Table 3.2: Multi-environment covariance parameter estimate (REML) and fixed effect solution table for T232 X CM37 RIL trait mean 100 kernel weight (g). Covariance parameter Estimate Standard Error Z value Pr > Z Y ear 0.03 0.39 0.07 p = 0.4702 Y ear×Genotype 6.78 2.36 2.87 p = 0.0021 Error 7.61 1.16 6.56 p < 0.0001 Fixed Effect Estimate Standard DF t Value Pr >|t| Error Intercept 36.29 2.3 129 15.75 p < 0.0001 Type III Test Fixed Effect Num. DF Den. DF F Value Pr > F Genotype 46 129 2.52 p < 0.0001 The Rep(Y ear) variance component estimate was negative and was removed from the model. 40
  • 54. 3.3 Single-Environment ANOVA Results The three weather covariates (GDU, PTS- precipitation from Pollination To Sampling, and TAP- Total Accumulated Precipitation from planting to the sampling period) were added as fixed factors (1 DF) to each cytological trait ANOVA model for the three mapping populations. None of the covariates tested in the T232 X CM37 cytological trait models for the 1996 and 1997 data were detected at α = 0.05. Tests for the genotype term in the T232 X CM37 cytological trait models using single- year REML ANOVA models identified a genotype effect for MEP (p = 0.0008, 1996; p = 0.0310, 1997) but not (α = 0.05) for MECN (p = 0.1370, 1996; p = 0.0937, 1997). The REML estimate for the genotype effect for the MECN trait at 16 DAP (1996) in the CO159 X Tx303 RIL mapping population were detected at p = 0.0004. Neither the GDU or precipitation covariates were detected in ANCOVA models. The genotype effect term for MEP (16 DAP) was large at p < 0.0001 in 1996. For the trait MEP the precipitation PTS covariate was present (p = 0.0050) in a REML ANCOVA model (a positive covariate relationship was found between PTS and the trait MEP). The interaction between the covariate and the genotype term did not exist (p = 0.6330) in the ANCOVA model suggesting that, in general, the slopes for genotype do not differ depending on the PTS covariate. Due to the nature of IF2 experimental design (bulked kernels, combined replicates), it was not possible to determine the existence of the genotype term (a calculation which would permit the estimation of an environmental component from the total phenotypic variance) for either cytological trait. Genetic mapping proceeded without these tests as if the bulked IF2 phenotypic data represented, when pooled within lines, an F2 population. Simple linear regression was used to test and measure the effect of the three co- variates on the IF2 cytological traits for the IF2 populations grown in both Columbia, 41
  • 55. MO and St. Paul, MN in 1996. An association between GDUs and log endosperm cell number for the population grown in Columbia, MO was detected. The heat unit data accounted for approximately 9.0% of the trait variation (R = +0.28 (R2 = 9.0%), p < 0.0001). Precipitation data (PTS) also served as a significant cofactor for the en- dosperm cell number trait measured from kernel samples collected from the Columbia, MO location. However, in a multiple regression model containing both covariates, only the GDU term was detected. There was no relationship between log endosperm cell number and GDU for the population grown in St. Paul (p = 0.579). The regres- sion on all three covariates with both sets of MEP data indicated no relationships between the trait and the cofactors α = 0.05. The population segregates for days to anthesis and the cytological characteristics are known to be influenced by phenology factors such as accumulated GDUs. The covariate correction allows for an equitable comparison of cytological characteristics of samples collected on different dates and has the potential to increase precision. Covariate analysis is based on the assumption that there is a constant regression relationship among the different treatments (in this case, genotypes). This assump- tion can be tested by examining the heterogeneity of slopes and was checked by using PROC GLM in SAS. The significance of the genotypes X GDUs term tests for this as- pect of the ANCOVA analysis. However, it is not possible to test for the heterogeneity of slopes using the bulked IF2 data so the adjustments were performed without this check. 42
  • 56. 3.4 Flow Cytometric Analysis Figure 3.10 shows a panel of endosperm nuclei histograms (fitted histogram data) from the inbred parent T232, CM37, and the F1 sampled at 18 DAP. ModFit LT 3.0 was used to model the CRBC and endosperm nuclei ploidy peaks (Gaussian distributions) using non-linear regression methods. Using multi-parameter analysis, two gates were set to eliminate both debris and nuclei doublets. The peak areas were used to calculate the number of nuclei (cells) in each ploidy class. 43
  • 58. Figure 3.11 shows a panel of endosperm nuclei histograms from the inbred parent CO159 (A and B), the F1 (C and D), and inbred parent Tx303 (E and F) sampled at 16 DAP. The left-hand side of the panel contains the raw histogram data (A,C,E) and the right-hand side of the panel contains the fitted histogram data (B,D,F). 45
  • 59. Channels (FL3 LOG-PI) 0 200 400 600 800 1000 Number 070140210280 Channels (FL3 LOG-PI) 0 200 400 600 800 1000 Number 070140210280 Channels (FL3 LOG-PI) 0 200 400 600 800 1000 Number 0100200300400500 Channels (FL3 LOG-PI) 0 200 400 600 800 1000 Number 0100200300400500 Channels (FL3 LOG-PI) 0 200 400 600 800 1000 Number 0100200300400500 Channels (FL3 LOG-PI) 0 200 400 600 800 1000 Number 0100200300400500 CO159 F1 Tx303 CO159 F1 Tx303 CRBC 3C 12C 24C 48C 96C 6C CRBC 3C 12C 24C 48C 96C 6C A B C D E F Figure 3.11: Flow cytometry histograms of parental (CO159 and Tx303) and F1 (CO159 X Tx303) endosperm nuclei together with CRBCs (St. Paul, MN 1996) showing endoreduplication distribution differences at 16 DAP. The left panel (A,C,E) represents that raw histogram data for CO159, the F1, and Tx303, respectively. The fitted data (colored his- tograms) are presented in (B,D,F). 46
  • 60. 3.5 Heritability Low to moderate1 heritability estimates were found for the cytological traits in all three mapping populations. Broad-sense heritability (H2 ) estimates were made by dividing the genotypic variance (ˆσ2 G) by the phenotypic variance (ˆσ2 P ). The phe- notypic variance, also known as the total variance, is composed of the total geno- typic variance (additive, dominance, and epistatic variance) and the environmental variance. For RILs, the component of total genetic variance (ˆσ2 G) more directly esti- mates additive genetic variance (ˆσ2 A) since the influence from dominant effects in this highly inbred material is absent. If the epistatic variance (ˆσ2 I ) and maternal variance (ˆσ2 M ) components are small, then the broad sense heritability estimate approaches the narrow-sense heritability (h2 ) which is defined as the ratio ˆσ2 A ˆσ2 P . Epistasis occurs between two (or more) loci when the effects of alleles at one locus depend on what alleles are present at the other loci. Generally, QTL analyses have detected epistasis by the presence of a significant interaction term between two loci from a two-factor ANOVA. Cheverud and Routman (1995) make a distinction between this ”statisti- cally” defined epistasis and what they call ”physiological” epistasis. Physiological epistasis more closely estimates the effect of genetic interactions on the physiology and development of the organism. To detect physiological epistasis, Cheverud and Routman (1995) partition the genotypic/phenotypic data into three categories: raw genotypic data, non-epistatic values (each independent of the alternate locus geno- type), and epistatic values (deviations of the two-locus genotypic values from the non-epistatic values) for each of the nine possible two-locus genotypic classes (F2 example). Although statistics are obviously used to detect and define both types of epistatic interactions, the distinction is important because physiological epistasis can make substantial contributions to the additive, dominant, and interaction variance 1 Low to moderate heritability here is defined as an approximate range from 0.0 to 0.50. 47
  • 61. components (Cheverud and Routman, 1995; Doebley et al., 1995; Eshed and Zamir, 1996; Lark et al., 1995; Li et al., 1997; Lefebvre and Palloix, 1996). The methods that detect statistical epistasis neglects contributions to the additive and dominance values and variance components. Unfortunately, epistasis was not estimated in this study due to the small size of the populations used for genetic mapping. Heritability estimates for all traits measured for the T232 X CM37 RIL pop- ulation are presented on an entry mean basis (ANOVA and REML). Heritabilities calculated across years on an entry mean basis ranged from 0.23 for MECN (18 DAP) to 0.79 for days to 50% pollen shed (REML method). Heritabilities calculated on an entry mean basis across years (equation (2.2)) are given in Table 3.3. Heritabilities from single year data on an entry mean basis ranged from a low of 0.16 for MTC at 18 DAP (1996) to a high of 0.85 for 100 kernel weight (1997). Heritabilities on a single year, entry mean basis (equation (2.3)) are listed in Table 3.3. 48
  • 62. Table3.3:Multi-yearandsingleyearheritabilityestimates-T232XCM37RILpopulation.NSindicatesnotsignificant- theconfidenceintervalfortheheritabilityestimateincludedthevaluezero. ANOVAREML TraitEntryMeanBasisEntryMeanBasis H2 withExact95%CIH2 ±SE CombinedYears(1996and1997) MeanEndospermCellNumber(18DAP)NS0.23±0.14 MeanEndospermNuclearPloidy(18DAP)0.53(0.23,0.71)0.43±0.12 MeanTotalC(18DAP)0.41(0.04,0.64)0.27±0.10 KernelWeight0.47(0.13,0.67)0.47±0.13 Daysto50%PollenShed0.74(0.57,0.84)0.79±0.05 1996 100KernelWeight0.68(0.47,0.80)0.60±0.11 TotalEndospermCellNumber(18DAP)NS0.20±0.12 MeanEndospermNuclearPloidy(18DAP)0.60(0.34,0.75)0.51±0.12 MeanTotalC(18DAP)NS0.16±0.12 1997 100KernelWeight0.89(0.83,0.94)0.85±0.04 TotalKernelProteinContent0.86(0.77,0.91)0.82±0.054 TotalKernelStarchContent0.87(0.78,0.92)0.83±0.05 TotalEndospermCellNumber(18DAP)NS0.26±0.13 MeanEndospermNuclearPloidy(18DAP)0.46(0.11,0.67)0.40±0.16 MeanTotalC(18DAP)NS0.18±0.14 49
  • 63. For the cell cycle components estimated using flow cytometry, heritability esti- mates based on the bulked-sample-basis (Holland et al., 2001) using the IF2 data from St. Paul, MN and Columbia, MO and using the equation (2.4) are listed in Table 3.4. Based on the results from the bulked sample heritability data which indicated a large (p < 0.001) location (represented by bulked replication term) effect for all three traits, the data were analyzed separately for the two locations. Heritability estimates for the CO159 X Tx303 RIL traits measured in 1996 are presented in Table 3.5. 50
  • 66. 3.6 Genetic and Phenotypic Correlations Between Traits Genetic (REML MANOVA, equation (2.5)) and phenotypic correlations (REML MANOVA, equation (2.6)) for 1996 and 1997 T232 X CM37 RIL trait data are listed in Table 3.6. In general, the GLM and REML estimates are similar so only the REML results are presented. Genetic correlations from single year data ranged from a low −0.65±0.53 for the traits kernel protein percentage and MECN at 18 DAP (1997) to a high 0.99±0.52 for the traits 100 kernel weight and MECN at 18 DAP (1996). The standard errors for the cytological trait correlations are high relative to the standard errors for the genetic/phenotypic correlations estimated for the agronomic traits. In addition, standard errors for the genetic/phenotypic correlations are also much higher compared to the standard errors of the heritability estimates. This is due to the fact that correlations are multivariate in nature and require substantially larger population sizes to achieve comparable standard error estimates (Lynch and Walsh, 1998). The traits endosperm cell number (log transformation) and mean ploidy (16 DAP) from the Tx303 X CO159 endosperm samples collected in Columbia, MO were negatively correlated (R = −0.32, p < 0.0001), see Figure 3.12. The traits log MECN and MEP from the St. Paul location data (data not shown) were not related (R = −0.12, p = 0.2607). The correlation between MEP and MECN (1996) was estimated from the CO159 X Tx303 RIL mapping population data (16 DAP). The PROC MIXED method for the computation of the genetic correlation failed to converge. The failure of the PROC MIXED genetic correlation to converge is likely due to the small sample size of this mapping population. The genetic correlation between MEP and MECN was estimated to be −0.22 ± 0.14 using the PROC GLM method developed by Holland et al. (2001). The phenotypic correlation was estimated to be −0.20 ± 0.13 on an 53
  • 68. 6.12 6.05 5.98 5.91 5.84 5.77 5.70 9 11 13 15 17 LogEndospermCellNumber Mean Endosperm Nuclear Ploidy R2 = 0.10 R = -0.32 p<0.0001 Figure 3.12: Regression of log endosperm cell number on mean nuclear ploidy from the immortalized Tx303 X CO159 F2 population endosperm samples collected at 16 DAP in Columbia, MO 1996. 55
  • 69. Table 3.6: Genetic and phenotypic correlations for T232 X CM37 RIL traits (St. Paul, MN 1996 and 1997). Multivariate mixed model analysis (REML). The traits kernel starch total and kernel protein total were derived by multiplying total kernel starch and protein percentage with the 100 kernel weight trait. NS indicates not significant- the confidence interval for the correlation estimate included the value of zero. Trait Combination Year Genotypic Phenotypic Correlation Correlation ±SE ±SE Endosperm Cell Number / 1996 −0.63 ± 0.29 −0.57 ± 0.28 Mean Ploidy (18 DAP) 1997 NS NS Kernel Weight / 1996 −0.47 ± 0.27 −0.36 ± 0.22 Mean Ploidy (18 DAP) 1997 NS NS Kernel Weight / 1996 0.99 ± 0.52 0.44 ± 0.26 Endosperm Cell Number (18 DAP) 1997 NS NS Kernel Weight / 1996 0.65 ± 0.46 0.44 ± 0.26 Mean Total C (18 DAP) 1997 NS 0.45 ± 0.27 Mean Ploidy (18 DAP) / 1997 −0.39 ± 0.27 −0.32 ± 0.22 Kernel Protein Percentage Endosperm Cell Number / 1997 −0.65 ± 0.53 −0.43 ± 0.26 Kernel Protein Percentage Mean Ploidy (18 DAP) / 1997 0.53 ± 0.27 0.46 ± 0.22 Kernel Starch Percentage Endosperm Cell Number (18 DAP) / 1997 NS NS Kernel Starch Percentage Mean Total C (18 DAP) / 1997 NS NS Kernel Protein Total Mean Total C (18 DAP) / 1997 NS 0.48 ± 0.27 Kernel Starch Total 56
  • 70. 3.7 Quantitative Trait Analysis Figure 3.13 presents a genome-wide summary of the MECN and MEP QTLs identified from all three mapping populations. QTL analysis was performed using the phenotypic data from the individual years (1996 and 1997). Although the year and year × genotype terms from the multi-environment ANOVA analysis were not different from 0 for either of the cytological traits in the T232 X CM37 mapping population, the cytological phenotypic data were not pooled for QTL analysis across years. By not pooling the data, differences in QTL location and effect per year may be detected. The power to detect QTLs is increased by CIM due to the reduction in the error variance when significant marker cofactors are present in the QTL model. In addition, although the study focused on the 18 DAP MEP data for heritability and genetic/phenotypic correlation estimations, the full range of MEP data (14 to 24 DAP) were used for QTL detection (Joint CIM and CIM from individual DAP stages). QTL mapping for both the Tx303 X CO159 IF2 and CO159 X Tx303 RIL populations is based on individual year (location) data. Initial single marker analysis using linear regression identified the most likely major QTLs and additional potential cofactors for composite interval mapping (CIM). Linear regression and conventional interval mapping (IM) results are not reported because in many cases, QTLs were not identified until additional marker cofactors were included in the mapping model. 57
  • 71. Figure3.13:GeneticmapoftheendospermcellnumberandmeanploidyQTLsidentifiedfromthethreemappingpopula- tions.LinkagedistanceswerecalculatedwithMapmakerQTLandarebasedontheT232XCM37mapping population.Theboxheightadjacenttothechromosomerepresentsanapproximate3LODsupportinterval fortheidentifiedQTL.QTLregionsaremarkedbywhiteboxes(endospermcellnumber)andblackboxes (endospermmeanploidy).Thepercentagephenotypicvarianceexplained,theadditivevalue(A),andthe maximumLODscoreforeachQTLislistedwithinthebox.ThepopulationIDislistedbelowthebox. 20.3% 5.02 TxCOIF2 MO 0.0npi114a npi220a bnl13.05a isu1410a umc103a bnl9.44 pdk2 hox1 umc12a bnl12.30a umc48a bnl17.17 npi108b bnl10.24b npi224b csu96b umc3a bnltas1m 5.8 20.4 27.5 42.4 56.1 65.3 72.9 77.5 81.2 90.4 96.1 118.5 123.9 136.7 144.3 155.2 179.2 Chr-8 Bin 8.01 8.03 8.02 8.05 8.06 8.07 8.09 8.04 21.5% 4.98 TxCOIF2 STP 17.7% 3.69 TCM 97 +118(A)+129(A) +118(A) Chr-1 0.0 13.3 31.6 44.6 60.6 71.3 77.3 86.2 101.8 117.2 135.1 145.1 167.8 181.6 189.7 197.5 209.6 225.1 233.7 242.2 252.7 267.4 275.6 bnltas1h bnltas1c umc94a cdo20a pds1 uaz120 umc11a p1 bnl7.21a uaz9 umc58 uaz18d uaz20a npi236 umc37a bz2 ias7 kn1 knox8 bnl8.29a chi1 uaz22 mpik9 Bin 1.00 1.01 1.03 1.02 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 16.1% 6.78 TxCO IF2MO -0.69(A) 24DAP 10.47 TCM Joint96 +0.58(A) -0.72(A) 14DAP 11.86 TCM Joint97 -0.60(A) 14DAP 5.10 TCM 97 12.6% 0.0pgs1 umc53a pbs5 mpik4b b1 umc34 bnl12.09 pic3 ici99 npi277a csu54a uaz31b umc122 dpg6d mpik26 uaz33a mha1 bnl17.14 knox4 13.1 17.2 38.1 50.1 57.1 69.5 78.2 85.1 96.9 114.8 132.9 140.4 144.1 157.9 163.6 178.7 189.8 192.2 Chr-2 24.2% 3.81 TCM 96 -81(A) Bin 2.00 2.02 2.04 2.03 2.06 2.07 2.08 2.09 2.10 13.1% 4.54 TxCOIF2 MO +0.51(A) 6.21/ 4.77 TCM Joint97 and at18/20 DAP -1.0(A) 18/20 DAP ~20% Chr-3 0.0uaz109 umc32a e8 me1 chs566 dup104 umc102 umc60 ici98 npi328b bnl6.16a bnl1.297 cdo345b sh2 npi425a uaz117a 9.0 23.0 41.3 57.7 67.7 76.3 82.2 95.3 112.4 132.9 140.5 148.5 157.3 164.6 193.2 Bin 3.00 3.01 3.05 3.04 3.06 3.07 3.08 3.09 3.10 9.7% 2.88 TxCOIF2 STP +94(A) -1.60(A) 16DAP 4.51 TCM 96 11.0% 0.0agrr115 umc123 bx4 uaz51 zpl1b umc31a uaz53b bet2 uaz73 umc156a mpik3 trg1 umc19 wsunia3 npi253b uaz122 c2 cuny9 mgs2 npi333 uwo3 ivr2a umc111a cdc2 8.1 11.4 17.7 20.8 48.9 61.2 78.2 81.3 88.1 92.9 103.1 110.0 115.4 123.8 132.5 138.7 146.2 149.7 158.7 169.2 189.9 196.3 199.3 Chr-4 Bin 4.01 4.03 4.02 4.05 4.06 4.07 4.08 4.09 4.11 4.10 4.04 4.10 4.10 4.09 10.0% 3.39 TxCOIF2 STP 19.9% 6.33 TxCOIF2 STP 29.2% 3.69 TCM 97 -1.1(A) +69(A) +0.96(A) 12.0%* 3.96 TCM 96 14,20, 24*DAP -1.1(A)11.1% 3.68 TCM 97 24DAP -1.2(A) 0.0bnltas2b uaz75 npi890 csh1c uaz214 bnl6.25a uaz163 ucsd64a phyA2 csicmah9 csu150b mpik33e csu168a umc1 npi213 bnl4.36 a2 bt1 uaz131 amp3 bnl10.12 pal1 bnl5.40 csu26a ici229 umc108 wsunia5 npi288a php1001 ias13b 7.1 16.9 16.9 21.8 32.0 53.7 64.3 72.0 85.1 91.1 98.0 107.9 115.6 121.5 129.7 133.7 138.1 151.4 155.9 175.4 179.7 184.6 203.6 208.4 222.0 257.8 269.2 279.4 296.9 Chr-5Bin 5.00 5.01 5.02 5.03 5.04 5.05 5.06 5.07 5.08 75.09 13.4% 3.42 TCM 97 11.6% 5.08 TxCOIF2 MO 12.2% 4.91 TxCOIF2 MO 6.0% 2.99 TxCOIF2 STP 22.8% 3.35 COTx 96 +1.2(A) +0.48(A) +0.52(A) +0.52(A) -76(A) 10.2% 3.67 TCM 96 18DAP -1.1(A) 15.8% 3.18 +0.67(A) TCM 96 14DAP 0.0npi340a npi235a enp1 npi393 uaz106a umc65a uaz160 bnl3.03 pge20 bnl5.47a uaz256 uaz19d bnl17.12 idh2 php20599 6.8 17.1 27.9 36.1 45.9 53.8 64.8 69.5 85.3 105.7 127.1 138.7 149.0 152.3 Chr-6 10.7% Bin 6.00 6.01 6.04 6.03 6.05 6.06 6.07 4.39 TCOIF2 MO 15.6% 2.71 COTx 97 +78(A) -0.62(A) 0.0bnl25 rs1 cuny12 uaz20b csu11 npi224a uaz221 umc110a bnl6.27 uaz92 bcd249b bnl8.39 npi385 npi113a bnl8.44a pbs7 abg373 6.8 15.1 19.8 31.6 35.0 65.1 68.7 74.6 88.4 97.2 103.7 116.7 126.0 132.4 148.9 152.8 Chr-7 Bin 7.00 7.01 7.03 7.02 7.04 7.05 7.06 15.4% 7.53 TxCOIF2 MO 17.2% 3.33 TxCOIF2 STP 18.4% 4.22 COTx 96 -80(A) -109(A) -156(A) 0.0csu95a npi253a c1 sh1 uaz237a wx1 umc153 bnl8.17 dpg6c npi443 npi439b bnl7.57 uaz148 npi291 npi97b csu50b 12.5 20.0 22.9 30.0 38.6 49.6 60.1 63.9 71.3 78.8 83.4 93.0 105.8 117.3 131.1 Chr-9 Bin 9.01 9.03 9.02 9.05 9.06 9.07 9.08 9.04 15.1% 3.60 TCM 96 +1.03(A) 0.0ucsd72b mpik12a php20075a npi285 sad1 dpg5 npi303 npi232a umc44a umc57a npi306 npi321a gln1 8.3 23.4 31.5 45.6 53.7 68.1 78.4 98.1 102.8 118.1 130.9 135.4 Chr-10 Bin 10.00 10.03 10.01 10.02 10.04 10.05 10.06 10.07 EndospermCellNumberQTLs EndospermMeanPloidyQTLs 58
  • 72. 3.7.1 Endosperm Cell Number QTLs Table 3.7 includes all of the MECN QTLs identified by CIM from the three mapping populations. The polygenic nature of the trait is evident from the putative QTLs identified across the genome. No common endosperm cell number-related QTLs were identified across the years in the T232 X CM37 RIL population. This indicates QTL × environment interaction. The magnitude of a particular QTL effect can change depending on environmental conditions. An alternative explanation involves the fact that the sample size for this population is small and the power to identify QTLs is low for all but the most major QTLs. Thus, only the most major QTLs for a given year are identified by these QTL analyses. Both important major and minor QTLs can be expected to be missed due to both QTL × genotype interaction, the small sample size, and sampling error. Composite interval mapping identified one QTL that significantly influenced MECN from kernel samples collected from the T232 X CM37 mapping population in St. Paul in 1996. The QTL identified on chromosome 2 is marked by ici99 (bin 2.06) and has an average negative effect of allele substitution of 81 × 103 endosperm cells (T232 direction). The QTL region identified on chromosome 2 met the p < 0.01 experimentwise threshold. The final QTL model, including this one QTL region, accounted for 24.2 ± 10.8% of the phenotypic variance and 89.5 ± 39.9% of the geno- typic variance. Two MECN QTLs were identified using the 1997 data. Both QTLs exceeded the α = 0.1 experimentwise threshold (LOD 3.26) but only the QTL on chromosome 8 exceeded the α = 0.05 threshold (LOD 3.63). The QTL identified on chromosome 5 is marked by npi288a (bin 5.08) and has an average negative effect of allele substitution of 76 × 103 endosperm cells (T232 direction). The QTL region identified on chromosome 8 is marked by csu96b (bin 8.08) and has an average neg- 59
  • 73. ative effect of allele substitution of 88 × 103 endosperm cells (T232 direction). The final QTL model, including both QTL regions, accounted for 27.6 ± 11.6% of the phenotypic variance and 78.7 ± 33.1% of the genotypic variance. Composite interval mapping identified one QTL that significantly influenced MECN from kernel samples collected from the CO159 X Tx303 RIL mapping pop- ulation in St. Paul in 1996. The QTL identified on chromosome 7 is marked by npi435 (bin 7.04) and has an average negative effect of allele substitution of 159×103 endosperm cells (Tx303 direction) (Table 3.7). The QTL region identified on chro- mosome 7 exceeded the α = 0.01 experimentwise threshold. The final QTL model, including this single QTL region, accounted for 18.4 ± 11.2% of the phenotypic vari- ance and 32.9 ± 20.0% of the genotypic variance. Similar QTL mapping results were found using the log transformation of the endosperm cell number data. Two common QTL regions, on chromosomes 7 and 8, were identified in both environments for MECN in the Tx303 X CO159 IF2 mapping population. Composite interval mapping identified three QTLs (on chromosomes 6, 7, and 8) that significantly influenced MECN from kernel samples collected from the IF2 mapping population in Columbia, MO in 1996 (Table 3.7). The three QTLs displayed significant additive gene action. The multiple QTL model, including all three regions simultaneously, accounted for 35.0 ± 7.3% of the phenotypic variance. Unlike the RIL final QTL models, total model genotypic variance is not reported for the IF2 mapping results. This is due to the lack of genotype replication (in this case replication of the bulked material that represents the IF2 line within each environment) necessary to separate genotypic and environmental variance components (P = G + E). Four QTLs (on chromosomes 3, 4, 7, and 8) were identified that significantly influenced MECN from the IF2 mapping population grown in St. Paul, MN in 1996 (Table 3.7). All four 60
  • 74. QTLs displayed additive gene action. The multiple QTL model, including all four regions simultaneously, accounted for 45.2 ± 7.7% of the phenotypic variance. An example of a QTL scan profile for the endosperm cell number trait (Columbia, MO, 1996) is given in Figure 3.14. 61
  • 75. Table3.7:SummaryofidentifiedQTLsforthetraitmeanendospermcellnumber. Chr.cMLocusAdditivePercentLODExperimentwiseParentPopulationYearDev. EffectVarianceProbabilityStage (CellNumber)Explained 285.2ici99−81×103 24.23.810.005<p<0.01T232TCMRILSTP199618DAP 348.0bnl5.37a94×103 9.72.880.05<p<0.10Tx303TxCOIF2STP199616DAP 432.0umc31a69×103 10.03.390.05<p<0.10Tx303TxCOIF2STP199616DAP 5270.2npi288a−76×103 13.43.420.05<p<0.10T232TCMRILSTP199718DAP 680.0umc132a78×103 10.74.390.01<p<0.05Tx303TxCOIF2MO199616DAP 7114.4npi435−156×103 18.44.22p<0.01Tx303COTxRILSTP199616DAP 770.0csu8−80×103 15.47.53p<0.01CO159TxCOIF2MO199616DAP 771.5umc254−109×103 17.23.330.05<p<0.10CO159TxCOIF2STP199616DAP 826.0umc103a118×103 20.65.02p<0.01Tx303TxCOIF2MO199616DAP 835.9stp1129×103 21.54.98p<0.01Tx303TxCOIF2STP199616DAP 8150.3csu96b−88×103 17.73.690.05<p<0.10T232TCMRILSTP199718DAP 62
  • 76. Figure 3.14: Immortalized Tx303 X CO159 F2 population composite interval map- ping: QTL likelihood maps on chromosome 7 for the trait mean en- dosperm cell number at 16 DAP (Columbia, MO 1996). The empiri- cally derived threshold values from permutation analysis are indicated by the horizontal solid line (5%) and dashed line (1%). The black triangle marker indicates the location of the cofactor used in CIM. 20 40 60 80 cM 2 4 6 csu582 asg8 asg34a asg49 csu296umc254 csu8 umc245bnl8.44aumc168umc35a LODScore 8 3.7.2 Endosperm Mean Ploidy QTLs Composite interval mapping identified one QTL that significantly influenced MEP from kernel samples collected from the T232 X CM37 mapping population in St. Paul in 1996. The QTL identified on chromosome 9 is marked by npi443 (bin 9.05) and has an average positive effect of allele substitution of 1.03 mean ploidy units (CM37 direction) (Table 3.8). The QTL region identified on chromosome 9 exceeded the α = 0.05 experimentwise threshold (LOD 3.39). The final QTL model, including this QTL region, accounted for 15.1±9.5% of the phenotypic variance and 25.2±15.9% 63
  • 77. of the genotypic variance. One QTL was identified that significantly influenced MEP from kernel samples collected from the T232 X CM37 mapping population in St. Paul in 1997. The QTL identified on chromosome 4 is marked by bet2 (gylcinebetaine2) (bin 4.05) and has an average negative effect of allele substitution of 1.10 mean ploidy units (T232 direction) 2 . The QTL region identified on chromosome 4 exceeded the α = 0.05 experimentwise threshold (LOD 3.29). The final QTL model, including this QTL region, accounted for 29.2 ± 11.7% of the phenotypic variance and 63.5 ± 25.4% of the genotypic variance. Composite interval mapping failed to identify QTL(s) that significantly (α = 0.05 experimentwise threshold) influenced MEP from kernel samples collected from the CO159 X Tx303 RIL mapping population in St. Paul in 1996. However, a puta- tive QTL on chromosome 5 was identified at the α = 0.10 experimentwise threshold level (LOD 3.16). This putative QTL, also identified by linear regression, on chro- mosome 5 is marked by php20566 (bin 5.06) and has an average positive effect of allele substitution of 1.19 mean ploidy units (CO159 direction) (Table 3.7). The final QTL model, including this QTL region, accounted for 22.8±11.8% of the phenotypic variance and 29.7±15.3% of the genotypic variance. Using the 1997 CO159 X Tx303 mapping data, a putative QTL on chromosome 6 was identified at the α = 0.25 ex- perimentwise threshold level (LOD 2.65). The putative QTL, also identified by linear regression, on chromosome 6 is marked by tug8 (bin 6.04) and has an average negative effect of allele substitution of 0.62 mean ploidy units (Tx303 direction). The final QTL model, including this QTL region, accounted for 15.6±11.8% of the phenotypic variance. Composite interval mapping identified four QTLs that influenced the trait MEP 2 This chromosome region (bin 4.05) was also identified using T232 X CM37 RIL data in both 1996 (using 14, 20, and 24 DAP stage data) and 1997 (using 24 DAP stage data (Figure 3.13) 64