1. ESTs are a Rich Source of Polymorphic SSRs for
Genomics and Molecular Breeding Applications in Peanut
Sameer Khanal1
, Shunxue Tang1
, Vadim Beilinson2
, Phillip San Miguel3
, Baozhu Guo4
, Niels Nielsen2
,
Thomas Stalker2
, Marie-Michele Cordonnier-Pratt5
, Lee H. Pratt5
, Virgil Ed Johnson5
, Christopher A. Taylor1
, and Steven J.
Knapp1
1
Center for Applied Genetic Technologies, The University of Georgia, Athens, Georgia, 30602
2
Crop Science Department, North Carolina State University, Raleigh, North Carolina, 27695
3
Genomics Center, Purdue University, West Lafayette, Indiana, 47907
4
USDA-ARS, Tifton, Georgia, 31793
5
Laboratory for Genomics and Bioinformatics, The University of Georgia, Athens, Georgia, 30602
Narrow genetic diversity and a deficiency of polymorphic DNA markers
have hindered genetic mapping and the application of genomics and
molecular breeding approaches in cultivated peanut (Arachis hypogaea
L.). Simple Sequence Repeat (SSR) markers have become the marker
class of choice for molecular mapping and breeding of many plant
species (Eujayl et al. 2003). Therefore, a complete collection of 556
polymorphic SSR markers was screened and found to be inadequate
for developing a saturated linkage map for Arachis species
(unpublished data). However, the development of a critical mass of SSR
markers by mining Expressed Sequence Tag (EST) databases has
emerged as a fast, efficient and low-cost option for many plant species
including peanut (Eujayl et al. 2003, Moretzsohn et al. 2005). Also, the
rate of detecting polymorphism among peanut lines is higher using EST
derived SSR markers than those derived from the genomic sequences
(Luo et al. 2005). Therefore, we developed and mined a peanut EST
database for simple sequence repeats (SSRs), assessed the frequency
of polymorphic SSRs in ESTs, and initiated the development of several
hundred EST-SSR markers with the goal of breaking the DNA marker
bottleneck in cultivated peanut. Our peanut EST database contains
84,229 ESTs assembled into 26,809 unigenes (unpublished).
Our objectives of this study were to:
1.Assess polymorphisms offered by different SSR repeat motifs, SSR
repeat lengths, and repeat locations of SSRs.
2.Assess the frequency of polymorphic EST-SSRs for developing a
critical mass of DNA markers for genomics and molecular breeding
applications in cultivated peanut.
Eujayl, I., M.K. Sledge, L. Wang, G.D. May, K. Chekhovskiy, J.C. Zwonitzer, and M.A.R. Mian. Medicago trunculata EST-SSRs reveal cross-species genetic markers for Medicago spp. Theor. Appl. Genet. 108(3):414-422.
Liu, K. and S.V. Muse. 2005. PowerMarker: an integrated analysis environment for gentic marker analysis. Bioinform. Appl. 21(9):2128-2129.
Luo, M., P. Dang, B.Z. Guo, G. He, C.C. Holbrook, M.G. Bausher, and R.D. Lee. 2005. Generation of expressed sequence tags (ESTs) for gene discovery and marker development in cultivated peanut. Crop Sci. 45:346-353.
Moretzsohn, M.C., L. Leoi, K. Proite, Guimeraes P.M., Leal-Bertioli S.C.M., M.A. Gimenes, W.S. Martins, J.F.M. Valls, D. Grattapaglia, and D.J. Bertioli. 2005. A microsatellite-based, gene-rich linkage map for the AA genome of Arachis (Fabaceae). Theor. Appl. Genet. 111(6):1060-1071.
Rozen, S. and H.J. Skaletsky. 2000. Primer3 on the WWW for general users and for biologist programmers. In: Krawetz, S. and S. Misener (eds) Bioinformatics Methods and Protocols:
Methods in Molecular Biology. Humana Press, Totowa, NJ, pp 365-386.
Temnykh, S., G. DeClerck, A. Lipovich, S. Cartinhour, and S. McCouch. 2001. Computational and experimental analysis of microsatellites in rice (Oryza sativa L): frequency,length, variation, transposon
associations, and genetic marker potential. Genet. Res. 11:1441-1452.
SSRIT (Temnykh et al. 2001) was used for mining EST-SSRs obtained from the peanut EST database (unpublished). 80 EST-SSRs representing a broad spectrum of
repeat motifs, repeat lengths and repeat locations were selected and Primer3 (Rozen and Skaletsky 2000), was used for designing primers. Primers were labeled with 6-
FAM, HEX, or Tamra fluorescent dyes and were screened for polymorphisms against 28 tetraploid and 4 diploid germplasm accessions. Genotypes were determined using
the ABI3730 DNA analyzer and GeneMapper Software Version 4 (Applied Biosystems, Foster City, CA).
Results and Conclusions
1. SSRs are abundant in the ESTs.
4,470 perfect SSRs were found interspersed in 3,716 unigenes. 13.86% of the unigenes
contained SSRs.
Introduction
2. Dinucleotides are the most frequent repeat motifs (Figure 1).
7. ESTs are a rich source of polymorphic SSRs (Figure 5).
Of 58 markers, 55 (94.8%) were polymorphic, 32 (55.2%) were polymorphic in tetraploid
peanut (mean heterozygosity was 0.18), 27 (46.6%) were polymorphic in four cultivated peanut
mapping populations, and 48 (82.8%) were polymorphic in two diploid mapping populations.
The frequency of polymorphic EST-SSRs seems to be more than sufficient for developing a
critical mass of DNA markers for genomics and molecular breeding applications in cultivated
peanut.
Frequencies of SSR Motifs
2580
1731
106 27 26
0
500
1000
1500
2000
2500
3000
Frequency
Di- Tri- Tetra- Penta- Hexa-
Figure 1. Frequencies of different repeat motifs out of
a total of 4,470 SSRs mined from 26,809 unigenes.
Figure 2. Scatter plot of different SSR lengths
plotted against heterozygosity observations.
SSR Length vs. Polymorphism
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 20 40 60 80
SSR Length
Heterozygosity
Corr. (r) = 0.45
5. SSRs in exons and UTRs are equally polymorphic (Figure 3).
Polymorphism Against SSR Locations
0.348040943
0.334534996
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
PIC(Heterozygosity)
UTRs Exon
Figure 3. Heterozygosity observations for
SSRs in the exonic regions and in the
untranslated regions.
Figure 5. Number of polymorphic markers in 4 tetraploid
mapping populations and two diploid mapping
populations
Polymorphisms in the Mapping Populations
12 12 12 14
36
41
0
10
20
30
40
50
1
Numberof
PolymorphicMarkers
Tifrunner x GTC20
NemaTAM x WS 14
Chico x SSD-6
NC12C x A. monticola (GKBSPSc 30062)
A. duranensis (DUR25) x A. duranensis (DUR35)
A. batizocoi (BAT3) x A. batizocoi (BAT8)
Materials and Methods
References
(57.7%)
(38.7%)
(2.37%)
(0.6 %) (0.58
%)
3. Observations on Allele Frequencies and Heterozygosities
Approximately 5 alleles per marker for the panel and 3 alleles per marker for the tetraploid
subset were scored. Average heterozygosity observed for 58 markers across the panel was
0.33 while that from the same set of markers on tetraploid subset was 0.18.
4. Longer SSRs are more polymorphic than the shorter ones.
Although there was no strong correlelation between SSR length and heterozygosity (Figure 2),
SSRs longer than 26 bp were two fold more polymorphic than SSRs shorter than 26 bp.
6. SSR markers can discriminate the botanical varieties of cultivated peanuts
(Figure 4).
Figure 4. An unrooted cladogram generated by PowerMarker (Liu and Muse 2005) with 33
polymorphic SSRs. Accessions from four botanical varieties viz. Runner, Virginia, Valancia
and Spanish are shown to cluster together.