This document discusses cotton genomics and provides details about cotton species. It summarizes that cotton has a large and complex genome, with Gossypium hirsutum and G. barbadense being the most important cultivated species. The genome of G. raimondii is considered closest to the D-genome donor of tetraploid cotton. Resources like CottonDB provide genomic data on markers, genes, and genetic maps. Sequencing efforts aim to understand cotton evolution and support molecular breeding.
2. Phylum: Magnoliophyta
Class: Magnoliopsida
Order: Malvales
Family: Malvaceae
Tribe: Gossypieae
Cotton- King of Fibres
Diploid: 2n = 26, tetraploid: 2n = 52
2.7 billion nucleotides.
GossypiumGossypium
S S Jena
3. Annual, biennial or
perennial
Herbaceous, short
shrub or small tree
Primary axis, alternate
Leaves have varying
texture, shape,
hairiness
Showy cream, yellow,
red or purple flowers
axilary, terminal or
solitary with typically 5
petals
DiversityDiversity
S S Jena
4. African-Asian diploids:
G. herbaceum
G. arboreum
New World tetraploids:
G. barbadense
G. hirsutum
Four Independently Domesticated
Species!!
Four Independently Domesticated
Species!!
G. barbadense
G. hirssutum
S S Jena
5. Gossypium hirsutum, also known as Upland
cotton, Long Staple Cotton, or Mexican
Cotton, produces over 90% of the world’s
cotton;
G. barbadense, also known as Sea Island
Cotton, Extra Long Staple Cotton, American
Pima, or Egyptian Cotton, contributes 8% of
the world’s cotton;
G. herbaceum, also known as Levant Cotton,
and G. arboreum, also known as Tree Cotton,
together provide 2% of the world’s cotton.
Four Independently Domesticated
Species!!
Four Independently Domesticated
Species!!
S S Jena
6. Concerning diploid parentageConcerning diploid parentage
Cytogenetic studies indicate G. raimondii as the
closest living relative of D genome parental donor
Hutchinson et al., 1947
– used 5 D-genome species in crosses with G.
hirsutum or G. barbadense.
– Indicated G. raimondii as closer to the D-
genome than other species tested.
– Innovative approach involving comparative
analysis of diverse synthetic allohexaploids.
Liu et al., 2001
– G. raimondii is the sister group to clade of all 5
allopolyploid species
S S Jena
7. A-genome perspectivesA-genome perspectives
A-genome of allopolyploid cotton is more similar
to the A-genome diploids than the D-genome of
the allopolyploid is to that of the D-genome
diploids!
G. arboreum and G. herbaceum better models of
the progenitor A-genome diploid than G. raimondii
is of the D-genome diploid
G. herbaceum more likely the A-genome donor
than G. arboreum.
S S Jena
8. Biogeographical TheoriesBiogeographical Theories
Theories, based on cytogenetic data, suggested
that polyploidization occurred after a Trans-
Atlantic dispersal of a species similar to G.
herbaceum.
Wendel and Albert, 1992:
Suggest pre-Pleistocene A-genome radiation into
Asia, followed by trans-Pacific dispersal to the
Americas
Supported by biogeography of D-genome species
Recent arrival of G. raimondii in Peru.
S S Jena
9. Allopolyploidization of Cotton
Occurred Only Once
Allopolyploidization of Cotton
Occurred Only Once
All New World tetraploid cottons contain Old
World Cytoplasm.
Must have been one single seed plant in the initial
hybridization event.
Long distance dispersals occurred by
Transoceanic Voyages.
S S Jena
11. CottonDBCottonDB
CottonDB is a product of the Agricultural Research
Service of the US Department of Agriculture and is
maintained as a public resource to serve the cotton
research community.
CottonDB is a database that contains genomic,
genetic and taxonomic information for cotton
(Gossypium spp.).
It serves both as an archival database and as a
dynamic database which incorporates new data and
user resources.
In 1995, CottonDB was initiated. It is the first and
most extensively used database for cotton
worldwide. S S Jena
12. CottonDBCottonDB
CottonDB is a database that contains genomic,
genetic and taxonomic information for cotton
(Gossypium spp.).
It serves both as an archival database and as a
dynamic database which incorporates new data
and user resources.
S S Jena
13. CottonDBCottonDB
CottonDB.org contains:
– Genomic markers and nucleotide sequences
– Genes, alleles and gene products
– BAC clones and clone libraries
– TM-1 fingerprint contigs developed by USDA-
ARS/Texas A&M University
– Taxonomy of the Gossypium genus
– Genetic maps
– Contact information and research interests of
colleagues
– Relevant bibliographic citations
S S Jena
14. Linkage mapLinkage map
The first molecular linkage map of the Gossypium
species was constructed from an interspecific G.
hirsutum × G. barbadense F2 population based
on RFLPs.
The map comprised of 2,584 loci at 1.74 cM
intervals and covered all 13 homeologous
chromosomes of the allotetraploid cottons,
representing the most complete genetic map of
the Gossypium.
At least six BAC and BIBAC libraries have been
developed and made available to the public
S S Jena
15. ESTsESTs
281,233 ESTs have been available for the
Gossypium species in GenBank.
Of these ESTs,
178,177 were from the polyploid cultivated
cottons with 177,154 (63.0%) from G. hirsutum
and 1,023 (<1.0%) from G. barbadense.
while 103,056 were from the related diploid
species with 39,232 (13.9%) from G. arboreum
(A2), 63,577 (22.6%) from G. raimondii (D5),
and 247 (<1.0%) from G. herbaceum (A1).
S S Jena
16. ESTs cont.ESTs cont.
These ESTs were collectively generated from 32 cDNA
libraries constructed from mRNA isolated from 18
genotypes of three species, G. hirsutum, G. arboretum,
and G. raimondii, by one-pass sequencing of cDNA
clones from one (3′ or 5′ end) or both ends.
Generated from 12 different organs- developing fibers,
seedlings, buds, bolls, ovules, roots, hypocotyls,
immature embryos, leaves, stems, and cotyledons.
Some of the ESTs were generated from plants growing
under biotic or abiotic stress conditions such as drought,
chilling, and pathogens.
A predominant feature of the cotton EST set is the
significant preference of their tissue sources for fiber or
fiber-bearing ovules than other organs.
S S Jena
17. Physical mappingPhysical mapping
To date the database is limited to information
concerning our ongoing physical mapping effort in
three species of cotton, including the two cultivated
'AADD' tetraploid species Gossypium barbadense
and G. hirsutum, and the wild DD genome species
G. raimondii.
BAC libraries for all three species are currently
being assayed using genomic and cDNA clones
derived from linkage maps, and also dispersed
repetitive DNA clones.
The physical mapping database has been
constructed using the BACMan data management
application
S S Jena
18. QTLsQTLs
QTLs for fiber quality properties in two Upland
cottons, compared with those of ELS (extra long
staple) cotton, with regard to their locations and
gene effects.
A total of thirteen QTLs have been identified, four
for fiber strength, three for fiber length, and six for
fiber fineness
They are located on different chromosomes or
linkage groups of our molecular maps comprised of
355 DNA markers covering 4,766 cM (Haldane
function) of the cotton genome in 50 linkage groups
S S Jena
19. Cotton Vs ArabidopsisCotton Vs Arabidopsis
Although cotton genome is large and complex, its
physical size of a cM is only 50% larger than that of
Arabidopsis
A high level of homology between Arabidopsis and
Gossypium genomes and abundant polymorphism
among Gossypium germplasm were detected
using conserved cDNAs from Arabidopsis genome.
The upland cotton genome consists of
approximately 61% unique sequences and low
copy number DNA
S S Jena
20. Chloroplast genomeChloroplast genome
The chloroplast genome of cotton is 160,317 base
pairs (bp) in length, and is composed of a large
single copy (LSC) of 88,841 bp, a small single
copy (SSC) of 20,294 bp, and two identical
inverted repeat (IR) regions of 25,591 bp each.
The genome contains 114 unique genes, of which
17 genes are duplicated in the IRs. In addition,
many open reading frames (ORFs) and
hypothetical chloroplast reading frames (ycfs) with
unknown functions were deduced.
S S Jena
21. Chloroplast genome cont.Chloroplast genome cont.
Compared to the chloroplast genomes from 8 other
dicot plants, the cotton chloroplast genome showed
a high degree of similarity of the overall structure,
gene organization, and gene content.
The cotton chloroplast genome was somewhat
longer than the chloroplast genomes of most of the
other dicot plants compared here.
However, this elongation of the cotton chloroplast
genome was found to be due mainly to expansions
of the intergenic regions and introns (non-coding
DNA).
Moreover, these expansions occurred
predominantly in the LSC and SSC regions.
S S Jena
22. Cotton sequencing will greatly help
molecular breeding
Cotton sequencing will greatly help
molecular breeding
Increases our understanding of cotton physiology
and evolution.
Model of polyploid and comparative genome
evolution
Maintains the competitive advantage of cotton
fiber relative to other fibers
Creates new values for cotton on the farm and
beyond the farm gate
S S Jena
23. Factors slowing down the cotton
sequence progress
Factors slowing down the cotton
sequence progress
Its large genome, a relatively large physical size of
2246 Mbp and small chromosomes
Allotetraploid AD genome property,
n=2x=AD=26
A large fraction of genome comprised of repetitive
DNA seq.
S S Jena