This Presentation will be helpful to undergraduate and postgraduate students of biology and biotechnology in understanding the significance of COT curves in determination of gene and genome complexity amoug various organisms
2. Introduction
• the more complex the organism, the more DNA is
needed to “run it i.e. total amount of DNA in the genome
• Therefore, we would expect a linear relationship
between genome size and organism complexity.
• At the lower range of complexity, this holds:
• In larger organisms, relationship breaks down
• Bacteria have smaller genomes than eukaryotes, and
viruses have smaller genomes than bacteria. Organisms
have DNA apparently in excess of what is needed;
repetitive sequences, “junk DNA”.
• This is the C value Paradox, that in the most complex
organisms, there doesn’t appear to be the expected
relationship between complexity and genome size.
3. C-value and C-Value paradox
• The amount of cellular DNA in different organisms
does not correlate with their relative biological
complexity.
• term C-value was first introduced in 1950 by Swift &
C-value paradox by Thomas.
• Swift H. 1950. The constancy of desoxyribose nucleic acid in plant
nuclei. Proc Natl Acad Sci USA 36:643–654.
• Thomas CA Jr. 1971. The genetic organization of chromosomes.
Annu Rev Genet 5:237–256.
4. Four variables affecting estimations of
the C-value—
the ambiguity of the term,
polyploidy,
repetitive sequences and
experimental errors—
of which polyploidy may be the most significant
Taft et al., 2007 The relationship between non-protein-coding DNA and eukaryotic
complexity. BioEssays 29 :288–299,
5. Some genome sizes compared
Mycoplasma 1 x 10 6
Bacteria 1.5 x 10 6
Fungi 5x10 7
Algae 8 x 10 7
Molds 8 x 10 7 1 x 10 8
Worms 1 x 10 8
Molluscs 9 x 10 8 5 x 10 9
Insects 1 x 10 8 2 x 10 9
Echinoderms 1 – 2 x 10 9
Cartiaginous
fishes 5 – 8 x 10 9
Boeny Fishes 5 x 10 8 1 x 10 10
Amphibians 5 x 10 8 9 x 10 10
Reptiles 2 - 5 x 10 9
Mammals 3 – 5 x 10 9
Birds 9 x 10 8 1 x 10 9
Flowering plants 5 x 10 7 1 x 10 11
6.
7.
8. What is the explanation?
• Early evidences of genome structures
came from
•DNA Denaturation-renaturation
studies
9.
10.
11.
12.
13.
14.
15.
16.
17.
18. cot curve - Cot curve is concerned with the measurement of the
degree of reannealing of DNA strands
19. When dsDNA is denatured by heat and
allowed to reanneal it the reaction can be
described by
• dC/dt = -kC2
• C is the concentration of ssDNA at time = t
• K is a rate constant
• If the equation is integrated over C
0
at t = 0 and C after time t has passes it
becomes:
• C/C
0
= 1
• 1 x k C
0
t
• One useful point of reference is the point where half the DNA remains in ssDNA
form
• C/C
0
= ½ = 1/1 x k C
0
t
½
• Can be reduced to C
0
t
½
= 1/k , the so called Cot ½ value = product of
• concentration and time describes the reaction to reach the half way point
• Larger Cot ½ values reflect longer renaturation times.
20. Double stranded DNA denatures at high temperature
UV light absorbance increases as DNA goes from double-stranded to single-stranded
Melting temperature (Tm)
“Hyperchromic shift”
What is the basis of hyperchromic shift? Stacked bases absorb less UV light
22. Single-stranded
DNA
(denatured)
Gives high A260
absorbance
Double-stranded DNA
(reassociated)
Gives low A260 absorbance
Cot Curve Analysis of DNA Samples
Therefore, the genome with the more unique DNA sequences has the higher Cot value.
Note: Organisms with larger genome sizes (in nucleotide pairs) usually have higher Cot values
(as illustrated in the Genome Size portion of the above figure).
“Complexity” is a measure of the amount of “uniqueness” of the DNA sequences in a genome.
How to measure Cot?
By looking at the reassociation kinetics of DNA
Cot values are a measure of the “complexity” of an organism’s genome
23.
24. What Repeat Classes Represent
• Unique DNA:
– up to 10 copies:
– about 60% of the genome
– highly conserved coding regions
– other highly conserved regions
– other non-conserved unique sequences
• Moderately repeated DNA
– average of 500 copies,
– a total of 30% of the genome
– transposon-based repeat
– large gene families
• Highly repeated DNA:
– average of 50,000 copies per genome
– about 10% of total DNA
– constitutive heterochromatin
– microsatellites
– a few highly repeated transposon families (Alu sequences)
25. Melting temperature Tm increases with G:C content
Why? Because G-C base pairs that have 3 hydrogen bonds that require more energy to
disrupt than A-T base pairs, which have only 2 hydrogen bonds.
26. Visualizing DNA by Ethidium Bromide Staining
•DNA binding dyes such as ethidium bromide and acridine orange intercalate between the stacked bases of DNA.
•Agarose gels, such as the one shown at the top right, are used to separate different length DNA fragments.
Once the gel is stained with ethidium bromide, the stained DNA is visualized by shining UV light on the gel
(ethidium bromide gives off a bright red fluorescence under UV light!).
27. G value paradox
relationship between organismal complexity
and the number of protein-coding genes
Why this paradox?
Part of this paradox can be explained by an increased
utilization of alternative splicing, which allows a
greater range of protein isoforms to be expressed,
which clearly occurs in the complex organisms, although this in turn
necessitates an increase in regulation.
28.
29.
30. The ratio of the total bases of non-protein-coding to the total bases of genomic DNA per
sequenced genome across phyla (i.e. the percent ncDNA). The four largest prokaryote
genomes and two well-known bacterial species are depicted in black. Single-celled organisms
are shown in gray, organisms known to be both single and multicellular depending on lifecycle
are light blue, basal multicellular organisms are blue, plants are green, nematodes are
purple, arthropods are orange, chordates are yellow, and vertebrates are red.