2. Analysis of DNA Sequences in
Eukaryotic Genomes
The technique that is used to determine the sequence complexity of any genome involves
the denaturation and renaturation of DNA.
DNA is denatured by heating which melts the H-bonds and renders the DNA single-
stranded.
If the DNA is rapidly cooled, the DNA remains single-stranded.
But if the DNA is allowed to cool slowly, sequences that are complementary will find each
other and eventually base pair again.
The rate at which the DNA reanneals (another term for renature) is a function of the
species from which the DNA was isolated.
3. Analysis of DNA Sequences in
Eukaryotic Genomes
The Y-axis is the percent of the DNA that remains single stranded.
This is expressed as a ratio of the concentration of single-stranded DNA (C) to the total
concentration of the starting DNA (Co).
The X-axis is a log-scale of the product of the initial concentration of DNA (in moles/liter)
multiplied by length of time the reaction proceeded (in seconds).
The designation for this value is Cot and is called the "Cot" value. The curve itself is called a
"Cot" curve.
As can be seen the curve is rather smooth which indicates that reannealing occurs slowing
but gradually over a period of time.
One particular value that is useful is Cot½ , the Cot value where half of the DNA has
reannealed.
4. Cot= DNA Concentration (moles/L) X Renaturation time in sec X Buffer factor (Constant)
Co- Concentration of DNA t- time taken for renaturation
5. Steps Involved in DNA Denaturation
and Renaturation Experiments
1. Shear the DNA to a size of about 400 bp.
2. Denature the DNA by heating to 100oC.
3. Slowly cool and take samples at different time intervals.
4. Determine the % single-stranded DNA at each time point.
The shape of a "Cot" curve for a given species is a function of two factors:
1. the size or complexity of the genome; and
2. the amount of repetitive DNA within the genome
6. If we plot the "Cot" curves of the genome of three species such as bacteriophage lambda, E.
coli and yeast we will see that they have the same shape, but the Cot½ of the yeast will be
largest, E. coli next and lambda smallest.
Physically, the larger the genome size the longer it will take for any one sequence to
encounter its complementary sequence in the solution.
This is because two complementary sequences must encounter each other before they can
pair.
The more complex the genome, that is the more unique sequences that are available, the
longer it will take for any two complementary sequences to encounter each other and pair.
Given similar concentrations in solution, it will then take a more complex species longer to
reach Cot½ .
7.
8. Repeated DNA sequences, DNA sequences that are found more than once in the genome of the species, have
distinctive effects on "Cot" curves.
If a specific sequence is represented twice in the genome it will have two complementary sequences to pair with
and as such will have a Cot value half as large as a sequence represented only once in the genome.
Eukaryotic genomes actually have a wide array of sequences that are represented at different levels of repetition.
Single copy sequences are found once or a few times in the genome.
Many of the sequences which encode functional genes fall into this class.
Middle repetitive DNA are found from 10s - 1000 times in the genome.
Examples of these would include rRNA and tRNA genes and storage proteins in plants such as corn.
Middle repetitive DNA can vary from 100-300 bp to 5000 bp and can be dispersed throughout the genome.
The most abundant sequences are found in the highly repetitive DNA class.
These sequences are found from 100,000 to 1 million times in the genome and can range in size from a few to
several hundred bases in length.
These sequences are found in regions of the chromosome such as heterochromatin, centromeres and telomeres
and tend to be arranged as a tandem repeats.
9. These sequences are found in regions of the chromosome such as heterochromatin,
centromeres and telomeres and tend to be arranged as a tandem repeats.
The following is an example of a tandemly repeated sequence:
ATTATA ATTATA ATTATA // ATTATA
Genomes that contain these different classes of sequences reanneal in a different
manner than genomes with only single copy sequences.
Instead of having a single smooth "Cot" curve, three distinct curves can be seen, each
representing a different repetition class.
The first sequences to reanneal are the highly repetitive sequences because so many
copies of them exist in the genome, and because they have a low sequence
complexity.
The second portion of the genome to reanneal is the middle repetitive DNA, and
the final portion to reanneal is the single copy DNA.