2. Gene libraries play a central role in gene cloning experiments.
For the isolation of a specific gene from a source (plant, animal or
microorganism), the genome is fragmented into small pieces and then
cloned into desired vector.
Then the entire collection of clones is screened to identify the specific gene
(for isolation).
The rest of the clones can be thrown away. But if another piece of DNA is
required from the same source we have to go through the whole process
again.
Thus the entire collection of the clones from a particular source can be
stored and a gene library can be made.
3. A gene library is thus a collection of clones which represent the entire
genome of an organism. More specifically this library is referred to as a
genomic library.
Another type of gene library is referred to as a cDNA library.
This library is constructed from DNA copies of the mRNA present in the
cells.
These DNA copies of mRNA are called as copy or complementary DNA
(cDNA) and hence this library is referred to as a cDNA library.
4. Genomic libraries:
Step1 Fragment the genomic DNA
The genomic DNA is fragmented into pieces of a suitable size. These pieces are then
cloned into an appropriate vector.
The simplest way to fragment the DNA is to completely it with a restriction endonuclease.
Say e.g., with Eco R1.
If EcoR1 sites are randomly distributed it produces an average fragment size of 4Kb (it is a
hexacutter GAATTC) thus after every 4096 (46) bp there could be a EcoR1 site.
However in actual some fragments are very much bigger and some are very small.
Now ligation works best with smaller fragments, so these smaller fragments would be over
represented in the library.
On the other hand some of the largest fragments may be too big to be cloned at all.
5. Moreover if restriction sites are not randomly distributed then large regions of
the genome will have no site at all for that enzyme.
Further, in case of genome sequencing, it is desirable to isolate the adjacent
DNA as well.
This is required to put together all the pieces of DNA to build up a bigger
picture.
To provide this information, we need a library of overlapping fragments.
This type of library is produced by using partial digestion.
Partial digestion is obtained by using conditions such as short digestion times,
using very small amount of enzyme or by incubating the digest at a reduced
temperature or a combination of these.
6. In a partial digest also some sites may be cut more efficiently than others.
Then the library of partial digest fragments will not be completely
overlapping.
One way of overcoming this is to use separate partial digests with different
enzymes.
Even with this library, there is still possibility that some regions of the genome
will be over represented and other regions occur less frequently in the library.
The only way to avoid this is mechanical shearing of the genome.
7. This is very effective way of generating relatively large fragments and
requires no special equipment.
But as the average fragment size decreases, the vulnerability of the DNA to
shearing diminishes.
A more elegant way is to use ultrasonication.
Although mechanical shearing is more attractive than partial digests, still
restriction digestion is more frequently used.
This is because it is easier to control the extent of degradation.
Also the restriction digests can be directly ligated with the vector.
In contrast, mechanical shearing produces either blunt or ragged ends (i.e.
variable lengths of SS regions at the 5’or 3’ends).
8. Choice of vectors:
Choice of the vector depends on the size of the insert that a vector can
accommodate.
This factor also affects the size of the library.
The size of the library is also affected by the total size of the genome of
the target organism.
It is not possible to produce a library that is guaranteed to carry all of the
genetic information from the original genome.
Thus we have to use probabilities.
9. We can assume there is 90% probability of having the gene that we want
(P=0.9) or a 99% probability (P=0.99). the level at which this probability is
set will affect the required size of the library.
The number of independent clones needed, can be calculated from the
formula.
N= ln (1-P)/ ln (1-F)
N=required number of clones
P=the probability of the library containing the desired piece of DNA
F= the fraction of the genome represented by an average clone (average insert
size divided by the total genome size)
11. cDNA libraries:
The advantage of cDNA library is that since cDNA clones reflect only the
coding regions of the gene so it can be expressed in a bacterial host which is
unable to splice out the introns.
A library based on mRNA rather than a genomic library reflects only those
genes that are actually expressed in a particular cell or tissue sample at a
particular time.
Thus a cDNA library from leaves will be different from a flower or seed
cDNA library from the same plant.
Thus a cDNA library reflects the nature of the cells from which the mRNA
was obtained, however the mRNA cannot be cloned directly.
12. A complementary DNA (cDNA) has to be produced.
The synthesis of cDNA is carried out using an enzyme known as reverse
transcriptase.
Isolation of mRNA:
In case of eukaryotes, mRNA carries a poly A tail at its 3énd.
The polyadenylated mRNA can anneal to synthetic oligo dT sequences.
Other RNA species and non-RNA components will not anneal and can be
washed off.
When the RNA preparation is passed through a column of a polymer
coated with synthetic oligo (dT) fragments, the poly A tail will anneal to
the oligo (dT) residues and will be retained on the column while other RNA
species will pass through.
13. This is in fact a hybridization process.
The hybrids can be made unstable by lowering the salt
concentration and raising the temperature.
Thus pure mRNA can be eluted from the column.
This will be a complex mixture of all the mRNA species present in
the cell at the time of extraction.
14. cDNA synthesis:
The presence of poly A tail is also useful in the reverse transcription step
also.
An oligo dT primer will anneal the poly A tail and reverse transcriptase will
then extend this primer using the mRNA as the template and will produce a
SS cDNA copy.
In the heteroduplex, the RNA can be degraded by either alkali treatment or
with RNAase H.
following alkali treatment, cDNA strand is largely in SS form.
SS nucleic acid molecules tend to form secondary structure looping back on
themselves because of hydrophobicity of the bases.
15. The SS cDNA thus form a hairpin loop at the 3’end.
This hairpin loop id used by DNA polymerase I to prime the synthesis of the
2nd strand.
The product is DS DNA molecule with a hairpin loop at one end.
The loop is then removed by treatment with S1 nuclease.
However this method has many disadvantages as S1 nuclease cause loss of
sequences at the 5’ends of cDNA.
16. Alternatively degradation of RNA strand with RNAse H will leave small
RNA fragments., Which act as primer for 2nd strand synthesis.
This also avoids the need for cutting the hairpin with S1 nuclease.
For cloning the cDNA , adaptors are added, by blunt end ligation.
This makes the cDNA molecules compatible with the chosen vector.
The vectors chosen are usually plasmid vector or a phage lambda insertion
vector such as ʎgt10 or ʎgt11.
17. In case of bacteria cDNA libraries are rarely produced.
As the size of bacterial genomes is small and introns are absent, thus genomic
library is adequate and easier to construct.
Also in case of bacteria, there is technical difficulty in producing cDNA:
mRNA is not ployadenylated
it is (mRNA) remarkably unstable. Many bacterial mRNA species have a half
life of only a minute or two.
The bacterial genes are organized into polycistronic operons means a bacterial
mRNA can be as much as 10-25 kB in length. It is difficult to isolate this
mRNA intact. Further it would be very difficult to produce a full length cDNA
copy from it. As a result bacterial cDNA libraries are rarely produced.