1. Polymerase Chain Reaction
and Primer Design
Workshop: “Cluster classification of Mycobacteriophages Isolated
from Tropical soils of Puerto Rico”
Angélica M. González
Pablo González
Carolina Montañez
Natalia A. Manzano
2. Polymerase Chain Reaction (PCR)
∗ Is an in vitro molecular replication technique
used to amplify any specific segment of
DNA based on sequence specificity.
∗ Used in a wide range of experimental and
diagnostic applications.
3. The Inventor
“We are the recipients of
scientific method. We
can each be a creative
and active part of it if
we so desire.”
Kary Mullis (1983)
http://www.420hook.com/?p=12229
4. Essential components of PCR
∗ Taq Polymerase
∗ DNA Primers
∗ Nucleotide
triphosphate
∗ DNA template
∗ Thermocycler
References: http://tolweb.org/treehouses/?treehouse_id=472 waynesword.palomar.edu dynamicscience.com.au nature.com
5. Steps in PCR
Denaturation: 95ºC
Annealing: between 50ºC and
65ºC.
Extension: 72ºC
http://www.genes.com/pcr/pcrinfo.html
6. DNA Primers
∗ A primer is a short synthetic oligonucleotide which is used in many molecular
techniques from PCR to DNA sequencing.
∗ They are designed to have a sequence which is the reverse complement of a
region of template or target DNA to which we wish the primer to anneal.
References: http://bioweb.uwlax.edu
7. Primer Design
∗ Bioinformatic tools: useful for their design.
- Both for known and unknown DNA target sequences.
Example:
http://www.ncbi.nlm.nih.gov/tools/primer-blast/index.cgi?LINK_LOC=BlastHome
Multiple Sequence Alignments (MSAs): determine the most frequent positions of
the bases in an unknown target sequence.
∗ Complementarity-based design.
References: obiolabs.com filebowl.com dnasoftware.com
8. Primer Design
∗ Considerations:
9 Primers should be 17-28 bases in length
c Base composition should be 50-60% (G+C)
G+C
Primers should end (3') in a G or C, or CG or GC: this
C
prevents "breathing" of ends and increases
efficiency of priming;
9. ∗ Melting temperatures between 55-80oºC are
preferred.
∗ 3'-ends of primers should not be complementary
(ie. base pair), as otherwise primer dimers will be
synthesised preferentially to any other product.
DNA primers used in PCR amplification can be designed using Bioinformatics tools. Primers must be complementary to relatively closely spaced regions including the area of interes targeted for amplification. Computer programs are available to aid in the design of matched primer pair when a DNA target sequence is precisely known, such as Primer blast from the National Library of Medicine. Many other techniques are available to design PCR primers when the exact target sequence is not known or when only the encoded amino acid sequence is defined. A common strategy for designing primers to amplify unknown targets employs Bioinformatics tools to generate Multiple sequence alignments of related sequences to determine the most frequent base at each position. Designed DNA primers are synthesized using an automated instrument and purified and diluted for use in PCR amplification reactions.
A dimer is a macromolecular 1-4. The primers should be of approximately equal lengths and with similar annealing temperatures based on the percent of Guanine and Cytosine bases (forming complementary base pairs with three hydrogen bonds). 5. A dimer is a macromolecular complex formed by two, usually non-covalently bound, macromolecules like proteins or nucleic acids. It is a quaternary structure of a protein. complex formed by two, usually non-covalently bound, macromolecules like proteins or nucleic acids. It is a quaternary structure of a protein.
The article presents that there are 3,357 of open reading frames of mycobacteriophage’s genes [ open reading frame (ORF) is a DNA sequence that does not contain a stop codon (such as "TGA", "TAA" and "TAG“ for DNA) in a given reading frame]. In order to understand this genetic diversity open reading frames have been gathered into 1,536 gene phamilies (groups of related sequences that share amino acid sequence similarity). This figure shows the complex relationships among the members of the Phamilie 7. Pham 7 is one of the 3 phamilies that contain members in all of the 30 mycobacteriophage genomes and that are among the most abundant ones. This phamilie is characteristic because it contains Lysis A genes, one of 2 known lysins encoded by mycobacteriophages ( lysins: An antibody that is capable of causing the destruction or dissolution of red blood cells, bacteria, or other cellular elements). This figure presents why it is said that no single sequence element within Pham 7 is present in all 30 genomes . For example, here we compare the mycobacteriophage Wildcat gp49 with other mycobacteriophage proteins. We can see that only 16 other mycobacteriophage proteins match with some sequence region of the Wildcat gp49 sequence, meaning that they are similar but also share diversity. (Leyend: Colored bars represent the strength of the matches , with red being the strongest, followed by purple, blue and black). This phamilie, as well as the ones that are going to be presented further on, are very diverse because they encode for proteins that are in charge of phage functions that help them to interact with specific bacterial hosts.
Another of the phamilies that contain members in all 30 of the bacteriophage genomes is the Phamilie 23. This phamilie’s characteristic is that their sequences encode for the tape-measure protein (Tmp) that plays a role in tail assembly and determines the lengths of non-contractile tails. This figure shows a phyologenetic analysis [ phylogenetic analysis : Phylogenetic methods can be used for many purposes, including analysis of morphological and several kinds of molecular data. On the analysis of DNA and protein sequences they can be useful for: Comparisons of more than two sequences; Analysis of gene families, including functional predictions; Estimation of evolutionary relationships among organisms] that accounts for the complexity of this phamilie. The long branch lengths make the complexity evident. In this phylogenetic analysis, the amino acid sequences for each of the 30 constituent members of Pham 23 were aligned using the program called ClustalW and the unrooted phylogenetic relationships represented using NJTree. Bootstrap values form reiterations are shown (Boostrap values are like average values, in this case from sequence length).
The 3 rd phamily that has members in all 30 of the mycobacteriophage genomes is the Phamilie 28. This phamilie also encodes for a tail protein, specifically for minor tails, reason why it is also quite complicated. A total of 81 genes fall into this phamily but no single sequence element in this phamily is present in all 30 genomes. This figure shows the relationships between some members of this phamily. As we can see, there is a high level of amino acid sequence identity in common, specially among the phages Lij gp18, Che8 gp18 and Che8 gp19 and a little bit lower with other members of Pham 28.
The comparison of mycobacteriophage genomes at the nucleotide level reveals not only considerable genetic diversity, but also small groups of phages that appear to be more closely related to each other than they are to other mycobacteriophages. The relationship among these phages was examined by using the program Splitstree, as shown in the picture, which accommodates alternative phylogenetic relationships that present whether or not each genome contains a member of each gene phamily. This analysis reveals 6 clearly defined groups of genomes, designed as Clusters A to F (shown here by colored circles), that are more closely-related to each other than to other mycobacteriophages. The importance of this figure is that it helps to organize the information by comparing and contrasting the genomes of each family for better acomodation.