This study investigated unspecific products that can form during quantitative PCR (qPCR). Several primer pairs known to produce artifacts like primer dimers or multiple peaks were tested. Primer dimers were amplified from no-template controls and sequenced but their sequences could not be determined. Statistical analysis found primer pairs producing primer dimers had significantly higher melting temperatures and self-complementarity compared to a control group. Multiple peaks were found to result from amplification of two different length products from the same gene, likely due to additional binding sites forming on the amplified sequence. Understanding the nature and causes of artifacts can help improve qPCR reliability and accuracy.
Real-Time PCR
The Polymerase Chain Reaction (PCR) is a process for the
amplification of specific fragments of DNA.
Real-Time PCR a specialized technique that allows a PCR reaction
to be visualized “in real time” as the reaction progresses.
Real-Time PCR allows us to measure minute amounts of DNA
sequences in a sample.
Uses of Real-Time PCR
Real-Time PCR has become a cornerstone of molecular biology:
Gene expression analysis
Cancer research
Drug research
Disease diagnosis and management
Viral quantification
Food testing
Testing of GMO food
Animal and plant breeding
Gene copy number
Polymerase Chain Reaction
History of PCR
Instrumentation of PCR
Principle of PCR
Components of PCR
Steps of PCR
Optimal PCR Factors
Applications of PCR
INTRODUCTION TO REAL TIME PCR IS GIVEN, basic principle of realtime pcr, along with the process of operating this, diagrammatic representation of the process, advantages and disadvantages o f reatimem pcr, applications of the same is also there
Real-Time PCR
The Polymerase Chain Reaction (PCR) is a process for the
amplification of specific fragments of DNA.
Real-Time PCR a specialized technique that allows a PCR reaction
to be visualized “in real time” as the reaction progresses.
Real-Time PCR allows us to measure minute amounts of DNA
sequences in a sample.
Uses of Real-Time PCR
Real-Time PCR has become a cornerstone of molecular biology:
Gene expression analysis
Cancer research
Drug research
Disease diagnosis and management
Viral quantification
Food testing
Testing of GMO food
Animal and plant breeding
Gene copy number
Polymerase Chain Reaction
History of PCR
Instrumentation of PCR
Principle of PCR
Components of PCR
Steps of PCR
Optimal PCR Factors
Applications of PCR
INTRODUCTION TO REAL TIME PCR IS GIVEN, basic principle of realtime pcr, along with the process of operating this, diagrammatic representation of the process, advantages and disadvantages o f reatimem pcr, applications of the same is also there
Reverse Transcriptase PCR (RT-PCR) is a variation of the polymerase chain reaction that amplifies target RNA. Addition of reverse transcriptase (RT) enzyme prior to PCR makes it possible to amplify and detect RNA targets.
Reverse transcriptase enzyme transcribes the template RNA and forms complementary DNA (cDNA). Single-stranded cDNA is converted into double-stranded DNA using DNA polymerase. These DNA molecules can now be used as templates for a PCR reaction
PCR is a polymerase chain reaction in which target DNA gets amplified. There are various modifications to PCR reaction to increase sensitivity and specificity such as touchdown PCR, Real time PCR, Hot start PCR, RT-PCR, Colony PCR and asymmetric PCR.
RT-PCR (reverse transcription-polymerase chain reaction) is a variant of the polymerase chain reaction (PCR) which are now widely used. Traditionally RT-PCR involves two steps: the RT reaction and PCR amplification. RNA is first reverse transcribed into cDNA using a reverse transcriptase as described here, the resulting cDNA is used as templates for subsequent PCR amplification using primers specific for one or more genes. RT-PCR can be used to quantify mRNA levels from much smaller samples. In fact, this technique is sensitive enough to enable quantitation of RNA from a single cell.
1. ARTIFACTS IN QPCR
Investigating unspecificproducts
Master Degree Project in
Molecular Biology
One year Level 30 ECTS
Autumn term 2014
Lucia De Mojà
a13lucde@student.his.se
Supervisor: Robert Sjöback
Examiner: Mikael Ejdebäck
2. Abstract
Unspecific products are a common artifact in PCR and its variants. The artifacts include
primer-dimers (PDs) and double or multiple peaks in the melt curve. The aim of this
experiment was to analyze the sequences of the unspecific products to define their
composition. The results showed that with the method used in this experiment it was not
possible to obtain sequences for the primer-dimers. Statistical analyses were performed in
which the characteristics of the tested primer pairs were compared with a control group of
primer pairs that do not produce unspecific products. The unpaired t test revealed that the
differences between the group of primer pairs giving PDs and the control group were
statistically significant (95% confidence) regarding Tm and self 3’ complementarity with p-
values of 0,0012 and 0,0477 respectively, leading to the conclusion that the criteria for the
design of these two characteristics (below 3-5 °C Tm mismatch between Fw and Rv) might be
too permissive and even designing a primer pair within the range allowed to avoid PDs, they
are still going to be formed. To increase the reliability of these data, a further study with a
higher number of samples should be performed, selecting only primer pairs that are already
part of the experiment as test samples. For what concerns the multiple peaks, the sequences of
the PCR products were aligned against the sequence expected and the Fw primer used,
allowing us to see that an additional binding site was generated on the target gene during
qPCR.
3. Summary
A gene is a segment of DNA that contains a specific sequence of double stranded nucleotides.
To be able to amplify a gene it is necessary to know its sequence so that a primer (short
sequence of nucleotides complementary to a certain segment of DNA) can be designed to
bind to the DNA. During amplification, an enzyme called polymerase starts to add
nucleotides to the short sequence until the whole target is duplicated. A fluorescent dye can be
used as a reporter that keeps track of the amount of DNA that is duplicated during this
process. Reagents involved in a qPCR reaction include the template DNA, the primers and the
fluorescent dye. The dye can bind to the double stranded DNA emitting a fluorescent signal (a
common dye used in qPCR is SYBR® Green), so the more double stranded DNA is present in
the reaction mix, the more intense the fluorescent signal will be. If the intensity of the signal
grows exponentially, the amplification curve will have an exponential growth too until a peak
is reached and the growth stops because there are no reagents left in the reaction mix. A melt
curve is generated by reading the fluorescent signal as the temperature is slowly increased.
When the DNA strands separate (melts) the dye is liberated from the DNA and loses its
fluorescence. This can be translated into a graphical curve and, depending on the length of the
fragments of DNA that are duplicated, each peak will have a specific melting temperature
(Tm) that corresponds to the temperature at which the peak is reached in the melting curve. A
good qPCR melt curve representing the amplification of one single target is supposed to
present only one clear peak. Based on these concepts, unspecific products can be visible in the
melt curve as additional peaks that make the curve difficult to read and compromise the
quality of the results. Assuming that a good qPCR result is necessary to give accurate
feedbacks on queries as the identification of a gene involved in cancer development or
similar, in a future perspective, eliminating artifacts of this nature is essential in many
different fields that include not only scientific research but also forensic science.
The present study was carried out in order to investigate the nature of the unspecific products
that are formed during Quantitative Real-Time Polymerase Chain Reaction (also known as
qPCR), a technology used in molecular biology to amplify DNA fragments. Unspecific
products are not supposed to be amplified during qPCR since this is considered to be a
specific reaction due to the specificity of the primer pair used. The amplification of any
product requires material to be used and all the reagents present in the reaction mix are meant
to be used for the amplification of the DNA target. In the presence of unspecific products,
some of the reagents present in the reaction mix are used to amplify the wrong DNA fragment
and the consequence is that there is a loss of reagent and less amplification of the desired
target possibly giving errors in quantification, and a very unclear result.
The data analysis shows that even though the primer pairs were designed considering Tm and
self 3’ complementarity below the maximum values (5°C of difference between Fw and Rv
primer and less than 4 as score for the complementarity), there is a significant difference
between complementarity and Tm of test group (giving PDs) and control group (PDs free). In
order to reduce the occurrence of the artifacts, a more strict value (less than 2°C of difference
between Fw and Rv primer for the Tm and no more than 1 as score for the complementarity)
for these two characteristics should be used during the design process. Furthermore, in this
experiment, double peaks have been proven to be the result of the amplification of two
different products in the same reaction due to the presence of multiple binding sites on the
PCR product (not on the original sequence of the target gene), that results in the amplification
of two versions of the same gene with different length. The reason for the formation of the
additional binding site is yet to be established with further experiments.
4. Table of contents
List of abbreviations....................................................................................... 1
Introduction................................................................................................... 2
qPCR........................................................................................................... 2
Artifacts...................................................................................................... 2
Primer-Dimers.......................................................................................... 3
Double Peaks ........................................................................................... 4
Aim................................................................................................................ 5
Materials and Methods.................................................................................. 6
Criteria for the primer pairs......................................................................... 6
qPCR........................................................................................................... 7
Purification................................................................................................. 7
Gel electrophoresis and extraction .............................................................. 8
Capillary gel electrophoresis........................................................................ 9
Sample preparation for sequencing............................................................. 9
Statistical analysis ....................................................................................... 9
Results......................................................................................................... 10
Primer-dimers........................................................................................ 10
Double peaks......................................................................................... 12
Statistical results....................................................................................... 13
Discussion.................................................................................................... 15
Primer-dimers........................................................................................... 15
Double peaks............................................................................................ 17
Conclusions.................................................................................................. 18
Future perspectives...................................................................................... 19
Acknowledgements...................................................................................... 20
References................................................................................................... 21
Appendices.................................................................................................. 24
Appendix 1 – Kits and reagents.................................................................. 24
7. 1
List of abbreviations
bp - base pair
dNTPs - deoxynucleotide triphosphate
ddNTPs - dideoxynucleotide triphosphate
dsDNA – double stranded DNA
Fw – forward
NTC - no template control
PDs – primer-dimers
PPi - pyrophosphate
qPCR - Quantitative Real-Time Polymerase Chain Reaction
Rv - reverse
SD – standard deviation
ssDNA – single strand DNA
Tm – melti temperature
8. 2
Introduction
qPCR
PCR is a widely used method for amplification of nucleic acids. The principle behind any
PCR reaction is based on the fact that nucleic acids can be amplified thanks to an enzyme
called polymerase, which recognizes the nucleotides’ sequence and creates a chain reaction in
which the free nucleotides present in the reaction mix are paired to their complementary ones
on the nucleic acid chain supposed to be amplified. For the reaction to be successful, it is
important for the reaction mix to have a balanced amount of primers, salts, deoxynucleotide
triphosphates (dNTP) and the targeted nucleic acid. The amplification of the fragments takes
place in a thermocycler, which is programmed to change the temperature according to several
steps in order to cause the denaturation of the genetic material that will provide a single
stranded nucleic acid, to which the primers can then anneal at the right melting temperature
and the polymerase can elongate the target. Each step is performed at a specific temperature
that will be maintained for several seconds and then automatically changed to proceed with
the reaction (Wilson and Walker, 2010).When all the steps are completed, a cycle is
performed and a normal PCR reaction is considered to be around 40 cycles.
qPCR is a quantitative PCR method that measures in real time the amount of product
generated in each cycle. It is based on the detection of a fluorescent signal that can be
generated from fluorescent dyes like SYBR® Green, EvaGreen®, BOXTO (that bind the DNA
non-specifically within the double strand), or probes (that bind specifically along the
nucleotides’ sequence) (Reed, et al., 2013). These two approaches differ on the fact that, in
the case of the probe based method, the fluorescent signal is free from interferences since it is
the direct result of the binding between the probe and the target sequence, while the dye based
method gives a signal that is directly proportional to the quantity of double-stranded DNA
bound by the dye itself, therefore it cannot distinguish between different sequences
(Vandesompele, 2009, pp. 12-19). This condition implies that also unspecific products, which
are created during the reaction, will contribute to the fluorescent signal emitted. The detector
will wrongly assign to the target, giving a margin of imprecision within the method itself.
This kind of interference does not take place in the probe based method due to its specificity,
so it is not possible to have a signal for the unspecific products formed using this method but
this does not imply that they are not formed (Poritz and Ririe, 2014).
Artifacts
It is not unusual to encounter unexpected results during PCR and qPCR, especially when
using SYBR® Green technology. Depending on different conditions, the kind of artifact that
can be produced goes from multiple peaks in the same melt curve, to the amplification of
unspecific products like primer-dimers in no template controls (NTC). This depends on the
fact that SYBR® Green binds to double stranded DNA regardless if that dsDNA is the target
of the experiment or not (Tajadini, et al., 2014).
9. 3
Primer-Dimers
PDs are thought to be the principal source of the artifacts in PCR and its variants. They are
stated to be the result of the annealing between forward (Fw) and reverse (Rv) primers
(present in the reaction mix during PCR), due to a certain level of complementarity between
them. The theory behind the formation of primer dimers as the result of self-annealing
between the primers pair is a common concept within the scientific community (Reed, et al.,
2013). This concept should lead to the logic conclusion that the products of primer-dimers are
not supposed to exceed in length the sum of the two nucleotides’ sequences of the primers
involved (SantaLucia, 2007).
Primer-dimer artifacts have actually been proven to be longer than that as shown in figure 1
(by few nucleotides more), leading to question their composition in contrast to the common
conceptions (Brownie, et al., 1997; SantaLucia, 2007).
Figure 1: Image taken from a study conducted by Brownie et al. in 1997 (The elimination of primer-
dimer accumulation in PCR. Nucleic Acid Research, 25(16)), in which they demonstrated that the
sequence of the primer-dimers is longer than the sum of the primer pair due to the fact that some
additional nucleotides come in between the two primers. A – B – C – D – E and F are the primers and
the Rv complements used to perform the alignment with the sequenced products. The red squares
show the additional sequence that is bond to the last nucleotide of each primer.
10. 4
Brownie, et al. (1997) focused their aim on producing primer dimers using different levels of
complementarity between the primers, while Satterfield (2014) recently tried adding Rv and
standard phosphoramidites to synthesize oligonucleotides without 5’ end to reduce the
annealing of the primers between each other.
This experiment was focused on the attempt to produce primer dimers from known primer
pairs that were supposed to have a low self-complementarity in normal PCR conditions
increasing the total number of cycles, since primer-dimers are a so-called “late product” in
PCR reactions. Once the primer-dimers were formed in NTCs, the product was subsequently
used as template in the following qPCR experiments, to increase the total amount of product
to be further analyzed.
Double Peaks
Double peaks are artifacts that are often displayed in the melt curve during PCR experiments
and its variants that use SYBR Green® as reporter dye (Ririe, et al., 1996). Even though the
literature is not rich in experiments that research the nature of multiple peaks in the melt
curve, several companies as Life Technologies and IDT® attempt to explain in their
troubleshooting guidelines that, due to the presence of artifacts as multiple peaks, melt curve
analysis are not to be considered as a diagnostic method but only as indicators (Downey,
2014). The common conception behind the presence of double peaks is that they are the
resultant of different products amplified during the reaction (SantaLucia, 2007).
In this experiment, samples giving double peaks in the melt curve analysis were purified and
sequenced in order to establish whether two different products were actually present or not in
the same reaction.
11. 5
Aim
The aim of this project was to produce and analyze unspecific products in qPCR to find out:
- Whether primer-dimers are the actual product of the dimerization of the primers or
something else participates in their formation.
- Whether multiple peaks in the melt curves prove the presence of different products
using SYBR® Green as reporter dye.
- Whether there is or not a pattern in the formation of unspecific products depending on
the characteristics of the primers involved.
To achieve this goal, qPCR was performed using several primer pairs as NTCs (to establish
the presence of PDs formations). A couple of samples were tested using serial dilutions to try
and separate the multiple peaks obtained. Statistical tests, as unpaired t test, were performed
in order to compare a test group (primer pairs prone to form PDs) with a control group
(primer pairs that were negative as NTCs for PDs formations).
Since artifacts during qPCR cause unclear results and loss of reagents, as well as less
amplified target, understanding any of the queries mentioned above could be helpful in order
to have better performances and more reliable results.
12. 6
Materials and Methods
Different primer pairs were tested in qPCR for unspecific products’ formation, collecting
them based on their previous results such as number of peaks in the melt curves for one single
target or presence/absence of a melt peak in NTCs. The primer pairs showing the presence of
unspecific amplification such as multiple peaks, in the melt curve were further tested using
five or seven points standard curves with 10 fold serial dilutions, while the primer pairs
showing amplification in the NTCs, were analyzed in qPCR without template to obtain a
sample to purify and some information about their length by capillary gel electrophoresis
using the Fragment Analyzer™ (AATI, Ames, USA). Sequencing was then performed to
analyze the sequence of the purified fragments.
The first set of experiments was performed running qPCR on seven primer pairs of which five
were NTCs and two were samples with human cDNA and the primers designed to target
IGFBP3 and CD44. The samples that showed amplification were then purified with the
MinElute™ kit (Qiagen, Hilding, Germany) and analyzed in the Fragment Analyzer™ from
Advanced Analytical by capillary gel electrophoresis. The purified products were sent for
cycle sequencing to Eurofins Genomics (Ebersberg, Germany).
The second set of experiments was performed on 20 primer pairs that have been previously
demonstrated to produce artifacts in qPCR, running NTCs. The products were purified with
the MinElute™ kit (Qiagen), Oligo clean-up kit (Norgen) and gel electrophoresis was
performed to extract the fragments using the QIAEX II® gel extraction kit (Qiagen). The
purified fragments were sent for Sanger sequencing at GATC Biotech (Konstanz, Germany).
Criteria for the primer pairs
The criteria used to collect the primer pairs to be tested in qPCR focused on two
characteristics: the ability of the primer pairs to self-anneal (this property was verified
looking at the melt curves generated by NTCs in previous experiments), and the number of
peaks in the melt curve for the samples that shown artifacts in previous experiments. All the
primer pairs producing melt curves attributable to unspecific products, were analyzed in
NTCs in order to prove the presence of products even without any template in the reaction
mix.
Upon confirmation of the presence of the product in qPCR, capillary gel electrophoresis was
performed on the positive samples to get more information about the length of the fragments.
The NTCs giving a product and the samples with multiple peaks in qPCR, were sent for
sequencing to a third party. For the artifacts that were thought to be PDs, the length was
chosen arbitrarily within a range that had the sum of the length of the two primers as a
minimum and the maximal length obtained as maximum.
As control, primer pairs that did not produce artifacts in qPCR in previous experiments were
used to perform statistical analysis to compare the characteristics of their sequences with the
tested ones.
All the primer pairs were provided by IDT® - Integrated DNA Technologies (Coralville,
USA) and the results of the previous experiments were provided by the primer library and the
laboratory Books of TATAA Biocenter (Gothenburg, Sweden).
13. 7
qPCR
Each primer pair giving unspecific products when analyzed without template in previous
experiments performed at TATAA Biocenter was analyzed in qPCR without template to
confirm the presence of the artifacts. The product was then diluted 1:108 and amplified again
to be sure that the fragments obtained were the product of the primer pair used in the NTC.
Each NTC was analyzed in quadruplicates, to increase the possibility to see some unspecific
product. Once the product was obtained, each sample was amplified again in triplicates to
ensure repeatability. Primer pairs giving different Tm in different replicates were analyzed
again separately.
The primer pairs giving multiple peaks in the melt curves in previous experiments were tested
using human cDNA as template, to confirm the presence of the artifacts, and then serial
dilutions were performed to obtain a standard curve that made possible to compare the
differences in relative amounts of the two products based on the concentration of the
template.
The master mix used for all the samples was TATAA SYBR® Grandmaster® Mix (TATAA
Biocenter). The amount of template used was 2 µl (1:108 diluted from the product of NTC) in
20 µl of total reaction volume. All the primers were used from stock solutions 100 µM and
diluted to a concentration of 10 µM in a working solution with nuclease free water. The
characteristics of the primers can be found attached in appendix 4 table 1a (control group) and
1b (Test group). Each qPCR was performed at the same conditions in all the tests as reported
in table 2 after optimization of number of cycles and amount of time for each step.
Table 2: Temperature and cycling program used for all the experiments.*
Program Temperature (°C) Time (s) Cycles
Initial Hold 95 60 1
Denaturation 95 5
Annealing 60 30 55
Extension 72 15
Melt curve 95 – 60 1
*LightCycler® Nano from Roche Life Science and QuantStudio® 12K Flex System from Life
Technologies were used to perform the experiments. The number of cycles used was higher than the
usual to ensure a proper amount of product to be amplified since unspecific products as primer-dimers
are shorter than normal qPCR fragments and only produce a detectable signal in the late phase of
PCR.
Purification
For the first set of the experiments, the products showing a clear peak in the melt curve were
collected and purified using MinElute™ PCR Purification Kit (50) from QIAGEN following
the manufacturer’s instructions, one volume of PCR product was added to five volumes of PB
buffer and then centrifuged using a purification column that keeps the genetic material in the
filter at the bottom, flushing away the buffer and the reaction mix used in qPCR. 750 µl of PE
buffer were then added to each column to wash the residues in the filter that were not related
with genomic material and centrifuged again. The last step was to elute the DNA in EB buffer
(14 µl) centrifuging for the last time. All the centrifugation steps were conducted at 13000
rpm for 1 minute.
14. 8
The second cycle of the experiment was performed using the Oligo Clean-Up and
Concentration Kit (50) provided by Norgen Biotek Corporation, due to the fact that the
Qiagen kit is able to recover fragments from 70 bp up to 4 kb. The Norgen kit was chosen
since its range of purification includes fragments from 10 bp, allowing the recovery of smaller
fragments such as primer-dimers. The samples were diluted up to 50 µl using nuclease free
water, to which were then added 150 µl of Binding solution to bind the DNA and 300 µl of
isopropanol. This procedure allows the DNA to bind to the filter of the column during the
centrifugation, discarding the residues along with the aqueous phase. To wash away from the
column residual debris, 400 µl of washing solution were added and the sample was
centrifuged again. The washing step was repeated and after discarding all of the residue,
another centrifugation step was performed to make sure that the filter of the column was dry.
To elute the DNA in the filter, 50 µl of Elution solution were added to the column and then
another centrifugation step was performed. All the centrifugation step were conducted at
14000 rpm for 1 or 2 minutes.
The purity and concentration of the samples was measured using both Nano Drop 1000
spectrophotometer (Thermo Scientific, Waltham, USA) and Drop Sense 96
spectrophotometer (Trinean, Gentbrygge, Belgium)
Gel electrophoresis and extraction
As none of the two kits mentioned above was enough to get a high enough amount of pure
product, gel extraction was used to get the smaller fragments without compromising the
purity of the samples and make sure that the residual primers were not collected together with
the target fragments. This procedure was performed only for the NTC samples, NuSieve™ 3:1
Agarose (LONZA, Basel, Switzerland) was used according to the instructions for casting a
6% agarose gel in TBE buffer. Concentrated qPCR products were added to the wells and left
running for one hour at 60 V and then 30 minutes at 75 V. The gel was stained with
GelGreen™ (Biotium, Hayward, USA) in a water bath (for a final concentration of 3X in H2O)
for 30 minutes in constant agitation at room temperature and the bands were visualized and
excised using a UV lamp and a scalpel.
To extract the DNA from the gel, QIAEX II® Gel Extraction Kit from QIAGEN was used
following the manufacturer instructions. For this procedure no columns are required, but in
order to extract the fragments from the gel, a water bath at 50° for 10 minutes was required.
The tube was filled with agarose gel (up to 250 mg) containing the DNA band, six volumes of
buffer QX1 to wash and solubilize the gel and 10 µl of QIAEX II solution to bind the DNA.
Once the gel had melted, the samples were centrifuged for 30 seconds at 13.000 rpm and the
supernatant was discarded, leaving the DNA in the pellet. To remove residual agarose, salts
and contaminants from the pellet, two washing steps with 500 µl of QX1 each were
performed resuspending the pellet in the buffer between each centrifugation. The pellet
containing the fragments was air dried at room temperature and then dissolved in 20 µl of
water, centrifuged again and the supernatant was collected in a clean tube.
The purified material was stored in -20° until further use for capillary gel electrophoresis.
15. 9
Capillary gel electrophoresis
Capillary gel electrophoresis was performed using the Fragment Analyzer™ from Advanced
Analytical to do quality control tests on the purified material. The Fragment Analyzer™ was
used following the manufacturer’s instructions for DNA samples. This instrument uses
capillary technology to run the samples in a gel and then discards the waste automatically
showing a digital image of the bands.
The gel was prepared with an intercalating dye provided by AATI, the sample plate was
prepared with PCR product diluted 1:11. The marker solution used had a lower marker of 35
bp and a high marker of 1500 bp. As a ladder the same 100 bp ladder run in a previous
experiment was used.
Sample preparation for sequencing
The samples collected were prepared following the sample submission guideline provided by
Eurofins Genomics. To analyze Fw and Rv strands separately, the two primers were sent in
two separate tubes at a concentration of 10 µM and the concentration of the purified samples
was 2 ng/µl. Cycle sequencing was performed for both unspecific products obtained running
NTCs and the samples with multiple peaks in the melt curve.
The fragments produced during the second set of the experiments were sent to GATC Biotech
following the sample submission guideline. A minimum concentration of 10 ng/µl in a total
for each sample was sent together with Fw and Rv primers at 10 µM. Each qPCR product was
purified with the three different methods described above and all the purified sample were
sent for Sanger sequencing. Only PDs products obtained running NTCs were analyzed for the
sequences.
Statistical analysis
Two groups of primer pairs were considered to investigate the reasons why certain primer
pairs are more prone than others in giving artifacts in qPCR. The test group involved primer
pairs that have demonstrated the ability to produce unspecific products known as primer-
dimers in previous experiments, while the control group consisted of primer pairs proven to
not produce PDs in NTC. The characteristics of the primers used in each primer pair were
observed using the primer-BLAST tool from NCBI keeping the standard parameters except
for the Tm box that was empty. Average and SD were calculated for the following
characteristics: Tm, GC% content, self 3’ complementarity and self-complementarity score
between the two groups and within the same group between Fw and Rv primer. The unpaired
t test was performed and t and p-value were calculated using the online tool QuickCalcs from
GraphPad Software, to find out if there was a statistically significant difference between the
two groups for each of the considered parameters. The confidence interval was set at 95%
establishing the significance level at 0.05 as maximum.
16. 10
Results
This project pointed as aim to understand the nature of the unspecific products in PCR and the
conditions in which they show up.
Unspecific products are a common artifact in qPCR. Within the unspecific products there are
deviations in the shape of the melt curve, additional products other than the target of the
experiment, multiple peaks in the melting.
A common artifact that is often encountered during PCR and its variants is the formation of
primer-dimers. In this work, we tried to intentionally produce primer-dimers using primer
pairs known to have the tendency of producing such artifacts. Some of the primer pairs
objects of this study were able to produce, in the presence of DNA as template, multiple peaks
in the melt curve. After testing the PDs set of primer pair as NTC and the double peaks set in
a serial dilution, to see the trend of the peaks at different template concentrations, the results
showed that primer-dimers formations are longer than the expected length of ~40 bp (50 to 99
bp according to the results in the Fragment Analyzer™).
This section will be divided in two parts, since the artifacts that were analyzed turned out to
be very different between each other. PDs were generated from 30 primer pairs samples while
the longer unspecific products giving double peaks were encountered in only two samples.
Primer-dimers
A total number of 30 different NTC primer pairs were selected to be part of the study as
generators of unspecific products like primer-dimers. The primer pairs were selected based on
their previous results in previous experiments, allowing us to take only the ones that showed
amplification of unspecific products even without any template in the reaction mix. Primer-
dimers are unspecific products that are generated randomly in all kind of conditions during
PCR. This randomness leads to the fact that when running the primer pairs in quadruplicates,
the results can be heterogeneous. As a matter of fact, primer pair one (shown in figure 2)
presented four different peaks in the melt curve plot, suggesting that in the different replicates
four different products were generated. The same pattern was visible in almost all of the
analyzeded primer pairs with some exceptions (appendix 2 - figure 4), suggesting as a
consequence that the method per se is not repeatable even following the same protocol at the
same conditions.
Even though the melt curves are different between the replicates, a quality control test
performed with the Fragment Analyzer™ using capillary gel electrophoresis confirmed that
the length of the fragments was exactly the same (see appendix 2 - figure 3).
17. 11
Figure 2: Melt curves belonging to the NTCs quadruplicates (2a) and the three replicates of the
second generation product (2b) for Primer pair 1. Figure 2a: The Tm of all the products seems to be
comprehended between 75 and 78°C. For all the replicates the peaks are clear and defined except for
one (indicated by the red circle) that showed almost no amplification even though was still visible at
~77°C. It is possible to notice a noisy pattern in the baseline for all the replicates. Figure 2b: The
image shows the three of four replicates of Primer pair one that were analyzed in qPCR and gave
different melt curves (Figure 1). Each of the three replicates was analyzed in triplicates and two out of
three replicates resulted negative (red rectangles). Only one of them (Primer pair one Replicate three
in the red circle) gave some product with Tm of ~73°C, while for the 1st generation products the Tm
was between 75-78°C
Only 18 of the 30 primer pairs originally selected for this study presented artifacts attributable
to PDs formations during the performance of qPCR despite the fact that they were selected
based on their capability to form primer-dimers. Once a reaction showed amplification of
unspecific products, another qPCR was performed using the 1st generation product as
template to confirm that there still was amplification and the product was generated from the
two primers used in the first qPCR.
The replicates giving different Tm were amplified a second time separately and in triplicates.
These 2nd generation products were not matching the expectations most of the time, giving no
amplification at all or a different Tm compared to the 1st generation product (figure 2b). In
some cases it was observed that the replicates giving different Tm in the 1st generation, gave
the same Tm when amplified a second time using 1st generation product as template
(appendix 2 - figure 5).
The samples that showed no amplification in the qPCR were then excluded from the statistical
analysis and the sequencing.
The concentration obtained with the different purification methods was always very low for
the samples giving PDs formation, from a minimum of 2 ng/µl to a maximum of 13 ng/µl
(MinElute™), 40 ng/µl (Oligo Clean-up) and 13 ng/µl (QIAEX II®). Even though the
concentration was not optimal, the number of copies was really high (from a minimum of 3.3
x 1010 to a maximum of 6.74 x 1011) due to the short length of the fragments.
For what concerned the fragments generated running NTCs from the primer pairs objects of
the study, no results were obtained by sequencing the purified material with either Sanger
sequencing and its variant cycle sequencing. As shown in appendix 3 - figure 11, the
chromatograms reported a noisy baseline pattern that made impossible to identify a sequence
since no clear peak was generated.
18. 12
Even though for certain samples a sequence was identified, the degree of certainty about the
nature of the bases was so low that it was impossible to use the data for further analysis.
Double peaks
Two samples were considered for this part of the experiment. Human cDNA was used as
template to amplify part of the IGFBP3 and CD44 genes using two different primer pairs.
The 1st generation product is visible in the melt curves represented in figure 6 and 7 in
appendix 2. Both of the samples present two peaks but the second one is not perfectly defined
yet in the case of CD44 (appendix 2 - figure 7). Multiple peaks in qPCR suggest the presence
of two different products in the same reaction. In the attempt to separate the peaks, obtaining
one final sample with only one clear peak, the qPCR product was then diluted 1:108 and used
as template for the second reaction and the results are shown and described in appendix 2 -
figure 8. Due to the fact that for both IFGBP3 and CD44 samples the peaks with higher Tm
(attributable to the unspecific product) showed a higher level of fluorescence than the peak
with lower Tm (attributable to the expected product), the dilution series were repeated again
1:108 leading to the amplification of the unspecific products as major product of the reaction
(figure 9a and 9b).
Figure 9: The figure shows CD44 melt curves for the old product with double peaks and the newly
diluted and amplified product with only one peak (9a) and the original product of IGFBP3 sample
against the one obtained diluting 1:108 both 1st and 2nd generation product of IGFBP3 (9b) Figure 9a:
The original target had a Tm of ~82°C and in the 1st generation product it is possible to see that a little
peak of unspecific product is still trying to emerge (in the red circle in the figure). The unspecific
product isolated in the diluted sample in the 5th generation shown a Tm of ~86°C (black arrow).and .
Figure 9b: The melt curve with only one peak with a Tm of 83°C represents the expected IGFBP3
cDNA amplified with the same primer pair. The melt curve with the double peak is the result of the
dilution of the 2nd generation product and consequent amplification in qPCR. It is possible to infer by
looking at the Tm of the original IGFBP3 overlapping with the hump residue of the dilutions that they
are probably representing the same product.
After different cycles of dilution, qPCR and purification, the final products with only one
peak isolated had a concentration of 229.70 ng/µl for IGFBP3 and an absorbance of 1.85
(A280/260) and 2.16 (A260/230), while for CD44 the concentration was 217.40 ng/µl with an
absorbance of 1.84 (A280/260) and 1.70 (A260/230). The absorbance of nucleic acids is visible
between 260 and 280 nm (the higher is the absorbance the purest is the solution). A ratio
higher than 2.0 within the range of absorbance mentioned indicates high DNA purity.
19. 13
Even though the concentration and the purity were high for both IGFBP3 and CD44, the
quality control running the Fragment Analyzer™ was disappointing since no band stood up in
the gel and no peak was visible in the electropherogram, determining CD44 as a sample that
could not be sequenced (appendix 2 – figure 10).
Nor IGFBP3 and CD44 passed the quality control in the Fragment Analyzer™, so CD44 was
not sent for sequencing but IGFBP3 was still sent as a control to see if it could be sequenced
since the amplified fragment was supposed to be more or less 100 bp long. As a result, both
1st generation product with the expected target amplified (IGFBP3 old) and the 5th generation
unspecific product isolated (IGFBP3 new) shown in figure 9b were successfully purified and
sequenced using Cycle sequencing (See appendix 3 figure 12).Despite being performed twice
the sequencing was unable recover the whole sequence of IGFBP3 new, but still an alignment
with the IGFBP3 old and the cDNA from IGFBP3 provided by GeneBank was performed
using Clustal Omega from EMBL-EBI. Because the cDNA was over 2000 bp, three extracts
where taken from the alignment results covering all the matching sequences or fragments.
Extract one shows in a green rectangle in appendix 3 the part of the sequences that matched
perfectly between new, old and cDNA IGFBP3. Due to the fact that the two fragments
analyzed had a high difference in length but almost a perfect matching in the alignment
results, the longest sequence was aligned with the sequence of the Fw primer showing a
double site of match on the same fragment as shown in figure 13 appendix 3. To understand if
the primer was designed in a way that allowed two different binding sites, the same primer
was aligned with the original sequence for IGFBP3 provided by GenBank. The result of this
second alignment showed that the sequenced obtained for IGFBP3 during this experiment was
not matching completely the original one, but some sort of severe deletion happened since the
same sequence of the Fw primer was not present in two different sites anymore (figure 14
appendix 3).
Statistical results
Test group and control group were consistent of 18 and 19 primer pairs respectively. Due to
the fact that two particular primer pairs from the test group showed to produce PDs or not
using different working solutions equally concentrated, they were named primer pair α and
primer pair β and put into both groups to try and see if their presence in one or the other group
could affect the final results. table 3a shows the standard deviations (SD) and averages of the
groups of primer pairs, table 3b in appendix 4 shows the same data with and without primer
pair α and primer pair β in the calculations.
Table 3a: SDs and averages between Fw and Rv primers for all the considered characteristics
for each group. T – test, C – control.
Group Length (b) Tm (°C) GC% Self Comp Self 3’ Comp
T group Average 20,02777778 58,4663888 52,1775 3,333333333 0,5555555556
T group SD 1,383290303 2,53090402 7,170243421 1,041976145 0,5555555556
C group Average 20,55263158 59,8855263 53,19289474 3,289473684 0,4131549501
C group SD 1,703690246 0,80530906 7,074501953 0,802290462 0,2105263158
Based on the results obtained with the calculation of average and SD, an unpaired t test was
performed and the results are shown in table 4. Considering SD, Average and number of
20. 14
samples for each group, the p-value, t value and the statistical significance of the calculated
differences were obtained.
The results shows that the differences between the test group and control Group Tm:s were
considered to be statistically significant for all the combinations (with α and β primer pairs
included, excluded, included only in one group or the other). The p-value are reported in table
4 and the highlighted ones are the one to be considered statistically significant with more than
95% of chances for the data to be reliable.
Table 4: P-values calculated between the different groups.*
Characteristic T and C T αβ and C T and C αβ T αβ and C αβ
Length 0.1406 0.2375 0.3821 0.1497
Tm 0.0012 0.0305 0.0180 0.0015
GC% 0.5310 0.8478 0.5110 0.5645
Self Comp 0.8351 0.8932 0.6889 0.8624
Self 3’ Comp 0.0477 0.1774 0.1574 0.0530
Test group and Control group (T and C), Test group + α and β primer pairs and Control group (T αβ
and C), Test group and Control group + α and β primer pairs (T and C αβ) and Test group + α and β
primer pairs and Control group + α and β primer pairs. The confidence was 95% leading to a
significance level of 0.05, the statistically significant results are highlighted and reported for all the
Tm:s calculated and the self 3’ complementarity of the T and C groups and T αβ and C αβ groups. As
reported in the table, the statistical significance is very high when the groups compared both contain
or exclude α and β primer pairs.
21. 15
Discussion
The present study had as a major purpose to find out the source of the additional sequence
between Fw and Rv primers in the formation of PDs, whether SYBR® Green is involved in
the manifestation of artifacts as additional peaks in the melt curve and if a correlation between
the characteristics of all the primer pairs giving artifacts in qPCR exists when compared with
primer pairs that do not produce PDs. The methods used to perform this experiment were
designed based on the knowledge that amplifying a product in qPCR, purifying the amplicons
and performing sequencing, the sequence of the products should have been possible to
analyze and then compared with other sequences (Mardis, 2008).
Primer-dimers
Once the artifacts were produced, the results were sent to third parties to analyze the
sequences of the products. Sequencing methods can be different depending on the
characteristics of the fragments (like their length). Sequencing is a technique that allows one
to read the sequence of the fragments object of the study thanks to different signaling
molecules that emit different fluorescent signals (Gupta and Gupta, 2014). In the past, this
technique used different kinds of molecules like radioactive agents as marker for the
nucleotides (Maxam and Gilbert, 1977) but in the recent years, other molecules like
fluorescent dyes were introduced as alternative (Reed, et al., 2013).
For this experiment, Sanger sequencing and its variant cycle sequencing were used to analyze
the unspecific products that gave multiple peaks in the melt curve and PDs. Even though the
new sequencing techniques are very precise and reliable, very short fragments as primer-
dimers are difficult to sequence since the first part of the sequence is often lost during the
process, meaning that in short fragments a large portion of information can be lost impairing
the collection of the data required to analyze the sequence. To overcome this problem, a
single read sequencing can be performed in both directions of the fragment object, allowing
the recovery of the lost part in the beginning of the Fw filament with the complementary
ending part of the Rv filament and vice versa (Wilson and Walker, 2010).
In the present study, both Sanger and cycle sequencing have been demonstrated to be
inadequate for the analysis of the very short fragments purified with either column based
methods or gel extraction. For this reason it was not possible to align the sequence of the
primers with the sequence of their products in qPCR to find out what kind of additional
sequence could have been inserted between the primers. The failure in sequencing the PDs is
most likely due to the fact that the column based systems used to purify the products were
optimized to purify molecules longer than 70 bp (MinElute™), leading to a big loss of
product during the process that caused a too low concentration of DNA in the samples. Also,
the Oligo clean-up column based kit, was able to collect molecules from 10 to 70 bp,
including, along with the purified samples other molecules like the primers and eventually
other unspecific products, impairing the sequencing procedure and leading to the results
shown in figure 11 – appendix 3.
Even though several purification methods were used, it is probable that the fluorescent dye
was still present in small amount in the samples sent for sequencing, due to its high affinity to
dsDNA and AT rich fragments that make it bind to the nucleic acid in a strong way (Mao, et
al., 2007) interfering in this way with the nucleotides’ signals. Giglio et al. (2003), claim
instead that SYBR® Green manifests a higher affinity for GC rich fragments, in contrast with
22. 16
the statement of Mao et al., three years earlier in their study about the preferential binding of
this dye to DNA with specific characteristics.
Other sequencing methods like NGS (Next Generation Sequencing) of Pyrosequencing could
be used to sequence short fragments (Weitschek et al., 2014) as well as cloning the fragments
into vectors to be able to sequence longer products (Brownie et al., 1997).
Other possible reasons for this lack of results is the fact that SYBR® Green is considered to be
able to inhibit several reactions including qPCR if used in high enough concentrations (Nath,
et al., 2000). It would not be unreasonable to think that, as EDTA, SYBR® Green could be an
inhibitor also for other methods such as sequencing, especially when analyzing very small
fragments as PDs (50 – 60 bp). This assumption is based on the fact that both PCR and
sequencing are enzymatic reactions that use the emission of a signal as detection method, so a
fluorescent dye could interfere with the signaling creating noise and unclear peaks (Leonard,
et al., 1998).
A question arose while observing the melt curves regarded the fact that often, the same
product shows different Tm in different experiments. An example was primer pair one
reported in figure 2a (1st generation product, different Tm for each replicate), figure 2b (2nd
generation product with no amplification at all even though the 1st generation product was
used as template) and figure 5 – appendix 2 (again 2nd generation product obtained from
template of each replicate of the 1st generation that this time produced six identical melt
curves with the same Tm). The belief that different Tm could belong to the same product
might be justified by the fact that also running the samples in the Fragment Analyzer for
capillary gel electrophoresis the results for the three different peaks were exactly the same
showing a size of 55 bp each time (figure 3 – appendix 2). Several authors such as Ririe, et
al.(1996), Wittwer, et al.(2003), Pryor, et al.(2006) and other throughout the last decade
reported a high reliability in identifying PCR product with the melt curve analysis, leading
one to consider the possibility that it was actually a different fragment generated every time.
Unfortunately, since sequencing did not work out, it was not possible in this study to explain
this anomaly.
According to the literature and our statistical analysis a relevant role in PDs formation is
played by the Tm of the primer-pair and the self 3’ complementarity of each primer (Hsieh, et
al., 2006; Chou, et al., 1992; Hongoh, et al., 2006; Kimura, et al., 2011). All the primer pairs
were designed following the common rules in order to avoid PDs formations (Poritz and
Ririe, 2014), the statistical significance found between the difference in the Tm and self 3’
complementarity of the two groups of primer pairs suggests that the standard conception
(which is below 3-5°C Tm mismatch) might be implemented with more strict values.
Observing the SD of the test group compared to the SD of the control group in table 3, it is
possible to notice that even though the average for the two groups is not so distant regarding
the Tm (58,47 °C for the test group and 59,88 °C for the control group), the difference within
the same group between Fw and Rv primers is notably higher in the test group (with an SD of
2,5309) than in the control group (with a SD of 0,8053). As reported in the literature
(Hyndman and Mitsuhashi, 2003; Yuryev, 2007) the optimal Tm of the primers should not
exceed 59 °C to impair the formation of PDs,. In this study, the group giving more PDs was
the one with the lower Tm but a higher SD, suggesting that more important that the Tm itself
is the fact that the difference of the Tm of the two primers within the same primer pair is
should not be higher than 1°C. Online tools for primer design as Primer-BLAST from NCBI
and Primer3 suggest a mismatch of the Tm between the two primers not higher than 3-5 °C.
23. 17
Concerning the self 3’ complementarity, also known as 3’-anchored global alignment score,
the common knowledge recommend to keep it as low as possible (near to 0) (Markel and
León, 2003). Our statistical analysis shows that even a difference of 0.3 in the average of
several alignment scores between a control group and a test group shows statistical
significance. This result is consistent with the literature (Markel and León, 2003; Poritz and
Ririe, 2014).
All the other characteristics tested for this experiment were not shown to be statistically
significant between the two groups, pointing at Tm and 3’ complementarity as the two most
discriminant values in primer design.
Double peaks
Due to the fact that the results of the sequencing did not lead to a sequence of an acceptable
length for the IGFBP3 as well as for the PDs, there was not enough material to draw valid
conclusions on the alignment made on the fragments obtained from the IGFBP3 new (the
result of serial dilutions and amplification that led to the isolation of a peak of unspecific
product). The only thing that is possible to observe in the extracts in appendix 3, is that the
stars visible in the green square in extract one indicate a perfect match for all the sequences,
suggesting that the product separated in qPCR (figure 8 - appendix 2 and figure 9b) as
IGFBP3 new has actually the same sequence as IGFBP3 old. This statement might be true in
case the length of the fragments was at least the same and the matching was near to 100%, but
since the IGFBP3 new was actually much shorter than IGFBP3 old, one possible conclusion
is that, as for the PDs, unspecific products like the ones giving additional peaks are much
more difficult to sequence than fragments with a known sequence.
The further analysis performed on the sequences searching for alternative binding sites did
not show a match between the Fw primer and the original sequence of IGFBP3, but aligning
the primer with the sequence obtained for IGFBP3old it emerged that an alternative binding
site was created during the PCR reactions, leading to the formation of a shorter version of the
IGFBP3old that we observed as IGFBP3. The results showed in figure 13 and 14 clearly
demonstrate how the Fw primer matches completely the two different sites on IGFBP3old
sequence. A possible explanation for this could be due to the fact that a deletion took place
during the PCR reactions that led to the formation of the second binding site on IGFBP3, or
maybe the human cDNA used as template was degraded and the Fw primer annealed to the
wrong region of the gene during PCR.
These results allows us to say that the double peak was in this case actually the result of two
different products, which were the result of the double binding site on the 1st generation
product that resulted in an altered product.
24. 18
Conclusions
The present work reports as conclusions the following:
The described method does not allow repeatability in the production of primer-dimers
since they behave in an unpredictable way even though the parameters are the same
between different experiments.
A column based method is not enough to purify unspecific products as primer-dimers
since the concentration obtained is too low to proceed with further analysis.
It is inferable from the statistical results that Tm and self 3’ complementarity could be
considered in a more strict range of variability when designing primers.
Double peaks can be the result of two different products amplified during the PCR.
Even though the primer pair was designed to bind in only one specific site on the
original gene, for some reason (like cDNA degradation) the same Fw primer bind to
two different sites, giving two different versions of the same product.
25. 19
Future perspectives
Unspecific products are a difficult and still unpredictable subject in qPCR, the short amount
of time available for the project did not allow us to fully observe and analyze all the possible
troubleshooting alternatives.
Due to the fact that Sanger sequencing did not work out at the presented conditions to analyze
PDs formations, a future improvement could include a different kind of sequencing method
more appropriate for short fragments. Pyrosequencing could be a more reliable solution as it
has been known since the past decade that the need of continuously adding each nucleotide
step by step is particularly suitable for sequencing of short fragments (Ronaghi, et al., 1998;
Vandenbrouke, et al., 2011).
For what concerns multiple peaks as artifact in qPCR, running control tests with different
templates could help understanding if the formation of an alternative binding site on the same
PCR product is due to the degradation of the genetic material or something else.
The unpaired t test conducted on average, SD and number of samples in this study shown that
a difference higher than 1 °C between the primer-pair’s Tm and even small differences in self
3’ complementarity play a relevant role in PDs formation, but a higher amount of primer pairs
could make these findings more reliable if still confirmed with statistical analysis.
26. 20
Acknowledgements
The major contribution in the development of this project was provided by Anna Pfister, who
supervised me during all my laboratory work and gave me good advices in order to overcome
the enormous variety of problems that came up every day. A moral and economical support
that was fundamental for the healthy management of the project was given by my parents and
Giuseppe, I will never thank them enough for the constant help they give me.
A special thanks goes to the TATAA’s staff, they were always there for advices when I
needed as well as my teachers in Skövde.
27. 21
References
Brownie, J., Shawcross, S., Theaker, J., Whitcombe, D., Ferrie, R., Newton, C., Little, S.,
1997. The elimination of primer-dimer accumulation in PCR. Nucleic Acid Research, 25(16).
Chou, Q., Russel, M., Birch, D.E., Raymond, J., Bloch, W., 1992. Prevention of pre-PCR
mis-priming and primer dimerization improves low-copy-number amplifications. Nucleic
Acids Research, 20(7), pp. 1717-1723.
Downey, N., 2014. Interpreting melt curves: an indicator, not a diagnosis. IDT® Integrated
DNA Technologies. Core concepts, scientific fundamentals explained.
Dwight, Z., Palais, R., Wittwer, C.T., 2011. uMELT, prediction of high resolution melting
cuves and dynamic melting profiles of PCR products in a rich web application.
Bioinformatics, February 7, 2011.
Giglio, S., Monis, T.P., Saint, C.P., 2003. Demonstration of preferential binding of SYBR
Green I to specific DNA fragments in real-time multiplex PCR. Nucleic Acid Research,
31(22): e136.
Gupta, A.K. and Gupta, U.D, 2014. Next Generation Sequencing and Its Applications. Models
in discovery and translation. Chapter 19, pp. 345-367.
Hongoh, Y., Yuzawa, H., Ohkuma, M., Kudo, T., 2006. Evaluation of primers and PCR
conditions for the analysis of 16S rRNA genes from a natural environment. FEMS
Microbiology Letters, 221(2), pp. 299-304.
Hsieh, M.H., Tsaih, R., Huang, C.Y., 2006. An intelligent primer design system for multiplex
reverse transcription polymerase chain reaction and complementary DNA microarray. Expert
Systems with Applications, 30(1), pp. 129-136.
Kimura, Y., de Hoon, M.J.L., Aoki, S., Ishizu, Y., Kawai, Y., Kogo, Y., Daub, C.O., Lezhava,
A., Arner, E., Hayashizaki, Y., 2011. Optimization of turn-back primers in isothermal
amplification. Nucleic Acids Research, 39(9): e59.
Kretz, K., Callen, W., Hedden, V., 2014. Cycle sequencing. Genome Research. Cold Spring
Harbor Laboratory Press 1054-9805/94.
Leonard, J.T., Grace, M.B., Buzard, G.S., Mullen, M.J., Barbagallo, C.B., 1998. Preparation
of qPCR product for DNA sequencing. BioTechniques, 24:314-317.
Mao, F., Leung, W.Y., Xin, X., 2007. Characterization of EvaGreen and the implication of its
physicochemical properties for qPCR applications. BMC Biotechnologies, 2007; 7:76.
Mardis, E.R., 2008. Next-generation DNA sequencing methods. Annu Rev Genomics Hum
Genet., 2008; 9:387-402.
Markel, S., León, D., 2003. Sequence Analysis in a Nutshell. A guide to common tools and
databases. O’Reilly & Associates, 2003; pp. 129-131.
Maxam, A.M. and Gilbert, W., 1977. A new method for sequencing DNA. Proceeding of the
National Academy of Sciences, 74(2), pp. 560-564.
28. 22
Nath, K., Sarosy, J.W., Hahn, J., Di Como, C.J., 2000. Effects of ethidium bromide and
SYBR® Green I on different polymerase chain reaction systems. Journal of Biochemical and
Biophysical Methods. 42(1-2), pp. 15-19.
Norgen Biotek Corporation, 2013. PCR Purification Kit. Product Insert.
Poritz, M.A. and Ririe K.M., 2014. Getting Things Backwards to Prevent Primer Dimers. The
Journal of Molecular Diagnostics, 16(2).
Pryor, M.J. and Wittwer, C.T., 2006. Real-time polymerase chain reaction and melt curve
analysis. Methods in Molecular Biology, 336:19-32.
Reed, R., Holmes, D., Weyers, J., Jones, A., 2013. Practical Skills in Biomolecular Sciences.
4th ed. Pearson Education.
Ririe, K.M., Rasmussen, R.P., Wittwer, C.T., 1996. Product Differentiation by Analysis of
DNA Melting Curves during the Polymerase Chain Reaction. Analytical Biochemistry,
245(2), pp. 154-160.
Ronaghi, M., Uhlén, M., Nyrén, P., 1998. A sequencing method based on real time
pyrophosphate. Science, 281 (5375): 363
Satterfield, B.C., 2014. Cooperative Primers: 2.5 Million–Fold Improvement in the Reduction
of Nonspecific Amplification. The Journal of Molecular Diagnostics, 16(2).
SantaLucia, J.J., 2007. Physical Principles and Visual-OMP Software for Optimal PCR
Design. Methods in Molecular Biology, 402: PCR Primer Design.
Tajadini, M., Panjehpour, M., Javanmard, SH., 2014. Comparison of SYBR Green and
TaqMan methods in quantitative real-time polymerase chain reaction analysis of four
adenosine receptor subtypes. Advanced Biochemical Research 2014, 3:85.
Vandesompele, J., 2009. qPCR guide. Eurogentec, pp.12-19.
Wilson, K., Walker, J.M., 2010. Principles and Techniques of Biochemistry and Molecular
Biology, Seventh Edition. Cambridge University Press.
Wittwer, C.T., Reed, G.H., Gundry, C.N., Vandersteen, J.G., and Pryor, R. J.,2003 High-
resolution genotyping by amplicon melting analysis using LCGreen. Clin. Chem. 49, 853–
860.
Weitschek, E., Santoni, D., Fiscon, G., De Cola, M.C., Bertolazzi, P., Felici, G., 2014. Next
generation sequencing reads comparison with an alignment-free distance. BioMed Central
Research Notes 2014, 7:869.
Hyndman, D.L., Mitshuhashi, M., 2003. PCR Primer Design. Methods in Molecular
Biology™, 226:81-88.
Yuryev, A., 2007. PCR Primer Design Using Statistical Modeling. Methods in Molecular
Biology™, 402:93-103.
29. 23
Vandenbrouke, I., Van Mark, H., Verhasselt, P., Thys, K., Mostmans, W., Dumont, S., Van
Fygen, V., Coen, K., Tuefferd, M., Aerssen, J., 2011. Minor Variant Detection in Amplicons
Using 454 Massive Parallel Pyrosequencing; Experiences and Considerations for Successful
Applications. Bio Techniques, Vol. 51 No 3, pp. 167-177.
30. 24
Appendices
Appendix 1 – Kits and reagents
NORGEN Biotek Corporation Oligo Clean-Up and Concentration Kit (50)
QIAEX II Gel extraction kit (150)
QIAGEN MinElute™ PCR Purification Kit (50)
TATAA SYBR® GrandMaster® Mix
31. 25
Appendix 2 – Figures
Figure 3: Electropherograms obtained running the three replicates of primer pair one in the Fragment
Analyzer™ for capillary gel electrophoresis. From top: replicate 1, replicate two and replicate 3. It is
visible to the right of the gel image with the two bands indicating the unspecific product with a length of
55 bp and the primers in the lowest band (~20 bp). A 35 – 1500 bp marker was used once and
imported for all the other analyses. All the samples looks the same even though the melt curves were
different and each peak had a different Tm.
32. 26
Figure 4: The figure shows the melt curves of the NTCs quadruplicates for primer pair two. The 76°C
Tm suggests that all the replicates gave the same unspecific product with more or less the same
amount of product in each replicate. As it was possible to see in figure 1, also primer pairs that do not
have a homogeneous pattern within their replicates have a melt curve that starts with a hump (red
circle) that corresponds approximatively to the Tm of the primers.
Figure 5: The image shows primer pair one 2nd generation product analyzed for the second time using
the 1st generation product as template. The three replicates giving different Tm in the 1st generation
product were analyzed separately and in triplicates and all the 9 resulting samples are shown in the
image with the same identical Tm (~71°C). A similar result was obtained with primer pair one replicate
three in figure 2b.
33. 27
Figure 6: The image shows the melt curve for the 1st generation product of IGFBP3 primer pair using
human cDNA as template. Two peaks are clearly visible, one at 83°C and another one at 86°C. The
expected product is the one included in the higher peak with the lower Tm.
Figure 7: The image shows the melt curve for human cDNA amplified using a CD44 primer pair. The
red circle shows a hump that indicates the presence of a different kind of product that may be longer
than the target.
34. 28
Figure 8: The melt curves in the image belong to both IGFBP3 and CD44 2nd generation product. It is
possible to notice the difference with the 1st generation product shown in figure 6 and 7, where the
peaks with the lower Tm represented the expected products and the higher ones the unspecific
product. It is evident the difference between the concentration of the unspecific products in the first
generation and the second generation where the only parameter that was changed was the nature of
the template.
Figure 10: Capillary gel electrophoresis for CD44 sample purified with MinElute™ kit. No peak is
visible in the electropherogram and no band is visible on the gel to the left even though purity and
concentration of the sample were quite high. Due to these results, the sample was not considered
adequate enough to be further analyzed.
IGFBP3
CD44
35. 29
Appendix 3 - Sequences
Figure 11: Chromatograms from Sanger sequencing. Primer pair 10 was purified with three different
methods and all the purified samples were sequenced in both Fw and Rv direction. From top: Primer
pair 10 - Rv and Fw fragments purified with Qiagen’s kit MinElute™; Primer pair 10 – Rv and Fw
fragments purified with Oligo clean-up and concentration kit from Norgen Biotek; Primer pair 10 – Rv
and Fw fragments purified with QIAEX II® gel extraction kit from Qiagen. None of the chromatograms
shown above presents a clear signal. As it is possible to see in the 4th and 5th rows, the Fw fragment
purified with the Oligo clean-up and the Rv fragment purified with QIAEX II® give a series of peaks that
the software assembles in an approximate series of dNTPs.
36. 30
Figure 12: Chromatogram for the IGFBP3 (old) original sequence obtained with cycle sequencing
from purified qPCR product. The sequence with the white background is the one that was clipped by
the software as thought to be the only relevant part according to the degree of reliability for each
nucleotide represented in the figure. The FASTA sequence is reported in extract one– appendix 3 with
the alignment results with the sequence obtained from the unspecific product generated in qPCR and
the IGFBP3 cDNA stored in GeneBank.
Extract 1
IGFBP3_new_F ------------------------------------------------------------
IGFBP3_new2_F ------------------------------------------------------------
IGFBP3_cDNA AAAGGGCATGCTAAAGACAGCCAGCGCTACAAAGTTGACTACGAGTCTCAGAGCACAGAT
IGFBP3_old_F --------------------------ATAAG----------CAGTTGTCGCTTCCAAGGC
IGFBP3_new_F ------------------------------------------------------------
IGFBP3_new2_F ------------------------------------------------------------
IGFBP3_cDNA ACCCAGAACTTCTCCTCCGAGTCCAAGCGGGAGACAGAATATGGTCCCTGCCGTAGAGAA
IGFBP3_old_F AGGAAGCGGGGC------TTCTGCTGGTGTATGGATAATGTCATGCGTGCAGGTAGAGAA
IGFBP3_new_F ------------------------------------------------------------
IGFBP3_new2_F ------------------------------------------------------------
IGFBP3_cDNA ATGGAAGACACACTGAATCACCTGAAGTTCCTCAATGTGCTGAGTCCCAGGGGTGTACAC
IGFBP3_old_F ATGGAAGACACACTGAATCACCTGAAGTTCCTCAATGTGCTGAGTCCCAGGGGTGTACAC
IGFBP3_new_F ----------------------------------AGTAAAAAAATTGCGCCTTCCAAGGC
IGFBP3_new2_F -------------------------------------ATACACGTGTCGCCTTCCAAGGC
IGFBP3_cDNA ATTCCCAACTGTGACAAGAAGGGATTTTATAAGAAAAAGCAGTGTCGCCCTTCCAAAGGC
IGFBP3_old_F ATTCTCAACTGTGACAAGAAGGGATTTTATAAGAAAAAGCAGTGTCGCCCTTCCAAAGGC
* * * * * * *****
IGFBP3_new_F AGGAAGCGGGGCTTCTGCTGGTGTACGGATACATTCTGCTGTTCTACAGAGCTTCT ----
IGFBP3_new2_F AGGAAGCGGGGCTTCTGCTGGTGTACGGATCATTCTGCTGTTGTACACAAACTTCT ----
IGFBP3_cDNA AGGAAGCGGGGCTTCTGCTGGTGTGTGGATAAGTATGGGCAGCCTCTCCCAGGCTACACC
IGFBP3_old_F AGGAAGCGGGGCTTCTGCTGGTGTATGGATAATTATGGGCAGTATAGAAATGGAAGACAC
************************ **** *
38. 32
IGFBP3_new_F ------------------------------------------------------------
IGFBP3_new2_F TCGCTCCCCCCCCCATCATAATCATAAAATAAATCAAACA----TCAACCTATCTTTATA
IGFBP3_cDNA TGTCTTGCAATGTATTTATAAATAGTAAATAAAGTTTTTACCATTAAAAAAATATCTTTC
IGFBP3_old_F ------------------------------------------------------------
IGFBP3_new_F ------------------------------------------------------------
IGFBP3_new2_F TAATTTTTTTTAACCCAC------------------------------------------
IGFBP3_cDNA CCTTTGTTATTGACCATCTCTGGGCTTTGTATCACTAATTATTTTATTTTATTATATAAT
IGFBP3_old_F ------------------------------------------------------------
IGFBP3_new_F ----------------------------------------------
IGFBP3_new2_F -------------------AAACCCAAAACGACACAAC--------
IGFBP3_cDNA AATTATTTTATTATAATAAAATCCTGAAAGGGGAAAATAAAAAAAA
IGFBP3_old_F ----------------------------------------------
Figure 13: The figure shows the sequence of the IGFBP3 fragment obtained with cycle sequencing
aligned with the Fw primer to search for alternative binding sites. It is possible to see that the same
sequence is present in two different sites on the fragment analyzed (yellow mark on top and orange
mark on bottom).
39. 33
Figure 14: The figure shows the alignment between the sequence obtained sequencing IGFBP3 from
our experiment, the Fw primer and the original sequence of IGFBP3 from GenBank. The orange and
yellow rectangles on top show the matching sequence between the three different products. In the red
rectangle on bottom it is visible that IGFBP3first (1st generation product) reports a tract with severe
deletions compared to the original sequence that cause the formation of a sequence that is identical to
the sequence of the Fw primer, creating an alternative binding site (see figure 13 for comparison).
40. 34
Appendix 4 - Tables
Table 1a: Characteristics of the primers involved in the study as control group.
Forward Reverse
Name Length Tm GC% Comp Self 3’ C Length Tm GC% Comp Self 3’ C
1 24 60,36 46,23 4 0 23 60,37 43,48 3 0
2 20 60,46 60 4 0 21 60,12 52,38 3 0
3 20 59,09 60 2 0 20 59,03 50 4 0
4 16 60,57 75 3 0 19 60,08 63,12 4 0
5 22 59,57 50 2 0 21 59,08 52,38 3 0
6 20 59,53 55 4 0 20 61,35 55 4 0
7 19 59,47 53,03 2 0 20 59,39 55 4 0
8 19 61,17 58,29 2 0 21 59,24 48,02 2 0
9 24 58,23 37,05 4 0 19 59,07 53,03 4 0
10 20 59,02 55 2 1 23 59,06 39,13 3 0
11 20 60,08 55 3 0 20 61,08 60 2 0
12 20 61,02 55 3 1 20 60,15 55 2 1
13 23 60,05 48,23 4 0 20 60,37 50 4 0
14 21 60,48 52,38 4 0 20 61,05 55 3 1
15 24 60,12 42,07 3 0 21 60,19 52,38 4 0
16 23 59,22 43,48 4 0 20 60,07 50 4 0
17 18 58,41 61,11 4 0 21 59,08 52,38 3 1
18 19 61,15 63,16 4 1 20 59,45 55 4 1
19 20 59,06 55 3 1 20 60,36 55 4 0
The table reports the values for five different characteristics in both Fw and Rv filament of the primer
pair. Each primer pair is indicated by a number (Name). The primer length is expressed in bases, Tm
is expressed in °C, GC content is expressed in percentage and complementarity and self 3’
complementarity are expressed in alignment score.
41. 35
Table 1b: Characteristics of the primers involved in the study as test group.
Forward Reverse
Name Length Tm GC% Comp Self 3’ C Length Tm GC% Com Self 3’ C
1 17 57,04 65,11 4 0 15 56,53 73,33 2 0
2 20 58,42 55 2 0 20 61,22 55 3 0
3 20 57,34 50 4 0 18 59,26 55,56 3 0
4 21 62,15 57,14 3 1 20 64,14 65 2 1
5 20 61,26 60 2 0 20 62,02 60 4 0
6 22 59,02 50 3 1 20 59,12 55 2 0
7 20 57,38 50 4 0 20 59,26 55 5 1
8 21 59,38 52,38 4 1 22 60,17 45,45 3 0
9 23 59,44 43,48 2 0 21 59,25 52,38 3 0
10 20 55,12 45 4 0 20 56,29 50 3 0
11 20 54,54 45 4 0 20 56,21 45 4 0
12 21 55,22 43,26 5 1 18 55,27 50 2 2
13 20 55,19 45 5 1 20 54,59 45 3 2
14 20 59,16 50 2 0 20 60,54 55 4 0
15 21 60,34 52,38 4 0 19 60,45 58,38 4 0
16 20 59,25 43,26 2 1 21 60,19 48,02 4 1
17 21 54,26 43,26 5 2 20 55,09 45 5 5
18 20 60,39 55 2 0 20 60,29 60 3 0
α 18 60,44 67,07 4 1 20 59,06 55 2 0
β 20 60,25 55 2 0 20 60,25 55 2 0
The table reports the values for five different characteristics in both Fw and Rv filament of the primer
pair. Each primer pair is indicated by a number (Name). The primer length is expressed in bases, Tm
is expressed in °C, GC content is expressed in percentage and complementarity and self 3’
complementarity are expressed in alignment score. The primer pairs α and β showed a behavior
attributable to both groups in different tests.
Table 3b: SDs and averages between Fw and Rv primers for all the considered characteristics
for the special groups α – β.*
Group Length (b) Tm (°C) GC% Self Comp Self 3’ Comp
T α – β Average 19,975 58,61975 52,7615 3,25 0,525
T α – β SD 1,349026240 2,44876157 7,217282517 1,056117709 0,9604352646
C α – β Average 20,45238095 59,8964285 53,65238095 3,214285714 0,2142857143
C α – β SD 1,670437082 0,78468518 7,062958963 0,842056550 0,4152997322
*The special groups α - β contains two more primer pairs in the analysis (Primer pair α and Primer pair
β) that were used in both groups as they presented the same behavior of the Test primer pairs in the
first tests and results attributable to the Control primer pairs in the other tests using different working
solutions from the same stock. Length is expressed in bases.