Biomarker Discovery from Paraffin Embedded Samples
Successful Principles for Addressing FFPE-associated Real-time PCR Challenges
Yexun Wang Ph.D., Emi Arikawa Ph.D., Shankar Sellappan Ph.D., and Li Shen Ph.D.
SABiosciences Corporation 6951 Executive Way, Frederick, MD 21703 USA
Phone: +1 (301) 682-9200 Fax: +1 (301) 682-7300 Web: www.SABiosciences.com
Abstract: Traditional histopathological analysis of formalin-fixed paraffin-embedded (FFPE) tissue was capable of
providing expression status on a few targeted proteins or genes. Although fixed samples could serve as valuable sources
for biomarker identification studies, much of their value is hidden at the molecular level. Analysis of gene expression
profiles using genetic material contained within FFPE samples is further limited by the chemical modifications that
decrease the quality and availability of recovered RNA for molecular analysis. In this article, we present the successful
principles and results of the new RT2 FFPE PreAMPTM technology that combines a novel RNA extraction protocol
that exploits thermodynamic and structural characteristics of FFPE RNA with a powerful preamplification step of a
pathway-focused set of genes for real-time gene expression analysis. With this new technology, FFPE RNA can finally
serve as a powerful tool for restrospective and future gene profiling studies for biomarker identification.
Biomarkers, as defined by the NIH Working Group, are “ a characteristic that is objectively measured and evaluated as an indicator of
normal biologic processes, pathogenic processes, or pharmacologic
responses to a therapeutic intervention” .
Since all biological
processes are carried out by genes at the molecular level, gene
expression profiles, as measured by RNA quantiation through realtime PCR analysis, have the potential to serve as predictors of diseased
conditions or response to therapeutic interventions.
discovery through gene expression analysis has been widely used in
basic and clinical research, with freshly prepared tissue or cultured
cells most often used as starting material. However, when fresh
samples are not readily available, as often the case with human tissue
samples, researchers rely on previously prepared, archived samples.
In spite of these difficulties, using formalin-fixed paraffin-embedded
(FFPE) samples in gene expression analysis studies also presents
great advantages. First, FFPE samples are available in vast amounts
and are readily accessible. It is estimated that there are over 400
million FFPE samples stored worldwide, with the number of samples
expected to increase annually . Second, almost all FFPE samples
have associated pathological and clinical annotations. Although
faced with similar issues like institutional approval, sample quantity
acquisition, and sample heterogeneity, it is still much easier to
gather FFPE samples with different clinical outcomes as compared
to collecting fresh samples. This makes association and classification
studies much easier. Thirdly, applying biomarkers developed using
FFPE samples fits well with the traditional workflow of clinical
Most archived samples have been prepared in neutrally-buffered
formalin solutions and embedded in paraffin. Paraffin blocks serve
as excellent substrates for morphological and immunological studies;
unfortunately their use in molecular and genomic studies has proven
challenging. Technical obstacles hindering the exploitation of these
samples can be attributed to extensive chemical cross-linkings among
proteins, DNA, and RNA, as well as from RNA degradation. Crosslinking is primarily caused by formaldehyde fixation, which adds
mono-methylols (-CH2OH) to the amino groups of all four nucleic
acid bases and proteins, resulting in base modifications, methylene
bridging (N-CH2-N) between neighboring bases, and DNA-protein
and RNA-protein cross-linkages . In addition, RNA degradation
begins with tissue anoxia during fixation and continues throughout
the storage period [3, 4]. These damages compromise data consistency
and sensitivity in molecular studies.
Experimental Protocol .......................................................................2
Component 1: FFPE RNA Extraction .......................................2
Component 2: Reverse Transcription .....................................3
Component 3: qPCR Primer Design .........................................4
Component 4: PreAMP Signal Amplification .........................5
FFPE PCR Array Performance ..........................................................5
Application Examples .............................................................................6
Real-time PCR is the simplest and most accurate method to study
gene expression and validate expression biomarkers. Before RNA
can be analyzed by real-time PCR, it has to be converted to cDNA
in a reverse transcription step. The common strategy for reverse
transcription uses the MMLV reverse transcriptase after RNA is
primed with random hexamers or oligo dT primers. These two
priming strategies are equally efficient for most RNA when fresh
samples are used. However, when RNA from FFPE sample is used,
neither strategy can deliver reverse transcription efficiencies high
enough for later processes. Since RNA from FFPE samples is highly
fragmented, the intended amplicon region in PCR is most likely
disconnected from the poly(A) tail. Thus the oligo dT primer based
reverse transcription cannot reach the amplicon region, which leads
to lower PCR detection (Figure 2). For the same reason, most of
the random hexamers primed upstream of the target region cannot
convert RNA amplicons into cDNAs. It is expected that some
random hexamers primed immediately upstream of the target region
will convert RNA amplicons into cDNA, thereby detectable in PCR.
However, the very low concentration of those specific hexamers (the
concentration of one particular hexamer is only 1/4096th, or 0.02%,
of the total concentration) significantly limits priming efficiency, as
it is known that oligonucleotide annealing is highly concentrationdependent. From this discussion, it becomes obvious that using a
single gene-specific primer closely upstream of the target region
will have the highest efficiency in converting an RNA amplicon into
cDNA from FFPE samples.
A. cDNA Synthesis with Fresh RNA
B. cDNA Synthesis with FFPE RNA
Figure 2: A) For intact RNA from fresh samples, any of the primers (1, 2, 3, or 4) can
generate cDNAs which contain the target amplicon (in gray). B) For fragmented
RNA from FFPE samples, only primer 4 can generate cDNA which contains the target
Figure 1: Expression rank order of 84 genes (Cancer PathwayFinder PCR Array) in RNA
extracted from one 20 mm section of a 5-year old normal human intestine FFPE block
using the RT2 FFPE RNA Extraction Kit. MicroRNA data is not shown.
RT2 FFPE RNA Extraction and PreAMPlification
To demonstrate that gene-specific primers will have a higher RT
efficiency than random hexamers for FFPE RNA, we designed
gene-specific RT primers for 89 genes present on the human Cancer
PathwayFinderTM PCR array (PAHS-033). We pooled 89 RT primers
together (0.225 uM each) and applied them to the product of the
RT reaction of RNA extracted from one 20 mm section of a five-year
old normal human spleen FFPE block, followed by RT2 PCR Array
analysis. For all the genes tested, the Ct values decreased an average
of 3.3 cycles, i.e. a ~10X increase in sensitivity (Figure 3). While
performing the analysis, we also noticed that some assays showed
non-specific amplification. This is likely due to non-specific primer
annealing events resulting from the low temperatures employed
during the RT step; other groups have reported similar problems
. While this problem with non-specific amplification can be solved
through individual assay optimization, it limits the genome-wide
application of this approach for gene expression analysis on FFPE
samples. Therefore we sought alternative strategies to improve assay
To test this hypothesis, we selected 22 genes which have existing RT2
primer designs with amplicon sizes longer than 130 bp. The average
size was 162 bp, with the longest at 191 bp. We then redesigned primers
for these genes so the new amplicon sizes range from 53 to 81 bp
(average = 67 bp), and confirmed the new designs work as efficiently
as the old ones. We then generated cDNA from a normal five-year old
human spleen FFPE block and compared the real-time PCR results of
these two sets of primers. As expected, the shorter amplicon designs,
in general, yielded Ct values much lower than the longer designs for
the same amount of cDNA (Figure 4A). The average Ct gain is 2.8
cycles, with the largest gain of over 5 cycles for some genes. Thus, by
simply designing shorter amplicons, we can increase PCR sensitivity
by almost 10-fold. In contrast, shorter amplicon designs do not
improve Cts for RNAs of high quality (Figure 4B).
Figure 3: Gene Specific RT Primer Improves Reverse Transcription Efficiency. 500
ng RNA extracted from one 20 mm section of a five-year old human spleen FFPE block
were reversed transcribed using either random hexamers or a pool of 89 gene specific
primers located upstream of the PCR amplicons for the human Cancer PathwayFinderTM
PCR Array. Equal volumes of cDNA were later used on the same PCR array. Shown here
are the box plots of raw Ct values for each condition.
The sizes of RNA fragments extracted from FFPE samples usually
range from 50 to 300 nt, with most around 100 nt. This average size
correlates well with the age of the block, with older blocks yielding
shorter RNA fragments . This indicates RNA fragmentation
continues even after the sample is completely fixed and embedded.
These short RNA fragments necessitate specialized PCR primer
designs, with an important caveat being the PCR amplicon should
not be much longer than the average size of the RNA fragments.
This stipulation can be explained by recognizing that there is less
probability to amplify a sequence fragment with a size much longer
than the average size of all RNA fragments. For example, when initial
RNA samples are randomly fragmented, the percentage of 300 nt
long fragments is much lower than that of 100 nt long fragments,
when the average size of fragments is 100 nt. It can be inferred that
shorter amplicon designs should yield better sensitivity in real-time
PCR analysis for FFPE samples.
Tel 888-503-3187 (USA)
qPCR Primer Design
Figure 4: Primers Designed for Shorter PCR Amplicons Generally Yield Lower Cts for
Fragmented RNA. A) 500 ng RNA extracted from one 20 mm section of a five-year old
human spleen block was reversed transcribed using random hexamers. Equal amounts
of cDNA were used on two customized RT2 PCR Arrays. For one array, primers were
designed to generate regular PCR amplicons (132 bp to 191 bp). For the second array,
primers were designed to specifically generate shorter amplicons (53 to 81 bp) for the
same genes. Shown here are the before and after raw Ct values for each gene. B) cDNA
transcribed from 1 ug high quality human universal RNA was used on 45 genes (from
PAHS-060) with two different primer designs. For one array, primers were designed to
generate regular PCR amplicons (130 bp to 195 bp). For the second array, primers were
designed to specifically generate shorter amplicons (50 to 89 bp) for the same genes.
Shown here are the before and after raw Ct values for each gene.
Fax 888-465-9859 (USA)
PreAMP Signal Amplification
Another discouraging factor for FFPE samples is that while one can
extract RNA in the microgram range, due to damages, the amount of
RNA fragments which can serve as effective templates in RT-qPCR is
much less. A pre-amplification approach has been used to enhance
detection of RNA extracted from small fresh samples (e.g. from LCM,
FNB, cell sorting), and it is expected that pre-amplification can also be
used to boost the overall qPCR assay sensitivity in FFPE samples.
FFPE PCR Array Performance
In order to demonstrate the value of pre-amplification when working
with FFPE samples yielding low effective RNA template, we have
developed the RT2 FFPE PreAMP technology. When used together
with the RT2 PCR array platform, multi-gene expression analysis on
FFPE samples can be successfully performed. After RNA extracted
from FFPE samples is first converted to cDNA, a quarter of the cDNA
is used in a tightly-controlled multiplex PCR reaction comprised
of a mixture of primers specific to those genes on a particular PCR
Array. After the PreAMP reaction is complete and excess primers are
removed by enzyme digestion, the pre-amplified cDNA is distributed
across a PCR array for individual gene detection.
Compared to the previously discussed strategies for RT-qPCR primer
design, the FFPE PreAMP technology yields the best improvement in
sensitivity. Depending on the cycling numbers of the multiplex PCR,
qPCR assay sensitivity can be easily improved by at least 100-fold or
more for FFPE RNA. As an example, we extracted RNA from a fiveyear old normal human intestine FFPE block, converted it into cDNA,
and ran the cDNA on the RT2 Human Cancer PathwayFinder PCR
Array with or without the FFPE PreAMP technology. The results
from our regular PCR array protocol without FFPE PreAMPlification
showed 18 genes (20% of the array) as “absent” due to Ct values
greater than 35. On the other hand, addition of the FFPE PreAMP
technology greatly improved the detection of those 18 genes (Figure
5), increasing the positive call rate to 100%.
RT + PreAMP
When scientists try to look at the expression profile of multiple genes
on the same sample by qPCR, the easiest approach is to run multiple
individual qPCR assays corresponding to those genes. This creates
the problem of sample division or dilution. For example, if one tries
to use qPCR to study 100 genes including gene A using 1 mg of RNA,
he would have to divide 1 mg RNA into 100 assays, one assay for each
gene. If gene A only has 100 copies in 1 mg RNA, then its copy number
in the individual assay designated for gene A will drop down to one,
which cannot be reliably detected. From another perspective, the
reverse transcription reaction usually has to be diluted before cDNA
can be used in qPCR due to the interference of RT chemistry with
qPCR chemistry. This also causes the number of original gene copies
to be diluted and fall off the qPCR detection range. This problem
can be avoided by starting with much more RNA in the experiment.
However, this is not always feasible, especially with FFPE samples.
Figure 5: PreAMPTM Process Makes More Genes Detectable in FFPE Samples. RNA
extracted from a five-year old human intestine FFPE block was reverse transcribed with
or without the PreAMP process. Both cDNAs were run on Human Cancer PathwayFinder
arrays (PAHS-033). Shown here are 18 genes which are called “Absent” using standard
RT-PCR array procedures.
Any signal or sample amplification approach needs to have high
fidelity in order to maintain the original profile of the genes being
studied. We evaluated the consistency of the PreAMP technology and
the fidelity of gene expression profiles between the original sample
and samples undergoing PreAMPlification. The PreAMPlified
samples showed very high consistency between two independent
runs (Figure 6).
R = 0.99
Figure 6: Inter-assay Consistency of the PreAMP Process. Two independent PreAMP
reactions were set up using the same amount of mouse cDNA. Mouse Toll-Like
Receptor Signaling Pathway PCR Array (PAMM-018) was used to evaluate the inter-run
consistency. DCt (Ct(GOI) – Ct(HKG)) of 89 genes from two separate runs shows R > 0.99.
RT2 FFPE RNA Extraction and PreAMPlification
To evaluate the biological significance of the fold changes, we used
RNAs from human spleen and intestine FFPE blocks and examined
the differential expression of 89 genes in the Human Cancer
PathwayFinder PCR Array. We compared the results obtained using
our PCR Array protocol with those incorporating the FFPE PreAMP
protocol. When looking at those genes which are reliably detected as
“present” (Ct < 31.5) with the PCR array, their raw Ct values without
PreAMP are much higher, but highly correlated with the raw Ct
values after the PreAMP process (Figure 7).
FFPE PreAMP Ct
The fold changes, or the difference in gene expression between
spleen and intestine samples, also correlate well between the two
protocols (Figure 8). This suggests two conclusions that establish
full confidence in providing biologically relevant and accurate data.
First, the original gene expression profile is faithfully maintained
during the PreAMP process. In a population of genes with both high
and low expressers, preamplification amplifies all genes equally such
that overall sensitivity of detection increases, while the differential
patterning is maintained. Second, the PreAMP technology allows one
to compare multiple genes simultaneously, some of which cannot be
detected by traditional means.
R = 0.96
R = 0.98
FFPE Raw Ct
FFPE PreAMP Ct
Figure 8: Comparable Fold Change Results Between PreAMP and Unamplified FFPE
Samples. RNA extracted from human spleen and intestine FFPE blocks were reverse
transcribed with or without the PreAMP process. All four cDNAs were run on PAHS033. Shown here is the DDCt comparison. All the genes show significantly changed in
original FFPE samples were also detected as significantly changed in the same direction
after the PreAMP process (in white regions A and B). Although some genes showed
fold change values (as identified by the arrow) in the opposite directions, the values
were not significant in either condition. Only genes which have Cts lower than 31.5 in
both unamplified spleen and intestine samples are shown here.
Gene Assays in Ct Ranking Order
Figure 7: Highly comparable Ct values between PreAMP and non-PreAMP FFPE
Spleen RNA. RNA extracted from human spleen FFPE block was reverse transcribed
with or without the PreAMP process. Both cDNAs were run on the Human Cancer
PathwayFinder Array. A) Shown here is the scatter plot of raw Cts. B) Every assay is
improved by approximately the same number of Cts after the PreAMP process. Each
pair of data points (PreAMP and no PreAMP) represent a different gene. Only genes
which have Cts lower than 31.5 in the reaction without PreAMP are shown here.
Tel 888-503-3187 (USA)
The RT2 FFPE PreAMP technology solves the major challenge in realtime PCR based gene expression analysis using FFPE samples. It
greatly improves the detection limit of FFPE RNA. It also allows for
easy integration with the popular RT2 Profiler PCR Array platform
for analysis of a pathway-focused set of genes.
As millions of FFPE samples are currently stored in archives with
thorough pathological information and clinical history of the patient
available, using them as a source for biomarker discovery has gained
increasing importance. With the improvement in assay design
and the introduction of a novel workflow tailored to the nature of
these samples, we can now expect to have better utilization of these
valuable resources. We believe and demonstrate that our complete
FFPE RNA extraction protocol, FFPE PreAMP technology and PCR
Array system will be a valuable tool in this quest.
Fax 888-465-9859 (USA)
1. Biomarkers and surrogate endpoints: preferred definitions and
conceptual framework. Biomarkers Definitions Working Group. Clin
Pharmacol Ther. 2001 Mar;69(3):89-95.
2. Analysis of chemical modification of RNA from formalin-fixed
samples and optimization of molecular biology applications for such
samples. Masuda N, Ohnishi T, Kawamoto S, Monden M, Okubo K.
Nucleic Acids Res. 1999 Nov 15;27(22):4436-43.
3. Determinants of RNA quality from FFPE samples. von Ahlfen
S, Missel A, Bendrat K, Schlumpberger M. PLoS One. 2007 Dec
4. Measurement of gene expression in archival paraffin-embedded
tissues: development and performance of a 92-gene reverse
transcriptase-polymerase chain reaction assay. Cronin M, Pho M,
Dutta D, Stephans JC, Shak S, Kiefer MC, Esteban JM, Baker JB. Am J
Pathol. 2004 Jan;164(1):35-42.
5. Applying aCGH to molecular diagnostics. Kathy Liszewski. GEN
2008 May 15; 28(10).
6. Enhanced detection of RNA from paraffin-embedded tissue using
a panel of truncated gene-specific primers for reverse transcription.
Mikhitarian K, Reott S, Hoover L, Allen A, Cole DJ, Gillanders WE,
Mitas M. Biotechniques. 2004 Mar;36(3):474-8.