SERIAL ANALYSIS OF GENESERIAL ANALYSIS OF GENE
EXPRESSIONEXPRESSION ( (SAGESAGE))
-SAMSUDEEN.
S
 Serial analysis of gene expression 
   (SAGE) is a transcriptomic 
technique used by molecular biologists to 
produce a snapshot of the messenger 
RNA population in a sample of interest in 
the form of small tags that correspond to 
fragments of those transcripts.
SAGESAGE EXPERIMENTSEXPERIMENTS PROCEED AS FOLLOWS:PROCEED AS FOLLOWS:
 The mRNA of an input sample is isolated and 
a reverse transcriptase and biotinylated primers 
are used to synthesize cDNA from mRNA.
 The cDNA is bound to Streptavidin beads via 
interaction with the biotin attached to the 
primers, and is then cleaved using a restriction 
endonuclease called an anchoring enzyme (AE). 
The location of the cleavage site and thus the 
length of the remaining cDNA bound to the 
bead will vary for each individual cDNA 
(mRNA).
PROCESS IN SAGE:PROCESS IN SAGE:
SAGESAGE EXPERIMENTSEXPERIMENTS PROCEED AS FOLLOWS:PROCEED AS FOLLOWS:
 The  cleaved  cDNA  downstream  from  the 
cleavage  site  is  then  discarded,  and  the 
remaining  immobile  cDNA  fragments 
upstream  from  cleavage  sites  are  divided  in 
half  and  exposed  to  one  of  two  adapter 
oligonucleotides  (A  or  B)  containing  several 
components in the following order upstream 
from the attachment site: 
A) Sticky ends with the anchoring enzyme 
  cut  site  to  allow  for  attachment  to  cleaved 
cDNA.
B)  A  recognition  site  for  a  restriction 
endonuclease  known  as  the  tagging  enzyme 
(TE),  which  cuts  about  15  nucleotides 
downstream of its recognition site. 
C)  A  short  primer  sequence  unique  to 
either  adapter  A  or  B,  which  will  later  be 
used for further amplification via PCR.
 After adapter ligation, cDNA are cleaved
using tagging enzyme (TE) to remove them
from the beads, leaving only a short "tag" of
about 11 nucleotides of original cDNA.
 The cleaved cDNA tags are then repaired
with DNA polymerase to produce blunt
end cDNA fragments.
 These cDNA tag fragments are ligated,
sandwiching the two tag sequences together,
and flanking adapters A and B at either end.
These new constructs, called ditags, are then
PCR amplified using anchor A and B specific
primers.
 The ditags are then cleaved using the original
anchoring enzyme, and allowed to link
together with other ditags, which will be
ligated to create a cDNA concatemer with each
ditag being separated by the Anchoring
Enzyme recognition site.
 These concatemers are then transformed into
bacteria for amplification through bacterial
replication.
 The cDNA concatemers can then be isolated
and sequenced using modern high-
throughput DNA sequencers and these
sequences can be analysed with computer
programs which quantify the recurrence of
individual tags.
 In 1995, the idea of reducing the tag length from
100 to 800 bp down to tag length of 10 to 22 bp
helped reduce the cost of mRNA surveys. In this
year, the original SAGE protocol was published
by Victor Velculescu at the Oncology Center of
Johns Hopkins University. Although SAGE was
originally conceived for use in cancer studies, it has
been successfully used to describe
the transcriptome of other diseases and in a wide
variety of organisms.
VICTOR VELCULESCUVICTOR VELCULESCU
LONG SAGELONG SAGE
 LongSAGE was a more robust version of the
original SAGE developed in 2002 which had a
higher throughput, using 20 μg of mRNA to
generate a cDNA library of thousands of
tags. Robust LongSage (RL-SAGE) Further
improved on the LongSAGE protocol with the
ability to generate a library with an insert size of 50
ng mRNA, much smaller than previous LongSAGE
insert size of 2 μg mRNA and using a lower number
of ditag polymerase chain reactions (PCR) to
obtain a complete cDNA library.
SUPER SAGESUPER SAGE
 SuperSAGE is a derivative of SAGE that uses
the type III-endonuclease EcoP15I
of phage P1, to cut 26 bp long sequence
tags from each transcript's cDNA, expanding
the tag-size by at least 6 bp as compared to
the predecessor techniques SAGE and
LongSAGE. The longer tag-size allows for a
more precise allocation of the tag to the
corresponding transcript, because each
additional base increases the precision of the
annotation considerably.
 Like in the original SAGE protocol, so-called
ditags are formed, using blunt-ended tags. By
direct sequencing with high-throughput
sequencing techniques, hundred thousands or
millions of tags can be analyzed simultaneously,
producing very precise and quantitative gene
expression profiles. Therefore, tag-based gene
expression profiling also called "digital gene
expression profiling" (DGE) can today
provide most accurate transcription profiles that
overcome the limitations of microarrays.
MASSIVE ANALYSIS OF CDNA ENDSMASSIVE ANALYSIS OF CDNA ENDS
 In the mid 2010s several techniques combined with
Next Generation Sequencing were developed that
employ the "tag" principle for "digital gene
expression profiling" but without the use of the
tagging enzyme. The "MACE" approach, (Massive
Analysis of cDNA Ends) generates tags
somewhere in the last 1500 bps of a transcript. The
technique does not depend on restriction enzymes
anymore and thereby circumvents bias that is
related to the absence or location of the restriction
site within the cDNA.
 cDNA is randomly fragmented and the
3'ends are sequenced from the 5' end of the
cDNA molecule that carries the poly-A tail.
The sequencing length of the tag can be
freely chosen. Because of this, the tags can be
assembled into contigs and the annotation of
the tags can be drastically improved.
Therefore, MACE is also use for the analyses
of non-model organisms. In addition, the
longer contigs can be screened for
polymorphisms.
 MACE does only require 3’ ends of
transcripts, even partly degraded RNA can be
analyzed with less degradation dependent
bias. The MACE approach uses unique
molecular identifiers to allow for
identification of PCR bias.
MACEMACE
RESULT ANALYSIS:RESULT ANALYSIS:
 The output of SAGE is a list of short sequence tags
and the number of times it is observed.
Using sequence databases a researcher can usually
determine, with some confidence, from which
original mRNA (and therefore which gene) the tag
was extracted.
 Statistical methods can be applied to tag and count
lists from different samples in order to determine
which genes are more highly expressed. For
example, a normal tissue sample can be compared
against a corresponding tumor to determine
which genes tend to be more (or less) active.
Sage

Sage

  • 1.
    SERIAL ANALYSIS OFGENESERIAL ANALYSIS OF GENE EXPRESSIONEXPRESSION ( (SAGESAGE)) -SAMSUDEEN. S
  • 2.
     Serial analysisof gene expression     (SAGE) is a transcriptomic  technique used by molecular biologists to  produce a snapshot of the messenger  RNA population in a sample of interest in  the form of small tags that correspond to  fragments of those transcripts.
  • 3.
    SAGESAGE EXPERIMENTSEXPERIMENTS PROCEEDAS FOLLOWS:PROCEED AS FOLLOWS:  The mRNA of an input sample is isolated and  a reverse transcriptase and biotinylated primers  are used to synthesize cDNA from mRNA.  The cDNA is bound to Streptavidin beads via  interaction with the biotin attached to the  primers, and is then cleaved using a restriction  endonuclease called an anchoring enzyme (AE).  The location of the cleavage site and thus the  length of the remaining cDNA bound to the  bead will vary for each individual cDNA  (mRNA).
  • 4.
  • 5.
    SAGESAGE EXPERIMENTSEXPERIMENTS PROCEEDAS FOLLOWS:PROCEED AS FOLLOWS:  The  cleaved  cDNA  downstream  from  the  cleavage  site  is  then  discarded,  and  the  remaining  immobile  cDNA  fragments  upstream  from  cleavage  sites  are  divided  in  half  and  exposed  to  one  of  two  adapter  oligonucleotides  (A  or  B)  containing  several  components in the following order upstream  from the attachment site: 
  • 6.
    A) Sticky ends with the anchoring enzyme    cut  site to  allow  for  attachment  to  cleaved  cDNA. B)  A  recognition  site  for  a  restriction  endonuclease  known  as  the  tagging  enzyme  (TE),  which  cuts  about  15  nucleotides  downstream of its recognition site.  C)  A  short  primer  sequence  unique  to  either  adapter  A  or  B,  which  will  later  be  used for further amplification via PCR.
  • 7.
     After adapterligation, cDNA are cleaved using tagging enzyme (TE) to remove them from the beads, leaving only a short "tag" of about 11 nucleotides of original cDNA.  The cleaved cDNA tags are then repaired with DNA polymerase to produce blunt end cDNA fragments.
  • 8.
     These cDNAtag fragments are ligated, sandwiching the two tag sequences together, and flanking adapters A and B at either end. These new constructs, called ditags, are then PCR amplified using anchor A and B specific primers.  The ditags are then cleaved using the original anchoring enzyme, and allowed to link together with other ditags, which will be ligated to create a cDNA concatemer with each ditag being separated by the Anchoring Enzyme recognition site.
  • 9.
     These concatemersare then transformed into bacteria for amplification through bacterial replication.  The cDNA concatemers can then be isolated and sequenced using modern high- throughput DNA sequencers and these sequences can be analysed with computer programs which quantify the recurrence of individual tags.
  • 10.
     In 1995,the idea of reducing the tag length from 100 to 800 bp down to tag length of 10 to 22 bp helped reduce the cost of mRNA surveys. In this year, the original SAGE protocol was published by Victor Velculescu at the Oncology Center of Johns Hopkins University. Although SAGE was originally conceived for use in cancer studies, it has been successfully used to describe the transcriptome of other diseases and in a wide variety of organisms.
  • 11.
  • 12.
    LONG SAGELONG SAGE LongSAGE was a more robust version of the original SAGE developed in 2002 which had a higher throughput, using 20 μg of mRNA to generate a cDNA library of thousands of tags. Robust LongSage (RL-SAGE) Further improved on the LongSAGE protocol with the ability to generate a library with an insert size of 50 ng mRNA, much smaller than previous LongSAGE insert size of 2 μg mRNA and using a lower number of ditag polymerase chain reactions (PCR) to obtain a complete cDNA library.
  • 14.
    SUPER SAGESUPER SAGE SuperSAGE is a derivative of SAGE that uses the type III-endonuclease EcoP15I of phage P1, to cut 26 bp long sequence tags from each transcript's cDNA, expanding the tag-size by at least 6 bp as compared to the predecessor techniques SAGE and LongSAGE. The longer tag-size allows for a more precise allocation of the tag to the corresponding transcript, because each additional base increases the precision of the annotation considerably.
  • 15.
     Like inthe original SAGE protocol, so-called ditags are formed, using blunt-ended tags. By direct sequencing with high-throughput sequencing techniques, hundred thousands or millions of tags can be analyzed simultaneously, producing very precise and quantitative gene expression profiles. Therefore, tag-based gene expression profiling also called "digital gene expression profiling" (DGE) can today provide most accurate transcription profiles that overcome the limitations of microarrays.
  • 16.
    MASSIVE ANALYSIS OFCDNA ENDSMASSIVE ANALYSIS OF CDNA ENDS  In the mid 2010s several techniques combined with Next Generation Sequencing were developed that employ the "tag" principle for "digital gene expression profiling" but without the use of the tagging enzyme. The "MACE" approach, (Massive Analysis of cDNA Ends) generates tags somewhere in the last 1500 bps of a transcript. The technique does not depend on restriction enzymes anymore and thereby circumvents bias that is related to the absence or location of the restriction site within the cDNA.
  • 17.
     cDNA israndomly fragmented and the 3'ends are sequenced from the 5' end of the cDNA molecule that carries the poly-A tail. The sequencing length of the tag can be freely chosen. Because of this, the tags can be assembled into contigs and the annotation of the tags can be drastically improved. Therefore, MACE is also use for the analyses of non-model organisms. In addition, the longer contigs can be screened for polymorphisms.
  • 18.
     MACE doesonly require 3’ ends of transcripts, even partly degraded RNA can be analyzed with less degradation dependent bias. The MACE approach uses unique molecular identifiers to allow for identification of PCR bias.
  • 19.
  • 20.
    RESULT ANALYSIS:RESULT ANALYSIS: The output of SAGE is a list of short sequence tags and the number of times it is observed. Using sequence databases a researcher can usually determine, with some confidence, from which original mRNA (and therefore which gene) the tag was extracted.  Statistical methods can be applied to tag and count lists from different samples in order to determine which genes are more highly expressed. For example, a normal tissue sample can be compared against a corresponding tumor to determine which genes tend to be more (or less) active.