2. SAGE TECHNOLOGY AND
ITS APPLICATIONS
PRESENTED BY
Dr. R.A.Siddique &
Dr.Anand Kumar
Animal Biochemistry Division
N.D.R.I., Karnal (Haryana)India, 132001
E-mail: riazndri@gmail.com
3. WHAT IS SAGE?
īŽ Serial analysis of gene expression (SAGE) is
a powerful tool that allows digital analysis of
overall gene expression patterns.
īŽ Produces a snapshot of the mRNA population
in the sample of interest.
īŽ SAGE provides quantitative and
comprehensive expression profiling in a given
cell population.
4. ī§ SAGE invented at Johns Hopkins University in
USA (Oncology Center) by Dr. Victor Velculescu
in 1995.
ī§ An overview of a cellâs complete gene activity.
ī§ Addresses specific issues such as determination of
normal gene structure and identification of
abnormal genome changes.
ī§ Enables precise annotation of existing genes and
discovery of new genes.
5. NEED FOR SAGEâĻ..
īŽ Gene expression refers to the study of how
specific genes are transcribed at a given point in
time in a given cell.
īŽ Examining which transcripts are present in a cell.
īŽ SAGE enables large scale studies of DNA
expression; these can be used to create
'expression profilesâ.
6. īŽ Allows rapid, detailed analysis of thousands of
transcripts in a cell.
īŽ By comparing different types of cells, generate
profiles that will help to understand healthy cells
and what goes wrong during diseases.
7. THREE PRINCIPLES UNDERLIE THETHREE PRINCIPLES UNDERLIE THE
SAGE METHODOLOGYSAGE METHODOLOGY::
âĸ A short sequence tag (10-14bp) contains sufficient
information to uniquely identify a transcript provided that
the tag is obtained from a unique position within each
transcript
âĸ Sequence tags can be linked together to from long serial
molecules that can be cloned and sequenced
âĸ Quantitation of the number of times a particular tag is
observed provides the expression level of the
corresponding transcript.
8.
9. PRE REQUISITESPRE REQUISITES::
âĸ Extensive sequencing techniques
âĸ Deep bioinformatic knowledge
âĸ Powerful computer software (assemble and analyze results
from SAGE experiments)
Limited use of this sensitive technique in
academic research laboratories
10. STEPS IN BRIEFâĻ..
1. Isolate the mRNA of an input sample (e.g. a
tumour).
2. Extract a small chunk of sequence from a
defined position of each mRNA molecule.
3. Link these small pieces of sequence together to
form a long chain (or concatamer).
11. 4. Clone these chains into a vector which
can be taken up by bacteria.
5. Sequence these chains using modern high-
throughput DNA sequencers.
6. Process this data with a computer to count
the small sequence tags.
13. SAGE TECHNIQUE (in detail)
Trap RNAs with beads
âĸ Messenger RNAs end with a long string of "As" (adenine)
âĸ Adenine forms very strong chemical bonds with another nucleotide,
thymine (T)
âĸ Molecule that consists of 20 or so Ts acts like a chemical bait to
capture RNAs
âĸ Researchers coat microscopic, magnetic beads with chemical baits i.e.
"TTTTT" tails hanging out
âĸ When the contents of cells are washed past the beads, the RNA
molecules will be trapped
âĸ A magnet is used to withdraw the bead and the RNAs out of the
"soup"
14.
15. cDNA SYNTHESIS
âĸDouble stranded cDNA is synthesized from the extracted
mRNA by means of biotinylated oligo (dT) primer.
âĸcDNA synthesized is immobilised to streptavidin beads.
16.
17. ENZYMATIC CLEAVAGE OF cDNA.
īŽ The cDNA molecule is cleaved with a restriction
enzyme.
īŽ Type II restriction enzyme used.
īŽ Also known as Anchoring enzyme. E.g. NlaIII.
īŽ Any 4 base recognising enzyme used.
īŽ Average length of cDNA 256bp with âsticky endsâ
created.
18. The biotinylated 3â cDNA are affinity purified using strepatavidin
coated magnetic beads.
19. LIGATION OF LINKERS TO BOUND
cDNA
īŽ These captured cDNAs are divided into two halves, then ligated to
linkers A and B, respectively at their ends.
īŽ Linkers also known as âdocking modulesâ.
īŽ They are oligonucleotide duplexes.
īŽ Linkers contain:
ī NlaIII4- nucleotide cohesive overhang
ī Type IIS recognition sequence
ī PCR primer sequence (primer A or B).
20. Type IIS restriction enzyme â âtagging enzymeâ.
Linker/docking module:
PRIMER TE AE TAG
21. CLEAVAGE WITH TAGGING
ENZYME
īŽ Tagging enzyme, usually BmsFI cleave DNA 14-
15 nucleotides, releasing the linker âadapted
SAGE tag from each cDNA.
īŽ Repair of ends to make blunt ended tags using
DNA polymerase (Klenow) and dNTPs.
22.
23. FORMATION OF DITAGS
īŽ What is left is a collection of short tags taken from each
molecule.
ī§ Two groups of cDNAs are ligated to each other, to create a
âditagâ with linkers on either end.
25. PCR AMPLIFICATION OF
DITAGS
īŽ The linker-ditag-linker constructs are
amplified by PCR using primers specific
to the linkers.
26. ISOLATION OF DITAGS
ī§ The cDNA is again digested by the AE.
ī§ Breaking the linker off right where it was added in the
beginning.
ī§ This leaves a âstickyâ end with the sequence GTAC (or
CATG on the other strand) at each end of the ditag.
27. CONCATAMERIZATION OF
DITAGS
ī§ Tags are combined into much longer molecules, called
concatemers.
ī§ Between each ditag is the AE site, allowing the scientist
and the computer to recognize where one ends and the next
begins.
28. CLONING CONCATAMERS
AND SEQUENCING
ī§ Lots of copies are required- So the concatemers are put
into bacteria, which act like living "copy machines" to
create millions of copies from the original
ī§ These copies are then sequenced, using machines that can
read the nucleotides in DNA. The result is a long list of
nucleotides that has to be analyzed by computer
ī§ Analysis will do several things: count the tags, determine
which ones come from the same RNA molecule, and figure
out which ones come from known, well-studied genes and
which ones are new
30. How does SAGE work?
1. Isolate mRNA.
2.(b) Synthesize ds cDNA.
2.(a) Add biotin-labeled dT primer:
4.(a) Divide into two pools and add linker sequences:
4.(b) Ligate.
3.(c) Discard loose fragments.
3.(a) Bind to streptavidin-coated beads.
3.(b) Cleave with âanchoring enzymeâ.
5. Cleave with âtagging enzymeâ.
6. Combine pools and ligate.
7. Amplify ditags, then cleave with anchoring enzyme.
8. Ligate ditags.
9. Sequence and record the tags and frequencies.
31. Vast amounts ofdata is produced, which
must be sifted and ordered for useful
information tobecome apparent.
Sage reference databases:
īŽ SAGE map
īŽ SAGE Genie
http://www.ncbi.nlm.nih.gov/cgap
33. FROM TAGS TO GENESâĻâĻ
īŽ Collect sequence records from GenBank
īŽ Assign sequence orientation (by finding poly-A
tail or poly-A signal or from annotations)
īŽ Extract 10-bases -adjacent to 3â-most CATG
īŽ Assign UniGene identifier to each sequence with a
SAGE tag
īŽ Record (for each tag-gene pair)
īŽ #sequences with this tag
īŽ #sequences in gene cluster with this tag
Maps available at http://www.ncbi.nlm.nih.gov/SAGE
34.
35. DIFFERENTIAL GENE
EXPRESSION BY SAGE
īŽ Identification of differentially expressed
genes in samples from different
physiological or pathological conditions.
īŽ Application of many statistical methods
ī Poisson approximation
ī Bayesian method
ī Chi square test.
36. īŽ SAGE software searches GenBank for matches
to each tag
īŽ This allows assignment to 3 categories of tags:
īŽ mRNAs derived from known genes
īŽ anonymous mRNAs, also known as expressed sequence
tags (ESTs)
īŽ mRNAs derived from currently unidentified genes
37. SAGE VS MICROARRAY
īŽ SAGE â An open system which detects both known and
unknown transcripts and genes.
38. COMPARISONâĻâĻ
SAGE
īŽ Detects 3â region of
transcript. Restriction site
is determining factor.
īŽ Collects sequence
information and copy no.
īŽ Sequencing error and
quantitation bias.
MICROARRAY
īŽ Targets various regions of
the transcript.Base
composition for
specificity of
hybridization.
īŽ Fluorescent signals and
signal intensity.
īŽ Labeling bias and noise
signals.
39. ContdâĻâĻ
Features SAGE Microarray
Detects unknown
transcripts
Yes No
Quantification Absolute measure Relative measure
Sensitivity High Moderate
Specificity Moderate High
Reproducibility Good for higher
abundance transcripts
Good for data from
intra-platform
comparison
Direct cost 5-10X higher than
arrays.
5-10 X lower than
SAGE
40. RECENT SAGE APPLICATIONS
âĸAnalysis of yeast transcriptome
âĸGene Expression Profiles in Normal and Cancer Cell
âĸInsights into p53-mediated apoptosis
âĸIdentification and classification of p53-regulated genes
âĸAnalysis of human transcriptomes
âĸSerial microanalysis of renal transcriptomes
âĸGenes Expressed in Human Tumor Endothelium
âĸAnalysis of colorectal metastases (PRL-3)
âĸCharacterization of gene expression in colorectal adenomas
and cancer
âĸUsing the transcriptome to analyze the genome (Long SAGE)
41. LIMITATIONS
âĸ Does not measure the actual expression level of a gene.
âĸ Average size of a tag produced during SAGE analysis is
ten bases and this makes it difficult to assign a tag to a
specific transcript with accuracy
âĸ Two different genes could have the same tag and the same
gene that is alternatively spliced could have different tags at
the 3' ends
âĸ Assigning each tag to an mRNA transcript could be made
even more difficult and ambiguous if sequencing errors are
also introduced in the process
42. âĸ Quantitation bias:
âĸ Contamination of of large quantities of linker-dimer molecules.
âĸ low efficiency in blunt end ligation.
âĸ Amplification bias.
âĸ Depending upon anchoring enzyme and tagging enzyme
used, some fraction of mRNA species would be lost.
43. Advances over SAGEAdvances over SAGE
âĸGeneration of longer 3` cDNA from SAGE tags
for gene identification (GLGI)
âĸ Long SAGE
âĸ Cap Analysis of Gene Expression (CAGE)
âĸ Gene Identification Signature (GIS)
âĸ SuperSAGE
âĸ Digital karyotyping
âĸ Paired-end ditag
44. Long SAGE
īŽ Increased specificity of SAGE tags for
transcript identification and SAGE tag
mapping.
īŽ Collects tags of 21bp
īŽ Different TypeII restriction enzyme-Mmel
īŽ Adapts SAGE principle to genomic DNA.
īŽ Allows localisation of TIS and PAS.
45.
46. CAGE (Capped Analysis of Gene Expression)
īŽ Aims to identify TIS and promoters.
īŽ Collects 21 bp from 5â ends of cap purified cDNA.
īŽ Used in mouse and human transcriptome studies.
īŽ The method essentially uses full-length
cDNAs , to the 5â ends of which linkers are
attached.
īŽ This is followed by the cleavage of the first 20
base pairs by class II restriction enzymes,
PCR, concatamerization, and cloning of the
CAGE tags
48. Micro SAGE
īŽ Requires 500-5000 fold less starting input RNA.
īŽ Simplifies by the incorporation of a âone tubeâ procedure
for all steps.
īŽ Characterization of expression profiles in tissue biopsies,
tumor metastases or in cases where tissue is scarce.
īŽ Generation of region-specific expression profiles of
complex heterogeneous tissues.
īŽ Limited number of additional PCR cycles are performed to
generate sufficient ditag.
49. īŽ An expression profile can be obtained from as
little as 1-5 ng of mRNA.
īŽ Comparison between the twoâĻ
SAGE MicroSAGE
Amount of input
material
2.5-5 ug RNA 1-5 ng of mRNA
Capture of
cDNA
Streptavidin coated
magnetic beads
Streptavidin coated PCR
tube
Multiple tube vs.
Single tube
reaction
īŽSubsequent reactions in
multiple tubes
īŽMultiple PCI extraction
and ethanol precipitation
steps
īŽSingle tube reaction
īŽEasy change of buffers
īŽNo PCI extraction or
ethanol ppt step.
īŽFewer manipulations
PCR 25-28 cycles 28 cycles followed by re-
PCR on excised ditag (8-
15)
50. SuperSAGE
īŽ Increases the specificity of SAGE tags and
use of tags as microarray probes.
īŽ Type III RE EcoP15I â tag releasing
īŽ Collects 26 bp tags
īŽ Has been used in plant SAGE studies.
īŽ Study of gene expression in which sequence
information is not available.
52. Gene Identification Signature
(GIS)
īŽ Identifies gene boundaries.
īŽ Collects 20bp LongSAGE tags from 3â and
5â end of the transcript.
īŽ Applied to human and mouse transcription
studies.
53. DIGITAL KARYOTYPING
īŽ Analyses gene structure.
īŽ Identification amplification and deletion in several
cancers.
PAIRED END DITAG
īŽ Identifies protein binding sites in genome.
īŽ Applied to identify p-53 binding sites in the
human genome.
54.
55. 1. SAGE: A LOOKING GLASS
FOR CANCER
īŽ Deciphering pathways involved in tumor genesisand identifying novel
diagnostic tools, prognostic markers,and potential therapeutic targets.
īŽ SAGE is one of the techniquesused in the National Cancer Instituteâ
funded Cancer GenomeAnatomy Project (CGAP).
īŽ A database with archived SAGE tag counts and on-line query tools
was created - the largest source of public SAGE data.
īŽ More than 3 million tags from 88 different librarieshave been
deposited on the National Center for BiotechnologyEducation/CGAP
SAGEmap web site (http://www.ncbi.nlm.nih.gov/SAGE/).
56. īŽ Several interesting patterns have emerged.
īŽ cancerous and normal cells derived from the same tissue typeare very
similar.
īŽ tumors of the same tissue of origin but of differenthistological type or
grade have distinct gene expression patterns
īŽ cancer cells usuallyincrease the expression of genes associated with
proliferationand survival and decrease the expression of genes involved in
differentiation.
īŽ SAGE studies have been performed in patientswith colon, pancreatic,
lung, bladder, ovarian, and breast cancers.
īŽ SAGE experiments validated in multiple tumor and normaltissue
pairs using a variety of approaches, including Northernblot analysis,
real-time PCR, mRNA in situ hybridization, and
immunohistochemistry.
īŽ Identification of an ideal tumor marker. E.g. Matrix metalloprotease1
in ovarian cancer is overexpressed.
57.
58. p53- TUMOR SUPRESSOR GENE
īŽ p53 is thought to play a rolein the regulation of cell cycle checkpoints,
apoptosis, genomicstability, and angiogenesis.
īŽ Sequence-specific transactivationis essential for p53-mediated tumor
suppression.
īŽ The analysis of transcriptomes after p53 expressionhas determined
that p53 exerts its diverse cellular functionsby influencing the
expression of a large group of genes.
īŽ Identification of Previously Unidentified p53-Regulated Genes by
SAGE analysis.
īŽ Variability exists with regardto the extent, timing, and p53
dependence of the expressionof these genes.
59. 2. IMMUNOLOGICAL STUDIES
īŽ Only a few SAGE analysis has been applied for the study of
immunological phenomena.
īŽ SAGE analyses were conducted for human monocytes and their
differentiated descendants, macrophages and dendritic cells.
īŽ DC cDNA library represented more than 17,000 different genes.
Genes differentially expressed were those encoding proteins related to
cell motility and structure.
īŽ SAGE has been applied to B cell lymphomas to analyze genes
involved in BCR âmediated apoptosis.- polyamine regulation is
involved in apoptosis during B cell clonal deletion.
60. ContdâĻ
īŽ LongSAGE has been used to identify genes of T cells with SLE that
determine commitment to the disease.
īŽ Findings indicate that the immatureCD4+ T lymphocytes may be
responsible for the pathogenesis of SLE.
īŽ SAGE has been used to analyze the expression profiles of Th-1 and Th-
2 cells, and newly identified numerous genes for which expression is
selective in either population.
īŽ Contributes to understanding of the molecular basis of Th1/Th2
dominated diseases and diagnosis of these diseases.
61. 3. YEAST TRANSCRIPTOME
īŽ Yeast is widely used to clarify the biochemical physiologic
parameters underlying eukaryotic cellular functions.
īŽ Yeast chosen as a model organism to evaluate the power
of SAGE technology.
īŽ Most extensive SAGE profile was made for yeast.
īŽ Analysis of yeast transcriptome affords a unique view of
the RNA components defining cellular life.
62. 4.ANALYSIS OF TISSUE
TRANSCRIPTOMES
īŽ Used to analyze the transcriptomes of renal, cervical
tissues etc.
īŽ Establishing a baseline of gene expression in normal tissue
is key for identifying changes in cancer.
īŽ Specific gene expression profiles were obtained, and
known markers (e.g., uromodulinin the thick ascending
limb of Henle's loop and aquaporin-2 inthe collecting duct)
were found.
63. REFERENCES
īŽ Maillard, Jean-Charles, et al., Efficiency and limits of the Serial Analysis of
Gene Expression., Veterinary Immunol. and Immunopathol. 2005., 108:59-69.
īŽ Man, M.Z. et al., POWER-SAGE: comparing statistical tests for SAGE
experiments., Bioinformatics 2000., 16: 953-959.
īŽ Polyak, K. and Riggins, G.J., Gene discovery using the serial analysis of gene
expression technique: Implications for cancer research., J. of Clin. Oncol.
2001., 19(11):2948-2958.
īŽ Tuteja and Tuteja., Serial Analysis of Gene Expression: Applications in
Human Studies., J. of Biomed. And Biotechnol. 2004., 2: 113-120.
īŽ Tuteja and Tuteja., Serial analysis of gene expression: application in cancer
research., Med. Sci. Monit. 2004., 10(6): 132-140.
īŽ Velculescu, V.E. et al. Serial analysis of gene expression., Science 1995.,
270:484-487.
īŽ Wing, San Ming., Understanding SAGE data., Trends in Genetics 2006., 23:
1-12.
īŽ Yamamoto, M., et al., Use of serial analysis of gene expression (SAGE)
technology., J. of Immunol. meth.2001., 250:45-66.