Chen yonggang Transcription Zhejiang Univ. School of Medicine
What you should already know: DNA RNA protein transcription translation nucleotides amino acids
DNA is the carrier of genetic information from one cell to all of its progeny. DNA is comprised of nitrogen-containing bases (A, T, G, C) connected to deoxyribose sugars, linked by a phosphate backbone. DNA is transcribed into RNA upon the basis of complementary base formation by RNA polymerase. RNA is comprised of nitrogen-containing bases (A, U, G, C) connected to ribose sugars, linked by a phosphate backbone. The “T” of DNA codes the same as the “U” of RNA. The only difference is that T has a methyl group in the 5-position. The RNA molecule, which contains both coding regions (exons) and intervening, noncoding sequences (introns) is processed by splicing to yield a mature mRNA containing a 3’ polyA tail. The mature mRNA is then translated into a sequence of amino acids on the ribosome using three letter codons that direct the utilization of a specific tRNA charged with a specific amino acid. There are a total of 64 possible unique 3-base codons that can be generated from four bases. There are only about 20 structural amino acids needed for protein synthesis, therefore multiple codons can encode the same amino acids. Some codons provide “punctuation”, serving as either initiation or termination signals. Some proteins carry additional amino acid sequences (signals) that aid in the delivery of proteins to the correct cellular location. Proteins can serve as structural elements, provide physiological functions, and serve as enzymes in the catalysis of biochemical reactions.
Additional complexities: DNA RNA protein transcription translation Transcription is controlled at multiple levels including regulation of/by transcription factors, and chromatin structure (next lectures-Dr. Neidigh) Additional levels of complexity have been discovered with respect to RNA processing and turnover. degradation It was previously thought that once formed, most proteins existed for the life of the cell. It is now known that degradation and protein turnover occur regularly and that human disease can result from defects in protein turnover.
Comparison of prokaryotic and Eukaryotic mRNA molecules: Mol. Biol. Of the Cell, Alberts et al., 2002.
initiation Capping and elongation polyadenylation splicing nuclear export Overview of mRNA synthesis
Three types of Eukaryotic promoters:
RNA polymerase (pol) I synthesizes the large rRNA (18S and 28S)
RNA polymerase (pol) II synthesizes mRNA
tissue-specific gene expressions via binding to their target sequences
RNA polymerase (pol) III synthesizes tRNA and 5S rRNA.
Modified from: Tamura T.-A. et al., 1996 Upstream control element Core promoter element Core promoter Initiator TATA-box GC-box CCAAT-box Repressor Box A Box C Box A Box B octamer Proximal sequence element 5S RNA tRNA U6/H1 RNA TTATATAT-box
100 + deaths/year word-wide
No antidote has been developed
Toxin is -amanitin (cyclic octapeptide)
Binds (K=10nM) to RNA pol II
Binds (K=1uM) to RNA pol III
Blocks precursors of mRNA - elongation phase of RNA synthesis.
CASE 1: "Poisoning by Amanita phalloides ("deathcap") mushrooms in the Australian Capital Territory" by Geoffrey M Trim. Mushroom Journal of Australia 1999; vol. 171
In 1995, a 46-year-old man ate eight mushrooms which he had picked in a north Canberra suburb. He presented with vomiting and diarrhea the next day, but, as he was confident that the mushrooms were not A. phalloides, he was discharged home after receiving intravenous rehydration. He presented again two days later with hepatic and renal failure."Initial investigations revealed the following serum levels: alanine aminotransferase (ALT) > 10 000 U/L (reference range [RR], 5-55 U/L); bilirubin, 114 m ol/L (RR, 3-20 m ol/L); creatinine, 535 m ol/L (RR, 40-90 m ol/L); and prothrombin time - international normalized ratio (PT-INR) > 10. A mycologist identified A. phalloides growing in the street where the patient had picked the mushrooms and also identified the mushroom stalks that he had discarded in a rubbish bin.The patient was transferred to a liver transplant unit but died from hepatic failure six days after mushroom ingestion. Postmortem examination of his liver revealed complete necrosis of parenchyma, with one residual island of intensely vacuolated hepatocytes with severe intra-canalicular biliary stasis. No viable hepatocytes were seen.
RNA factory concept: Mol. Biol. Of the Cell, Alberts et al., 2002.
First step in RNA processing: Capping
Formation of the 7-methylguanine 5’- 5’ cap structure. The 5’ cap structure is
essential for efficient pre-mRNA splicing, export, stability and translation initiation.
Three separate enzyme activities are required for cap formation.
The cap protects the RNA from 5’-exonucleolytic cleavage.
*Note the unusual 5’ – 5’ linkage *Note the 7-methylguanine Shafer B et al., Mol Cell, 25(7):2644-2649, 2005. Quiocho FA et al., Curr Opin Struct Biol, 10(1):78-86, 2000.
The number of “factors” involved in elongation is growing: you can’t keep up with it
Once it has initiated transcription, RNA polymerase does not proceed smoothly-rather it moves jerkily*, pausing at some sequences and rapidly transcribing others.
Many elongation factors travel with the polymerase to decrease the likelihood that dissociation will occur. These factors associate shortly after initiation.
Eukaryotic RNA pol. Contend with chromatin structure as they move along the DNA template.
Another barrier to elongation is DNA supercoiling
*Mol. Biol. Of the Cell, Alberts et al., 2002.
Exons (coding regions-expressed sequences)
Introns (noncoding regions-intervening sequences)
Exonic and intronic cis elements known as splicing enhancers and silencers are involved in both constitutive and alternative splicing.
RNA splicing The average human gene contains a mean of 8.8 exons with a mean size of 145 nucleotides. The mean intron length is 3365 nucleotides and the 5’ and 3’ untranslated regions (UTR’s) are 770 and 300 nucleotides, respectively. More than 90% of the pre-mRNA is removed as introns. Though this seems wasteful, this removal enables eukaryotes to increase the coding potential of their genomes. The introns are removed through a process called splicing. The 5’ spice site in higher eukaryotes conforms to the consensus sequence AG/GURAGU where “/” is the cut site, R = purine and Y = pyrimidine. The 3’ splice site is characterized by the sequence YAG/ and is preceded by a stretch of pyrimidine residues in most vertebrate introns. Another sequence element, the branch site is usually located at a distance of 18 to 40 nucleotides upstream from the 3’ splice site.
Multicomponent complex consisting of five small nuclear ribonucleoproteins (snRNPs): U1, U2, U4, U5, and U6 and more then 100 other proteins.
Via multiple RNA/RNA, RNA/protein, and protein/protein interactions, the spliceosome recognizes exon-intron boundaries and catalyzes two sequential trans-esterification reactions that remove introns and ligate exons.
Trans-esterification: R-C- OR’ + R”O-H = O R-C- OR” + R’O-H = O H + or OR” - In the esterification of an acid, an alcohol acts as a nucleophilic reagent; in hydrolysis of an ester, an alcohol is displaced by a nucleophilic reagent. Also one alcohol is capable of displacing another from an ester. This alcoholysis (cleavage by an alcohol) of an ester is called transesterification. Morrison & Boyd, 1987
5’ splice site marks the exon/intron junction and includes a GU dinucleotide at the intron end encompassed within a larger, less conserved consensus sequence.
At the other end of the intron, the 3’ splice site region has three conserved sequence elements: the branch point, followed by a polypyrimidine tract, followed by a terminal AG at the extreme 3’ end of the intron.
pGU A (Py)n - AGp pGU + A (Py)n - AGp pGU A (Py)n - AGp p +
Spliceosome U1 SF1 U2AF 65 U2AF 35 Complex E A B C pGU A (Py)n - AGp U1 U2 U2AF 65 U2AF 35 pGU A (Py)n - AGp U1 U2 U4 U6 U5 Rearrangement to C complex and Catalysis: U1 snRNP is replaced with U6 and U1 &4 are lost from complex pGU A (Py)n - AGp
The mammalian consensus sequences at the 5 ’ splice site and the 3 ’ splice site in the pre-mRNA The 5 ’ splice site is defined by the consensus sequence - MAG/GURAGU (M = A or C; R = A or G and the / indicates the exon - intron junction). The 3 ’ splice site is defined by three sequence elements going 5 ’ to 3 ’ : the branch site (YNYURAC, where A indicates the adenosine used to form the lariat intermediate structure during splicing; Y = U or C; N = A or G or U or C) the polypyrimidine tract, and the 3 ’ splice site consensus (YAG/G; Y = U or C). The branchpoint consensus sequence is usually located 18 to 38 nucleotides upstream of the 3 ’ splice site.
Alternative Splicing (AS):
Fundamental mechanism that allows the production of structurally and functionally distinct proteins from a single coding sequence.
The human genome contains only 30-40K genes with the proteome numbering several hundred thousand.
Genome-wide analyses of AS indicate that 35% -75% of human genes may have AS forms suggesting that AS together with posttranslational modifications play a major role in proteome complexity.
Estimated that 15% of point mutations that cause human genetic disease affect splicing.
The selection of alternative splice sites can be regulated in different manners: tissue specificity, developmental stage, physiological processes, sex determination, and in response to various stress factors.
A number of signals, including stimulation of receptors by growth factors, cytokines, or hormones; depolarization; rising intracellular calcium levels; and cellular stresses like heat shock and change in pH, have been shown to induce changes in selection of splice sites.
Stamm S, Human Mol Genetics 11:(20)2409-2416, 2002; Faustino NA et al., G & D 17:419-437, 2003.
Receptor stimulation, neuronal activity, cellular stress, nutritional status, etc.
Alternative splicing can occur in any region of the nascent messenger RNA, in the 3 ’ or 5 ’ untranslated regions (UTRs) or in the protein coding sequence.
Changes in Pre-mRNA Splicing and Cancer:
Pubmed search using keywords “splicing defects” or “aberrant splicing” and “cancer” resulted in ~100 genes whose pre-mRNA splicing is altered in various types of cancer.
Several recent bioinformatics studies have revealed a vast number of potentially cancer-specific or cancer-associated splice variants.
Wang Z et al., Cancer Research 63:655-657, 2003.
Identified 845 of 26,258 AS variants significantly associated with cancer. Screened 3.47 million ESTs.
Xu Q et al., Nucleic Acids Res 31:5635-5643, 2003.
Identified 316 human genes that have cancer-specific splice variants from a screen of 2 million ESTs.
Hui L et al., Oncogene 23:3013-3023, 2004.
Identified 383 potentially tumor-associated splice variants by aligning ESTs with the genomic sequence of 4,322 genes.
Mechanisms of splicing defects:
Cis effects: mutations that disrupt use of AS sites, inherited or somatic
Trans effects: mutations that affect the basal splicing machinery. These variations are in the composition, concentration, localization and activity of trans-acting regulatory factors
Complications ! Errors in these processes are known to result in human disease. Familial isolated growth hormone deficiency type II The Wilm’s tumor suppressor gene (WT1) undergoes extensive alternative splicing. Frontotemporal dementia Atypical cystic fibrosis. Spinal muscular atrophy Myotonic dystrophy
Insertion or deletion of specific nucleotides
C to U or U to C changes, insertions or deletions of U residues and the insertion of multiple G or C residues
Apolipoprotein B (apoB)
RNA interference (RNAi)
RNA can be edited by base deamination
Humans express two forms of apo:
Small intestine origin
Functions in the chylomicrons to transport triacylglycerols from the intestine to the liver and peripheral tissues
Functions in VLDL, IDL, and LDL to transport cholesterol from the liver to the peripheral tissues
Enormous 4536-residue protein
Expressed from same gene
Differ in single C to U change: the codon for Gln 2153 (CAA) in apoB-100 mRNA is, in apoB-48 mRNA, a UAA stop codon.
What is RNA interference (RNAi)?
Process of long, double stranded RNA-dependent posttanscriptional gene silencing, a process associated with virus resistance, developmental control and heterochromatin formation
Double stranded RNA introduced into the cytoplasm is cleaved by the RNaseIII-like enzymes, dicer or drosha, to 19-26 nt RNA which serves as a guide for targeted mRNA degradation
Resulting siRNAs are incorporated into a nuclease complex, RNA Induced Silencing Complex (RISC) which targets and cleaves mRNA that is complementary to the siRNA
double-stranded RNA (dsRNA) Dicer (ATP dependent) 19-26 nt siRNA siRNA/protein (RISC) complex formation Gemin3 Gemin4 eIF2C OH-3’ 5’-p target recognition Gemin3 Gemin4 eIF2C p-5’ AAA…A n 7 mGpppG 3’-HO base pairing target cleavage helicase unwinding (?) mRNA AAA…A n 7 mGpppG Modified from Wall NR and Shi Y, LANCET, 2003. OH-3’ 3’-OH 5’-p p-5’ Gemin3 Gemin4 eIF2C OH-3’ 3’-OH 5’-p p-5’
G C GXXX XXXXX XXXXX XXXXX X TTC A CXXX XXXXX XXXXX XXXXX X AAG T TCGA AGCTT X XXXXX XXXXX XXXXX XXXC TTTTT G A X XXXXX XXXXX XXXXX XXXG AAAAA CTTAA U6/H1 Promoter 5’ 3’ 3’ 5’ 5’ 3’ 3’ 5’ for HindIII for HindIII for EcoRI Oligo 1a Oligo 1b Oligo 2a Oligo 2b Transcripted siRNA: GXX XXXXX XXXXX XXXXX XX C XX XXXXX XXXXX XXXXX XX 5’ 3’ (U) n C A A G C U U * * ‘ ‘ U U * ‘ pBS/U6/siRNA 3.4 kb U6 KpnI BamHI HindIII EcoRI BamHI
The Power of RNAi
RNAi can be used to induce a loss-of-function phenotype for any gene, without recourse to labor-intensive, traditional genetic methods.
RNAi can be used to verify or complement existing genetic mutations.
RNAi pathway is widely conserved and has been demonstrated in such widely diverged eukaryotes as Neurospora, trypanosomes, and mammals
Though many researchers use RNAi as a tool, the actual mechanism by which it works remains incompletely understood.
A final step in the synthesis of the mRNA molecule is the ligation of the Poly-A tail. The mRNA sequence itself provides the signals that determine the site of polyadenylation. The AAUAAA element 20-30 nucleotides upstream of the cleavage site is where the poly-A is added. Addition of the poly-A tail (approximately 200 A nucleotides) is essential to protect the RNA from 3’ hydrolytic enzymes. Why would a poly-T column be used in an experiment to look at the effects of a drug on gene expression? Polyadenylation
PolyA binding proteins (PABP): protecting the ends and directing traffic