Transcription,translation and genetic code(cell biology)by welfredo yu
Cells are governed by a cellular chain of
DNA RNA protein
Is the synthesis of RNA under the direction of
Produces messenger RNA (mRNA)
Is the actual synthesis of a polypeptide, which
occurs under the direction of mRNA
Occurs on ribosomes
The Genetic Code
a non-overlapping sequence, with each amino
acid plus polypeptide initiation and termination
specified by RNA codons composed of three
Genes can be expressed at
• Gene A is transcribed
much more efficiently than
• This allows the amount of
protein A in the cell to be
greater than protein B
• The lower expression of
gene B is a reason behind
RNA is the bridge between
genes and the proteins for
which they code.
what is RNA?
Monomers of proteins are amino
what are proteins made
• During transcription, one of the two DNA
strands, called the template strand, provides
a template for ordering the sequence of
complementary nucleotides in an RNA
– The template strand is always the same strand for
a given gene
– However, different genes may be on opposite
• The genetic code is nearly
universal, shared by the simplest
bacteria to the most complex
– Some species prefer certain
codons (codon bias)
• Genes can be transcribed and
translated after being
transplanted from one species
EVOLUTION OF THE CODE
History: linking genes
1900’s Archibald Garrod
Inborn errors of metabolism: inherited human
metabolic diseases (more information)
Genes are the inherited factors
Enzymes are the biological molecules that drive
Enzymes are proteins
How do the inherited factors, the genes, control the
structure and activity of enzymes (proteins)?
History: linking genes
Beadle and Tatum (1941) PNAS USA 27, 499–506.
If genes control structure and activity of metabolic enzymes,
then mutations in genes should disrupt production of
required nutrients, and that disruption should be heritable.
Isolated ~2,000 strains from single irradiate spores
(Neurospora) that grew on rich but not minimal medium.
Examples: defects in B1, B6 synthesis.
Genes govern the ability to synthesize amino acids, purines
History: linking genes
1950s: sickle-cell anemia
Glu to Val change in hemoglobin
Sequence of nucleotides in gene determines sequence
of amino acids in protein
Single amino acid change can alter the function of the
Tryptophan synthase gene in E. coli
Mutations resulted in single amino acid change
Order of mutations in gene same as order of affected
Peptidyl-tRNA binding site
Aminoacyl-tRNA binding site
From gene to protein:
Gene sequence (DNA) recopied or transcribed to
Product of transcription is a messenger molecule that
delivers the genetic instructions to the protein
synthesis machinery: messenger RNA (mRNA)
Brenner, S., Jacob, F. and Meselson, M. (1961) Nature
Question: How do genes work?
Does each one encode a different type of ribosome
which in turn synthesizes a different protein, OR
Are all ribosomes alike, receiving the genetic
information to create each different protein via some
kind of messenger molecule?
E. coli cells switch from making bacterial proteins to
phage proteins when infected with bacteriophage T4.
Grow bacteria on medium containing “heavy”
nitrogen (15N) and carbon (13C).
Infect with phage T4.
Immediately transfer to “light” medium containing
If genes encode different ribosomes, the newly
synthesized phage ribosomes will be “light”.
If genes direct new RNA synthesis, the RNA will contain
Ribosomes from phage-infected cells were “heavy”,
banding at the same density on a CsCl gradient as the
Newly synthesized RNA was associated with the heavy
New RNA hybridized with viral ssDNA, not bacterial
Expression of phage DNA results in new phage-
specific RNA molecules (mRNA)
These mRNA molecules are temporarily associated
Ribosomes do not themselves contain the genetic
directions for assembling individual proteins
ribonucleoside 5´ triphosphates:
ATP, GTP, CTP and UTP
bases are adenine, guanine, cytosine and uracil
sugar is ribose (not deoxyribose)
DNA-dependent RNA polymerase
Template (sense) DNA strand
Animation of transcription
Features of transcription:
RNA polymerase catalyzes sugar-phosphate bond
between 3´-OH of ribose and the 5´-PO4.
Order of bases in DNA template strand determines
order of bases in transcript.
Nucleotides are added to the 3´-OH of the growing
RNA synthesis does not require a primer.
In prokaryotes transcription and translation are
coupled. Proteins are synthesized directly from the
primary transcript as it is made.
In eukaryotes transcription and translation are
separated. Transcription occurs in the nucleus, and
translation occurs in the cytoplasm on ribosomes.
Figure comparing eukaryotic and prokaryotic
transcription and translation.
DNA template, ribonucleoside 5´ triphosphates, and Mg2+
Synthesizes RNA in 5´ to 3´ direction
E. coli RNA polymerase consists of 5 subunits
Eukaryotes have three RNA polymerases
RNA polymerase II is responsible for transcription of
protein-coding genes and some snRNA molecules
RNA polymerase II has 12 subunits
Requires accessory proteins (transcription factors)
Does not require a primer
Transcription factors bind to promoter sequences and
recruit RNA polymerase.
DNA is bound first in a closed complex. Then, RNA
polymerase denatures a 12–15 bp segment of the DNA
The site where the first base is incorporated into the
transcription is numbered “+1” and is called the
transcription start site.
Transcription factors that are required at every promoter
site for RNA polymerase interaction are called basal
Promoter recognition: promoter sequences
Promoter sequences vary considerably.
RNA polymerase binds to different promoters with
different strengths; binding strength relates to the
level of gene expression
There are some common consensus sequences for
Example: E. coli –35 sequence (found 35 bases 5´ to the
start of transcription)
Example: E. coli TATA box (found 10 bases 5´ to the
start of transcription)
Eukaryotic genes may also have enhancers.
Enhancers can be located at great distances from the
gene they regulate, either 5´ or 3´ of the transcription
start, in introns or even on the noncoding strand.
One of the most common ways to identify promoters
and enhancers is to use a reporter gene.
Many proteins can regulate gene expression by
modulating the strength of interaction between the
promoter and RNA polymerase.
Some proteins can activate transcription (upregulate
Some proteins can inhibit transcription by blocking
Some proteins can act both as repressors and
activators of transcription.
RNA polymerase locally denatures the DNA.
The first base of the new RNA strand is placed
complementary to the +1 site.
RNA polymerase does not require a primer.
The first 8 or 9 bases of the transcript are linked.
Transcription factors are released, and the
polymerase leaves the promoter region.
Figure of bacterial transcription initiation.
RNA polymerase moves along the transcribed or
template DNA strand.
The new RNA molecule (primary transcript) forms a
short RNA-DNA hybrid molecule with the DNA
Most known about bacterial chain termination
Termination is signaled by a sequence that can form
a hairpin loop.
The polymerase and the new RNA molecule are
released upon formation of the loop.
Review the transcription animation.
Prokaryotes: mRNA transcribed directly from DNA
template and used immediately in protein synthesis
Eukaryotes: primary transcript must be processed to
produce the mRNA
Noncoding sequences (introns) are removed
Coding sequences (exons) spliced together
5´-methylguanosine cap added
3´-polyadenosine tail added
Removal of introns and splicing of exons can occur
For introns within a nuclear transcript, a spliceosome is
Splicesomes protein and small nuclear RNA (snRNA)
Specificity of splicing comes from the snRNA, some of
which contain sequences complementary to the splice
junctions between introns and exons
Alternative splicing can produce different forms of a protein
from the same gene
Mutations at the splice sites can cause disease
Thalassemia • Breast cancer (BRCA 1)
RNA splicing inside the nucleus on particles called
Splicesomes are composed of proteins and small
RNA molecules (100–200 bp; snRNA).
Both proteins and RNA are required, but some
suggesting that RNA can catalyze the splicing
Self-splicing in Tetrahymena: the RNA catalyzes its
Catalytic RNA: ribozymes
From gene to protein:
Information travels from DNA to RNA to Protein
Is there a one-to-one correspondence between DNA,
RNA and Protein?
DNA and RNA each have four nucleotides that can form
them; so yes, there is a one-to-one correspondence between
DNA and RNA.
Proteins can be composed of a potential 20 amino acids;
only four RNA nucleotides: no one-to-one correspondence.
How then does RNA direct the order and number of amino
acids in a protein?
From gene to protein:
How many bases are required for each amino acid?
(4 bases)2bases/aa = 16 amino acids—not enough
(4 bases)3bases/aa = 64 amino acid possibilities
Minimum of 3 bases/aa required
What is the nature of the code?
Does it have punctuation? Is it overlapping?
Crick, F.H. et al. (1961) Nature 192, 1227–32.
3-base, nonoverlapping code that is read from a fixed
From gene to protein:
Nirenberg and Matthaei: in vitro protein translation
Found that adding rRNA prolonged cell-free protein
Adding artificial RNA synthesized by polynucleotide
phosphorylase (no template, UUUUUUUUU)
stimulated protein synthesis more
The protein that came out of this reaction was
polyphenylalanine (UUU = Phe)
Other artificial RNAs: AAA = Lys; CCC =Pro
From gene to protein:
Triplet binding assay: add triplet RNA, ribosomes, binding
factors, GTP, and radiolabeled charged tRNA (figure)
UUU trinucleotide binds to Phe-tRNA
UGU trinucleotide binds to CYS-tRNA
By fits and starts the triplet genetic code was worked out.
Each three-letter “word” (codon) specifies an amino acid or
directions to stop translation.
The code is redundant or degenerate: more than one way to
encode an amino acid
From gene to protein:
Components required for translation:
Aminoacyl tRNA synthetases
Initiation, elongation and termination factors
Animation of translation
Ribosome small subunit binds to mRNA
Charged tRNA anticodon forms base pairs with the
Small subunit interacts with initiation factors and
special initiator tRNA that is charged with
mRNA-small subunit-tRNA complex recruits the
Eukaryotic and prokaryotic initiation differ slightly
The large subunit of the ribosome contains three
Amino acyl (A site)
Peptidyl (P site)
Exit (E site)
The tRNAfMet occupies the P site
A second, charged tRNA complementary to the next
codon binds the A site.
Ribosome translocates by three bases after peptide
New charged tRNA aligns in the A site
Peptide bond between amino acids in A and P sites is
Ribosome translocates by three more bases
The uncharged tRNA in the A site is moved to the E
EF-Tu recruits charged tRNA to A site. Requires
hydrolysis of GTP
Peptidyl transferase catalyzes peptide bond
formation (bond between aa and tRNA in the P
site converted to peptide bond between the two
Peptide bond formation requires RNA and may
be a ribozyme-catalyzed reaction
Elongation proceeds until STOP codon reached
UAA, UAG, UGA
No tRNA normally exists that can form base pairing
with a STOP codon; recognized by a release factor
tRNA charged with last amino acid will remain at P
Release factors cleave the amino acid from the tRNA
Ribosome subunits dissociate from each other
Review the animation of translation
Def. Genetic code is the nucleotide base sequence on DNA ( and
subsequently on mRNA by transcription) which will be translated into a
sequence of amino acids of the protein to be synthesized.
The code is composed of codons
Codon is composed of 3 bases ( e.g. ACG or UAG). Each codon is
translated into one amino acid.
The 4 nucleotide bases (A,G,C and U) in mRNA are used to produce the
three base codons. There are therefore, 64 codons code for the 20 amino
acids, and since each codon code for only one amino acids this means
that, there are more than one cone for the same amino acid.
How to translate a codon (see table):
This table or dictionary can be used to translate any codon sequence.
Each triplet is read from 5′ → 3′ direction so the first base is 5′ base,
followed by the middle base then the last base which is 3′ base.
Examples: 5′- A UG- 3′ codes for methionine
5′- UCU- 3′ codes for serine
5′ - CCA- 3′ codes for proline
Termination (stop or nonsense) codons:
Three of the 64 codons; UAA, UAG, UGA do not code for any amino
acid. They are termination codes which when one of them appear in
mRNA sequence, it indicates finishing of protein synthesis.
Characters of the genetic code:
1- Specificity: the genetic code is specific, that is a specific codon
always code for the same amino acid.
2- Universality: the genetic code is universal, that is, the same codon is
used in all living organisms, procaryotics and eucaryotics.
3- Degeneracy: the genetic code is degenerate i.e. although each codon
corresponds to a single amino acid,one amino acid may have more than
one codons. e.g arginine has 6 different codons (give more examples
from the table).
Gene mutation (altering the nucleotide sequence):
1- Point mutation: changing in a single nucleotide base on the mRNA
can lead to any of the following 3 results:
i- Silent mutation: i.e. the codon containg the changed base may code
for the same amino acid. For example, in serine codon UCA, if A is
changed to U giving the codon UCU, it still code for serine. See table.
ii- Missense mutation: the codon containing the changed base may code
for a different amino acid. For example, if the serine codon UCA is
changed to be CCA ( U is replaced by C), it will code for proline not
serine leading to insertion of incorrect amino acid into polypeptide chain.
iii- Non sense mutation: the codon containing the changed base may
become a termination codon. For example, serine codon UCA becomes
UAA if C is changed to A. UAA is a stop codon leading to termination
of translation at that point.
Types of point mutation:
U A A (termination codon) Nonsense mutation
U C A → U C U Silent mutation
(codon for serine) (codon for serine)
C C A ( codon for proline) Missense mutation:
Give other examples on missense mutation which leads to some Hb
2- Frame- shift mutation:
deletion or addition of one or two base to
message sequence, leading to change in
reading frame (reading sequence) and the
resulting amino acid seuence may become
completely different from this point.
Components required for protein synthesis:
1- Amino acids: all amino acids involved in the
finished protein must be present at the time of
2- Ribosomes: the site of protein synthesis. They are
large complexes of protein and rRNA. In human,
they consist of two subunits, one large (60S) and one
3- tRNA: at least one specific type of tRNA is required to transfer
one amino acid. There about 50 tRNA in human for the 20 amino
acids, this means some amino acids have more than one specific
tRNA. The role of tRNA in protein synthesis is discussed before.
(amino acid attachment and anticodon loop).
4- aminoacyl-tRNA synthetase: This is the enzyme that catalyzes the
attachment of amino acid with its corresponding tRNA forming
5- mRNA: that carry code for the protein to be synthesized
6- protein factors: Initiation, elongation and termination (or release)
factors are required for peptide synthesis
7- ATP and GTP : are required as source of energy.
Initiation (start) codon is usually AUG which is the codon of
methionine, so the initiator tRNA is methionnyl tRNA (Met. tRNA).
a- The initiation factors (IF-1, IF-2 and IF-3) binds the Met. tRNA with
small ribosomal subunit then to mRNA containing the code of the
protein to be synthesized. IFs recognizes mRNA from its 5' cap
b-This complex binds to large ribosomal subunit forming initiation complex in which
Met. tRNA is present in P- site of 60 ribosomal subunit.
NB:- tRNA bind with mRNA by base pairing between codon on mRNA and anticodon
- mRNA is read from 5′ → 3′ direction
P-site: is the peptidyl site of the ribosome to which methionyl tRNA is placed (enter).
2- Elongation: elongation factors (EFs) stimulate the stepwise elongation of
polypeptide chain as follow:
a- The next aminoacyl tRNA (tRNA which carry the next amino acid specified by
recognition of the next codon on mRNA) will enter A site of ribosome
A site or acceptor site or aminoacyl tRNA site:
Is the site of ribosome to which each new incoming aminoacyl tRNA will enter.
b) ribosomal peptidyl transferase enzyme will transfer methionine
from methionyl tRNA into A site to form a peptide bond between
methionine and the new incoming amino acid to form dipeptidyl
c) Elongation factor-2 (EF-2), (called also, translocase): moves
mRNA and dipeptidyl tRNA from A site to P site leaving A site free to
allow entrance of another new aminoacyl tRNA.
The figure shows the repetitive cycle of elongation of chain. Each
cycle is consisting of
1) codon recognition and the entrance of the new aminoacyl tRNA
acid ( amino acid carried on tRNA) into A site,
2) The growing chain in P site will moved to A site with peptide
bond formation with the new amino acid
3) Translocation of growing chain to P site allowing A site free for
enterance of new amino acid an so on………………….. Resulting in
elongation of poly peptide chain.
3- Termination: occurs when one of the three stop codons (UAA, UAG or
UGA) enters A site of the ribosome. These codons are recognized by release
factors (RFs) which are RF-1, RF-2, RF-3. RFs cause the newly synthesized
protein to be released from the ribosomal complex and dissociation of
ribosomes from mRNA (i.e. cause dissolution of the complex)