• The aim of these lectures is to investigate how cells organize their DNA within
the cell nucleus, and replicate it during cell division to produce two new copies
of the genome. Cellular processes to repair damaged DNA will also be covered.
• The mechanism of DNA replication will be discussed, covering the structure
of the replication fork, how cells select sites of replication initiation, and how
they control whether and when to replicate DNA
• The reaction mechanism catalyzed by DNA polymerases causes difficulty in
replicating the ends of linear DNA molecules. Various methods have evolved to
solve this ‘end-replication problem’. The most common involves the use of an
unusual reverse transcriptase, called telomerase
• We will discuss genome organization: introns, exons, satellites, repetitive
• How is the huge amount of genomic DNA packaged to fit within the cell
nucleus, and still keeping specific sequences accessible for transcription? We
will discuss the structure of the nucleosome and higher levels of chromatin
organization and packaging
• DNA is often damaged under normal environmental conditions. How can
cells repair their genome and what are the consequences if they cannot?
Is the branch of biology that deals with the molecular
basis of biological activity.
This field overlaps with other areas of biology and
chemistry, particularly genetics and biochemistry.
Molecular biology chiefly concerns itself with
understanding and the interactions between the various
systems of a cell, including the interactions between the
different types of DNA, RNA and protein biosynthesis as
well as learning how these interactions are regulated.
The field of molecular biology studies macromolecules and
the macromolecular mechanisms found in living things, such
as the molecular nature of the gene and its mechanisms of
gene replication, mutation, and expression.
The genome The totality of genetic
information and is
encoded in the DNA or
RNA for some viruses.
All living things are grouped into
Eukaryotic cell Eukaryotic cell are generally more advanced than
has a nucleus, which is separated from the rest of the
cell by a membrane. The nucleus contains
chromosomes, which are the carrier of the genetic
The genetic material distributed among multiple
Eukaryotic DNA is linear and complexed with
protiens called histones.
Prokaryotes are single-celled
Without nucleus, no nuclear
DNA is naked, without
Archaea are prokaryotes
(without nucleus) but some
aspects similar to
Deoxyribonucleic acid (DNA)
The genetic instructions used
in the development and
functioning of all known living
organisms and some viruses.
The main role of DNA
molecules is the long-term
storage of information.
DNA is often compared to a set of blueprints or a recipe,
or a code, since it contains the instructions needed to
construct other components of cells, such as proteins and
The chromosome The storage place of
The number of
from one species to
The genes The DNA segments
that carry this genetic
information are called
• General structure of nucleic acids:
• DNA is a long polymer made from repeating units called
• The DNA chain is 22 to 26 Å wide (2.2 to 2.6 nano.), and one
nucleotide unit is 3.3 Å (0.33 nm) long. Although each
individual repeating unit is very small, DNA polymers can be
very large molecules containing millions of nucleotides.
• Human chromosome number 1, is approximately 220 million
base pairs long.
Building Blocks - Nucleotides A nucleotide is composed of three parts: sugar (
Ribose in RNA and Deoxy ribose in DNA), base and
phosphate group. If all phosphate groups are
removed, a nucleotide becomes a nucleoside.
The four bases found in DNA are:
Guanine (G) and
A fifth pyrimidine base, called uracil (U), usually takes the
place of thymine in RNA and differs from thymine by
lacking a methyl group on its ring.
These bases are classified
into two types; adenine and
guanine are fused five- and
compounds called purines,
while cytosine and thymine
are six-membered rings
• In living organisms, DNA does not usually exist as a single
molecule, but instead as a pair of molecules that are held tightly
together. These two long strands entwine like vines, in the shape of a
• In a double helix the direction of the nucleotides in one strand is
opposite to their direction in the other strand: the strands are
• The asymmetric ends of DNA strands are called the 5′ (five prime)
and 3′ (three prime) ends, with the 5' end having a terminal
phosphate group and the 3' end a terminal hydroxyl group.
Each type of base on one strand forms a bond with
just one type of base on the other strand. This is called
complementary base pairing. Here, purines form
hydrogen bonds to pyrimidines, with A bonding only
to T, and C bonding only to G.
This arrangement of two nucleotides binding together
across the double helix is called a base pair. As
hydrogen bonds are not covalent, they can be broken
and rejoined relatively easily.
• Due to the specific base pairing, DNA's two strands are
complementary to each other. Hence, the nucleotide sequence
of one strand determines the sequence of another strand. For
example, the sequence of the two strands can be written as
• 5' -ACT- 3'
• 3' -TGA- 5'
• Note that they obey the (A:T) and (C:G) pairing rule. If we
know the sequence of one strand, we can deduce the sequence
of another strand. For this reason, a DNA database needs to
store only the sequence of one strand. By convention, the
sequence in a DNA database refers to the sequence of the 5' to
3' strand (left to right).
Twin helical strands form the DNA backbone. Another
double helix may be found by tracing the spaces, or
grooves, between the strands. As the strands are not
directly opposite each other, the grooves are unequally
sized. One groove, the major groove, is 22 Å wide and the
other, the minor groove, is 12 Å wide.
Sense and antisense
A DNA sequence is called "sense" if its sequence is the
same as that of a messenger RNA copy that is translated
into protein. The sequence on the opposite strand is called
the "antisense" sequence. Both sense and antisense
sequences can exist on different parts of the same strand
of DNA (i.e. both strands contain both sense and antisense
sequences). In both prokaryotes and eukaryotes, antisense
RNA sequences are produced, but the functions of these
RNAs are not entirely clear.
DNA can be twisted like a rope in a process called DNA
supercoiling. With DNA in its "relaxed" state, a strand
usually circles the axis of the double helix once every 10.4
If the DNA is twisted in the direction of the helix, this is
positive supercoiling, and the bases are held more tightly
If they are twisted in the opposite direction, this is negative
supercoiling, and the bases come apart more easily.
In nature, most DNA has slight negative supercoiling that is
introduced by enzymes called topoisomerases.
Alternate DNA structures
DNA exists in many possible conformations that include
A-DNA, B-DNA, and Z-DNA forms, although, only B-
DNA and Z-DNA have been directly observed in
From left to right, the structures of A, B and Z DNA
The aims of this lecture is to investigate how cells organize
their DNA within the cell nucleus, how is the huge amount of
genomic DNA packaged to fit within the cell nucleus, and
still keeping specific sequences accessible for transcription?.
We will discuss genome organization, satellites, repetitive
• We will discuss the structure of the nucleosome and higher
levels of chromatin organization and packaging.
Interactions with proteins
All the functions of DNA depend on interactions with
proteins. These protein interactions can be non-specific, or
the protein can bind specifically to a single DNA
sequence. Enzymes can also bind to DNA for example the
polymerases that copy the DNA sequence in transcription
and DNA replication.
• DNA-binding proteins
Within chromosomes, DNA is held in complexes with
structural proteins. These proteins organize the DNA into
a compact structure called chromatin. In eukaryotes this
structure involves DNA binding to a complex of small
basic proteins called histones, while in prokaryotes
multiple types of proteins are involved. The histones form
a disk-shaped complex called a nucleosome. These non-
specific interactions are formed through basic residues in
the histones making ionic bonds to the acidic sugar-
phosphate backbone of the DNA.
Chromatin is the complex combination of DNA and
protein that makes up chromosomes. It is found inside the
nuclei of eukaryotic cells. The major components of
chromatin are DNA and histone proteins. The functions of
chromatin are to package DNA into a smaller volume to
fit in the cell.
• Chromatin is the substance which becomes visible
chromosomes during cell division. Its basic unit is
nucleosome, composed of 146 bp DNA and eight histone
proteins. The structure of chromatin is dynamically
changing, at least in part, depending on the need of
transcription. In the metaphase of cell division, the
chromatin is condensed into the visible chromosome. At
other times, the chromatin is less condensed, with some
regions in a "Beads-On-a-String" conformation.
• Histones are the proteins closely associated with DNA
molecules. They are responsible for the structure of
chromatin and play important roles in the regulation of
gene expression. Five types of histones have been
identified: H1 (or H5), H2A, H2B, H3 and H4. H1 and its
homologous protein H5 are involved in higher-order
structures of chromatin. The other four types of histones
associate with DNA to form nucleosomes.
Histones (H1, H2A, H2B, H3, H4, and H5) organized into
two super classes as follows: Core histones – H2A, H2B,
H3 and H4 and linker histones – H1 and H5.
Histones contain a high proportion of basic amino acids
(arginine and lysine) that facilitate binding to the
negatively charged DNA molecule.
Two of each of the core histones (H2A, H2B, H3 and H4)
assemble to form one nucleosome core particle by
wrapping 146 base pairs of DNA around the protein spool
in 1.65 left-handed super-helical turn. The linker histone
H1 binds the nucleosome and the entry and exit sites of
the DNA, thus locking the DNA into place and allowing
the formation of higher order structure.
• each nucleosome is associated with an H1 (or H5) to
form a solenoid structure. H1 and H5 are called
A chromosome is an organized structure of DNA and
protein that is found in cells. It is a single piece of
coiled DNA containing many genes, regulatory
elements and other nucleotide sequences.
Chromosomes also contain DNA-bound proteins,
which serve to package the DNA and control its
• Chromosomes in prokaryotes
The prokaryotes – bacteria and archaea – typically have a
single circular chromosome, but many variations do exist.
Most bacteria have a single circular chromosome that can
range in size from only 160,000 base pairs in the
endosymbiotic bacterium Candidatus Carsonella ruddii, to
12,200,000 base pairs in the soil-dwelling bacterium
Sorangium cellulosum. Spirochaetes of the genus Borrelia
are a notable exception to this arrangement, with bacteria
such as Borrelia burgdorferi, ontaining a single linear
Repetitive DNA Sequences
A stretch of DNA sequence often repeats several times in
the total DNA of a cell. For example, the following DNA
sequence is just a small part of telomere located at the
ends of each human chromosome:
An entire telomere, about 15 kb, is constituted by
thousands of the repeated sequence "GGGTTA".
DNA sequences are divided into three classes:
• Highly repetitive: About 10-15% of mammalian DNA
fragments reassociate very rapidly. This class includes
• Moderately repetitive: Roughly 25-40% of mammalian
DNA fragments reassociate at an intermediate rate. This
class includes interspersed repeats (also known as
mobile elements or transposable elements).
• Single copy (or very low copy number): This class
accounts for 50-60% of mammalian DNA.
Tandem repeats are an array of consecutive repeats. They
include three subclasses: satellites, minisatellites and
microsatellites. The name "satellites" comes from their
• The size of a satellite DNA ranges from 100 kb to over 1
Mb. In humans, a well known example is the alphoid
DNA located at the centromere of all chromosomes. Its
repeat unit is 171 bp and the repetitive region accounts for
3-5% of the DNA in each chromosome. Other satellites
have a shorter repeat unit. Most satellites in humans or in
other organisms are located at the centromere.
• The size of a minisatellite ranges from 1 kb to 20 kb. One
type of minisatellites is called variable number of
tandem repeats (VNTR). Its repeat unit ranges from 9
bp to 80 bp. They are located in non-coding regions. The
number of repeats for a given minisatellite may differ
between individuals. This feature is the basis of DNA
• Another type of minisatellites is the telomere. In a human
germ cell, the size of a telomere is about 15 kb. In an
aging somatic cell, the telomere is shorter. The telomere
contains tandemly repeated sequence GGGTTA.
• Microsatellites are also known as short tandem repeats
(STR), because a repeat unit consists of only 1 to 6 bp and
the whole repetitive region spans less than 150
bp. Similar to minisatellites, the number of repeats for a
given microsatellite may differ between
individuals. Therefore, microsatellites can also be used
for DNA fingerprinting. In addition, both microsatellite
and minisatellite patterns can provide information about
Interspersed Repeats• Interspersed repeats are repeated DNA sequences located
at dispersed regions in a genome. They are also known as
mobile elements or transposable elements. A stretch of
DNA sequence may be copied to a different location
through DNA recombination. After many generations,
such sequence (the repeat unit) could spread over various
regions. Mobile elements are found in all kinds of
organisms. In mammals, the most common mobile
elements are LINEs and SINEs.
To understand the DNA replication mechanism in
eukaryotes and prokaryotes.
Identifying the DNA polymerases.
The basis for biological
inheritance, is a
occurring in all living
organisms to copy their
"semiconservative"Each strand of the original
molecule serves as template
for the reproduction of the
The replisome is a complex
molecular machine that carries out
replication of DNA. It is made up of a
number of subcomponents that each
provides a specific function during
the process of replication.
DNA pol. III
DNA pol. I
SSB (Single strand binding
Major Components of Replisome
• DNA polymerases are a family of enzymes that carry out
all forms of DNA replication.
• DNA polymerase can only extend an existing DNA strand
paired with a template strand; it cannot begin the synthesis
of a new strand.
• To begin synthesis of a new strand, a short fragment of
DNA or RNA, called a primer, must be created and paired
with the template strand before DNA polymerase can
synthesize new DNA.
DNA pol III has two key
It can only add nucleotides to
the 3' end of a strand.
2- It cannot start a new strand;
it can only extend an existing
strand (because it must only add
to 3' ends of strand).
Three types of DNA polymerase
classified in prokaryotes,
Type I, used to fill the gap between DNA
fragments of the lagging strand.
Type II, involved in the SOS response to
Type III, DNA replication is mainly
carried out by the DNA polymerase III.
• There are five types of DNA polymerases in mammalian
cells: a, b, g, d, and e. The (g) subunit is located in the
mitochondria, responsible for the replication of
mtDNA. Other subunits are located in the nucleus. Their
major roles of each subunits are:
• a: synthesis of lagging strand.
• b: DNA repair.
• d: synthesis of leading strand.
• e: DNA repair.
• The prokaryotic DNA polymerase III consists of several
subunits, with a total molecular weight exceeding
600kD. Among them, a, e, and q subunits constitute the
• The major role of b subunits is to keep the enzyme from
falling off the template strand. Two b subunits can form a
donut-shaped structure to clamp a DNA molecule in its
center, and slide with the core polymerase along the DNA
molecule. This allows continuous polymerization of up to
5 x 105 nucleotides. In the absence of b subunits, the core
polymerase would fall off the template strand after
synthesizing 10-50 nucleotides.
DNA replication within the cell
Origins of replication
• For a cell to divide, it must first replicate its DNA. This
process is initiated at particular points within the DNA,
known as "origins", which are targeted by proteins
that separate the two strands and initiate DNA synthesis.
Origins contain DNA sequences recognized by replication
initiator proteins (eg. dnaA in E.coli and the Origin
Recognition Complex in yeast). These initiator proteins
separate the two strands and initiate replication forks.
Origins tend to be "AT-rich" (rich in
adenine and thymine bases) to assist this
process, because A-T base pairs have two
hydrogen bonds (rather than the three
formed in a C-G pair)—strands rich in
these nucleotides are generally easier to
separate because a greater number of
hydrogen bonds requires more energy to
The replication fork
When replicating, the original DNA splits in two, forming
two "prongs" which resemble a fork (hence the name
"replication fork"). Because DNA polymerase can only
synthesize a new DNA strand in a 5' to 3' manner, the
process of replication goes differently for the two strands
comprising the DNA double helix.
Mechanism of DNA Replication Once strands are separated, RNA primers are created on
the template strands. DNA Polymerase extends the leading
strand in one continuous motion and the lagging strand in
a discontinuous motion. Rnase removes the RNA
fragments used to initiate replication by DNA Polymerase,
and DNA Polymerase I enters to fill the gaps. When this is
complete, a single nick on the leading strand and several
nicks on the lagging strand can be found. Ligase works to
fill these nicks in, thus completing the newly replicated
The leading strand receives one
RNA primer per active origin of
replication while the lagging strand
receives several; these several
fragments of RNA primers found on
the lagging strand of DNA are
called Okazaki fragments, named
after their discoverer.
The leading strand is the template strand of the DNA
double helix so that the replication fork moves along
it in the 3' to 5' direction. This allows the newly
synthesized strand complementary to the original
strand to be synthesized 5' to 3' in the same direction
as the movement of the replication fork.
On the leading strand, a polymerase "reads" the DNA
and adds nucleotides to it continuously. This
polymerase is DNA polymerase III (DNA Pol III)
in prokaryotes and presumably Pol ε in eukaryotes.
The lagging strand is the strand of the
template DNA double helix that is
oriented so that the replication fork moves
along it in a 5' to 3' manner. Because of its
orientation, opposite to the working
orientation of DNA polymerase III, which
moves on a template in a 3' to 5' manner,
replication of the lagging strand is more
complicated than that of the leading
On the lagging strand, primase "reads" the DNA and
adds RNA to it in short, separated segments. DNA
polymerase III or Pol δ lengthens the primed
segments, forming Okazaki fragments. Primer
removal in eukaryotes is also performed by Pol δ. In
prokaryotes, DNA polymerase I "reads" the
fragments, removes the RNA by 5'-3' exonuclease
activity of polymerase I, and replaces the RNA
nucleotides with DNA nucleotides (this is necessary
because RNA and DNA use slightly different kinds of
nucleotides). DNA ligase joins the fragments together.
In bacteria, which have a single origin of
replication on their circular chromosome,
this process eventually creates a "theta
structure" (resembling the Greek letter
theta: θ). In contrast, eukaryotes have
longer linear chromosomes and initiate
replication at multiple origins within
Telomerase and Aging• Synthesis of the lagging strand requires a short primer
which will be removed. At the extreme end of a
chromosome, there is no way to synthesize this region
when the last primer is removed. Therefore, the
lagging strand is always shorter than its template by at
least the length of the primer. This is the so-called
• Bacteria do not have the end-replication problem,
because its DNA is circular.
• In eukaryotes, the chromosome ends are called
telomeres which have at least two functions:
• to protect chromosomes from fusing with each other.
• to solve the end-replication problem.
• In a human chromosome, the telomere is about 10 to 15
kb in length, composed of the tandem repeat
sequence: TTAGGG. The telomerase contains an
essential RNA component which is complementary to
the telomere repeat sequence. Hence, the internal
RNA can serve as the template for synthesizing
DNA. Through telomerase translocation, a telomere
may be extended by many repeats.
• In the absence of telomerase, the telomere will
become shorter after each cell division. When it
reaches a certain length, the cell may cease to divide
and die. Therefore, telomerase plays a critical role in
the aging process.
Rolling circle replication
Another method of copying DNA, sometimes used in vivo
by bacteria and viruses, is the process of rolling circle
replication. In this form of replication, a single replication
fork progresses around a circular molecule to form
multiple linear copies of the DNA sequence. In cells, this
process can be used to rapidly synthesize multiple copies
of plasmids or viral genomes.
During replication, the unwinding of DNA may cause the
formation of tangling structures, such as supercoils or
catenanes. The major role of topoisomerases is to prevent
There are two types of topoisomerases:
• type I produces transient single-strand breaks in DNA .
• types II produces transient double-strand breaks.
As a result, the type I enzyme removes supercoils from
DNA one at a time, whereas the type II enzyme removes
supercoils two at a time.
• In eukaryotes, the topo I and topo II can remove both
positive and negative supercoils.
• In bacteria, the topo I can remove only negative
supercoils. The bacterial topo II is also called the gyrase,
which has two functions:
(a) to remove the positive supercoils during DNA
(b) to introduce negative supercoils (one supercoil for 15-20
turns of the DNA helix) so that the DNA molecule can
be packed into the cell. During replication, these
negative supercoils are removed by topo I.
Without topoisomerases, the DNA cannot
replicate normally. Therefore, the
inhibitors of topoisomerases have been
used as anti-cancer drugs to stop the
proliferation of malignant
cells. However, these inhibitors may also
stop the division of normal cells. Some
cells (e.g., hair cells) which need to
continuously divide will be most
affected. This explains a noticeable side
effect: the hair loss.
One look around a room tells you that each person has
slight differences in their physical make up — and
therefore in their DNA. These subtle variations in
DNA are called polymorphisms (literally "many
forms"). Many of these gene polymorphisms account
for slight differences between people such as hair and
eye color. But some gene variations may result in
disease or an increased risk for disease. Although all
polymorphisms are the result of a mutation in the
gene, geneticists only refer to a change as a mutation
when it is not part of the normal variations between
To understand mutation, mutagen, mutants.
Classification of mutations.
Types of mutagens.
Mutagen & carcinogen.
In biology, a mutation is a randomly derived change
to the nucleotide sequence of the genetic material of
• Mutations can be caused by copying errors in the
genetic material during cell division( DNA
replication), or by exposure to mutagens (chemical,
physical or viruses).
• In multi-cellular organisms with dedicated
reproductive cells, mutations can be subdivided into
germ line mutations, which can be passed on to
descendants, and somatic mutations, which are not
usually transmitted to descendants.
*By effect on structure
1- Small-scale mutations:- such as those affecting a
small gene in one or a few nucleotides, including:
A- Point mutations: Exchange a single nucleotide for
another, there are different types of point mutation:-
*Transitions: Exchanges a purine for a purine (A ↔ G)
or a pyrimidine for a pyrimidine, (C ↔ T).(Most
*Transversions: Exchanges a purine for a pyrimidine or
a pyrimidine for a purine (C/T ↔ A/G). (Less
Classification of mutation
*Insertions add one or more extra nucleotides into the DNA.
They are usually caused by transposable elements, or errors
during replication of repeating elements (e.g. AT repeats).
*Deletions remove one or more nucleotides from the DNA.
original The fat cat ate the wee rat.
Point Mutation The fat hat ate the wee rat.
In a frame shift mutation, one or more bases are inserted
or deleted. This type of mutation disrupt the reading frame
thus make the DNA meaningless and often results in a
shortened protein. Frame shift mutation can classified to:-
Original The fat cat ate the wee rat.
Deletion The fat ate the wee rat.
Original The fat cat ate the wee rat.
Insertion The fat cat xlw ate the wee rat.
In an inversion mutation, an entire section of DNA is
Original The fat cat ate the wee rat.
Inversion The fat tar eew eht eta tac.
2- Large-scale mutations in chromosomal structure,
A- Deletion of large chromosomal regions, leading to
loss of the genes within those regions.
B- Translocation: interchange of genetic parts from
C-Inversion: reversing the orientation of a
D- Amplifications (or gene duplications) leading to multiple
copies of all chromosomal regions, increasing the dosage
of the genes located within them.
Loss-of-function mutations are the result of gene product
having less or no function.
Gain-of-function mutations change the gene product
such that it gains a new and abnormal function.
Lethal mutations are mutations that lead to the death of
the organisms which carry the mutations.
*By effect on function
In applied genetics it is usual to speak of mutations as either
harmful or beneficial.
A harmful mutation is a mutation that decreases the
fitness of the organism.
A beneficial mutation is a mutation that increases fitness
of the organism, or which promotes traits that are
*By effect on fitness
Conditional mutation is a mutation that has wild-type (or
less severe) phenotype under certain "permissive"
environmental conditions and a mutant phenotype under
certain "restrictive" conditions. For example, a
temperature-sensitive mutation can cause cell death at
high temperature (restrictive condition), but might have no
deleterious consequences at a lower temperature
Causes of mutation
Mutations may occur spontaneously
(spontaneous mutations) or induced
(induced mutations) caused by
*Spontaneous mutations can arise as a result of:
1- DNA replication errors and polymerase
A- Base alterations
Taotomeresim – A base is changed by the
repositioning of a hydrogen atom, altering the
hydrogen bonding pattern of that base resulting in
incorrect base pairing during replication.
Deamination - Hydrolysis changes a normal base to
an atypical base containing a keto group in place of the
original amine group. Examples include C → U and A
→ HX (hypoxanthine), and 5MeC (5-methylcytosine)
B- Base damage
Depurination – Loss of a purine base (A or G) to form
an apurinic site (AP site). Alkylation can occur through
reaction of compounds such as S-adenosyl methionine
with DNA. Alkylated bases may be subject to spontaneous
breakdown or mispairing.
** Alkylation, the addition of alkyl (methyl, ethyl,
occasionally propyl) groups to the bases or backbone of
2- Spontaneous genetic rearrengment mutations
Deletion, duplication, ……..etc
Induced mutations on the molecular level can be
caused by either Chemical or Physical mutagens.
1- Chemical mutagens
The first report of mutagenic action of a chemical was
in 1942 by Charlotte Auerbach, who showed that
nitrogen mustard (component of poisonous mustard
gas used in World Wars I and II) could cause
mutations in cells.
A- Base analogs
These chemicals structurally resemble purines and
pyrimidines and may be incorporated into DNA in
place of the normal bases during DNA replication:
*bromouracil (BU), resembles thymine (has Br atom
instead of methyl group) and will be incorporated into
DNA and pair with A like thymine.
*aminopurine --adenine analog which can pair with T
or with C; causes A:T to G:C or G:C to A:T transitions.
B- Chemicals which alter the structure and pairing
properties of bases (base modifiers). Example …
*nitrous acid-- formed by digestion of nitrites
(preservatives) in foods. It causes C to U, meC to T, and A
to hypoxanthine deaminations.
*nitrosoguanidine, *methyl methanesulfonate, *ethyl
methanesulfonate--chemical mutagens that react with
bases and add methyl or ethyl groups. Depending on the
affected atom, the alkylated base may then degrade to yield
a baseless site, or mispair to result in mutations upon DNA
C- Intercalating agents
acridine orange, proflavin, ethidium bromide (used in
labs as dyes and mutagens), All are flat, multiple ring
molecules which interact with bases of DNA and insert
This insertion causes a "stretching" of the DNA duplex and
the DNA polymerase is "fooled" into inserting an extra base
opposite an intercalated molecule. The result is that
intercalating agents cause frameshifts.
D- Agents altering DNA structure
Includes a variety of different kinds of agents. These
Large molecules which bind to bases in DNA and
cause them to be noncoding "bulky" lesions (eg.
agents causing intra- and inter-strand crosslinks (eg.
psoralens--found in some vegetables and used in
treatments of some skin conditions).
chemicals causing DNA strand breaks (eg. peroxides)
Natural sources of radiation produce so-
called background radiation. These include
cosmic rays from the sun and outer space,
radioactive elements in soil and terrestrial
products (wood, stone) and in the
atmosphere (radon). One's exposure due to
background radiation varies with
Sources of radiation
humans have created artificial sources of
radiation which contribute to our radiation
exposure. Among these are medical
testing (diagnostic X-rays and other
procedures), nuclear testing and various
other products (TV's, smoke detectors,
Types of radiation Ionizing radiation
- Alpha, Beta, Neutron, X-ray and Gamma
Non-ionizing radiation (electromagnetic radiation)
- Visible light, Infrared, Microwave, Radio waves, Very
low frequency (VLF), Extremely low frequency (ELF),
Thermal radiation (heat) and Black body radiation.
Non Ionizing radiation
1. EM spectrum
Visible light and other forms of radiation are all types
of electromagnetic radiation (consists of electric and
magnetic waves). The length of EM waves (wavelength)
varies widely and is inversely proportional to the energy
they contain: this is the basis of the so-called EM
UV radiation is less energetic,
and therefore non-ionizing, but its
wavelengths are preferentially
absorbed by bases of DNA and by
aromatic amino acids of proteins, so
it has important biological and
UV is normally classified in terms of its wavelength:
UV-C (180-290 nm)--"germicidal"--most energetic and
lethal, it is not found in sunlight because it is absorbed by
the ozone layer.
UV-B (290-320 nm)--major lethal/mutagenic fraction of
UV-A (320 nm--visible)--"near UV"--also has deleterious
effects (because it creates oxygen radicals) but it produces
very few pyrimidine dimers.
The major lethal lesions are pyrimidine dimers in DNA
(produced by UV-B and UV-C)--these are the result of a
covalent attachment between adjacent pyrimidines in one
strand. These dimers, like bulky lesions from chemicals,
block transcription and DNA replication and are lethal if
unrepaired. They can stimulate mutation and chromosome
rearrangement as well.
. Ionizing radiation
X- and gamma-rays are energetic enough that
they produce reactive ions (charged atoms or
molecules) when they react with biological molecules;
thus they are referred to as ionizing radiation.
Intense exposure (high dose rate) causes burns
and skin damage versus a long-term weak exposure
(low dose rate) which would only increase risk of
mutation and cancer.
• Biological effects of radiation
Ionizing radiation produces a range of damage to
cells and organisms primarily due to the production of
free radicals of water (the hydroxyl or OH radical).
Free radicals possess unpaired electrons and are
chemically very reactive and will interact with DNA,
proteins, lipids in cell membranes, etc. Thus X-rays
can cause DNA and protein damage which may result
in organelle failure, block cell division, or cause cell
death. The rapidly dividing cell types (blood cell-
forming areas of bone marrow, gastrointestinal tract
lining) are the most affected by ionizing radiation and
the severity of the effects depends upon the dose
Genetic effects of radiation
Ionizing radiation produces a range of effects on
DNA both through free radical effects and direct
Breaks in one or both strands (can lead to
rearrangements, deletions, chromosome loss, death if
unrepaired; this is from stimulation of
Damage to/loss of bases (mutations).
cross linking of DNA to itself or proteins
DNA repair refers to a collection of processes by which
a cell identifies and corrects damage to the DNA molecules that
encode its genome. In human cells, both normal metabolic activities
and environmental factors such as UV light and radiation can cause
DNA damage, resulting in as many as 1 million individual molecular
lesions per cell per day. Many of these lesions cause structural
damage to the DNA molecule and can alter or eliminate the cell's
ability to transcribe the gene that the affected DNA encodes. Other
lesions induce potentially harmful mutations in the cell's genome,
which affect the survival of its daughter cells after it
undergoes mitosis. As a consequence, the DNA repair process is
constantly active as it responds to damage in the DNA structure.
When normal repair processes fail, and when cellular apoptosis does
not occur, irreparable DNA damage may occur, including double-
strand breaks and DNA crosslinkages
The rate of DNA repair is dependent on many factors,
including the cell type, the age of the cell, and the
extracellular environment. A cell that has accumulated
a large amount of DNA damage, or one that no longer
effectively repairs damage incurred to its DNA, can
enter one of three possible states:
an irreversible state of dormancy, known
cell suicide, also known as apoptosis or programmed
unregulated cell division, which can lead to the
formation of a tumor that is cancerous
Since many mutations are deleterious,, DNA repair
systems are vital to the survival off all organisms
– Living cells contain several DNA repair systems that
can fix different type of DNA alterations
DNA repair mechanisms are placed into different
categories on the basis of the way they operate
Direct correction or direct reversal- reversing the
Excise the damaged areas and then repair the gap by
new DNA synthesis
Basic mechanism of repairing DNA
In most cases,, DNA repair is a multi-step process
1. An irregularity in DNA structure is detected and
2. Normal DNA is synthesized DNA
Direct Reversal of DNA Damage
Mismatch Repair by DNA Polymerase Proofreading.
Repair of UV-Induced Pyrimidine Dimers (reverted
by exposure to near-UV light-activates photolyase –
not found in humans, It splits the dimers restoring the
DNA to its original condition).
Repair of Alkylation Damage (by O6-methylguanine
methyltransferase encoded by ada gene, It transfers
the methyl or ethyl group from the base to a cysteine
side chain within the alkyltransferase protein)
Base Excision Repair and Repair
Involving Excision of Nucleotides
There are three major DNA repairing mechanisms:
1- Base excision
2- Nucleotide excision
3- Mismatch repair
• Base excision
DNA bases may be modified by deamination or
alkylation. The position of the modified (damaged) base
is called the "abasic site" or "AP site".
In E.coli, the DNA glycosylase can recognize the
AP site and remove its base. Then, the AP endonuclease
removes the AP site and neighboring nucleotides. The
gap is filled by DNA polymerase I and DNA ligase.
These enzymes can recognize a single damaged base and
cleave the bond between it and the sugar in the DNA.
Removes one base, excises several around it, and replaces
with several new bases using Pol adding to 3’ ends then
ligase attaching to 5’ end
Depending on the species,, this repair system can
eliminate abnormal bases such as
– Uracil; Thymine dimers
– 3-methyladenine; 7-methylguanine
• Nucleotide excision
In E. coli, proteins UvrA, UvrB, and UvrC
are involved in removing the damaged
nucleotides (e.g., the dimer induced by UV
light). The gap is then filled by DNA
polymerase I and DNA ligase.
In yeast, the proteins similar to Uvr's
are named RADxx ("RAD“ for "radiation"),
such as RAD3, RAD10. etc.
An important general process for DNA repair is nucleotide
excision repair (NER)
Nicks DNA around damaged base and removes region
Then fills in with Pol on 3’ends, and attaches 5’ end
This type off system can repair many types off DNA
– Thymine dimers and chemically modified bases
NER is found in all eukaryotes and prokaryotes
(However, its molecular mechanism is better
understood in prokaryotes).
Nucleotide excision repair
(NER) of pyrimidine dimer
and other damage
induced distortions of
Several human diseases have been shown to involve
inherited defects in genes involved in NER
– These include xeroderma pigmentosum (XP) and
Cockayne syndrome (CS)
" A common characteristic off both syndromes is an
increased sensitivity to sunlightt
– Xeroderma pigmentosum can be caused by defects
in seven different NER genes
Skin lesions of Xeroderma Pigmentosum
Caused by homozygosity
For a recessive mutation in
A repair gene.
One example of a DNA-repair genetic
Mismatch Repair System
If proofreading fails, the methyl-directed mismatch
repair system comes to the rescue
--This repair system is found in all species
--In humans, mutations in the system are associated with
particular types of cancer.
Methyl-directed mismatch repair recognizes
mismatched base pairs, excises the incorrect
bases, and then carries out repair synthesis.
• Mismatch repair
To repair mismatched bases, the system has to know
which base is the correct one. In E. coli, this is achieved by
a special methylase called the "Dam methylase", which can
methylate all adenines that occur within (5')GATC
sequences. Immediately after DNA replication, the template
strand has been methylated, but the newly synthesized
strand is not methylated yet. Thus, the template strand and
the new strand can be distinguished.
--The repairing process begins with the protein MutS which binds
to mismatched base pairs. Then, MutL activates MutH which binds to
--Activation of MutH cleaves the unmethylated strand at the GATC
site. Subsequently, the segment from the cleavage site to the mismatch is
removed by exonuclease (with assistance from helicase II and SSB
If the cleavage occurs on the 3' side of the
mismatch, this step is carried out by
exonuclease I (which degrades a single
strand only in the 3' to 5' direction).
* If the cleavage occurs on the 5' side of the
mismatch, exonuclease VII or RecJ is used
to degrade the single stranded DNA. The
gap is filled by DNA polymerase III and
Mechanism of mismatch
repair. The mismatch
correction enzyme recognizes
which strand the base
mismatch is on by reading the
methylation state of a nearby
GATC sequence. If the
sequence is unmethylated, a
segment of that DNA strand
containing the mismatch is
excised and new DNA is
Mismatch Repair in Eukaryotes
Eukaryotes also have mismatch repair, but
it is not clear how old and new DNA
strands are identified.
– Four genes are involved in humans,
hMSH2 and hMLH1, hPMS1, and
– All of these are mutator genes
In humans, mutations in any one of the four
human mismatch repair genes confers a
phenotype of hereditary predisposition to a
form of colon cancer called hereditary
nonpolyposis colon cancer
Proteins involved in DNA repairing
of E. coli
Overview of Gene Expression
An organism may contain many types of somatic cells,
each with distinct shape and function. However, they all
have the same genome. The genes in a genome do not
have any effect on cellular functions until they are
"expressed". Different types of cells express different
sets of genes, thereby exhibiting various shapes and
Essential steps involved in the expression of
Is the process by which information from a gene is
used in the synthesis of a functional gene product. These
products are often proteins, but in non-protein coding
genes such as ribosomal RNA (rRNA), transfer RNA
(tRNA) or small nuclear RNA (snRNA) genes, the
product is a functional RNA.
Steps of gene expression
Several steps in the gene expression
process may be modulated, including the
Post-translational modification of a
A DNA strand is used as a template to synthesize a
complementary RNA strand, which is called the primary
Schematic illustration of transcription. (a) DNA before transcription. (b)
During transcription, the DNA should unwind so that one of its strand can
be used as template to synthesize a complementary RNA.
The function of RNA polymerases
Both RNA and DNA polymerases can add
nucleotides to an existing strand,
extending its length. However, there is a
major difference between the two classes
of enzymes: RNA polymerases can
initiate a new strand but DNA
RNA polymerase is composed of five subunits:
• Two α subunits
• one for each β, β´
• δ subunit.
Several different forms of δ subunits have been
identified, with molecular weights ranging from
28 kD to 70 kD.
The δ subunit is also known as the sigma factor
δ factor plays an important role in
recognizing the transcriptional
initiation site, and also possesses the
helicase activity to unwind the DNA
Tow α subunits, β, β´ carry out
Core RNA polymerase
RNA polymerase without sigma factor (α2 subunits, β, β´),
carry out nucleotide synthesis.
refers to a complete and fully functional RNA polymerase.
The holoenzyme includes the core polymerase and the δ
• There are three classes of eukaryotic RNA
polymerases: I, II and III, each comprising two
large subunits and 12-15 smaller subunits.
• The two large subunits β and β' subunits.
• Two smaller subunits α subunit.
• The eukaryotic RNA polymerase does not
contain any sigma factor.
• Therefore, in eukaryotes, transcriptional
initiation should be mediated by other proteins.
RNA polymerase II is involved in the transcription of all
protein genes and most snRNA genes.
The other two classes transcribe only RNA genes. RNA
polymerase I is located in the nucleolus, transcribing
rRNA genes except 5S rRNA.
RNA polymerase III is located outside the nucleolus,
transcribing 5S rRNA, tRNA, U6 snRNA and some small
In prokaryotes, binding of the polymerase's δ factor to
promoter can catalyze unwinding of the DNA double
helix. The most important δ factor is Sigma 70.
Promoter: a short nucleotide sequence that is recognized
by an RNA polymerase enzyme as a point at which to bind
to DNA in order to start transcription. Promoters occur
upstream of the gene.
Transcription Mechanisms in
1- promoter recognition:
δ factor directs RNA polymerase to specific sequences in
the DNA called promoters so that transcription initiates at
the proper place. Prokaryotic polymerases can recognize the
promoter and bind to it directly.
Promoters contain two distinct sequence motifs that reside
~10 bases and ~35 bases upstream of the transcriptional start
site or first base of the RNA.
• The transcriptional start site is known as the +1 site. All of the
bases following the +1 site are transcribed into RNA and are
numbered with positive numbers.
• The bases prior to the +1 site are numbered with negative
• The promoter sequence consists of tow motifs a ~10 bases
upstream of the +1 site is called the -10 box (Pribnow box) and
~35 bases upstream of the +1 is called the -35 box.
• δ 70 recognizes promoters with a consensus sequence
consisting of TAATAT at the -10 region and -35 region.
The following steps occur before
RNA polymerase recognizes and
specifically binds to the promoter
region on DNA. At this stage, the
DNA is double-stranded ("closed").
This wound-DNA structure is referred
to as the closed complex.
•The DNA is unwound and becomes
single-stranded ("open") in the
vicinity of the initiation site (defined
as +1). This unwound-DNA structure
is called the open complex.
• RNA polymerase incorporate the
first few nucleotides to the +1 region.
• sigma disassociate from the
Chain initiation: Unwinding (melting) of the DNA
double helix. The enzyme which can unwind the double
helix is called helicase. Prokaryotic RNA polymerases
have the helicase activity.
Chain elongation: Synthesis of RNA based on the
sequence of the DNA template strand.
RNA polymerases use nucleoside triphosphates (NTPs) to
construct a RNA strand.
Chain termination: Prokaryotes and eukaryotes use
different signals to terminate transcription.
Transcription in eukaryotes is much more complicated
than in prokaryotes, partly because eukaryotic DNA is
associated with histones, which could hinder the access of
polymerases to the promotor.
Transcriptional Termination in
In prokaryotes, the transcription is terminated by two
The Rho-independent termination
signal is a stretch of 30-40 bp
sequence (terminator sequence),
consisting of many GC residues
followed by a series of T ( "U" in the
transcribed RNA). The resulting
RNA transcript will form a stem-loop
structure (hairpin) to terminate
The stem-loop structure of the RNA transcript as a termination signal for the
transcription of the trp operon.
• Rho-dependent mechanism
• Rho is a ~ 50 kD protein, involved in transcription
terminations. Six Rho proteins form a hexamer to
• The Rho protein binds to the RNA transcript at the
upstream site which is 70-80 nucleotides long and rich in
C residues. Upon binding, the Rho moves along the RNA
in the 3' direction. If movement of the polymerase is slow,
the Rho will catch up and terminate the transcription at the
downstream termination site. Rho has ATPase activity
which can induce release of the polymerase from DNA.
Scheme of reverse transcription
• Some viruses (such as HIV, the cause of AIDS), have the ability to
transcribe RNA into DNA in order to see a cell's genome.
• The main enzyme responsible for this type of transcription is called
reverse transcriptase. In the case of HIV, reverse transcriptase is
responsible for synthesizing a complementary DNA strand (cDNA) to the
viral RNA genome.
• An associated enzyme, ribonuclease H, digests the RNA strand and
reverse transcriptase synthesises a complementary strand of DNA to form
a double helix DNA structure.
• This cDNA is integrated into the host cell's genome via another enzyme
(integrase) causing the host cell to generate viral proteins which
reassemble into new viral particles. Subsequently, the host cell undergoes
programmed cell death (apoptosis).
1. The mechanism of eukaryotic transcription is
similar to that in prokaryotes.
2. A lot more proteins are associated with the
eukaryotic transcription machinery, which results
in the much more complicated transcription.
3. Three eukaryotic polymerases transcribe different
sets of genes.
4. In addition, eukaryotic cells contain additional
RNA Pols in mitochondria and chloraplasts.
Main Features of
Type Location Substrate
RNA Pol I Nucleoli Most rRNAs gene
RNA Pol II Nucleo-plasm All protein-coding
genes and some
RNA Pol III Nucleo-plasm tRNAs, 5S rRNA,
U6 snRNA and
other small RNAs
Three eukaryotic polymerases
RNA polymerase subunits
Each eukaryotic polymerase contains 12 or
– the two largest subunits are similar to each
other and to the b’ and b subunits of E. coli RNA
– There is one other subunit in all three RNA Pol
homologous to alfa subunit of E. coli RNA Pol.
– Five additional subunits are common to all
– Each RNA Pol contain additional four or seven
RNA polymerase activities
1. Transcription mechanism is similar to
that of E. coli polymerase (How?)
2. Different from bacterial polymerasae,
they require accessory factors for
The CTD of RNA pol II
1. The C-terminus of RNA Pol II contains a
stretch of seven amino acids that is
repeated 52 times in mouse and 26
times in yeast RNA pol II.
2. The heptapeptide sequence ( Seven
amino acids) is: Tyr-Ser-Pro-Thr-Ser-
3. This repeated sequence is known as
carboxyl terminal domain (CTD)
4. The CTD sequence may be
phosphorylated at the serines and
5. The CTD is unphosphorylated at
transcription initiation, and
phosphorylation occurs during
transcription elongation as the RNA Pol II
leaves the promoter.
6. Because it transcribes all eukaryotic
protein-coding gene, RNA Pol II is the
most important RNA polymerase for the
study of differential gene expression. The
CTD is an important target for differential
activation of transcription elongation.
RNA Pol II
1. located in nucleoplasm
2. catalyzing the synthesis of the
mRNA precursors for all protein-
3. RNA Pol Ⅱ-transcribed pre-
mRNAs are processed through
cap addition, poly(A) tail addition
• Eukaryotic genes, like their prokaryotic
counterparts, require promoters for transcription
initiation. Each of the three types of polymerase has
•RNA polymerase I transcribes from a single type of
promoter, present only in rRNA genes, that
encompasses the initiation site. In some genes, RNA
polymerase III responds to promoters located in the
normal, upstream position; in other genes, it
responds to promoters imbedded in the genes,
downstream of the initiation site.
Promoters for RNA polymerase II can be simple or
complex. As is the case for prokaryotes, promoters are
always on the same molecule of DNA as the gene they
Most promoters contain a sequence called the
TATA box around 25-35 bp upstream from the
start site of transcription. It has a 7 bp consensus
•TATA box acts in a similar way to an E.
coli promoter –10 sequence to position
the RNA Pol II for correct transcription
Some eukaryotic genes contain an
initiator element instead of a TATA
box. The initiator element is located
around the transcription start site.
Other genes have neither a TATA box
nor an initiator element, and usually
are transcribed at very low rates.
Sequence elements which can activate
transcription from thousands of base
pairs upstream or downstream.
• Exert strong activation of transcription of a
linked gene from the correct start site.
• activate transcription when placed in either
orientation with respect to linked genes Able to
function over long distances of more than 1 kb
whether from an upstream or downstream
position relative to the start site.
• Exert preferential stimulation of the closets of
two tandem promoters
General characteristics of
The TATA-Box-Binding Protein Initiates the Assembly
of the Active Transcription Complex
Promotors constitute only part of the eukaryotic
gene expression. Transcription factors that bind to
these elements also are required. For
example, RNA polymerase II is guided to the start
site by a set of transcription factors known
collectively as TFII (TF stands for transcription
factor, and II refers to RNA polymerase II).
Individual TFII factors are called TFIIA, TFIIB, and
so on. Initiation begins with the binding of TFIID
to the TATA box
are known as
TBP is the only
protein binds to
1. a general
to DNA at the
2. a general
all 3 RNA pol.
3. TFIIB &
• binds to
•Binds to RNA
Pol with TFIIF
5. phosphorylation of the polymerase CTD
Formation of a processive RNA polymerase
complex and allows the RNA Pol to leave the
Initiation of RNA synthesis For RNAP II (protein-coding genes), initiation
requires several transcription factors that assist
binding to promoter sites. Promoters sites
recognized by RNAP II (and associated protein
factors) are several conserved elements that are
located upstream from the transcription start point
(the +1 base).
Elongation of RNA via RNAP II
Elongation of the RNA chain is similar to that
in prokaryotes except that a 7-methyl guanosine
(7-MG) cap is added to the 5’ end when the
growing RNA chain is fairly short (20-30 bases
The 7-MG cap is “attached” by an unusual 5’-5’
triphosphate linkage and serves to protect the
growing RNA from degradation by nucleases.
This “capping” is part of RNA processing in
Termination of RNA synthesis
1- Transcription by RNAP II (for protein-coding genes) is
not really terminated, in the sense that transcription
continues for 1,000 - 2,000 bases after or downstream
from the site that ultimately will become the 3’ end of
the mature transcript.
2- Termination of transcription via RNAPI and RNAP III
is via response to discrete termination signals.
(a) The “functional” transcript actually results from
endonucleolytic cleavage of the primary transcript.
(b) Cleavage occurs 10-30 bases downstream from the
conserved sequence AAUAAA.
(c) After cleavage, an enzyme [poly(A) polymerase]
adds about 200 adenine (A) bases to the 3’ends.
This is called polyadenylation or the addition of
**The function of poly-A tails is to increase stability
of the transcript and to assist in transport of the
mRNA from the nucleus to the cytoplasm. This is
another part of RNA processing is eukaryotes.
In molecular biology and genetics, splicing is a modification of
the nascent pre-mRNA taking place after or concurrently with
its transcription, in which introns are removed and exons are
joined. This is needed for the typical eukaryotic messenger
RNA before it can be used to produce a correct protein
through translation. For many eukaryotic introns, splicing is
done in a series of reactions which are catalyzed by
the spliceosome, a complex of small nuclear ribonucleoproteins
(snRNPs), but there are also self-splicing introns
The protein coding genes of eukaryotes typically contain
regions of DNA that serves no coding functions. Non coding
regions called introns, interrupt the coding regions called
When the genes is transcribed to RNA, both the coding and
non coding regions are copied. However eukaryotic cell
having a mechanism of removing introns from RNA, in a
process called RNA splicing, a newly transcribed RNA
molecule is cut at the intron – exon boundaries, its intron are
discarded. And its exon are joined together. RNA splicing
occur within the nucleus before RNA migrates to the
cytoplasm. In the cytoplasm, ribosome translate the RNA-
now containing uninterrupted coding information- in to
mRNA processing and splicing
pre-mRNA –The nuclear transcript that is processed by
modification and splicing to give an mRNA.
RNA splicing – The process of excising introns from
RNA and connecting the exons into a continuous mRNA.
Eukaryotic mRNA is modified, processed, and transported
The 5′ End of Eukaryotic mRNA Is Capped
A 5′ cap is formed by adding a G to the terminal base of
the transcript via a 5′–5′ link.
The capping process takes place during the transcription,
which may be important for transcription reinitiation.
Eukaryotic mRNA has a
methylated 5’ cap
The 5′ End of Eukaryotic mRNA Is Capped
The 5′ cap of most mRNA is monomethylated, but some
small noncoding RNAs are trimethylated.
The cap structure is recognized by protein factors to
influence mRNA stability, splicing, export, and translation.
The 3′ Ends of mRNAs Are Generated by Cleavage
• The sequence AAUAAA is a
signal for cleavage to generate
a 3′ end of mRNA that is
• The reaction requires a protein
complex that contains a
specificity factor, an
endonuclease, and poly(A)
• The specificity factor and
endonuclease cleave RNA
downstream of AAUAAA.
The 3’ end of mRNA is generated
The 3′ Ends of mRNAs Are Generated by Cleavage and
The specificity factor and
poly(A) polymerase add
~200 A residues processively
to the 3′ end.
The poly(A) tail controls
mRNA stability and
There is a single 3’ end-processing complex
Pre-mRNA Splicing Proceeds through a Lariat
Splicing requires the 5′ and 3′ splice sites and a branch
site just upstream of the 3′ splice site.
A lariat is formed when the intron is cleaved at the 5′
splice site, and the 5′ end is joined to a 2′ position at an A
at the branch site in the intron.
Pre-mRNA Splicing Proceeds through a Lariat
The intron is released as a
lariat when it is cleaved at
the 3′ splice site, and the left
and right exons are then
Splicing proceeds through a lariat
snRNAs Are Required for Splicing
small cytoplasmic RNAs (scRNA) – RNAs that are
present in the cytoplasm (and sometimes are also found
in the nucleus).
small nuclear RNA (snRNA) – One of many small RNA
species confined to the nucleus; several of them are
involved in splicing or other RNA processing reactions.
small nucleolar RNA (snoRNA) – A small nuclear RNA
that is localized in the nucleolus.
snRNA Proteins Are Required for Splicing
The five snRNPs involved in splicing are U1, U2, U5, U4,
Together with some additional proteins, the snRNPs form
tRNA Splicing Involves Cutting and Rejoining in
RNA polymerase III terminates transcription in a poly(U)4
sequence embedded in a GC-rich sequence.
tRNA splicing occurs by successive cleavage and ligation
tRNA splicing recognized a specific
An endonuclease cleaves the tRNA precursors at both ends of
Release of the intron generates two half-tRNAs with unusual
ends that contain 5′ hydroxyl and 2′–3′ cyclic phosphate.
tRNA splicing has
separate cleavage and
Production of rRNA Requires Cleavage Events
and Involves Small RNAs
RNA polymerase I terminates transcription at an 18-base
The large and small rRNAs are released by cleavage
from a common precursor rRNA; the 5S rRNA is
Generation of mature eukaryotic rRNAs
Protein synthesis is based on the sequence of
mRNA, which is made up of nucleotides while
proteins are made up of amino acids. There must
be a specific relationship between the nucleotide
sequence and amino acid sequence. This
relationship is the so called genetic code, which
was deciphered by Marshall Nirenberg and his
colleagues in early 1960s. It turns out that three
nucleotides (a codon) code for one amino acid, as
shown in the following figure.
The Genetic Code
The standard genetic code. Synthesis of a peptide always starts from methionine (Met),
coded by AUG. The stop codon (UAA, UAG or UGA) signals the end of a peptide. This
table applies to mRNA sequences. For DNA, U (uracil) should be replaced by T
(thymine). In a DNA molecule, the sequence from an initiating codon (ATG) to a stop
codon (TAA, TAG or TGA) is called an open reading frame (ORF), which is likely (but not
always) to encode a protein or polypeptide.
Order in the Genetic Code
The genetic code is not randomly assigned. If an amino
acid is coded by several codons, they often share the same
sequence in the first two positions and differ in the third
position. Such assignment is accomplished by the design
of wobble position.
• Ribosomes are the sites of protein synthesis in both
prokaryotic and eukaryotic cell.
• 70S for bacterial ribosome and 80S for eukaryotic cell.
• Both prokaryotic and eukaryotic ribosomes are composed
of two distinct subunits, each containing characteristic
proteins and rRNAs.
This lecture is to describe in detail the process of
protein synthesis, whereby a messenger RNA is
translated by the ribosome in the cytoplasmic
compartment. The parallels and differences between
eukaryote and prokaryote translation will be
considered. As well as constituting a central
component of the machinery of the cell.
• Proteins are synthesized from mRNA templates by a
process that has been highly conserved throughout
• All mRNAs are read in the 5´ to 3´ direction, and poly
peptide chains are synthesized from the amino to the
• Each amino acid is specified by three bases (a codon ) in
the mRNA, according to a nearly universal genetic code.
• The basic mechanics of protein synthesis are also the
same in all cells.
• Translation is carried out on ribosomes, with tRNAs
serving as adaptors between the mRNA template and the
amino acids being incorporated into protein.
• Protein synthesis thus involves interactions between three
types of RNA molecules (mRNA templates, tRNAs, and
rRNAs), as well as various proteins that are required for
• Consists of approximately 70 to 80 nucleotides.
• Cloverleaf structures
• All tRNAs have the sequence CCA at their 3´ terminus,
and amino acids are covalently attached to the ribose of
the terminal adenosine.
• The mRNA template is then recognized by the anticodon
loop, located at the other end of the folded tRNA, which
binds to the appropriate codon by complementary base
The incorporation of the correctly encoded amino
acids into proteins depends on the attachment of
each amino acid to an appropriate tRNA by the
action of the enzyme aminoacyl tRNA synthetases.
The reaction proceeds in
two steps. First, the amino
acids is activated by
reaction with ATP to form
an aminoacyl AMP
The activated amino acids
is then joined to the 3´
terminus of the tRNA.
• After being attached to tRNA, an amino acids is aligned
on the mRNA template by complementary base pairing
between the mRNA codon and the anticodon of the
• Codon-anticodon base pairing is somewhat less stringent
than the standard A-U and G-C base pairing, leading to
the nonstandard base pairing (wobble), and then making
the genetic code redundant (redundancy of the genetic
code), because Inosine located in the tRNA anticodon
loop can base-pair with either C, U, or A in the third
position on mRNA.
The Organization of mRNAs and
the Initiation of Translation
Both prokaryotic and eukaryotic
mRNAs contain untranslated regions
(UTRs) at their 5´ and 3´ ends.
Eukaryotic mRNAs also contain 5´
7-methylguanosine (m7G) caps and 3´
• Prokaryotic mRNAs are frequently polycistronic: They encode
multiple proteins, each of which is translated from an
independent start site. Eukaryotic mRNAs are usually
monocistronic, encoding only a single protein.
• In both prokaryotic and eukaryotic cells, translation always
initiates with the amino acid methionine, usually encoded by
AUG. Alternative initiation codons, such as GUG, are used
occasionally in bacteria (GUG normally encodes valine).
• In most bacteria, protein synthesis is initiated with a modified
methionine residue (N-formylmethionine), whereas
unmodified methionines initiate protein synthesis in
eukaryotes (except in mitochondria and chloroplasts, whose
ribosomes resemble those of bacteria).
The signals that identify initiation codons are different in
prokaryotic and eukaryotic cells
• Initiation codons in bacterial mRNAs are preceded by a
specific sequence called a Shine-Delgarno sequence that
aligns the mRNA on the ribosome for translation by base-
pairing with a complementary sequence near the 3´ terminus
of 16S rRNA.
• Ribosomes recognize most eukaryotic mRNAs by binding to
the 7-methylguanosine cap at their 5´ terminus .
• The ribosomes then scan downstream of the 5´ cap until they
encounter an AUG initiation codon.
Translation is generally divided into three stages:
• In both prokaryotes and eukaryotes the first step of the
initiation stage is the binding of a specific initiator
methionyl tRNA and the mRNA to the small
• The large ribosomal subunit then joins the complex,
forming a functional ribosome on which elongation of
the polypeptide chain proceeds.
• A number of specific non-ribosomal proteins are also
required for the various stages of the translation
• The first translation step in bacteria is the binding of three
initiation factors (IF-1, IF-2, and IF-3) to the 30S ribosomal
• The mRNA and initiator N-formylmethionyl tRNA then join
• A 50S ribosomal subunit associate with the complex.
• The result is the formation of a 70S initiation complex (with
mRNA and initiator tRNA bound to the ribosome) that is
ready to begin peptide bond formation during the elongation
stage of translation.
• Initiation in eukaryotes is more complicated and requires at least ten
proteins, which are designated eIFs (eukaryotic initiation factors).
The factors bind to the 40S ribosomal subunit, and associates with
the initiator methionyl tRNA .
• The mRNA is recognized by initiation factors via mRNA 5´ cap and
poly-A tail at the 3’ end then brought to the 40S ribosomal subunit.
• The 40S ribosomal subunit, in association with the bound methionyl
tRNA and eIFs, then scans the mRNA to identify the AUG initiation
codon. When the AUG codon is reached, a 60S subunit binds to the
40S subunit to form the 80S initiation complex of eukaryotic cells.
Elongation of the polypeptide chain.
The ribosome has three sites for tRNA binding,
designated the P (peptidyl), A (aminoacyl), and E (exit)
sites. The initiator methionyl tRNA is bound at the P
site. The first step in elongation is the binding of the
next aminoacyl tRNA to the A site by pairing with the
second codon of the mRNA. The aminoacyl tRNA is
escorted to the ribosome by an elongation factor.
Once elongation factor has left the ribosome, a peptide bond can be formed
between the initiator methionyl tRNA at the P site and the second aminoacyl
tRNA at the A site. This reaction is catalyzed by the large ribosomal subunit,
with the rRNA playing a critical role. The result is the transfer of methionine
to the aminoacyl tRNA at the A site of the ribosome, forming a peptidyl
tRNA at this position and leaving the uncharged initiator tRNA at the P site.
The next step in elongation is translocation, the ribosome moves three
nucleotides along the mRNA, positioning the next codon in an empty A site.
This step translocates the peptidyl tRNA from the A site to the P site, and the
uncharged tRNA from the P site to the E site. The ribosome is then left with
a peptidyl tRNA bound at the P site, and an empty A site. The binding of a
new aminoacyl tRNA to the A site then induces the release of the uncharged
tRNA from the E site, leaving the ribosome ready for insertion of the next
amino acid in the growing polypeptide chain.
• Elongation of the polypeptide chain continues until a stop
codon (UAA, UAG, or UGA) is translocated into the A site of
the ribosome. Cells do not contain tRNAs with anticodons
complementary to these termination signals; instead, they have
release factors that recognize the signals and terminate
protein synthesis. The release factors bind to a termination
codon at the A site and stimulate hydrolysis of the bond
between the tRNA and the polypeptide chain at the P site,
resulting in release of the completed polypeptide from the
ribosome. The tRNA is then released, and the ribosomal
subunits and the mRNA template dissociate
1 Codon recognition
2 Peptide bond
• Messenger RNAs can be translated simultaneously by
several ribosomes in both prokaryotic and eukaryotic
• Thus, mRNAs are usually translated by a series of
ribosomes, spaced at intervals of about 100 to 200
• The group of ribosomes bound to an mRNA molecule is
called a polyribosome, or polysome.
Why regulate gene expression?
It takes a lot of energy to make RNA and protein.
Therefore some genes active all the time because their
products are in constant demand.
Others are turned off most of the time and are only switched
on when their products are needed.
Gene Control in Prokaryotes
One way in which prokaryotes control gene expression
is to group functionally related genes together so that
they can be regulated together.
This grouping is called an operon (The clustered genes
are transcribed together from one promoter giving a
polycistronic messenger RNA).
Gene Control in Prokaryotes
The prokaryotic genes organized in to operons.
An operon can be defined as a cluster gene that
encode the proteins necessary to perform coordinated
function. Genes of the same operon have related
functions within the cell and are turned on
(expressed) and off together (suppressed).
The first operon discovered was the lac operon so
named because its products are involved in lactose
An operon consists of:
a promoter (binding site for RNA polymerase)
a repressor binding site called an operator that overlaps
Repressor proteins encoded by repressor genes, are
synthesized to regulate gene expression. They bind to
the operator site to block transcription by RNA
The promoter sequences are recognized by RNA
polymerase, When RNA polymerase binds to the
promoter, transcription occurs.
The activity of RNA polymerase is also regulated by
interaction with accessory proteins called activators
The presence of the activator removes repression and
Two major modes of transcriptional regulation function in bacteria
(E. coli) to control the expression of operons:
Both mechanisms involve repressor proteins.
Induction happen in operons that produce gene products needed
for the utilization of energy.
Repression regulate operons that produce gene products
necessary for the synthesis of small biomolecules such as amino
Also called Positive control
The effector molecule interacts with the repressor protein such that it
cannot bind to the operator.
With inducible systems, the binding of the effector molecule to the
repressor greatly reduces the affinity of the repressor for the
operator as a result the repressor is released and transcription
A classic example of an inducible (catabolite-mediated) operon is
the lac operon, responsible for obtaining energy from galactosides
such as lactose.
Also called Negative control
The effector molecule interacts with the repressor protein such
that it can bind to the operator .
With repressible systems, the binding of the effector molecule to
the repressor greatly increases the affinity of repressor for the
operator, the repressor binds and stops transcription.
For the trp operon , the addition of tryptophan (the effector
molecule) to the E. coli environment shuts off the system because
the repressors binds at the operator.
In addition to negative control mediated by a repressor,
expression from a repressible operons is attenuated by sequences
within the transcribed RNA.
A classic example of a repressible (and attenuated) operon is the
trp operon, responsible for the biosynthesis of tryptophan.
Structure of the lac Operon
The lac operon have three structural genes: Z, , Y and A
The z gene codes for β-galactosidase , responsible for the hydrolysis
of the disaccharide, lactose into its monomeric units, galactose and
The y gene codes for permease, which increases permeability of the
cell to galactosides.
The a gene encodes a transacetylase.
In addition to the structural genes the lac operon also has regulatory
Promoter: Binding site for RNA polymerase
Operator: Binding site of repressor
The control of the lac operon occurs by both positive and negative
Negative control of the lac operon
What happens to lac operon when glucose is present and lactose is
During normal growth on a glucose-based medium (lacking lactose),
the lac repressor is bound to the operator region of the lac operon,
What happens when glucose is absent and lactose is
The few molecules of lac operon enzymes present will
produce a few molecules of allolactose from lactose.
Allolactose is the inducer of the lac operon.
The inducer binds to the repressor causing a
conformational shift that causes the repressor to release the
With the repressor removed, the RNA polymerase can
now bind the promoter and transcribe the operon.
Positive Control of the lac operon
What happens when both glucose and lactose levels are
Since the inducer is present, the lac operon will be transcribed
but the rate of transcription is very slow (almost repressed)
because glucose levels are high and therefore cAMP levels are
The repression of the lac operon under these conditions is
termed catabolite repression and is a result of the low levels of
cAMP that results from an adequate glucose supply.
This repression is maintained until the glucose supply is
What happens when glucose levels start dropping in the
presence of lactose?
As the level of glucose in the medium falls, the level of cAMP
Simultaneously the inducer (allolactose) is also binds to the lac
repressor (since lactose is present).
The net result is an increase in transcription from the operon.
The ability of cAMP to activate (increase) expression from the lac
operon results from an interaction of cAMP with a protein termed
CRP (for cAMP receptor protein).
The protein is also called CAP (for catabolite activator protein).
The cAMP-CAP complex binds to a region of the lac operon
just upstream of the promoter. This binding stimulates RNA
polymerase activity 20-to-50-fold.
(Repression of the lac operon is relieved in the presence
of glucose if excess cAMP is added.)
cAMP is therefore an activator of the lac operon.
This type of regulation by an activator is positive in contrast
to the negative control exerted by repressors.
The trp operon encodes the genes for the synthesis of tryptophan.
As with all operons, the trp operon consists of the promoter, operator and the
It is also subject to negative control by a repressor
In this system, unlike the lac operon, the gene for the repressor is not adjacent
to the promoter, but rather is located in another part of the E. coli genome.
Another difference is that the operator resides entirely within the promoter
Unlike an inducible system, the repressible operon is usually turned on.
Structure of the trp operon
The operon consists of:
Five structural genes that code for the three enzymes required
to convert chorismic acid into tryptophan.
A gene (trpL) which functions in attenuation.
Gene Gene Function
P/O: Promoter; operator sequence is found in the promoter
trp L Leader sequence; containing attenuator (A) sequence the leader
trp E: Gene for anthranilate synthetase subunit
trp D : Gene for anthranilate synthetase subunit
trp C: Gene for glycerolphosphate synthetase
trp B: Gene for tryptophan synthetase subunit
trp A: Gene for tryptophan synthetase subunit
Negative control of trp operon
The affinity of the trp repressor for binding the operator region is
enhanced when it binds tryptophan, blocking further transcription of
the operon and, as a result, the synthesis of the three enzymes will
decline, hence tryptophan is a co-repressor, this means that when
tryptophan is absent expression of the trp operon occurs.
the rate of expression of the trp operon is graded in response to the
level of tryptophan in the cell.
Attenuation of the trp operon
Expression of trp operon is reduced by the addition of
trytophan. Tryptophan synthesis also controlled by two
1.tRNA, specifically tryptophanyl-tRNA, tRNATrp
(tRNATrp charged with tryptophan).
2.the trpL gene
Trp L gene found between operator and trp E gene, thus
the attenuator region is composed of sequences found
within the transcribed RNA of the operon.
It is involved in controlling transcription from the operon
after RNA polymerase has initiated synthesis of the
The leader sequence (trp L) contains 14 amino acids
including tandem tryptophan codons (2 codons).
How does leader sequence affect transcription of the trp
It contains two consecutive trp codons and therefore serves to
measure the tryptophan supply in the cell.
If the supply is inadequate, small amount of tRNA will be charged
and the leader peptide will be translated without problem.
If the supply is good, large amount of tRNA will be charged, and
translation will stall at the trp codons. How?
The trpL mRNA consists of four region can adopt a
number of different conformations. It contains several
self-complementary regions which can form a variety
of stem-loop structures
Different stem-loops can form depending on the level of
tryptophan in the cell and hence the level of charged trp-
tRNAs determine the position of ribosome on the leader
polypeptide as well as determining the rate of translation.
In the case of the trpL mRNA, when the cellular levels of
tryptophan are high, the levels of the tryptophan tRNA are
Immediately after transcription, the ribosome follows right
behind RNA polymerase until it is halted by a stop codon
prior to the region 2 which prevent stem loop formation
between 2 & 3 and permits formation of the terminator
stemloop (3 & 4) which will cause RNA polymerase to
dissociate when reach the UUUUU rich region in the end of
the 4 region terminating transcription.
How is the terminator stem-loop formed?
Because of the quick translation of domain 1, domain 2
becomes associated with the ribosome complex.
Then domain 3 binds with domain 4, and transcription is
attenuated because of this stem loop formation.
The stem loop formed by binding of domains 3 and 4 is found
near a region rich in uracil and acts as the transcriptional
Consequently, RNA polymerase is dislodged from the
trp Operon Transcription Under Low
Levels of Tryptophan
Under low cellular levels of tryptophan, the translation of the short peptide
on domain 1 is slow .
As a result domain 2 does not become associated with the ribosome.
Rather domain 2 of the leader mRNA associates with domain 3 of the
This stem loop structure is the anti-terminator. Its formation prevents
formation of the terminator.
This structure permits the continued transcription of the
operon. Then the trpE-A genes are translated, and the
biosynthesis of tryptophan occurs
Domain 4 is called the attenuator because its presence is
required to reduce (attenuate) mRNA transcription in the
presence of high levels of tryptophan.
• Review Articles:
• Regulation of RNA polymerase I transcription in the
nucleolus - Genes and Develop., 2003.
• Roles of the heat shock transcription factors in regulation
of the heat shock response and beyond - FASEB J., 2001.
• Translational Control of Viral Gene Expression in
Eukaryotes - Microbiology and Molecular Biology
• Molecular Biology of the Cell, Bruce Alberts,
Alexander Johnson, Julian Lewis, Martin Raff, Keith
Roberts, Peter Walter, Garland Science, 2007.
• Molecular cell biology, 1986, Darnell, Lodish, and