2. INTRODUCTION :
The bacterium E. coli was discovered by the German-Austrian pediatrician Dr.
Theodor Escherich (1857–1911) in 1885. He conducted examinations of neonate’s
meconium and feces of breast-fed infants with the aim to gain insight into the
development of intestinal “flora.”
He observed “slender short rods” of the size of 1–5 μm in length and 0.3–0.4 μm in
width, which he named Bacterium coli commune. Later, in 1919, the bacterium was
renamed after its discoverer by Castellani and Chalmers and became Escherichia
coli.
The bacterium E. coli belongs into the family of Enterobacteriaceae. It is a Gram-
negative rod-shaped bacterium, non-sporulating, nonmotile or motile by peritrichous
flagella, chemo-organotrophic, facultative anaerobic, producing acid from glucose,
catalase positive, oxidase negative, and mesophilic .
3. LIFE CYCLE :
E. coli reproduces by two means: cell division, and the transfer of genetic material
through a sex pilus (conjugation).
In most bacteria, conjugation depnds on a fertility (F) factor(a plasmid) that is present
in the donor cell and absent in the recipient cell. Cells that contain F are referred to as
F+ and cells lacking F are F⁻.
Conjugation can take place only between a cell that possesses F+ and a cell
that lacks F⁻.
Figure 1
4. ADVANTAGES AS A MODEL ORGANISM :
E. coli is a preferred host for gene cloning due to the high efficiency of introduction
of DNA molecules into cells.
E. coli is a preferred host for protein production due to its rapid growth and the
ability to express proteins at very high levels.
Bacterial conjugation can be used to transfer large DNA fragments from one
bacterium to another.
E. coli is a preferred host for the study of phage biology due to the detailed
knowledge of its nucleic acid and protein biosynthetic pathways.
.
5. GENOME PROJECT :
A team of scientists headed by Frederick R. Blattner of the E. coli Genome
Project in the Laboratory of Genetics at the University of Wisconsin-
Madison has determined the complete genome sequence of the E. coli
(laboratory strain K- 12 derivative MG1655)bacterium, it was reported
(September 1997) in the journal Science.
E. coli has only one circular chromosome, some along with a circular
plasmid. The published genome has 4,639,221 base pairs. The E.
coli chromosome is represented by 4,401 genes encoding 116 RNAs and
4,616 protein modules.
Protein-coding genes account for 87.8% of the genome, 0.8% encodes stable
RNAs, and 0.7% consists of non-coding repeats.
6. SEQUENCING STRATEGY :
Sequencing was carried out in sections, with steadily improving technical
approaches. The M13 Janus shotgun strategy proved to be the most efficient
strategy for data collection and closure, followed by limited primer walking.
The first 1.92 Mb, positions 2,686,777 to 4,639,221 [in base pairs (bp)], was
sequenced from overlapping set of 15- to 20-kb MG1655 lambda clones by
means of radioactive chemistry .Subsequently, switched to dye-terminator
fluorescence sequencing (Applied Bio-systems).
For the next segment (positions 2,475,719 to 2,690,160), they obtained
DNA for sequencing by the pop-out plasmid approach , in which non -
overlapping segments were excised directly from the chromosome in
circular form, gel-purified, and shot-gunned for sequencing.
7. GENOME ANNOTATION :
The attempt was to:-
(i) identify genes, operons, regulatory sites, mobile genetic elements, and
repetitive sequences in the genome.
(ii) assign or suggest functions where possible.
(iii) relate the E. coli sequence to other organisms.
Currently, the annotation includes 4288 actual and proposed protein-coding
genes, and one-third of these genes are well characterized.
The average distance between E. coli genes is 118 bp. The 70 inter-genic
regions larger than 600 bp were reevaluated for the presence of ORFs
(Geneplot, DNASTAR Inc.) and searched against the entire GenBank
database for DNA sequence (BLASTN) and protein coding (BLASTX)
features.
8. The origin and terminus of replication
are shown as green lines.
The distribution of genes is depicted on
two outer rings: The orange boxes are
genes located on the presented strand, and
the yellow boxes are genes on the
opposite strand.
The next circle illustrates the positions
of REP sequences around the genome as
radial tick marks.
The central orange sunburst is a
histogram of inverse CAI (1 – CAI), in
which long yellow rays represent clusters
of low (<0.25) CAI.
The overall structure of the E.coli genome.
10. DIFFERENCE:
Genome size
E. coli and B. subtilis, two of the most intensively studied bacteria, have the
largest genomes and largest numbers of genes.
The genome of the yeast Saccharomyces cerevisiae is only 2.6 times as large as
that of E. coli. The genome of humans is almost 700 times larger than that of E.
coli.
Gene size and number
Very little DNA separates most bacterial genes; in E. coli there is an average of
only 118 bp between genes. Since the gene size varies little, then the number of
genes varies over as wide a range as the genome size, from 467 genes in M.
genitalium to 4289 in E. coli.
Saccharomyces cerevisiae has one gene every 1900 bp on average, which could
reflect both an increase in size of gene as well as somewhat greater distance
between genes.
Both bacteria and yeast show a much denser packing of genes than is seen in
more complex genomes.