Major Features of Escherichia coli
(E. coli) genome
VANSHIKA VARSHNEY
18081564015
B.Sc. Microbiology (H)
General Introduction
General Introduction :
● Gram negative and have cell membrane of LPS.
● Short rods.
● Facultative anaerobes.
● Can ferment Lactose (lac+).
● Motile.
● Capsule layer is present.
● Cultured in MacConkey agar.
● E. coli (Escherichia coli), is a type of bacteria that
normally lives in your intestines. It’s also found in the
gut of some animals.
● Shiga toxin-producing E. coli (STEC) is a bacterium
that can cause severe foodborne disease.
● Most types of E. coli are harmless and even help keep
your digestive tract healthy. But some strains can
cause diarrhea if you eat contaminated food or drink
fouled water.
Genetic “Vital Statistics”
● Genome size : 4.6 mb
● 4285 Genes
● 122 structural RNA protein
● No intron.
● Operon : Polycistronic Transcription units
● Haploid circular genome
● Usually asexual reproduction
● Transcription and translation take place in the same compartment.
Life Cycle of E.coli :
● E.coli reproduces by two means: Cell division and the Transfer of genetic
material through a sex pilus(conjugation).
● In most bacteria, conjugation depends on a fertility (F) factor (a plasmid) that
is the donor cell and absent in the recipient cell. Cells that contain F are
referred to a F+, and cells are lacking F are F-.
● The F factor contains an origin of replication and a number of genes required
for conjugation.
● For example, some of these genes encode sex pilli, slender extensions of the
cell membrane.
● If the entire F Factor is transferred to the recipient F cell, that cell becomes an
F cell.
E.coli GENOME
● The first completed DNA sequence of an E.coli genome (laboratory strain K-
12 derivative MG1655) was published in September 1997 by Blattner.
● The next E. coli genomes to be sequenced were pathogenic EHEC O157:H7
strains Sakai and EDL933. When both genomes were sequenced and
compared EHEC O157:H7 is about 859,000 bp larger than E.coli K-12.
● The next E. coli genome to be sequenced was that of strain CFT073.
Unexpectedly, only 10% of the CFT073-specific genes relative to MG 1655
were also shared by the O157:H7 genome.
Cont…..
● When the genome was sequenced as apart of Human Genome Project
(HGP), it was found to be a circular DNA molecule 4.6 million base pairs in
length, containing 4288 annotated protein coding genes (organized into 2584
operons), seven ribosomal RNA (rRNA) operons, and 86 transfer RNA (tRNA)
genes.
● The coding density was found to be very high, with a mean distance between
genes of only 118 bp.
● Greater gene density is due to a combination of factors: 1) bacterial genes
have no introns, 2) neighboring genes are very close together throughout the
genome; i.e., there are hardly any big regions of non-coding DNA between
genes.
Cont….
● The genome was observed to contain a significant number of transposable
genetic elements, repeat elements, cryptic prophages, and bacteriophage
remnants.
Cont….
● There are online resources that centralize information about E.coli such as
EcoCyc, RegulonDB and EcoGene.
Phylogenetic
relationships of
E. coli strains.
Plasmid and the E.coli resolution……..
Relevant E.coli resources and databases
● Plasmids are the most important tools not only for the manipulation of E. coli but also the foundation
for the genetic engineering of many organisms, cloning and sequencing, generation of mutants, and
many applications in molecular biology.
● In nature, many bacteria contain self-replicating DNA molecules that can be harnessed for molecular
biology applications. E. coli plasmids were the first ones to be extensively modified for such purposes.
● Some exceptional features of plasmids are they can be used in systems where replication origins (check
compatibility first) and selection markers can coexist in the same cell, which can be extremely useful
for the coexpression of four different proteins in the same cell.
● Linear plasmids are common in bacteria (particularly in actinobacteria), but thus far, only N15 plasmid
prophage has been isolated from E. coli [46] and is an impediment for generating knockouts using
linear DNA.
● plasmids play an important role, from generating site-specific integration of chromosomal fusions to
large plasmids that can hold complex constructs for further modification.
● E. coli Genetic Stock Center is a good source of strains and plasmids for different applications.
Common elements in
Plasmid used in
molecular biology
applications……….
E.coli based biosensors…...
● E. coli-based biosensors using plasmid or chromosomal constructs are useful for the detection of environmental
traits or hazards or measuring cellular processes as any standard reporter system .
● The basic design considers the following: copy number, reporter proteins, detection methods, and control
elements.
● generation and detection of this kind of biosensors are cost-effective and easy to generate and reasonably
sensitive.
● E. coli-based biosensors have been successful for detecting different traits: oxidants, DNA damaging
compounds, membrane-damaging compounds, protein-damaging compounds, aromatic compounds,
xenobiotics, antibiotic panels using reporter strains without antibiotic selection, etc.
● Basic principles for biosensor design. First, the proper design and the experimental creation of reporter strains.
With current knowledge, either plasmid creation or whole genome engineering can lead to the creation of a
reporter strain. Second, incorporation of such elements into the host cell. Third, sensing module and response
measurement. With current reporter proteins and detection technology, it is relatively easy to generate
biosensors that can be used in different applications with high sensitivity and selectivity.
GENE MAPPING
Mapping of E.coli chromosome…...
● This technique was utilized to map all genes of E.coli.
● There are 3 approaches for E.coli gene mapping:
1) Long- versus Short-Range Mapping: The total length of the E. coli chromosome is defined
as 100 min (»4,655 kb), referring to the approximate length of time it takes for an Hfr donor
cell to transfer the entire linkage group to a recipient cell by conjugation at 37°C.
2) Conjugational Mapping: The frequency of transfer of any given marker from any particular
Hfr strain to recipient cells is strongly dependent on the distance of the marker from the Hfr
origin of transfer. The availability of various Hfr strains with different points of origin usually
enables one to find the general map region for a mutation within 3 to 10 min by this method.
3) Rapid Mapping: the vast difference in recombination frequency for early versus late Hfr
markers allows a rapid localization of new markers between Hfr points of origin based on
simple comparative crosses using various Hfrs. In cases in which an Hfr marker can be
directly selected after transfer and recombination into a recipient strain, the early-marker
versus late-marker approach can be greatly simplified by replica plating. This rapid mapping
is especially useful in mapping large numbers of mutations, including temperature-sensitive
Mud(lac) fusions.
E.coli K-12 MG1655
● The predominant facultative anaerobe resident in the human colon is the
Gram negative motile bacillus Escherichia coli. E.coli colonies the infant gut
within hours of birth.
● However, E.coli has a dichotomous existence; while the majority of E.coli
strains exist within the mammalian intestinal tract as harmless commensals,
paradoxically several evolutionary lineages have deviated from this harmless
lifestyle to become pathogens.
● Those strains causing intestinal infections can be divided into six separate
and major categories of pathogens viz enteroaggregative E.coli(EAEC),
enteroinvasive (EIEC), enterohemorrhagic E.coli (EHEC) and differs adhering
E.coli(DAEC).
Entrez records of E.coli K-12 MG1655
The
EcoCyc
database
includes a
cellular
overview
of E. coli
K-12
MG1655.
The Integrated Microbial Genomes
(IMG) website offers data on
bacterial genomes
such as E. coli K-12 MG1655. IMG
also offers extensive tools for
metagenomics analyses.
Nucleotide Composition:
● We analyze the GC content of an E. coli strain using an R package, seqinr.
We can divide our task into three parts: (1) obtaining a genome sequence in
the FASTA format; (2) determining the overall GC content; and (3) measuring
the GC content in windows across the genome.
● Now, search for a standard strain of E. coli by entering the search term
“escherichia coli K12” to the home page of NCBI. From there we find
accession NC_000913.3 corresponding to Escherichia coli K-12 MG1655.
● The corresponding NCBI Nucleotide entry includes the option Send >
Complete Record > Destination: File > Format: FASTA > Create File. This file
is 4.7 MB; save it (or copy it) to your R working directory.
Cont….
● Next open the R program. You can install seqinr using the “Packages” pull-
down menu from the RGui console, or RStudio offers a convenient option to
select packages for installation.
● We have created the object ecoli which includes the sequence. The
command str(ecoli) will show us its of length 4.6 million. Next we place the
sequence in a vector called ecoliseq. We can see this with the length
command, and can also obtain the GC content:
Cont….
● This strain of E. coli therefore has a GC content of about 50.8%. We may also
want to evaluate the GC content across windows of fixed size. We begin with
a size of 20,000 bp.
GC content of E. coli strain K-12. The sequence of an E. coli
strain was downloaded from NCBI, input to the R program
seqinr, a for loop was used to calculate GC content in
windows of 20,000 base pairs, and the data were plotted.
Obtaining the E.coli genome
sequence in FASTA from
from NCBI
Obtaining the E.coli genome sequence in
FASTA from NCBI
References ----
● Chapter 17 ‘Completed Genomes: Bacteria & Archaea’ Page no. - 793 by
Pevsner “Bioinformatics and Functional Genomics” 3rd edition.
● https://www.intechopen.com/books/-i-escherichia-coli-i-recent-advances-on-
physiology-pathogenesis-and-biotechnological-applications/-i-escherichia-
coli-i-as-a-model-organism-and-its-application-in-biotechnology
● http://www.asmscience.org/files/Chapter_137_Genetic_Mapping.pdf
● https://www.slideshare.net/mpattani/genetic-analysis-and-mapping-in-
bacteria-and-bacteriophages
Major features of e.coli

Major features of e.coli

  • 1.
    Major Features ofEscherichia coli (E. coli) genome VANSHIKA VARSHNEY 18081564015 B.Sc. Microbiology (H)
  • 2.
    General Introduction General Introduction: ● Gram negative and have cell membrane of LPS. ● Short rods. ● Facultative anaerobes. ● Can ferment Lactose (lac+). ● Motile. ● Capsule layer is present. ● Cultured in MacConkey agar.
  • 3.
    ● E. coli(Escherichia coli), is a type of bacteria that normally lives in your intestines. It’s also found in the gut of some animals. ● Shiga toxin-producing E. coli (STEC) is a bacterium that can cause severe foodborne disease. ● Most types of E. coli are harmless and even help keep your digestive tract healthy. But some strains can cause diarrhea if you eat contaminated food or drink fouled water.
  • 4.
    Genetic “Vital Statistics” ●Genome size : 4.6 mb ● 4285 Genes ● 122 structural RNA protein ● No intron. ● Operon : Polycistronic Transcription units ● Haploid circular genome ● Usually asexual reproduction ● Transcription and translation take place in the same compartment.
  • 6.
    Life Cycle ofE.coli : ● E.coli reproduces by two means: Cell division and the Transfer of genetic material through a sex pilus(conjugation). ● In most bacteria, conjugation depends on a fertility (F) factor (a plasmid) that is the donor cell and absent in the recipient cell. Cells that contain F are referred to a F+, and cells are lacking F are F-. ● The F factor contains an origin of replication and a number of genes required for conjugation. ● For example, some of these genes encode sex pilli, slender extensions of the cell membrane. ● If the entire F Factor is transferred to the recipient F cell, that cell becomes an F cell.
  • 8.
    E.coli GENOME ● Thefirst completed DNA sequence of an E.coli genome (laboratory strain K- 12 derivative MG1655) was published in September 1997 by Blattner. ● The next E. coli genomes to be sequenced were pathogenic EHEC O157:H7 strains Sakai and EDL933. When both genomes were sequenced and compared EHEC O157:H7 is about 859,000 bp larger than E.coli K-12. ● The next E. coli genome to be sequenced was that of strain CFT073. Unexpectedly, only 10% of the CFT073-specific genes relative to MG 1655 were also shared by the O157:H7 genome.
  • 9.
    Cont….. ● When thegenome was sequenced as apart of Human Genome Project (HGP), it was found to be a circular DNA molecule 4.6 million base pairs in length, containing 4288 annotated protein coding genes (organized into 2584 operons), seven ribosomal RNA (rRNA) operons, and 86 transfer RNA (tRNA) genes. ● The coding density was found to be very high, with a mean distance between genes of only 118 bp. ● Greater gene density is due to a combination of factors: 1) bacterial genes have no introns, 2) neighboring genes are very close together throughout the genome; i.e., there are hardly any big regions of non-coding DNA between genes.
  • 10.
    Cont…. ● The genomewas observed to contain a significant number of transposable genetic elements, repeat elements, cryptic prophages, and bacteriophage remnants.
  • 11.
    Cont…. ● There areonline resources that centralize information about E.coli such as EcoCyc, RegulonDB and EcoGene.
  • 12.
  • 13.
    Plasmid and theE.coli resolution……..
  • 14.
  • 15.
    ● Plasmids arethe most important tools not only for the manipulation of E. coli but also the foundation for the genetic engineering of many organisms, cloning and sequencing, generation of mutants, and many applications in molecular biology. ● In nature, many bacteria contain self-replicating DNA molecules that can be harnessed for molecular biology applications. E. coli plasmids were the first ones to be extensively modified for such purposes. ● Some exceptional features of plasmids are they can be used in systems where replication origins (check compatibility first) and selection markers can coexist in the same cell, which can be extremely useful for the coexpression of four different proteins in the same cell. ● Linear plasmids are common in bacteria (particularly in actinobacteria), but thus far, only N15 plasmid prophage has been isolated from E. coli [46] and is an impediment for generating knockouts using linear DNA. ● plasmids play an important role, from generating site-specific integration of chromosomal fusions to large plasmids that can hold complex constructs for further modification. ● E. coli Genetic Stock Center is a good source of strains and plasmids for different applications.
  • 16.
    Common elements in Plasmidused in molecular biology applications……….
  • 17.
    E.coli based biosensors…... ●E. coli-based biosensors using plasmid or chromosomal constructs are useful for the detection of environmental traits or hazards or measuring cellular processes as any standard reporter system . ● The basic design considers the following: copy number, reporter proteins, detection methods, and control elements. ● generation and detection of this kind of biosensors are cost-effective and easy to generate and reasonably sensitive. ● E. coli-based biosensors have been successful for detecting different traits: oxidants, DNA damaging compounds, membrane-damaging compounds, protein-damaging compounds, aromatic compounds, xenobiotics, antibiotic panels using reporter strains without antibiotic selection, etc. ● Basic principles for biosensor design. First, the proper design and the experimental creation of reporter strains. With current knowledge, either plasmid creation or whole genome engineering can lead to the creation of a reporter strain. Second, incorporation of such elements into the host cell. Third, sensing module and response measurement. With current reporter proteins and detection technology, it is relatively easy to generate biosensors that can be used in different applications with high sensitivity and selectivity.
  • 19.
  • 20.
    Mapping of E.colichromosome…... ● This technique was utilized to map all genes of E.coli. ● There are 3 approaches for E.coli gene mapping: 1) Long- versus Short-Range Mapping: The total length of the E. coli chromosome is defined as 100 min (»4,655 kb), referring to the approximate length of time it takes for an Hfr donor cell to transfer the entire linkage group to a recipient cell by conjugation at 37°C. 2) Conjugational Mapping: The frequency of transfer of any given marker from any particular Hfr strain to recipient cells is strongly dependent on the distance of the marker from the Hfr origin of transfer. The availability of various Hfr strains with different points of origin usually enables one to find the general map region for a mutation within 3 to 10 min by this method. 3) Rapid Mapping: the vast difference in recombination frequency for early versus late Hfr markers allows a rapid localization of new markers between Hfr points of origin based on simple comparative crosses using various Hfrs. In cases in which an Hfr marker can be directly selected after transfer and recombination into a recipient strain, the early-marker versus late-marker approach can be greatly simplified by replica plating. This rapid mapping is especially useful in mapping large numbers of mutations, including temperature-sensitive Mud(lac) fusions.
  • 22.
    E.coli K-12 MG1655 ●The predominant facultative anaerobe resident in the human colon is the Gram negative motile bacillus Escherichia coli. E.coli colonies the infant gut within hours of birth. ● However, E.coli has a dichotomous existence; while the majority of E.coli strains exist within the mammalian intestinal tract as harmless commensals, paradoxically several evolutionary lineages have deviated from this harmless lifestyle to become pathogens. ● Those strains causing intestinal infections can be divided into six separate and major categories of pathogens viz enteroaggregative E.coli(EAEC), enteroinvasive (EIEC), enterohemorrhagic E.coli (EHEC) and differs adhering E.coli(DAEC).
  • 23.
    Entrez records ofE.coli K-12 MG1655
  • 24.
  • 26.
    The Integrated MicrobialGenomes (IMG) website offers data on bacterial genomes such as E. coli K-12 MG1655. IMG also offers extensive tools for metagenomics analyses.
  • 27.
    Nucleotide Composition: ● Weanalyze the GC content of an E. coli strain using an R package, seqinr. We can divide our task into three parts: (1) obtaining a genome sequence in the FASTA format; (2) determining the overall GC content; and (3) measuring the GC content in windows across the genome. ● Now, search for a standard strain of E. coli by entering the search term “escherichia coli K12” to the home page of NCBI. From there we find accession NC_000913.3 corresponding to Escherichia coli K-12 MG1655. ● The corresponding NCBI Nucleotide entry includes the option Send > Complete Record > Destination: File > Format: FASTA > Create File. This file is 4.7 MB; save it (or copy it) to your R working directory.
  • 28.
    Cont…. ● Next openthe R program. You can install seqinr using the “Packages” pull- down menu from the RGui console, or RStudio offers a convenient option to select packages for installation.
  • 29.
    ● We havecreated the object ecoli which includes the sequence. The command str(ecoli) will show us its of length 4.6 million. Next we place the sequence in a vector called ecoliseq. We can see this with the length command, and can also obtain the GC content:
  • 30.
    Cont…. ● This strainof E. coli therefore has a GC content of about 50.8%. We may also want to evaluate the GC content across windows of fixed size. We begin with a size of 20,000 bp.
  • 31.
    GC content ofE. coli strain K-12. The sequence of an E. coli strain was downloaded from NCBI, input to the R program seqinr, a for loop was used to calculate GC content in windows of 20,000 base pairs, and the data were plotted.
  • 33.
    Obtaining the E.coligenome sequence in FASTA from from NCBI Obtaining the E.coli genome sequence in FASTA from NCBI
  • 35.
    References ---- ● Chapter17 ‘Completed Genomes: Bacteria & Archaea’ Page no. - 793 by Pevsner “Bioinformatics and Functional Genomics” 3rd edition. ● https://www.intechopen.com/books/-i-escherichia-coli-i-recent-advances-on- physiology-pathogenesis-and-biotechnological-applications/-i-escherichia- coli-i-as-a-model-organism-and-its-application-in-biotechnology ● http://www.asmscience.org/files/Chapter_137_Genetic_Mapping.pdf ● https://www.slideshare.net/mpattani/genetic-analysis-and-mapping-in- bacteria-and-bacteriophages