Bacterial Identification by 16s rRNA Sequencing.ppt
• Bacteria are the most ubiquitous life forms on planet earth, a single gram of soil is said
to contain 40 million bacterial cells. Bacteria forms a biomass that exceeds that of
plants and animals.
• Because of their abundance most of the bacterial species living on earth have not
• Various biotechnological techniques have been developed for identifying various
bacterial species at the sequence level
• DNA hybridization and 16s rRNA gene sequencing are the most common techniques
used for this purpose
Overview of the technique
Isolation of the bacteria
Bacterial DNA extraction
Amplification of the 16s rRNA gene
Sequence a portion of the 16s rRNA gene
Compare the sequenced gene with GenBank to
obtain a match
Foremost is the fact that it seems to behave as a molecular chronometer.
The 16s rRNA gene sequence show a high degree of conservation among
species. This is assumed to result from the importance of the 16S rRNA as a
critical component of cell function.
Few other genes are as highly conserved as the 16S rRNA gene. Although
the absolute rate of change in the 16S rRNA gene sequence is not known, it
does mark evolutionary distance and relatedness of organisms
16s rRNA is present in almost all bacteria, often existing as a multigene
family, or operons.
The 16s rRNA gene (1,500 bp) is large enough for informatics purposes.
16s rRNA gene sequencing studies as opposed to the more cumbersome
manipulations involving DNA-DNA hybridization investigations.
16s rRNA gene of various bacteria are extensively studied and is present in
Why use 16s rRNA gene sequencing
Isolation of the Bacteria from a
• A loopful of the natural sample presumed to contain bacteria is
taken and inoculated in a saline solution.
• The inoculated saline solution is serially diluted to reduce the
concentration of micro organisms
• After a sufficient dilution has been reached a loopful of
inoculum is plated on a sterile nutrient agar plate.
• The plates are incubated at room temperature for 24 hours.
• The colony characteristics are observed to identify the colony
of the desired micro-organism
• Different types of media can be used depending on type of micro
organism we wish to isolate
• Enrichment media: Media contains substances that stimulates the
growth of a desired bacteria and suppresses the growth of unwanted
(eg: Tetrathionate broth that inhibits coliforms but stimulated bacilli )
• Selective Media : Media that allows only a particular type of bacteria
to grow on it . (eg: MacConcey agar for growing gram negative
• Differential Media : Media that distinguishes on micro organism type
from another growing in the same plate
(eg: Manitol Salt agar , differential for Manitol fermentation)
• Anaerobic Media : Media used for growing anaerobic organisms (eg
: Robertson’s cooked meat)
• After the colony of the desired
bacteria has been identified , the
colony is pinched out of the agar
plate and stained , for microscopic
• The bacteria can be
morphologically identified by simple
• Gram staining is done to check the
gram nature of the organism .
• Ziehl-Neelsen stain is used to
visualize acid fast bacteria like
Mycobacterium tuberculosis and
• Fluorescent antibody test is used to
identify bacteria by the antigens
present on their surface
Extraction of DNA from
• A variety of extraction methods can be
used for the extraction of DNA for use in
16S rRNA sequencing. The choice of
extraction methods rest with the source of
the DNA sample and the amount of purity
• For soil bacteria, the most common
methods used are bead beating,
sonication, enzymatic lysis, etc.
• For bacteria derived from other sources,
the mainstream DNA isolation methods
like Phenol-Chloroform method, CTAB
method are also used.
• Modern laboratories depend on readily
available kits to achieve quick, efficient
and highly pure DNA Extraction.
Amplification of 16S rRNA
• The DNA extracted is used as the template
for PCR to amplify a segment of about 500 or
1,500 bp of the 16S rRNA gene sequence.
• Broad-based or universal primers
complementary to conserved regions are
used so that the region can be amplified from
any bacteria. The PCR products are purified
to remove excess primers and nucleotides.
• The PCR Amplification results in multiple
copies of the target DNA Sequence being
• This resulting sequence is then used as the
template for the next step of the process
known as Cycle Sequencing.
• The next step is a process called cycle sequencing. It is similar to PCR in that it uses DNA
(purified products of the first PCR cycle) as the template. Both the forward and reverse
sequences are used as the template in separate reactions in which only the forward or reverse
primer is used.
• Cycle sequencing also differs from PCR in that no new template is formed the same template is
reused for as many cycles as programmed, usually 25 cycles and the product is a mixture of
DNA of various lengths. This is achieved by adding specially labelled bases called dye
terminators along with unlabelled bases, which, when they are randomly incorporated in this
second cycle, terminate the sequence.
• Thus, fragments of every size are generated. As each of the four added labelled terminator
bases has different fluorescent dye, each of which absorbs at a different wavelength, the
terminal base of each fragment can be determined by a fluorometer.
• The products are purified to remove unincorporated dye terminators, and the length of each is
determined using capillary electrophoresis or gel electrophoresis.
• Since we then know the length and terminal base of each fragment, the
sequence of the bases can be determined.
• DNA Sequencing can be done by either the common Sanger Methods,
or by using modern DNA Sequencers.
• The two strands of the DNA are sequenced separately, generating both
forward and reverse sequences.
• An electropherogram, which is a tracing of the detection of the
separated fragments as they elute from the column or are separated in
the gel, in which each base is represented by a different colour, can be
• It is possible to have the fragments of various lengths so well separated
that every base of a 500-bp sequence can be determined. When
ambiguities occur, most of them can be resolved by visual reediting of
Generating Phylogenetic Trees
and Comparing Sequences
• Comparisons are commonly shown as
phylogenetic trees and linear alignments.
• A phylogenetic tree is a tree-structured graph
used in computational biology to visualize the
result of a hierarchical clustering calculation.
• The result of a clustering is presented either
as the distance(dissimilarity) or the similarity
between the clustered rows or columns
depending on the selected distance measure.
• Methods commonly used for generating
phylogenetic trees are:
NJ method (Neighboor-joining),
UPGMA method (Unweighted pair group method with
WPGMA method (Weighted pair group method with
• The methods are comparable, and the major groupings
are preserved if the isolates are closely related.
• However, when the taxa being compared are less
closely related, the phylogenetic tree relationships are
more strongly affected by the program used for
accurate 16S rRNA gene sequence identification of
• In such cases, we are dependent on accurate
sequences in databases, appropriate names
associated with those sequences, and an accurate
sequence for the isolate to be identified.
• There are several reasons why sequence databases can vary
and may not accurately link a name with a sequence and,
further, with a correct relative placement of the sequence among
other bacterial sequences. It could be that the type strain or
strains in certified collections such as ATCC were incorrectly
named or classified by biochemical means or that the
descriptions are just wrong.
• A familiar example of an organism being placed in the wrong
genus is “Corynebacterium aquaticum.” Although “C. aquaticum”
has, superficially, the same morphology as the corynebacteria, it
is genetically distant from the genus Corynebacterium and is
more closely related to organisms in the genus Microbacterium.
• When examined by genotypic methods, the ATCC strains
previously deposited as Corynebacterium xerosis were found to
be diverse and incorrectly identified.
• There is also the misplacement of well-defined species
presumably within a single genus but actually found in
many taxonomic groups. Species of the genus
Enterobacter are found associated with five different
genera. Enterobacter is a good example of what is
called a polyphyletic genus, as is Citrobacter
• A third reason for error applies primarily to unverified
databases such as GenBank, which accept any linked
name and sequence that is sent to them.
• It was this sort of consideration that propelled the
development of the proprietary MicroSeq database of
verified type strains and the RIDOM database.
APPLICATIONS OF BACTERIAL
IDENTIFICATION USING 16S rRNA SEQUENCING
1. Identifying unidentified bacteria or isolates with
• One of the most attractive potential uses of 16S rRNA gene sequence
informatics is to provide genus and species identification for isolates
that do not fit any recognized biochemical profiles, for strains
generating only a “low likelihood” or “acceptable” identification
according to commercial systems, or for taxa that are rarely
associated with human infectious diseases.
• The cumulative results from a limited number of studies to date
suggest that 16S rRNA gene sequencing provides genus identification
in most cases (>90%) but less so with regard to species (65 to 83%),
with from 1 to 14% of the isolates remaining unidentified after testing.
• Difficulties encountered in obtaining a genus and species identification
include the recognition of novel taxa, too few sequences deposited in
nucleotide databases, species sharing similar and/or identical 16S rRNA
sequences, or nomenclature problems arising from multiple genomovars
assigned to single species or complexes.
• Criteria for species identification
Minimum: >99% sequence similarity; ideal: >99.5% sequence similarity
Sequence match is to type strain or reference strain of species that has
undergone DNA-relatedness studies
For matches with distance scores <0.5% to the next closest species,
other properties, including phenotype, should be considered in final
2. Identification of Mycobacteria
• Mycobacteria are in general slow-growing and difficult to
identify. Thus, they were an important group of organisms in
early important studies establishing the usefulness of 16S rRNA
gene sequencing for clinical microbiology.
• More recently, there have been several additional studies
comparing the identification of mycobacteria by 16S rRNA gene
sequence and phenotypic methods. In all of the studies, the
accuracy of 16S rRNA gene sequencing in the identification to
the species level was judged to be superior overall to
• Overall, by providing for the accurate identification of species in
the database and the taxonomic placement if not complete
identification of novel species, 16S rRNA gene sequence
analysis of mycobacteria seems to be the most accurate
3. Discovery and Description of Novel Pathogens
• In clinical microbiology practice, novel organisms are
generally first recognized by an aberrant phenotype or niche.
• If these observations are followed by 16S rRNA gene
sequencing, the sequence often indicates that the organism is
only an unusual phenotype of a known taxon. However, at this
time, perhaps 10 to 20% of the isolates might not match any
other described organism and thus might be a novel organism
and an even higher percentage might be previously described
organisms but not strains usually encountered in clinical
• A medium to large clinical microbiology laboratory can be
expected to isolate a few novel organisms per month.
• The traditional identification of bacteria on the basis of phenotypic
characteristics is generally not as accurate as identification based on
• Comparison of the bacterial 16S rRNA gene sequence has emerged as
a preferred genetic technique. 16S rRNA gene sequence analysis can
better identify poorly described, rarely isolated, or phenotypically
aberrant strains, can be routinely used for identification of
mycobacteria, and can lead to the recognition of novel pathogens and
• Problems remain in that the sequences in some databases are not
accurate, there is no consensus quantitative definition of genus or
species based on 16S rRNA gene sequence data,
• The proliferation of species names based on minimal genetic and
phenotypic differences raises communication difficulties, and micro
heterogeneity in 16S rRNA gene sequence within a species is
• Despite its accuracy, 16S rRNA gene sequence analysis lacks
widespread use beyond the large and reference laboratories
because of technical and cost considerations. Thus, a future
challenge is to translate information from 16S rRNA gene
sequencing into convenient biochemical testing schemes, making
the accuracy of the genotypic identification available to the smaller
and routine clinical microbiology laboratories.