Shotgun metagenomics sequencing allows researchers to comprehensively sample all genes in organisms present in a complex sample without culturing. This provides insights into bacterial diversity, abundance, and uncultured microbes. Bioinformatics pipelines guide analysis including quality filtering, assembly, binning, gene finding, fingerprinting, and phylogeny/diversity modeling to understand communities. Metagenomics has applications in antibiotic/drug discovery, bioremediation, agriculture, human microbiome mapping, and more. Tools like QIIME, Mothur, MEGAN, and MG-RAST facilitate large-scale metagenomic analysis.
Metagenomics studies genetic material from environmental samples, differing from traditional microbiology.
Two approaches: sequence-driven analyzing DNA sequences vs. function-driven screening for specific functions.
Both metagenomic approaches face challenges in gene expression and reliance on existing knowledge.
Metagenomic data analysis begins with pre-filtering low-quality sequences to enhance data accuracy.
Comparison of metagenomes provides insights into microbial community functions influencing host health.
Shotgun sequencing breaks DNA into fragments for assembly, suitable for small genomes like viruses.
Faster and efficient than traditional methods, but requires a reference genome for accurate assembly.
High computational needs and assembly errors due to lack of genetic mapping are key disadvantages.
Assembly of microbial communities poses challenges due to unknown strain quantities and abundances.
WGS provides comprehensive gene sampling in microbiology, aiding in diversity evaluation of microbes.
Continued growth in shotgun sequencing projects promises to uncover ecological and community insights.
Technological advancements include A5-miseq and Orione for user-friendly bioinformatics data analysis.Metagenomics applications impact pharmaceuticals, environmental cleanup, and agriculture, showing diverse uses.
Focus on human microbiome mapping may yield tools for nutrition and understanding complex diseases.
QIIME, Mothur, and MEGAN are essential bioinformatics tools for analyzing microbial dynamics.
Future research will focus on exploring new antibiotics, studying gut microbiome effects and ancient DNA.
Metagenomics
Metagenomics isthe study of genetic material recovered
directly from environmental samples.
While traditional microbiology and microbial genome
sequencing and genomics rely upon cultivated clonal
cultures, early environmental gene sequencing cloned specific
genes to produce a profile of diversity in a natural sample
TWO APPROACHES FORMETAGENOMICS
In the first approach:
Known as ‘sequence-driven metagenomics’, DNA from the
environment of interest is sequenced and subjected to
computational analysis.
The metagenomic sequences are compared to sequences deposited
in publicly available databases such as GENBANK.
The genes are then collected into groups of similar predicted
function, and the distribution of various functions and types of
proteins that conduct those functions can be assessed.
6.
Cont,,
In the secondapproach:
‘Function-driven metagenomics’, the DNA extracted from the
environment is also captured and stored in a surrogate host, but
instead of sequencing it, scientists screen the captured fragments of
DNA, or ‘clones’, for a certain function.
The function must be absent in the surrogate host so that acquisition
of the function can be attributed to the metagenomics DNA.
7.
LIMITATIONS OF TWOAPPROACHES
The sequence driven approach
limited existing knowledge: if a metagenomic gene does not look like
a gene of known function deposited in the databases, then little can be
learned about the gene or its product from sequence alone.
The function driven approach
most genes from organisms in wild communities cannot be expressed
easily by a given surrogate host
8.
How it usein bioinformatics:
Sequence pre-filtering
The first step of metagenomic data analysis requires the execution of
certain pre-filtering steps, including the removal of redundant, low-
quality sequences and sequences of probable eukaryotic origin .
The methods available for the removal of contaminating eukaryotic
genomic DNA sequences include Eu-Detect and DeConseq.
9.
Comparative metagenomics
Comparativeanalyses between metagenomes can provide
additional insight into the function of complex microbial
communities and their role in host health.
Pairwise or multiple comparisons between metagenomes can be
made at the level of sequence composition (comparing GC-content
or genome size), taxonomic diversity, or functional complement.
10.
Cont,,
Consequently, metadataon the environmental context of the
metagenomic sample is especially important in comparative
analyses, as it provides researchers with the ability to study the
effect of habitat upon community structure and function.
Shotgun Sequencing
Shotgunsequencing involves randomly breaking up DNA sequences into lots of
small pieces and then reassembling the sequence by looking for regions of
overlap.
Large, mammalian genomes difficult to clone(complex).
Clone-by-clone sequencing, although reliable and methodical(time taking).
Used by Fred Sanger and his colleagues.
To sequence small genomes such as those of viruses and bacteria.
fragments are often of varying sizes, ranging from 2-20kilobases to 200-300 kilo
bases.
15.
Advantages of shotgunsequencing:
By removing the mapping stages, much faster process than clone-
by-clone sequencing.
Uses a fraction of the DNA that clone-by-clone sequencing needs.
Efficient if there is an existing reference sequence.
Easier to assemble the genome sequence by aligning it to an
existing reference genome?.
Faster and less expensive than methods requiring a genetic map.
16.
Disadvantages of shotgunsequencing
Vast amounts of computing power and sophisticated software are
required to assemble shotgun sequences together.
Errors in assembly are more likely to be made because a genetic
map is not used
Easier to resolve than in other methods and minimized if a
reference genome can be used.
Carried out if a reference genome is already available, otherwise
assembly is very difficult without an existing genome to match it
to.
Repetitive genomes and sequences can be more difficult to
assemble.
17.
Assembling Communities
Theassembly of communities has strong similarities to the assembly of highly
polymorphic diploid eukaryotes, such as Ciona savigny and Candida albicans.
If we view prokaryotic strains as analogous to eukaryotic haplotypes.
The main difference is that in a microbial community, the number of strains is unknown
and potentially large, and their relative abundance is also unknown and potentially
skewed, while in most eukaryotes we know a priori the number of haplotypes and their
relative abundance.
This disadvantage is mitigated somewhat by the small size and relative lack of repetitive
sequence in prokaryotic and viral genomes, so that the issue of distinguishing alleles from
paralogs and polymorphism from repetitive sequence is less acute.
18.
We performedsimilar calculations for the three whale fall communities.
In addition, we considered the problem of assembling all genomes in these communities.
Since the 16S survey indicated that three dominant species constitute approximately half
the total abundance and all other species have roughly equal abundance, the Lander–
Waterman model implies that the expected coverage should be distributed as the mixture
of two Poisons with equal weight.
The results of these calculations are summarized. Similar results were obtained by Venter
et al. and Breitbart et al. , and bioinformatitions use different software's.
20.
“
”
Whole genome shotgunsequencing
guided by bioinformatics pipelines—
an optimized approach for an
established technique
Shotgun metagenomics sequencing allows researchers to comprehensively sample all genes in all organisms present in a given
complex sample. The method enables microbiologists to evaluate bacterial diversity and detect the abundance of microbes in
various environments. Shotgun metagenomics also provides a means to study unculturable microorganisms that are otherwise
difficult or impossible to analyze.
21.
Phylogeny and CommunityDiversity
Regards to community diversity, one of the advantages of the WGS
approach is that it is less biased then PCR, which is known to suffer
from a host of problems.
Community modeling based on analysis of assembly data within
the Lander–Waterman model is beginning to show that species
abundance curves are not lognormal as previously thought.
New methods that take into account these naturally occurring
distributions are needed.
22.
Conclusion
The numberof new community shotgun sequencing projects continues to grow, promising
to provide vast quantities of sequence data for analysis.
Samples are being drawn from macroscopic environments such as the sea and air, as well
as from more contained communities such as the human mouth.
Exciting advances in our understanding of ecosystems, environments, and communities
will require creative solutions to numerous new bioinformatics problems.
We have briefly mentioned some of these: assembly (can co-assembly techniques be used
to assemble polymorphic genomes and complex communities?), binning (what is the best
way to combine diverse sources of information to bin scaffolds?), gene finding (how
should gene finding programs, which were designed for complete genes and genomes, be
adapted for low-coverage sequence?), fingerprinting (which clustering techniques are best
suited for discovering novel pathways and functional groups that allow communities to
adapt to their environments?), and MSA and phylogeny (how can we best construct trees
and alignments from fragmented data?).
23.
Countless morechallenges will likely emerge as WGS sequencing approaches are used to
tackle increasingly complex communities.
The reward for computational biologists who work on these problems will be the
satisfaction of contributing to the grand enterprise of understanding the total diversity of
life on our planet.
24.
A5-miseq
Produces highquality microbial genome assemblies on a laptop
computer without any parameter tuning. A5-miseq does this by
automating the process of adapter trimming, quality filtering
25.
Orione
A Galaxy-basedframework consisting of publicly available
research software and specifically designed pipelines to build
complex, reproducible workflows for next-generation sequencing
microbiology data analysis.
Enabling microbiology researchers to conduct their own custom
analysis and data manipulation without software installation or
programming, Orione provides new opportunities for data-intensive
computational analyses in microbiology and metagenomics.
Cont..
• Transport proteins
•Ecology and Environment
• Energy
• Bioremediation
• Biotechnology
• Agriculture
• Biodefence
28.
Applications
● Global Impacts.
Therole of microbes is critical in
maintaining atmospheric balances, as
they are
the main photosynthetic agents
responsible for the generation and
consumption of greenhouse gases
involved at all levels in ecosystems
and trophic chains
Applications
● Bioenergy
We areharnessing microbial power in
order to produce
● ethanol (from cellulose), hydrogen,
methane, butanol...
● Smart Farming. Microbes help our
crops by
● the “supressive soil” phenomenon
(buffer effect against disease-
causing organisms)
● soil enrichment and regeneration
31.
Applications
The World Within.
Studyingthe human microbiome
may lead to valuable new tools
and guidelines in
● Human and animal nutrition
● Better understanding of complex
diseases (obesity, cancer,
asthma...)
● Drug discovery
● Preventative medicine
QIIME
QIIME isan open-source bioinformatics pipeline
for performing microbiome analysis from raw
DNA sequencing data.
QIIME is designed to take users from raw
sequencing data generated on the Illumina or
other platforms through publication quality
graphics and statistics.
QIIME has been applied to studies based on
billions of sequences from tens of thousands of
samples.
35.
Mothur.
to developa single piece of open-source, expandable software to fill the
bioinformatics needs of the microbial ecology community
screening, processing, aligning & clustering of Sanger, 454 or Illumina (16S
rRNA) amplicons
generating a high-quality, effectively ‘normalized’ shared file (i.e. counts of
OTUs per sample)
gaining general taxonomic information about the OTUs in your study system
(RDP Taxonomic Classifier)
36.
MEGAN
In metagenomics,the aim is to understand the composition and
operation of complex microbial consortia in environmental
samples through sequencing and analysis of their DNA.
FUTURE OF METAGENOMICS
•To identify new enzymes & antibiotics
• To assess the effects of age, diet, and pathologic states
(e.g., inflammatory bowel diseases, obesity, and cancer)
on the distal gut microbiome of humans living in
different environments
Study of more exotic habitats
• Study antibiotic resistance in soil microbes
• Improved bioinformatics will quicken analysis for library
profiling
40.
Cont..
• Investigating ancientDNA remnants
• Discoveries such as phylogenic tags (rRNA genes, etc) will give
momentum to the growing field
• Learning novel pathways will lead to knowledge about the current
nonculturable bacteria to then culture these systems