SlideShare a Scribd company logo
1 of 36
WHOLE GENOME SEQUENCING OF
BACTERIA & ANALYSIS
ELAMURUGAN. A
Ph.D Scholar,
Vet. Immunology
INTRODUCTION
 1977 - first complete genome to be sequenced was
bacteriophage X174 - 5386 bp
 1995 - first complete genome sequence from a free living
organism - Haemophilus influenzae (1.83 Mb) by whole
genome shotgun approach
 Sanger & Coulson (1977) - used chain-terminating
dideoxynucleotide analogues
 Maxam & Gilbert (1977) chemical degradation DNA
sequencing - terminally labeled DNA fragments were
chemically cleaved at specific bases and separated by gel
electrophoresis
http://www.genomesonline.org/cgi-bin/GOLD/sequencing_status_distribution.cgi
429
Genome online database (GOLD)
ARCHON X PRIZE
 X PRIZE Foundation in Santa Monica, CA, has
introduced the Archon X PRIZE for Genomics and will
award a sum of $10 million to the first team that can
design a system capable of sequencing 100 human
genomes in 10 days
SEQUENCING TECHNOLOGY
 First generation
 Sanger’s dideoxy chain terminating tech
 Maxam & Gilbert chemical degradation tech
 Next generation sequencing (NGS)
 454/Roche - pyrosequencing
 Illumina/ Solexa - reversible dye terminators
 SOLiD /ABI- sequential ligation of oligonucleotide probes
Second generation HT-NGS – sequencing after amplification
 Heliscope
 SMRT (Pacific biosciences)
 Single molecule real time (RNAP) sequencer
 Nanopore DNA sequencer
 Ion Torrent sequencing technology (PostLight)
 VisiGen biotechnologies – FRET
 Advantages of 3rd generation HT-NGS over 2nd
 higher throughput
 faster turnaround time
 longer read lengths
 higher consensus accuracy
 small amounts of starting material
 low cost
Third
generation
HT-NGS -
Single
molecule
sequencing
ADVANTAGES OF HT-NGS
 Massive parallel sequencing of hundreds of thousands
or millions of templates
 Preliminary and tedious cloning work is eliminated and
substituted by PCR amplification
 Most recent technologies, even PCR is eliminated,
because single DNA molecules
 Economic
 Reduced time
DISADVANTAGES OF HT-NGS
 Most NGSTs produce short reads
 Constructions of fragment libraries remain tricky and
involve several steps of fragmentation, adaptor ligation
and PCR amplification
 Short homopolymers with the 454 technology
 Modified nucleotides cause mis-incorporation or block
further incorporation if the florescent moiety cannot be
completely removed
 Assembly of short reads into longer sequences
Illumina/ Solexa technology
zero-mode
waveguides
(ZMWs)
Selection of a technology for an experiment
GENOME ASSEMBLY
 Assemblers can join sequences together based on
overlapping regions between the sequences
 Composed of contigs and scaffolds
 Contigs - contiguous consensus sequences that are
derived from collections of overlapping reads
 Scaffolds - ordered and orientated sets of contigs that are
linked to one another by mate pairs of sequencing reads
 N50 - basic statistic for describing the contiguity of a
genome assembly. The longer the N50 is, the better the
assembly
 Alignment against a reference genome sequence
 De novo assembly Construction of longer sequences, such
as contigs or genomes, from shorter sequences, such as
sequence reads, without prior knowledge of the order of
the reads or reference to a closely related sequence
GENE PREDICTION
 Ab initio gene prediction - mathematical models
rather than external evidence (such as EST and
protein alignments) to identify genes and to
determine their intron–exon structures
 Evidence-driven gene prediction - using ESTs, can
be used to identify exon boundaries
unambiguously. Great potential to improve the
quality of gene prediction in newly sequenced
genomes. ESTs and proteins must first be aligned
to the genome
 Commonly used tools for gene prediction in
prokaryotes Glimmer, GeneMark
GENOME ANNOTATION
 Is the extraction of biological knowledge from raw
nucleotide sequences
 Seeks to identify every potential protein coding gene
(ORFs)
 Used to compare in available database like BlastP
 ‘Structural’ genome annotation is the process of identifying
genes and their intron–exon structures
 ‘Functional’ genome annotation is the process of attaching
meta-data such as gene ontology terms to structural
annotations
APPLICATIONS
 Very large no of short reads help to identify single nucleotide
polymorphisms (SNP) when comparing them in reference
genome
 Identification of rearrangements, deletions, insertions,
inversions
 Used to generate expressed sequence tags (EST) from RNA
sequencing
 Also to detect small regulatory RNAs
 Illumia technoloy - ChIP Seq to study protein - DNA
interactions
 Metagenomics
LEADS TO DEVELOPMENT
 Functional genomics
 Comparative genomics
 Environmental genomics (Metagenomics)
FUNCTIONAL GENOMICS
 Reveals genome structure and its functional relation
 Orthologs - they represent genes derived from a common
ancestor that diverged because of divergence of the
organism, tend to have similar function
 Paralogs are homologs produced by gene duplication and
represent genes derived from a common ancestral gene
that duplicated within an organism and then diverged, tend
to have different functions
 Xenologs are homologs resulting from the horizontal
transfer of a gene between two organisms. The function of
xenologs can be variable, depending on how significant the
change in context was for the horizontally moving gene. In
general, though, the function tends to be similar
PHYLOGENETIC ANALYSIS
 Phylogenetic trees, which are used to classify the
evolutionary relationships between homologous
genes represented in the genomes of divergent species
Internal Nodes or
Divergence Points
Branches or
Lineages A
B
C
D
E
Terminal Nodes
Ancestral Node
or ROOT of
the Tree
COMPARATIVE GENOMICS
 Comparison of genome sequences reveals much
information about genome structure and evolution,
including importance of lateral gene transfer
 Tool to discover how microbs adapted to particular
ecology and in development of new therapeutic
agents
METAGENOMICS
 Genomics-based study of genetic material
recovered directly from environmentally derived
samples without laboratory culture and compared
with all previously sequenced genes
 Enable how microbs adapt extreme environments
which help to discover new metabolic pathway and
protective mechanisms
IMPACT OF GENOME SEQUENCING
 Revealed genome reduction in I/C bacteria
 Genome plasticity (rearrangements, mobile elements)
 Gene duplication and diversification of protein function
 Lateral gene transfer & acquisition of new functions
 Adaptation to environments, virulence
 Industrial process - fermentation tech,
 Bioremediation
 Biotransformation
 Development of vaccines
 Bacterial diversity
 Synthetic biology
 Epigenetics
REVERSE VACCINOLOGY
 Use of genomic sequence information to identify novel
and better suited protein candidates for vaccine
 Serogroup B Neisseria meningitidis – based on
genomic data all proteins predicted to be surface
exposed, therefore accessible to antiobodies
 Suitable candidates selected after sequencing various
strains
 Streptococcus agalactiae
 Pan-genome composed of core genome, the genes
present in all sequence strains and the dispensable
genome made of genes present in a subset of strains
 Synthetic biology - from sequence of entire genome to
synthesize genes de novo
 Identification of minimal genome, the smallest set of
genes that enbles life - Mycoplasma genitalium
DATABASES AND TOOLS RELATED WITH BACTERIAL
GENOMIC DATA
 NCBI Entrez Genome Project database:
 http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db = genomeprj
 A searchable collection of complete and incomplete (in-progress)
large-scale sequencing, assembly, annotation, and mapping projects
for cellular organisms
 NCBI, Bacteria Genome Database:
 http://www.ncbi.nlm.nih.gov/genomes/static/eub.html
 The Genome database provides views for a variety of genomes,
complete chromosomes, sequence maps with contigs, and integrated
genetic and physical maps
 Bacterial Genomes at The Sanger Institute:
• http://www.sanger.ac.uk/Projects/Microbes/
• This web contains a list of funded, on-going, or completed projects of
pathogens sequenced at this institute
 TIGR Comprehensive Microbial Resource (CMR):
 http://cmr.tigr.org/tigr-scripts/CMR/CmrHomePage.cgi
 A free website displaying information on all the publicly available,
complete prokaryotic genomes
 GOLD: Genomes OnLine Database:
 http://www.genomesonline.org/
 A genome database containing information about which genomes have
been sequenced or are in progress
 Microbial Genome Database for Comparative Analysis (MBGD):
 http://mbgd.genome.ad.jp/
 A database for comparative analysis of completely sequenced microbial
genomes
 Virulence Factors of Bacterial Pathogens (VFDB):
 http://zdsys.chgb.org.cn/VFs/main.htm
 VFDB is an integrated and comprehensive database of virulence
factors for bacterial pathogens
 Genome Information Broker:
 http://gib.genes.nig.ac.jp/
 A comprehensive data repository of complete microbial genomes in the
public domain. Many microbial genomes can be explored graphically
 Islander, a Database of Genomic Islands:
 http://www.indiana.edu/~islander
 This database contains genomic islands discovered in completely
sequenced bacterial genomes
 GenoList genome browser at Institute Pasteur:
 http://genolist.pasteur.fr/
 Contains access to diverse genome browsers of pathogenic
bacteria
 IslandPath:
 http://www.pathogenomics.sfu.ca/islandpath/update/IPindex.pl
 An aid to the identification of genomic islands, including
pathogenicity islands, of potentially horizontally transferred genes
 HGT-DB:
 http://www.tinet.org/~debb/HGT/
 A database containing the prediction of horizontally transferred
genes in several prokaryotic complete genomes
 E. coli genome project:
 http://www.genome.wisc.edu
 A site devoted to the E. coli genome project with an updated
annotation of the genome
Whole genome sequencing of bacteria & analysis

More Related Content

What's hot

Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Juan Antonio Vizcaino
 
Comparitive genome mapping and model systems
Comparitive genome mapping and model systemsComparitive genome mapping and model systems
Comparitive genome mapping and model systemsHimanshi Chauhan
 
Comparative and functional genomics
Comparative and functional genomicsComparative and functional genomics
Comparative and functional genomicsJalormi Parekh
 
Identification of disease genes
Identification of disease genesIdentification of disease genes
Identification of disease genesPrasanthperceptron
 
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGBilal Nizami
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequencesababibi
 
Transcript detection in RNAseq
Transcript detection in RNAseqTranscript detection in RNAseq
Transcript detection in RNAseqDenis C. Bauer
 
Bio153 microbial genomics 2012
Bio153 microbial genomics 2012Bio153 microbial genomics 2012
Bio153 microbial genomics 2012Mark Pallen
 
Whole genome sequence.
Whole genome sequence.Whole genome sequence.
Whole genome sequence.jayalakshmi311
 
Transcriptome Analysis & Applications
Transcriptome Analysis & ApplicationsTranscriptome Analysis & Applications
Transcriptome Analysis & Applications1010Genome Pte Ltd
 
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Manikhandan Mudaliar
 

What's hot (20)

Genome Assembly
Genome AssemblyGenome Assembly
Genome Assembly
 
GWAS
GWASGWAS
GWAS
 
Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016
 
Comparitive genome mapping and model systems
Comparitive genome mapping and model systemsComparitive genome mapping and model systems
Comparitive genome mapping and model systems
 
Genome assembly
Genome assemblyGenome assembly
Genome assembly
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Transcriptomics
TranscriptomicsTranscriptomics
Transcriptomics
 
Comparative and functional genomics
Comparative and functional genomicsComparative and functional genomics
Comparative and functional genomics
 
Identification of disease genes
Identification of disease genesIdentification of disease genes
Identification of disease genes
 
NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCING
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequence
 
Transcript detection in RNAseq
Transcript detection in RNAseqTranscript detection in RNAseq
Transcript detection in RNAseq
 
Bio153 microbial genomics 2012
Bio153 microbial genomics 2012Bio153 microbial genomics 2012
Bio153 microbial genomics 2012
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
Whole genome sequence.
Whole genome sequence.Whole genome sequence.
Whole genome sequence.
 
Transcriptome Analysis & Applications
Transcriptome Analysis & ApplicationsTranscriptome Analysis & Applications
Transcriptome Analysis & Applications
 
Biological networks
Biological networksBiological networks
Biological networks
 
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 

Viewers also liked

Genome sequencing
Genome sequencingGenome sequencing
Genome sequencingShital Pal
 
Third Generation Sequencing
Third Generation Sequencing Third Generation Sequencing
Third Generation Sequencing priyanka raviraj
 
sequencing of genome
sequencing of genomesequencing of genome
sequencing of genomeNaveen Gupta
 
DNA SEQUENCING METHOD
DNA SEQUENCING METHODDNA SEQUENCING METHOD
DNA SEQUENCING METHODMusa Khan
 
Dna sequencing powerpoint
Dna sequencing powerpointDna sequencing powerpoint
Dna sequencing powerpoint14cummke
 
GMOD 2014 MAKER Lecture
GMOD 2014 MAKER LectureGMOD 2014 MAKER Lecture
GMOD 2014 MAKER Lecturebarrymoore
 
Bacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBIBacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBInist-spin
 
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...ExternalEvents
 
SPIN Workshop Microbial Genomics @NIST
SPIN Workshop Microbial Genomics @NISTSPIN Workshop Microbial Genomics @NIST
SPIN Workshop Microbial Genomics @NISTnist-spin
 
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...José André
 
Pathways and genomes databases in bioinformatics
Pathways and genomes databases in bioinformaticsPathways and genomes databases in bioinformatics
Pathways and genomes databases in bioinformaticssarwat bashir
 
Biology 16 1 genes and variation[1]
Biology 16 1 genes and variation[1]Biology 16 1 genes and variation[1]
Biology 16 1 genes and variation[1]Tamara
 
Genome Sequencing Project
Genome Sequencing ProjectGenome Sequencing Project
Genome Sequencing Projectguestd53a1
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural cropsPulipati Gangadhara Rao
 
Genomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single stepGenomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single stepILRI
 
Introduction to genomes
Introduction to genomesIntroduction to genomes
Introduction to genomesavrilcoghlan
 
Next-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotNext-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotLi Shen
 

Viewers also liked (20)

Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
Third Generation Sequencing
Third Generation Sequencing Third Generation Sequencing
Third Generation Sequencing
 
sequencing of genome
sequencing of genomesequencing of genome
sequencing of genome
 
Genome analysis
Genome analysisGenome analysis
Genome analysis
 
DNA SEQUENCING METHOD
DNA SEQUENCING METHODDNA SEQUENCING METHOD
DNA SEQUENCING METHOD
 
Whole Genome Analysis
Whole Genome AnalysisWhole Genome Analysis
Whole Genome Analysis
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
Dna sequencing powerpoint
Dna sequencing powerpointDna sequencing powerpoint
Dna sequencing powerpoint
 
GMOD 2014 MAKER Lecture
GMOD 2014 MAKER LectureGMOD 2014 MAKER Lecture
GMOD 2014 MAKER Lecture
 
Bacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBIBacterial Pathogen Genomics at NCBI
Bacterial Pathogen Genomics at NCBI
 
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
 
SPIN Workshop Microbial Genomics @NIST
SPIN Workshop Microbial Genomics @NISTSPIN Workshop Microbial Genomics @NIST
SPIN Workshop Microbial Genomics @NIST
 
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...
Monteiro 2015 Conservação ex situ de espécies ameaçadas da flora brasileira: ...
 
Pathways and genomes databases in bioinformatics
Pathways and genomes databases in bioinformaticsPathways and genomes databases in bioinformatics
Pathways and genomes databases in bioinformatics
 
Biology 16 1 genes and variation[1]
Biology 16 1 genes and variation[1]Biology 16 1 genes and variation[1]
Biology 16 1 genes and variation[1]
 
Genome Sequencing Project
Genome Sequencing ProjectGenome Sequencing Project
Genome Sequencing Project
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural crops
 
Genomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single stepGenomic selection with weighted GBLUP and APY single step
Genomic selection with weighted GBLUP and APY single step
 
Introduction to genomes
Introduction to genomesIntroduction to genomes
Introduction to genomes
 
Next-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plotNext-generation sequencing format and visualization with ngs.plot
Next-generation sequencing format and visualization with ngs.plot
 

Similar to Whole genome sequencing of bacteria & analysis

Genome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGenome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGedifewGebrie
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformaticsAtai Rabby
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical NotebookNaima Tahsin
 
Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891saurabh verma
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityMonica Munoz-Torres
 
Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGLong Pei
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein functionLars Juhl Jensen
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods Zohaib HUSSAIN
 
Apollo : A workshop for the Manakin Research Coordination Network
Apollo: A workshop for the Manakin Research Coordination NetworkApollo: A workshop for the Manakin Research Coordination Network
Apollo : A workshop for the Manakin Research Coordination NetworkMonica Munoz-Torres
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and toolsKAUSHAL SAHU
 
Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...OECD Environment
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxChijiokeNsofor
 

Similar to Whole genome sequencing of bacteria & analysis (20)

Understanding Genome
Understanding Genome Understanding Genome
Understanding Genome
 
Genome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGenome sequencing. ppt.pptx
Genome sequencing. ppt.pptx
 
CROP GENOME SEQUENCING
CROP GENOME SEQUENCINGCROP GENOME SEQUENCING
CROP GENOME SEQUENCING
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
New generation Sequencing
New generation Sequencing New generation Sequencing
New generation Sequencing
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
 
THE human genome
THE human genomeTHE human genome
THE human genome
 
Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research Community
 
Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEG
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein function
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods
 
Apollo : A workshop for the Manakin Research Coordination Network
Apollo: A workshop for the Manakin Research Coordination NetworkApollo: A workshop for the Manakin Research Coordination Network
Apollo : A workshop for the Manakin Research Coordination Network
 
Genome comparision
Genome comparisionGenome comparision
Genome comparision
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and tools
 
Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...
 
Vector Engineering.pptx
Vector Engineering.pptxVector Engineering.pptx
Vector Engineering.pptx
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
 
Shotgun (2) metagenomics
Shotgun (2) metagenomicsShotgun (2) metagenomics
Shotgun (2) metagenomics
 

Whole genome sequencing of bacteria & analysis

  • 1. WHOLE GENOME SEQUENCING OF BACTERIA & ANALYSIS ELAMURUGAN. A Ph.D Scholar, Vet. Immunology
  • 2. INTRODUCTION  1977 - first complete genome to be sequenced was bacteriophage X174 - 5386 bp  1995 - first complete genome sequence from a free living organism - Haemophilus influenzae (1.83 Mb) by whole genome shotgun approach  Sanger & Coulson (1977) - used chain-terminating dideoxynucleotide analogues  Maxam & Gilbert (1977) chemical degradation DNA sequencing - terminally labeled DNA fragments were chemically cleaved at specific bases and separated by gel electrophoresis
  • 4. ARCHON X PRIZE  X PRIZE Foundation in Santa Monica, CA, has introduced the Archon X PRIZE for Genomics and will award a sum of $10 million to the first team that can design a system capable of sequencing 100 human genomes in 10 days
  • 5. SEQUENCING TECHNOLOGY  First generation  Sanger’s dideoxy chain terminating tech  Maxam & Gilbert chemical degradation tech  Next generation sequencing (NGS)  454/Roche - pyrosequencing  Illumina/ Solexa - reversible dye terminators  SOLiD /ABI- sequential ligation of oligonucleotide probes Second generation HT-NGS – sequencing after amplification
  • 6.
  • 7.  Heliscope  SMRT (Pacific biosciences)  Single molecule real time (RNAP) sequencer  Nanopore DNA sequencer  Ion Torrent sequencing technology (PostLight)  VisiGen biotechnologies – FRET  Advantages of 3rd generation HT-NGS over 2nd  higher throughput  faster turnaround time  longer read lengths  higher consensus accuracy  small amounts of starting material  low cost Third generation HT-NGS - Single molecule sequencing
  • 8.
  • 9. ADVANTAGES OF HT-NGS  Massive parallel sequencing of hundreds of thousands or millions of templates  Preliminary and tedious cloning work is eliminated and substituted by PCR amplification  Most recent technologies, even PCR is eliminated, because single DNA molecules  Economic  Reduced time
  • 10. DISADVANTAGES OF HT-NGS  Most NGSTs produce short reads  Constructions of fragment libraries remain tricky and involve several steps of fragmentation, adaptor ligation and PCR amplification  Short homopolymers with the 454 technology  Modified nucleotides cause mis-incorporation or block further incorporation if the florescent moiety cannot be completely removed  Assembly of short reads into longer sequences
  • 11.
  • 13.
  • 14.
  • 15.
  • 17. Selection of a technology for an experiment
  • 18. GENOME ASSEMBLY  Assemblers can join sequences together based on overlapping regions between the sequences  Composed of contigs and scaffolds  Contigs - contiguous consensus sequences that are derived from collections of overlapping reads  Scaffolds - ordered and orientated sets of contigs that are linked to one another by mate pairs of sequencing reads  N50 - basic statistic for describing the contiguity of a genome assembly. The longer the N50 is, the better the assembly
  • 19.  Alignment against a reference genome sequence  De novo assembly Construction of longer sequences, such as contigs or genomes, from shorter sequences, such as sequence reads, without prior knowledge of the order of the reads or reference to a closely related sequence
  • 20. GENE PREDICTION  Ab initio gene prediction - mathematical models rather than external evidence (such as EST and protein alignments) to identify genes and to determine their intron–exon structures  Evidence-driven gene prediction - using ESTs, can be used to identify exon boundaries unambiguously. Great potential to improve the quality of gene prediction in newly sequenced genomes. ESTs and proteins must first be aligned to the genome  Commonly used tools for gene prediction in prokaryotes Glimmer, GeneMark
  • 21. GENOME ANNOTATION  Is the extraction of biological knowledge from raw nucleotide sequences  Seeks to identify every potential protein coding gene (ORFs)  Used to compare in available database like BlastP  ‘Structural’ genome annotation is the process of identifying genes and their intron–exon structures  ‘Functional’ genome annotation is the process of attaching meta-data such as gene ontology terms to structural annotations
  • 22.
  • 23.
  • 24. APPLICATIONS  Very large no of short reads help to identify single nucleotide polymorphisms (SNP) when comparing them in reference genome  Identification of rearrangements, deletions, insertions, inversions  Used to generate expressed sequence tags (EST) from RNA sequencing  Also to detect small regulatory RNAs  Illumia technoloy - ChIP Seq to study protein - DNA interactions  Metagenomics
  • 25. LEADS TO DEVELOPMENT  Functional genomics  Comparative genomics  Environmental genomics (Metagenomics)
  • 26. FUNCTIONAL GENOMICS  Reveals genome structure and its functional relation  Orthologs - they represent genes derived from a common ancestor that diverged because of divergence of the organism, tend to have similar function  Paralogs are homologs produced by gene duplication and represent genes derived from a common ancestral gene that duplicated within an organism and then diverged, tend to have different functions  Xenologs are homologs resulting from the horizontal transfer of a gene between two organisms. The function of xenologs can be variable, depending on how significant the change in context was for the horizontally moving gene. In general, though, the function tends to be similar
  • 27. PHYLOGENETIC ANALYSIS  Phylogenetic trees, which are used to classify the evolutionary relationships between homologous genes represented in the genomes of divergent species Internal Nodes or Divergence Points Branches or Lineages A B C D E Terminal Nodes Ancestral Node or ROOT of the Tree
  • 28. COMPARATIVE GENOMICS  Comparison of genome sequences reveals much information about genome structure and evolution, including importance of lateral gene transfer  Tool to discover how microbs adapted to particular ecology and in development of new therapeutic agents
  • 29. METAGENOMICS  Genomics-based study of genetic material recovered directly from environmentally derived samples without laboratory culture and compared with all previously sequenced genes  Enable how microbs adapt extreme environments which help to discover new metabolic pathway and protective mechanisms
  • 30. IMPACT OF GENOME SEQUENCING  Revealed genome reduction in I/C bacteria  Genome plasticity (rearrangements, mobile elements)  Gene duplication and diversification of protein function  Lateral gene transfer & acquisition of new functions  Adaptation to environments, virulence  Industrial process - fermentation tech,  Bioremediation  Biotransformation  Development of vaccines  Bacterial diversity  Synthetic biology  Epigenetics
  • 31. REVERSE VACCINOLOGY  Use of genomic sequence information to identify novel and better suited protein candidates for vaccine  Serogroup B Neisseria meningitidis – based on genomic data all proteins predicted to be surface exposed, therefore accessible to antiobodies  Suitable candidates selected after sequencing various strains  Streptococcus agalactiae  Pan-genome composed of core genome, the genes present in all sequence strains and the dispensable genome made of genes present in a subset of strains
  • 32.  Synthetic biology - from sequence of entire genome to synthesize genes de novo  Identification of minimal genome, the smallest set of genes that enbles life - Mycoplasma genitalium
  • 33. DATABASES AND TOOLS RELATED WITH BACTERIAL GENOMIC DATA  NCBI Entrez Genome Project database:  http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db = genomeprj  A searchable collection of complete and incomplete (in-progress) large-scale sequencing, assembly, annotation, and mapping projects for cellular organisms  NCBI, Bacteria Genome Database:  http://www.ncbi.nlm.nih.gov/genomes/static/eub.html  The Genome database provides views for a variety of genomes, complete chromosomes, sequence maps with contigs, and integrated genetic and physical maps  Bacterial Genomes at The Sanger Institute: • http://www.sanger.ac.uk/Projects/Microbes/ • This web contains a list of funded, on-going, or completed projects of pathogens sequenced at this institute  TIGR Comprehensive Microbial Resource (CMR):  http://cmr.tigr.org/tigr-scripts/CMR/CmrHomePage.cgi  A free website displaying information on all the publicly available, complete prokaryotic genomes
  • 34.  GOLD: Genomes OnLine Database:  http://www.genomesonline.org/  A genome database containing information about which genomes have been sequenced or are in progress  Microbial Genome Database for Comparative Analysis (MBGD):  http://mbgd.genome.ad.jp/  A database for comparative analysis of completely sequenced microbial genomes  Virulence Factors of Bacterial Pathogens (VFDB):  http://zdsys.chgb.org.cn/VFs/main.htm  VFDB is an integrated and comprehensive database of virulence factors for bacterial pathogens  Genome Information Broker:  http://gib.genes.nig.ac.jp/  A comprehensive data repository of complete microbial genomes in the public domain. Many microbial genomes can be explored graphically  Islander, a Database of Genomic Islands:  http://www.indiana.edu/~islander  This database contains genomic islands discovered in completely sequenced bacterial genomes
  • 35.  GenoList genome browser at Institute Pasteur:  http://genolist.pasteur.fr/  Contains access to diverse genome browsers of pathogenic bacteria  IslandPath:  http://www.pathogenomics.sfu.ca/islandpath/update/IPindex.pl  An aid to the identification of genomic islands, including pathogenicity islands, of potentially horizontally transferred genes  HGT-DB:  http://www.tinet.org/~debb/HGT/  A database containing the prediction of horizontally transferred genes in several prokaryotic complete genomes  E. coli genome project:  http://www.genome.wisc.edu  A site devoted to the E. coli genome project with an updated annotation of the genome