Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
The utility of draft bacterial genomes for
gene function analysis and genomic island prediction
Julie A. Shay, Claire Bertelli, Bhavjinder K. Dhillon, and Fiona S.L. Brinkman
Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
Genomics Research and
This work was made possible by funding from Genome Canada, Genome British Columbia, and the GRDI. Funding for project personnel was also provided by Cystic
Fibrosis Canada, the Swiss National Science Foundation, the CIHR/MSFHR Bioinformatics Training Program, and the Michael Smith Foundation for Health Research.
Comparing draft vs. complete genomes:
The problem: growing gap between
draft and complete genomes
Genomic Island (GI) analysis
•Draft/complete genomes were run on
IslandViewer5: web-based GI prediction tool
which incorporates two methods:
Contig alignment to reference
genome with Mauve3
Concatenate contigs based on
Normal IslandViewer analysis
Isolate Draft Genome Complete Genome
Gene function category analysis
•Open reading frames (ORFs) were assigned to
clusters of orthologous groups (COGs)12 using
•COG superfamily distributions were compared
between complete genomes and missing regions
Genes of interest
•Main data set: 36 Listeria monocytogenes
isolates1, draft Illumina genomes and the
identical subsequently completed genomes
•Other data set: Draft genomes from the
Pseudomonas aeruginosa reference panel2
and similar completed genomes (identical
reference available for 2 strains)
•Draft genome aligned to completed reference
with Mauve Contig Mover3
•codon usage bias
•presence of a mobility gene
•“Replication, recombination, and repair”
superfamily was significantly underrepresented
in draft genomes of both L. monocytogenes
and P. aeruginosa
•In particular, transposons tend to be missing
from draft genomes
Many GIs are present at
contig breaks, and these
GIs are more likely to be
missed by analysis of draft
0 1 to 9 10 to 99 100 to
Distance in Base Pairs from Contig Edge
Predictions Missed in Draft
Predictions Correctly Identified
in Draft Genome Analysis
2008 2009 2010 2011 2012 2013 2014 2015
[A] RNA processing and modification
[B] Chromatin structure and dynamics
[C] Energy production and conversion
[D] Cell cycle control, cell
division, chromosome partitioning
[E] Amino acid transport and metabolism
[F] Nucleotide transport and metabolism
[G] Carbohydrate transport and
[H] Coenzyme transport and metabolism
[I] Lipid transport and metabolism
[J] Translation, ribosomal structure and
[L] Replication, recombination and
[M] Cell wall / membrane / envelope
[N] Cell motility
[O] Posttranslational modification, protein
[P] Inorganic ion transport and metabolism
[Q] Secondary metabolites
biosynthesis, transport and catabolism
[R] General function prediction only
[S] Function unknown
[T] Signal transduction mechanisms
[U] Intracellular trafficking, secretion, and
[V] Defense mechanisms
[W] Extracellular structures
Identifier4 using the
Predicted using a
from VFDB, PATRIC,
and Victor’s virulence
A B C D E F G H I J K L M N O P Q R S T U V W Y Z
Regions Missing from Draft Genome
Note: This image only shows genomes submitted to NCBI, so it is
underestimating the extent of the gap between draft and complete
1) Gimour MW, et al. 2010 BMC Genomics 11:120.
2) De Soyza A, et al. 2013 MicrobiologyOpen 2(6):
3) Darling AE et al. 2010 PLoS One 5(6):e11147.
4) McArthur AG, et al. 2013 Antimicrob Agents Chemo
5) Dhillon BK, et al. 2015 NAR gkv401.
6) Chen L, et al. 2011 NAR gkr989.
7) Wattam AR, et al. 2014 NAR 42(D1):D581-9.
8) Lowe TM & Eddy SR 1997 NAR 25(5):955-64.
9) Laslett D & Canback B 2004 NAR 32(1):11-6.
10)Waack S, et al. 2006 BMC Bioinformatics7:142.
11)Hsiao W, et al. 2003 Bioinformatics 19(3):418-20.
12)Tatusov RL, et al. 2003 BMC Bioinformatics 4:1.
13)Altschul SF, et al. 1997 NAR 25(17):3389-402.
•Draft genomes have limitations: certain gene
types, particularly those associated with mobile
elements, are disproportionately missing
•Draft genome analysis is still valuable for
VFs/AMR for the species examined, but more
species should be studied
Be the first to like this
Number of Embeds
You have now unlocked unlimited access to 20M+ documents!
Learn faster and smarter from top experts
Download to take your learnings offline and on the go
You also get free access to Scribd!
Instant access to millions of ebooks, audiobooks, magazines, podcasts and more.
Read and listen offline with any device.
Free access to premium services like Tuneln, Mubi and more.