Computational prediction and characterization of genomic islands: insights into bacterial pathogenicity

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Computational prediction and characterization of genomic islands: insights into bacterial pathogenicity - Presentation Transcript

    1. Computational prediction and characterization of genomic islands: insights into bacterial pathogenicity Morgan G.I. Langille Department of Molecular Biology & Biochemistry Simon Fraser University http://tinyurl.com/genomic-islands
    2. Genomic Island History
      • Early 1990’s clusters of virulence genes were found in E. coli (Hacker, et al.,1990)
      • Pathogenicity Islands (PAIs)
        • Clusters of genes that are associated with bacterial virulence
      • Genomic Islands (GIs) (Hacker, et al., 2000)
        • Segments of a genome that are thought to have originated from a horizontal transfer event
    3. Genomic Island Interest
      • Pathogenicity Islands
        • Adhesins
          • Fimbriae, intimin, etc.
        • Secretion Systems
          • Type III and Type IV
        • Toxins
          • Hemolysins, Pertussis toxin
        • Invasins, Modulins, and Effectors
      • Antibiotic Resistance Islands
      • Metabolic Islands
    4. Genomic Island Interest
    5. Methods for Predicting GIs
      • Sequence based
        • Abnormal sequence composition
          • GC% bias, dinucleotide bias, codon bias, etc
        • Genomic features associated with mobile genetic elements
          • Direct repeats, IS elements, presence of tRNA and mobility genes (Integrases, transposases, etc.)
    6. Methods of Predicting GIs
      • Comparative genomics based
        • Identify genomic regions with anomalous phylogenetic patterns
        • Requires multiple genomes
    7. Previous state of GI identification
      • Sequence based methods
        • Numerous methods and constant improving of algorithm design
        • Not very user friendly and accuracy of various methods not well described
      • Comparative based methods
        • Used by many researchers, but with no established method (only in-house scripts)
        • Limited access to user friendly tools for this type of analysis
    8. Outline
      • IslandPick: A comparative genomics approach for genomic island identification
      • Evaluating sequence composition based genomic island prediction methods
      • IslandViewer: An integrated interface for computational identification and visualization of genomic islands
      • The role of genomic islands in the virulent Pseudomonas aeruginosa Liverpool Epidemic Strain
      • CRISPRs and their association with genomic islands
    9. Outline
      • IslandPick: A comparative genomics approach for genomic island identification
      • Evaluating sequence composition based genomic island prediction methods
      • IslandViewer: An integrated interface for computational identification and visualization of genomic islands
      • The role of genomic islands in the virulent Pseudomonas aeruginosa Liverpool Epidemic Strain
      • CRISPRs and their association with genomic islands
    10. Mauve-whole genome aligner
      • Allows genome arrangements and inversions
      • Fast – Aligns two genomes < 15 minutes
      • Command line accessible
      • http://gel.ahabs.wisc.edu/mauve/
      (Darling, et al., 2004)
    11. IslandPick: Outline Query Genome A Genome B Genome C Genome D Run Mauve Mauve (A & B) Extract unique regions Mauve (A & C) Mauve (A & D) Genome D Putative Genomic Islands BLAST Identify overlapping unique regions
    12. Selecting Comparative Genomes Run Mauve Mauve (A & B) Extract unique regions Mauve (A & C) Mauve (A & D) Genome D Putative Genomic Islands BLAST Identify overlapping unique regions Genome B Genome C Genome D Comparative Genome Selection (using CVTree distances) Query Genome A
    13. What genomes to use?
      • We want to compare the query genome to other comparative genomes within certain evolutionary distances
      • Need a phylogenetic tree or a distance matrix for all sequenced bacteria species
    14. CVTree
      • Uses matching K-strings between the proteomes of two organisms
      • Constructs phylogenetic trees without alignment
      • Avoids choosing genes for phylogenetic reconstruction
      • Web Server http://cvtree.cbi.pku.edu.cn
      • Downloadable command line executable
      (Qi, et al., 2004)
    15. Example: Pseudomonas Tree
      • Tree built using conserved genes, Omp85 & CarB, and maximum parsimony
      • CVTree distances from P.syringae B728a are shown
      0.227 0.256 0.397 0.393 0.411 0.428 0.430 0 0.481 P. fluorescens Pf-5 P. putida KT2440 P. fluorescens PfO-1 P. syringae tomato DC3000 P. syringae phaseolicola 1448A P. syringae syringae B728a P. aeruginosa PAO1 P. aeruginosa PA14 Acinetobacter ADP1
    16. Determining Distance Cutoffs
      • Given the distances between any two species, how do we choose comparison genomes?
        • Maximum Distance Cutoff
          • Eliminates the use of genomes that have diverged too much (noise)
        • Minimum Distance Cutoff
          • Eliminates the use of genomes that have not diverged enough (very closely related strains)
        • Minimum Number of Genomes
          • Eliminates the use of too few comparative genomes
    17. Example: Pseudomonas Tree Maximum Distance Cutoff = 0.42 Minimum Number of Genomes = 3 0.227 0.256 0.397 0.393 0.411 0.428 0.430 0 0.481 P. fluorescens Pf-5 P. putida KT2440 P. fluorescens PfO-1 P. syringae tomato DC3000 P. syringae phaseolicola 1448A P. syringae syringae B728a P. aeruginosa PAO1 P. aeruginosa PA14 Acinetobacter ADP1 Minimum Distance Cutoff = 0.10
    18. Predicting Similar Aged GIs GI Insertion Query Genome 1 genome < distance X Query Genome GI Insertion
    19. Outline
      • IslandPick: A comparative genomics approach for genomic island identification
      • Evaluating sequence composition based genomic island prediction methods
      • IslandViewer: An integrated interface for computational identification and visualization of genomic islands
      • The role of genomic islands in the virulent Pseudomonas aeruginosa Liverpool Epidemic Strain
      • CRISPRs and their association with genomic islands
    20. Accuracy of GI methods
      • Sequence based GI prediction methods
        • Only require a single genome
        • Can easily make false predictions
          • Highly expressed genes
        • May miss predictions
          • Amelioration of DNA to host genome
          • Source genome has same composition as host genome
      • Usually evaluate accuracy using simulated horizontal gene transfer events or small datasets of verified GIs
      • IslandPick is independent of sequence composition methods
        • generated a “positive” dataset of islands
    21. Developing a Negative Dataset
      • To identify false positives we need a “negative” dataset that does not contain GIs
      • Identify regions that are conserved across several genomes using Mauve whole genome alignment
      • Use the same genomes as selected by IslandPick with one additional cutoff
    22. Negative Dataset Query Genome 1 genome > distance X GI Insertion Query Genome GI Insertion
    23. IslandPick Cutoffs
      • 118 chromosomes
      • 771 GIs
      • ~100 genes/strain
      173 chromosomes 736 chromosomes (Langille, et al., 2008)
    24. GI Prediction Accuracy Positive Dataset Negative Dataset Predicted Dataset Entire Genome TP FP FN Precision = TP / (TP + FP) Recall = TP / (TP + FN) TN
    25. GI Prediction Accuracy (Langille, et al.,2008) Tool Average number of nucleotides in GIs per genome (kb) Precision Recall Overall Accuracy SIGI-HMM 233 92 33.0 86 IslandPath/ Dimob 171 86 36 86 PAI IDA 163 68 32 84 Centroid 171 61 28 82 IslandPath/ Dinuc 444 55 53 82 Alien Hunter 1265 38 77 71 Literature* 639 100 87 96
    26. Outline
      • IslandPick: A comparative genomics approach for genomic island identification
      • Evaluating sequence composition based genomic island prediction methods
      • IslandViewer: An integrated interface for computational identification and visualization of genomic islands
      • The role of genomic islands in the virulent Pseudomonas aeruginosa Liverpool Epidemic Strain
      • CRISPRs and their association with genomic islands
    27. IslandViewer (Langille, et al., 2009)
      • Website that integrates the most accurate GI prediction programs SIGI-HMM, IslandPath-DIMOB, and IslandPick
      • Genomic island prediction pre-calculated for all genomes
        • Automatically updated monthly
      • User genome submission available
      • IslandPick can be run using manually selected comparison genomes
      • Download data for a genomic island, a chromosome, or entire dataset
      • http://www.pathogenomics.sfu.ca/islandviewer/
    28. IslandPick – Manual genome selection
    29. User Genome Submission
    30. Outline
      • IslandPick: A comparative genomics approach for genomic island identification
      • Evaluating sequence composition based genomic island prediction methods
      • IslandViewer: An integrated interface for computational identification and visualization of genomic islands
      • The role of genomic islands in the virulent Pseudomonas aeruginosa Liverpool Epidemic Strain
      • CRISPRs and their association with genomic islands
    31. P seudomonas aeruginosa Liverpool Epidemic Strain (LES)
      • Highly successful at colonizing cystic fibrosis (CF) patients
      • Has replaced previously established strains
      • Caused infections of non-CF patients
      • Can cause greater morbidity in CF than other strains of P. aeruginosa
      • ( Salunkhe, et al., 2005)
    32. LES Analysis
      • Genome sequenced by Sanger Centre
      • I led annotation of the genome and analysis of GIs
      • 6 Prophages
      • 5 Genomic Islands
      (Winstanley, Langille, et al., 2008)
    33. Signature-tagged mutagenesis (STM)
      • STM is a method to identify genes associated with pathogenesis
      • LES used in a chronic rat lung infection model
      • 47 genes identified by STM
      • 5 of these genes are within GIs and prophage regions
      http://www.traill.uiuc.edu/uploads/porknet/papers/LitchtensteigerPaper.pdf
    34. LES Prophage (Winstanley, Langille, et al., 2008)
    35. LES Genomic Islands (Winstanley, Langille, et al., 2008)
    36. LES in-vivo competitive index
      • Mutants grown for 7 days in rat lung with the wild type LES
      • A CI of less than 1 indicates attenuation of virulence
      • 4 genes within prophage and GIs had strong impact on competitiveness
      (Winstanley, Langille, 2008)
    37. Outline
      • IslandPick: A comparative genomics approach for genomic island identification
      • Evaluating sequence composition based genomic island prediction methods
      • IslandViewer: An integrated interface for computational identification and visualization of genomic islands
      • The role of genomic islands in the virulent Pseudomonas aeruginosa Liverpool Epidemic Strain
      • CRISPRs and their association with genomic islands
    38. Overview of CRISPRs
      • CRISPRs: C lustered r egularly i nterspaced s hort p alindromic r epeats
      • Able to provide phage resistance and block conjugation
      • Thought to be similar to RNAi, except DNA (instead of RNA) is thought to be the target
    39. CRISPRs and HGT
      • Previous studies have shown some evidence of HGT of CRISPRs
        • Phylogenetic profiles of CAS genes (Haft, et al., 2005)
        • CRISPRs within 10 megaplasmids (Godde, et al., 2006)
        • CRISPRs within two prophage in Clostridium difficile (Sebaihia, et al., 2006)
      • Analysis of CRISPRs and GIs had not been conducted previously
    40. CRISPRs within GIs
      • CRISPRs predictions were obtained from CRISPRdb, http://crispr.u-psud.fr/crispr/CRISPRHomePage.php
      • GI predictions were taken from the union of IslandPick, IslandPath-DIMOB, and SIGI-HMM
      • Number of CRISPRs inside and outside GIs were compared
      CRISPRs are over-represented in GIs Domain of Life Number of Genomes Number of GIs Proportion of Genome in GIs Total Number of CRISPRs Expected CRISPRs in GIs Observed CRISPRs in GIs Significance (Chi-square Test)* Archaea 49 298 3.7% 206 7.7 14 0.020 Bacteria 306 4874 6.4% 837 53.3 114 8.1x 10 -18 Archaea & Bacteria 355 5172 6.1% 1043 64.0 128 1.6x 10 -16
    41. Phage genes within GIs
      • Many GIs are known to contain phage genes
      • What proportion of GI genes have links to phage?
      • Identified genes with “phage” in their annotation within GIs
      • 35% of all ‘phage genes’ are within GIs (6% expected)
      Phage genes are over-represented in GIs Genomic Regions Number of ‘phage genes’ Total number of genes in region Chi- Square Test Observed Expected 3 Inside GIs 1 6990 1264.22 165784 ~0 Outside GIs 1 12868 18593.78 2438303
    42. Archaea and CRISPRs Prevalence of CRISPRs in Archaea genomes could result in reduced phage genes Archaea Bacteria Genomes containing a CRISPR 90% 40% Proportion of phage genes 0.10% 0.79% Proportion of GIs with a phage gene 5.1% 17.6%
    43. GIs with CRISPRs and phage genes
      • Is there evidence supporting that some CRISPRs are being transferred by phage?
      GIs containing CRISPR(s) also contain an over-representation of phage genes -> suggesting that some CRISPRs are transferred by phage Genomic Regions Number of ‘phage genes’ Total number of genes in region Chi- Square Test Observed Expected 3 GIs containing CRISPR(s) 2 13 4.5 1500 5.7 x 10 -5 Outside GIs 2 812 820.5 274073
    44. CRISPR conclusions
      • CRISPR over-representation in GIs suggest that they are being horizontally transferred
      • Some GIs that contain CRISPRs may have phage origins
      • CRISPRs in Archaea could be limiting HGT by increasing resistance to phage
    45. Conclusions
      • Several advances in GI computational prediction
        • IslandPick, a novel automated comparative genomics based GI prediction program
        • Analysis of the accuracy of several sequenced based GI prediction methods
        • IslandViewer: An integrated interface for computational identification and visualization of genomic islands
      • Insights into GI evolution and their pathogenicity
        • P. aeruginosa LES – evidence that genomic islands and prophage regions contain genes that provide a competitive advantage for infection in a chronic rat infection model.
        • CRISPRs and their association with genomic islands
    46. Acknowledgements Supervisor Dr. Fiona Brinkman Supervisor Committee Dr. Baillie Dr. Pio P. aeruginosa LES Craig Winstanley Roger Levesque Bob Hancock Nick Thomson
    SlideShare Zeitgeist 2009

    + UC DavisUC Davis Nominate

    custom

    415 views, 0 favs, 0 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 415
      • 415 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 11
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories