[13.07.07] albertsen mewe13 metagenomics

5,156 views

Published on

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,156
On SlideShare
0
From Embeds
0
Number of Embeds
369
Actions
Shares
0
Downloads
344
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • Intro
  • [13.07.07] albertsen mewe13 metagenomics

    1. 1. Metagenomics - Potentials and pitfalls Mads Albertsen MEWE 2013 CENTER FOR MICROBIAL COMMUNITIES
    2. 2. Agenda Introduction Pitfalls Potentials Recommendations CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
    3. 3. Introduction CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Genome = Parts list of a single genome
    4. 4. Introduction CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Metagenome = Parts list of the community Photo: D. Kunkel; color, E. Latypova
    5. 5. Introduction ”...functional analysis of the collective genomes of soil microflora, which we term the metagenome of the soil.” - J. Handelsman et al., 1998 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
    6. 6. Introduction PubMed: metagenom*[Title/Abstract] ”...functional analysis of the collective genomes of soil microflora, which we term the metagenome of the soil.” - J. Handelsman et al., 1998 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
    7. 7. Introduction CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY ”...functional analysis of the collective genomes of soil microflora, which we term the metagenome of the soil.” - J. Handelsman et al., 1998 PubMed: metagenom*[Title/Abstract] Sequencing costs http://www.genome.gov/sequencingcosts/
    8. 8. Introduction CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Metagenomics ≠ Amplicon sequencing
    9. 9. Sequencing and assembly CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY ≈3.000.000 bp pr. genome ≈1000 bp+ contigs 150 bp reads
    10. 10. Assigning information CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Contigs Function Taxonomy Databases Binning
    11. 11. What have metagenomics been used for? CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Rusch et al., 2007 Plos Biology Exploration Qin et al., 2010 Nature • 6.3 Gbp of sequence (2x Human genomes, 2000 x Bacterial genomes) • Most sequences were novel compared to the databases • 127 Human gut metagenomes • 600 Gbp sequence (200 x Human genomes) • 3.3 million genes identified • Minimal gut metagenome definded
    12. 12. What have metagenomics been used for? CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY • A characteristic microbial fingerprint for each of the nine different ecosystem types Dinsdale et al., 2008 Nature Comparative Specific functions Hess et al., 2011 Science • Identified 27.755 putative carbohydrate-active genes from a cow rumen metagenome • Expressed 90 candidates of which 57% had enzymatic activity against cellulosic substrates
    13. 13. What have metagenomics been used for? CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY • Genome extraction from low complexity metagenome • Candidatus Accumulibacter phosphatis • The first genome of a polyphosphate accumulating organism (PAO) with a major role en enhanced biological phosphorus removal Extracting genomes • Genome extraction of low abundant species (< 0.1%) from metagenomes • First complete TM7 genome • Access to genomes of the ”uncultured majority” Garcia Martin et al., 2006 Nat. Biotechnol. Albertsen et al., 2013 Nat. Biotechnol.
    14. 14. Pitfalls CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
    15. 15. Metagenomics made easy CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Great resources – but use with care
    16. 16. MG-RAST example CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Contigs
    17. 17. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Dataset overview
    18. 18. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY FunctionTaxonomy Taxonomy and Function overview
    19. 19. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Compare with other samples Samples Functional categories
    20. 20. Pitfalls CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY You always get billions of data!
    21. 21. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Pitfalls Is your DNA extraction OK? ... and the samples you want to compare with? Did you sequence enough? Did you know the GC bias of your protocol? Did you normalize for sequencing depth? Did you use the same sequencing platform? Assembly = data not quantitative! Are you comparing assembled data with reads?
    22. 22. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Databases Contigs Databases ...you only see what is in the database Annotated metagenome
    23. 23. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY What is in the databases? Phyla Class Order Species 29 46 100 1268 90 249 405 99322 Genomes 16S Finshed Genomes in IMG Vs. Greengenes 16S rRNA database Note: only including 1 strain pr. species *97% clustering *
    24. 24. MG-RAST example CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Contigs 650.000 EBPR proteins with taxonomy assigned How similar are they to the genomes in the database?
    25. 25. Sludge microbes vs. Database genomes CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY 650.000 EBPR proteins Note: not abundance weighted
    26. 26. Sludge microbes vs. Database genomes CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY 650.000 EBPR proteins 1.260.000 Human gut Qin et al., 2010 Nature RAST ID: 4448044.3 Note: not abundance weighted
    27. 27. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Sludge microbes vs. Database genomes The 7 genera with most EBPR proteins assigned
    28. 28. Effect of missing genomes What is the effect of not having closely related genomes in the database? 1. Remove a genome from the database 2. Search the removed genome against the database CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
    29. 29. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Effect of missing genomes Best hit Bacteria 1268 Proteobacteria 564 Betaproteobacteria 84 Rhodocyclales 5 Rhodocyclaceae 5 Accumulibacter phosphatis blastp Related genomes 4326 proteins
    30. 30. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Effect of missing genomes Best hit Accumulibacter phosphatis blastp Related genomes 4326 proteins Azoarcus Bacteria 1268 Proteobacteria 564 Betaproteobacteria 84 Rhodocyclales 5 Rhodocyclaceae 5
    31. 31. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Effect of missing genomes MEGAN LCA Accumulibacter phosphatis blastp Lowest common ancester (LCA) approach: Hit 1: Beta-proteobacteria 80% ID Hit 2: Gamma-proteobacteria 79% ID Hit 3: Actinobacteria 59% ID Assigned to Proteobacteria Related genomes 4326 proteins Bacteria 1268 Proteobacteria 564 Betaproteobacteria 84 Rhodocyclales 5 Rhodocyclaceae 5
    32. 32. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Effect of missing genomes MEGAN LCA Accumulibacter phosphatis blastp Genus No hits 261 Bacteria 325 Proteobacteria 860 Beta- 853 Rhodocyclaceae 1149 4326 proteins: • 27% correctly classified on genus level • 54% not assigned the correct class • 101 genera identified Related genomes Lowest common ancester (LCA) approach: Hit 1: Beta-proteobacteria 80% ID Hit 2: Gamma-proteobacteria 79% ID Hit 3: Actinobacteria 59% ID Assigned to Proteobacteria 4326 proteins Bacteria 1268 Proteobacteria 564 Betaproteobacteria 84 Rhodocyclales 5 Rhodocyclaceae 5
    33. 33. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Effect of missing genomes MEGAN LCA Nitrospira defluvii Bacteria 1268 Nitrospirae 3 blastp Related genomes 4268 proteins: • 1% correctly classified on phylum level Phylum
    34. 34. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Effect of missing genomes MEGAN LCA + KEGG Nitrospira defluvii blastp Related genomes Bacteria 1268 Nitrospirae 3 What about function?
    35. 35. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Effect of missing genomes MEGAN LCA + KEGG Nitrospira defluvii blastp Related genomes Bacteria 1268 Nitrospirae 3
    36. 36. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Effect of missing genomes Nitrospira defluvii blastp Related genomes MEGAN LCA + KEGG Bacteria 1268 Nitrospirae 3
    37. 37. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Implication of missing genomes Function A Function B Function C Function D
    38. 38. Pitfalls CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY You always get billions of data!
    39. 39. Potentials CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
    40. 40. Potentials CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY 1. Hunting novel antibiotic resistance genes 2. Extracting genomes from metagenomes
    41. 41. Hunting novel antibiotic resistance genes CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY What if you want to find something that is not in the database?
    42. 42. Hunting novel antibiotic resistance genes CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Functional metagenomics M. Sommer, DTU, Denmark (in prep)
    43. 43. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Hunting novel antibiotic resistance genes 89 different antibiotic resistance genes 19 novel M. Sommer, DTU, Denmark (in prep)
    44. 44. Hunting novel antibiotic resistance genes CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY How abundant are the antibiotic genes in the environment?
    45. 45. Hunting novel antibiotic resistance genes CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY The number of metagenome reads reflect the abundance of the bacteria. Bacteria Reads
    46. 46. Hunting novel antibiotic resistance genes CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Bacteria Reads
    47. 47. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Hunting novel antibiotic resistance genes Bacteria Reads
    48. 48. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Hunting novel antibiotic resistance genes Metagenomes Antibioticgenes 89 different antibiotic resistance genes M. Sommer, DTU, Denmark (in prep)
    49. 49. Extracting genomes CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
    50. 50. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY ≈3.000.000 bp pr. genome ≈1000 bp+ contigs 150 bp reads Why not full genomes? Extracting genomes
    51. 51. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY ≈3.000.000 bp pr. genome ≈1000 bp+ contigs 150 bp reads Why not full genomes? 1. Micro-diversity 2. Separation of genomes (Binning) Extracting genomes
    52. 52. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Not 1 strain Many closely related strains AAAAAAAAAAAAAA AAAAAAAAATAAAA AAAAAAAAACAAAA AAAAAAAAA TAAAA CAAAA What you get AAAAA Assembly Extracting genomes
    53. 53. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Extracting genomes Metagenome assembly is not quantitative!
    54. 54. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Reduce microdiversity Low micro-diversityHigh micro-diversity Short term enrichment
    55. 55. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY ≈3.000.000 bp pr. genome ≈1000 bp+ contigs 150 bp reads Why not full genomes? 1. Micro-diversity 2. Separation of genomes (Binning) Extracting genomes
    56. 56. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Binning Genomic signatures: - GC / Codon usage - Tetranucleotide frequency + statistical method Complex sample PhD student ”Binning”
    57. 57. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Binning Genomic signatures: - GC / Codon usage - Tetranucleotide frequency + statistical method Complex sample PhD student ”Binning” Problems: - Short pieces of sequence (1-10kbp) - Local sequence divergence
    58. 58. Sequence composition-independent binning Sample 1 Abundance Sample 2 Abundance CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Binning
    59. 59. Sequence composition-independent binning Sample 1 Sample 2 Abundance Sample 1 AbundanceSample2 Abundance Abundance CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Binning
    60. 60. 1. Reduce micro-diversity 2. Use multiple related samples Abundance Sample 1 AbundanceSample2 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Binning
    61. 61. 1. Reduce micro-diversity 2. Use multiple related samples Abundance Sample 1 AbundanceSample2 Abundance Sample 1 AbundanceSample2 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Binning
    62. 62. Simple reactors CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYH. Daims & C. Dorninger, DOME, University of Vienna • Nitrospira enrichment running for years • 3 dominant species • No micro-diversity
    63. 63. Short term enrichment Full-scale EBPR plant SBR reactor Days 1. Reduction of (micro)-diversity Competibacter CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
    64. 64. Short term enrichment Full-scale EBPR plant SBR reactor 2. Two different DNA extraction methods CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
    65. 65. Colored using a set of 100 phylogenetic marker genes CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
    66. 66. Colored using a set of 100 phylogenetic marker genes TM7-1 (1.6%) TM7-2 (0.7%) TM7-3 (0.2%) TM7-4 (0.06%) CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
    67. 67. Zoom on target TM7-2 (0.7%) Colored using a set of 100 phylogenetic marker genes CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
    68. 68. Zoom on target PC2 PC1 TM7-2 PCA on genomic signatures TM7-2 (0.7%) Colored using a set of 100 phylogenetic marker genes CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
    69. 69. Colored using a set of 100 phylogenetic marker genes TM7-1 (1.6%) Candidate phylum TM7 Saccharibacteria Candidatus Saccharimonas aalborgensis CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech.
    70. 70. Candidatus Competibacter denitrificans (10.6%) CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech. Poster by S. McIlroy
    71. 71. Genome assembly validation CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech. Phyla Genes (HMM model) Essential single copy genesAssembly inspection
    72. 72. Multi-metagenome CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITYAlbertsen et al., 2013 Nat. Biotech. http://madsalbertsen.github.io/multi-metagenome/ Short: goo.gl/0ctA3 • Guides • Workflow scripts • Example data • All the code • Reccomendations
    73. 73. Multi-metagenome CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Highly complex environments... ...add more samples! Talk by SM. Karst
    74. 74. Potentials CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Metabolites Proteins mRNA DNA Meta-bolomics Meta-proteomics Meta-transcriptomics Meta-genomics In Situ methods Community structure Microbial functions Extraction P-Removal: N-Removal: -Removal: Foaming: Ethanol production: Microbial needs
    75. 75. Recommendations • Do you really need metagenomics? • Are the databases usefull in your environment? • Unless human related they are not... • Metagenomics is just the parts list ... of the DNA that could be extracted ... and the functions that could be annotated • Validation, validation validation! • Bioinformatic • In situ • Genome extraction from simple reactors is possible • Enables comprehensive transcriptomics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
    76. 76. Metagenomics is pretty... ...but not always informative

    ×