A phylogeny driven genomic
          encyclopedia of bacteria and archaea



                               Jonathan A. Ei...
Bacterial evolve




Saturday, April 24, 2010
Fleischmann et al.
                           1995
Saturday, April 24, 2010
Microbial genomes




                               From http://genomesonline.org
Saturday, April 24, 2010
Saturday, April 24, 2010
rRNA Tree of Life




    Based on
     tree by
   Norm Pace




Saturday, April 24, 2010
The Tree is not Happy




    Based on
     tree by
   Norm Pace




Saturday, April 24, 2010
As of 2002                 Proteobacteria
                           TM6
                           OS-K                  ...
As of 2002                 Proteobacteria
                           TM6
                           OS-K
                 ...
As of 2002                 Proteobacteria
                           TM6
                           OS-K
                 ...
As of 2002                 Proteobacteria
                           TM6
                           OS-K
                 ...
As of 2002                 Proteobacteria
                           TM6
                           OS-K
                 ...
Filling in the Genomic Phylogenetic Gaps

     • Common approach within some eukaryotic
       groups

     • Many small p...
Proteobacteria
• NSF-funded               TM6
                           OS-K
                                            ...
Organisms Selected
         Phylum                  Species selected


         Chrysiogenes            Chrysiogenes arsen...
Bacterial aTOL Project AIMS

         • Improve resolution of deep branches in the
           bacterial tree

         • L...
T. roseum
            genome




Saturday, April 24, 2010
Microbial genomes




                               From http://genomesonline.org
Saturday, April 24, 2010
The Tree of Life is Still Angry




Saturday, April 24, 2010
Major Lineages of Actinobacteria
                                                                         2.5 Actinobacter...
Proteobacteria
                           TM6
                           OS-K
                                            ...
http://www.jgi.doe.gov/programs/GEBA/pilot.html
Saturday, April 24, 2010
GEBA Pilot Project Overview

        • Identify major branches in rRNA tree for
          which no genomes are available
 ...
B:
                                        Ac
                                          tin
                              ...
Why Increase Taxonomic Coverage?

         • Gene discovery
         • Annotation, functional prediction
         • Metage...
GEBA Pilot Project: Components
       • Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan
         Eisen, Eddy R...
Assess Benefits of GEBA

         • All genomes have some value

         • But what, if any, is the benefit of tree-
     ...
GEBA Lesson 1

                      rRNA Tree is Useful for Identifying
                       Phylogenetically Novel Gen...
rRNA Tree of Life




    Based on
     tree by
   Norm Pace




Saturday, April 24, 2010
Saturday, April 24, 2010
Saturday, April 24, 2010
Wh




  Whole genome tree
  built using
  AMPHORA
  by Martin Wu and
  Dongying Wu


Saturday, April 24, 2010
PD of rRNA, Genome Trees Similar




From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656...
Proteobacteria




Saturday, April 24, 2010
GEBA Lesson 2

                           Phylogenetically-guided genome
                             selection improves g...
Predicting Function

         • Key step in genome projects
         • More accurate predictions help guide
           exp...
Most/All Functional Prediction Improves
             w/ Better Phylogenetic Sampling
           • Better definition of prot...
From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html
Saturday, April 24, 2010
GEBA Lesson 3

                           Improves analysis of genome data
                              from uncultured o...
Environmental Shotgun Sequencing


                               shotgun



                                         clon...
Saturday, April 24, 2010
rRNA phylotyping from metagenomics




                           Venter et al., 2004
Saturday, April 24, 2010
Shotgun Sequencing Allows Use of
                  Alternative Anchors (e.g., RecA)




                                  ...
Weighted % of Clones




                                                                                                 ...
Weighted % of Clones




                                                                                                 ...
Binning challenge

      A                                        T
      B                                        U
     ...
Binning challenge

      A                                                             T
      B                          ...
Binning challenge

      A                                                           T
      B                            ...
Binning challenge

      A                                                           T
      B                            ...
Al
                                                                                ph
                                    ...
Phylogenetic Binning Using AMPHORA
                                                                                  dnaG
...
GEBA Phylogenomic Lesson 5

                           We have still only scratched the
                            surfac...
Protein Family Rarefaction Curves

         • Take data set of multiple complete genomes
         • Identify all protein f...
Saturday, April 24, 2010
Saturday, April 24, 2010
Saturday, April 24, 2010
Saturday, April 24, 2010
Saturday, April 24, 2010
Phylogenetic Distribution Novelty:
                  Bacterial Actin Related Protein
                                     ...
rRNA Tree of Life




    Based on
     tree by
   Norm Pace




Saturday, April 24, 2010
Phylogenetic Diversity:
                           Sequenced Bacteria & Archaea




From Wu et al. 2009. http://www.nature...
Phylogenetic Diversity with GEBA




From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656...
Phylogenetic Diversity: Isolates




From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656...
Phylogenetic Diversity: All




From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html...
Proteobacteria
                           TM6
                           OS-K
                                            ...
Uncultured Lineages:
                           Technical Approaches
         •    Get into culture
         •    Enrichme...
GEBA Phylogenomic Lesson 6

                       Need Experiments from Across the
                               Tree of...
As of 2002                 Proteobacteria
                           TM6
                           OS-K                  ...
As of 2002                 Proteobacteria
                           TM6
                           OS-K
                 ...
As of 2002                 Proteobacteria
                           TM6
                           OS-K
                 ...
A phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaea
A phylogeny driven genomic encyclopedia of bacteria and archaea
Upcoming SlideShare
Loading in …5
×

A phylogeny driven genomic encyclopedia of bacteria and archaea

2,525 views
2,468 views

Published on

Slides for talk given by Jonathan Eisen at Stanford University April 17, 2010

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,525
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
20
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

A phylogeny driven genomic encyclopedia of bacteria and archaea

  1. 1. A phylogeny driven genomic encyclopedia of bacteria and archaea Jonathan A. Eisen Talk at Stanford University April 17, 2010 Saturday, April 24, 2010
  2. 2. Bacterial evolve Saturday, April 24, 2010
  3. 3. Fleischmann et al. 1995 Saturday, April 24, 2010
  4. 4. Microbial genomes From http://genomesonline.org Saturday, April 24, 2010
  5. 5. Saturday, April 24, 2010
  6. 6. rRNA Tree of Life Based on tree by Norm Pace Saturday, April 24, 2010
  7. 7. The Tree is not Happy Based on tree by Norm Pace Saturday, April 24, 2010
  8. 8. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA WS3 Gemmimonas Firmicutes Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Saturday, April 24, 2010
  9. 9. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Saturday, April 24, 2010
  10. 10. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Saturday, April 24, 2010
  11. 11. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus • Same trend in Dictyoglomus Aquificae Thermudesulfobacteria Archaea Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Saturday, April 24, 2010
  12. 12. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus • Same trend in Dictyoglomus Aquificae Thermudesulfobacteria Eukaryotes Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Saturday, April 24, 2010
  13. 13. Filling in the Genomic Phylogenetic Gaps • Common approach within some eukaryotic groups • Many small projects funded to fill in some bacterial or archaeal gaps • Phylogenetic gaps in bacterial and archaeal projects commonly lamented in literature Saturday, April 24, 2010
  14. 14. Proteobacteria • NSF-funded TM6 OS-K • At least 40 Tree of Life Acidobacteria Termite Group phyla of OP8 Project Nitrospira Bacteroides bacteria Chlorobi • A genome Fibrobacteres Marine GroupA • Genome WS3 from each of Gemmimonas sequences are Firmicutes eight phyla Fusobacteria mostly from Actinobacteria OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are only Planctomycetes Spriochaetes sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus • Solution I: Dictyoglomus Aquificae sequence more Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 phyla OP11 Saturday, April 24, 2010
  15. 15. Organisms Selected Phylum Species selected Chrysiogenes Chrysiogenes arsenatis (GCA) Coprothermobacter Coprothermobacter proteolyticus (GCBP) Dictyoglomi Dictyoglomus thermophilum (GD T ) Thermodesulfobacteria Thermodesulfobacterium commune (GTC) Nitrospirae Thermodesulfovibrio yellowstonii (GTY) Thermomicrobia Thermomicrobium roseum (GTR ) Deferribacteres Geovibrio thiophilus (GGT) Synergistes Synergistes jonesii (GSJ) Saturday, April 24, 2010
  16. 16. Bacterial aTOL Project AIMS • Improve resolution of deep branches in the bacterial tree • Launch biological studies of these phyla • Leverage data for interpreting environmental surveys Saturday, April 24, 2010
  17. 17. T. roseum genome Saturday, April 24, 2010
  18. 18. Microbial genomes From http://genomesonline.org Saturday, April 24, 2010
  19. 19. The Tree of Life is Still Angry Saturday, April 24, 2010
  20. 20. Major Lineages of Actinobacteria 2.5 Actinobacteria 2.5.1 Acidimicrobidae 2.5.1 Acidimicrobidae 2.5.1.1 Unclassified 2.5.1.2 "Microthrixineae 2.5.1.1 Unclassified 2.5.1.3 Acidimicrobineae 2.5.1.3.1 Unclassified 2.5.1.2 "Microthrixineae 2.5.1.3.2 Acidimicrobiaceae 2.5.1.4 BD2-10 2.5.1.3 Acidimicrobineae 2.5.1.5 EB1017 2.5.2 Actinobacteridae 2.5.1.4 BD2-10 2.5.2.1 Unclassified 2.5.2.10 Ellin306/WR160 2.5.1.5 EB1017 2.5.2.11 Ellin5012 2.5.2.12 Ellin5034 2.5.2 Actinobacteridae 2.5.2.13 Frankineae 2.5.2.13.1 Unclassified 2.5.2.1 Unclassified 2.5.2.13.2 Acidothermaceae 2.5.2.10 Ellin306/WR160 2.5.2.13.3 2.5.2.13.4 Ellin6090 Frankiaceae 2.5.2.11 Ellin5012 2.5.2.13.5 2.5.2.13.6 Geodermatophilaceae Microsphaeraceae 2.5.2.12 Ellin5034 2.5.2.13.7 2.5.2.14 Sporichthyaceae Glycomyces 2.5.2.13 Frankineae 2.5.2.15 2.5.2.15.1 Intrasporangiaceae Unclassified 2.5.2.14 Glycomyces 2.5.2.15.2 2.5.2.15.3 Dermacoccus Intrasporangiaceae 2.5.2.15 Intrasporangiaceae 2.5.2.16 2.5.2.17 Kineosporiaceae Microbacteriaceae 2.5.2.16 Kineosporiaceae 2.5.2.17.1 2.5.2.17.2 Unclassified Agrococcus 2.5.2.17 Microbacteriaceae 2.5.2.17.3 2.5.2.18 Agromyces Micrococcaceae 2.5.2.18 Micrococcaceae 2.5.2.19 2.5.2.2 Micromonosporaceae Actinomyces 2.5.2.19 Micromonosporaceae 2.5.2.20 2.5.2.20.1 Propionibacterineae Unclassified 2.5.2.2 Actinomyces 2.5.2.20.2 2.5.2.20.3 Kribbella Nocardioidaceae 2.5.2.20 Propionibacterineae 2.5.2.20.4 2.5.2.21 Propionibacteriaceae Pseudonocardiaceae 2.5.2.21 Pseudonocardiaceae 2.5.2.22 2.5.2.22.1 Streptomycineae Unclassified 2.5.2.22 Streptomycineae 2.5.2.22.2 2.5.2.22.3 Kitasatospora Streptacidiphilus 2.5.2.23 Streptosporangineae 2.5.2.23 2.5.2.23.1 Streptosporangineae Unclassified 2.5.2.3 Actinomycineae 2.5.2.23.2 2.5.2.23.3 Ellin5129 Nocardiopsaceae 2.5.2.4 Actinosynnemataceae 2.5.2.23.4 2.5.2.23.5 Streptosporangiaceae Thermomonosporaceae 2.5.2.5 Bifidobacteriaceae 2.5.2.3 Actinomycineae 2.5.2.4 Actinosynnemataceae 2.5.2.6 Brevibacteriaceae 2.5.2.5 Bifidobacteriaceae 2.5.2.6 Brevibacteriaceae 2.5.2.7 Cellulomonadaceae 2.5.2.7 Cellulomonadaceae 2.5.2.8 Corynebacterineae 2.5.2.8 Corynebacterineae 2.5.2.8.1 Unclassified 2.5.2.8.2 Corynebacteriaceae 2.5.2.9 Dermabacteraceae 2.5.2.8.3 Dietziaceae 2.5.2.8.4 Gordoniaceae 2.5.3 Coriobacteridae 2.5.2.8.5 Mycobacteriaceae 2.5.2.8.6 Rhodococcus 2.5.3.1 Unclassified 2.5.2.8.7 Rhodococcus 2.5.2.8.8 Rhodococcus 2.5.3.2 Atopobiales 2.5.2.9 Dermabacteraceae 2.5.2.9.1 Unclassified 2.5.3.3 Coriobacteriales 2.5.2.9.2 Brachybacterium 2.5.2.9.3 Dermabacter 2.5.3.4 Eggerthellales 2.5.3 Coriobacteridae 2.5.3.1 Unclassified 2.5.4 OPB41 2.5.3.2 Atopobiales 2.5.3.3 Coriobacteriales 2.5.5 PK1 2.5.3.4 Eggerthellales 2.5.4 OPB41 2.5.6 Rubrobacteridae 2.5.5 PK1 2.5.6 Rubrobacteridae 2.5.6.1 Unclassified 2.5.6.1 Unclassified 2.5.6.2 "Thermoleiphilaceae 2.5.6.2 "Thermoleiphilaceae 2.5.6.2.1 Unclassified 2.5.6.2.2 Conexibacter 2.5.6.3 MC47 2.5.6.2.3 XGE514 2.5.6.3 MC47 2.5.6.4 Rubrobacteraceae 2.5.6.4 Rubrobacteraceae Saturday, April 24, 2010
  21. 21. Proteobacteria TM6 OS-K • At least 100 phyla of Acidobacteria Termite Group OP8 bacteria Nitrospira Bacteroides Chlorobi • Genome sequences are Fibrobacteres Marine GroupA mostly from three phyla WS3 Gemmimonas Firmicutes • Most phyla with cultured Fusobacteria Actinobacteria species are sparsely OP9 Cyanobacteria Synergistes sampled Deferribacteres Chrysiogenetes NKB19 • Lineages with no cultured Verrucomicrobia Chlamydia OP3 taxa even more poorly Planctomycetes Spriochaetes sampled Coprothmermobacter OP10 Thermomicrobia Chloroflexi • Solution - use tree to really TM7 Deinococcus-Thermus fill gaps Dictyoglomus Aquificae Well sampled phyla Thermudesulfobacteria Thermotogae OP1 OP11 Saturday, April 24, 2010
  22. 22. http://www.jgi.doe.gov/programs/GEBA/pilot.html Saturday, April 24, 2010
  23. 23. GEBA Pilot Project Overview • Identify major branches in rRNA tree for which no genomes are available • Identify a cultured representative for each group • Grow > 200 of these and prep. DNA • Sequence and finish 100 • Annotate, analyze, release data • Assess benefits of tree guided sequencing Saturday, April 24, 2010
  24. 24. B: Ac tin ob ac te B: ria # of Genomes Am (H Saturday, April 24, 2010 in igh 10 15 20 25 30 35 0 5 an G a C B: B: er ) Ba Aq ob ct uif ia B: ero ica B: e D Ch ide B: e ef lo te r s D rri ofl ef ba e B: e c xi B: De B rrib ter Ep lta : D act es si Pr ei er lo o n es n te oc Pr ob oc ot a ci B: e ct G B: oba eri am B F ct a : ir e B: m Fu mi ria a G P so cut em ro ba e t c s B: ma eo te ba ri H tim c a a t B: loa ona eri a B: Pl nae de an r te Th c o s Phyla er B: to bia m S m le y s B: od piro ce es c te T u h B: he lfo ae s rm b te GEBA Pilot Target List Th o a s er de cte m s ri u a A: ove lfo H n bi A: alo abu a A: A b la M rc ac e A: et ha te M han eo ria et g ha ob lob ac i A: no te m r A: The icr ia Th rm obi er oc a m oc op ci ro te i
  25. 25. Why Increase Taxonomic Coverage? • Gene discovery • Annotation, functional prediction • Metagenomic analysis • Mechanisms of diversification • Species phylogeny and classification Saturday, April 24, 2010
  26. 26. GEBA Pilot Project: Components • Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan Eisen, Eddy Rubin, Jim Bristow) • Project management (David Bruce, Eileen Dalin, Lynne Goodwin) • Culture collection and DNA prep (DSMZ, Hans-Peter Klenk) • Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus, Mat Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng) • Annotation and data release (Nikos Kyrpides, Victor Markowitz, et al) • Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla) • Adopt a microbe education project (Cheryl Kerfeld) • Outreach (David Gilbert) • $$$ (DOE, Eddy Rubin, Jim Bristow) Saturday, April 24, 2010
  27. 27. Assess Benefits of GEBA • All genomes have some value • But what, if any, is the benefit of tree- guided sequencing over other selection methods • Lessons for other large scale microbial genome projects? Saturday, April 24, 2010
  28. 28. GEBA Lesson 1 rRNA Tree is Useful for Identifying Phylogenetically Novel Genomes rRNA Tree topology is not perfect; Genome-based trees better Saturday, April 24, 2010
  29. 29. rRNA Tree of Life Based on tree by Norm Pace Saturday, April 24, 2010
  30. 30. Saturday, April 24, 2010
  31. 31. Saturday, April 24, 2010
  32. 32. Wh Whole genome tree built using AMPHORA by Martin Wu and Dongying Wu Saturday, April 24, 2010
  33. 33. PD of rRNA, Genome Trees Similar From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html Saturday, April 24, 2010
  34. 34. Proteobacteria Saturday, April 24, 2010
  35. 35. GEBA Lesson 2 Phylogenetically-guided genome selection improves genome annotation Saturday, April 24, 2010
  36. 36. Predicting Function • Key step in genome projects • More accurate predictions help guide experimental and computational analyses • Many diverse approaches • Comparative and evolutionary analysis greatly improves most predictions Saturday, April 24, 2010
  37. 37. Most/All Functional Prediction Improves w/ Better Phylogenetic Sampling • Better definition of protein family sequence “patterns” • Conversion of hypothetical into conserved hypotheticals • Greatly improves “comparative” and “evolutionary” based predictions • Linking distantly related members of protein families • Improved non-homology prediction Saturday, April 24, 2010
  38. 38. From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html Saturday, April 24, 2010
  39. 39. GEBA Lesson 3 Improves analysis of genome data from uncultured organisms Saturday, April 24, 2010
  40. 40. Environmental Shotgun Sequencing shotgun clone Saturday, April 24, 2010
  41. 41. Saturday, April 24, 2010
  42. 42. rRNA phylotyping from metagenomics Venter et al., 2004 Saturday, April 24, 2010
  43. 43. Shotgun Sequencing Allows Use of Alternative Anchors (e.g., RecA) Venter et al., 2004 Saturday, April 24, 2010
  44. 44. Weighted % of Clones 0 0.1250 0.2500 0.3750 0.5000 Al ph ap ro t eo Be b ac ta pr t er ot e ia G ob am Saturday, April 24, 2010 ac m t er ap ia ro Ep te si ob lo ac np t er ro ia De t eo lta ba ct pr ot e ria eo b C ac ya t er n ob ia ac t er Fi ia rm ic u te Ac s tin ob ac ter C ia hl or ob i C FB Major Phylogenetic Group Sargasso Phylotypes C hl or of le Sp xi iro ch ae te Fu so s De ba in ct er oc ia oc cu s- Eu The ry r ar mu ch s ae C ot re a na rc ha eo ta Shotgun Sequencing Allows Use of Other Markers Venter et al., 2004 EFG EFTu rRNA RecA RpoB HSP70
  45. 45. Weighted % of Clones 0 0.1250 0.2500 0.3750 0.5000 Al ph ap ro t eo Be b ac ta pr t er ot e ia G ob am Saturday, April 24, 2010 ac m t er ap ia ro Ep te si ob lo ac np t er ro ia De t eo lta ba ct pr ot e ria eo b C ac ya t er n ob ia ac t er Fi ia rm ic u te Ac s tin ob ac ter C ia hl or ob i without good C FB Major Phylogenetic Group Sargasso Phylotypes C Cannot be done hl or of le Sp xi iro ch ae te Fu so s De ba in ct er oc ia sampling of genomes oc cu s- Eu The ry r ar mu ch s ae C ot re a na rc ha eo ta Shotgun Sequencing Allows Use of Other Markers Venter et al., 2004 EFG EFTu rRNA RecA RpoB HSP70
  46. 46. Binning challenge A T B U C V D W E X F Y G Z Saturday, April 24, 2010
  47. 47. Binning challenge A T B U C V D W E X F Y G Best binning method: reference genomes Z Saturday, April 24, 2010
  48. 48. Binning challenge A T B U C V D W E X F Y G No reference genome? What do you do? Z Saturday, April 24, 2010
  49. 49. Binning challenge A T B U C V D W E X F Y G No reference genome? What do you do? Z Phylogeny .... Saturday, April 24, 2010
  50. 50. Al ph ap ro Be te ta o ba G p 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 am ro ct te er m o ia ap ba Saturday, April 24, 2010 ro ct D te er el ob ia ta pr ac Ep ot te U si lo eo ria nc ba la np ct ss ro er ifi te ia ed ob Pr ac ot te eo ria ba Cy ct an er ob ia ac Ch te ria la m Ac yd id ia ob e Ba ac te ct ria er Ac oi de tin te ob s ac te ria Aq Pl ui an fic ct om ae yc Sp et AMPHORA - each read on its own tree iro es ch ae Fi te rm s ic ut Ch es lo ro U fle nc xi la Ch ss lo ifi ro ed bi Ba ct er ia Phylogenetic Binning Using AMPHORA frr tsf pgk rplL rplF rplP rplT rplE infC rpsI rplS rplA rplB rplK rplC rpsJ rplN rplD rplM rpsE rpsS rpsB rpsK rpsC rpoB rpsM pyrG nusA dnaG rpmA smpB
  51. 51. Phylogenetic Binning Using AMPHORA dnaG 0.7 frr infC 0.6 nusA pgk pyrG 0.5 0.4 Cannot be done rplA rplB rplC rplD 0.3 without good rplE rplF rplK rplL 0.2 0.1 sampling of genomes rplM rplN rplP rplS rplT rpmA 0 rpoB rpsB es ia s es s ria ia ia bi ia ia om ae e ia ria ria ria xi te te ia er er er er r er fle ro et ut rpsC fic te te te te te yd de ae ct ct ct ct ct lo yc ro ic ac ac ac ac ac ui m ch oi ba ba Ch ba ba Ba rm rpsE lo Aq ob ob ob ob ob er la iro eo Ch o eo o Fi ed Ch ct an te te te te id tin ct rpsI Sp ot ot Ba Ac ro ro ro ro ifi an Cy Ac Pr pr ss ap p ap np rpsJ Pl ta ta ed la ph m lo el Be nc rpsK si ifi am Al D Ep U ss rpsM G la nc rpsS U smpB tsf AMPHORA - each read on its own tree Saturday, April 24, 2010
  52. 52. GEBA Phylogenomic Lesson 5 We have still only scratched the surface of microbial diversity Saturday, April 24, 2010
  53. 53. Protein Family Rarefaction Curves • Take data set of multiple complete genomes • Identify all protein families using MCL • Plot # of genomes vs. # of protein families Saturday, April 24, 2010
  54. 54. Saturday, April 24, 2010
  55. 55. Saturday, April 24, 2010
  56. 56. Saturday, April 24, 2010
  57. 57. Saturday, April 24, 2010
  58. 58. Saturday, April 24, 2010
  59. 59. Phylogenetic Distribution Novelty: Bacterial Actin Related Protein 2"#3)&4&*&& !"#*)$*),+% 5"#$-.-6&0&1- !"#$%,$-%)( 7"#0(1.8-9& !"#$''+-+,',! 5"#:1,)*&$/0 !"#&$,%+)+-+ !"#$% !"#$%&'()*&& !"#$%&'(%() (( +"#,-.(/01 !"#*+,**'+( ;"#01,&-*0 !"#%*+$--( <"#$-.-3.1%&0 !"#%',&'-+) ') 2"#$&*-.-1 !"#$'(-%%+&$ ="#$.1001 !"#-*$+$(&( !&'( $++ >"#0$1,/%1.&0 !"#&$**+),)-! *$ $++ ;"#01,&-*0 !"#*+,$*'( '* 5"#:1,)*&$/0 !"#&$,%+%-%% $++ 5"#$-.-6&0&1- !"#',&+$)* !&') ?"#@-%1*)A10(-. !"#&%'%&*%* $++ B"#A1%%/0# "#%*,-&*'( )* 2"#*-)').@1*0 !"#*-&'''(+ 5"#$-.-6&0&1- !"#',&&*&* !&'* $++ ?"#@-%1*)A10(-. !"#$)),)*%, $++ ;"#01,&-*0 !"#*+,$*),! ;"#)$C.1$-/@ !"#&&),(*((- +!&' 5"#$-.-6&0&1- !"#$++-&%%! ), ."#,1(-*0 !"#$'-+*$((&! !&', (( !"#(C1%&1*1 !"#$-,(%'+-! (% 5"#$-.-6&0&1- !"#$,+$(,& $++ 5"#:1,)*&$/0 !"#&$,%+-,(,! !&'- -) ?"#4&0$)&4-/@ !"#''-+&%$- )% ?"#@-%1*)A10(-. !"#$)),),%) () 5"#$-.-6&0&1- !"#',&,$$% $++ ?"#C1*0-*&&!"#&$-*$ $(&$ !&'. $++ D"#01(&61 !"#$-&'*)%&+! !"#(C1%&1*1!"#$-%$ $),) !&'/ ?"#@-%1*)A1(-. !"#$((&+,*- $++ <"#@/0$/%/0 !"#&&'&%'*(, !&'(0 +/*! Patrik D’haeseleer, Adam Haliangium ochraceum DSM 14365 Zemla, Victor Kunin From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html Saturday, April 24, 2010
  60. 60. rRNA Tree of Life Based on tree by Norm Pace Saturday, April 24, 2010
  61. 61. Phylogenetic Diversity: Sequenced Bacteria & Archaea From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html Saturday, April 24, 2010
  62. 62. Phylogenetic Diversity with GEBA From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html Saturday, April 24, 2010
  63. 63. Phylogenetic Diversity: Isolates From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html Saturday, April 24, 2010
  64. 64. Phylogenetic Diversity: All From Wu et al. 2009. http://www.nature.com/nature/journal/v462/n7276/full/nature08656.html Saturday, April 24, 2010
  65. 65. Proteobacteria TM6 OS-K • At least 40 phyla of Acidobacteria Termite Group OP8 bacteria Nitrospira Bacteroides Chlorobi • Genome sequences are Fibrobacteres Marine GroupA mostly from three phyla WS3 Gemmimonas Firmicutes • Most phyla with cultured Fusobacteria Actinobacteria species are sparsely OP9 Cyanobacteria Synergistes sampled Deferribacteres Chrysiogenetes NKB19 • Lineages with no cultured Verrucomicrobia Chlamydia OP3 taxa even more poorly Planctomycetes Spriochaetes sampled Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Well sampled phyla Thermudesulfobacteria Thermotogae Poorly sampled OP1 OP11 No cultured taxa Saturday, April 24, 2010
  66. 66. Uncultured Lineages: Technical Approaches • Get into culture • Enrichment cultures • If abundant in low diversity ecosystems • Flow sorting • Microbeads • Microfluidic sorting • Single cell amplification Saturday, April 24, 2010
  67. 67. GEBA Phylogenomic Lesson 6 Need Experiments from Across the Tree of Life too Saturday, April 24, 2010
  68. 68. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA WS3 Gemmimonas Firmicutes Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Saturday, April 24, 2010
  69. 69. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Experimental WS3 Gemmimonas Firmicutes studies are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Saturday, April 24, 2010
  70. 70. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Experimental WS3 Gemmimonas Firmicutes studies are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some studies Verrucomicrobia Chlamydia OP3 in other phyla Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Saturday, April 24, 2010

×