A phylogeny driven genomic
encyclopedia of bacteria and archaea

    (or what is GEBA anyway?)

          Jonathan A. Eise...
From http://genomesonline.org
rRNA Tree of Life
The Tree is not Happy
As of 2002   Proteobacteria
             TM6
             OS-K                    • At least 40
             Acidobacteria...
As of 2002   Proteobacteria
             TM6
             OS-K
                                     • At least 40
        ...
As of 2002   Proteobacteria
             TM6
             OS-K
                                     • At least 40
        ...
As of 2002   Proteobacteria
             TM6
             OS-K
                                     • At least 40
        ...
Need for Tree Guidance Well Established

• Common approach within some eukaryotic
  groups

• Many small projects funded t...
Proteobacteria
TM6
OS-K
                        • At least 100 phyla of
Acidobacteria
Termite Group
OP8
                  ...
http://www.jgi.doe.gov/programs/GEBA/pilot.html
GEBA Pilot Project Overview

• Identify major branches in rRNA tree for
  which no genomes are available
• Identify a cult...
GEBA Pilot Project: Components
• Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan
  Eisen, Eddy Rubin, Jim Bris...
Some Lessons From GEBA
GEBA Lesson 1

rRNA Tree of Life is a Useful Guide
 and Genomes Improve Resolution
GEBA Lesson 2

Phylogenetically Guided Selection
Can Help Annotate Other Genomes
Most/All Functional Prediction Improves
    w/ Better Phylogenetic Sampling
  • Better definition of protein family sequenc...
GEBA Lesson 3

Phylogenetically Guided Selection
   Can Help Study Uncultured
           Organisms
Environmental Shotgun Sequencing


                   shotgun



                             sequence
Binning challenge

A                       T
B                       U
C                       V
D                       W...
Metagenomic Analysis Improves



 Sean
Hooper     • Small but real
             improvement
             in
             m...
GEBA Lesson 4

We have still only scratched the
 surface of microbial diversity
Protein Family Rarefaction Curves

• Take data set of multiple complete genomes
• Identify all protein families using MCL
...
Phylogenetic Distribution Novelty: 1st
         Bacterial Actin Related Protein

 Victor
 Kunin




   Patrik
D’haeseleer
...
Phylogenetic Diversity with GEBA
Phylogenetic Diversity: Isolates
Phylogenetic Diversity: All
Proteobacteria
TM6
OS-K
                        • At least 40 phyla of
Acidobacteria
Termite Group
OP8
                   ...
Uncultured Lineages:
           Technical Approaches
•   Get into culture
•   Enrichment cultures
•   If abundant in low d...
GEBA Lesson 6

Need Experiments from Across the
        Tree of Life too
Adopt a Microbe
MICROBES
A Happy Tree of Life
Related Lesson 1

METADATA ROCKS
SIGS

• The Genomic Standards Consortium
• The GSC is an open-membership working body which
  formed in September 2005.
• ...
Eisen.All Hands
Eisen.All Hands
Eisen.All Hands
Eisen.All Hands
Eisen.All Hands
Eisen.All Hands
Eisen.All Hands
Eisen.All Hands
Upcoming SlideShare
Loading in...5
×

Eisen.All Hands

1,830

Published on

Talk summarizing our GEBA Genomic Encylopedia of Bacteria and Archaea project for "All Hands" meeting at the Joint Genome Institute

Published in: Health & Medicine, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,830
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
19
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Eisen.All Hands

  1. 1. A phylogeny driven genomic encyclopedia of bacteria and archaea (or what is GEBA anyway?) Jonathan A. Eisen October 27, 2009
  2. 2. From http://genomesonline.org
  3. 3. rRNA Tree of Life
  4. 4. The Tree is not Happy
  5. 5. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA WS3 Gemmimonas Firmicutes Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002
  6. 6. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002
  7. 7. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002
  8. 8. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus • Same trend in Dictyoglomus Aquificae Thermudesulfobacteria Archaea Thermotogae OP1 Based on OP11 Hugenholtz, 2002
  9. 9. Need for Tree Guidance Well Established • Common approach within some eukaryotic groups • Many small projects funded to fill in some bacterial or archaeal gaps • Phylogenetic gaps in bacterial and archaeal projects commonly lamented in literature
  10. 10. Proteobacteria TM6 OS-K • At least 100 phyla of Acidobacteria Termite Group OP8 bacteria Nitrospira Bacteroides Chlorobi • Genome sequences are Fibrobacteres Marine GroupA mostly from three phyla WS3 Gemmimonas Firmicutes • Most phyla with cultured Fusobacteria Actinobacteria species are sparsely OP9 Cyanobacteria Synergistes sampled Deferribacteres Chrysiogenetes NKB19 • Lineages with no cultured Verrucomicrobia Chlamydia OP3 taxa even more poorly Planctomycetes Spriochaetes sampled Coprothmermobacter OP10 Thermomicrobia Chloroflexi • Solution - use tree to really TM7 Deinococcus-Thermus fill gaps Dictyoglomus Aquificae Well sampled phyla Thermudesulfobacteria Thermotogae OP1 OP11
  11. 11. http://www.jgi.doe.gov/programs/GEBA/pilot.html
  12. 12. GEBA Pilot Project Overview • Identify major branches in rRNA tree for which no genomes are available • Identify a cultured representative for each group • Grow > 200 of these and prep. DNA • Sequence and finish 100 • Annotate, analyze, release data • Assess benefits of tree guided sequencing
  13. 13. GEBA Pilot Project: Components • Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan Eisen, Eddy Rubin, Jim Bristow) • Project management (David Bruce, Eileen Dalin, Lynne Goodwin) • Culture collection and DNA prep (DSMZ, Hans-Peter Klenk) • Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus, Mat Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng) • Annotation and data release (Nikos Kyrpides, Victor Markowitz, et al) • Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla) • Adopt a microbe education project (Cheryl Kerfeld) • Outreach (David Gilbert) • $$$ (DOE, Eddy Rubin, Jim Bristow)
  14. 14. Some Lessons From GEBA
  15. 15. GEBA Lesson 1 rRNA Tree of Life is a Useful Guide and Genomes Improve Resolution
  16. 16. GEBA Lesson 2 Phylogenetically Guided Selection Can Help Annotate Other Genomes
  17. 17. Most/All Functional Prediction Improves w/ Better Phylogenetic Sampling • Better definition of protein family sequence “patterns” • Greatly improves “comparative” and “evolutionary” based predictions • Conversion of hypothetical into conserved hypotheticals • Linking distantly related members of protein families • Improved non-homology prediction Kostas Natalia Thanos Nikos Iain Mavrommatis Ivanova Lykidis Kyrpides Anderson
  18. 18. GEBA Lesson 3 Phylogenetically Guided Selection Can Help Study Uncultured Organisms
  19. 19. Environmental Shotgun Sequencing shotgun sequence
  20. 20. Binning challenge A T B U C V D W E X F Y G Z
  21. 21. Metagenomic Analysis Improves Sean Hooper • Small but real improvement in metagenomic Amrita Pati annotation and analysis
  22. 22. GEBA Lesson 4 We have still only scratched the surface of microbial diversity
  23. 23. Protein Family Rarefaction Curves • Take data set of multiple complete genomes • Identify all protein families using MCL • Plot # of genomes vs. # of protein families
  24. 24. Phylogenetic Distribution Novelty: 1st Bacterial Actin Related Protein Victor Kunin Patrik D’haeseleer Adam Zemla Haliangium ochraceum DSM 14365
  25. 25. Phylogenetic Diversity with GEBA
  26. 26. Phylogenetic Diversity: Isolates
  27. 27. Phylogenetic Diversity: All
  28. 28. Proteobacteria TM6 OS-K • At least 40 phyla of Acidobacteria Termite Group OP8 bacteria Nitrospira Bacteroides Chlorobi • Genome sequences are Fibrobacteres Marine GroupA mostly from three phyla WS3 Gemmimonas Firmicutes • Most phyla with cultured Fusobacteria Actinobacteria species are sparsely OP9 Cyanobacteria Synergistes sampled Deferribacteres Chrysiogenetes NKB19 • Lineages with no cultured Verrucomicrobia Chlamydia OP3 taxa even more poorly Planctomycetes Spriochaetes sampled Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Well sampled phyla Thermudesulfobacteria Thermotogae Poorly sampled OP1 OP11 No cultured taxa
  29. 29. Uncultured Lineages: Technical Approaches • Get into culture • Enrichment cultures • If abundant in low diversity ecosystems • Flow sorting • Microbeads • Microfluidic sorting • Single cell amplification
  30. 30. GEBA Lesson 6 Need Experiments from Across the Tree of Life too
  31. 31. Adopt a Microbe
  32. 32. MICROBES
  33. 33. A Happy Tree of Life
  34. 34. Related Lesson 1 METADATA ROCKS
  35. 35. SIGS • The Genomic Standards Consortium • The GSC is an open-membership working body which formed in September 2005. • The goal of this international community is to promote mechanisms that standardize the description of genomes and the exchange and integration of genomic data. • See http://gensc.org/gc_wiki/index.php/Main_Page
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×