Your SlideShare is downloading. ×
0
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Jonathan Eisen talk on "Genomic Encyclopedia" at Lake Arrowhead Small Genomes Meeting 2008

3,130

Published on

Talk by Jonathan Eisen on "A genomic encyclopedia of bacteria and archaea" at Lake Arrowhead Small Genomes meeting in 2008.

Talk by Jonathan Eisen on "A genomic encyclopedia of bacteria and archaea" at Lake Arrowhead Small Genomes meeting in 2008.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
3,130
On Slideshare
0
From Embeds
0
Number of Embeds
51
Actions
Shares
0
Downloads
17
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Gets better with more markers - but we do not have lots of sequences for these markers. We can get them from genomes. The more diverse the genomes, thebeter the marker set will be
  • Gives useful comparison
  • Transcript

    • 1. A Genomic Encyclopedia of Bacteria and Archaea (GEBA) Jonathan A. Eisen U. C. Davis and J. G. I. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 2. Outline • Background – Why history matters – Gaps in available genomes • The GEBA pilot project • Future needs QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 3. The Tree of Life QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 4. Famous Arrowhead 2004 Quotes • Space-time continuum of genes and genomes • Gene sequences are the wormhole that allows one to tunnel into the past • The human mind can conceive of things with no basis in physical reality • Thoughts can go faster than the speed of light QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 5. Famous Arrowhead Quotes 2006 • Publications, student degrees, etc. • Not trying to say anything bad about anyone • The human guts are a real milieu • Where’s you evening gown? • You better kiss everybody • This is how you do metagenomics on 50 dollars, and that’s Canadian dollars QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 6. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 7. QuickTime™ and a From http://genomesonline.org TIFF (LZW) decompressor QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 8. Major Microbial Sequencing Efforts • Coordinated, top-down efforts – Fungal Genome Initiative (Broad/Whitehead) – Gordon and Betty Moore Foundation Marine Microbial Genome Sequencing Project – Sanger Center Pathogen Sequencing Unit – NHGRI Human Gut Microbiome Project – NIH Human Microbiome Program • White paper or grant systems – NIAID Microbial Sequencing Centers – DOE/JGI Community Sequencing Program – DOE/JGI BER Sequencing Program – NSF/USDA Microbial Genome Sequencing • Covers lots of ground and biological diversity QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 9. The Tree of Life QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 10. The Tree is not Happy QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 11. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA WS3 Gemmimonas Firmicutes Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on QuickTime™ and a OP11 TIFF (LZW) decompressor Hugenholtz, 2002 QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 12. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria three phyla Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on QuickTime™ and a TIFF (LZW) decompressor Hugenholtz, 2002 QuickTime™ and a OP11 TIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 13. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria three phyla Synergistes Deferribacteres Chrysiogenetes • Some other NKB19 Verrucomicrobia Chlamydia phyla are OP3 Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 sampled Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on QuickTime™ and a TIFF (LZW) decompressor Hugenholtz, 2002 QuickTime™ and a OP11 TIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 14. As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas sequences are Firmicutes Fusobacteria mostly from Actinobacteria OP9 Cyanobacteria three phyla Synergistes Deferribacteres Chrysiogenetes • Some other NKB19 Verrucomicrobia Chlamydia phyla are OP3 Planctomycetes only sparsely Spriochaetes Coprothmermobacter OP10 sampled Thermomicrobia Chloroflexi TM7 • Same trend in Deinococcus-Thermus Dictyoglomus Aquificae Archaea, Thermudesulfobacteria Thermotogae Eukaryotes OP1 Based on QuickTime™ and a TIFF (LZW) decompressor Hugenholtz, 2002 QuickTime™ and a OP11 TIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 15. Need for Tree Guidance Well Established• Common approach within some eukaryotic groups – NHGRI animal projects – FGI at Whitehead – Plant sequencing at JGI• Phylogenetic gaps in bacterial and archaeal projects commonly lamented in literature• Many small projects funded to fill in some gaps – DOE/TIGR Sequencing – Multiple CSP projects – Multiple NSF/USDA projects – Private projects (e.g., Integrated Genomics, Diversa) – TIGR (Eisen, Ward) Bacterial Tree of Life Project QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 16. Why Increase Taxonomic Coverage? • Mechanisms of diversification • Gene discovery • Annotation, functional prediction • Metagenomic analysis • Species phylogeny and classification QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 17. Proteobacteria• Eisen-Ward TM6 • At least 40 OS-K NSF Tree of Acidobacteria Termite Group phyla of OP8 Life Project Nitrospira Bacteroides bacteria Chlorobi• A genome Fibrobacteres Marine GroupA • Genome WS3 from each of Gemmimonas sequences are Firmicutes eight phyla Fusobacteria mostly from Actinobacteria OP9 Cyanobacteria three phyla Synergistes Deferribacteres Chrysiogenetes • Some other NKB19 QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Verrucomicrobia Chlamydia phyla are only OP3 Planctomycetes sparsely Spriochaetes Coprothmermobacter OP10 sampled ThermomicrobiaBased on Chloroflexi TM7 • Solution I:Hugenholtz, Deinococcus-Thermus Dictyoglomus sequence more2002 Aquificae Thermudesulfobacteria Thermotogae phyla OP1 QuickTime™ and a QuickTime™ and a TIFF (LZW) decompressor OP11 TIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 18. The Tree of Life is Still Angry QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 19. Within Phyla Diversity Immense • Each phyla represents billions of years of evolution • Some have hundreds of major lineages, most with no genomes • Need to sample within phyla too QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 20. Major Lineages of Actinobacteria 2.5.1 Acidimicrobidae 2.5.1.1 Unclassified 2.5.1.2 "Microthrixineae 2.5.1.3 Acidimicrobineae 2.5.1.4 BD2-10 2.5.1.5 EB1017 2.5.2 Actinobacteridae 2.5.2.1 Unclassified 2.5.2.10 Ellin306/WR160 2.5.2.11 Ellin5012 2.5.2.12 Ellin5034 2.5.2.13 Frankineae 2.5.2.14 Glycomyces 2.5.2.15 Intrasporangiaceae 2.5.2.16 Kineosporiaceae 2.5.2.17 Microbacteriaceae 2.5.2.18 Micrococcaceae 2.5.2.19 Micromonosporaceae 2.5.2.2 Actinomyces 2.5.2.20 Propionibacterineae 2.5.2.21 Pseudonocardiaceae 2.5.2.22 Streptomycineae 2.5.2.23 Streptosporangineae 2.5.2.3 Actinomycineae 2.5.2.4 Actinosynnemataceae 2.5.2.5 Bifidobacteriaceae 2.5.2.6 Brevibacteriaceae 2.5.2.7 Cellulomonadaceae 2.5.2.8 Corynebacterineae 2.5.2.9 Dermabacteraceae 2.5.3 Coriobacteridae 2.5.3.1 Unclassified 2.5.3.2 Atopobiales 2.5.3.3 Coriobacteriales 2.5.3.4 Eggerthellales 2.5.4 OPB41 2.5.5 PK1 2.5.6 Rubrobacteridae 2.5.6.1 Unclassified 2.5.6.2 "Thermoleiphilaceae QuickTime™ and a TIFF (LZW) decompressor 2.5.6.3 MC47 QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture. 2.5.6.4 Rubrobacteraceae
    • 21. Major Lineages of Actinobacteria II 2.5 Actinobacteria 2.5.1 Acidimicrobidae 2.5.1.1 Unclassified 2.5.1.2 "Microthrixineae 2.5.1.3 Acidimicrobineae 2.5.1.3.1 Unclassified 2.5.1.3.2 Acidimicrobiaceae 2.5.1.4 BD2-10 2.5.1.5 EB1017 2.5.2 Actinobacteridae 2.5.2.1 Unclassified 2.5.2.10 Ellin306/WR160 2.5.2.11 Ellin5012 2.5.2.12 Ellin5034 2.5.2.13 Frankineae 2.5.2.13.1 Unclassified 2.5.2.13.2 Acidothermaceae 2.5.2.13.3 Ellin6090 2.5.2.13.4 Frankiaceae 2.5.2.13.5 Geodermatophilaceae 2.5.2.13.6 Microsphaeraceae 2.5.2.13.7 Sporichthyaceae 2.5.2.14 Glycomyces 2.5.2.15 Intrasporangiaceae 2.5.2.15.1 Unclassified 2.5.2.15.2 Dermacoccus 2.5.2.15.3 Intrasporangiaceae 2.5.2.16 Kineosporiaceae 2.5.2.17 Microbacteriaceae 2.5.2.17.1 Unclassified 2.5.2.17.2 Agrococcus 2.5.2.17.3 Agromyces 2.5.2.18 Micrococcaceae 2.5.2.19 Micromonosporaceae 2.5.2.2 Actinomyces 2.5.2.20 Propionibacterineae 2.5.2.20.1 Unclassified 2.5.2.20.2 Kribbella 2.5.2.20.3 Nocardioidaceae 2.5.2.20.4 Propionibacteriaceae 2.5.2.21 Pseudonocardiaceae 2.5.2.22 Streptomycineae 2.5.2.22.1 Unclassified 2.5.2.22.2 Kitasatospora 2.5.2.22.3 Streptacidiphilus 2.5.2.23 Streptosporangineae 2.5.2.23.1 Unclassified 2.5.2.23.2 Ellin5129 2.5.2.23.3 Nocardiopsaceae 2.5.2.23.4 Streptosporangiaceae 2.5.2.23.5 Thermomonosporaceae 2.5.2.3 Actinomycineae 2.5.2.4 Actinosynnemataceae 2.5.2.5 Bifidobacteriaceae 2.5.2.6 Brevibacteriaceae 2.5.2.7 Cellulomonadaceae 2.5.2.8 Corynebacterineae 2.5.2.8.1 Unclassified 2.5.2.8.2 Corynebacteriaceae 2.5.2.8.3 Dietziaceae 2.5.2.8.4 Gordoniaceae 2.5.2.8.5 Mycobacteriaceae 2.5.2.8.6 Rhodococcus 2.5.2.8.7 Rhodococcus 2.5.2.8.8 Rhodococcus 2.5.2.9 Dermabacteraceae 2.5.2.9.1 Unclassified 2.5.2.9.2 Brachybacterium 2.5.2.9.3 Dermabacter 2.5.3 Coriobacteridae 2.5.3.1 Unclassified 2.5.3.2 Atopobiales 2.5.3.3 Coriobacteriales 2.5.3.4 Eggerthellales 2.5.4 OPB41 2.5.5 PK1 2.5.6 Rubrobacteridae 2.5.6.1 Unclassified 2.5.6.2 "Thermoleiphilaceae 2.5.6.2.1 Unclassified 2.5.6.2.2 Conexibacter 2.5.6.2.3 XGE514 QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. 2.5.6.3 MC47 TIFF (LZW) decompressor 2.5.6.4 Rubrobacteraceae are needed to see this picture.
    • 22. Proteobacteria TM6 OS-K • At least 100 phyla of Acidobacteria Termite Group bacteria OP8 Nitrospira Bacteroides • Genome sequences are Chlorobi Fibrobacteres Marine GroupA mostly from three phyla WS3 Gemmimonas Firmicutes • Most phyla with cultured Fusobacteria Actinobacteria species are sparsely OP9 Cyanobacteria Synergistes sampled Deferribacteres Chrysiogenetes NKB19 • Lineages with no cultured Verrucomicrobia Chlamydia OP3 taxa even more poorly Planctomycetes Spriochaetes sampled Coprothmermobacter OP10 Thermomicrobia • Solution - use tree to really Chloroflexi TM7 Deinococcus-Thermus fill gaps Dictyoglomus Aquificae Well sampled phyla Thermudesulfobacteria Thermotogae OP1 QuickTime™ and a TIFF (LZW) decompressor OP11 QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 23. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 24. GEBA Pilot Project: Components • Project management (David Bruce, Lynne Goodwin et al) • Selection of strains (Phil Hugenholtz, Nikos Kyrpides, Jonathan Eisen) • Culture collection and DNA prep (DSMZ, Hans-Peter Klenk) • Libraries and DNA (Eileen Dalin et al.) • Sequencing and closure (Susan Lucas, Alla Lapidus et al.) • Annotation and database needs (Nikos Kyrpides) • Analysis (Dongying Wu, Martin Wu, Jenna Morgan, Victor Kunin, Marcel Huntemann, Neil Rawlings, Ian Paulsen, Gary Xie, Patrick Chain, Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Mavrommatis Kostas) • Adopt a microbe education project (Cheryl Kerfeld) • Outreach (David Gilbert) QuickTime™ and a QuickTime™ and a • $$$ (DOE, Eddy Rubin, Jim Bristow)TIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 25. GEBA Pilot I: Identifying Lineages without Genomes QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 26. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 27. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 28. QuickTime™ and a TIFFQuickTime™ and a (LZW) decompressor are TIFF (LZW) decompressor needed to see this picture. are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 29. GEBA Pilot II: Selecting Targets QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 30. Key Criteria • Phylogenetic novelty – Working from top of tree down – Also selected one phylum to fill in in more detail - Actinobacteria • Culturable – Type strain preferred is all else equal • DOE mission relevance • Ready availability to us and community – Of strain – Of DNA QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 31. GEBA Pilot III: Partnership with DSMZ QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 32. GEBA Biggest Challenge: Getting DNA • Getting quality DNA is biggest bottleneck • Decided to test as part of the GEBA pilot the possibility of getting DNA directly from culture collections • DSMZ offered to do for free • ATCC is doing a small number for a fee • Working with other culture collections QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 33. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 34. Quantification gel of the genomic DNA isolated from Microorganisms Microorganisms Conexibacter woesei (DSM 14684T) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Lane 1: c(λ-Marker)= 15 ng Lane 9: DSM 18081, Patulibacter minatonensisLane 2: c(λ-Marker)= 30 ng Lane 10: DSM 14684, Conexibacter woeseiLane 3: c(λ-Marker)= 50 ng Lane 11: DSM 11002, Dethiosulfovibrio peptidovoransLane 4: DNA Molecular Weight Marker II (Roche Lane 12: DSM 11551, Halogeometricum borinquense 236250) Lane 13: DNA Molecular Weight Marker II (RocheLane 5: DSM 13279, Collinsella stercoris 236250)Lane 6: DSM 43043, Intrasporangium calvum Lane 14: c(λ-Marker)= 125 ngLane 7: DSM 18053, Dyadobacter fermentans Lane 15: c(λ-Marker)= 250 ngLane 8: DSM 20476, Slackia heliotrinireducens Lane 16: c(λ-Marker)= 500 ng Conexibacter woesei (DSM 14684T) was taken from the German Collection of Microorganisms and Cell Cultures (DSMZ). The genomic DNA was isolated using the Qiagen Genomic 500 DNA Kit (Qiagen 10262). The genomic DNA was 10-250 kb in size as determined by Pulsed Field Gel Electrophoresis (PFGE). The bulk of DNA had a size of 50-250 kb (see attached PFGE image). The DNA concentration is 500 ng/µl as estimated from the gel. Spectrophotometric measurements QuickTime™ and a yielded a DNA concentration of 450 µg/ml; 300 µl of genomic DNA are shipped (150 µg). QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 35. GEBA Pilot IV: Sequencing Progress QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 36. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 37. GEBA Pilot Target List 35 30 25 20 15 # of Genomes 10 5 0 B: Aquificae B: Chloroflexi Deinococci Firmicutes B: B: B: Bacteroidetes B: Fusobacteria B: Spirochaetes Halobacteria Thermococci A: A: Archaeoglobi Thermoprotei A: A: B: AminanaerobiaDeferribacteres B: B: Deferribacteres B: Planctomycetes A: Methanobacteria B: Haloanaerobiales Thermovenabulae B: Thermodesulfobia Methanomicrobia B: A: B: Gemmatimonadetes B: Delta Proteobacteria B: Epsilon B: Gamma Proteobacteria Proteobacteria B: Thermodesulfobacteria B: Actinobacteria (High GC) Phyla QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 38. GEBA Pilot Status 5-12-08 35 30 25 Closed 20 Post Draft Production 15 Library Awaiting Material # of Genomes 10 5 0 B: Aquificae B: Firmicutes B: Chloroflexi B: Deinococci B: Bacteroidetes A: HalobacteriaA: A: Thermococci B: Fusobacteria B: Spirochaetes A: ArchaeoglobiThermoprotei B: Aminanaerobia Deferribacteres B: Deferribacteres B: B: Planctomycetes B: Haloanaerobiales A: A: Methanomicrobia Methanobacteria B: Thermodesulfobia B: Thermovenabulae B: Gemmatimonadetes B: Delta Proteobacteria B: Epsilon Proteobacteria B: Thermodesulfobacteria B: Gamma ProteobacteriaB: Actinobacteria (High GC) QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Phyla QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.
    • 39. Non Active Projects 16 14 12 10 Abandoned 8 On Hold # of6Genomes 4 2 0 B: Aquificae B: ChloroflexiDeinococciFirmicutes B: B: B: Bacteroidetes B: Fusobacteria B: Spirochaetes Halobacteria A: Thermoprotei A: A: Archaeoglobi A: Thermococci B: Aminanaerobia Deferribacteres B: Deferribacteres B: B: Planctomycetes A: Methanobacteria B: HaloanaerobialesThermovenabulae B: ThermodesulfobiaMethanomicrobia B: A: B: Gemmatimonadetes B: Delta Proteobacteria B: EpsilonB: Gamma Proteobacteria Proteobacteria B: Thermodesulfobacteria B: Actinobacteria (High GC) Phyla QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 40. GEBA Pilot Data Release 30 25 20 15 # of Genomes 10 5 0 B: Aquificae B: Chloroflexi Deinococci Firmicutes B: B: B: Bacteroidetes B: Fusobacteria B: Spirochaetes A: Halobacteria A: Thermococci A: Archaeoglobi A: Thermoprotei B: Aminanaerobia Deferribacteres B: B: Deferribacteres B: Planctomycetes A: Methanobacteria B: HaloanaerobialesB: Thermovenabulae B: Thermodesulfobia A: Methanomicrobia B: Delta Proteobacteria Gemmatimonadetes B: B: Epsilon Proteobacteria B: Thermodesulfobacteria B: Gamma ProteobacteriaB: Actinobacteria (High GC) Phyla QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 41. Progress Report GEBA Status 5-12-08 Closed 3% Awaiting Material 26% Post Draft 51% Library 9% Production 11% QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 42. Progress QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 43. Data QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 44. Organism Domain Phylum Status IMG-GEBA NCBI-PID Culture-ID GOLD-ID Acidimicrobium ferrooxidans DSM 10331 Bacteria Actinobacteria draft 2500645360 29525 DSM 10331 Gi02326 Actinosynnema mirum 101, DSM 43827 Bacteria Actinobacteria draft 2500395345 19705 DSM 43827 Gi02064 Alicyclobacillus acidocaldarius acidocaldarius 104-IA, DSM 446 Bacteria Firmicutes draft 2500575013 29405 DSM 446 Gi02324 Anaerococcus prevotii PC 1, DSM 20548 Bacteria Firmicutes draft 2500645363 29533 DSM 20548 Gi02318 Atopobium parvulum IPP 1246, DSM 20469 Bacteria Actinobacteria draft 2500575011 29401 DSM 20469 Gi02317 Beutenbergia cavernosa HKI 0122, DSM 12333 Bacteria Actinobacteria draft 2500395322 20827 DSM 12333 Gi02225 Brachybacterium faecium DSM 4810 Bacteria Actinobacteria finished 2500153401 17026 DSM 4810 Gi02066 Brachyspira murdochii DSM 12563 Bacteria Spirochaetes draft 2500645365 29543 DSM 12563 Gi02313 Capnocytophaga ochracea DSM 7271 Bacteria Bacteroidetes draft 2500575012 29403 DSM 7271 Gi02305 Catenulispora acidiphila ID139908, DSM 44928 Bacteria Actinobacteria draft 2500395338 21085 DSM 44928 Gi02233 Cellulomonas flavigena 134, DSM 20109 Bacteria Actinobacteria draft 2500395336 19707 DSM 20109 Gi02067 Chitinophaga pinensis UQM 2034, DSM 2588 Bacteria Bacteroidetes draft 2500395347 27951 DSM 2588 Gi02244 Conexibacter woesei ID131577, DSM 14684 Bacteria Actinobacteria draft 2500347307 20745 DSM 14684 Gi02154 Cryptobacterium curtum DSM 15641 Bacteria Actinobacteria finished 2500332002 20739 DSM 15641 Gi02234 Denitrovibrio acetiphilus N2460, DSM 12809 Bacteria Deferribacteres draft 2500575016 29431 DSM 12809 Gi02322 Desulfohalobium retbaense DSM 5692 Bacteria Deltaproteobacteria draft 2500575018 29199 DSM 5692 Gi02246 Desulfomicrobium baculatum DSM 04028 Bacteria Deltaproteobacteria draft 2500645356 29527 DSM 4028 Gi02302 Desulfotomaculum acetoxidans 5575, DSM 771 Bacteria Firmicutes draft 2500395337 27947 DSM 771 Gi02239 Dethiosulfovibrio peptidovorans SEBR 4207, DSM 11002 Bacteria Aminanaerobia draft 2500549401 20741 DSM 11002 Gi02152 Dyadobacter fermentans NS 114, DSM 18053 Bacteria Bacteroidetes draft 2500395342 20829 DSM 18053 Gi02155 Eggerthella lenta VPI 0255, DSM 2243 Bacteria Actinobacteria draft 2500549402 21093 DSM 2243 Gi02242 Geodermatophilus obscurus DSM 43160 Bacteria Actinobacteria draft 2500645366 29547 DSM 43160 Gi02257 Gordonia bronchialis DSM 43247 Bacteria Actinobacteria draft 2500645367 29549 DSM 43247 Gi02258 Haliangium ochraceum SMP-2, DSM 14365 Bacteria Deltaproteobacteria draft 2500395339 28711 DSM 14365 Gi02251 Halogeometricum borinquense DSM 11551 Archaea Halobacteria finished 2500153400 20743 DSM 11551 Gi02153 Halomicrobium mukohataei arg-2, DSM 12286 Archaea Halobacteria draft 2500395343 27945 DSM 12286 Gi02248 Halorhabdus utahensis AX-2, DSM 12940 Archaea Halobacteria draft 2500575004 29305 DSM 12940 Gi02250 Jonesia denitrificans DSM 20603 Bacteria Actinobacteria draft 2500168153 20833 DSM 20603 Gi02227 Kangiella koreensis SW-125, DSM 16069 Bacteria Gammaproteobacteria draft 2500645353 29443 DSM 16069 Gi02314 Kribbella flavida DSM 17836 Bacteria Actinobacteria draft 2500395325 21089 DSM 17836 Gi02235 Kytococcus sedentarius DSM 20547 Bacteria Actinobacteria finished 2500168150 21067 DSM 20547 Gi02226 Leptotrichia buccalis C-1013-b, DSM 1135 Bacteria Fusobacteria draft 2500645352 29445 DSM 1135 Gi02240 Meiothermus ruber DSM 1279 Bacteria Deinococci draft 2500395348 28827 DSM 1279 Gi02300 Meiothermus silvanus DSM 9946 Bacteria Deinococci draft 2500645369 29551 DSM 9946 Gi02308 Nakamurella multipartita DSM 44233 Bacteria Actinobacteria draft 2500645368 21081 DSM 44233 Gi02230 Nocardiopsis dassonvillei dassonvillei DSM 43111 Bacteria Actinobacteria draft 2500395320 19709 DSM 43111 Gi02065 Pedobacter heparinus HIM 762-3, DSM 2366 Bacteria Bacteroidetes draft 2500395321 27949 DSM 2366 Gi02243 Planctomyces limnophilus DSM 3776 Bacteria Bacteroidetes draft 2500575009 29411 DSM 3776 Gi02301 Rhodothermus marinus DSM 4252 Bacteria Bacteroidetes draft 2500575002 29281 DSM 4252 Gi02303 Saccharomonospora viridis P101, DSM 43017 Bacteria Actinobacteria finished 2500347305 20835 DSM 43017 Gi02228 Sanguibacter keddieii DSM 10542 Bacteria Actinobacteria finished 2500153403 19711 DSM 10542 Gi02151 Sebaldella termitidis ATCC 33386 Bacteria Fusobacteria draft 2500645364 29539 ATCC 33386 Gi02490 Slackia heliotrinireducens DSM 20476 Bacteria Actinobacteria finished 2500168151 20831 DSM 20476 Gi02157 Sphaerobacter thermophilus 4ac11, DSM 20745 Bacteria Chloroflexi draft 2500347306 21087 DSM 20745 Gi02236 Spirosoma linguale DSM 74 Bacteria Bacteroidetes draft 2500395346 28817 DSM 74 Gi02298 Stackebrandtia nassauensis LLR-40K-21, DSM 44728 Bacteria Actinobacteria draft 2500549403 19713 DSM 44728 Gi02068 Streptobacillus moniliformis DSM 12112 Bacteria Fusobacteria draft 2500575005 29309 DSM 12112 Gi02312 Streptosporangium roseum NI 9100, DSM 43021 Bacteria Actinobacteria draft 2500395335 21083 DSM 43021 Gi02229 Sulfurospirillum deleyianum DSM 6946 Bacteria Epsilonproteobacteria draft 2500645361 29529 DSM 6946 Gi02323 Thermanaerovibrio acidaminovorans Su883 DSM 6589 Bacteria Aminanaerobia draft 2500645362 29531 DSM 6589 Gi02247 Thermobaculum terrenum YNP1, ATCC BAA-798 Bacteria Chloroflexi draft 2500645355 29523 ATCC BAA-798 Gi02489 Thermobispora bispora DSM 43833 Bacteria Actinobacteria finished 2500194801 20737 DSM 43833 Gi02237 Thermomonospora curvata DSM 43183 Bacteria Actinobacteria draft 2500645351 20825 DSM 43183 Gi02238 Tsukamurella paurometabola DSM 20162 Bacteria Actinobacteria draft 2500575010 29399 DSM 20162 Gi02254 Veillonella parvula Te3, DSM 2008 Bacteria Firmicutes draft 2500347300 21091 DSM 2008 Gi02241 Xylanimonas cellulosilytica DSM 15894 Bacteria Actinobacteria draft 2500153402 19715 DSM 15894 Gi02069 QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 45. GEBA Pilot V: Benefit? QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 46. Why Increase Taxonomic Coverage? • Mechanisms of diversification • Gene discovery • Annotation, functional prediction • Metagenomic analysis • Species phylogeny and classification QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 47. Value of 100 diverse genomes I: Gene discovery • Gene families – Will compare and contrast gene family diversity in these genomes versus random samples of previous genomes – Will assess rate of gene family discovery and whether / how much it is diminishing • Specific examples of novelty – Focusing on DOE mission areas – Do we find novel forms of hydrogenases, cellulases, C-fixation, etc QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 48. Value of 100 diverse genomes II: Annotation • Ortholog identification – Filling in gaps will help identify orthologs between species – Diverse GC content and amino acid composition should also improve ortholog identification • Examination of the rate of hypothetical protein conversion to “known” proteins • Non-homology functional prediction should improve greatly – Phylogenetic profiling – Rosetta Stone domain sharing QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 49. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a TIFF (LZW) decompressor Based on Wu et al. 2005 QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 50. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 51. Value of 100 diverse genomes III: Metagenomics • More diverse genomes should improve anchoring and binning of all metagenomic data sets • Will test by running phylotyping software comparing to genome data sets with and without GEBA genomes – Megan – AMPHORA • Should be a good complement to reference genome sequencing QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 52. dnaG 0.7 frr infC 0.6 nusA pgk pyrG 0.5 rplA rplB rplC 0.4 rplD rplE rplF 0.3 rplK rplL 0.2 rplM rplN rplP 0.1 rplS rplT rpmA 0 rpoB rpsB rpsC Aquificae Chlorobi rpsE Chlamydiae Firmicutes Chloroflexi Acidobacteria Bacteroidetes Spirochaetes rpsI Cyanobacteria Actinobacteria Planctomycetes rpsJ Betaproteobacteria DeltaproteobacteriaAlphaproteobacteria rpsK Epsilonproteobacteria Unclassified Bacteria Gammaproteobacteria rpsM Unclassified Proteobacteria rpsS smpB tsf QuickTime™ and a QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 53. Value of 100 diverse genomes IV: Mechanisms of Diversification • Lateral gene transfer – Lateral gene transfer is fundamentally important in microbial evolution – However, when we find “foreign” DNA in genomes we usually cannot pinpoint the origin of that DNA – Having more diverse genomes may help better pin down source groups for each piece of foreign DNA • Eukaryotic diversification – Of ~200 eukaryotic specific gene families – How many now show up in bacteria and archaea – Any patterns to where there are found? QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 54. CRISPR - expanding the possible 34 out of 56 genomes contain CRISPR 1-13 arrays (loci) per genome Halingium ochraceum SMP-2, DSM 14365 807 repeats in total a single repeat contains 382 repeats Verminephrobacter eisenieae: 249 repeats QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 55. Value of 100 diverse genomes V: Phylogeny QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 56. 16s Says Hyphomonas is in Rhodobacteriales QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.Badger et al.2005 QuickTime™ and a QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 57. WGT Says Its Related to Caulobacterales QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.Badger et al.2005 QuickTime™ and a QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 58. Tree of Life Example II QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 59. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 60. GEBA - What’s Next • Repeat and/or scale up • Need to determine the value of finished versus unfinished genomes • Apply this method to other groups – Microbial eukaryotes – Viruses • Really fill in bacterial and archaeal tree QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 61. The slopes of the linear regression Lines represent the PD contribution of the genomes (each window contains 50 genomes) QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 62. Slope (50 genome windows) Window position QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 63. Greengenes ssrRNA Slope (50 genome windows) Window position QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 64. GEBA: Long Run • Need active community input • Involvement of multiple funding agencies, labs, genome centers • Integration/ communication among all large scale projects • Follow recommendations of NAS, ASM, AAM reports • Adopt a Microbe - Link to educational initiatives QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 65. Proteobacteria TM6 OS-K • At least 40 phyla of Acidobacteria Termite Group bacteria OP8 Nitrospira Bacteroides • Genome sequences are Chlorobi Fibrobacteres Marine GroupA mostly from three phyla WS3 Gemmimonas Firmicutes • Most phyla with cultured Fusobacteria Actinobacteria species are sparsely OP9 Cyanobacteria Synergistes sampled Deferribacteres Chrysiogenetes NKB19 • Lineages with no cultured Verrucomicrobia Chlamydia OP3 taxa even more poorly Planctomycetes Spriochaetes sampled Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Well sampled phyla Thermudesulfobacteria Thermotogae Poorly sampled OP1 QuickTime™ and a TIFF (LZW) decompressor OP11 No cultured taxa QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 66. Uncultured Lineages: Technical Approaches • Get into culture • Enrichment cultures • If abundant in low diversity ecosystems • Flow sorting • Microbeads • Microfluidic sorting • Single cell amplification QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 67. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 68. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 69. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 70. A Happy Tree of Life QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 71. Proteobacteria TM6 OS-K • At least 40 phyla of Acidobacteria Termite Group bacteria OP8 Nitrospira Bacteroides • Genome sequences are Chlorobi Fibrobacteres Marine GroupA mostly from three phyla WS3 Gemmimonas Firmicutes • Most phyla with cultured Fusobacteria Actinobacteria species are sparsely OP9 Cyanobacteria Synergistes sampled Deferribacteres Chrysiogenetes NKB19 • Lineages with no cultured Verrucomicrobia Chlamydia OP3 taxa even more poorly Planctomycetes Spriochaetes sampled Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Well sampled phyla Thermudesulfobacteria Thermotogae Poorly sampled OP1 QuickTime™ and a TIFF (LZW) decompressor OP11 No cultured taxa QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 72. Taxonomic Bias in Cultures Too culture collection (ACM) sequenced genomes other phyla 3% other phyla Bacteroidetes 6% 20% Bacteroidetes 1% Proteobacteria Firmicutes 14% 45% Proteobacteria Firmicutes 54% 24% Actinobacteria 18% Actinobacteria 7% 3760 bacterial cultures 71 bacterial genomes QuickTime™ and a Slide by Hugenholtz QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 73. GEBA Pilot Project at JGI • Select 200 organisms using tree/taxonomy as guide • Collaborate with culture collections to obtain DNA • Sequence to closure 100 for which DNA QC is good • Sequencing by Sanger-454 hybrid approach • Data, annotation released after shotgun and closure • Assess tree based sequencing by doing reconstructions with different selection criteria. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 74. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 75. Selecting Organisms Step 2: • Selecting representatives from each lineage without a genome • Preference given within groups for organisms of DOE mission relevance • Focused on type strains of cultured species • Selecting those for which we can get DNA QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 76. GEBA Pilot Project 4th month project status: 79/169 DNAs delivered 40 35 in JGI pending 30 25 20 15 10 5 0 Thermi Firmicutes Aquificae Chloroflexi Halobacteria Bacteroidetes Fusobacteria Spirochaetes Actinobacteria Thermoprotei Aminanaerobia Proteobacteria Archaeoglobi Thermococci Acidobacteria Methanomicrobia Planctomycetes Haloanaerobiales Deferribacteres Methanobacteria Thermodesulf... Thermovenabulae Thermodesulfobia QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 77. Other Markers Give Similar Phylotpyes Sargasso Phylotypes 0.5 0.45 0.4 0.35 EFG 0.3 EFTu HSP70 0.25 RecA 0.2 RpoB rRNA 0.15Weighted % of Clones 0.1 0.05 0 CFB Chlorobi Firmicutes Chloroflexi Fusobacteria Spirochaetes Cyanobacteria Actinobacteria Euryarchaeota Crenarchaeota BetaproteobacteriaDeltaproteobacteria Alphaproteobacteria Epsilonproteobacteria Gammaproteobacteria QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Venter et al., 2004 Deinococcus-Thermus QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Major Phylogenetic Group
    • 78. Proteobacteria• Eisen-Ward TM6 • At least 40 OS-K NSF Tree of Acidobacteria Termite Group phyla of OP8 Life Project Nitrospira Bacteroides bacteria Chlorobi• A genome Fibrobacteres Marine GroupA • Genome WS3 from each of Gemmimonas sequences are Firmicutes eight phyla Fusobacteria mostly from Actinobacteria OP9 Cyanobacteria three phyla Synergistes Deferribacteres Chrysiogenetes • Some other NKB19 Verrucomicrobia Chlamydia phyla are only OP3 Planctomycetes sparsely Spriochaetes Coprothmermobacter OP10 sampled ThermomicrobiaBased on Chloroflexi TM7 • Solution I:Hugenholtz, Deinococcus-Thermus Dictyoglomus sequence more2002 Aquificae Thermudesulfobacteria Thermotogae phyla OP1 QuickTime™ and a QuickTime™ and a TIFF (LZW) decompressor OP11 TIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 79. Proteobacteria• JGI - TM6 • At least 40 OS-K Genomic Acidobacteria Termite Group phyla of OP8 Encyclopedia Nitrospira Bacteroides bacteria of Bacteria Chlorobi Fibrobacteres • Genome Marine GroupA and Archaea WS3 Gemmimonas sequences are Firmicutes Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria three phyla Synergistes QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Deferribacteres Chrysiogenetes • Some other NKB19 Verrucomicrobia Chlamydia phyla are only OP3 Planctomycetes Spriochaetes sparsely Coprothmermobacter OP10 sampled ThermomicrobiaBased on Chloroflexi TM7 • Solution II: FillHugenholtz, Deinococcus-Thermus2002 Dictyoglomus Aquificae in Phyla Thermudesulfobacteria Thermotogae OP1 QuickTime™ and a QuickTime™ and a TIFF (LZW) decompressor OP11 TIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 80. Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas sequences are Firmicutes Fusobacteria mostly from Actinobacteria OP9 Cyanobacteria three phyla Synergistes Deferribacteres Chrysiogenetes • Some other NKB19 Verrucomicrobia Chlamydia phyla are only OP3 Planctomycetes sparsely Spriochaetes Coprothmermobacter OP10 sampled Thermomicrobia Chloroflexi TM7 • Solution III: Deinococcus-Thermus Dictyoglomus Aquificae Sequence Thermudesulfobacteria Thermotogae Uncultured OP1 QuickTime™ and a QuickTime™ and a TIFF (LZW) decompressor OP11TIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture.
    • 81. The Tree of Life is Still Angry QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 82. Circular Maps QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 83. DNA Repair Genes in D. radiodurans Complete Genome Process Genes in D. radiodurans Nucleotide Excision Repair UvrABCD, UvrA2 Base Excision Repair AlkA, Ung, Ung2, GT, MutM, MutY-Nths, MPG AP Endonuclease Xth Mismatch Excision Repair MutS, MutL Recombination Initiation RecFJNRQ, SbcCD, RecD Recombinase RecA Migration and resolution RuvABC, RecG Replication PolA, PolC, PolX, phage Pol Ligation DnlJ dNTP pools, cleanup MutTs, RRase Other LexA, RadA, HepA, UVDE, MutS2 QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 84. Problem: List of DNA repair gene homologs in D. radiodurans genome is not significantly different from other bacterial genomes of the similar size QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 85. Proteobacteria TM6 OS-K ~40 Phyla Acidobacteria Termite Group OP8 of Bacteria Nitrospira Bacteroides Chlorobi Fibrobacteres Marine GroupA WS3 Gemmimonas Firmicutes Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes 0.1 Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Tree based on Thermudesulfobacteria Thermotogae Hugenholtz (2002) OP1 with some QuickTime™ and a TIFF (LZW) decompressor modifications. QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture. OP11
    • 86. Proteobacteria TM6 OS-K Acidobacteria Most DNA Termite Group OP8 Nitrospira metabolism Bacteroides Chlorobi Fibrobacteres studies in Marine GroupA WS3 Gemmimonas two Phyla Firmicutes Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes 0.1 Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Tree based on Thermudesulfobacteria Thermotogae Hugenholtz (2002) OP1 with some QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. OP11 modifications. TIFF (LZW) decompressor are needed to see this picture.
    • 87. Proteobacteria TM6 OS-K Acidobacteria Deinococcus Termite Group OP8 Nitrospira is very distant Bacteroides Chlorobi Fibrobacteres from well Marine GroupA WS3 Gemmimonas studied Firmicutes Fusobacteria groups Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes 0.1 Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Tree based on Thermudesulfobacteria Thermotogae Hugenholtz (2002) OP1 with some QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. OP11 modifications. TIFF (LZW) decompressor are needed to see this picture.
    • 88. Gain and Loss of Repair Genes Trepa Helpy BACTERIA ARCHAEA EUKARYOTES Ecoli Human Mycpn Mycge Bacsu Borbu Synsp Neigo Arcfu Strpy Metth Yeast Haein Metjn -Ogt -AlkA -Nfo -AlkA -PhrI -Ogt -AlkA -AlkA -PhrI -Ogt -Ung -Xth -Rad25 -AlkA -Nfo -RecFRQN -Rad25? + us R -Nfo -TagI -RecQ +P53 -Vsr -RuvC UmuD + -Nfo -SbcD? dRecQ -SbcCD -Dut +Rad7 +Nei? -Rec -Lon dRad23 -LexA -SMS +CCE1 +RecE -SbcCD -LexA +MAG? tRecT? -UmuC -LexA +Spr tTagI ? tRad25 t3MG -PhrI -PhrII -PhrI -Ogt -PhrI -Ogg tUvrABCD Ada + -PhrII -AlkA -Ogt MutH + -PhrII? -AlkA -Xth -AlkA -Ung SbcB + -Fpg -MutLS -Nfo -Fpg -Nfo -RecFJORQN -Nfo -Dut -MutLS -Mfd -RecO -Lon -PhrI -RecFORQ -SbcCD -LexA -Ung? -PhrII -SbcCD -RecG -UmuC -MutLS -LexA -Dut -RecQ? + sr V -UmuC -PriA -Dut RecBCD? + -TagI+RecT -LexA -UmuC -SMS -MutT RFAs + -PhrII +TFIIH -RuvC +Rad4,10,14,16,23,26 CSA + Rad52,53,54 + +TagI? dPhr DNA-PK, Ku + SNF2 d TagI? + dMutS +Fpg dMutL UvrABCD + dRecA Mfd + RecFJNOR + Ung? + RuvABC + SSB, + +RecG Rad1 + +Dut? LigI + +Rad2 from mitochondria LexA + +Rad25? SSB + Ogg + +PriA LigII + +Dut? PhrI, PhrII + +Ogt +Ung, AlkA, MutY-Nth +AlkA +Xth, Nfo? +MutLS? +SbcCD +RecA Eisen and Hanawalt, 1999 Mut +UmuC +MutT +Lon QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. dMutSI/MutSII dRecA/SMS Res 435: 171-213 QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. dPhrI/PhrII
    • 89. Solution - Experiments QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 90. Proteobacteria TM6 OS-K Need Acidobacteria Termite Group OP8 experimental Nitrospira Bacteroides Chlorobi studies from Fibrobacteres Marine GroupA WS3 across the tree Gemmimonas Firmicutes too Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes 0.1 Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Tree based on Thermudesulfobacteria Thermotogae Hugenholtz (2002) OP1 with some QuickTime™ and a TIFF (LZW) decompressor modifications. QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. are needed to see this picture. OP11
    • 91. QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 92. MICROBES QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 93. A Happy Tree of Life QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 94. TIGR $$$ N. Ward C. Fraser J. Heidelberg Frase Other people DOE E. Eisenstadt Moore NSF N. Moran S. Salzberg NIH F. Robb H. Ochman JCVI I. Paulsen J. Venter M. WuJ. Battista R. Myers D. RuschE. Orias D. Wu A. HalpernD. Bryant M. Frazier S. Chatterji S. O’Neill M. Eisen H. Huse P. Hanawalt E. Rubin C. M. Cavanaugh T. Woyke A. Hartman J. Morgan JGI Eisen Group/ Mom and Dad Davis QuickTime™ and a QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 95. CRISPR - expanding the possible 34 out of 56 genomes contain CRISPR 1-13 arrays (loci) per genome Halingium ochraceum SMP-2, DSM 14365 807 repeats in total a single repeat contains 382 repeats Verminephrobacter eisenieae: 249 repeats QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 96. GEBA Annotation and data status summary GBP - BDMTC QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 97. Annotation - Current status • 56 draft genomes in IMG-GEBA site (free access) • 61 in IMG-ER (passwd protected) • 19 complete genomes (none in IMG) QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 98. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 99. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 100. 24.6Mb of sequence 230,596 genes 227,562 proteins 155,641 w. function (67.4%) QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. 16,435 fused genes QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.
    • 101. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 102. • GC%: 74% Actinosynnema mirum 101, DSM 43827 26% Streptobacillus moniliformis DSM 12112• Size: 13.4Mb Ktedonobacter racemifer SOSP1-21, DSM 44963 1.5Mb Streptobacillus moniliformis DSM 12112• Scaffolds: 407 Ktedonobacter racemifer SOSP1-21, DSM 44963 1 Atopobium parvulum• Genes: 13,445 Ktedonobacter racemifer SOSP1-21, DSM 44963 1,433 Cryptobacterium curtum DSM 15641• w. Functions: 78.6% Thermanaerovibrio acidaminovorans 50.6% Planctomyces limnophilus DSM 3776 QuickTime™ and a QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 103. • COG%: 79% Thermanaerovibrio acidaminovorans 49% Planctomyces limnophilus• SignalP: 38% Ferrimonas balearica 14% Methanohalophilus mahii• Transmembr: 31% Eggerthella lenta 17% Haliangium ochraceum• Fused genes: 9% Desulfomicrobium baculatum 3.7% Planctomyces limnophilus• 16s: 12 Desulfotomaculum acetoxidans QuickTime™ and a QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 104. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 105. Progress Report GEBA Status 5-12-08 Closed 3% Awaiting Material 26% Post Draft 51% Library 9% Production 11% QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 106. GEBA Pilot Target List 5-12-08 35 30 25 20 15 # of Genomes 10 5 0 B: Aquificae B: Chloroflexi Deinococci Firmicutes B: B: B: Bacteroidetes B: Fusobacteria B: Spirochaetes Halobacteria Thermococci A: A: Archaeoglobi Thermoprotei A: A: B: AminanaerobiaDeferribacteres B: B: Deferribacteres B: Planctomycetes A: Methanobacteria B: Haloanaerobiales Thermovenabulae B: Thermodesulfobia Methanomicrobia B: A: B: Gemmatimonadetes B: Delta Proteobacteria B: Epsilon B: Gamma Proteobacteria Proteobacteria B: Thermodesulfobacteria B: Actinobacteria (High GC) Phyla QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 107. GEBA Pilot Status 5-12-08 35 30 25 Closed 20 Post Draft Production 15 Library Awaiting Material # of Genomes 10 5 0 B: Aquificae B: Firmicutes B: Chloroflexi B: Deinococci B: Bacteroidetes A: HalobacteriaA: A: Thermococci B: Fusobacteria B: Spirochaetes A: ArchaeoglobiThermoprotei B: Aminanaerobia Deferribacteres B: Deferribacteres B: B: Planctomycetes B: Haloanaerobiales A: A: Methanomicrobia Methanobacteria B: Thermodesulfobia B: Thermovenabulae B: Gemmatimonadetes B: Delta Proteobacteria B: Epsilon Proteobacteria B: Thermodesulfobacteria B: Gamma ProteobacteriaB: Actinobacteria (High GC) QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Phyla QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.
    • 108. Non Active Projects 16 14 12 10 Abandoned 8 On Hold # of6Genomes 4 2 0 B: Aquificae B: ChloroflexiDeinococciFirmicutes B: B: B: Bacteroidetes B: Fusobacteria B: Spirochaetes Halobacteria A: Thermoprotei A: A: Archaeoglobi A: Thermococci B: Aminanaerobia Deferribacteres B: Deferribacteres B: B: Planctomycetes A: Methanobacteria B: HaloanaerobialesThermovenabulae B: ThermodesulfobiaMethanomicrobia B: A: B: Gemmatimonadetes B: Delta Proteobacteria B: EpsilonB: Gamma Proteobacteria Proteobacteria B: Thermodesulfobacteria B: Actinobacteria (High GC) Phyla QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 109. GEBA Pilot Data Release 30 25 20 15 # of Genomes 10 5 0 B: Aquificae B: Chloroflexi Deinococci Firmicutes B: B: B: Bacteroidetes B: Fusobacteria B: Spirochaetes A: Halobacteria A: Thermococci A: Archaeoglobi A: Thermoprotei B: Aminanaerobia Deferribacteres B: B: Deferribacteres B: Planctomycetes A: Methanobacteria B: HaloanaerobialesB: Thermovenabulae B: Thermodesulfobia A: Methanomicrobia B: Delta Proteobacteria Gemmatimonadetes B: B: Epsilon Proteobacteria B: Thermodesulfobacteria B: Gamma ProteobacteriaB: Actinobacteria (High GC) Phyla QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 110. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.
    • 111. GEBA Paper Plans • Methods for large scale microbial genome sequencing – Sequencing, closure methods – DNA sources (i.e., culture collections) – Outreach and educational issues • How valuable is phylogenetic gap based sequencing? – For annotation – For metagenomics – For gene discovery • How deep to go in phylogenetic gap filling? – Breadth between phyla versus filling in phyla QuickTime™ and a QuickTime™ and aTIFF (Uncompressed) decompressor are needed to see this picture. TIFF (LZW) decompressor are needed to see this picture.

    ×