• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Jonathan Eisen talk at ASM General Meeting 2010
 

Jonathan Eisen talk at ASM General Meeting 2010

on

  • 2,570 views

Talk by Jonathan Eisen at ASM General Meeting

Talk by Jonathan Eisen at ASM General Meeting

Statistics

Views

Total Views
2,570
Views on SlideShare
1,821
Embed Views
749

Actions

Likes
0
Downloads
26
Comments
0

9 Embeds 749

http://phylogenomics.wordpress.com 572
http://phylogenomics.blogspot.com 95
http://www.slideshare.net 72
http://static.slidesharecdn.com 4
http://phylogenomics.blogspot.co.uk 2
http://health.medicbd.com 1
http://phylogenomics.blogspot.ro 1
http://phylogenomics.blogspot.ch 1
http://phylogenomics.blogspot.kr 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Jonathan Eisen talk at ASM General Meeting 2010 Jonathan Eisen talk at ASM General Meeting 2010 Presentation Transcript

    • A phylogeny driven genomic encyclopedia of bacteria and archaea Jonathan A. Eisen Talk at ASMGM May 25, 2010 Tuesday, May 25, 2010
    • Fleischmann et al. 1995 Tuesday, May 25, 2010
    • Microbial genomes From http://genomesonline.org Tuesday, May 25, 2010
    • rRNA Tree of Life Bacteria Archaea Eukaryotes FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003. Tuesday, May 25, 2010
    • Proteobacteria 2002 TM6 OS-K Acidobacteria • At least 40 Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA WS3 Gemmimonas Firmicutes Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, May 25, 2010
    • Proteobacteria 2002 TM6 OS-K Acidobacteria • At least 40 Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, May 25, 2010
    • Proteobacteria 2002 TM6 OS-K Acidobacteria • At least 40 Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, May 25, 2010
    • Proteobacteria 2002 TM6 OS-K Acidobacteria • At least 40 Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus • Same trend in Dictyoglomus Aquificae Thermudesulfobacteria Archaea Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, May 25, 2010
    • Proteobacteria 2002 TM6 OS-K Acidobacteria • At least 40 Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus • Same trend in Dictyoglomus Aquificae Thermudesulfobacteria Eukaryotes Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, May 25, 2010
    • The Tree is not Happy Bacteria Archaea Eukaryotes FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003. Tuesday, May 25, 2010
    • Why Increase Phylogenetic Coverage? • Common approach within some eukaryotic groups • Many small projects to fill in bacterial or archaeal gaps • Phylogenetic gaps in bacterial and archaeal projects commonly lamented in literature • Many potential benefits Tuesday, May 25, 2010
    • Proteobacteria • NSF-funded TM6 OS-K • At least 40 Tree of Life Acidobacteria Termite Group phyla of OP8 Project Nitrospira Bacteroides bacteria Chlorobi • A genome Fibrobacteres Marine GroupA • Genome WS3 from each of Gemmimonas sequences are Firmicutes eight phyla Fusobacteria mostly from Actinobacteria OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are only Planctomycetes Spriochaetes sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus • Solution I: Dictyoglomus Aquificae sequence more Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 phyla OP11 Tuesday, May 25, 2010
    • Tuesday, May 25, 2010
    • The Tree of Life is Still Angry Bacteria Archaea Eukaryotes FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003. Tuesday, May 25, 2010
    • Major Lineages of Actinobacteria 2.5 Actinobacteria 2.5.1 Acidimicrobidae 2.5.1 Acidimicrobidae 2.5.1.1 Unclassified 2.5.1.2 "Microthrixineae 2.5.1.1 Unclassified 2.5.1.3 Acidimicrobineae 2.5.1.3.1 Unclassified 2.5.1.2 "Microthrixineae 2.5.1.3.2 Acidimicrobiaceae 2.5.1.4 BD2-10 2.5.1.3 Acidimicrobineae 2.5.1.5 EB1017 2.5.2 Actinobacteridae 2.5.1.4 BD2-10 2.5.2.1 Unclassified 2.5.2.10 Ellin306/WR160 2.5.1.5 EB1017 2.5.2.11 Ellin5012 2.5.2.12 Ellin5034 2.5.2 Actinobacteridae 2.5.2.13 Frankineae 2.5.2.13.1 Unclassified 2.5.2.1 Unclassified 2.5.2.13.2 Acidothermaceae 2.5.2.10 Ellin306/WR160 2.5.2.13.3 2.5.2.13.4 Ellin6090 Frankiaceae 2.5.2.11 Ellin5012 2.5.2.13.5 2.5.2.13.6 Geodermatophilaceae Microsphaeraceae 2.5.2.12 Ellin5034 2.5.2.13.7 2.5.2.14 Sporichthyaceae Glycomyces 2.5.2.13 Frankineae 2.5.2.15 2.5.2.15.1 Intrasporangiaceae Unclassified 2.5.2.14 Glycomyces 2.5.2.15.2 2.5.2.15.3 Dermacoccus Intrasporangiaceae 2.5.2.15 Intrasporangiaceae 2.5.2.16 2.5.2.17 Kineosporiaceae Microbacteriaceae 2.5.2.16 Kineosporiaceae 2.5.2.17.1 2.5.2.17.2 Unclassified Agrococcus 2.5.2.17 Microbacteriaceae 2.5.2.17.3 2.5.2.18 Agromyces Micrococcaceae 2.5.2.18 Micrococcaceae 2.5.2.19 2.5.2.2 Micromonosporaceae Actinomyces 2.5.2.19 Micromonosporaceae 2.5.2.20 2.5.2.20.1 Propionibacterineae Unclassified 2.5.2.2 Actinomyces 2.5.2.20.2 2.5.2.20.3 Kribbella Nocardioidaceae 2.5.2.20 Propionibacterineae 2.5.2.20.4 2.5.2.21 Propionibacteriaceae Pseudonocardiaceae 2.5.2.21 Pseudonocardiaceae 2.5.2.22 2.5.2.22.1 Streptomycineae Unclassified 2.5.2.22 Streptomycineae 2.5.2.22.2 2.5.2.22.3 Kitasatospora Streptacidiphilus 2.5.2.23 Streptosporangineae 2.5.2.23 2.5.2.23.1 Streptosporangineae Unclassified 2.5.2.3 Actinomycineae 2.5.2.23.2 2.5.2.23.3 Ellin5129 Nocardiopsaceae 2.5.2.4 Actinosynnemataceae 2.5.2.23.4 2.5.2.23.5 Streptosporangiaceae Thermomonosporaceae 2.5.2.5 Bifidobacteriaceae 2.5.2.3 Actinomycineae 2.5.2.4 Actinosynnemataceae 2.5.2.6 Brevibacteriaceae 2.5.2.5 Bifidobacteriaceae 2.5.2.6 Brevibacteriaceae 2.5.2.7 Cellulomonadaceae 2.5.2.7 Cellulomonadaceae 2.5.2.8 Corynebacterineae 2.5.2.8 Corynebacterineae 2.5.2.8.1 Unclassified 2.5.2.8.2 Corynebacteriaceae 2.5.2.9 Dermabacteraceae 2.5.2.8.3 Dietziaceae 2.5.2.8.4 Gordoniaceae 2.5.3 Coriobacteridae 2.5.2.8.5 Mycobacteriaceae 2.5.2.8.6 Rhodococcus 2.5.3.1 Unclassified 2.5.2.8.7 Rhodococcus 2.5.2.8.8 Rhodococcus 2.5.3.2 Atopobiales 2.5.2.9 Dermabacteraceae 2.5.2.9.1 Unclassified 2.5.3.3 Coriobacteriales 2.5.2.9.2 Brachybacterium 2.5.2.9.3 Dermabacter 2.5.3.4 Eggerthellales 2.5.3 Coriobacteridae 2.5.3.1 Unclassified 2.5.4 OPB41 2.5.3.2 Atopobiales 2.5.3.3 Coriobacteriales 2.5.5 PK1 2.5.3.4 Eggerthellales 2.5.4 OPB41 2.5.6 Rubrobacteridae 2.5.5 PK1 2.5.6 Rubrobacteridae 2.5.6.1 Unclassified 2.5.6.1 Unclassified 2.5.6.2 "Thermoleiphilaceae 2.5.6.2 "Thermoleiphilaceae 2.5.6.2.1 Unclassified 2.5.6.2.2 Conexibacter 2.5.6.3 MC47 2.5.6.2.3 XGE514 2.5.6.3 MC47 2.5.6.4 Rubrobacteraceae 2.5.6.4 Rubrobacteraceae Tuesday, May 25, 2010
    • Proteobacteria TM6 OS-K • At least 100 phyla of Acidobacteria Termite Group OP8 bacteria Nitrospira Bacteroides Chlorobi • Genome sequences are Fibrobacteres Marine GroupA mostly from three phyla WS3 Gemmimonas Firmicutes • Most phyla with cultured Fusobacteria Actinobacteria species are sparsely OP9 Cyanobacteria Synergistes sampled Deferribacteres Chrysiogenetes NKB19 • Lineages with no cultured Verrucomicrobia Chlamydia OP3 taxa even more poorly Planctomycetes Spriochaetes sampled Coprothmermobacter OP10 Thermomicrobia Chloroflexi • Solution - use tree to really TM7 Deinococcus-Thermus fill gaps Dictyoglomus Aquificae Well sampled phyla Thermudesulfobacteria Thermotogae OP1 OP11 Tuesday, May 25, 2010
    • http://www.jgi.doe.gov/programs/GEBA/pilot.html Tuesday, May 25, 2010
    • A Genomic Encyclopedia of Bacteria and Archaea (GEBA) Tuesday, May 25, 2010
    • GEBA Pilot Project Overview • Identify major branches in rRNA tree for which no genomes are available • Identify branches with a cultured representative in DSMZ • Grow > 200 of these and prep. DNA • Sequence and finish 100 (covering breadth of bacterial/archaea diversity) • Annotate, analyze, release data • Assess benefits of tree guided sequencing Tuesday, May 25, 2010
    • GEBA and Openness • All data released as quickly as possible w/ no restrictions to IMG-GEBA; Genbank, etc • Data also available in Biotorrents (http://biotorrents.net) • Individual genome reports published in OA “Standards in Genome Sciences (SIGS)” • 1st GEBA paper in Nature freely available and published using Creative Commons License Tuesday, May 25, 2010
    • GEBA Lesson 1 rRNA Tree is Useful for Identifying Phylogenetically Novel Genomes Tuesday, May 25, 2010
    • rRNA Tree of Life Bacteria Archaea Eukaryotes FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003. Tuesday, May 25, 2010
    • Network of Life Bacteria Archaea Eukaryotes Figure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003. Tuesday, May 25, 2010
    • Whole Genome Tree w/ AMPHORA See Wu and Eisen, Genome Biology 2008 9: R151 http://bobcat.genomecenter.ucdavis.edu/AMPHORA/ Tuesday, May 25, 2010
    • Compare PD in Trees Tuesday, May 25, 2010
    • PD of rRNA, Genome Trees Similar From Wu et al. 2009 Nature 462, 1056-1060 Tuesday, May 25, 2010
    • GEBA Lesson 1B rRNA Tree topology is not perfect; Genome-based trees better Tuesday, May 25, 2010
    • 16s Says Hyphomonas is in Rhodobacteriales Badger et al. 2005 28 Tuesday, May 25, 2010
    • WGT and individual gene trees: Its Related to Caulobacterales Badger et al. 2005 29 Tuesday, May 25, 2010
    • Wh Concatenated alignment “whole genome tree” built using AMPHORA Tuesday, May 25, 2010
    • Whole genome phylogeny? • Many approaches – Gene presence/absence – Concatenation of phylogenetic markers – Separate phylogeny of genes and then integration of results (e.g., networks) – Models that incorporate gain/loss as well as gene phylogeny • No new results from us – However ... see Eric Alm talk Ballroom A - “Microbes in a changing world” session tomorrow AM Tuesday, May 25, 2010
    • GEBA Lesson 2 Phylogeny-driven genome selection helps discover new genetic diversity Tuesday, May 25, 2010
    • Network of Life Bacteria Archaea Eukaryotes FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003. Tuesday, May 25, 2010
    • Protein Family Rarefaction Curves • Take data set of multiple complete genomes • Identify all protein families using MCL • Plot # of genomes vs. # of protein families Tuesday, May 25, 2010
    • Tuesday, May 25, 2010
    • Tuesday, May 25, 2010
    • Tuesday, May 25, 2010
    • Tuesday, May 25, 2010
    • Tuesday, May 25, 2010
    • Synapomorphies exist Tuesday, May 25, 2010
    • GEBA Lesson 3 Phylogeny-driven genome selection improves genome annotation Tuesday, May 25, 2010
    • Predicting Function • Key step in genome projects • More accurate predictions help guide experimental and computational analyses • Many diverse approaches • Comparative and evolutionary analysis greatly improves most predictions Tuesday, May 25, 2010
    • Most/All Functional Prediction Improves w/ Better Phylogenetic Sampling • Better definition of protein family sequence “patterns” (e.g., improved HMMs) • Conversion of hypothetical into conserved hypotheticals • Greatly improves “comparative” and “evolutionary” based predictions • Linking distantly related members of protein families • Improved non-homology prediction Tuesday, May 25, 2010
    • From Wu et al. 2009. Tuesday, May 25, 2010
    • GEBA Lesson 4 Phylogeny-driven genome selection improves analysis of genome data from uncultured organisms Tuesday, May 25, 2010
    • Metagenomics Challenge Tuesday, May 25, 2010
    • Metagenomics Challenge 1. Who is out there? 2. What are they doing? Tuesday, May 25, 2010
    • Who is out there? • Mimic rRNA PCR based studies • But can now do these with other genes Tuesday, May 25, 2010
    • rRNA phylotyping from metagenomics Venter et al., 2004 Tuesday, May 25, 2010
    • Shotgun Sequencing Allows Use of Alternative Anchors (e.g., RecA) Venter et al., 2004 Tuesday, May 25, 2010
    • Weighted % of Clones 0 0.1250 0.2500 0.3750 0.5000 Al ph ap ro t eo Be b ac ta te pr ria ot eo G ba am Tuesday, May 25, 2010 ct m er ia ap ro te Ep ob si ac lo te np ria ro te De ob ac lta te pr ria ot eo ba C ct ya er no ia ba ct er ia Fi rm ic ut es Ac tin ob ac te ria C hl or ob i C Major Phylogenetic Group FB Sargasso Phylotypes C hl or of le xi Sp iro cha et es Fu so ba De ct in er o ia co cc u s- Th er Eu ry m ar us ch ae ot C a re na rc ha eo ta Shotgun Sequencing Allows Use of Other Markers Venter et al., 2004 EFG EFTu rRNA RecA RpoB HSP70
    • Weighted % of Clones 0 0.1250 0.2500 0.3750 0.5000 Al ph ap ro t eo Be b ac ta te pr ria ot eo G ba am Tuesday, May 25, 2010 ct m er ia ap ro te Ep ob si ac lo te np ria ro te De ob ac lta te pr ria ot eo ba C ct ya er no ia ba ct er ia Fi rm ic ut es sampling Ac tin ob ac te ria C hl or ob i C Major Phylogenetic Group better genomic FB Sargasso Phylotypes C hl or ofl ex Sp i iro cha et es Fu so ba Should improve with De ct in er o ia co cc u s- Th er Eu ry m ar us ch ae ot C a re na rc ha eo ta Shotgun Sequencing Allows Use of Other Markers Venter et al., 2004 EFG EFTu rRNA RecA RpoB HSP70
    • Functional Inference from Metagenomics • Can work well for individual genes • Predicting “community” function is challenging because treating community as a bag of genes does not work well • Better to “compartmentalize” data ... Tuesday, May 25, 2010
    • Binning challenge A T B U C V D W E X F Y G Z Tuesday, May 25, 2010
    • Binning challenge A T B U C V D W E X F Y G Best binning method: reference genomes Z Tuesday, May 25, 2010
    • Reference Genomes Coming from Select Environment Tuesday, May 25, 2010
    • Binning challenge A T B U C V D W E X F Y G No reference genome? What do you do? Z Tuesday, May 25, 2010
    • Binning challenge A T B U C V D W E X F Y G No reference genome? What do you do? Z Phylogeny .... Tuesday, May 25, 2010
    • AMPHORA Guide tree Tuesday, May 25, 2010
    • Al ph ap ro Be te ta o ba G p 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 am ro ct te er m o ia Tuesday, May 25, 2010 ap ba ro ct D te er el ob ia ta pr ac Ep ot te U si lo eo ria nc ba la np ct ss ro er ifi te ia ed ob Pr ac ot te eo ria ba Cy ct an er ob ia ac Ch te ria la m Ac yd id ia ob e Ba ac te ct ria er Ac oi de tin te ob s ac te ria Aq Pl ui an fic ct om ae yc Sp et AMPHORA - each read on its own tree iro es ch ae Fi te rm s ic ut Ch es lo ro U fle nc xi la Ch ss lo ifi ro ed bi Ba ct er ia Phylogenetic Binning Using AMPHORA frr tsf pgk rplL rplF rplP rplT rplE infC rpsI rplS rplA rplB rplK rplC rpsJ rplN rplD rplM rpsE rpsS rpsB rpsK rpsC rpoB rpsM pyrG nusA dnaG rpmA smpB
    • Phylogenetic Binning Using AMPHORA dnaG 0.7 frr infC 0.6 nusA pgk pyrG 0.5 0.4 Should improve with rplA rplB rplC rplD 0.3 better genomic rplE rplF rplK rplL 0.2 0.1 sampling rplM rplN rplP rplS rplT rpmA 0 rpoB rpsB es ia s es s ria ia ia bi ia ia om ae e ia ria ria ria xi te te ia er er er er r er fle ro et ut rpsC fic te te te te te yd de ae ct ct ct ct ct lo yc ro ic ac ac ac ac ac ui m ch oi ba ba Ch ba ba Ba rm rpsE lo Aq ob ob ob ob ob er la iro eo Ch o eo o Fi ed Ch ct an te te te te id tin ct rpsI Sp ot ot Ba Ac ro ro ro ro ifi an Cy Ac Pr pr ss ap p ap np rpsJ Pl ta ta ed la ph m lo el Be nc rpsK si ifi am Al D Ep U ss rpsM G la nc rpsS U smpB tsf AMPHORA - each read on its own tree Tuesday, May 25, 2010
    • Metagenomic Analysis Improves w/ Phylogenetic Sampling • Small but real improvements in – Gene identification / confirmation – Functional prediction – Binning – Phylogenetic classification Tuesday, May 25, 2010
    • Metagenomic Analysis Improves w/ Phylogenetic Sampling • Small but real improvements in – Gene identification / confirmation – Functional prediction – Binning – Phylogenetic classification • But not a lot ... Tuesday, May 25, 2010
    • How to improve phylogenetic analysis of metagenomic data • Fragmented data • Which genes to use? • More automation Tuesday, May 25, 2010
    • iSEEM Project Tuesday, May 25, 2010
    • Phylogenetic challenge A single tree with everything Tuesday, May 25, 2010
    • Phylogenetic Binning Using AMPHORA dnaG 0.7 frr infC 0.6 nusA pgk pyrG 0.5 0.4 Improves with better rplA rplB rplC rplD 0.3 phylogenetic methods rplE rplF rplK rplL 0.2 rplM rplN rplP 0.1 rplS rplT rpmA 0 rpoB rpsB es ia s es s ria ia ia bi ia ia om ae e ia ria ria ria xi te te ia er er er er r er fle ro et ut rpsC fic te te te te te yd de ae ct ct ct ct ct lo yc ro ic ac ac ac ac ac ui m ch oi ba ba Ch ba ba Ba rm rpsE lo Aq ob ob ob ob ob er la iro eo Ch o eo o Fi ed Ch ct an te te te te id tin ct rpsI Sp ot ot Ba Ac ro ro ro ro ifi an Cy Ac Pr pr ss ap p ap np rpsJ Pl ta ta ed la ph m lo el Be nc rpsK si ifi am Al D Ep U ss rpsM G la nc rpsS U smpB tsf AMPHORA - each read on its own tree Tuesday, May 25, 2010
    • Al ph ap ro Be te ta o ba G p 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 am ro ct te er m o ia Tuesday, May 25, 2010 ap ba ro ct D te er el ob ia ta pr ac Ep ot te U si lo eo ria nc ba la np ct ss ro er ifi te ia ed ob Pr ac ot te eo ria ba Cy ct an er ob ia ac Ch te ria la m Ac yd id ia ob e Ba ac te ct ria er Ac oi de tin te ob s ac gene families te ria Aq Pl ui an fic ct om ae yc Sp et AMPHORA - each read on its own tree iro es ch ae Fi te rm s ic ut Improves with more Ch es lo ro U fle nc xi la Ch ss lo ifi ro ed bi Ba ct er ia Phylogenetic Binning Using AMPHORA frr tsf pgk rplL rplF rplP rplT rplE infC rpsI rplS rplA rplB rplK rplC rpsJ rplN rplD rplM rpsE rpsS rpsB rpsK rpsC rpoB rpsM pyrG nusA dnaG rpmA smpB
    • New “Marker Genes” • 100 representative genomes • MCL gene families • Identify gene families w/ – High universality – High uniformity of copy number – Phylogenetic tree similar to “whole genome tree” Tuesday, May 25, 2010
    • Distances between gene trees and the AMPHORA concatenated genome tree rpmA coaE coaE rpmA trmD rplL rpsS rpsQ radA rplR rplD rplQ tsf rpsH frr smpB ttf rpsO rplR rplP rplM rpsS rplI rplV rpsB rplT rpsO rplO mraW rpsP rpsH rpsK rplQ rplU rplL tsf rplT trmD rplE rplS rpsP ttf rplC rpsI rplV mraW rplS rpsL infC rpsG rpsM rplM rplO rplI rplU pyrH rpsL rpsM rpsQ ruvA guaA radA rpsG purA smpB rplK priA rplD rpsK infC rplK rplC serS rplE rplA rplA rplF frr ruvA rplF rpsC serS rplN rplN rplP guaA rpsE ruvB pyrH rpsB rpsI rpsJ secY rRNA16S rpsJ secY purA rplB rplB priA nusA rpsE ruvB rpsC rRNA16S nusA 0 1 2 3 4 5 6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 NODAL distance SPLIT distance AMPHORA marker Ribosomal protein Transcription/translation related protein DNA repair protein Protein of other function Distance between the genome tree and 100 random trees (average ± standard deviation) Tuesday, May 25, 2010
    • Screen gene markers for any given taxonomic group Phylogenetic group Genome Gene Maker Number Number Candidates Archaea 62 145415 106 Actinobacteria 63 267783 136 Alphaproteobacteria 94 347287 121 Betaproteobacteria 56 266362 311 Gammaproteobacteria 126 483632 118 Deltaproteobacteria 25 102115 206 Epislonproteobacteria 18 33416 455 Bacteriodes 25 71531 286 Chlamydae 13 13823 560 Chloroflexi 10 33577 323 Cyanobacteria 36 124080 590 Firmicutes 106 312309 87 Spirochaetes 18 38832 176 Thermi 5 14160 974 Thermotogae 9 17037 684 Tuesday, May 25, 2010
    • Al ph ap ro Be te ta o ba G p 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 am ro ct te er m o ia Tuesday, May 25, 2010 ap ba ro ct D te er el ob ia ta pr ac Ep ot te U si lo eo ria nc ba la np ct ss ro er ifi te ia ed ob Pr ac ot te eo ria ba Cy ct an er ob ia ac Ch te ria la m Ac yd id ia ob e Ba ac te ct ria er automation Ac oi de tin te ob s ac te ria Aq Pl ui an fic ct om ae yc Sp et AMPHORA - each read on its own tree iro es ch ae Fi te rm s ic ut Ch es lo Improves with better ro U fle nc xi la Ch ss lo ifi ro ed bi Ba ct er ia Phylogenetic Binning Using AMPHORA frr tsf pgk rplL rplF rplP rplT rplE infC rpsI rplS rplA rplB rplK rplC rpsJ rplN rplD rplM rpsE rpsS rpsB rpsK rpsC rpoB rpsM pyrG nusA dnaG rpmA smpB
    • Zorro • http://sourceforge.net/projects/probmask/ • ZORRO is a probabilistic masking program that assigns confidence scores to each column in a multiple seqeunce alignment. These scores can then be used to account for alignment accuracy in phylogenetic inference pipelines • Wu, Chatterji, Eisen submitted Tuesday, May 25, 2010
    • Tuesday, May 25, 2010
    • GEBA Phylogenomic Lesson 5 We have still only scratched the surface of microbial diversity Tuesday, May 25, 2010
    • rRNA Tree of Life Bacteria Archaea Eukaryotes FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003. Tuesday, May 25, 2010
    • Phylogenetic Diversity: Sequenced Bacteria & Archaea From Wu et al. 2009 Tuesday, May 25, 2010
    • Phylogenetic Diversity with GEBA From Wu et al. 2009 Tuesday, May 25, 2010
    • Phylogenetic Diversity: Isolates From Wu et al. 2009 Tuesday, May 25, 2010
    • Phylogenetic Diversity: All From Wu et al. 2009 Tuesday, May 25, 2010
    • Proteobacteria TM6 OS-K • At least 40 phyla of Acidobacteria Termite Group OP8 bacteria Nitrospira Bacteroides Chlorobi • Genome sequences are Fibrobacteres Marine GroupA mostly from three phyla WS3 Gemmimonas Firmicutes • Most phyla with cultured Fusobacteria Actinobacteria species are sparsely OP9 Cyanobacteria Synergistes sampled Deferribacteres Chrysiogenetes NKB19 • Lineages with no cultured Verrucomicrobia Chlamydia OP3 taxa even more poorly Planctomycetes Spriochaetes sampled Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Well sampled phyla Thermudesulfobacteria Thermotogae Poorly sampled OP1 OP11 No cultured taxa Tuesday, May 25, 2010
    • Proteobacteria TM6 OS-K • At least 40 phyla of bacteria Acidobacteria Termite Group OP8 • Genome sequences are mostly Nitrospira Bacteroides from three phyla Chlorobi Fibrobacteres Marine GroupA • Most phyla with cultured WS3 Gemmimonas Firmicutes species are sparsely sampled Fusobacteria Actinobacteria • Lineages with no cultured OP9 Cyanobacteria Synergistes taxa even more poorly Deferribacteres Chrysiogenetes sampled NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Well sampled phyla Thermudesulfobacteria Thermotogae Poorly sampled OP1 OP11 No cultured taxa Tuesday, May 25, 2010
    • Uncultured Lineages: Technical Approaches • Get into culture • Enrichment cultures • If abundant in low diversity ecosystems • Flow sorting • Microbeads • Microfluidic sorting • Single cell amplification Tuesday, May 25, 2010
    • GEBA Phylogenomic Lesson 6 Need Experiments from Across the Tree of Life too Tuesday, May 25, 2010
    • As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA WS3 Gemmimonas Firmicutes Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, May 25, 2010
    • As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Experimental WS3 Gemmimonas Firmicutes studies are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, May 25, 2010
    • As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Experimental WS3 Gemmimonas Firmicutes studies are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some studies Verrucomicrobia Chlamydia OP3 in other phyla Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, May 25, 2010
    • As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus • Same trend in Dictyoglomus Aquificae Thermudesulfobacteria Eukaryotes Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, May 25, 2010
    • As of 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas Firmicutes sequences are Fusobacteria Actinobacteria mostly from OP9 Cyanobacteria Synergistes three phyla Deferribacteres Chrysiogenetes NKB19 • Some other Verrucomicrobia Chlamydia OP3 phyla are Planctomycetes Spriochaetes only sparsely Coprothmermobacter OP10 Thermomicrobia sampled Chloroflexi TM7 Deinococcus-Thermus • Same trend in Dictyoglomus Aquificae Thermudesulfobacteria Viruses Thermotogae OP1 Based on OP11 Hugenholtz, 2002 Tuesday, May 25, 2010
    • Proteobacteria TM6 OS-K Need Acidobacteria Termite Group OP8 experimental Nitrospira Bacteroides Chlorobi studies from Fibrobacteres Marine GroupA WS3 across the tree Gemmimonas Firmicutes too Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes 0.1 Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Tree based on Thermudesulfobacteria Thermotogae Hugenholtz (2002) OP1 with some OP11 modifications. Tuesday, May 25, 2010
    • Tuesday, May 25, 2010
    • MICROBES Tuesday, May 25, 2010
    • A Happy Tree of Life Tuesday, May 25, 2010