The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks

on

  • 5,275 views

Talk by Jonathan Eisen for #SMBEEuks

Talk by Jonathan Eisen for #SMBEEuks

Statistics

Views

Total Views
5,275
Views on SlideShare
1,872
Embed Views
3,403

Actions

Likes
0
Downloads
11
Comments
0

53 Embeds 3,403

http://phylogenomics.blogspot.com 1557
http://phylogenomics.wordpress.com 567
http://phylogenomics.blogspot.co.uk 165
http://phylogenomics.blogspot.ca 143
http://phylogenomics.blogspot.de 111
http://phylogenomics.blogspot.nl 90
http://www.informaticsblogs.com 86
http://phylogenomics.blogspot.com.au 69
https://twitter.com 69
http://phylogenomics.blogspot.co.at 66
http://phylogenomics.blogspot.fr 55
http://phylogenomics.blogspot.com.es 49
http://phylogenomics.blogspot.dk 44
http://phylogenomics.blogspot.in 38
http://phylogenomics.blogspot.fi 23
http://phylogenomics.blogspot.co.nz 23
http://phylogenomics.blogspot.jp 21
http://phylogenomics.blogspot.it 18
http://phylogenomics.blogspot.com.br 18
http://phylogenomics.blogspot.tw 14
http://phylogenomics.blogspot.ch 13
http://phylogenomics.blogspot.se 13
http://www.phylogenomics.blogspot.com 12
http://phylogenomics.blogspot.cz 10
http://xb5.blogspot.com.es 10
http://phylogenomics.blogspot.pt 9
http://byobio.com 9
http://phylogenomics.blogspot.sg 9
http://phylogenomics.blogspot.ru 8
http://phylogenomics.blogspot.no 8
http://phylogenomics.blogspot.ie 7
http://phylogenomics.blogspot.hu 7
http://www.phylogenomics.blogspot.ru 7
http://phylogenomics.blogspot.co.il 6
http://scottcain.net 6
https://phylogenomics.wordpress.com 5
http://phylogenomics.blogspot.com.ar 4
http://phylogenomics.blogspot.be 4
http://phylogenomics.blogspot.mx 4
http://www.newsblur.com 4
http://phylogenomics.blogspot.gr 4
http://news.google.com 3
http://phylogenomics.blogspot.kr 3
http://translate.googleusercontent.com 2
https://storify.com 2
http://webcache.googleusercontent.com 1
http://www.feedspot.com 1
http://tweetedtimes.com 1
http://phylogenomics.blogspot.hk 1
http://feeds.feedburner.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks Presentation Transcript

  • 1. The Need for a Phylogeny-Driven GenomicEncyclopedia of EukaryotesJonathan A. Eisen@phylogenomicsUniversity of California, DavisTalk for SMBE-EUKSMonday, April 29, 13
  • 2. I: The ProblemMonday, April 29, 13
  • 3. Googling Sequenced Eukaryotic GenomesMonday, April 29, 13
  • 4. Wikipedia On Sequenced EuksMonday, April 29, 13
  • 5. More from WikipediaMonday, April 29, 13
  • 6. Better Source: GOLDhttp://www.genomesonline.org/cgi-bin/GOLD/index.cgiMonday, April 29, 13
  • 7. GOLD by Taxonomyhttp://www.genomesonline.org/cgi-bin/GOLD/index.cgiMonday, April 29, 13
  • 8. GOLD: Euks by PhylumPhylum Count PercentKorarchaeota 1 0Nanoarchaeota 2 0Thaumarchaeota 30 5Crenarchaeota 142 25Euryarchaeota 356 64Unclassified 28 5Phylum Count PercentCaldiserica 1 0Nitrospinae 1 0Crenarchaeota 2 0Chrysiogenetes 2 0Dictyoglomi 2 0Fibrobacteres 2 0Armatimonadetes 3 0Elusimicrobia 3 0Lentisphaerae 3 0Poribacteria 4 0Gemmatimonadetes 6 0Thermodesulfobacteria 7 0Ignavibacteria 8 0Deferribacteres 10 0Chlorobi 14 0Synergistetes 21 0Euryarchaeota 23 0Nitrospirae 24 0Aquificae 24 0Acidobacteria 30 0Verrucomicrobia 41 0Planctomycetes 42 0Thermotogae 50 0Chloroflexi 51 0Fusobacteria 80 0Deinococcus-Thermus 92 0Chlamydiae 207 1Cyanobacteria 245 1Tenericutes 251 1Spirochaetes 472 2Bacteroidetes 762 4Actinobacteria 2,065 10Firmicutes 5,342 26Proteobacteria 10,088 50Unclassified 17 0Phylum Count PercentPhaeophyceae 1 0Priapulida 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23Unclassified 704 13Back to GOLDbutionPercent0000000Eukaryotic Phylum DistributionPhylum Count PercentPhaeophyceae 1 0Priapulida 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 018/18 Family: 30/29 Genus: 103/118 Species: 340/67324/118 Family: 280/298 Genus: 1368/2106 Species: 6352/114240/1037 Family: 689/6689 Genus: 1170/54319 Species: 1769/218222jects over number of the classified subdivisions of this phylogenetic group.http://www.genomesonline.org/cgi-bin/GOLD/index.cgiMonday, April 29, 13
  • 9. GOLD: Euks by PhylumPriapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23Monday, April 29, 13
  • 10. Euks More Resolution0.2Bodomorpha minimaLumbricus rubellusDiplophrysBOLA458Chaunacanthida sp.Labyrinthuloides minutaFilamoeba nolandiChlamydaster sterniRT7iin2Phalansterium solitariumEuglena gracilisRT5iin20BOLA383Ulkenia profundaLEMD267Ammonia sp.Oxymonas sp.DH148EKB1Diplonema ambulatorMinchinia teredinisPavlova salinaGlaucosphaera vacuolataCyanoptyche gloeocystisOLI11305Gromia oviformisCryptosporidium parvumBreviata anathemaAchlya bisexualisLEMD052Phagomyxa odontellaeRaphidiophrys ambiguaCompsopogon coeruleusBOLA212Colpodella ponticaUncultured eukaryote clone BOLA187Jakoba liberaRT5iin2CS.E036Acrosphaera sp. CR6AAcanthamoeba castellaniiAT1.3Saccharomyces cerevisiaeOLI11150Nuclearia simplexRA000412.136TCS 2002BOLA868Allogromia sp.Monosiga brevicollisRT5iin4Plasmodiophora brassicaeRT5iin8OLI51105RA010412.17BOLA515OLI11032RT 5iin25AT4.11SymphyacanthidaRT5iin44CS.E045Urosporidium crescensGoniomonas truncataGymnophrys cometaPodocoryne carneaOLI11066Reclinomonas americanaReticulomyxa filosaRT8n7Oxytricha novaAT4.50C1.E027Arthracanthida sp.RT1n14culAT4.94Telonema antarcticumOLI11025LKM30LKM48Filobasidiella neoformansDH147EKD17Mayorella sp.C2.E026Bacillaria paxilliferRetortamonas sp.OLI11059Malawimonas jakobiformisBOLA048Streblomastix strixGuillardia thetaPlatyamoeba stenopodiaDH148EKD18Cafeteria roenbergensisTelonema subtilis RCC404.5DH148EKD53LKM74Ciliophrys infusionumScherffelia dubiaVolvox carteriCS.R003Trypanosoma cruziBL010625.25AT4.56N-PorJakoba incarcerataSphaerozoum punctatumUncultured eukaryote clone BOLA366Lecythium sp.Acanthometra sp.Loxophyllum utriculareLKM101Glaucocystis nostochinearumOLI11056BAQA072Apusomonas proboscideaTrimastix marinaC3.E012Helianthus annuusAT8.54Ichthyobodo necatorCS.E022RA001219.10RT5in38Paravahlkampfia ustianaOLI11007Telonema subtilis RCC358.7Amastigomonas debruyneiEmiliania huxleyiLeptomyxa reticulataHartmannella vermiformisOLI11072DH145EKD11Noctiluca scintillansCyanophora paradoxaTrimastix pyriformisNaegleria gruberiAT 4.96Amoeba proteusGonyaulax spiniferasp.0.99/680.89/-0.40/-0.87/-0.88/-0.88/-0.84/-0.78/590.66/610.55/-0.89/-Collodictyon triciliatumDiphylleia rotansUncultured Collodictyonidae partial1.0/77-/841.0/631.0/560.99/-1.0/-0.96/-0.99/-0.95/-0.99/-0.99/681.0/631.0/620.69/-0.63/- 0.83/-0.79/750.69/570.79/-0.87/-0.59/-0.68/-1.0/-0.57/500.63/-1.0/780.53/-SARExcavataDiphyllatiaAmoebozoaOpisthokonta0.53/760.73/-0.81/-0.84/--/-0.63/-0.79/-0.81/-0.70/-0.98/-1.0/740.51/--/--/-HaptophytaTelonemiaApusozoaCentrohelidaCryptophytaRhodophytaGlaucophytaViridiplantaeFIG. 1. 18S rDNA phylogeny of the Diphyllatia species Collodictyon triciliatum (highlighted by black box) and Diphylleia rotans. The topologywas reconstructed by MrBayes v3.1.2 under the GTR þ GAMMA þ I þ covarion model. Posterior probabilities (PP) and ML bootstrap supports(BP, inferred by RAxML v7.1.2 under GTR þ GAMMA þ I model) are shown at the nodes. Thick lines indicate PP . 0.90 and BP . 80%. Dashes‘‘-’’ indicate PP , 0.5 or BP , 50%. A few long branches are shortened by 50% (/) or 75% (//).Zhao et al. · doi:10.1093/molbev/mss001 MBE1560byguestonApril28,2013http://mbe.oxfordjournals.org/DownloadedfromCollodictyon—An Ancient Lineage in the Tree of EukaryotesSen Zhao, ,1Fabien Burki, ,2Jon Bra˚te,1Patrick J. Keeling,2Dag Klaveness,1andKamran Shalchian-Tabrizi*,11Microbial Evolution Research Group, Department of Biology, University of Oslo, Oslo, Norway2Canadian Institute for Advanced Research, Botany Department, University of British Columbia, Vancouver, British Columbia,Canada These authors contributed equally to this work.*Corresponding author: E-mail: kamran@bio.uio.no.Associate editor: Herve´ PhilippeAbstractThe current consensus for the eukaryote tree of life consists of several large assemblages (supergroups) that are hypothesized todescribe the existing diversity. Phylogenomic analyses have shed light on the evolutionary relationships within and betweensupergroups as well as placed newly sequenced enigmatic species close to known lineages. Yet, a few eukaryote species remain ofunknown origin and could represent key evolutionary forms for inferring ancient genomic and cellular characteristics ofeukaryotes. Here, we investigate the evolutionary origin of the poorly studied protist Collodictyon (subphylum Diphyllatia) bysequencing a cDNA library as well as the 18S and 28S ribosomal DNA (rDNA) genes. Phylogenomic trees inferred from 124 genesplaced Collodictyon close to the bifurcation of the ‘‘unikont’’ and ‘‘bikont’’ groups, either alone or as sister to the potentiallycontentious excavate Malawimonas. Phylogenies based on rDNA genes confirmed that Collodictyon is closely related to anothergenus, Diphylleia, and revealed a very low diversity in environmental DNA samples. The early and distinct origin of Collodictyonsuggests that it constitutes a new lineage in the global eukaryote phylogeny. Collodictyon shares cellular characteristics withExcavata and Amoebozoa, such as ventral feeding groove supported by microtubular structures and the ability to form thin andbroad pseudopods. These may therefore be ancient morphological features among eukaryotes. Overall, this shows thatCollodictyon is a key lineage to understand early eukaryote evolution.Key words: 18S and 28S rDNA, Collodictyon, Diphyllatia, tree of life, phylogenomics, cDNA, pyrosequencing.IntroductionOver the last few years, molecular sequence data have ad-dressed some of the most intriguing questions about theeukaryote tree of life. Phylogenomic analyses have con-firmed the existence of several major eukaryote groups(supergroups) as well as shown various levels of evidencesfor the relationships among them (Burki et al. 2007; Parfreyet al. 2010). Recently, two new large assemblages, SAR(Stramenopila, Alveolata, and Rhizaria) and CCTH (Crypto-phyta, Centrohelida, Telonemia, and Haptophyta), wereproposed to encompass a large fraction of the eukaryotediversity, together with the other supergroups Opisthokon-ta, Amoebozoa, Archaeplastida, and Excavata (Patron et al.2007; Burki et al. 2009). Solid phylogenomic evidenceand complex genome histories (Simpson and Roger2004; Parfrey et al. 2006; Roger and Simpson 2009).Identification of sister lineages to these supergroups iscrucial for resolving the eukaryote tree and understandingthe early history of eukaryotes. If these key lineages exist,they may be found among the few species that harbor dis-tinct morphological features but are of unknown evolu-tionary origin in single-gene phylogenies (Patterson 1999;Shalchian-Tabrizi et al. 2006; Kim et al. 2011). Indicationsthat such enigmatic species can be placed in the eukaryotetree come from recent phylogenomic analyses. For in-stance, Ministeria (Opisthokonta), Breviata (Amoebozoa)and Telonemia, Centroheliozoa, and Picobiliphyta havebeen shown to constitute deep lineages within their re-ResearcharticlebyguestonApril28,2013http://mbe.oxfordjournals.org/Downloadedfromhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC3351787/Monday, April 29, 13
  • 11. 2010 PARFREY ET AL.—BROADLY SAMPLED TREE OF EUKARYOTIC LIFE 523FIGURE 1. Most likely eukaryotic tree of life reconstructed using all 451 taxa and all 16 genes (SSU-rDNA plus 15 protein genes). Majornodes in this topology are robust to analyses of subsets of taxa and genes, which include varying levels of missing data (Table 1). Clades in boldare monophyletic in analyses with 2 or more members except in all:15 in which taxa represented by a single gene were sometimes misplaced.Numbers in boxes represent support at key nodes in analyses with increasing amounts of missing data (10:16, 6:16, 4:16, and all:16 analyses; seeTable 1 for more details). Given uncertainties around the root of the eukaryotic tree of life (see text), we have chosen to draw the tree rooted withthe well-supported clade Opisthokonta. Dashed line indicates alternate branching pattern seen for Amoebozoa in other analyses. Long branches,indicated by //, have been reduced by half. The 6 lineages labeled by * represent taxa that are misplaced, probably due to LBA, listed fromtop to bottom with expected clade in parentheses. These are Protoopalina japonica (Stramenopiles), Aggregata octopiana (Apicomplexa), Mikrocytosmackini (Haplosporidia), Centropyxis laevigata (Tubulinea), Marteilioides chungmuensis (unplaced), and Cochliopodium spiniferum (Amoebozoa).byguestonApril28,2013http://sysbio.oxfordjournals.org/DownloadedfromSyst. Biol. 59(5):518–533, 2010c The Author(s) 2010. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved.For Permissions, please email: journals.permissions@oxfordjournals.orgDOI:10.1093/sysbio/syq037Advance Access publication on July 23, 2010Broadly Sampled Multigene Analyses Yield a Well-Resolved Eukaryotic Tree of LifeLAURA WEGENER PARFREY1, JESSICA GRANT2, YONAS I. TEKLE2,6, ERICA LASEK-NESSELQUIST3,4,HILARY G. MORRISON3, MITCHELL L. SOGIN3, DAVID J. PATTERSON5, AND LAURA A. KATZ1,2,∗1Program in Organismic and Evolutionary Biology, University of Massachusetts, 611 North Pleasant Street, Amherst,MA 01003, USA; 2Department of Biological Sciences, Smith College, 44 College Lane, Northampton, MA 01063, USA; 3Bay Paul Center forComparative Molecular Biology and Evolution, Marine Biological Laboratory, 7 MBL Street, Woods Hole, MA 02543, USA; 4Department of Ecology andEvolutionary Biology, Brown University, 80 Waterman Street, Providence, RI 02912, USA; 5Biodiversity Informatics Group, Marine BiologicalLaboratory, 7 MBL Street, Woods Hole, MA 02543, USA; 6Present address: Department of Epidemiology and Public Health, Yale University School ofMedicine, New Haven, CT 06520, USA;∗Correspondence to be sent to: Laura A. Katz, 44 College Lane, Northampton, MA 01003, USA; E-mail: lkatz@smith.edu.Laura Wegener Parfrey and Jessica Grant have contributed equally to this work.Received 30 September 2009; reviews returned 1 December 2009; accepted 25 May 2010Associate Editor: C´ecile An´eAbstract.—An accurate reconstruction of the eukaryotic tree of life is essential to identify the innovations underlying thediversity of microbial and macroscopic (e.g., plants and animals) eukaryotes. Previous work has divided eukaryotic diver-sity into a small number of high-level “supergroups,” many of which receive strong support in phylogenomic analyses.However, the abundance of data in phylogenomic analyses can lead to highly supported but incorrect relationships dueto systematic phylogenetic error. Furthermore, the paucity of major eukaryotic lineages (19 or fewer) included in thesegenomic studies may exaggerate systematic error and reduce power to evaluate hypotheses. Here, we use a taxon-richstrategy to assess eukaryotic relationships. We show that analyses emphasizing broad taxonomic sampling (up to 451 taxarepresenting 72 major lineages) combined with a moderate number of genes yield a well-resolved eukaryotic tree of life.The consistency across analyses with varying numbers of taxa (88–451) and levels of missing data (17–69%) supports theaccuracy of the resulting topologies. The resulting stable topology emerges without the removal of rapidly evolving genesor taxa, a practice common to phylogenomic analyses. Several major groups are stable and strongly supported in theseanalyses (e.g., SAR, Rhizaria, Excavata), whereas the proposed supergroup “Chromalveolata” is rejected. Furthermore, ex-tensive instability among photosynthetic lineages suggests the presence of systematic biases including endosymbiotic genetransfer from symbiont (nucleus or plastid) to host. Our analyses demonstrate that stable topologies of ancient evolutionaryrelationships can be achieved with broad taxonomic sampling and a moderate number of genes. Finally, taxon-rich analy-ses such as presented here provide a method for testing the accuracy of relationships that receive high bootstrap support(BS) in phylogenomic analyses and enable placement of the multitude of lineages that lack genome scale data. [Excavata;microbial eukaryotes; Rhizaria; supergroups; systematic error; taxon sampling.]Perspectives on the structure of the eukaryotic treeof life have shifted in the past decade as molecularanalyses provide hypotheses for relationships amongmarks throughout to note groups where uncertaintieremain. Moreover, it is difficult to evaluate the overalstability of major clades of eukaryotes because phylogehttp://sysbio.oxfordjournals.org/content/59/5/518.fullEuks More ResolutionMonday, April 29, 13
  • 12. Syst. Biol. 59(5):518–533, 2010c The Author(s) 2010. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved.For Permissions, please email: journals.permissions@oxfordjournals.orgDOI:10.1093/sysbio/syq037Advance Access publication on July 23, 2010Broadly Sampled Multigene Analyses Yield a Well-Resolved Eukaryotic Tree of LifeLAURA WEGENER PARFREY1, JESSICA GRANT2, YONAS I. TEKLE2,6, ERICA LASEK-NESSELQUIST3,4,HILARY G. MORRISON3, MITCHELL L. SOGIN3, DAVID J. PATTERSON5, AND LAURA A. KATZ1,2,∗1Program in Organismic and Evolutionary Biology, University of Massachusetts, 611 North Pleasant Street, Amherst,MA 01003, USA; 2Department of Biological Sciences, Smith College, 44 College Lane, Northampton, MA 01063, USA; 3Bay Paul Center forComparative Molecular Biology and Evolution, Marine Biological Laboratory, 7 MBL Street, Woods Hole, MA 02543, USA; 4Department of Ecology andEvolutionary Biology, Brown University, 80 Waterman Street, Providence, RI 02912, USA; 5Biodiversity Informatics Group, Marine BiologicalLaboratory, 7 MBL Street, Woods Hole, MA 02543, USA; 6Present address: Department of Epidemiology and Public Health, Yale University School ofMedicine, New Haven, CT 06520, USA;∗Correspondence to be sent to: Laura A. Katz, 44 College Lane, Northampton, MA 01003, USA; E-mail: lkatz@smith.edu.Laura Wegener Parfrey and Jessica Grant have contributed equally to this work.Received 30 September 2009; reviews returned 1 December 2009; accepted 25 May 2010Associate Editor: C´ecile An´eAbstract.—An accurate reconstruction of the eukaryotic tree of life is essential to identify the innovations underlying thediversity of microbial and macroscopic (e.g., plants and animals) eukaryotes. Previous work has divided eukaryotic diver-sity into a small number of high-level “supergroups,” many of which receive strong support in phylogenomic analyses.However, the abundance of data in phylogenomic analyses can lead to highly supported but incorrect relationships dueto systematic phylogenetic error. Furthermore, the paucity of major eukaryotic lineages (19 or fewer) included in thesegenomic studies may exaggerate systematic error and reduce power to evaluate hypotheses. Here, we use a taxon-richstrategy to assess eukaryotic relationships. We show that analyses emphasizing broad taxonomic sampling (up to 451 taxarepresenting 72 major lineages) combined with a moderate number of genes yield a well-resolved eukaryotic tree of life.The consistency across analyses with varying numbers of taxa (88–451) and levels of missing data (17–69%) supports theaccuracy of the resulting topologies. The resulting stable topology emerges without the removal of rapidly evolving genesor taxa, a practice common to phylogenomic analyses. Several major groups are stable and strongly supported in theseanalyses (e.g., SAR, Rhizaria, Excavata), whereas the proposed supergroup “Chromalveolata” is rejected. Furthermore, ex-tensive instability among photosynthetic lineages suggests the presence of systematic biases including endosymbiotic genetransfer from symbiont (nucleus or plastid) to host. Our analyses demonstrate that stable topologies of ancient evolutionaryrelationships can be achieved with broad taxonomic sampling and a moderate number of genes. Finally, taxon-rich analy-ses such as presented here provide a method for testing the accuracy of relationships that receive high bootstrap support(BS) in phylogenomic analyses and enable placement of the multitude of lineages that lack genome scale data. [Excavata;microbial eukaryotes; Rhizaria; supergroups; systematic error; taxon sampling.]Perspectives on the structure of the eukaryotic treeof life have shifted in the past decade as molecularanalyses provide hypotheses for relationships amongthe approximately 75 robust lineages of eukaryotes.These lineages are defined by ultrastructural identities(Patterson 1999)—patterns of cellular and subcellularorganization revealed by electron microscopy—and arestrongly supported in molecular analyses (Parfrey et al.2006; Yoon et al. 2008). Most of these lineages nowfall within a small number of higher level clades, thesupergroups of eukaryotes (Simpson and Roger 2004;Adl et al. 2005; Keeling et al. 2005). Several of theseclades—Opisthokonta, Rhizaria, and Amoebozoa—marks throughout to note groups where uncertaintiesremain. Moreover, it is difficult to evaluate the overallstability of major clades of eukaryotes because phyloge-nomic analyses have 19 or fewer of the major lineagesand hence do not sufficiently sample eukaryotic diver-sity (Rodr´ıguez-Ezpeleta et al. 2007b; Burki et al. 2008;Hampl et al. 2009), whereas taxon-rich analyses with4 or fewer genes yield topologies with poor support atdeep nodes (Cavalier-Smith 2004; Parfrey et al. 2006;Yoon et al. 2008).Estimating the relationships of the major lineagesof eukaryotes is difficult because of both the ancientage of eukaryotes (1.2–1.8 billion years; Knoll et al.SYSTEMATIC BIOLOGY VOL. 59uded all lin-s additionalstudy (Tablerted, thoughed: i) Cerco-Acanthareaadiolarians),Plasmodio-Fig. 3; Bassa nematode-lid amoebaer to the plante SSU-rDNAamoeba iso-as Arachnulansistent withtrastructuralcontaminantomastix strix6). Excavataause Malaw-of Excavataet al. 2009),avata mem-2006; Simp-ests robustlynot have aindependenttephanopogonhin Heterolo-Yubuki andigmatic flag-inia anisocys-m this studyon samplingnalyses pro-d representa-mbined withFIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedinto those that we view to be strongly supported. The many poly-tomies represent uncertainties that remain.FUNDINGThis work was made possible by the US NationalbyguestonApril28,2013http://sysbio.oxfordjournals.org/Downloadedfromhttp://sysbio.oxfordjournals.org/content/59/5/518.fullEuks More Resolution but SimplerMonday, April 29, 13
  • 13. Mapping GOLD to TreePriapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Monday, April 29, 13
  • 14. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Mapping GOLD to TreeMonday, April 29, 13
  • 15. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Fungi49%Mapping GOLD to TreeMonday, April 29, 13
  • 16. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Mapping GOLD to TreeMonday, April 29, 13
  • 17. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Mapping GOLD to TreeMonday, April 29, 13
  • 18. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Animals26%Mapping GOLD to TreeMonday, April 29, 13
  • 19. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Mapping GOLD to TreeMonday, April 29, 13
  • 20. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Mapping GOLD to TreeMonday, April 29, 13
  • 21. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Greenalgae19%Mapping GOLD to TreeMonday, April 29, 13
  • 22. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Mapping GOLD to TreeMonday, April 29, 13
  • 23. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Mapping GOLD to TreeMonday, April 29, 13
  • 24. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Apicomplexa5%Mapping GOLD to TreeMonday, April 29, 13
  • 25. Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-A Very Biased SamplingMonday, April 29, 13
  • 26. Solution to Biased Sampling?Priapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedincluded all lin-a plus additionalthis study (Tableesolved: i) Cerco-as radiolarians),and Plasmodio-Bassa nematode-pyrellid amoebaesister to the plantto an amoeba iso-Arachnulath ultrastructuralhen contaminant). ExcavataMalaw-),s Excavata mem-Simp-t an independentwithin Heterolo-er enigmatic flag-Soginia anisocys-Monday, April 29, 13
  • 27. Solution: Fill in the TreePriapulida 1 0Phaeophyceae 1 0Rotifera 1 0Hemichordata 1 0Pinguiophyceae 1 0Ctenophora 1 0Bolidophyceae 1 0Chaetognatha 1 0Porifera 2 0Xanthophyceae 2 0Tardigrada 2 0Euglenida 2 0Chromerida 3 0Placozoa 3 0Glomeromycota 3 0Cryptomycota 4 0Blastocladiomycota 5 0Echinodermata 6 0Entomophthoromycota 9 0Chytridiomycota 12 0Neocallimastigomycota 12 0Annelida 13 0Eustigmatophyceae 13 0Cnidaria 18 0Bacillariophyta 21 0Platyhelminthes 23 0Mollusca 25 0Microsporidia 31 1Chlorophyta 77 1Nematoda 110 2Apicomplexa 264 5Arthropoda 370 7Chordata 626 12Streptophyta 796 15Basidiomycota 976 18Ascomycota 1,251 23530 SYSTEMATIC BIOLOGY VOL. 59a 97-taxon data set of Rhizaria that included all lin-eages with previously published data plus additionalmultigene data for 12 taxa added for this study (TableS1). Three major clades are strongly supported, thoughthe relationships among them are unresolved: i) Cerco-zoa, ii) Foraminifera plus Polycystinea and Acantharea(formerly classified with Phaeodarea as radiolarians),and (iii) the parasitic Haplosporidia and Plasmodio-phorida with Gromia and vampyrellids (Fig. 3; Basset al. 2009). We show that Theratromyxa, a nematode-eating soil amoeba, is related to vampyrellid amoebae(Fig. 3; 100% BS), and together they are sister to the plantparasites plasmodiophorids (100% BS). The SSU-rDNAsequence for Theratromyxa is identical to an amoeba iso-lated from Siberia where it was identified as Arachnulaimpatiens (EU567294; Bass et al. 2009).The topology within the Excavata is consistent withprevious hypotheses and clades with ultrastructuralidentities (Simpson 2003; Fig. 4), when contaminantEST data originally mislabeled as Streblomastix strixare excluded (Slamovits and Keeling 2006). Excavatais often polyphyletic in other analyses because Malaw-imonas branches outside the other clades of Excavata(Rodr´ıguez-Ezpeleta et al. 2007a; Hampl et al. 2009),whereas in analyses of fewer genes Excavata mem-bers fall into 2 or 3 clades (Parfrey et al. 2006; Simp-son et al. 2006). Although Malawimonas nests robustlywithin Excavata in our analyses, it does not have astable sister group and may represent an independentlineage (Fig. 4). Our analyses confirm that Stephanopogon(unplaced in Patterson 1999) branches within Heterolo-bosea (Cavalier-Smith and Nikolaev 2008; Yubuki andLeander 2008) and suggests that another enigmatic flag-ellate, ATCC 50646 (tentatively named Soginia anisocys-tis) is a basal member of Heterolobosea.FIGURE 5. Summary of major findings—the evolutionary relation-ships among major lineages of eukaryotes. Clades have been collapsedSYSTEMATIC BIOLOGYincluded all lin-a plus additionalthis study (Tableupported, thoughesolved: i) Cerco-a and Acanthareaas radiolarians),and Plasmodio-lids (Fig. 3; Bassyxa, a nematode-pyrellid amoebaesister to the plant). The SSU-rDNAto an amoeba iso-ified as Arachnulais consistent withth ultrastructuralhen contaminantStreblomastix strixg 2006). Excavatas because Malaw-ades of Excavataampl et al. 2009),s Excavata mem-et al. 2006; Simp-nas nests robustlydoes not have at an independenthat Stephanopogonwithin Heterolo-2008; Yubuki ander enigmatic flag-d Soginia anisocys-a.FI G U R E 5. Summary of major findingsships among major lineages of eukaryotes.Monday, April 29, 13
  • 28. II: Filling in the Tree ExampleMonday, April 29, 13
  • 29. Big Microbial Sequencing Projects• Coordinated, top-down efforts• Fungal Genome Initiative (Broad/Whitehead)• Gordon and Betty Moore Foundation Marine Microbial GenomeSequencing Project• Sanger Center Pathogen Sequencing Unit• NHGRI Human Gut Microbiome Project• NIH Human Microbiome Program• White paper or grant systems• NIAID Microbial Sequencing Centers• DOE/JGI Community Sequencing Program• DOE/JGI BER Sequencing Program• NSF/USDA Microbial Genome Sequencing• Covers lots of ground and biological diversityMonday, April 29, 13
  • 30. As of 2002Monday, April 29, 13
  • 31. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40phyla ofbacteriaAs of 2002Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 32. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40phyla ofbacteria• Genomesequences aremostly fromthree phylaAs of 2002Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 33. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40phyla ofbacteria• Genomesequences aremostly fromthree phyla• Some otherphyla are onlysparselysampledAs of 2002Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 34. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40phyla ofbacteria• Genomesequences aremostly fromthree phyla• Some otherphyla are onlysparselysampledAs of 2002Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 35. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40phyla ofbacteria• Genomesequences aremostly fromthree phyla• Some otherphyla are onlysparselysampledAs of 2002Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 36. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40 phylaof bacteria• Genomesequences aremostly fromthree phyla• Some otherphyla are onlysparsely sampled• Solution I:sequence morephyla• NSF-fundedTree of LifeProject• A genomefrom each ofeight phylaEisen, Ward, Robb,Nelson, et alTree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 37. Phylum Species selectedChrysiogenes Chrysiogenes arsenatis (GCA)Coprothermobacter Coprothermobacter proteolyticus (GCBP)Dictyoglomi Dictyoglomus thermophilum (GD T )Thermodesulfobacteria Thermodesulfobacterium commune (GTC)Nitrospirae Thermodesulfovibrio yellowstonii (GTY)Thermomicrobia Thermomicrobium roseum (GTR )Deferribacteres Geovibrio thiophilus (GGT)Synergistes Synergistes jonesii (GSJ)Organisms SelectedMonday, April 29, 13
  • 38. Monday, April 29, 13
  • 39. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40 phylaof bacteria• Genomesequences aremostly from threephyla• Some other phylaare only sparselysampled• Still highly biasedin terms of thetree• NSF-fundedTree of LifeProject• A genome fromeach of eightphylaEisen & Ward, PIs Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 40. Major Lineages of Actinobacteria2.5.1 Acidimicrobidae2.5.1.1 Unclassified2.5.1.2 "Microthrixineae2.5.1.3 Acidimicrobineae2.5.1.4 BD2-102.5.1.5 EB10172.5.2 Actinobacteridae2.5.2.1 Unclassified2.5.2.10 Ellin306/WR1602.5.2.11 Ellin50122.5.2.12 Ellin50342.5.2.13 Frankineae2.5.2.14 Glycomyces2.5.2.15 Intrasporangiaceae2.5.2.16 Kineosporiaceae2.5.2.17 Microbacteriaceae2.5.2.18 Micrococcaceae2.5.2.19 Micromonosporaceae2.5.2.2 Actinomyces2.5.2.20 Propionibacterineae2.5.2.21 Pseudonocardiaceae2.5.2.22 Streptomycineae2.5.2.23 Streptosporangineae2.5.2.3 Actinomycineae2.5.2.4 Actinosynnemataceae2.5.2.5 Bifidobacteriaceae2.5.2.6 Brevibacteriaceae2.5.2.7 Cellulomonadaceae2.5.2.8 Corynebacterineae2.5.2.9 Dermabacteraceae2.5.3 Coriobacteridae2.5.3.1 Unclassified2.5.3.2 Atopobiales2.5.3.3 Coriobacteriales2.5.3.4 Eggerthellales2.5.4 OPB412.5.5 PK12.5.6 Rubrobacteridae2.5.6.1 Unclassified2.5.6.2 "Thermoleiphilaceae2.5.6.3 MC472.5.6.4 Rubrobacteraceae2.5 Actinobacteria2.5.1 Acidimicrobidae2.5.1.1 Unclassified2.5.1.2 "Microthrixineae2.5.1.3 Acidimicrobineae2.5.1.3.1 Unclassified2.5.1.3.2 Acidimicrobiaceae2.5.1.4 BD2-102.5.1.5 EB10172.5.2 Actinobacteridae2.5.2.1 Unclassified2.5.2.10 Ellin306/WR1602.5.2.11 Ellin50122.5.2.12 Ellin50342.5.2.13 Frankineae2.5.2.13.1 Unclassified2.5.2.13.2 Acidothermaceae2.5.2.13.3 Ellin60902.5.2.13.4 Frankiaceae2.5.2.13.5 Geodermatophilaceae2.5.2.13.6 Microsphaeraceae2.5.2.13.7 Sporichthyaceae2.5.2.14 Glycomyces2.5.2.15 Intrasporangiaceae2.5.2.15.1 Unclassified2.5.2.15.2 Dermacoccus2.5.2.15.3 Intrasporangiaceae2.5.2.16 Kineosporiaceae2.5.2.17 Microbacteriaceae2.5.2.17.1 Unclassified2.5.2.17.2 Agrococcus2.5.2.17.3 Agromyces2.5.2.18 Micrococcaceae2.5.2.19 Micromonosporaceae2.5.2.2 Actinomyces2.5.2.20 Propionibacterineae2.5.2.20.1 Unclassified2.5.2.20.2 Kribbella2.5.2.20.3 Nocardioidaceae2.5.2.20.4 Propionibacteriaceae2.5.2.21 Pseudonocardiaceae2.5.2.22 Streptomycineae2.5.2.22.1 Unclassified2.5.2.22.2 Kitasatospora2.5.2.22.3 Streptacidiphilus2.5.2.23 Streptosporangineae2.5.2.23.1 Unclassified2.5.2.23.2 Ellin51292.5.2.23.3 Nocardiopsaceae2.5.2.23.4 Streptosporangiaceae2.5.2.23.5 Thermomonosporaceae2.5.2.3 Actinomycineae2.5.2.4 Actinosynnemataceae2.5.2.5 Bifidobacteriaceae2.5.2.6 Brevibacteriaceae2.5.2.7 Cellulomonadaceae2.5.2.8 Corynebacterineae2.5.2.8.1 Unclassified2.5.2.8.2 Corynebacteriaceae2.5.2.8.3 Dietziaceae2.5.2.8.4 Gordoniaceae2.5.2.8.5 Mycobacteriaceae2.5.2.8.6 Rhodococcus2.5.2.8.7 Rhodococcus2.5.2.8.8 Rhodococcus2.5.2.9 Dermabacteraceae2.5.2.9.1 Unclassified2.5.2.9.2 Brachybacterium2.5.2.9.3 Dermabacter2.5.3 Coriobacteridae2.5.3.1 Unclassified2.5.3.2 Atopobiales2.5.3.3 Coriobacteriales2.5.3.4 Eggerthellales2.5.4 OPB412.5.5 PK12.5.6 Rubrobacteridae2.5.6.1 Unclassified2.5.6.2 "Thermoleiphilaceae2.5.6.2.1 Unclassified2.5.6.2.2 Conexibacter2.5.6.2.3 XGE5142.5.6.3 MC472.5.6.4 RubrobacteraceaeMonday, April 29, 13
  • 41. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40 phylaof bacteria• Genomesequences aremostly from threephyla• Some other phylaare only sparselysampled• Same trend inArchaea• NSF-fundedTree of LifeProject• A genome fromeach of eightphylaEisen & Ward, PIs Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 42. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40 phylaof bacteria• Genomesequences aremostly from threephyla• Some other phylaare only sparselysampled• Same trend inEukaryotes• NSF-fundedTree of LifeProject• A genome fromeach of eightphylaEisen & Ward, PIs Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 43. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40 phylaof bacteria• Genomesequences aremostly from threephyla• Some other phylaare only sparselysampled• Same trend inViruses• NSF-fundedTree of LifeProject• A genome fromeach of eightphylaEisen & Ward, PIs Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 44. • At least 40 phylaof bacteria• Genomesequences aremostly from threephyla• Some other phylaare only sparselysampled• Solution: ReallyFill in the TreesAcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacterTree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 45. Filling in the TreeFigure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al TreeMonday, April 29, 13
  • 46. Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al TreeFilling in the TreeMonday, April 29, 13
  • 47. Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al TreeFilling in the TreeMonday, April 29, 13
  • 48. Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al TreeFilling in the TreeMonday, April 29, 13
  • 49. Lots of Plants, Animals, FungiFigure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al TreeMonday, April 29, 13
  • 50. Exclude Plants, Animals, FungiFigure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al TreeMonday, April 29, 13
  • 51. A Genomic Encyclopedia of Microbes (GEM)Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al TreeMonday, April 29, 13
  • 52. Just Say No to EukaryotesFigure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al TreeMonday, April 29, 13
  • 53. GEBA: A Genomic Encyclopediaof Bacteria and ArchaeaFigure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al TreeMonday, April 29, 13
  • 54. GEBAMonday, April 29, 13
  • 55. GEBA Pilot Project: Components• Project overview (Phil Hugenholtz, Nikos Kyrpides, JonathanEisen, Eddy Rubin, Jim Bristow)• Project management (David Bruce, Eileen Dalin, LynneGoodwin)• Culture collection and DNA prep (DSMZ, Hans-Peter Klenk)• Sequencing and closure (Eileen Dalin, Susan Lucas, AllaLapidus, Mat Nolan, Alex Copeland, Cliff Han, Feng Chen,Jan-Fang Cheng)• Annotation and data release (Nikos Kyrpides, VictorMarkowitz, et al)• Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu,Victor Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain,Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati,Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla)• Adopt a microbe education project (Cheryl Kerfeld)• Outreach (David Gilbert)• $$$ (DOE, Eddy Rubin, Jim Bristow)Monday, April 29, 13
  • 56. rRNA Tree of LifeFIgure from Barton, Eisen et al.“Evolution”, CSHL Press.Based on tree from Pace NR, 2003.Monday, April 29, 13
  • 57. rRNA Tree of BAFIgure from Barton, Eisen et al.“Evolution”, CSHL Press.Based on tree from Pace NR, 2003.Monday, April 29, 13
  • 58. GreenGenesMonday, April 29, 13
  • 59. Monday, April 29, 13
  • 60. DSMZMonday, April 29, 13
  • 61. Monday, April 29, 13
  • 62. GEBA Pilot Project Overview• Identify major branches in rRNA tree for whichno genomes are available• Identify those with a cultured representative inDSMZ• DSMZ grew > 200 of these and prepped DNA• Sequence and finish 200+• Annotate, analyze, release data• Assess benefits of tree guided sequencing• 1st paper Wu et al in Nature Dec 2009Monday, April 29, 13
  • 63. GEBA Pilot Target List05101520253035B:Actinobacteria(HighGC)B:AminanaerobiaB:AquificaeB:BacteroidetesB:ChloroflexiB:DeferribacteresB:DeferribacteresB:DeinococciB:DeltaProteobacteriaB:EpsilonProteobacteriaB:FirmicutesB:FusobacteriaB:GammaProteobacteriaB:GemmatimonadetesB:HaloanaerobialesB:PlanctomycetesB:SpirochaetesB:ThermodesulfobacteriaB:ThermodesulfobiaB:ThermovenabulaeA:HalobacteriaA:ArchaeoglobiA:MethanobacteriaA:MethanomicrobiaA:ThermococciA:ThermoproteiPhyla#ofGenomesGEBA Initial Target ListMonday, April 29, 13
  • 64. Assess Benefits of GEBA• All genomes have some value• But what, if any, is the benefit of tree-guided sequencing over other selectionmethods• Lessons for other large scale microbialgenome projects?Monday, April 29, 13
  • 65. Lessons from GEBAMonday, April 29, 13
  • 66. Lesson 1: rRNA PD IDs novel lineagesFrom Wu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 67. Concatenated Marker PDFrom Wu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 68. Lesson 2: rRNA Tree is not perfectBadger et al. 2005 Int J System Evol Microbiol 55: 1021-1026.16s WGT, 23SMonday, April 29, 13
  • 69. How Pick Novel Lineages for Euks?• Molecular• rRNA PD?• Conserved markers by PCR?• EST shotgun?• Other data for phylogenyMonday, April 29, 13
  • 70. Lesson 3: Improves annotation• Took 56 GEBA genomes and compared results vs. 56randomly sampled new genomes• Better definition of protein family sequence “patterns”• Greatly improves “comparative” and “evolutionary”based predictions• Conversion of hypothetical into conserved hypotheticals• Linking distantly related members of protein families• Improved non-homology predictionMonday, April 29, 13
  • 71. Annotation for Euks?Monday, April 29, 13
  • 72. Lesson 4 : Metadata ImportantMonday, April 29, 13
  • 73. Lesson 5: Project management critical• Tracking samples and status• Getting permissions• Shipping samples• Contacting collaborators• Data archiving and submission• Communicating with core facilities• and moreMonday, April 29, 13
  • 74. Lesson 6: Culture Collections NeededMonday, April 29, 13
  • 75. Lesson 7: Data PublicationsMonday, April 29, 13
  • 76. Lesson 8: Diversity Discovery• Phylogeny-driven genome selection helpsdiscover new genetic diversityMonday, April 29, 13
  • 77. Protein Family Rarefaction• Take data set of multiple completegenomes• Identify all protein families using MCL• Plot # of genomes vs. # of protein familiesMonday, April 29, 13
  • 78. Wu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 79. Wu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 80. Wu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 81. Wu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 82. Wu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 83. Synapomorphies existWu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 84. True for Euks?Monday, April 29, 13
  • 85. Lesson 9: Improves metagenomicsMonday, April 29, 13
  • 86. Phylotyping00.1250.2500.3750.500AlphaproteobacteriaBetaproteobacteriaGammaproteobacteriaEpsilonproteobacteriaDeltaproteobacteriaCyanobacteriaFirmicutesActinobacteriaChlorobiCFBChloroflexiSpirochaetesFusobacteriaDeinococcus-ThermusEuryarchaeotaCrenarchaeotaSargasso PhylotypesWeighted%ofClonesMajor Phylogenetic GroupEFG EFTuHSP70 RecARpoB rRNAVenter et al., Science 304: 66-74. 2004GEBA ProjectimprovesmetagenomicanalysisMonday, April 29, 13
  • 87. Eukaryotic Metagenomics?Monday, April 29, 13
  • 88. GEBA ZoomMonday, April 29, 13
  • 89. GEBA Now• 300+ genomes• Rich sampling of major groups ofcultured organisms• Zoomed in sampling of haloarchaea,cyanobacteria and moreMonday, April 29, 13
  • 90. GEBA Cyanobacteriawww.pnas.org/cgi/doi/10.1073/pnas.1217107110Monday, April 29, 13
  • 91. Haloarchaeal GEBA-likeLynch EA, Langille MGI, Darling A, Wilbanks EG, Haltiner C, et al. (2012) Sequencing of Seven HaloarchaealGenomes Reveals Patterns of Genomic Flux. PLoS ONE 7(7): e41389. doi:10.1371/journal.pone.0041389Monday, April 29, 13
  • 92. 88Plan:Sequence multiple Root Nodule Bacteria (RNBs) across theplanet. Pilot: 100 RNBs.Alpha RNBBradyrhizobiumMesorhizobiumRhizobiumBeta RNBSinorhizobiumCupriavidisBurkholderiaBalneimonas-likeDevosiaOchrobactrumPhyllobacteriumAzorhizobiumAllorhizobiumGoal:• Understand BioGeographical effects on speciesevolution and understand host-specificity.Rationale:• N2 fixation by legume pastures and crops provides 65% of theN currently utilized in agricultural production.• Contributes 25 to 90 million metric tones N pa.• Symbioses save $US 6-10 billion annually on N fertilizer.• Grain and animal production enhanced by fixed nitrogensupplied by the symbiosis.Nikos KyrpidesGEBA RNBMonday, April 29, 13
  • 93. But ...Monday, April 29, 13
  • 94. Phylotyping00.1250.2500.3750.500AlphaproteobacteriaBetaproteobacteriaGammaproteobacteriaEpsilonproteobacteriaDeltaproteobacteriaCyanobacteriaFirmicutesActinobacteriaChlorobiCFBChloroflexiSpirochaetesFusobacteriaDeinococcus-ThermusEuryarchaeotaCrenarchaeotaSargasso PhylotypesWeighted%ofClonesMajor Phylogenetic GroupEFG EFTuHSP70 RecARpoB rRNAVenter et al., Science 304: 66-74. 2004GEBA ProjectimprovesmetagenomicanalysisMonday, April 29, 13
  • 95. Phylotyping00.1250.2500.3750.500AlphaproteobacteriaBetaproteobacteriaGammaproteobacteriaEpsilonproteobacteriaDeltaproteobacteriaCyanobacteriaFirmicutesActinobacteriaChlorobiCFBChloroflexiSpirochaetesFusobacteriaDeinococcus-ThermusEuryarchaeotaCrenarchaeotaSargasso PhylotypesWeighted%ofClonesMajor Phylogenetic GroupEFG EFTuHSP70 RecARpoB rRNABut not a lotVenter et al., Science 304: 66-74. 2004Monday, April 29, 13
  • 96. Phylogenomics Future 1• Need to adapt genomic and metagenomicmethods to make better use of dataMonday, April 29, 13
  • 97. Improving Metagenomic Analysis• Methods• More automation• Better phylogenetic methods for short readsand large data sets• Improved tools for using distantly relatedgenomes in metagenomic analysis• Data sets• Rebuild protein family models• New phylogenetic markers• Need better reference phylogenies, includingHGT• More simulationsMonday, April 29, 13
  • 98. Kembel CorrectionKembel, Wu, Eisen, Green. In press.PLoS Computational Biology.Incorporating 16S gene copy numberinformation improves estimates ofmicrobial diversity and abundanceMonday, April 29, 13
  • 99. alignment used to build the profile, resulting in a multiplesequence alignment of full-length reference sequences andmetagenomic reads. The final step of the alignment process is aPD versus PID clustering, 2) to explore overlap betwclusters and recognized taxonomic designations, andthe accuracy of PhylOTU clusters from shotgun reFigure 1. PhylOTU Workflow. Computational processes are represented as squares and databases are represented as cylinders inworkflow of PhylOTU. See Results section for details.doi:10.1371/journal.pcbi.1001061.g001Finding MetaSharpton TJ, Riesenfeld SJ, Kembel SW, Ladau J, ODwyer JP, Green JL, Eisen JA, Pollard KS. (2011)PhylOTU: A High-Throughput Procedure Quantifies Microbial Community Diversity and Resolves NovelTaxa from Metagenomic Data. PLoS Comput Biol 7(1): e1001061. doi:10.1371/journal.pcbi.1001061PhylOTUMonday, April 29, 13
  • 100. Phylosift/ pplacerAaron Darling, Guillaume Jospin, Holly Bik, Erik Matsen, EricLowe, and othersMonday, April 29, 13
  • 101. Kembel Combinertypically used as a qualitative measure because duplicate squences are usually removed from the tree. However, thetest may be used in a semiquantitative manner if all cloneeven those with identical or near-identical sequences, are icluded in the tree (13).Here we describe a quantitative version of UniFrac that wcall “weighted UniFrac.” We show that weighted UniFrac bhaves similarly to the FST test in situations where both aFIG. 1. Calculation of the unweighted and the weighted UniFrmeasures. Squares and circles represent sequences from two differeenvironments. (a) In unweighted UniFrac, the distance between tcircle and square communities is calculated as the fraction of tbranch length that has descendants from either the square or the circenvironment (black) but not both (gray). (b) In weighted UniFrabranch lengths are weighted by the relative abundance of sequencesthe square and circle communities; square sequences are weighttwice as much as circle sequences because there are twice as many totcircle sequences in the data set. The width of branches is proportionto the degree to which each branch is weighted in the calculations, angray branches have no weight. Branches 1 and 2 have heavy weighsince the descendants are biased toward the square and circles, respetively. Branch 3 contributes no value since it has an equal contributiofrom circle and square sequences after normalization.Kembel SW, Eisen JA, Pollard KS, Green JL (2011) The Phylogenetic Diversity of Metagenomes. PLoSONE 6(8): e23214. doi:10.1371/journal.pone.0023214Monday, April 29, 13
  • 102. NMF in MetagenomesCharacterizing the niche-space distributions of componentsSitesNorth American East Coast_GS005_EmbaymentNorth American East Coast_GS002_CoastalNorth American East Coast_GS003_CoastalNorth American East Coast_GS007_CoastalNorth American East Coast_GS004_CoastalNorth American East Coast_GS013_CoastalNorth American East Coast_GS008_CoastalNorth American East Coast_GS011_EstuaryNorth American East Coast_GS009_CoastalEastern Tropical Pacific_GS021_CoastalNorth American East Coast_GS006_EstuaryNorth American East Coast_GS014_CoastalPolynesia Archipelagos_GS051_Coral Reef AtollGalapagos Islands_GS036_CoastalGalapagos Islands_GS028_CoastalIndian Ocean_GS117a_Coastal sampleGalapagos Islands_GS031_Coastal upwellingGalapagos Islands_GS029_CoastalGalapagos Islands_GS030_Warm SeepGalapagos Islands_GS035_CoastalSargasso Sea_GS001c_Open OceanEastern Tropical Pacific_GS022_Open OceanGalapagos Islands_GS027_CoastalIndian Ocean_GS149_HarborIndian Ocean_GS123_Open OceanCaribbean Sea_GS016_Coastal SeaIndian Ocean_GS148_Fringing ReefIndian Ocean_GS113_Open OceanIndian Ocean_GS112a_Open OceanCaribbean Sea_GS017_Open OceanIndian Ocean_GS121_Open OceanIndian Ocean_GS122a_Open OceanGalapagos Islands_GS034_CoastalCaribbean Sea_GS018_Open OceanIndian Ocean_GS108a_Lagoon ReefIndian Ocean_GS110a_Open OceanEastern Tropical Pacific_GS023_Open OceanIndian Ocean_GS114_Open OceanCaribbean Sea_GS019_CoastalCaribbean Sea_GS015_CoastalIndian Ocean_GS119_Open OceanGalapagos Islands_GS026_Open OceanPolynesia Archipelagos_GS049_CoastalIndian Ocean_GS120_Open OceanPolynesia Archipelagos_GS048a_Coral ReefComponent 1Component 2Component 3Component 4Component 50.1 0.2 0.3 0.4 0.5 0.6 0.2 0.4 0.6 0.8 1.0SalinitySampleDepthChlorophyllTemperatureInsolationWaterDepthGeneralHighM ediumLowNAHighM ediumLowNAWater depth>4000m2000!4000m900!2000m100!200m20!100m0!20m>4000m2000!4000m900!2000m100!200m20!100m0!20m(a) (b) (c)Figure 3: a) Niche-space distributions for our five components (HT); b) the site-similarity matrix ( ˆHT ˆH); c) environmental variables for the sites. The matrices arealigned so that the same row corresponds to the same site in each matrix. Sites areordered by applying spectral reordering to the similarity matrix (see Materials andMethods). Rows are aligned across the three matrices.Functional biogeography of ocean microbesrevealed through non-negative matrixfactorization Jiang et al. In press PLoSOne. Comes out 9/18.w/ Weitz, Dushoff,Langille, Neches,Levin, etcMonday, April 29, 13
  • 103. More MarkersPhylogenetic group GenomeNumberGeneNumberMakerCandidatesArchaea 62 145415 106Actinobacteria 63 267783 136Alphaproteobacteria 94 347287 121Betaproteobacteria 56 266362 311Gammaproteobacteria 126 483632 118Deltaproteobacteria 25 102115 206Epislonproteobacteria 18 33416 455Bacteriodes 25 71531 286Chlamydae 13 13823 560Chloroflexi 10 33577 323Cyanobacteria 36 124080 590Firmicutes 106 312309 87Spirochaetes 18 38832 176Thermi 5 14160 974Thermotogae 9 17037 684Monday, April 29, 13
  • 104. Better Reference TreeMorgan et al.submittedMonday, April 29, 13
  • 105. Sifting FamiliesRepresentativeGenomesExtractProteinAnnotationAll v. AllBLASTHomologyClustering(MCL)SFamsAlign &BuildHMMsHMMsScreen forHomologsNewGenomesExtractProteinAnnotationFigure 1Sharpton et al. submittedABCMonday, April 29, 13
  • 106. Zorro - Automated MaskingcetoTrueTree0.01.02.03.04.05.06.07.08.09.0200 400 800 1600 3200DistancetoTrueTreeSequence Length200no maskingzorrogblocksWu M, Chatterji S, Eisen JA (2012) Accounting For Alignment Uncertaintyin Phylogenomics. PLoS ONE 7(1): e30288. doi:10.1371/journal.pone.0030288Monday, April 29, 13
  • 107. Phylogenomics Future 2• We have still only scratched the surfaceof microbial diversityMonday, April 29, 13
  • 108. rRNA Tree of LifeFigure from Barton, Eisen et al. “Evolution”, CSHLPress. 2007.Based on tree from Pace 1997 Science 276:734-740ArchaeaEukaryotesBacteriaMonday, April 29, 13
  • 109. PD: GenomesFrom Wu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 110. PD: Genomes + GEBAFrom Wu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 111. PD: IsolatesFrom Wu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 112. PD: AllFrom Wu et al. 2009 Nature 462, 1056-1060Monday, April 29, 13
  • 113. Uncultured Lineages: Methods• Get into culture• Enrichment cultures• If abundant in low diversity ecosystems• Flow sorting• Microbeads• Microfluidic sorting• Single cell amplificationMonday, April 29, 13
  • 114. 110Number of SAGs from Candidate PhylaOD1OP11OP3SAR406Site A: Hydrothermal vent 4 1 - -Site B: Gold Mine 6 13 2 -Site C: Tropical gyres (Mesopelagic) - - - 2Site D: Tropical gyres (Photic zone) 1 - - -Sample collections at 4 additional sites are underway.Phil HugenholtzGEBA UnculturedMonday, April 29, 13
  • 115. Uncultured Eukaryotes?Monday, April 29, 13
  • 116. Phylogenomics Future 3• Need Experiments from Across the Treeof Life tooMonday, April 29, 13
  • 117. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40phyla ofbacteriaAs of 2002Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 118. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40phyla ofbacteria• Experimentalstudies aremostly fromthree phylaAs of 2002Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 119. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40phyla ofbacteria• Experimentalstudies aremostly fromthree phyla• Some studiesin other phylaAs of 2002Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 120. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40phyla ofbacteria• Genomesequences aremostly fromthree phyla• Some otherphyla are onlysparselysampled• Same trend inEukaryotesAs of 2002Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 121. AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacter• At least 40phyla ofbacteria• Genomesequences aremostly fromthree phyla• Some otherphyla are onlysparselysampled• Same trend inVirusesAs of 2002Tree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 122. 0.1AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacterTree based onHugenholtz (2002)with somemodifications.Needexperimentalstudies fromacross the treetooTree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 123. 0.1AcidobacteriaBacteroidesFibrobacteresGemmimonasVerrucomicrobiaPlanctomycetesChloroflexiProteobacteriaChlorobiFirmicutesFusobacteriaActinobacteriaCyanobacteriaChlamydiaSpriochaetesDeinococcus-ThermusAquificaeThermotogaeTM6OS-KTermite GroupOP8Marine GroupAWS3OP9NKB19OP3OP10TM7OP1OP11NitrospiraSynergistesDeferribacteresThermudesulfobacteriaChrysiogenetesThermomicrobiaDictyoglomusCoprothmermobacterTree based onHugenholtz (2002)with somemodifications.Adopt aMicrobeTree Based on Hugenholtz,2002.http://genomebiology.com/2002/3/2/reviews/0003Monday, April 29, 13
  • 124. What Next?Monday, April 29, 13
  • 125. Acknowledgements• GEBA:• $$: DOE-JGI, DSMZ• Eddy Rubin, Phil Hugenholtz, Hans-Peter Klenk, Nikos Kyrpides, Tanya Woyke, Dongying Wu, AaronDarling, Jenna Lang• GEBA Cyanobacteria• $$: DOE-JGI• Cheryl Kerfeld, Dongying Wu, Patrick Shih• Haloarchaea• $$$ NSF• Marc Facciotti, Aaron Darling, Erin Lynch,• iSEEM:• $$: GBMF• Katie Pollard, Jessica Green, Martin Wu, Steven Kembel, Tom Sharpton, Morgan Langille, GuillaumeJospin, Dongying Wu,• aTOL• $$: NSF• Naomi Ward, Jonathan Badger, Frank Robb, Martin Wu, Dongying Wu• Others (not mentioned in detail)• $$: NSF, NIH, DOE, GBMF, DARPA, Sloan• Frank Robb, Craig Venter, Doug Rusch, Shibu Yooseph, Nancy Moran, Colleen Cavanaugh, JoshWeitz• EisenLab: Srijak Bhatnagar, Russell Neches, Lizzy Wilbanks, Holly BikMonday, April 29, 13