Introduction   MEGAN           Metadata         Pooling Datasets     Summary & Conclusion           Pooling metagenomes in...
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion         1     Introduction M...
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion         1     Introduction M...
Introduction         MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Metagenomics               Th...
Introduction        MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Typical Metagenomic Samples   ...
Introduction               MEGAN                Metadata      Pooling Datasets   Summary & Conclusion Metagenomic Pipeline...
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion         1     Introduction M...
Introduction             MEGAN              Metadata            Pooling Datasets           Summary & Conclusion MEGAN Intr...
Introduction   MEGAN           Metadata        Pooling Datasets    Summary & Conclusion Taxonomic Analysis                ...
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Functional Analysis - SEED         ...
Introduction   MEGAN           Metadata        Pooling Datasets             Summary & Conclusion Functional Analysis - KEG...
Introduction   MEGAN           Metadata        Pooling Datasets      Summary & Conclusion Comparing Datasets              ...
Introduction         MEGAN           Metadata        Pooling Datasets              Summary & Conclusion DB Extension - Pos...
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion          1    Introduction M...
Introduction                 MEGAN                  Metadata                Pooling Datasets                Summary & Conc...
Introduction     MEGAN           Metadata        Pooling Datasets     Summary & Conclusion                             Mon...
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion          1    Introduction M...
Introduction           MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Basic Idea               Cr...
Introduction         MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Primary & Combined Datasets i...
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Creating Combined Datasets in MEGAN...
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Creating Combined Datasets in MEGAN...
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Creating Combined Datasets in MEGAN...
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Creating Combined Datasets in MEGAN...
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Creating Combined Datasets in MEGAN...
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Creating Combined Datasets in MEGAN...
Introduction         MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Analysis               Input:...
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Comparing all Datasets22 / 27      ...
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Comparing by Season23 / 27         ...
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion24 / 27           Hans-Joachim Rusch...
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion24 / 27           Hans-Joachim Rusch...
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion          1    Introduction M...
Introduction         MEGAN           Metadata        Pooling Datasets   Summary & Conclusion Summary & Conclusion         ...
Introduction        MEGAN           Metadata        Pooling Datasets   Summary & Conclusion               MEGAN v4 is free...
Upcoming SlideShare
Loading in …5
×

Hans-Joachim Ruscheweyh: Pooling Metagenomes in MEGAN Based on Environmental Parameters

1,277 views

Published on

Hans-Joachim Ruscheweyh's talk from the 1st Earth Microbiome Project meeting in Shenzhen

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,277
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
21
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Hans-Joachim Ruscheweyh: Pooling Metagenomes in MEGAN Based on Environmental Parameters

  1. 1. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Pooling metagenomes in MEGAN based on environmental parameters Hans-Joachim Ruscheweyh Center for Bioinformatics, Tuebingen University June 15, 20111 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  2. 2. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion2 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  3. 3. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion3 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  4. 4. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Metagenomics The study of DNA of uncultured organisms > 99% of all microbes cannot be cultured A genome is the entire genetic information of a single organism A metagenome is the entire genetic information of a assemblage of organisms4 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  5. 5. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Typical Metagenomic Samples Human microbiome Soil samples Sea water samples Seabed samples Air samples Medical samples Ancient bones5 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  6. 6. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Metagenomic Pipeline A primer on metagenomics; Wooley et al. (2010)6 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  7. 7. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion7 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  8. 8. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion MEGAN Introduction Interactive tool for metagenomic analysis - www-ab.informatik.uni-tuebingen.de/software/megan8 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  9. 9. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Taxonomic Analysis Tree reflects the NCBI taxonomy Reads are compared against reference database e.g. NR Reads are mapped on the tree using the comparison results based on the LCA algorithm9 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  10. 10. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Functional Analysis - SEED The tree contains the nodes of the SEED classification Reads are mapped on to the SEED classification www.theSEED.org10 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  11. 11. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Functional Analysis - KEGG KEGG: Kanehisa et al., Nucleic Acids Res. 38, D355-D360 (2010) http://www.genome.jp/kegg/11 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  12. 12. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Comparing Datasets Based on (normalized) number of reads assigned to each node Each color determines a dataset12 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  13. 13. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion DB Extension - PostgreSQL MEGAN communicates with a PostgreSQL database Many datasets are available in one database instance Many users can operate on the same database instance This avoids redundancy on often large datasets http://www.postgresql.org/13 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  14. 14. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion14 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  15. 15. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion What is Metadata? Metadata are for example environmental parameters recorded together with the actual metagenomic sample e.g. collection date, gender, health status, ... Month Salinity Ammonia January_2PM January 33.3 0.0 January_10PM January 34.2 0.0 August_4AM August 33.3 0.14 August_10AM August 32.1 0.06 Datasets taken from: The taxonomic and functional diversity of microbes at a temperate coastal site: a ’multi-omic’ study of the seasonal and diel temporal variation; Gilbert et al. (2010)15 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  16. 16. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Month ∈ {Dec, Jan, Feb} January_2PM Winter January_10PM Month ∈ {Jun,Jul, Aug} August_4AM Summer August_10AM16 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  17. 17. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion17 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  18. 18. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Basic Idea Create two new datasets (winter, summer) from the four BLAST files Problems: Doubles space consumption Is time inefficient Idea: Use database technology to avoid redundancy, save time and space18 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  19. 19. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Primary & Combined Datasets in the Database A primary dataset is a dataset created from the original BLAST output and the reads file A combined dataset is created from primary datasets A combined dataset is created by using: References to read and match data of the primary datasets Optionally also the classification data of the primary datasets Hence, a combined dataset can be created time and space efficiently19 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  20. 20. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  21. 21. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  22. 22. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  23. 23. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  24. 24. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  25. 25. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  26. 26. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Analysis Input: 8 primary datasets. Altogether ~100,000 reads, ~4 mio matches, ~4.5 GB space It takes ~50 minutes to load these datasets to the database Three combined datasets (winter, spring, summer) are created Their creation takes ~30 seconds and needs ~40MB additional space Alternatively combined datasets can be created on-the-fly. This takes less than a second and needs no additional space21 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  27. 27. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Comparing all Datasets22 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  28. 28. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Comparing by Season23 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  29. 29. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion24 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  30. 30. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion24 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  31. 31. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion25 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  32. 32. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Summary & Conclusion MEGAN communicates with a PostgreSQL database This gives the user access to many datasets Many user can work on the database simultaneously Primary datasets can be pooled to create combined datasets The MetaData Analyzer allows one to create combined datasets based on the usage of boolean expressions on assigned metadata This technique is highly space and time efficient26 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  33. 33. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion MEGAN v4 is freely available from www-ab. informatik.uni-tuebingen.de/software/megan Integrative analysis of environmental sequences using MEGAN4, Daniel H. Huson, Suparna Mitra, Hans-Joachim Ruscheweyh, Nico Weber, Stephan C. Schuster; submitted 2011 Thanks go to Daniel Huson, Suparna Mitra, Nico Weber, Stefan Schuster Thank your for your attention!27 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes

×