Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Reconstructing paleoenvironments using metagenomics


Published on

Lecture on metagenomics research at the Naturalis Biodiversity Center, based on the use case of sequencing the metagenome of mammoth stomach contents.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Reconstructing paleoenvironments using metagenomics

  1. 1. Reconstructing paleoenvironmentsusing metagenomicsRutger Vos
  2. 2. Outline• About Naturalis Biodiversity Center• NBCs facilities and expertises  Ancient DNA lab  Barcode lab  Informatics focus group• Metagenomics and paleoecology• Use case: The mammoths last meal• NGS@Naturalis• Pipeline developmentMetagenomics approaches and data analysis 6 February 2013
  3. 3. Naturalis Biodiversity Center• With 37 million specimens, NBC holds one of the largest natural history collections in the world• More than just a museum, NBC is an expert center specializing in:  Species identification  Trait harvesting  Impact modeling  Ecological intensivationMetagenomics approaches and data analysis 6 February 2013
  4. 4. Ancient DNA lab• The ancient-DNA facility is equipped for recovering DNA from plant and animal material from museum collections and fossils.• It permits research that would otherwise not be possible, such as the study of ancient populations and museum material.• The ancient-DNA lab provides an environment where the risk of contamination with contemporary DNA is minimal.• The facility, a collaboration of IBL, the faculty of archeology and NBC, is unique in the NetherlandsMetagenomics approaches and data analysis 6 February 2013
  5. 5. Barcode labMetagenomics approaches and data analysis 6 February 2013
  6. 6. Informatics focus group• Exploitation of HPC resources• Dissemination of best practices• In-house development of research- supporting tools:  NGS data processing  Clustering, BLASTing  Custom pipelines  Visualization  Image analysis  Niche modelingMetagenomics approaches and data analysis 6 February 2013
  7. 7. HPC infrastructure• Dell T7500 and T7600 workstations• Intel® Xeon® Processor (QuadCore, 2.40GHz) x 2• 128Gb RAM• TESLA/NVIDIA GPU• RedHat/Ubuntu Linux• Always looking for extra numbercrunching power, e.g. from NBIC Galaxy, CIPRES, BioPortal, etc.Metagenomics approaches and data analysis 6 February 2013
  8. 8. Paleoenvironments• Reconstructing the paleoenvironment is useful for:  Understanding the dynamics of ecosystem change  Reconstructing pre- industrialization ecosystems• Many public policy decision- makers have pointed to the importance of using palaeoecological studies as a basis for choices made in conservation ecologyMetagenomics approaches and data analysis 6 February 2013
  9. 9. Metagenomics• Taxonomic identification is one of the main challenges surrounding metagenomics, and one of NBC’s core strengths• Conversely, a better understanding of the metagenome feeds back into our other research interests and expertises• Consequently, a lot of research activity and ongoing capacity buildingMetagenomics approaches and data analysis 6 February 2013
  10. 10. Use case: the woolly mammoths dietary metagenomeMetagenomics approaches and data analysis 6 February 2013
  11. 11. The research programme• To test hypotheses about the structure of the ancient environment of the woolly mammoth, i.e. productive, continuous grassland steppe or sparsely covered herb tundra• Finding frozen mammoths with forensically identiable food, parasites, and microorganisms in their gastrointestinal tracts or feces has the potential of adding data to the extinction debate• To integrate the findings from ancient DNA with those obtained from macro- and micro-fossilsMetagenomics approaches and data analysis 6 February 2013
  12. 12. Lyuba Cape Blossom Yukagir Permafrost-preserved mammoth remains
  13. 13. "Lyuba" • Discovered in May 2007 • One-month old mammoth calf • Age: 41,910 ± 550 YBP • Very well-nourished, milk-fedMetagenomics approaches and data analysis 6 February 2013
  14. 14. The Yukagir mammoth• Male woolly mammal• Discovered in 2002• Very well preserved in the permafrost• Age: 18,560 ± 50 YBP• Head, front legs, parts of stomach and intestinal tract• Last meal still preserved Metagenomics approaches and data analysis 6 February 2013
  15. 15. The Cape Blossom mammoth dung • Produced during the cold season • Found among a partial skeleton • Exact site unknown • Age: 12,300 YBPMetagenomics approaches and data analysis 6 February 2013
  16. 16. DNA extraction and sequencing• In all studies, macro-fossils (stems, leaves, seeds), micro- fossils (pollen) and ancient DNA were compared• DNA was extracted in the ancient DNA facility using multiple extraction protocols• Several commonly-used markers were amplified (trnL, rbcL, nrITS1)• Sanger sequencing was done on an ABI 3730xlMetagenomics approaches and data analysis 6 February 2013
  17. 17. Data analysis• Sequences were assembled using Sequencher• Taxa were assigned using a combination of GenBank BLAST searches and phylogenetic inference• BLAST hits were only accepted if they covered the full query sequence and differed by at most 1 nucleotide• Phylogenetic placement was determined on the basis of bootstrap support (1000 replicates using paup*)Metagenomics approaches and data analysis 6 February 2013
  18. 18. Findings• Ancient DNA could assign 7 ("Lyuba"), 12 ("dung") and 8 ("Yukagir") plant families, with several determinations down to genus level• Molecules complemented and confirmed fossils• Identified vegetation composition is generally supportive of a productive "mammoth steppe"• Micro-fossils of specific dung fungi showed that mammoths appear, unlike elephants, to be habitually coprophagousMetagenomics approaches and data analysis 6 February 2013
  19. 19. Next generation applications• The results of the mammoth research so far have been obtained using Sanger sequencing• Similar, as yet unpublished, research is being undertaken with the newly acquired IonTorrent "sequencing by synthesis" platform Marcel Eurlings at NaturalisMetagenomics approaches and data analysis 6 February 2013
  20. 20. IonTorrent chip generationsMetagenomics approaches and data analysis 6 February 2013
  21. 21. IonTorrent data pre-processing workflow Filter out short reads Splice out low phred scores Split by Split by primer adapter sequence sequence FASTA for downstream analysisMetagenomics approaches and data analysis 6 February 2013
  22. 22. Taxonomic identification pipeline• Taxonomic identification of the contents of samples is a generic problem for which we have developed a re-usable pipeline• It replicates some of the functionality of QIIME but integrates more conveniently in our HPC configuration• Requirements:  Python 2.7 or 3.2  Biopython 1.58  NCBI-Blast-2.2.25+  Clustering programs, e.g. TGICL, Usearch, Octupus, cd-hitMetagenomics approaches and data analysis 6 February 2013
  23. 23. Pipeline steps Optional: tag FASTA for provenance retracing across files Cluster sequences into OCTUs of at least 10 reads Pick exemplar sequence (random, consensus or hybrid) BLAST exemplar sequences (local or remote) Optional: retrace provenance ReportMetagenomics approaches and data analysis 6 February 2013
  24. 24. Pipeline extensions• NBC frequently deals with samples that may contain materials from endangered species, for example:  Putative FSC wood  Traditional Chinese medicine  Incense• We are therefore extending the taxonomic identification pipeline to check automatically whether any taxa from the sample are listed in CITES appendices• This, however, poses additional challenges of taxonomic name reconciliationMetagenomics approaches and data analysis 6 February 2013
  25. 25. Other metagenomics work• Phylogenies from metagenomic sequence data can grow to immense sizes• For example, the GreenGenes 16S rRNA tree has ~400k tips• We are developing novel algorithms for pruning these trees using (Google’s) MapReduce programming modelMetagenomics approaches and data analysis 6 February 2013
  26. 26. Acknowledgements• I am grateful to: • Dr. Barbara Gravendeel for her input in developing this talk • Youri Lammers for his great working in developing a well-documented taxonomic identification pipeline• And to NBIC for giving me the opportunity to present this storyMetagenomics approaches and data analysis 6 February 2013
  27. 27. Thank you for listening