This document provides an overview of analyzing 16S rRNA gene amplicon sequencing data using Qiime2. It begins with an introduction to metabarcoding and using 16S rRNA for microbial community profiling. It then demonstrates the bioinformatics analysis workflow in Qiime2, including quality control, clustering sequences into OTUs, assigning taxonomy, and generating an OTU table with metadata. The document is intended as a training source to familiarize users with common analysis steps and biases or assumptions in 16S amplicon analysis.
4. Metabarcoding
– Sogin et. al 2006
“Gene sequences,
most commonly those encoding rRNAs,
provide a basis for
estimating microbial phylogenetic diversity
and generating taxonomic inventories of
[...] microbial populations.”
13. A training source: USEARCH
USEARCH is a monolithic package written by Robert
Edgar, an independent bioinformatics researcher.
Can be used to see how we manipulate reads to make
an annotated OTU table.
Binary download: http://drive5.com/
14. A training source Sample_1.fastq Sample_2.fastq Sample_3.fastq
merged.fq
filtered.fa
uniques.fa
OTUS.fa
OTUS_tax.fahttp://drive5.com/usearch/manual/pipe_examples.html
22. Qiime is…
… an open-source bioinformatics pipeline for performing microbiome analysis from raw
DNA sequencing data.
• It’s a suite of wrappers built around third party tools
• The wrapping is consistent and makes easier to perform the whole analysis
• …but hiding technicalities can lead to misinterpretations
23. Metadata file (was Mapping File)
A major problem in bioinformatics workflow is access to metadata
24. Qiime2 is…
• a complete rewrite of Qiime 1
• addresses the most common problems users found (is this good?)
• introduce the new concept of artifacts (input/output packages with metadata)
• improves the workflow
• It’s the recommended version: http://4ngs.com/go/hV
25. Qiime2 artifacts
qza
qzv
Qiime2 archive
It’s the output format of all Qiime2
programs. It’s a ZIP files with both
data and metadata.
Qiime2 visualization
It’s the output format for plots/charts and tables that the user
could desire to inspect. It’s an HTML document (web page)
embedded in a “ZIP with metadata”, like qza.
http://view.qiime2.org
27. Qiime2 command: first example
imported_reads.qza
$ unzip -t demux-paired-end.qza
Archive: demux-paired-end.qza
testing: e2150a41-6d7c-4e13-99ee-36f57ab1f2fb/VERSION OK
testing: e2150a41-6d7c-4e13-99ee-36f57ab1f2fb/metadata.yaml OK
testing: e2150a41-6d7c-4e13-99ee-36f57ab1f2fb/provenance/VERSION OK
testing: e2150a41-6d7c-4e13-99ee-36f57ab1f2fb/provenance/metadata.yaml OK
testing: e2150a41-6d7c-4e13-99ee-36f57ab1f2fb/provenance/citations.bib OK
testing: e2150a41-6d7c-4e13-99ee-36f57ab1f2fb/provenance/action/action.yaml OK
testing: e2150a41-6d7c-4e13-99ee-36f57ab1f2fb/data/64_S64_L001_R2_001.fastq.gz OK
testing: e2150a41-6d7c-4e13-99ee-36f57ab1f2fb/data/34_S34_L001_R1_001.fastq.gz OK
testing: e2150a41-6d7c-4e13-99ee-36f57ab1f2fb/data/60_S60_L001_R2_001.fastq.gz OK
testing: e2150a41-6d7c-4e13-99ee-36f57ab1f2fb/data/19_S19_L001_R1_001.fastq.gz OK
34. Beta diversity
Measures the change or
difference in composition
across samples
A nice example of the
importance of metadata for
meaningful outputs!