SlideShare a Scribd company logo
1 of 30
Download to read offline
BCBB Bioinformatics Seminars
February 25, 2015
Andrew Oler, PhD
High-throughput Sequencing Bioinformatics Specialist
BCBB/OCICB/NIAID/NIH
Bioinformatics & Computational Biology
Branch (BCBB)
2
Biocomputing Research Consulting and
Scientific Software Development
High
Throughput
Illustration
Animation
http://www.niaid.nih.gov/about/organization/odoffices/omo/ocicb/Pages/bcbb.aspx
ScienceApps@niaid.nih.gov
3
Outline
§ Background
• Microbiome
• 16S rRNA
§ Basic analysis workflow
§ Mothur MiSeq tutorial
4
The Microbiome
“The ecological community of
commensal, symbiotic, and
pathogenic microorganisms that
literally share our body space”
(Lederberg and McCray 2001)
5
Microbiomics: A growing field
Slide modified from J. Wan 6
Microbiomics: A growing field
Slide modified from J. Wan 7
Microbiomics: A growing field
Slide modified from J. Wan 8
Human Microbiome
§  The human body contains approximately 10x as many
microbes as human cells, including bacteria, archaea,
fungi, and viruses (about 1014 vs 1013).
§  “Metagenome” or our “other genome”
•  Includes all genes from bacteria, etc.
•  About 10,000 microbial species
§  First introduction occurs at birth
§  Microbes provide enzymes for digestion and other
compounds such as vitamins
9
Berg, Trends in Microbiology, 1996
http://www.nih.gov/news/health/jun2012/nhgri-13.htm
Huse et al., PLoS ONE, 2012
Human Microbiome Project
§  Funded by NIH Common Fund,
FY2007-2015
§  “Develop tools and datasets for the
research community for studying
the role of these microbes in human
health and disease.”
§  Phase I (2007-2012)
•  Composition and Diversity of
microbial communities
•  Sequencing 3000 reference
genomes
§  Phase II (2013-2015)
•  Integrated analysis of host and
microbiome in human health and
disease
§  Primary focus on bacterial
microbiome
§  “Your mouth is connected to your
rectum.” J
•  PatSchloss
10http://commonfund.nih.gov/hmp/index
Microbiome Analysis
§  Identifying microbial populations in various body tissues
and how changes in these populations correlate to various
disease states
§  Techniques
•  Whole genome shotgun (WGS) sequencing
–  Sampling all genes of all organisms in a sample
–  Goal is to determine functional groups of genes
•  16S rRNA metagenomic sequencing
–  Targeted amplicon sequencing of all 16S rRNA genes in a
population of microbes
–  Goal is to determine taxonomic distribution of microbial species
•  Microbial metatranscriptomics
–  RNA-seq of all organisms in a population
11
Carriage of microbial taxa varies while metabolic
pathways remain stable within a healthy population
12C Huttenhower et al. Nature 486, 207-214 (2012) doi:10.1038/nature11234
WGS vs. 16S
16S rRNA Variable Regions
Slide modified from J. Wan
V13
V35
V69
§  Part of the 30S subunit of the
prokaryotic ribosome
§  Widely conserved (bacteria, archaea)
§  9 hypervariable regions, flanked by
conserved sequences
Illumina: Advantages and Challenges
454 Illumina MiSeq Illumina HiSeq
Reads /
run
1 million 25 million
300 million – 2
billion
Max
Read
length
400 – 700 bp 2 x 300bp (paired) 2 x 150bp (paired)
Error rate moderate low low
Cost / Mb $7 – $22 ~$0.50 $0.04 – $0.074
•  Long reads offer greater taxonomic information (454, MiSeq)
•  Low error rates produce more accurate data (MiSeq, HiSeq)
•  Current tools aren’t designed to cluster and classify non-overlapping
paired ends (MiSeq, HiSeq)
•  Higher sequencing depth offers greater sensitivity for detection
http://www.illumina.com/systems/sequencing.ilmn
http://nextgenseek.com/2012/08/comparing-price-and-tech-specs-of-illumina-miseq-ion-torrent-pgm-454-gs-junior-and-pacbio-rs/
Slide modified from J. Wan
14
16S rRNA Sequencing
Sample
DNA
Extraction
Genomic DNA
PCR
Amplification
16S Amplicons
Next-Gen
Sequencing
Sequence Data
TGGGGAATATTGGACAATGGGGGG
AACCCTGATCCAGCCATGCCGCGT
GTGTGAAGAAGGCCTTATGGTTGT
AATGGGGAATATTGCACAATGGGC
GAAAGCCTGATGCAGCGACGCCGC
GTGAGGGATGGAGGCCTTCGGGTT
GTAAATAATGGGGAATATTGCACA
ATGGGCGAAAGCCTGATGCAGCGA	
  
Slide modified from J. Wan 15
How are 16S sequence data analyzed?
§  Usually interested in taxa, not genotypes
§  Sequences can be grouped into taxa by:
•  Traditional taxonomic classification (phylotypes)
•  Phylogenetic tree
•  Operational taxonomic units (OTU)
§  Operational taxonomic units (OTUs) are used to
represent groups of related organisms
§  OTUs at 3% sequence difference are used as a
proxy for species-level diversity
Slide modified from J. Wan 16
Caution!
Contamination
Polymerase error
Primer mismatch
Amplification bias
Chimera formation
Sequencing error
Sample
DNA
Extraction
Genomic DNA
PCR
Amplification
16S Amplicons
Next-Gen
Sequencing
Sequence Data
TGGGGAATATTGGACAATGGGGGG
AACCCTGATCCAGCCATGCCGCGT
GTGTGAAGAAGGCCTTATGGTTGT
AATGGGGAATATTGCACAATGGGC
GAAAGCCTGATGCAGCGACGCCGC
GTGAGGGATGGAGGCCTTCGGGTT
GTAAATAATGGGGAATATTGCACA
ATGGGCGAAAGCCTGATGCAGCGA	
  
Slide modified from J. Wan 17
Caution!
§  At such high read numbers, errors are inevitable
§  When not accounted for, errors greatly inflate OTU
counts and diversity estimates
•  Hundreds of “species-level” OTUs identified in
30,000 E. coli reads (Huse, Environ Microbiol. 2010)
§  Solutions:
•  Single-linkage pre-cluster step (SLP)
•  Alternatively, model sequencing errors and use
machine learning to remove noise (e.g., DADA)
Slide modified from J. Wan 18
Software/Databases for Microbiome Analysis
§  Mothur (mothur.org) - full 16S analysis suite
§  QIIME (qiime.org) - full 16S analysis suite
§  MG-RAST server (metagenomics.anl.gov) - 16S and WGS
§  CloVR (clovr.org) - 16S and WGS
§  BioBakery (bitbucket.org/biobakery/biobakery)
§  BROAD Microbiome (microbiomeutil.sourceforge.net) - chimera detection,
OTU binning
§  Ribosomal Database Project (RDP; rdp.cme.msu.edu) - 16S and 28S
Fungal
•  RDP Classifier (rdp-classifier.sourceforge.net/)
§  greengenes (greengenes.lbl.gov) - Taxonomy, 16S
§  IMG (img.jgi.doe.gov/imgm_hmp) - DOE Joint Genome Institutes; genome
annotation
§  PATRIC (patricbrc.org) - Pathogens
§  SILVA (arb-silva.de) - 16S, 18S, 28S
§  More tools listed @ HMP DACC: http://www.hmpdacc.org/tools_protocols/
tools_protocols.php
19
Nephele:	
  Microbiome	
  Analysis	
  in	
  the	
  Cloud	
  
Microbiome	
  analysis	
  +	
  Cloud	
  compu9ng	
  =	
  no	
  hassle	
  for	
  installa9on	
  
and	
  “on	
  demand”	
  analysis	
  pla?orm	
  service	
  	
  
Example	
  Nephele	
  Workflow	
  
SFF#
Single)end#
FASTQ#
FASTA,#
QUAL#
Paired)end#
#FASTQ#
sffinfo#(Mothur)#
validate_mapping_file#
split_libraries#
denoise_wrapper#
inflate_denoiser_output#
Cleaned#FASTA#
validate_mapping_file#
split_libraries# convert_fastaqual_fastq#
join_paired_ends#
make#mapping#files#
validate_mapping_files#
split_libraries_fastq#
merge_mapping_files#
#
PRE)PROCESSING#
1.#Closed#Reference#
pick_closed_reference_otus#
2.#Open#Reference#
pick_open_reference_otus#
3.#De#novo#
pick_de_novo_otus#
CLUSTERING#AND#
CLASSIFICATION#
BIOM#
TREE#
1.#Alpha#Diversity,#
Beta#Diversity,#PCoA#
biom#summarize)table#
calculate_subsample#
core_diversity_analyses#
DIVERSITY#ANALYSIS#AND#PLOTS#
3.#Interac?ve#
heatmap#
make_otu_heatmap#
2.#Resampling##
PCoA#plots#
jackknifed_beta_diversity#
make_bootstrapped_tree#
4.#Differen?al#
OTU#enrichment#
metastats#(Mothur)#
make.shared#(Mothur)#
make.lefse#(Mothur)#
SHARED# LEFSE#
5.#Differen?al#
Clade#
Enrichment#
LEfSe#
(HuYenhower)#
Table#of#
OTUs#with#
p)values#
QIIME#16S#Workflow#Diagram#
Func?onal#Enrichment#
normalize_by_copy_number#(PICRUSt)#
predict_metagenomes#(PICRUSt)#
WGS#
Focus	
  Group	
  for	
  Usability	
  Tes9ng	
  
Nephele	
  is	
  currently	
  under	
  development.	
  
We	
  need	
  your	
  feedback	
  to	
  improve	
  features	
  and	
  usability	
  from	
  a	
  users’	
  perspec7ve,	
  i.e.,	
  YOU!	
  
Analysis	
  Engine	
  
Data	
  Explorer	
  
Please	
  signup	
  and	
  gain	
  
early	
  access	
  to	
  Nephele	
  
(for	
  tes7ng	
  purposes)!	
  
nephele@mail.nih.gov	
  	
  
Mothur
§  “This project seeks to develop a single piece of open-
source, expandable software to fill the bioinformatics
needs of the microbial ecology community.”
§  Documentation:
•  http://www.mothur.org/wiki/Mothur_manual
§  Support:
•  http://www.mothur.org/forum/
§  Tutorials / Protocols
•  http://www.mothur.org/wiki/Analysis_examples
•  http://www.mothur.org/wiki/454_SOP
•  http://www.mothur.org/wiki/MiSeq_SOP
23
Mothur GUI
24
Basic Workflow for 16S Analysis
§  1. Remove unwanted reads and sequencing and PCR
error
quality filtering
pre.cluster/SLP
§  2. Identify and remove chimeric sequences
UCHIME
§  3. Cluster operational taxonomic units (OTUs)
average linkage (UPGMA), complete linkage
§  4. Classify OTUs
naïve Bayesian classification (Wang), BLAST
§  5. Diversity Analysis and plots
Alpha Diversity, Beta Diversity
Set up Environment
§  Open Terminal
§  cd [drag MiSeq_SOP folder into terminal] [Enter]	
  
§  ls -al
§  export PATH=$PATH:/path/to/Desktop/
mothurGUI/mothur (drag folder into terminal)	
  
§  which mothur	
  
§  mothur	
  
§  quit()	
  
26
Experimental Design
§  Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD.
(2013): Development of a dual-index sequencing strategy and
curation pipeline for analyzing amplicon sequence data on the
MiSeq Illumina sequencing platform. Applied and Environmental
Microbiology. 79(17):5112-20.
§  Total 362 samples
§  Test dataset: 21 samples
•  Female 3 days 0-9 and days 141-150 (post-weaning)
•  Mock
§  R1 vs. R2 (read1, read2 -- not replicates)
§  Timecourse, Early vs. Late
§  V4 region
§  Already demultiplexed by Illumina MiSeq software (one sample
per file)
27
Mothur-formatted 16S Sequence Databases
§  SILVA
•  Aligned Fasta, width 50,000 bases
•  Used for alignment to make sure reads are in the correct
region
§  Gold (BROAD)
•  Used for Chimera detection with chimera.slayer
§  RDP Classifier training
•  Unaligned Fasta
•  Use with accompanying .taxonomy file for classify.seqs
•  Has mitochondria, chloroplast so you can use for filtering out
junk
§  Greengenes
•  Unaligned Fasta
•  Use with accompanying .taxonomy file for classify.seqs
•  Use for actual classification of sequences and OTUs
28
Tutorials, other tools for today
§  MiSeq initial steps:
•  http://www.mothur.org/wiki/MiSeq_SOP
§  Analysis
•  http://www.mothur.org/wiki/454_SOP
§  Plot phylogenetic tree
•  http://iubio.bio.indiana.edu/treeapp/treeprint-
form.html
§  Examples of plots
•  http://qiime.org/tutorials/tutorial.html
29
30
Thank You
For questions or comments please contact:
andrew.oler@nih.gov
ScienceApps@niaid.nih.gov
Slides available here
(open in Safari or Internet Explorer):
http://collab.niaid.nih.gov/sites/research/SIG/Bioinformatics/
-> Next Gen Sequencing -> “16S Microbiome Analysis”

More Related Content

What's hot (20)

Genome assembly
Genome assemblyGenome assembly
Genome assembly
 
BLAST and sequence alignment
BLAST and sequence alignmentBLAST and sequence alignment
BLAST and sequence alignment
 
Structural databases
Structural databases Structural databases
Structural databases
 
Metagenomic analysis
Metagenomic analysisMetagenomic analysis
Metagenomic analysis
 
Yeast two hybrid
Yeast two hybrid Yeast two hybrid
Yeast two hybrid
 
ChIP-seq
ChIP-seqChIP-seq
ChIP-seq
 
dot plot analysis
dot plot analysisdot plot analysis
dot plot analysis
 
DNA microarray
DNA microarrayDNA microarray
DNA microarray
 
Introduction to 16s r rna sequencing cd genomics
Introduction to 16s r rna sequencing cd genomicsIntroduction to 16s r rna sequencing cd genomics
Introduction to 16s r rna sequencing cd genomics
 
Differential gene profiling methods
Differential gene profiling methodsDifferential gene profiling methods
Differential gene profiling methods
 
PCR Primer desining
PCR Primer desiningPCR Primer desining
PCR Primer desining
 
Gene knockout
Gene knockoutGene knockout
Gene knockout
 
Rasmol
RasmolRasmol
Rasmol
 
Scop database
Scop databaseScop database
Scop database
 
Structure analysis of protein
Structure analysis of proteinStructure analysis of protein
Structure analysis of protein
 
Omics era
Omics eraOmics era
Omics era
 
Transcriptome analysis
Transcriptome analysisTranscriptome analysis
Transcriptome analysis
 
Temperature Gradient Gel Electrophoresis
Temperature Gradient Gel ElectrophoresisTemperature Gradient Gel Electrophoresis
Temperature Gradient Gel Electrophoresis
 
Metabolomics
MetabolomicsMetabolomics
Metabolomics
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 

Viewers also liked

Click Esperança Livreto apresentação
Click Esperança Livreto apresentaçãoClick Esperança Livreto apresentação
Click Esperança Livreto apresentaçãoCVS Comunicação
 
Transpalatal, nance, lingual arch, quadhelix appliances for orthodontists by ...
Transpalatal, nance, lingual arch, quadhelix appliances for orthodontists by ...Transpalatal, nance, lingual arch, quadhelix appliances for orthodontists by ...
Transpalatal, nance, lingual arch, quadhelix appliances for orthodontists by ...University of Sydney and Edinbugh
 
Theoretical basis and_correct_explanatio
Theoretical basis and_correct_explanatioTheoretical basis and_correct_explanatio
Theoretical basis and_correct_explanatioErnani Silva
 
أعمال الانشاءات الصناعية - Industrial Structure Works
أعمال الانشاءات الصناعية - Industrial Structure Worksأعمال الانشاءات الصناعية - Industrial Structure Works
أعمال الانشاءات الصناعية - Industrial Structure WorksHussain Sbetan
 
The stability of class ii malocclusion for orthodontists by Almuzian
The stability of class ii malocclusion for orthodontists by AlmuzianThe stability of class ii malocclusion for orthodontists by Almuzian
The stability of class ii malocclusion for orthodontists by AlmuzianUniversity of Sydney and Edinbugh
 
Company preso dangying_niantic
Company preso dangying_nianticCompany preso dangying_niantic
Company preso dangying_nianticDang Ying
 

Viewers also liked (20)

Click Esperança Livreto apresentação
Click Esperança Livreto apresentaçãoClick Esperança Livreto apresentação
Click Esperança Livreto apresentação
 
DIÁRIO OFICIAL DE ILHÉUS DO DIA 15-03-2017
DIÁRIO OFICIAL DE ILHÉUS DO DIA 15-03-2017DIÁRIO OFICIAL DE ILHÉUS DO DIA 15-03-2017
DIÁRIO OFICIAL DE ILHÉUS DO DIA 15-03-2017
 
Tip edge technique for orthodontists by Almuzian
Tip edge technique for orthodontists by AlmuzianTip edge technique for orthodontists by Almuzian
Tip edge technique for orthodontists by Almuzian
 
TMD and orthodontic by almuzian
TMD and orthodontic by almuzianTMD and orthodontic by almuzian
TMD and orthodontic by almuzian
 
Teeth transpositions for orthodontists by almuzian
Teeth transpositions for orthodontists by almuzian  Teeth transpositions for orthodontists by almuzian
Teeth transpositions for orthodontists by almuzian
 
Tooth movement for orthodontists by Almuzian
Tooth movement for orthodontists by AlmuzianTooth movement for orthodontists by Almuzian
Tooth movement for orthodontists by Almuzian
 
Transverse discrepancy for orthodontists by almuzian
Transverse discrepancy for orthodontists by almuzianTransverse discrepancy for orthodontists by almuzian
Transverse discrepancy for orthodontists by almuzian
 
Transpalatal, nance, lingual arch, quadhelix appliances for orthodontists by ...
Transpalatal, nance, lingual arch, quadhelix appliances for orthodontists by ...Transpalatal, nance, lingual arch, quadhelix appliances for orthodontists by ...
Transpalatal, nance, lingual arch, quadhelix appliances for orthodontists by ...
 
Theoretical basis and_correct_explanatio
Theoretical basis and_correct_explanatioTheoretical basis and_correct_explanatio
Theoretical basis and_correct_explanatio
 
20170311教學聯繫
20170311教學聯繫20170311教學聯繫
20170311教學聯繫
 
Entada 4-jo
Entada 4-joEntada 4-jo
Entada 4-jo
 
أعمال الانشاءات الصناعية - Industrial Structure Works
أعمال الانشاءات الصناعية - Industrial Structure Worksأعمال الانشاءات الصناعية - Industrial Structure Works
أعمال الانشاءات الصناعية - Industrial Structure Works
 
Nomad project
Nomad projectNomad project
Nomad project
 
Third molar impaction for orthodontists by Almuzian
Third molar impaction for orthodontists by AlmuzianThird molar impaction for orthodontists by Almuzian
Third molar impaction for orthodontists by Almuzian
 
Abordaje
AbordajeAbordaje
Abordaje
 
The stability of class ii malocclusion for orthodontists by Almuzian
The stability of class ii malocclusion for orthodontists by AlmuzianThe stability of class ii malocclusion for orthodontists by Almuzian
The stability of class ii malocclusion for orthodontists by Almuzian
 
An introduction to R
An introduction to RAn introduction to R
An introduction to R
 
Secuencia didactica
Secuencia didacticaSecuencia didactica
Secuencia didactica
 
Mj adeniyi msc
Mj adeniyi mscMj adeniyi msc
Mj adeniyi msc
 
Company preso dangying_niantic
Company preso dangying_nianticCompany preso dangying_niantic
Company preso dangying_niantic
 

Similar to Introduction to 16S Microbiome Analysis

NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...VHIR Vall d’Hebron Institut de Recerca
 
Functional annotation of invertebrate genomes
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomesSurya Saha
 
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...VHIR Vall d’Hebron Institut de Recerca
 
CCBC tutorial beiko
CCBC tutorial beikoCCBC tutorial beiko
CCBC tutorial beikobeiko
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingJonathan Eisen
 
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Surya Saha
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods Zohaib HUSSAIN
 
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...Larry Smarr
 
bioinfomatics
bioinfomaticsbioinfomatics
bioinfomaticsnguyenpg
 
2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotes2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotesc.titus.brown
 
L14 human genome
L14 human genomeL14 human genome
L14 human genomeMUBOSScz
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesMonica Munoz-Torres
 
Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....Jonathan Eisen
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingJonathan Eisen
 
Genome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryGenome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryZarlishAttique1
 
The Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics ResearchersThe Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics ResearchersLarry Smarr
 

Similar to Introduction to 16S Microbiome Analysis (20)

NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
 
Functional annotation of invertebrate genomes
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomes
 
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Ba...
 
CCBC tutorial beiko
CCBC tutorial beikoCCBC tutorial beiko
CCBC tutorial beiko
 
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome SequencingMicrobial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
Microbial Phylogenomics (EVE161) Class 10-11: Genome Sequencing
 
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
2014 bangkok-talk
2014 bangkok-talk2014 bangkok-talk
2014 bangkok-talk
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods
 
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
 
Bioinformatics seminar
Bioinformatics seminarBioinformatics seminar
Bioinformatics seminar
 
bioinfomatics
bioinfomaticsbioinfomatics
bioinfomatics
 
2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotes2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotes
 
L14 human genome
L14 human genomeL14 human genome
L14 human genome
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
 
Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....Diversity Diversity Diversity Diversity ....
Diversity Diversity Diversity Diversity ....
 
BioSB meeting 2015
BioSB meeting 2015BioSB meeting 2015
BioSB meeting 2015
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meeting
 
Genome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryGenome sequencing and the development of our current information library
Genome sequencing and the development of our current information library
 
The Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics ResearchersThe Emerging Global Community of Microbial Metagenomics Researchers
The Emerging Global Community of Microbial Metagenomics Researchers
 

More from Bioinformatics and Computational Biosciences Branch

More from Bioinformatics and Computational Biosciences Branch (20)

Hong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptxHong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptx
 
Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019
 
Nephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele resultsNephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele results
 
Introduction to METAGENOTE
Introduction to METAGENOTE Introduction to METAGENOTE
Introduction to METAGENOTE
 
Intro to homology modeling
Intro to homology modelingIntro to homology modeling
Intro to homology modeling
 
Homology modeling: Modeller
Homology modeling: ModellerHomology modeling: Modeller
Homology modeling: Modeller
 
Protein docking
Protein dockingProtein docking
Protein docking
 
Protein function prediction
Protein function predictionProtein function prediction
Protein function prediction
 
Protein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on RosettaProtein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on Rosetta
 
Biological networks
Biological networksBiological networks
Biological networks
 
UNIX Basics and Cluster Computing
UNIX Basics and Cluster ComputingUNIX Basics and Cluster Computing
UNIX Basics and Cluster Computing
 
Statistical applications in GraphPad Prism
Statistical applications in GraphPad PrismStatistical applications in GraphPad Prism
Statistical applications in GraphPad Prism
 
Intro to JMP for statistics
Intro to JMP for statisticsIntro to JMP for statistics
Intro to JMP for statistics
 
Categorical models
Categorical modelsCategorical models
Categorical models
 
Better graphics in R
Better graphics in RBetter graphics in R
Better graphics in R
 
Automating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtoolsAutomating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtools
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
 
Overview of statistics: Statistical testing (Part I)
Overview of statistics: Statistical testing (Part I)Overview of statistics: Statistical testing (Part I)
Overview of statistics: Statistical testing (Part I)
 
GraphPad Prism: Curve fitting
GraphPad Prism: Curve fittingGraphPad Prism: Curve fitting
GraphPad Prism: Curve fitting
 
Appendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductorAppendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductor
 

Recently uploaded

Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 

Recently uploaded (20)

Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 

Introduction to 16S Microbiome Analysis

  • 1. BCBB Bioinformatics Seminars February 25, 2015 Andrew Oler, PhD High-throughput Sequencing Bioinformatics Specialist BCBB/OCICB/NIAID/NIH
  • 2. Bioinformatics & Computational Biology Branch (BCBB) 2
  • 3. Biocomputing Research Consulting and Scientific Software Development High Throughput Illustration Animation http://www.niaid.nih.gov/about/organization/odoffices/omo/ocicb/Pages/bcbb.aspx ScienceApps@niaid.nih.gov 3
  • 5. The Microbiome “The ecological community of commensal, symbiotic, and pathogenic microorganisms that literally share our body space” (Lederberg and McCray 2001) 5
  • 6. Microbiomics: A growing field Slide modified from J. Wan 6
  • 7. Microbiomics: A growing field Slide modified from J. Wan 7
  • 8. Microbiomics: A growing field Slide modified from J. Wan 8
  • 9. Human Microbiome §  The human body contains approximately 10x as many microbes as human cells, including bacteria, archaea, fungi, and viruses (about 1014 vs 1013). §  “Metagenome” or our “other genome” •  Includes all genes from bacteria, etc. •  About 10,000 microbial species §  First introduction occurs at birth §  Microbes provide enzymes for digestion and other compounds such as vitamins 9 Berg, Trends in Microbiology, 1996 http://www.nih.gov/news/health/jun2012/nhgri-13.htm Huse et al., PLoS ONE, 2012
  • 10. Human Microbiome Project §  Funded by NIH Common Fund, FY2007-2015 §  “Develop tools and datasets for the research community for studying the role of these microbes in human health and disease.” §  Phase I (2007-2012) •  Composition and Diversity of microbial communities •  Sequencing 3000 reference genomes §  Phase II (2013-2015) •  Integrated analysis of host and microbiome in human health and disease §  Primary focus on bacterial microbiome §  “Your mouth is connected to your rectum.” J •  PatSchloss 10http://commonfund.nih.gov/hmp/index
  • 11. Microbiome Analysis §  Identifying microbial populations in various body tissues and how changes in these populations correlate to various disease states §  Techniques •  Whole genome shotgun (WGS) sequencing –  Sampling all genes of all organisms in a sample –  Goal is to determine functional groups of genes •  16S rRNA metagenomic sequencing –  Targeted amplicon sequencing of all 16S rRNA genes in a population of microbes –  Goal is to determine taxonomic distribution of microbial species •  Microbial metatranscriptomics –  RNA-seq of all organisms in a population 11
  • 12. Carriage of microbial taxa varies while metabolic pathways remain stable within a healthy population 12C Huttenhower et al. Nature 486, 207-214 (2012) doi:10.1038/nature11234 WGS vs. 16S
  • 13. 16S rRNA Variable Regions Slide modified from J. Wan V13 V35 V69 §  Part of the 30S subunit of the prokaryotic ribosome §  Widely conserved (bacteria, archaea) §  9 hypervariable regions, flanked by conserved sequences
  • 14. Illumina: Advantages and Challenges 454 Illumina MiSeq Illumina HiSeq Reads / run 1 million 25 million 300 million – 2 billion Max Read length 400 – 700 bp 2 x 300bp (paired) 2 x 150bp (paired) Error rate moderate low low Cost / Mb $7 – $22 ~$0.50 $0.04 – $0.074 •  Long reads offer greater taxonomic information (454, MiSeq) •  Low error rates produce more accurate data (MiSeq, HiSeq) •  Current tools aren’t designed to cluster and classify non-overlapping paired ends (MiSeq, HiSeq) •  Higher sequencing depth offers greater sensitivity for detection http://www.illumina.com/systems/sequencing.ilmn http://nextgenseek.com/2012/08/comparing-price-and-tech-specs-of-illumina-miseq-ion-torrent-pgm-454-gs-junior-and-pacbio-rs/ Slide modified from J. Wan 14
  • 15. 16S rRNA Sequencing Sample DNA Extraction Genomic DNA PCR Amplification 16S Amplicons Next-Gen Sequencing Sequence Data TGGGGAATATTGGACAATGGGGGG AACCCTGATCCAGCCATGCCGCGT GTGTGAAGAAGGCCTTATGGTTGT AATGGGGAATATTGCACAATGGGC GAAAGCCTGATGCAGCGACGCCGC GTGAGGGATGGAGGCCTTCGGGTT GTAAATAATGGGGAATATTGCACA ATGGGCGAAAGCCTGATGCAGCGA   Slide modified from J. Wan 15
  • 16. How are 16S sequence data analyzed? §  Usually interested in taxa, not genotypes §  Sequences can be grouped into taxa by: •  Traditional taxonomic classification (phylotypes) •  Phylogenetic tree •  Operational taxonomic units (OTU) §  Operational taxonomic units (OTUs) are used to represent groups of related organisms §  OTUs at 3% sequence difference are used as a proxy for species-level diversity Slide modified from J. Wan 16
  • 17. Caution! Contamination Polymerase error Primer mismatch Amplification bias Chimera formation Sequencing error Sample DNA Extraction Genomic DNA PCR Amplification 16S Amplicons Next-Gen Sequencing Sequence Data TGGGGAATATTGGACAATGGGGGG AACCCTGATCCAGCCATGCCGCGT GTGTGAAGAAGGCCTTATGGTTGT AATGGGGAATATTGCACAATGGGC GAAAGCCTGATGCAGCGACGCCGC GTGAGGGATGGAGGCCTTCGGGTT GTAAATAATGGGGAATATTGCACA ATGGGCGAAAGCCTGATGCAGCGA   Slide modified from J. Wan 17
  • 18. Caution! §  At such high read numbers, errors are inevitable §  When not accounted for, errors greatly inflate OTU counts and diversity estimates •  Hundreds of “species-level” OTUs identified in 30,000 E. coli reads (Huse, Environ Microbiol. 2010) §  Solutions: •  Single-linkage pre-cluster step (SLP) •  Alternatively, model sequencing errors and use machine learning to remove noise (e.g., DADA) Slide modified from J. Wan 18
  • 19. Software/Databases for Microbiome Analysis §  Mothur (mothur.org) - full 16S analysis suite §  QIIME (qiime.org) - full 16S analysis suite §  MG-RAST server (metagenomics.anl.gov) - 16S and WGS §  CloVR (clovr.org) - 16S and WGS §  BioBakery (bitbucket.org/biobakery/biobakery) §  BROAD Microbiome (microbiomeutil.sourceforge.net) - chimera detection, OTU binning §  Ribosomal Database Project (RDP; rdp.cme.msu.edu) - 16S and 28S Fungal •  RDP Classifier (rdp-classifier.sourceforge.net/) §  greengenes (greengenes.lbl.gov) - Taxonomy, 16S §  IMG (img.jgi.doe.gov/imgm_hmp) - DOE Joint Genome Institutes; genome annotation §  PATRIC (patricbrc.org) - Pathogens §  SILVA (arb-silva.de) - 16S, 18S, 28S §  More tools listed @ HMP DACC: http://www.hmpdacc.org/tools_protocols/ tools_protocols.php 19
  • 20. Nephele:  Microbiome  Analysis  in  the  Cloud   Microbiome  analysis  +  Cloud  compu9ng  =  no  hassle  for  installa9on   and  “on  demand”  analysis  pla?orm  service    
  • 21. Example  Nephele  Workflow   SFF# Single)end# FASTQ# FASTA,# QUAL# Paired)end# #FASTQ# sffinfo#(Mothur)# validate_mapping_file# split_libraries# denoise_wrapper# inflate_denoiser_output# Cleaned#FASTA# validate_mapping_file# split_libraries# convert_fastaqual_fastq# join_paired_ends# make#mapping#files# validate_mapping_files# split_libraries_fastq# merge_mapping_files# # PRE)PROCESSING# 1.#Closed#Reference# pick_closed_reference_otus# 2.#Open#Reference# pick_open_reference_otus# 3.#De#novo# pick_de_novo_otus# CLUSTERING#AND# CLASSIFICATION# BIOM# TREE# 1.#Alpha#Diversity,# Beta#Diversity,#PCoA# biom#summarize)table# calculate_subsample# core_diversity_analyses# DIVERSITY#ANALYSIS#AND#PLOTS# 3.#Interac?ve# heatmap# make_otu_heatmap# 2.#Resampling## PCoA#plots# jackknifed_beta_diversity# make_bootstrapped_tree# 4.#Differen?al# OTU#enrichment# metastats#(Mothur)# make.shared#(Mothur)# make.lefse#(Mothur)# SHARED# LEFSE# 5.#Differen?al# Clade# Enrichment# LEfSe# (HuYenhower)# Table#of# OTUs#with# p)values# QIIME#16S#Workflow#Diagram# Func?onal#Enrichment# normalize_by_copy_number#(PICRUSt)# predict_metagenomes#(PICRUSt)# WGS#
  • 22. Focus  Group  for  Usability  Tes9ng   Nephele  is  currently  under  development.   We  need  your  feedback  to  improve  features  and  usability  from  a  users’  perspec7ve,  i.e.,  YOU!   Analysis  Engine   Data  Explorer   Please  signup  and  gain   early  access  to  Nephele   (for  tes7ng  purposes)!   nephele@mail.nih.gov    
  • 23. Mothur §  “This project seeks to develop a single piece of open- source, expandable software to fill the bioinformatics needs of the microbial ecology community.” §  Documentation: •  http://www.mothur.org/wiki/Mothur_manual §  Support: •  http://www.mothur.org/forum/ §  Tutorials / Protocols •  http://www.mothur.org/wiki/Analysis_examples •  http://www.mothur.org/wiki/454_SOP •  http://www.mothur.org/wiki/MiSeq_SOP 23
  • 25. Basic Workflow for 16S Analysis §  1. Remove unwanted reads and sequencing and PCR error quality filtering pre.cluster/SLP §  2. Identify and remove chimeric sequences UCHIME §  3. Cluster operational taxonomic units (OTUs) average linkage (UPGMA), complete linkage §  4. Classify OTUs naïve Bayesian classification (Wang), BLAST §  5. Diversity Analysis and plots Alpha Diversity, Beta Diversity
  • 26. Set up Environment §  Open Terminal §  cd [drag MiSeq_SOP folder into terminal] [Enter]   §  ls -al §  export PATH=$PATH:/path/to/Desktop/ mothurGUI/mothur (drag folder into terminal)   §  which mothur   §  mothur   §  quit()   26
  • 27. Experimental Design §  Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD. (2013): Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Applied and Environmental Microbiology. 79(17):5112-20. §  Total 362 samples §  Test dataset: 21 samples •  Female 3 days 0-9 and days 141-150 (post-weaning) •  Mock §  R1 vs. R2 (read1, read2 -- not replicates) §  Timecourse, Early vs. Late §  V4 region §  Already demultiplexed by Illumina MiSeq software (one sample per file) 27
  • 28. Mothur-formatted 16S Sequence Databases §  SILVA •  Aligned Fasta, width 50,000 bases •  Used for alignment to make sure reads are in the correct region §  Gold (BROAD) •  Used for Chimera detection with chimera.slayer §  RDP Classifier training •  Unaligned Fasta •  Use with accompanying .taxonomy file for classify.seqs •  Has mitochondria, chloroplast so you can use for filtering out junk §  Greengenes •  Unaligned Fasta •  Use with accompanying .taxonomy file for classify.seqs •  Use for actual classification of sequences and OTUs 28
  • 29. Tutorials, other tools for today §  MiSeq initial steps: •  http://www.mothur.org/wiki/MiSeq_SOP §  Analysis •  http://www.mothur.org/wiki/454_SOP §  Plot phylogenetic tree •  http://iubio.bio.indiana.edu/treeapp/treeprint- form.html §  Examples of plots •  http://qiime.org/tutorials/tutorial.html 29
  • 30. 30 Thank You For questions or comments please contact: andrew.oler@nih.gov ScienceApps@niaid.nih.gov Slides available here (open in Safari or Internet Explorer): http://collab.niaid.nih.gov/sites/research/SIG/Bioinformatics/ -> Next Gen Sequencing -> “16S Microbiome Analysis”