Information for producing phylogenetic/taxonomic libraries of airborne bacteria and fungi. Includes fundamental background information, approaches for sequencing and data analysis, two case studies, and a review of sampling methods
2. General
Outline:
Overview
of
geneDcs
The
new
world
of
DNA
sequencing
Molecular
methods
for
idenDficaDon
Molecular
methods
for
quanDficaDon
PhylogeneDcs
overview
Aerosol
sampling
for
molecular
analysis
2
4. GeneDcs
DefiniDons:
Genome:
The
complete
set
of
geneDc
material
(DNA)
of
an
organism
or
a
virus.
Gene:
A
segment
of
DNA
specifying
a
parDcular
protein,
or
other
funcDonal
molecule
(tRNA
or
rRNA).
Transcriptome:
The
complement
of
mRNAs
produced
in
an
organism
under
a
specific
set
of
condiDons.
Metagenome:
The
total
geneDc
complement
of
all
the
cells
present
in
a
parDcular
environment.
Proteome:
The
total
set
of
proteins
encoded
by
a
genome
4
5. Central
Dogma
of
Biology:
DNA
RNA
Protein
Genomic
DNA
is
blueprint
set
of
instruc8ons
Messenger
RNAs
(mRNAs)
are
the
specific,
short-‐lived,
gene
transcripts
Proteins
perform
structural
and
cataly8c
func8ons
transcrip8on
a.k.a.
“gene
expression”
Transla8on
occurs
in
ribosomes:
(1)
mRNA
aNaches
to
ribosome,
(2)
polypep8des
are
produced,
polypep8des
are
folded
in
to
proteins
5
9. Cost
of
DNA
Sequencing:
!"#$
#$
#!$
#!!$
#!!!$
#!!!!$
%!!#$ %!!&$ %!!'$ %!!($ %!!)$ %!##$
!"#$%$"%#&'(&)*&%+,--,")%./0%12#&#%345%
Moore’s
law
TradiDonal
method
is
Sanger
sequencing:
-‐advantage:
longer
(up
to
800
bp
long
sequences)
-‐disadvantage:
slow
and
costly
Next
generaDon
sequencing:
-‐advantage:
low
cost
and
rapid
-‐disadvantage:
sequences
are
short
(75
to
400
bp)
9
10. (A) DNA
is
fragmented
into
pieces
~500
bp
long
and
made
single
stranded;
(B) Adaptors
are
added
to
single
strands
and
1
strand
is
aNached
to
1
microbead;
(C) PCR
is
performed
and
mul8ple
copies
of
the
strand
are
produced;
Next
GeneraDon
Sequencing
Example
(454
Pyrosequencing):
A B
C
10
11. D
(D) Beads
are
placed
into
wells
(1.5
x
106
wells
per
plate);
(E) The
seconds
strand
is
synthesized
and
added
bases
are
recorded.
Next
GeneraDon
Sequencing
Example
(454
Pyrosequencing)
ConDnued:
E
11
12. Some
DNA
Sequencing
OpDons
(as
of
2012):
Illumina
HiSeq
technology
-‐one
lane
produces
~50
million
reads
-‐reads
are
~100
nucleoDdes
long
-‐cost
is
~$2,000
per
lane
454
Pyrosequencing
-‐one
gasket
produces
150,000
reads
-‐reads
are
~500
nucleoDdes
long
-‐cost
is
~$2,000
per
gasket
Lab
“personal”sequencers
-‐Ion
Torrent:
60-‐80
millions
reads,
200
nt
long
-‐MiSeq:
15
million
reads,
up
to
250
nt
long
12
14. PhylogeneDcs:
Phylogeny:
The
evoluDonary
history
of
organisms
PhylogeneDcs:
A
framework
for
idenDficaDon
and
quanDficaDon
of
microbial
communiDes.
Habitat
Culturability
(%)
Seawater
0.001-‐0.1
Freshwater
0.25
Mesotrophic
lake
0.1-‐1
Estuarine
waters
0.1-‐3
Ac8vated
sludge
1-‐15
Sediments
0.25
Soil
0.3
Air
~1
The
great
plate
count
anomaly
(see
Amann
et
al.
(1995),
Microbiol.
Rev.
v59,
p143.)
14
15. 16S
rRNA
is
the
EvoluDonary
Chronometer
~1500
nucleoDdes
long
a
structural
porDon
of
the
ribosome
present
in
all
organisms
evolved
slowly
and
includes
conserved,
variable
and
hypervariable
15
19. For
IdenDficaDon:
1) Sequences
derived
from
one
or
many
microorganism
in
an
aerosol
sample
can
be
produced
ACGTATAGGACGATACCATG……………
2) Using
a
search
algorithm,
the
sequence
is
matched
against
a
databases
of
rDNA
gene
sequences
from
known
organisms.
3) IdenDficaDon
at
the
highest
taxonomic
level
that
can
be
confidently
assigned
is
provided.
eg.
assignment
of
E.
coli
to
genus
level
would
yield:
Bacteria
Proteobacteria
gammaProteobacteria
Enterobacteriales
Enterobacteraceae
Escherichia
domain
phylum
class
order
family
genus
19
20. SSU
rRNA
Alignment
Forms
the
Tree
of
Life
and
a
Basis
for
IdenDficaDon
rRNA-‐based
Taxonomy:
Domain
Phylum
Class
Order
Family
Genus
Species
Pace,
1997,
Science
v276,
p734
20
22. Why
Not
QuanDfy
by
Culturability?
Habitat
Culturability
(%)
Seawater
0.001-‐0.1
Freshwater
0.25
Mesotrophic
lake
0.1-‐1
Estuarine
waters
0.1-‐3
Ac8vated
sludge
1-‐15
Sediments
0.25
Soil
0.3
Air
~1
The
great
plate
count
anomaly:
22
23. Viable Spore
Dead Spore
Spore that can not
grow on media
Unidentifiable
Culturing
Cannot
Capture
Fungal
Diversity:
Other fungal
fragments
23
24. Methods
for
QuanDficaDon:
QuanDtaDve
polymerase
chain
reacDon
Direct
microscopy
and
staining
Immuno-‐based
methods
and
proteomics
24
25. First:
Polymerase
Chain
ReacDon
(PCR)
1) Reagents:
forward
and
reverse
primers,
dNTP
mix
(A,T,C,G),
water
and
Mg2+,
template,
DNA
polymerase
2) Thermal
cycler:
runs
temperature
program
for
Denatura8on
(~95oC),
primer
annealing
(40-‐60oC),
extension
(72oC).
Typically
20
to
30
cycle
is
adequate,
don’t
go
above
45
cycles.
PCR
performs
two
funcDons:
(1)
it
selects
a
gene
or
segment
of
DNA
from
a
background
of
total
extracted
DNA,
and
(2)
it
makes
many
copies
of
the
selected
DNA
(amplicons)
25
26. PCR
is
Confirmed
by
Gel
Electrophoresis:
1000
bp
500
bp
100
bp
Ladder
-‐
control
sample
+
control
26
28. QuanDtaDve
(PCR),
a.k.a
Real-‐Time
PCR
(a) PCR
reagents
include
a
fluorescent
dye
that
increases
in
emissions
as
amplicon
number
increases
each
cycle
(b) Thermal
cycler
blocks
are
equipped
with
fluorometers
to
detect
changes
in
emission,
thus
track
amplicon
number
as
cycles
progress
Rela8ve
fluorescence
Increase
in
sample
concentra8on
28
29. How
is
Amplicon
Number
Converted
to
Fluorescent
Signal?
Method
1:
TaqMan®
Method
2:
SYBR
green
SYBR
is
a
DNA
intercala8ng
agent
that
fluoresces
only
when
bound
to
double
stranded
DNA.
As
more
amplicons
are
produced,
more
SYBR
green
binds
and
fluoresces.
29
30. qPCR
QuanDficaDon
Methods
–CalibraDon
CT
(cycle
threshold
value
set
in
linear
region
Replicate
samples,
known
concentraDon
of
cells
or
amplicon
targets
101
105
104
103
102
30
35. Methods
for
IdenDficaDon
PhylogeneDc
libraries:
a
library
of
of
all
SSU
rDNA
sequences
that
exist
in
an
environmental
sample.
Microbial
diversity
methods
and
tools
35
36. § For
bacterial
libraries:
PCR
primers
typically
target
the
16S
rRNA
encoding
gene
variable
regions;
§ For
fungal
libraries:
PCR
primers
typically
target
genes
encoding
the
ITS
region
of
ribosomal
RNA;
PhylogeneDc
Libraries
for
Bacteria,
Fungi,
and
Viruses:
36
37. § GS-‐FLX
454
sequencing
planorm;
§ Primers
targe8ng
16SrDNA
regions
crea8ng
~500
basepair
long
amplicons;
§ Data
analysis
pipeline
called
QIIME
(quan8ta8ve
insights
into
molecular
biology).
Isolate DNA Produce
amplicons
DNA clean-
up
Ampure
clean-up
Pool DNA
Scheme
for
CreaDng
PhylogeneDc
Libraries:
Send to
sequencer
37
39. § SorDng
sequences
in
to
sample
bins
and
trimming
primers
and
adaptors;
§ Producing
a
phylogeneDc
placement
or
idenDficaDon
for
each
sequence;
§ Determining
relaDve
abundances
of
taxa
for
each
sequence
(alpha
diversity);
§ Use
phylogeneDcs
to
compare
one
sample
populaDon
with
other
populaDons
(beta
diversity).
Sequence
Data
Analysis
Includes:
39
40. SorDng/Trimming/Denoising:
1) Raw
sequencer
files
are
input
into
sopware
that
recognizes
the
barcodes
and
sorts
sequences
into
their
original
sample
bin.
2) Primers
are
recognized
and
primer,
and
adaptors
are
removed
3) 454
sequencing
is
suscep8ble
to
mistakes
due
to
homopolymers
(AAAAAA).
Denoising
“fixes”
these
errors
40
41. PhylogeneDc
Placement
or
IdenDficaDon:
1) Sequences
derived
from
one
or
many
microorganisms
in
an
aerosol
sample
are
first
produced
ACGTATAGGACGATACCATG……………
2) Using
search
algorithms,
the
sequenced
is
matched
against
a
databases
of
rDNA
gene
sequences
from
known
organisms.
3) IdenDficaDon
at
the
highest
taxonomic
level
that
can
be
confidently
assigned
is
provided.
eg.
Assignment
of
an
E.
coli
sequence
to
a
genus
level
would
yield
the
result:
Bacteria
Proteobacteria
gammaProteobacteria
Enterobacteriales
Enterobacteraceae
Escherichia
domain
phylum
class
order
family
genus
41
42. PhylogeneDc
Placement
or
IdenDficaDon:
For
Bacteria:
Sequences
are
placed
into
a
MASTER
phylogene8c
tree
(Greengenes
tree).
The
are
then
iden8fied
based
on
their
placement.
97%
similarity
in
sequence
is
generally
accepted
as
the
same
species
(also
called
phylotype
or
opera8onal
taxonomic
unit
(OTU))
Pace,
1997,
Science
v276,
p734
42
43. PhylogeneDc
Placement
or
IdenDficaDon:
For
Fungi:
Sequences
are
compared
against
a
database
of
known
ITS
fungal
sequences
(by
BLAST
(Basic
Local
Alignment
Search
Tool)),
and
“best
matches”
are
determined
TGCGGAAGGATCATTACCGAGTGAGGGCCCTCTGGGTCCAACCTCCCACCCGTGTCTATCGTACCTTGTTGCTTCGGCGGGCCCGCCGTTTCGACGGCCGCCGGGGAGGCCTTGCGCCCCCGGGC
CCGCGCCCGCCGAAGACCCCAACATGAACGCTGTTCTGAAAGTATGCAGTCTGAGTTGATTATCGTAATCAGTTAAAACTTTCAACAACGGATCTCTTGGTTCCGGCATCGATGAAGAACGCAGCG
AAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAGTCTTTGAACGCACATTGCGCCCCCTGGTATTCCGGGGGGCATGCCTGTCCGAGCGTCATTGCTGCCCTCAAGCACGGCTT
GTGTGTTGGGCCCCCGTCCCCCTCTCCCGGGGGACGGGCCCGAAAGGCAGCGGCGGCACCGCGTCCGGTCCTCGAGCGTATGGGGCTTTGTCACCTGCTCTGTAGGCCCGGCCGGCGCCAGCCG
ACACCCAACTTTATTTTTCTAAGGTTGACCTCGGATCAGGTAGGGATACCCGCTGAACTTAAGCATATCAATAAGGCGGA
BLAST
nucleo8de
search
43
44. n What
are
the
origins
of
this
material
that
is
associated
with
human
occupancy?
shedding
resuspension
resuspension
Case
Study
#1:
occupied vs. vacant
44
45. Hospodsky
D,
Qian
J,
Nazaroff
WW,
Yamamoto
N,
et
al.
(2012)
Human
Occupancy
as
a
Source
of
Indoor
Airborne
Bacteria.
PLoS
ONE
7(4):
e34867.
doi:10.1371/journal.pone.0034867
hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0034867
Case
Study
#1:
RarefacDon
Curves,
the
First
Step
in
alpha
Diversity
Analysis:
45
46. Case
Study
#1:
RelaDve
Abundances
of
Bacterial
Taxa:
Hospodsky
D,
Qian
J,
Nazaroff
WW,
Yamamoto
N,
et
al.
(2012)
Human
Occupancy
as
a
Source
of
Indoor
Airborne
Bacteria.
PLoS
ONE
7(4):
e34867.
doi:10.1371/journal.pone.0034867
hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0034867
46
47. Hospodsky
D,
Qian
J,
Nazaroff
WW,
Yamamoto
N,
et
al.
(2012)
Human
Occupancy
as
a
Source
of
Indoor
Airborne
Bacteria.
PLoS
ONE
7(4):
e34867.
doi:10.1371/journal.pone.0034867
hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0034867
Case
Study
#1:
Beta
Diversity,
Comparing
Aerosol
PopulaDons
with
PotenDal
Source
PopulaDons:
47
48. Case
Study
#2:
Microbial
Ecology
of
Public
Restroom
Surfaces
48
49. Flores
GE,
Bates
ST,
Knights
D,
Lauber
CL,
et
al.
(2011)
Microbial
Biogeography
of
Public
Restroom
Surfaces.
PLoS
ONE
6(11):
e28132.
doi:
10.1371/journal.pone.0028132
hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0028132
Case
Study
#2:
Taxonomic
ComposiDon
of
Public
Restroom
Surfaces:
49
50. Flores
GE,
Bates
ST,
Knights
D,
Lauber
CL,
et
al.
(2011)
Microbial
Biogeography
of
Public
Restroom
Surfaces.
PLoS
ONE
6(11):
e28132.
doi:10.1371/
journal.pone.0028132
hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0028132
Case
Study
#2:
Beta
diversity-‐
Comparison
Among
Different
Surface
Samples
50
51. Flores
GE,
Bates
ST,
Knights
D,
Lauber
CL,
et
al.
(2011)
Microbial
Biogeography
of
Public
Restroom
Surfaces.
PLoS
ONE
6(11):
e28132.
doi:
10.1371/journal.pone.0028132
hNp://www.plosone.org/ar8cle/info:doi/10.1371/journal.pone.0028132
Case
Study
#2:
Beta
diversity-‐Source
Tracker
Program
in
QIIME
51
56. Sampler
CharacterisDcs:
Impactors
Sampling
rate
Size resolved
sampling
Viability Sample suitable for
molecular methods
Advantages/disadvantages
Cascade
impactors
Mechanism: The
sampling air
stream makes a
sharp bend and
particles are
stripped based on
their aerodynamic
diameter.
Typical models:
-Anderson
Cascade Impactor;
-MOUDI cascade
impactor;
-BGI 900 L/min
high volume
cascade impactor.
Typically
10 to 28
L/min.
Some
samplers
allow for >
500 L/min.
Provides the
best size
distribution
information.
Different
models offer
between 1 and
12 stages for
collecting
aerosols with
aerodynamic
diameters from
10 nm to >18
µm.
Only at 28
L/min
collection rates
and requires
direct sampling
onto agar
plates.
Stages can be
covered with filters,
membranes, or plates
and samples can then
be extracted from
these materials.
The panel did not
recommend use of
foam as a sampling
medium due to the
low efficiencies
associate with cell
and DNA extraction.
Advantages:
-Best ability to define particle
size distributions;
-Models available to perform
culturing;.
Disadvantages:
-High cost per sampler,
especially for high volume
samplers;
-Sampling inefficiencies due to
particle bounce;
-Not sensitive as total sampled
mass is divided among multiple
stages.
!
Sampling Size resolved Viability Sample Suitable for Advantages/disadvantages
dfddd rate sampling molecular methods
56
57. Common
Impactors:
Andersen multistage
impactor
Micro-Orifice Uniform-
Deposit Impactor
BGI High
Vol Impactor
57
58. Available
Sampler
CharacterisDcs:
Impingement
Liquid
impingement
Mechanism:
Sampled air is
passed through a
small opening and
captured into a
liquid medium.
Typical Models:
-SKC swirl
impingers;
-Omni 3000 high
volume impinge.
14 L/min
for glass
impingers,
new high
volume
models are
capable of
>100 liters
per minute.
Very limited
information on
the size ranges
that are
collected.
Efficiency
drops in low
volume glass
impingers
below
aerodynamic
diameters of 1
µm. High
volume
samplers have
not been
characterized
for sampling
efficiencies as
a function of
particle sizes.
Impingers are
flexible since
organisms are
impinged into
liquid media or
buffer and can
be used for
culturing or
molecular
analysis.
Samples are
impinged into 10 to
20 ml of liquid,
which may required
concentration by
filtration.
Advantages:
-Sample is collected into liquid
and does not require extraction
from a solid collection medium;
-Low cost of low flow glass
impingers.
Disadvantages:
-Limited information on
efficiencies, and the particle
sizes that are sampled;
-High volume impingers are
high cost;
-Glass impingers suffer from
low sampling rate and limited
sampling times due to
evaporation;
-High volume impingers have
complex systems for collecting
the sample and rewetting
surfaces, and there is large
concern about effectively
decontaminating the equipment.
!
Sampling Size resolved Viability Sample Suitable for Advantages/disadvantages
dfddd rate sampling molecular methods
58
60. Aerosol
Sampler
CharacterisDcs:
FiltraDon
Filtration
Mechanism:
Aerosols are
captured on filters
by impaction or
diffusional forces.
Typical Models:
-Anderson High
volume PM
samplers;
-SKC IMPACT
samplers.
Ranges
from 4
L/min and
up to 1,000
L/min.
Filtration
samplers
typically have
size selective
inlets that
allow for
sampling 10
µm and below
(PM10) and 2.5
µm and below
(PM2.5) size
fractons.
Because of
high
diffusional
forces, filters
are efficient at
sampling sizes
down to the 20
nm range of
viruses and
microbial
fragments
Not
recommended
for viability
due to high
stresses from
impaction and
desiccation.
Requires extraction
from filter material,
often Teflon or
polycarbonate
membranes, quartz
fiber filters, or
gelatin filters.
Advantages:
-High sampling rates available;
-Most common and robust form
of high volume sampling;
-Very small particles can be
sampled, most efficient way to
sample viruses;
-Can be used as personal
samplers;
-low cost compared to impingers
and impactors;
-Preferred method for sampling
PM for regulatory compliance.
Disadvantages:
-No possibility for viable
determination;
-High volume samples are not
suitable for sampling in most
occupied environments;
-Limited ability to produce
particle size distributions.
!
Sampling Size resolved Viability Sample Suitable for Advantages/disadvantages
dfddd rate sampling molecular methods
60
63. Tools
for
Sequence
Analysis:
Some
useful
basic
tools
for
gexng
started
with
bacterial
and
fungal
phylogene8c
analysis:
RDP
Pyrosequencing
pipeline:
Easy
to
use
pipeline
for
viewing
histograms
of
raw
sequences
and
sor8ng
data
based
on
barcodes.
hNp://pyro.cme.msu.edu/
UniFrac:
Beta
diversity
measurements
including
PCoA
plots
of
microbial
popula8ons.
hNp://bmf2.colorado.edu/fastunifrac/
FHiTINGS:
Automa8cally
selects
best
BLAST
hit
for
fungal
iden8fica8on,
assigns
taxonomy,
and
parses
data
into
tables.
hNp://sourceforge.net/projects/yi8ngs/
All
in
One
tool
boxes,
that
contain
a
variety
of
programs
for
complete
sequence
analysis:
QIIME:
Quan8ta8ve
Insights
Into
Microbial
Ecology:
hNp://qiime.sourceforge.net/
VAMPS:
Visualiza8on
and
Analysis
for
Microbial
Popula8on
Structure:
hNp://vamps.mbl.edu/index.php
MOTHUR:
hNp://www.mothur.org/
63
64. To
learn
more:
Procedures
for
phylogeneDc
sequencing
using
Illumina-‐based
DNA
sequencing:
Caporaso
et
al.
(2012)”
Ultra-‐high-‐throughput
microbial
community
analysis
on
the
Illumina
HiSeq
and
MiSeq
planorms.
ISME
J
6:
1621-‐1624.”
Reviews
on
aerosol
science
and
molecular
biology:
Peccia
et
al.,
(2011)
"New
Direc8ons:
A
revolu8on
in
DNA
sequencing
…”,
Atm.
Environ.,
45:
1896-‐1897.
AND
Peccia,
J.,
Hernandez,
M.
(2006)
"Incorpora8ng
Polymerase
chain
reac8on-‐based
iden8fica8on
…",
Atm
Environ.,
40:
3941-‐3961.
Good
fungal
aerosol
next
gen
sequencing
paper.
Adams
et
al.(2013)
Dispersal
in
microbes:
fungi
in
indoor
air
are
dominated
by
outdoor
air
and
show
dispersal
limita8on
at
short
distances.
ISME
J.
doi.org/10.1038/ismej.2013.28
Brocks
Biology
of
Microorganisms
(11th
ediDon
or
higher):
easy
to
understand
textbook
that
covers
microbial
gene8cs
and
phylogene8cs
64
Good
viral
aerosol/qPCR
paper.
Yang
et
al.,
(2011).
“Concentra8ons
and
size
distribu8ons
of
airborne
influenza
A
viruses
measured
indoors
at
a
health
centre…”
Journal
of
the
Royal
Society
Interface,
8,
1176-‐1184.