Annotating 18S rDNA sequences from environmental molecular surveys
1. Annotating 18S rDNA sequences from environmental
molecular surveys
Ramon Massana
EukRef Workshop, Vancouver, Canada, 21 July 2015
2. Microorganisms in the marine plankton
1977 - Many bacteria Hobbie et al. 1977. Appl. Environ. Microbiol.
1980 - Very active Fuhrman & Azam 1980. Appl. Environ. Microbiol.
1982 - Controlled by predation by small flagellates Fenchel 1982. Mar. Ecol. Progr. Ser.
The microbial loop
Phytoplankton
Half Earth primary production occurs at the sea Field et al. 1998. Science
Marine primary producers are essentially planktonic microorganisms
At the base of food webs
Classical food web Microbial loop
Azam et al. 1983. Mar. Ecol. Progr. Ser
Chlorophyll concentration by SeaWiFS
September 1997 – August 2000
3. A simple question: Who are the smallest marine protists?
Morphology Cultures
Molecular
Nanoplankton (2-20 µm)Microplankton (20-200 µm) Picoplankton (0.2-2 µm)
1000-10,000 cells ml-1 in seawater
4. Advantages
Universal in all living beings
Mosaic of conserved and variable regions
Highly expressed (DNA and RNA approaches)
The 18S rDNA gene as a phylogenetic marker
Limitations
Variable rDNA operon copy number
Different evolutionary rate among lineages
Primers used affect the results
5. Novel lineages were very apparent in molecular surveys
of marine picoeukaryotes
Mainly heterotrophs
Mainly phototrophs
Novel alveolates
Novel stramenopiles
Blanes Bay
Díez et al. 2001
Moon van der Staay et al. 2001
Antarctica
Mediterranean seaNorth Atlantic
Equatorial Pacific
Massana et al. 2004
7. Identifying MAST cells
MAST-1B MAST-1C
10 µm
MAST-4
Massana et al. 2006
MAST-4MAST-1CMAST-1B
Important bacterial grazers in the sea !
FISH
Abundant an widely distributed !
Small (2-5 µm) free-living
unpigmented protists
8. Objectives
Evaluate the consistency of the groups described so far
Explore the existence of other groups
Describe substructure within the groups
9. MAST criteria
Belong to the “basal heterotrophic” stramenopiles
Do not belong to any described group in this region
13. 1 – Download as much sequences as possible (and then discard)
Tips in making the final reference dataset
2 – Rely on Sanger sequencing
Very clear perception of quality
Multiple sequencing reactions from the same amplicon => complete 18S rDNA sequences
14. 1 – Download as much sequences as possible (and then discard)
Tips in making the final reference dataset
2 – Rely on Sanger sequencing
0
10
20
30
40
50
1 51 101 151 201 251 301 351 401
0
10
20
30
40
50
1 51 101 151 201 251 301 351 401 451
But…
No chromatograms
Apply a quality criteria AND check !!
HTS
Faster, easier cheaper and more
Automatic pipelines
15. 1 – Download as much sequences as possible (and then discard)
Tips in making the final reference dataset
2 – Rely on Sanger sequencing
3 – Make phylogenies
Verify the annotation of sequences to given groups
Be suspicious of long “orphan” branches (individual BLAST)
16. 1 – Download as much sequences as possible (and then discard)
Tips in making the final reference dataset
2 – Rely on Sanger sequencing
3 – Make phylogenies
4 – Be aware that even Sanger can have sequencing errors
Ends need to be correctly trimmed
In databases you do not see the chromatograms
17. 1 – Download as much sequences as possible (and then discard)
Tips in making the final reference dataset
2 – Rely on Sanger sequencing
3 – Make phylogenies
4 – Be aware that even Sanger can have sequencing errors
5 – Chimeras do occur, can be very frequent, and can escape detection algorithms
18. Stramenopile diversity. The importance of quality and chimera checks
Chimeras appearing within MAST-4
0.03
Backbone tree
Seed to align 454 reads
19. Stramenopile diversity. The importance of quality and chimera checks
0.02
chimera.slayer (Mothur)
Removes 81 chimeras
0.03
0.02
Manual checking
Removes 6 chimeras
Chimeras appearing within MAST-4
20. 1 – Download as much sequences as possible (and then discard)
Tips in making the final reference dataset
2 – Rely on Sanger sequencing
3 – Make phylogenies
4 – Be aware that even Sanger can have sequencing errors
5 – Chimeras do occur, can be very frequent, and can escape detection algorithms
Since the 18D rDNA is such a conserved gene
Be very conservative to accept a new phylotype!!