454/Illumina Marker Gene Studies (rRNA)
Upcoming SlideShare
Loading in...5
×
 

454/Illumina Marker Gene Studies (rRNA)

on

  • 1,385 views

Presentation about high-throughput marker gene studies at the UC Davis Bits & Bites lunchtime discussion group (4/19/2012)

Presentation about high-throughput marker gene studies at the UC Davis Bits & Bites lunchtime discussion group (4/19/2012)

Statistics

Views

Total Views
1,385
Views on SlideShare
1,105
Embed Views
280

Actions

Likes
0
Downloads
31
Comments
0

3 Embeds 280

http://bitsandbites.posterous.com 272
https://twimg0-a.akamaihd.net 6
https://si0.twimg.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • These primers are HIGHLY CONSERVED. We use 18S because there is no other option for broad taxonomic coverage – universal COI primers don’t work for meiofauna
  • 1) Looking at diversity and species assemblages (relative abundances)2) Cosmopolitan or regionally restricted taxa? Using phylogeographic patterns to infer global biogeography for microbial eukaryote taxa3) Work in the Gulf of Mexico following Deepwater Horizon oil spill
  • High-throughput sequencing have revolutionized studies of environmentsBut limits of BLAST (no match/enviromental)
  • These patterns were consistent regardless of CLUSTERING cutoff or 18S LOCUS
  • Read patterns were consistent and replicable across PCRs
  • We can see individual specimens, so we know how sequence data relates to an concerted evolution is an incomplete process and we have to deal with intragenomic variation across these copies.OTUs are arbitrary units and one cutoff is not likely to be universally applicable across all taxa (vs. microbial protocols, 97% = a species)
  • The second challenge is assigning accurate taxonomy to OTUs that we do define (assuming not everything is a species)
  • Chimeras typically show up as low-copyA rare biosphere certainly does exist and it can be ecologically important
  • Head-tail patterns may help us to delimit species and separate out rare taxa (who will have Head-tail patterns) from errors (no apparent pattern)
  • Marker genes across all domains – bacteria, archeaa,eukaryotes & virusesrRNA genes,Protein-coding orthologs, lineage-specific gene families
  • For a deeper discussion of some of the things I’ve brushed on, I’ll refer you to our recent review in TREE
  • So with that I’d just like to thank my current and former lab members and collaborators. And I’ll take any questions.

454/Illumina Marker Gene Studies (rRNA) 454/Illumina Marker Gene Studies (rRNA) Presentation Transcript

  • High-throughput environmental marker gene studiesHolly Bik@Dr_Bik Photo Credit: J. Baldwin, M. Mundo-Ocampo
  • High-throughput biodiversity research• Oceanic sediments (covering >70% of the earth’s surface) harbor the vast majority of the world’s biodiversity• Microscopic eukaryotes (e.g. nematode worms, protists, fungi) are diverse and abundant in these environments• The taxonomy and functional role of these species (likely to be significant in marine ecosystems) is not understood• Informed mitigation and remediation REQUIRE prior knowledge of biodiversity!
  • -Omic Dictionary• Marker gene studies – amplification of a conserved homologous gene (18S, 16S rRNA) from environmental samples• Metagenomics – shotgun sequencing of random genomic fragments from environmental DNA• Metatranscriptomics– expressed mRNA transcripts from environmental samples
  • Extract Environmental DNA EASY EASY Amplify rRNA Diverse marine communityCommunity analysis VERY Difficult!! EASY High-throughput sequencing
  • Amplification of 18S rRNA F04/R22 NF1/18Sr2b (Region 1) (Region 2) 456 bp ~400 bp Base Conservation across Metazoa NematodesPrimer SequenceSSU_F04 5’- G C T T G T C T C A A A G A T T A A G C C C -3’ % identity 99 98 98 98 98 98 98 98 99 99 99 100 100 99 99 99 99 99 99 99 99 98 100SSU_R22 5’- G C C T G C T G C C T T C C T T G G A -3’ % identity 100 100 100 100 100 100 100 100 100 100 98 100 100 100 90 100 90 100 100 100 NF1 5’- G G T G G T G C A T G G C C G T T C T T A G T T -3’ % identity 99 100 100 100 99 100 100 100 100 100 99 97 100 99 99 100 100 98 100 100 98 98 100 100 100 18Sr2b 5’- T A C A A A G G G C A G G G A C G T A A T -3’ % identity 100 88 88 88 88 88 88 88 100 98 98 100 99 99 100 100 100 99 100 99 100 100
  • Key Questions1) How diverse are marine communities of microscopic eukaryotes?2) How structured are these communities in marine sediments?3) What has been the effect of anthropogenic disturbance on these communities?
  • Environmental Taxonomy (18S rRNA)Deep sea and shallow water marine sediment1.2 million reads, 454 GS FLX Titanium Bik et al. (2012), Molecular Ecology
  • Diverse Communities
  • Deep sea vs. Shallow communities PC2 (12.21%) ShallowGulfPC2 (13.32%) Atlantic22#1 ShallowCali ShallowGulf Atlantic29 f Atlantic25#2 Atlantic45 Atlantic43 ShallowCali f Pacific128 Atlantic22#1 Pacific237 Pacific422 Atlantic25#2 Pacific321 Pacific528 Pacific321 Atlantic29 Atlantic43 Pacific128 Pacific528 Pacific422 Atlantic45 Pacific237PC3 (12.38%) PC1 (14.46%) PC3 (10.54%) PC1 (13.03%) 95% Clustering 99% Clustering (2000 OCTUs) (20,000 OCTUs) **Same grouping patterns were observed using Region 2 of the 18S geneBik et al. 2012, Molecular Ecology
  • OTUs clustered at 95% identityBik et al. 2012, Molecular Ecology
  • Introduction of Bias• Sampling design (replicates, temporal, gear)• Preservation and Extraction methods• Primer bias (marker gene studies)• PCR bias (template composition, inhibitors)• Sequencing bias (depth of sequencing, platform specific considerations)
  • Ha lice No. of Reads ph a lo bu s 50 100 150 200 250 300 350 400 0 n. s p. 6 H. B. gal 96 ana eat to l us iu s A. 1 he 7 0 A. lict i9 B. b esse 4 D it lon yi 9 yle gicau 8 n ch d u s atus 1r_08 B. sp. 3r_C_09 1r_C_09 kev 199 3r_B_09 1r_B_09 3r_A_09 1r_A_09 in i B. Z. p 361 hyl ob u ncta ian B. um ta t 1 B. uscia 60 ho e1 fm 83 B. ann i egg 1 ers 55 i 14 6 Tri T. lire ch o llus d P. a o ru s cu m sp. B in a B. . se an tus hel i1 P. f len ic 75 lor us ide 1 B. n 54 B. fu ngi sis 61 par v 7 aco oru s rne 153 B. se x o lus de B. nt 172 abr ati 17 up 9 M y B. g tu s 1 o la e 36 imu r beri sn 169 B. . sp bo . 23 Pr i sm real 3 ato is 1 Rh la im 38 ab L us dit o ngi id o dor sp. ide u s n s sp . . sp C. e . 243 Variation in Read Number B. leg p a P. a latze ns er i ri 1 A. v 7 rh y o rus 1 Par nch . 75 act o fo 8 ino ri laim 193 us sp. Artificial control community – 1 individual per nematode speciesPorazinska et al. 2009 Molecular Ecology Resources
  • OTUs as ‘Clouds’99% cutoff97% cutoff How to correlate OTUs with biological species?
  • Head-Tail Pattern in Nematode OTUsOCTU Reads OCTU Length Bit Score E-Value Match bp Total bp % Similarity Chimera DB match 27 63 266 525 e-146 265 265 100 Head Head -1 B. seani 175 12 9 265 500 e-138 261 264 98.86 -1 B. seani 175 170 8 264 496 e-137 261 264 98.86 0 B. seani 175 513 1 264 494 e-136 259 262 98.85 Tail -2 B. seani 175 579 2 263 492 e-136 258 261 98.85 -2 B. seani 175 570 1 262 492 e-136 258 261 98.85 -1 B. seani 175 394 19 1 2 263 269 490 488 e-135 e-135 260 264 264 269 98.48 98.14 Tail 1 0 B. seani 175 B. seani 175 658 1 266 486 e-134 260 265 98.11 -1 B. seani 175 412 2 264 480 e-132 260 265 98.11 1 B. seani 175 465 9 254 478 e-132 251 254 98.82 0 B. seani 1751164 1 268 478 e-132 261 267 97.75 -1 B. seani 175 304 1 261 474 e-130 255 260 98.08 -1 B. seani 175 868 1 244 460 e-126 242 245 98.78 1 B. seani 175 514 2 274 458 e-126 263 272 96.69 -2 B. seani 175 683 1 250 426 e-116 241 249 96.79 -1 B. seani 175 627 1 230 422 e-115 223 226 98.67 -4 B. seani 175 171 3 212 400 e-108 209 211 99.05 -1 B. seani 1751223 1 202 355 5.00E-95 198 204 97.06 2 B. seani 175 Porazinska et al. 2010 Zootaxa Artificial control community containing known nematode species, all with corresponding full length reference 18S sequences
  • Assigning Taxonomy to OTUs• BLAST approaches: accuracy is critically dependent on reference databases• Eukaryote sequence databases are patchy and sparsely sampled SILVA 108 Ref rRNA Database (16S/18S) Bacteria 530,197 Archaea 25,658 Eukaryotes 62,587
  • Errors vs. Rare Taxa• Chimeras – hybrid sequences formed during PCR that do not exist in nature• ‘Jumping off points’ in conserved amplicon regions• Mostly low-read OTUs restricted to single samplesHow do we separate the ‘rare biosphere’ from erroneoussequences?
  • Important ChallengesPhylogenetic – rRNA data needs to be interpreted in a phylogenetic context, but eukaryotic guide trees are not comprehensive – Phylogenetic placement of short sequences can help you identify taxon sampling problems in the reference dataset that would not be obvious by BLAST searches
  • BLAST vs. Phylogeny
  • Explicitly Phylogenetic Approaches Aligned Evolutionary OTU sequences Placement of Edge PCoA short reads CommunityGuide Tree ‘fingerprints’ Taxonomy assignment, Exploiting head-tail
  • http://phylosift.wordpress.com @PhyloSift
  • Development of new tools How does OTU picking affect biological interpretations of sequence dataShift towards Illumina…processing 10x as much data?!
  • Visualization Sample Sites Visuals tools for enabling novelAbundance (vertical) scientific discovery OTUs / Species
  • Important ChallengesMetadata – Genbank’s Short Read Archive is not accessible – MOTUs (Molecular Operational Taxonomic Units) are arbitrary constructionsPressing need for open access database resources for metadataanalysis and comparative studies
  • Tools for Computational AnalysisQIIME is popular and easy to use – available on Amazon Cloudif researchers don’t have local bioinformatic facilities
  • AcknowledgementsUC Davis• Jonathan Eisen• Aaron Darling• Guillaume JospinFormer Lab Members• W. Kelley Thomas (Univ. of New Hampshire)• Way Sung (Univ. of New Hampshire)• Feseha Abebe-Akele (Univ. of New Hampshire)Collaborators• Simon Creer (Univ. of Wales, Bangor)• Vera Fonseca (Univ. of Wales, Bangor)• Dorota Porazinska (Univ. of Florida)• Robin Giblin-Davis (Univ. of Florida)• Jyotsna Sharma (University of Texas, San Antonio)• Ken Halanych (Auburn University)