Successfully reported this slideshow.

[2013.11.01] visualizing omics_data

2,010 views

Published on

Published in: Technology, Education
  • Be the first to comment

[2013.11.01] visualizing omics_data

  1. 1. Visualizing omics data Mads Albertsen Introduction to community systems microbiology 2013 CENTER FOR MICROBIAL COMMUNITIES
  2. 2. Agenda • Visualizing omics data • Re-introduction to 16S analysis • Hands on 16S analysis in Rstudio • There is so much to learn. How do I start? CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  3. 3. Visualizing data? Martin Krzywinski http://mkweb.bcgsc.ca/ CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  4. 4. Who - when, where and why? Re-introduction to 16S analysis CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  5. 5. Who - when, where and why? CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  6. 6. Who - when, where and why? Accumulibacter Competibacter http://en.wikipedia.org/wiki/File:EBPR_FISH_Floc.jpg P. Larsen 2012 Bacillus anthracis http://phil.cdc.gov/phil/details.asp?pid=2226 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  7. 7. Taking advantage of evolution The affinities of all the beings of the same class have sometimes been represented by a great tree... The green and budding twigs may represent existing species; and those produced during former years may represent the long succession of extinct species. C. Darwin, 1872 Nothing in biology makes sense, except in the light of evolution. T. Dobzhansky, 1973 http://tolweb.org CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  8. 8. Why do we use the 16S gene? Ribosomes are universal rRNA = Structural RNA http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  9. 9. Why do we use the 16S gene? 8F 8F Universal primer 8F 8F http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  10. 10. Why do we use the 16S gene? Ashelford et al. AEM. 2005;71:7724-7736 • Advantages: • Universal gene (No horizontal gene transfer) • Conserved regions • Variable regions • Great databases and alignments • Problems: • Variable copy number • No universal (unbiased) primers • (Not directly correlated with activity) • (Lack of functional information) http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  11. 11. Typical workflow Sampling Extraction Sample prep Sequencing Bioinformatics There is a lot of steps! CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  12. 12. Typical workflow Sampling Extraction Sample prep Sequencing Bioinformatics • Standardisation, standardization, standardizasion..! • Use biological replicates and evaluate your variation…! • Design a good experiment with realistic expectations to the outcome (Most studies fail here!!!) AAU activated sludge standard @ midasfieldguide.org CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  13. 13. Typical workflow Sampling Extraction Sample prep Bioinformatics Sequencing Storage Input (mg) • Fresh • 24 h @ 4°C • 24 h @ 20 °C 4 1 2 9 22 eDNA removal NH2 + 650 W 10 min N3 N+ CH3 PMA AAU activated sludge standard @ midasfieldguide.org Duration (s) Bead beating 400 160 80 40 20 4 6 Intensity (ms-1) CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  14. 14. Typical workflow Mean frequency of most common residue in 50 bp window Sampling Extraction Sample prep Bioinformatics Sequencing 1.0 0.8 V7 V1 0.6 V2 V3 V1.3 0 V4 V5 V6 V3.4 V4 500 V8 V9 Bp 1000 1500 Ashelford et al. AEM. 2005;71:7724-7736 AAU activated sludge standard @ midasfieldguide.org CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  15. 15. Typical workflow Sampling Extraction Sample prep Bioinformatics Sequencing PCR with modified 16S primers Illumina adapter Pad linker 27F 5’-AATGATACGGCGACCACCGAGATCTACAC GTACGTACG GT AGAGTTTGATCCTGGCTCAG-3’ Illumina adapter Barcode Pad linker 534R 5’-CAAGCAGAAGACGGCATACGAGAT TCCCTTGTCTCC ACGTACGTAC CCG ATTACCGCGGCTGCTGG-3’ PCR Cycle // 1. 2. Target region // // 3. AAU activated sludge standard @ midasfieldguide.org CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  16. 16. Typical workflow Sampling Extraction Sample prep Sequencing Bioinformatics ≈ 500 bp target amplicon Mardis, 2008 (PMID 18576944) CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  17. 17. Typical workflow Sampling Extraction Sample prep Sequencing Bioinformatics ≈ 500 bp target amplicon Read 1: 300 bp Read 2: 300 bp After Sequencing: Read 1 Read 2 Barcode CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  18. 18. Typical workflow Sampling Extraction Sample prep Sequencing Bioinformatics How many sequences are needed? It depends on your question! (although 50.000 raw sequences per sample is usually fine) AAU raw kit and chemical costs (DKK) Cost DNA extraction 105 70a 40 40 Sequencing (min 100k reads / sample) 190b 70c Total 335 Library preparation Cost v2 180 a Kits discounted 50 samples per run c 150 samples per run (can run up to 300) b CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  19. 19. Typical workflow Sampling Extraction Sample prep Bioinformatics Sequencing OTU Count Merge Cluster 3 11 3 1 Assign taxonomy (Compare to database) OTU Count 3 11 3 1 OTU table Accumulibacter Unkown Competibacter Bacillus anthracis CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  20. 20. Typical workflow Sampling Extraction Sample prep Sequencing Bioinformatics Barcode Merge A A A A A A A A A B B B B B B B B B OTU A B 2 3 3 1 Cluster 1 8 0 0 Assign taxonomy (Compare to database) OTU A B 2 3 3 1 1 8 0 0 OTU table Accumulibacter Unkown Competibacter Bacillus anthracis CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  21. 21. Typical workflow Sampling Extraction Sample prep Sequencing Bioinformatics Sequence errors, chimera’s and weird stuff.. The chance of a perfect read as function of the read length Chimera’s CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  22. 22. Typical workflow Sampling Extraction Sample prep Bioinformatics Sequencing OTU Count Merge Cluster 3 11 3 Assign taxonomy (Compare to database) OTU Count 3 11 3 Removing unique sequences makes the subsequent steps 10-100x faster and removes the majority of errors and chimera’s OTU table Accumulibacter Unkown Competibacter Dependent on sequencing depth and sample complexity! Be careful! CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  23. 23. AAU workflow Sampling Extraction Sample prep Bioinformatics Sequencing Find sample ID’s on Google drive Plain text file 16SAMP-145 16SAMP-146 16SAMP-147 16SAMP-148 16SAMP-149 16SAMP-150 OTU table (+ R version) 16S.V13.workflow.sh OTU A B 2 1 Accumulibacter 3 8 Unkown 3 0 Competibacter CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  24. 24. AAU workflow Sampling Extraction Sample prep Sequencing Bioinformatics What 16S.V13.workflow.sh does: 1. Find and unpack your samples 2. Optional subsampling 3. Remove potential phiX contamination (bowtie2) 4. Merge read 1 and read 2 (flash) 5. Remove reads outside length criteria 6. Optional removal of unique reads and subsampling to even depth 7. Format reads for QIIME 8. Cluster reads to OTUs (Uclust, QIIME) 9. Assign taxonomy (RDP classifier, QIIME + database: MiDAS, Greengnes or Silva) 10. Generate OTU table (QIIME) CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
  25. 25. Where do I start? • Get online (twitter, blogs, seqanswer.com) • Learn basic multivariate statistics • Learn R (with Rstudio) • Analyzing Ecological Data (2007) by Zuur, Ieno & Smith CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY

×