Your SlideShare is downloading. ×
Ngs microbiome
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Ngs microbiome

1,972
views

Published on

Published in: Technology

1 Comment
1 Like
Statistics
Notes
  • Sharing NGS presentation data discussing strategies may give a better idea of advantages and disavantages when using NGS. Thanks for sharing these experiences!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
1,972
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
121
Comments
1
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Next-Generation Sequencing of MicrobialGenomes and MetagenomesChristine KingFarncombe Metagenomics FacilityHuman Microbiome Journal ClubJuly 13, 2012
  • 2. Overview Next-generation sequencing  Applications  Instruments  Library prep and sequencing chemistry  Sequence quality Project overview  Microbial genomes  Microbial communities
  • 3. DNA Sequencing  1st generation  Sanger chain termination  Capillary electrophoresis  2nd generation (NGS)  High throughput, “massively parallel”  Shorter reads  Sequencing-by- synthesis  3rd generation  Single molecule
  • 4. Applications DNA sequencing  De novo genomes  Resequencing  Shotgun (e.g. mutant strains)  Amplicon (e.g. HLA, cancer)  Sequence capture (e.g. exome)  Metagenome  Amplicon (e.g. 16S, COI, viral)  Shotgun  ChIP RNA sequencing  Gene expression  Gene annotation, splice variants
  • 5. Instruments
  • 6. Instruments Total # of Read Cost outp RunInstrument read length per Technology ut Time s (bp) base (Gb) GS FLX 1M 450 0.5 $$$$ ++ GS FLX+ 1M 650 0.6 $$$$ ++ emPCR, SBS, light detection GS Jr 100K 450 0.05 $$$$ ++ GAIIx 640M 2x 150 90 $$ +++HiSeq 2000 6B 2x 100 600 $ +++ Bridge PCR, SBS, fluororphore MiSeq 12M 2x 150 2 $$ ++ PacBio RS >10K >1000 0.01 $$$$ + Single-molecule seq, fluorophoreSOLiD 5500xl 1.4B 75 + 35 155 $ +++ emPCR, probe ligation, fluorophore Ion PGM - 1M >100 0.1 $$$ + 316 emPCR, SBS, pH change Ion PGM - 6M >100 1 $$ + 318
  • 7. Which instrument(s) to use?  Read length vs number of reads  Cost per base, per sample, per project (multiplexing?)  Accuracy  Run time, wait timeApplication Lengt # Accura Instruments Considerations h Reads cyDe novo +++ ++ ++ MiSeq, 454, Ion Mix lengths(small)De novo +++ +++ ++ HiSeq, 454, Mix lengths, MP(large) SOLiDRe-seq ++ ++ ++ MiSeq, Ion Multiplex?(small)Re-seq (large) ++ +++ ++ HiSeq, SOLiD Enrichment?RNA-seq + +++ + Illumina, SOLiD, Ref? Size?(count) Ion Rare?
  • 8. Library Preparation Goal: fragments of DNA, each end flanked by adaptor sequences Adaptors contain amplification- and sequencing primer binding sites; platform- and chemistry-specific Optional: sample-specific barcodes/indexes/MIDs/tags allow multiplexing during sequencing Library QC: quantity, size
  • 9. Library Preparation Library types:  Shotgun (DNA)  May begin with ChIP  May follow with sequence capture  Mate pair (DNA)  Amplicon (DNA)  Total RNA  May enrich for mRNA (poly-A enrichment, rRNA depletion)  Convert to cDNA (then similar to DNA protocols)  Small RNA  RNA ligations, convert to cDNA after
  • 10. Library Preparation: Shotgun  Fragmentation  Sonication  Nebulization  Enzymatic  End repair  3’ overhangs digested  5’ overhangs filled  5’ phosphate added
  • 11. Library Preparation: Shotgun  Adapter ligation  T-overhangs  Forked structure controls orientation  Library amplification  Few cycles  Enrich for correctly-adapted fragments  Required to complete adapter structure in some protocols  Size selection  Gel excision, AMPure beads  Limit insert size as needed, remove artifacts
  • 12. Library Preparation: Amplicon Amplify region of  Primers contain interest using PCR adapter sequences
  • 13. Library Preparation: Mate Pair Begin with large fragments (e.g. 3kb, 20kb) Circularize and fragment again  Illumina: direct ligation  454: Cre/Lox recombination Enrich for fragments containing the junction Proceed with shotgun library prep
  • 14. Library Preparation: Mate Pair Why? Paired sequences are a known distance apart; improves genome assembly Note: 454 calls these “paired end libraries”, not to be confused with Illumina’s “paired end sequencing”!
  • 15. Sequencing: Illumina  Cluster generation  Library fragments hybridize to oligos on the flow cell  New strand synthesized, original denatured, removed  Free end binds to adjacent oligos (bridge formation)  Complimentary strand synthesized, denatured (both tethered to flow cell)  Repeat to form clonal cluster  Cleave one oligo, denature to leave ssDNA clusters  ~800K clusters/mm^2
  • 16. Sequencing: Illumina Variety of workflows:  Single- or paired end reads  0, 1, or 2 index reads
  • 17. Sequencing: Illumina At each cycle, all 4 fluorescently-labeled nucleotides pass over the flow cell Each cluster incorporates one nt (terminator) per cycle Fluor is imaged, then cleaved De-block and repeat
  • 18. Sequencing: Illumina Other terminology:  cBot – accessory instrument that performs cluster generation  Lanes – divisions (8) of HiSeq and GAIIx flow cells  PhiX – bacteriophage with small, balanced genome; PhiX library spiked in with samples for QC  Phasing/pre-phasing – nt incorporation falls behind or jumps ahead on a portion of strands in the cluster and contributes to noise  Chastity filter – measures signal purity (after intensity corrections); if the background signal is high, cluster will be discarded  BaseSpace – cloud computing site for processing MiSeq data File format: fastq
  • 19. Sequencing: 454 emPCR: clonal amplification of bead-bound library in microdroplets Library input amounts critical!  One molecule per bead  Titration procedure
  • 20. Sequencing: 454 Library capture: beads coated with complimentary oligo Amplification: droplet contains PCR reagents and the other oligo Post-PCR: millions of identical fragments attached to the bead
  • 21. Sequencing: 454 Bead Recovery:  Enrichment: capture physical and successfully chemical disruption amplified beads using biotinylated primers + magnetic, streptavidin beads
  • 22. Sequencing: 454 Deposit bead layers onto PicoTiterPlate:  Enzyme beads  Enriched DNA beads  More enzyme beads  PPiase beads
  • 23. Sequencing: 454
  • 24. Sequencing: 454 Pyrosequencing  4 nucleotides flow separately  If nt incorporation…PPi...light  APS + PPi (sulfurylase) ATP  Luciferin + ATP (luciferase) light + oxyluciferin  Amount of light proportional to #nt incorporated  Rinse and repeat with next nt
  • 25. Sequencing: 454  Camera captures light emitted from every well during every nucleotide flow
  • 26. Sequencing: 454 Flowgram: representation of a sequence, based on the pattern of light emitted from a single well
  • 27. Sequencing: 454 Other terminology:  Lib-L/Lib-A: adapter variants, “ligated” or “annealed”  Titanium chemistry: ~450 bp reads on all instruments  XL+ chemistry: ~700 bp reads on the FLX+ instrument  Flow: one of the four nucleotides flows over the PTP  Cycle: a set of four flows, in order  Valley flow: if number of bases incorporated in a given read during that flow is uncertain, e.g. 1.5 units of light (background signal, homopolymers) File format: sff (standard flowgram format)
  • 28. Sequencing: Ion Torrent Procedures and chemistry similar to 454 Instead of PPi, measure H+ release (pH change) via semiconductor chip No expensive camera or laser required, no modified nucleotides
  • 29. Sequence QualityPhred (Q) Probabilit Base Call  Error probabilities Score y of Error Accuracy determined using (P) training sets, 10 1 in 10 90% platform-specific 20 1 in 100 99% 30 1 in 1K 99.9% biases 40 1 in 10K 99.99%  Expressed as a 50 1 in 100K 99.999% quality value (QV or Q score) per base  Similar to PHRED scores:  Q = -10 log10P  P = 10 -Q/10
  • 30. Project 1: Microbial Genome Considerations:  Coverage  Reference genome?  Depth (number of  How much coverage times a particular do I want? base is “covered” by a read (e.g. 25X)  How big is the genome  Breadth (% of genome with at least 1X  How much data do I coverage) need?  bp needed = genome size X coverage  Which instrument/chemistry configuration to use?
  • 31. Project 1: Microbial Genome Sample preparation  Isolate high quality (not degraded) and high purity (no RNA) gDNA  Verify on a gel  Quantify using dsDNA-specific dye Library preparation  Can do this yourself if you like  ~ $200 per sample for Nextera  Cheaper protocols  Cheaper in bulk  Barcode compatibility
  • 32. Project 1: Microbial Genome Library QC  Insertsize confirmed on BioAnalyzer (within range, no artifacts)  Pool barcoded libraries (normalize based on PicoGreen quantification)  Absolute quantification of library pools using qPCR
  • 33. Project 1: Microbial Genome MiSeq sequencing  Diluteand denature library pool (optimal concentration requires titration...)  Spike in PhiX library as needed (e.g. 1%)  Prepare and load reagents, flow cell  Basic filtering and de-multiplexing performed automatically  Download fastq files from BaseSpace
  • 34. Project 1: Microbial Genome Data processing  Assembly:  Additional filtering overlapping reads  Trim the ends are assembled to  Remove PCR eachother based on duplicates sequence similarity = contigs
  • 35. Project 1: Microbial Genome What’s next?  Polish the genome (hybrid assemblies, mate pair libraries)  Annotate (ORFs, RNA-seq)  Compare
  • 36. Project 2: Microbial Community Shotgun  Targeted metagenomics metagenomics  Unbiased survey of  Limited survey of community content community content  Random library  Targeted loci provide fragments may excellent taxonomic provide very little resolution, but may taxonomic resolution exclude certain taxa (e.g. conserved, unknown)  Identify OTUs, classify  Identify genes, by taxonomy classify by function
  • 37. Project 2: Microbial Community 16S rRNA Multi-copy gene (1.5 kb) Conserved and hypervariable regions Extensive databases from known species
  • 38. Project 2: Microbial Community Considerations:  Sample preparation:  Biases in sampling  Isolate DNA methods, culturing,  PCR amplify, purify DNA isolation,  High-fidelity PCR...replicate polymerase  Available SOPs  Barcoded primers  How many reads per  No primer dimers! sample?  NormalizePCR  Read length products and pool matters!
  • 39. Project 2: Microbial Community 454 Sequencing  Data processing  emPCR titrations  De-multiplexing with different library  Additionalfiltering input  Trim the barcodes,  Bulk emPCR primers  Sequence  Check for chimeras  Basic filtering  Collect sff files
  • 40. Project 2: Microbial Community Clustering  Sequences grouped by similarity = OTUs
  • 41. Project 2: Microbial Community Taxonomic identification  OTUs are classifed by comparing to known 16S sequences  Level of classification (e.g. family vs genus)? Diversity  Within sample  Between samples