NGS technologies - platforms and applications

10,234 views
9,516 views

Published on

AGRF in conjunction with EMBL Australia recently organised a workshop at Monash University Clayton. This workshop was targeted at beginners and biologists who are new to analysing Next-Gen Sequencing data. The workshop also aimed to provide users with a snapshot of bioinformatics and data analysis tips on how to begin to analyse project data. Next Gen Sequencing Platforms and Applications was presented by AGRF Next Gen Manager, Mr. Matt Tinning.
Presented: 1st August 2012

Published in: Education, Technology
5 Comments
24 Likes
Statistics
Notes
No Downloads
Views
Total views
10,234
On SlideShare
0
From Embeds
0
Number of Embeds
24
Actions
Shares
0
Downloads
2
Comments
5
Likes
24
Embeds 0
No embeds

No notes for slide

NGS technologies - platforms and applications

  1. 1. Next Gen Sequencing Platforms and Applications Matthew Tinning Australian Genome Research Facility1 August 2012
  2. 2. Next-Gen Sequencing TechnologiesRoche GS-FLX Life Technologies SOLiDIllumina HiSeq Life Technologies Ion Torrent
  3. 3. Roche GS-FLX
  4. 4. WorkflowSample FragmentationLibrary PreparationemPCR SetupemPCR AmplificationPyrosequencingData Analysis
  5. 5. Pyrosequencing
  6. 6. emPCREmulsion PCR is a method of clonal amplification which allows for millions of unique PCRs to be performed at once through the generation of micro-reactors.
  7. 7. emPCRThe Water-in-Oil-Emulsion
  8. 8. Massively Parallel Sequencing
  9. 9. Data Analysis T Base A Base C Base G Base Flow Flow Flow Flow Raw Image Files Image Base- Quality Processing calling Filtering SFF File
  10. 10. 454 Platform Updates GS20 • 100bp reads, ~20Mbp / run GS-FLX • 250bp reads ~100 Mbp / run (7.5 hrs) GS-FLX Titanium • 400bp reads ~400 Mbp / run (10 hrs)GS-FLX Titanium Plus • 700 bp reads ~700 Mbp/run (18 hrs) GS Junior • 400 bp reads ~ 35Mbp/run (10 hrs)
  11. 11. 454 Sequencing Output• *.sff (standard flowgram format)• *.fna (fasta)• *.qual (Phred quality scores)
  12. 12. Illumina HiSeq
  13. 13. Illumina Sequencing Technology Robust Reversible Terminator Chemistry Foundation 3’ 5’ DNA (0.1-1.0 ug) A G T C G A C T T A C C G G A T A A C T C C C G G A T T C Sample G A preparation Cluster growth T 5’ Sequencing1 2 3 4 5 6 7 8 9 T G C T A C G A T … Base callingImage acquisition
  14. 14. Platform Updates Solexa 1G • 18bp reads, ~1Gbp / run Illumina GA • 36bp reads ~3Gbp / run Illumina GAII • 75bp paired reads ~10Gbp / run (8 days) Illumina GAIIx • 75bp paired reads ~40Gbp / run (8 days) Illumina HiSeq 2000 • 100 bp paired reads ~200 Gbp/ run (10 days)Illumina HiSeq, v3 SBS • 100bp paired reads ~600Gbp / run (12 days) MiSeq • 150 paired reads ~1.5 Gb/run (27 hrs) Maximum yield / day 50,Gbp ~16x the human genome
  15. 15. Illumina Sequencing Output• *.fastq (sequence and corresponding quality score encoded with an ASCII character, phred- like quality score + 33)
  16. 16. Illumina fastq 1 2 3 4 5 67 8@HWI-ST226:253:D14WFACXX:2:1101:2743:29814 1:N:0:ATCACGTGCGGAAGGATCATTGTGGAATTCTCGGGTGCCAAGGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAAAAAAATTA+B@CFFFFFHHFFHJIIGHIHIJJIJIIJJGDCHIIIJJJJJJJGJGIHHEH@)=F@EIGHHEHFFFFDCBBD:@CC@C:<CDDDD50559<B######## 1. unique instrument ID and run ID 2. Flow cell ID and lane 3. tile number within the flow cell lane 4. x-coordinate of the cluster within the tile 5. y-coordinate of the cluster within the tile 6. the member of a pair, /1 or /2 (paired-end or mate-pair reads only) 7. N if the read passes filter, Y if read fails filter otherwise 8. Index sequence
  17. 17. Applied Biosystems SOLiD
  18. 18. Sequencing by Ligation
  19. 19. Base Interrogations
  20. 20. 2 Base encoding AT
  21. 21. emPCR and Enrichment3’ Modification allows covalent bonding to the slide surface
  22. 22. Platform Updates • 50bp Paired reads ~50Gbp / run SOLiD 3 (12 days) • 50bp Paired reads ~100Gbp / run SOLiD 4 (12 days) • 75bp Paired reads ~300Gbp / run 5500xl (14 days)Maximum yield / day 21,000,000,000bp7x the human genome3.5 hours of sequencing for a 1 fold coverage.....
  23. 23. SOLiD Colour Space Reads• *.csfasta (colour space fasta)• *.qual (Phred quality scores) >853_17_1660_F3 T32111011201320102312...... AA CC GG TT 0 Blue AC CA GT TG 1 Green AG CT GA TC 2 Yellow AT CG GC TA 3 Red
  24. 24. Applied Biosystems: Ion Torrent PGM
  25. 25. Ion Torrent• Ion Semiconductor Sequencing• Detection of hydrogen ions duringthe polymerization DNA• Sequencing occurs in microwellswith ion sensors• No modified nucleotides• No optics
  26. 26. Ion Torrent dNTP • DNA Ions  Sequence – Nucleotides flow sequentially over Ion semiconductor chip H+ – One sensor per well per sequencing reaction ∆ pH – Direct detection of natural DNA extension – Millions of sequencing reactions per chip ∆Q – Fast cycle time, real time detectionSensing Layer Sensor Plate ∆VBulk Drain Source To column receiverSilicon Substrate
  27. 27. Ion Torrent: System Updates314 Chip • 100bp reads ~10 Mb/run (1.5 hrs) • 100 bp reads ~100 Mbp / run (2 hrs)316 Chip • 200 bp reads ~200 Mbp/run (3 hrs)318 Chip • 200 bp reads ~1 Gbp / run (4.5 hrs)
  28. 28. Ion Torrent Reads• *.sff (standard flowgram format)• *.fastq (sequence and corresponding quality score encoded with an ASCII character, phred- like quality score + 33)
  29. 29. Summary of NGS Platforms• Clonal amplification of sequencing template – emPCR (454, SOLiD and Ion Torrent) – Bridge amplification (Illumina)• Sequencing by Synthesis – 454 Pyrosequencing – Illumina Reversible Terminator Chemistry – Ion Torrent Ion Semiconductor Sequencing• Sequencing by ligation – SOLiD – 2 base encoding• Dramatic reduction in cost of sequencing – GS-FLX provides > 100x decrease in costs compared to Sanger Sequencing – HiSeq and SOLiD > 100x decrease in costs over GS-FLX
  30. 30. Applications• DNA • Whole Genome – Shotgun & Mate Pair • Sequence Capture • Amplicon• RNA • mRNA • small RNA
  31. 31. Next Gen Sequencing Library Preparation
  32. 32. Sample preparation mRNA DNA chemical mechanicalFragmentationcDNA Synthesis Fragmentation Ligation of Amplification/ Sequencing Adaptors Library Fragment Size Selection
  33. 33. Shotgun Libraries• Illumina – Input: 1 ug of DNA – Fragmentation w/ Covaris – Size Selection w/ gel excission • Insert Size 300-400 bp • gel free method for captures – PCR “enrichment” (10 cycles)• 454 – Input 500 ng of DNA – Fragmentation w/ Nebulization – Small fragment removal (AMpure size exclusion) • Library size ~900 bp
  34. 34. Mate-Pair Libraries• Mate pair libraries for scafolding and structural variation – Input: 5-20 ug of DNA – 3kb, 8kb and 20Kb inserts – Size Select via gel electrophoresis – Adaptors for circularization via Cre recombinase (454) – PCR amplification (20 cycles)
  35. 35. Sequence Capture• Enrichment for specific targets via capture with oligonculeotide baits – Exome Capture • TruSeq Exome 62 Mb • NimbleGen SeqCap EZ Exome Library v2 & v4 • Agilent SureSelect XT/2 All Exon v4 (+UTRS) – Custom Capture • TruSeq Custom Enrichment (700 Kb- 15 Mb) • NimbleGen SeqCap EZ Choice (up to 50 Mb) • Agilent SureSelect XT/2 Custom (up to 34 Mb)
  36. 36. RNA-seq (cDNA libraries)• Shotgun library of cDNA – Isolation of Poly(A) RNA – (100 ng – 4 ug of total RNA) – Chemical Fragmentation of RNA – Random primed cDNA Synthesis & 2nd strand Synthesis – Follows standard “DNA” library protocol
  37. 37. Illumina small RNA• Illumina Small RNA Sample Preparation – Input: 1-10 ug of total RNA • 50-200 ng of small RNA – RNA-adaptor ligation before cDNA synthesis – Small RNA size selection via PAGE • Library fragment ~145-160bp (insert 20-33 nucleotides) – PCR “amplification” (11 cycles)
  38. 38. Sample requirementsDNA – OD260/280 1.8-2.0 RNA – RIN > 8.0gDNA 1 µg (Illumina) 500 ng (454) 5-20 ug (454 Paired-End)Total RNA 100 ng- 4 µg (mRNA-seq) 1-10 ug (small RNA)mRNA 10-100 ng (Illumina) 200 ng (454)small RNA 50-200 ng

×