Sequencing: The Next Generation

2,235 views

Published on

Talk given at IIT Indore, India on May 29, 2014

Published in: Technology, Health & Medicine

Sequencing: The Next Generation

  1. 1. Surya Saha Cornell University & Boyce Thompson Institute suryasaha@cornell.edu // Twitter:@SahaSurya IIT Indore May 29, 2014 Slides: http://bit.ly/IITIndoreSeq http://www.acgt.me/blog/2014/3/7/next-generation-sequencing-must-die
  2. 2. 5/29/2014 IIT Indore 2 You are free to: Copy, share, adapt, or re-mix; Photograph, film, or broadcast; Blog, live-blog, or post video of; This presentation. Provided that: You attribute the work to its author and respect the rights and licenses associated with its components. Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero. Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
  3. 3. 1953 DNA Structure discovery 1977 2012 Sanger DNA sequencing by chain-terminating inhibitors 1984 Epstein-Barr virus (170 Kb) 1987Abi370 Sequencer 1995 2001 Homo sapiens (3.0 Gb) 2005 454 Solexa Solid 2007 2011 Ion Torrent PacBio Haemophilus influenzae (1.83 Mb) 2013 Slide credit: Aureliano Bombarely Sequencing over the Ages Illumina Illumina Hiseq X 454 5/29/2014 IIT Indore 3 Pinus taeda (24 Gb) 2014 MinION
  4. 4. 5/29/2014 IIT Indore 4 Its all about the $£€¥ http://www.genome.gov/sequencingcosts/
  5. 5. 5/29/2014 IIT Indore 5 First generation sequencing
  6. 6. Sanger method 5/29/2014 IIT Indore 6 Frederick Sanger 13 Aug 1918 – 19 Nov 2013 Won the Nobel Prize for Chemistry in 1958 and 1980. Published the dideoxy chain termination method or “Sanger method” in 1977 http://dailym.ai/1f1XeTB
  7. 7. Sanger method 5/29/2014 IIT Indore 7 http://bit.ly/1g6Cudq http://bit.ly/1lcQO4J
  8. 8. Maxam-Gilbert method 5/29/2014 IIT Indore 8
  9. 9. Maxam-Gilbert method 5/29/2014 IIT Indore 9 http://bit.ly/1noY0fu http://bit.ly/1lGvJCA
  10. 10. First generation sequencing • Very high quality sequences (99.999%) • Very low throughput 5/29/2014 IIT Indore 10 Run Time Read Length Reads / Run Total nucleotides sequenced Cost / MB Capillary Sequencing (ABI3730xl) 20m-3h 400-900 bp 96 or 386 1.9-84 Kb $2400 http://bit.ly/1clLps3 http://1.usa.gov/1cLqIRd
  11. 11. Next generation sequencing 5/29/2014 IIT Indore 11
  12. 12. 5/29/2014 IIT Indore 12 http://bit.ly/1keDtZQ • Second generation • Third generation • Fourth generation • Next-next-generation • Next-next-next generation http://www.acgt.me/blog/2014/3/10/next-generation- sequencing-must-diepart-2
  13. 13. Use the specific technology used to generate the data – Illumina Hiseq/Miseq/NextSeq – Pacific Biosciences RS I/RS II – Ion Torrent Proton/PGM – SOLiD – 454 5/29/2014 IIT Indore 13 http://www.acgt.me/blog/2014/3/10/next-generation- sequencing-must-diepart-2
  14. 14. 454 Pyrosequencing One purified DNA fragment, to one bead, to one read. 5/29/2014 IIT Indore 14 http://bit.ly/1ehwxWN GS FLX Titanium http://bit.ly/1ehAcEh
  15. 15. Illumina 5/29/2014 IIT Indore 15 Output 15 Gb 120 GB 1000 GB 1800 GB Number of Reads 25 Million 400 Million 4 Billion 6 Billion Read Length 2x300 bp 2x150 bp 2x125 bp (2x250 update mid-2014) 2x150 bp Cost $99K $250K $740K $10M Source: Illumina
  16. 16. Illumina 5/29/2014 IIT Indore 16 Output 15 Gb 120 GB 1000 GB 1800 GB Number of Reads 25 Million 400 Million 4 Billion 6 Billion Read Length 2x300 bp 2x150 bp 2x125 bp (2x250 update mid-2014) 2x150 bp Cost $99K $250K $740K $10M Source: Illumina $1000 human genome??
  17. 17. Illumina 5/29/2014 IIT Indore 17 http://1.usa.gov/1fP9ybl
  18. 18. Illumina:Moleculo 5/29/2014 IIT Indore 18 http://bit.ly/1aEPOBn
  19. 19. Pacific Biosciences SMRT sequencing Single Molecule Real Time sequencing 5/29/2014 IIT Indore 19 http://bit.ly/1naxgTe
  20. 20. Pacific Biosciences SMRT sequencing Error correction methods 5/29/2014 IIT Indore 20 Hierarchical genome-assembly process (HGAP) PBJelly Enlish et al., PLOS One. 2012 PBJelly
  21. 21. 5/29/2014 IIT Indore 21 Pacific Biosciences SMRT sequencing Read Lengths http://www.igs.umaryland.edu/labs/grc/ Mean Read Length: 8391 bp Maximum Subread Length: 24585 bp
  22. 22. Oxford Nanopore 5/29/2014 IIT Indore 22 https://www.nanoporetech.com/ • No data yet • Error model http://erlichya.tumblr.com/post/66376172948/hands-on- experience-with-oxford-nanopore-minion
  23. 23. Others • Ion Torrent Proton/PGM • Nabsys • SOLiD 5/29/2014 IIT Indore 23
  24. 24. Comparison 5/29/2014 IIT Indore 24
  25. 25. Next generation sequencing 5/29/2014 IIT Indore 25 Run Time Read Length Quality Total nucleotides sequenced Cost /MB 454 Pyrosequencing 24h 700 bp Q20-Q30 0.7 GB $10 Illumina Miseq 27h 2x250bp > Q30 15 GB $0.15 Illumina Hiseq 2500 11days 2x125bp >Q30 1000 GB $0.05 Ion torrent 2h 400bp >Q20 50MB-1GB $1 Pacific Biosciences 2h 5.5-8.5kb >Q30 consensus >Q10 single 400-800MB /SMRT cell $0.33-$1 http://bit.ly/1clLps3 http://1.usa.gov/1cLqIRd
  26. 26. http://omicsmaps.com/ Next Generation Genomics: World Map of High-throughput Sequencers IIT Indore5/29/2014 26
  27. 27. 5/29/2014 IIT Indore 27 http://bit.ly/18pfUId
  28. 28. 5/29/2014 IIT Indore 28 http://bit.ly/18pfUId
  29. 29. Real cost of Sequencing!! Sboner, Genome Biology, 2011 IIT Indore5/29/2014 29
  30. 30. Library Types Single end Pair end (PE, 150-800 bp, Fwd:/1, Rev:/2) Mate pair (MP, 2Kb to 20 Kb) 5/29/2014 IIT Indore 30 F F R F R 454/Roche FR Illumina Illumina Slide credit: Aureliano Bombarely
  31. 31. Implications of Choice of Library 5/29/2014 IIT Indore 31 Slide credit: Aureliano Bombarely Consensus sequence (Contig) Reads Scaffold (or Supercontig) Pair Read information NNNNN Pseudomolecule (or ultracontig) F Genetic information (markers) NNNNN NN
  32. 32. 5/29/2014 IIT Indore 32 Quality control: Encoding http://bit.ly/N28yUd Phred score of a base is: Qphred = -10 log10 (e) where e is the estimated probability of a base being incorrect
  33. 33. Which technology to use?? • Microbial genomes • Eukaryotic genomes • Resequencing genomes • RNAseq and other XXXseq methods 5/29/2014 IIT Indore 33 http://bit.ly/1ko9Kgh
  34. 34. Looking into the Crystal ball • Desktop sequencing • Diagnostics in the clinic • Large scale environmental sequencing of microbes • But challenges remain.. 5/29/2014 IIT Indore 34
  35. 35. • International Society of Computational Biology (ISCB) • ISCB SC RSG India • > 1500 members • Contact – rsg-india@googlegroups.com – http://www.iscbsc.org/rsg/rsg-india – https://groups.google.com/forum/#!for um/compbio_discussion 5/29/2014 IIT Indore 35
  36. 36. 5/29/2014 IIT Indore 36 • Collaborate with student organizations • Organize workshops and journal clubs • Attend international meetings
  37. 37. Position available at Solgenomics Cassavabase project Plant Breeding + Bioinformatician ● Familiar with breeding ● Programming in Perl, R, SQL, Hadoop ● Linux ● Africa ● Genius http://www.cassavabase.org/forum/posts .pl?topic_id=9
  38. 38. Thank you!! Questions?? 5/29/2014 BTI Plant Bioinformatics Course 2014 38

×