ICAR Soybean Indore 2014

1,492 views

Published on

Presented on June 7 at
Directorate of Soybean Research
Khandwa Road, Indore, M.P, India - 452 001

http://www.nrcsoya.nic.in/default.htm

Published in: Science, Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,492
On SlideShare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
31
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

ICAR Soybean Indore 2014

  1. 1. Surya Saha Cornell University & Boyce Thompson Institute suryasaha@cornell.edu @SahaSurya Directorate of Soybean Research, Indore June 7,2014 Slides: http://bit.ly/Soybean_Indore_2014 http://www.acgt.me/blog/2014/3/7/next-generation-sequencing-must-die
  2. 2. 6/6/2014 Directorate of Soybean Research, Indore 2 You are free to: Copy, share, adapt, or re-mix; Photograph, film, or broadcast; Blog, live-blog, or post video of; This presentation. Provided that: You attribute the work to its author and respect the rights and licenses associated with its components. Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero. Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
  3. 3. 6/7/2014 Directorate of Soybean Research, Indore 3 Sequencing
  4. 4. 1953 DNA Structure discovery 1977 2012 Sanger DNA sequencing by chain-terminating inhibitors 1984 Epstein-Barr virus (170 Kb) 1987Abi370 Sequencer 1995 2001 Homo sapiens (3.0 Gb) 2005 454 Solexa Solid 2007 2011 Ion Torrent PacBio Haemophilus influenzae (1.83 Mb) 2013 Slide credit: Aureliano Bombarely Sequencing over the Ages Illumina Illumina Hiseq X 454 6/7/2014 Directorate of Soybean Research, Indore 4 Pinus taeda (24 Gb) 2014 MinION The Next Generation
  5. 5. 6/6/2014 Directorate of Soybean Research, Indore 5 Its all about the $£€¥ http://www.genome.gov/sequencingcosts/
  6. 6. 6/6/2014 Directorate of Soybean Research, Indore 6 First generation sequencing
  7. 7. Sanger method 6/6/2014 Directorate of Soybean Research, Indore 7 Frederick Sanger 13 Aug 1918 – 19 Nov 2013 Won the Nobel Prize for Chemistry in 1958 and 1980. Published the dideoxy chain termination method or “Sanger method” in 1977 http://dailym.ai/1f1XeTB
  8. 8. Sanger method 6/6/2014 Directorate of Soybean Research, Indore 8 http://bit.ly/1g6Cudq http://bit.ly/1lcQO4J
  9. 9. First generation sequencing • Very high quality sequences (99.999%) • Very low throughput 6/6/2014 Directorate of Soybean Research, Indore 9 Run Time Read Length Reads / Run Total nucleotides sequenced Cost / MB Capillary Sequencing (ABI3730xl) 20m-3h 400-900 bp 96 or 386 1.9-84 Kb $2400 http://bit.ly/1clLps3 http://1.usa.gov/1cLqIRd
  10. 10. Next generation sequencing 6/6/2014 Directorate of Soybean Research, Indore 10
  11. 11. 6/6/2014 Directorate of Soybean Research, Indore 11 http://bit.ly/1keDtZQ • Second generation • Third generation • Fourth generation • Next-next-generation • Next-next-next generation http://www.acgt.me/blog/2014/3/10/next-generation- sequencing-must-diepart-2
  12. 12. Use the specific technology used to generate the data – Illumina Hiseq/Miseq/NextSeq – Pacific Biosciences RS I/RS II – Ion Torrent Proton/PGM – SOLiD – 454 6/6/2014 Directorate of Soybean Research, Indore 12 http://www.acgt.me/blog/2014/3/10/next-generation- sequencing-must-diepart-2
  13. 13. 454 Pyrosequencing One purified DNA fragment, to one bead, to one read. 6/6/2014 Directorate of Soybean Research, Indore 13 http://bit.ly/1ehwxWN GS FLX Titanium http://bit.ly/1ehAcEh
  14. 14. Illumina 6/6/2014 Directorate of Soybean Research, Indore 14 Output 15 Gb 120 GB 1000 GB 1800 GB Number of Reads 25 Million 400 Million 4 Billion 6 Billion Read Length 2x300 bp 2x150 bp 2x125 bp (2x250 update mid-2014) 2x150 bp Cost $99K $250K $740K $10M Source: Illumina
  15. 15. Illumina 6/6/2014 Directorate of Soybean Research, Indore 15 Output 15 Gb 120 GB 1000 GB 1800 GB Number of Reads 25 Million 400 Million 4 Billion 6 Billion Read Length 2x300 bp 2x150 bp 2x125 bp (2x250 update mid-2014) 2x150 bp Cost $99K $250K $740K $10M Source: Illumina $1000 human genome??
  16. 16. Illumina 6/6/2014 Directorate of Soybean Research, Indore 16 http://1.usa.gov/1fP9ybl
  17. 17. Illumina:Moleculo 6/6/2014 Directorate of Soybean Research, Indore 17 http://bit.ly/1aEPOBn
  18. 18. Pacific Biosciences SMRT sequencing Single Molecule Real Time sequencing 6/6/2014 Directorate of Soybean Research, Indore 18 http://bit.ly/1naxgTe
  19. 19. Pacific Biosciences SMRT sequencing Error correction methods 6/6/2014 Directorate of Soybean Research, Indore 19 Hierarchical genome-assembly process (HGAP) PBJelly Enlish et al., PLOS One. 2012 PBJelly
  20. 20. 6/6/2014 Directorate of Soybean Research, Indore 20 Pacific Biosciences SMRT sequencing Read Lengths http://www.igs.umaryland.edu/labs/grc/ Mean Read Length: 8391 bp Maximum Subread Length: 24585 bp
  21. 21. Oxford Nanopore 6/6/2014 Directorate of Soybean Research, Indore 21 https://www.nanoporetech.com/ • No data yet • Error model http://erlichya.tumblr.com/post/66376172948/hands-on- experience-with-oxford-nanopore-minion
  22. 22. Others • Ion Torrent Proton/PGM • Nabsys • SOLiD 6/6/2014 Directorate of Soybean Research, Indore 22
  23. 23. Comparison 6/6/2014 Directorate of Soybean Research, Indore 23
  24. 24. Next generation sequencing 6/6/2014 Directorate of Soybean Research, Indore 24 Run Time Read Length Quality Total nucleotides sequenced Cost /MB 454 Pyrosequencing 24h 700 bp Q20-Q30 0.7 GB $10 Illumina Miseq 27h 2x250bp > Q30 15 GB $0.15 Illumina Hiseq 2500 11days 2x125bp >Q30 1000 GB $0.05 Ion torrent 2h 400bp >Q20 50MB-1GB $1 Pacific Biosciences 2h 5.5-8.5kb >Q30 consensus >Q10 single 400-800MB /SMRT cell $0.33-$1 http://bit.ly/1clLps3 http://1.usa.gov/1cLqIRd
  25. 25. http://omicsmaps.com/ Next Generation Genomics: World Map of High-throughput Sequencers Directorate of Soybean Research, Indore6/6/2014 25
  26. 26. 6/6/2014 Directorate of Soybean Research, Indore 26 http://bit.ly/18pfUId
  27. 27. Real cost of Sequencing!! Sboner, Genome Biology, 2011 6/7/2014 27Directorate of Soybean Research, Indore
  28. 28. Library Types Single end Pair end (PE, 150-800 bp, Fwd:/1, Rev:/2) Mate pair (MP, 2Kb to 20 Kb) 6/6/2014 Directorate of Soybean Research, Indore 28 F F R F R 454/Roche FR Illumina Illumina Slide credit: Aureliano Bombarely
  29. 29. Implications of Choice of Library 6/6/2014 Directorate of Soybean Research, Indore 29 Slide credit: Aureliano Bombarely Consensus sequence (Contig) Reads Scaffold (or Supercontig) Pair Read information NNNNN Pseudomolecule (or ultracontig) F Genetic information (markers) NNNNN NN
  30. 30. 6/6/2014 Directorate of Soybean Research, Indore 30 Quality control: Encoding http://bit.ly/N28yUd Phred score of a base is: Qphred = -10 log10 (e) where e is the estimated probability of a base being incorrect
  31. 31. Which technology to use?? • Microbial genomes • Eukaryotic genomes • Resequencing genomes • RNAseq and other XXXseq methods 6/6/2014 Directorate of Soybean Research, Indore 31 http://bit.ly/1ko9Kgh
  32. 32. 6/7/2014 Directorate of Soybean Research, Indore 32 SOL Genomics Network
  33. 33. 6/6/2014 Directorate of Soybean Research, Indore 33
  34. 34. The SGN Team!! 6/6/2014 Directorate of Soybean Research, Indore 34 Surya Saha, Tom Fisher-York, Hartmut Foerster, Suzy Strickler, Jeremy Edwards, Noe Fernandez, Naama Menda, Aure Bombarely, Aimin Yan, Isaak Tecle
  35. 35. What's new on SGN? • Tomato genome release 2.5 • Incorporates results from FISH • Nicotiana benthamiana genome sequence • Genome sequence and annotation • VIGS Tool • Select specific probes for VIGS • New BLAST interface • New Breeder functions • Later this year: Tomato genome release 3.0 6/6/2014 Directorate of Soybean Research, Indore 35
  36. 36. SGN Website 6/6/2014 Directorate of Soybean Research, Indore 36 http://solgenomics.net
  37. 37. 6/6/2014 Directorate of Soybean Research, Indore 37 Main web page (front page): WEB ICONS TOOL BAR
  38. 38. 6/6/2014 Directorate of Soybean Research, Indore 38 Main web page (front page): TOOL BAR (MENUS)
  39. 39. 6/6/2014 Directorate of Soybean Research, Indore 39 But the DATA also can be edited LocusLocus Editor Data Community Data Curation
  40. 40. 6/6/2014 Directorate of Soybean Research, Indore 40 You need • SGN account. • Activate submitter / Locus Editor privileges by SGN curator LocusLocus Editor Data
  41. 41. 6/6/2014 Directorate of Soybean Research, Indore 41 Tools
  42. 42. 6/6/2014 Directorate of Soybean Research, Indore 42 Genome Browser
  43. 43. 6/6/2014 Directorate of Soybean Research, Indore 43 Genomes in SGN
  44. 44. 6/6/2014 Directorate of Soybean Research, Indore 44
  45. 45. 6/7/2014 Directorate of Soybean Research, Indore 45 CassavaBase
  46. 46. 6/7/2014 Directorate of Soybean Research, Indore 46 Cassava ● Tropical and subtropical regions ● Mainly grown for starchy roots ● Native to South America ● Major crop in Africa ● Food for 500 million people around the world ● Clonally propagated ● Accumulates toxic cyanogenic glucosides ● Requires processing before consumption
  47. 47. 6/7/2014 Directorate of Soybean Research, Indore 47 NextGen Cassava Project ● Project: Adapt SGN database for Cassava Breeding ● Goal: Apply Genomic Selection to cassava breeding ● Predict breeding values from genotype information ● Shorten the breeding cycle ● Massive amounts of genotypic data (GBS) ● Phenotypic data ● Data management challenge ● Improve flowering ● http://nextgencassava.org
  48. 48. 6/7/2014 Directorate of Soybean Research, Indore 48 CassavaBase http://cassavabase.org/
  49. 49. SGN/Cassavabase behind the scenes 6/7/2014 Directorate of Soybean Research, Indore 49 ● Perl/Catalyst MVC Framework ● PostgreSQL Database ● Generic Model Organism Database (GMOD) – Chado relational database schema – GBrowse – JBrowse ● R – Experimental design – QTL mapping – Genomic selection
  50. 50. Objectives Provide cassava breeders and researchers access to data and tools in a centralized, user-friendly and reliable database. – Improve partner breeding program information tracking – Streamline management of genotypic and phenotypic data – Pipeline genotypic and phenotypic data through Genomic Selection prediction analyses 6/7/2014 Directorate of Soybean Research, Indore 50
  51. 51. 6/7/2014 Directorate of Soybean Research, Indore 51 Genomic Selection The 'training population' is genotyped and phenotyped to 'train' the genomic selection (GS) prediction model. Genotypic information from the breeding material is then fed into the model to calculate genomic estimated breeding values (GEBV) for these lines. From Heffner et al. 2009 Crop Sci. 49:1–12 Information from a majority of lines in the breeding population (the training set) is used to create the prediction model. The model is then used to predict the phenotypes of the remaining lines (the validation set), using genotypic information only. The results from the model are compared to the actual data to give the prediction accuracy. Image courtesy of Martha Hamblin, Cornell University Flow diagram of a genomic selection breeding program. Breeding cycle time is shortened by removing phenotypic evaluation of lines before selection as parents for the next cycle. From Heffner et al. 2009 Crop Sci. 49:1–12 Slide credit: Jeremy Edwards
  52. 52. 6/7/2014 Directorate of Soybean Research, Indore 52 Data collection in the field ● Android tablets ● Field book app – Jesse Poland's group at USDA-ARS / Kansas State University Slide credit: Jeremy Edwards
  53. 53. 6/7/2014 Directorate of Soybean Research, Indore 53 ● Tassel 4 pipeline from Ed Bucker's group ● Discovery vs production ● Filtering ● Imputation ● Storing in Cassavabase Slide credit: Jeremy Edwards Genotyping by sequencing (GBS)
  54. 54. Genotyping by sequencing (GBS) 6/7/2014 Directorate of Soybean Research, Indore 54
  55. 55. 6/7/2014 Directorate of Soybean Research, Indore 55 SolGS: A tool for genomic selection Phenotyped & Genotyped Lines Prediction Model Predicted Breeding Values Genotyped Lines Slide credit: Jeremy Edwards
  56. 56. Cassava Trait Ontology 6/7/2014 Directorate of Soybean Research, Indore 56 Kulakow et al. 2011 Kulakow et al. 2011 ● Standard terminology ● Facilitate the sharing of information ● Allow users to query keywords related to traits Slide credit: Jeremy Edwards
  57. 57. 6/6/2014 Directorate of Soybean Research, Indore 58 Position available at Solgenomics Cassavabase project Plant Breeding + Bioinformatician ● Familiar with breeding ● Programming in Perl, R, SQL, Hadoop ● Linux ● Africa ● Genius http://www.cassavabase.org/forum/posts .pl?topic_id=9
  58. 58. Thank you!! Questions?? 6/6/2014 Directorate of Soybean Research, Indore 59

×