Introduction to NGS

7,172 views

Published on

Introduction to NGS -
Ana Conesa -
Massive sequencing data analysis workshop -
Granada 2011

Published in: Technology

Introduction to NGS

  1. 1. Introduction to NGS Ana ConesaHead of Genomics of Gene Expression LabCentro de Investigaciones Prínicpe Felipe aconesa@cipf.es http://bioinfo.cipf.es/aconesa
  2. 2. Next Generation SequencingNGS has brought high speed not only to genome sequencing and personal medicine, but has also change the way we do genome research: Got a question on genome organization: SEQUENCE IT!!!!
  3. 3. NGS technologies Cost-effective Fast Ultra throughput Cloning-free Short reads
  4. 4. Roche 454 pyrosequencing
  5. 5. Roche 454 pyrosequencing
  6. 6. Roche 454
  7. 7. GS Junior, benchtop
  8. 8. Solexa
  9. 9. Solexa
  10. 10. Solexa-HiSeq200 Gb/run in 8 days 2x100 bp fragments2 billion reads per run
  11. 11. Helicos
  12. 12. SOLiD
  13. 13. SOLiD* Sequencing output in “color space”* Needs reference genome totranslate to base space.
  14. 14. SOLiD 5500* Fifth 3-based encoded primer* Sequencing output in base space * No reference needed
  15. 15. 5500 xl-u SOLiD180 Gb/run (microbeads)300 Gb/run (nanobeads) 35-75 bp fragments2.8 - 4.8 billion reads/run 2x6 lanes/run 96 bar-codes 99.99% accuracy
  16. 16. Pacific BiosystemsReal time DNA synthesis Up to 12000 nt?? 50 bases/second??
  17. 17. Ion Torrent •$ 50.000 •$ 500 /sample • 1 hour/run • > 200 nt lengths•Reads H+ released by DNA polymerase
  18. 18. Comparison Roche 454 Solexa SOLiD•Long fragments •Short fragments •Short fragments•Errors: poly nts •Errors: Hexamer bias •Color-space•Low throughput •High throughput •High throughput•Expensive •Cheap •Cheap•De novo sequencing: •Resequencing: •Resequencing:•Amplicon sequencing •ChipSeq •ChipSeq •RNASeq •RNASeq •MethylSeq •MethylSeq
  19. 19. ApplicationsDe novo sequencingResequencingExome SequencingRNA-seqGenome annotationChip-seqMethyl-seq…….
  20. 20. ApplicationsDe novo sequencingResequencingExome SequencingRNA-seqGenome annotationChip-seqMethyl-seq…….
  21. 21. Basic steps NGS data processing QC and read cleaning
  22. 22. Basic steps NGS data processing QC and Mapping read cleaning
  23. 23. Basic steps NGS data processing QC and Feature Mapping read cleaning identification
  24. 24. Basic steps NGS data processing SNVs Indels Rearrang. QC and Feature RPKM Mapping Splicing read cleaning identification DNA Binding site
  25. 25. RNA-seq Elucidate gene modelsQuantify gene expression
  26. 26. RNA-seq Elucidate gene models
  27. 27. RNA-seq protocol* total RNA purification mRNA preparation oligodT RiboZ2nd strand synthesis 1st strand synthesis fragmentation RNA DN *Solexa Pair-End A
  28. 28. RNA-seq protocol (II) 100bp lad A Aadenylation 3’ ends A A A A A A A A 400-200ligate adapters amplification library 400-200 SEQUENCING!
  29. 29. Strand-specific RNAseq
  30. 30. Strand-specific RNA-seq
  31. 31. File formatsfastq: sequence data and qualities SAM/BAM: mapping data and qualities
  32. 32. Some Figures How much does it “cost” (computationally) to sequence a human transcriptome? One human transcriptome: 100 Million reads1 Solexa run ==8 lanes ==25 M reads/lane==2 x 4 G fastq/lane (PE) 32 G disk spaceMapping @ processor 12 cores, 48 GB RAM , 4TB disk 24 hoursSAM (Ascii) / BAM (Binary) output 36 G / 9 G
  33. 33. Applications of RNAseqQualitative: Quantitative: * Alternative splicing * Differential expression * Antisense expression * Dynamic range of gene expression * Extragenic expression …. * Alternative 5’ and 3’ usage * Detection of fusion transcripts …. edgeR Tophat/Cufflinks DESeq Scripture baySeq Alexa NOISeq
  34. 34. Advantages of RNAseq? RNAseq microarrays* Non targeted transcript detection * Restricted to probes on array* No need of reference genome * Needs genome knowledge* Strand specificity * Normally, not strand specific* Find novels splicing sites * Exon arrays difficult to use* Larger dynamic range * Smaller dynamic range* Detects expression and SNVs * Does not provide sequence info* Detects rare transcripts * Rare transcripts difficult…. …. and…. are there any disadvantages?????
  35. 35. Resequencing
  36. 36. Exome Sequencing Gene A Gene B DNA (patient)1 Produce shotgun library2 Determine Capture exon sequences Map against 5 variants, 4 reference Filter, compare genome 3 patients candidate genes Wash & Sequence
  37. 37. Exome capture
  38. 38. The principle: comparison of patients Patient 1 Patient 2 Patient 3 Patient 4 Patient 5 Patient 6 candidate gene (shares mutation for all patients) mutation
  39. 39. ChipSeq
  40. 40. MethylSeq
  41. 41. MIDseq
  42. 42. Census NGS methods
  43. 43. Sucessful Stories
  44. 44. Miller syndrome
  45. 45. Species composition of metagenomic DNAextracted from mammoth hair.
  46. 46. ConclusionsNGS is revolutionizing how we do genome research
  47. 47. ConclusionsNGS is revolutionizing how we do genome researchBut it will also revolutionize our lives….
  48. 48. ConclusionsNGS is revolutionizing how we do genome researchBut it will also revolutionize our lives…. If we manage to process and analyze the data
  49. 49. YOUR SUCESSFUL STORY??? Have a great MDA course?

×