Introduction to second generation sequencing

[MIT]Introduction to 2GS data analysisDrink faster !June 23, 2011

Production Informatics and BioinformaticsJune 23, 2011Produce raw sequence readsBasic ProductionInformaticsMap to genome and generate raw genomic features (e.g. SNPs)Advanced Production Inform.Analyze the data; Uncover the biological meaningBioinformaticsResearchPer one-flowcell project

First Generation: Sanger sequencingSecond Generation: amplified molecule sequencing Third Generation: single molecule sequencingBrief history of sequencing June 23, 2011*** Discussion about category

What steps are involved in sequencing ?June 23, 2011sequencing by synthesis (SBS) technologyFragmentationLibrary generationAmplificationSequencingAnalysisIllumina Marketing: “3h 10 minutes wet-lab30 minutes dry lab”

Illumina sequencing: Library + AmplificationJune 23, 2011“Illumina Sequencing Technology” booklet

Illumina Sequencing: Synthesis + ImagingJune 23, 2011“Illumina Sequencing Technology” booklet

Output: 1.5 Terabyte of dataJune 23, 2011Inspired by anzska information booklet

Sequencer Output Conversion: Production Informatics1.5 TB data : 6 billion clusters with 100 bp reads = 600 billion data points June 23, 2011HiSeqCASAVA…× read lengthFor HiSeq: images are converted to flat files (*.bcl or *.cif) visualpharm.comMaysoft

Multiplexing6 billion reads:750 million reads per laneCurrently 12-plex (soon 96-plex):One run June 23, 2011Oliver Twardowski

DemultiplexingJune 23, 2011CASAVA……× samples× read lengthvisualpharm.com

CASAVA1.8.0 program callJune 23, 2011configureBclToFastq.pl \ --input-dir Data/Intensities/BaseCalls/ \ -output-dir Data/Unaligned \ --sample-sheet SampleSheet.csv \ --use-bases-mask y100,I6nn,Y100 >file.log 2>&1cd Data/Unalignedqsub -pe make 16 -jy -v $MYPATH –oqsub.out -cwd –N fastq -by \ make -j 16Runtime: ~ 6h

Fastq filesJune 23, 2011@HWI-ST301_0112:1:1:1169:2044#0/1CCATAAGGCCACGTATTTTGCAAGCTATTTAACTGGCGGCGAT+HWI-ST301_0112:1:1:1169:2044#0/1dddc\dd^dd`acacdacd`ecdedabdcdddcc\``\`bTa\36 36 36 35 28 …ASCII @ .. ~DEC 64 .. 126PHRED 0 .. 62Phred scores are estimates only ! Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2010 Apr;38(6):1767-71. PMID:20015970

Fastq – PHRED qualityPathologicalJune 23, 2011

Fastq: Quality controlBase-pair quality score Adapter contaminationUneven Amplification June 23, 2011

Three things to rememberDon’t be fooled by marketingFastqfiles are not directly usableBasic-run QC can be made from fastq fileJune 23, 2011“All modern genomics projects are now bottlenecked at the stage of data analysis rather than data production” Ewan Birney European Bioinformatics InstituteWellcome Trust David S. Roos Bioinformatics--Trying to Swim in a Sea of Data;Science 16 February 2001: Vol. 291 no. 5507 pp. 1260-1261 DOI: 10.1126/science.291.5507.1260

Next Week:June 23, 2011Abstract: This session will focus on identifying SNPs from whole genome, exome capture or targeted resequencing data. The approaches of mapping, local realigment, recalibration, SNP calling, and SNP recalibration will be introduced and quality metrics discussed.

Helicostrue Single Molecule Sequencing(tSMS)™ technologySequencing by synthesis but much more sensitive so no amplificationJune 23, 2011

Life Technology - Ion TorrentHydrogen Ion is released by the incorporation of a nucleotide, which is measured by a semiconductorDepending on which nucleotide wash cycle the signal coincidesJune 23, 2011

PacBioImmobilized polymerase at the bottom of a wellFluorescent nucleotides float around and if they are incorporated they are held still for tens of milliseconds, which is the signal that is recordedNo upper limit on the length June 23, 2011http://www.pacificbiosciences.com/smrt-biology/smrt-technology?page=4

NanoporeMolecule is sucked through a poor and the change in the membrane charge due to the different nucleotides is recorded.June 23, 2011http://www.nanoporetech.com/sections/index/82

Introduction to second generation sequencing

More Related Content

What's hot

Viewers also liked

Similar to Introduction to second generation sequencing

More from Denis C. Bauer

Recently uploaded

Introduction to second generation sequencing

Editor's Notes