SlideShare a Scribd company logo
1 of 28
Transcript discovery and gene model correction using next generation sequencing data SuchetaTripathy, VBI, 11th Nov 2010
Brief History of Sequencing Sanger Dideoxy Sequencing methods(1977). Maxam Gilberts Chemical degradation methods(1977). Two Labs that owned automated sequencers: 1. Leroy Hood at Caltech, 1986(commercialized by AB) 2. Wilhelm Ansorge at EMBL, 1986(commercialized by Pharmacia-Amersham and GE healthcare)
Brief History Of sequencing Hypoxanthine-guanine phosphoribosyltransferase (HGPRT) Alu sequences
Hitachi Laboratory developed High throughput capillary array sequencer, 1996. 1991, A patent filed by EMBL on media less, solid support based sequencing. Brief History Of sequencing
NextGen Sequencing Methods 454 sequencing methods(2006) Principles of pyrophosphate detection(1985, 1988) Illumina(Solexa) Genome sequencing methods(2007) Applied Biosystems ABI SOLiD System(2007) Helicos single molecule sequencing(Helioscope, 2007) Pacific Biosciences single-molecule real-time(SMRT) technology, 2010 Sequenom for Nanotechnology based sequencing. BioNanomatrixnanofluidiscs.  RNAP technology.
Figure 1. (A) Outline of the GS 454 DNA sequencer workflow. Library construction (I) ligates 454-specific adapters to DNA fragments (indicated as A and B) and couples amplification beads with DNA in an emulsion PCR to amplify fragments before sequencing (II). The beads are loaded into the picotiter plate (III). (B) Schematic illustration of the pyrosequencing reaction which occurs on nucleotide incorporation to report sequencing-by-synthesis. (Adapted from http://www.454.com.)
Outline of the Illumina Genome Analyzer workflow. Similar fragmentation and adapter ligation steps take place (I), before applying the library onto the solid surface of a flow cell. Attached DNA fragments form ‘bridge’ molecules which are subsequently amplified via an isothermal amplification process, leading to a cluster of identical fragments that are subsequently denatured for sequencing primer annealing (II). Amplified DNA fragments are subjected to sequencing-by-synthesis using 3′ blocked labelled nucleotides (III). (Adapted from the Genome Analyzer brochure, http://www.solexa.com.)
(A) Primers hybridise to the P1 adapter within the library template. A set of four fluorescence-labelleddi-base probes competes for ligation to the sequencing primer. These probes have partly degenerated DNA sequence (indicated by n and z). Specificity of the di-base probe is achieved by interrogating the first and second base in each ligation reaction (CA in this case for the complementary strand).  (B) Sequence determination by the SOLiD DNA sequencing platform is performed in multiple ligation cycles, using different primers, each one shorter from the previous one by a single base. The number of ligation cycles determines the eventual read length, whilst for each sequence tag, six rounds of primer reset occur [from primer (n) to primer (n − 4)].  (Adapted and modified from http://www.appliedbiosystems.com.)
Cost Adapted from Eric Lander, 2010
Throughput Standard ABI “Sanger” sequencing  96 samples/day Read length ~650 bp Total = 450,000 bases of sequence data 454 was the game changer! ~400,000 different templates (reads)/day Read length ~250 bp Total = 100,000,000 bases of sequence data!!!
Throughput 454 Life Sciences/Roche Genome Sequencer FLX: currently produces 400-600 million bases per day per machine Published 1 million bases of Neanderthal DNA in 2006 May 2007 published complete genome of James Watson (3.2 billion bases ~20x coverage)  Solexa/Illumina 10 GB per machine/week May 2008 published complete genomes for 3 hapmap subjects (14x coverage)  ABI SOLID 20 GB per machine/week
RNASeq Catalogue all species of transcripts.	 mRNA Non-coding RNA Small RNA Splicing patterns or other post-transcriptional modifications. Quantify the expression levels.
Zhong Wang et al;  Nat. Rev. Genetics, 2009
Other Applications SNP detection Splice Variant Discovery Identification of miRNA targets TF binding sites Genome Methylation pattern RNA editing Metagenomic projects Gene Expression Analysis
Difference with other expression sequencing EST: Low throughput, expansive, NOT quantitative. SAGA, CAGE, MPSS: Highthroughput, digital gene expression levels Expansive Sanger sequencing methods A portion of transcript is analyzed Isoforms are indistinguishable
Advantages: Zero or very less background noise. Sensitive to isoform discovery. Both low and highly expressed genes can be quantified. Highly reproducible.
Data Analysis Mapping Reads to the reference assembly Filtering output: Reads mapping > x number of times Downstream data analysis
Mapping One or two mis-matches < 35 bases One insertion/deletion.  K-mer based seeding. ,[object Object]
Transcript abundance.,[object Object]
Integrated Pipeline ,[object Object]
CLCBio Genomic workbench.
Galaxy Server.
ERANGE:Is a full package for RNASeq and chipSeq data analysis
DESEQ(used by edgeR package),[object Object]
An overview of the MapSplice pipeline. © The Author(s) 2010. Published by Oxford University Press. Wang K et al. Nucl. Acids Res. 2010;38:e178-e178
Larsen et al 2010
Denoeud et al, 2008

More Related Content

What's hot

6.남영도110923
6.남영도1109236.남영도110923
6.남영도110923
drugmetabol
 

What's hot (20)

In Vitro Analog of the Primitive Streak (ANIMATED)
In Vitro Analog of the Primitive Streak (ANIMATED)In Vitro Analog of the Primitive Streak (ANIMATED)
In Vitro Analog of the Primitive Streak (ANIMATED)
 
Catalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seqCatalyzing Plant Science Research with RNA-seq
Catalyzing Plant Science Research with RNA-seq
 
CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015CSU Next Generation Sequencing Core 06/09/2015
CSU Next Generation Sequencing Core 06/09/2015
 
Future of metagenomics
Future of metagenomicsFuture of metagenomics
Future of metagenomics
 
DNA-based methods for bioaerosol analysis
DNA-based methods for bioaerosol analysisDNA-based methods for bioaerosol analysis
DNA-based methods for bioaerosol analysis
 
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
Single-molecule real-time (SMRT) Nanopore sequencing for Plant Pathology appl...
 
6.남영도110923
6.남영도1109236.남영도110923
6.남영도110923
 
RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities
 
Sequencing 2016
Sequencing 2016Sequencing 2016
Sequencing 2016
 
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
 
Genome walking – a new strategy for identification of nucleotide sequence in ...
Genome walking – a new strategy for identification of nucleotide sequence in ...Genome walking – a new strategy for identification of nucleotide sequence in ...
Genome walking – a new strategy for identification of nucleotide sequence in ...
 
PhD; Qualification exam (mesele)2
PhD; Qualification exam (mesele)2PhD; Qualification exam (mesele)2
PhD; Qualification exam (mesele)2
 
NAISTビッグデータシンポジウム - バイオ久保先生
NAISTビッグデータシンポジウム - バイオ久保先生NAISTビッグデータシンポジウム - バイオ久保先生
NAISTビッグデータシンポジウム - バイオ久保先生
 
DNA Barcoding
DNA BarcodingDNA Barcoding
DNA Barcoding
 
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNA
 
Rapid 16S Next Generation Sequencing for Bacterial Identification in Polymicr...
Rapid 16S Next Generation Sequencing for Bacterial Identification in Polymicr...Rapid 16S Next Generation Sequencing for Bacterial Identification in Polymicr...
Rapid 16S Next Generation Sequencing for Bacterial Identification in Polymicr...
 
Molecular markers by tahura mariyam ansari
Molecular markers by tahura mariyam ansariMolecular markers by tahura mariyam ansari
Molecular markers by tahura mariyam ansari
 
Forensic Sciences (DNA Fingerprinting) STR Typing - Case Report
Forensic Sciences (DNA Fingerprinting) STR Typing - Case ReportForensic Sciences (DNA Fingerprinting) STR Typing - Case Report
Forensic Sciences (DNA Fingerprinting) STR Typing - Case Report
 

Similar to Rnaseq forgenefinding

Next generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable cropsNext generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable crops
Pulipati Gangadhara Rao
 
Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEG
Long Pei
 
New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...
Douglas Wu
 

Similar to Rnaseq forgenefinding (20)

Next generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable cropsNext generation seqencing tecnologies and application vegetable crops
Next generation seqencing tecnologies and application vegetable crops
 
15 molecular markers techniques
15 molecular markers techniques15 molecular markers techniques
15 molecular markers techniques
 
Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...Data Management for Quantitative Biology - Data sources (Next generation tech...
Data Management for Quantitative Biology - Data sources (Next generation tech...
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
 
sequencing-methods-review
sequencing-methods-reviewsequencing-methods-review
sequencing-methods-review
 
New Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overviewNew Generation Sequencing Technologies: an overview
New Generation Sequencing Technologies: an overview
 
Araport Data Integration - 2015 UMD Minisymposium
Araport Data Integration - 2015 UMD MinisymposiumAraport Data Integration - 2015 UMD Minisymposium
Araport Data Integration - 2015 UMD Minisymposium
 
Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEG
 
Yeast Genome
Yeast Genome Yeast Genome
Yeast Genome
 
Next Generation Sequencing Technologies and Their Applications in Ornamental ...
Next Generation Sequencing Technologies and Their Applications in Ornamental ...Next Generation Sequencing Technologies and Their Applications in Ornamental ...
Next Generation Sequencing Technologies and Their Applications in Ornamental ...
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
Transcriptomics approaches
Transcriptomics approachesTranscriptomics approaches
Transcriptomics approaches
 
Transcriptomics: A Tool for Plant Disease Management
Transcriptomics: A Tool for Plant Disease ManagementTranscriptomics: A Tool for Plant Disease Management
Transcriptomics: A Tool for Plant Disease Management
 
High throughput sequencing with Thermostable Group II Intron Reverse Transcri...
High throughput sequencing with Thermostable Group II Intron Reverse Transcri...High throughput sequencing with Thermostable Group II Intron Reverse Transcri...
High throughput sequencing with Thermostable Group II Intron Reverse Transcri...
 
New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...New methods for high-throughput nucleic sequencing and diagnostics using a th...
New methods for high-throughput nucleic sequencing and diagnostics using a th...
 
Sequence based Markers
Sequence based MarkersSequence based Markers
Sequence based Markers
 
Present status and recent developments on available molecular marker.pptx
Present status and recent developments on available molecular marker.pptxPresent status and recent developments on available molecular marker.pptx
Present status and recent developments on available molecular marker.pptx
 
Rna
RnaRna
Rna
 
BIOL335: Functional genomics
BIOL335: Functional genomicsBIOL335: Functional genomics
BIOL335: Functional genomics
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 

More from Sucheta Tripathy

More from Sucheta Tripathy (20)

Gal
GalGal
Gal
 
Ramorum2016 final
Ramorum2016 finalRamorum2016 final
Ramorum2016 final
 
Primer designgeneprediction
Primer designgenepredictionPrimer designgeneprediction
Primer designgeneprediction
 
Motif andpatterndatabase
Motif andpatterndatabaseMotif andpatterndatabase
Motif andpatterndatabase
 
Databases ii
Databases iiDatabases ii
Databases ii
 
Snps and microarray
Snps and microarraySnps and microarray
Snps and microarray
 
Stat2013
Stat2013Stat2013
Stat2013
 
26 nov2013seminar
26 nov2013seminar26 nov2013seminar
26 nov2013seminar
 
Stat2013
Stat2013Stat2013
Stat2013
 
Presentation2013
Presentation2013Presentation2013
Presentation2013
 
Lecture7,8
Lecture7,8Lecture7,8
Lecture7,8
 
Lecture5,6
Lecture5,6Lecture5,6
Lecture5,6
 
Primer designgeneprediction
Primer designgenepredictionPrimer designgeneprediction
Primer designgeneprediction
 
Lecture 3,4
Lecture 3,4Lecture 3,4
Lecture 3,4
 
Lecture 1,2
Lecture 1,2Lecture 1,2
Lecture 1,2
 
Sequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSASequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSA
 
Databases Part II
Databases Part IIDatabases Part II
Databases Part II
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Genome sequencingprojects
Genome sequencingprojectsGenome sequencingprojects
Genome sequencingprojects
 
Human encodeproject
Human encodeprojectHuman encodeproject
Human encodeproject
 

Recently uploaded

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 

Recently uploaded (20)

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

Rnaseq forgenefinding

  • 1. Transcript discovery and gene model correction using next generation sequencing data SuchetaTripathy, VBI, 11th Nov 2010
  • 2. Brief History of Sequencing Sanger Dideoxy Sequencing methods(1977). Maxam Gilberts Chemical degradation methods(1977). Two Labs that owned automated sequencers: 1. Leroy Hood at Caltech, 1986(commercialized by AB) 2. Wilhelm Ansorge at EMBL, 1986(commercialized by Pharmacia-Amersham and GE healthcare)
  • 3. Brief History Of sequencing Hypoxanthine-guanine phosphoribosyltransferase (HGPRT) Alu sequences
  • 4. Hitachi Laboratory developed High throughput capillary array sequencer, 1996. 1991, A patent filed by EMBL on media less, solid support based sequencing. Brief History Of sequencing
  • 5. NextGen Sequencing Methods 454 sequencing methods(2006) Principles of pyrophosphate detection(1985, 1988) Illumina(Solexa) Genome sequencing methods(2007) Applied Biosystems ABI SOLiD System(2007) Helicos single molecule sequencing(Helioscope, 2007) Pacific Biosciences single-molecule real-time(SMRT) technology, 2010 Sequenom for Nanotechnology based sequencing. BioNanomatrixnanofluidiscs. RNAP technology.
  • 6. Figure 1. (A) Outline of the GS 454 DNA sequencer workflow. Library construction (I) ligates 454-specific adapters to DNA fragments (indicated as A and B) and couples amplification beads with DNA in an emulsion PCR to amplify fragments before sequencing (II). The beads are loaded into the picotiter plate (III). (B) Schematic illustration of the pyrosequencing reaction which occurs on nucleotide incorporation to report sequencing-by-synthesis. (Adapted from http://www.454.com.)
  • 7. Outline of the Illumina Genome Analyzer workflow. Similar fragmentation and adapter ligation steps take place (I), before applying the library onto the solid surface of a flow cell. Attached DNA fragments form ‘bridge’ molecules which are subsequently amplified via an isothermal amplification process, leading to a cluster of identical fragments that are subsequently denatured for sequencing primer annealing (II). Amplified DNA fragments are subjected to sequencing-by-synthesis using 3′ blocked labelled nucleotides (III). (Adapted from the Genome Analyzer brochure, http://www.solexa.com.)
  • 8. (A) Primers hybridise to the P1 adapter within the library template. A set of four fluorescence-labelleddi-base probes competes for ligation to the sequencing primer. These probes have partly degenerated DNA sequence (indicated by n and z). Specificity of the di-base probe is achieved by interrogating the first and second base in each ligation reaction (CA in this case for the complementary strand). (B) Sequence determination by the SOLiD DNA sequencing platform is performed in multiple ligation cycles, using different primers, each one shorter from the previous one by a single base. The number of ligation cycles determines the eventual read length, whilst for each sequence tag, six rounds of primer reset occur [from primer (n) to primer (n − 4)]. (Adapted and modified from http://www.appliedbiosystems.com.)
  • 9. Cost Adapted from Eric Lander, 2010
  • 10. Throughput Standard ABI “Sanger” sequencing 96 samples/day Read length ~650 bp Total = 450,000 bases of sequence data 454 was the game changer! ~400,000 different templates (reads)/day Read length ~250 bp Total = 100,000,000 bases of sequence data!!!
  • 11. Throughput 454 Life Sciences/Roche Genome Sequencer FLX: currently produces 400-600 million bases per day per machine Published 1 million bases of Neanderthal DNA in 2006 May 2007 published complete genome of James Watson (3.2 billion bases ~20x coverage) Solexa/Illumina 10 GB per machine/week May 2008 published complete genomes for 3 hapmap subjects (14x coverage) ABI SOLID 20 GB per machine/week
  • 12. RNASeq Catalogue all species of transcripts. mRNA Non-coding RNA Small RNA Splicing patterns or other post-transcriptional modifications. Quantify the expression levels.
  • 13. Zhong Wang et al; Nat. Rev. Genetics, 2009
  • 14. Other Applications SNP detection Splice Variant Discovery Identification of miRNA targets TF binding sites Genome Methylation pattern RNA editing Metagenomic projects Gene Expression Analysis
  • 15. Difference with other expression sequencing EST: Low throughput, expansive, NOT quantitative. SAGA, CAGE, MPSS: Highthroughput, digital gene expression levels Expansive Sanger sequencing methods A portion of transcript is analyzed Isoforms are indistinguishable
  • 16. Advantages: Zero or very less background noise. Sensitive to isoform discovery. Both low and highly expressed genes can be quantified. Highly reproducible.
  • 17. Data Analysis Mapping Reads to the reference assembly Filtering output: Reads mapping > x number of times Downstream data analysis
  • 18.
  • 19.
  • 20.
  • 23. ERANGE:Is a full package for RNASeq and chipSeq data analysis
  • 24.
  • 25.
  • 26. An overview of the MapSplice pipeline. © The Author(s) 2010. Published by Oxford University Press. Wang K et al. Nucl. Acids Res. 2010;38:e178-e178
  • 27. Larsen et al 2010
  • 29. Transcripts discovered/Corrected 10,000 new Transcription start site discovered in Rhesus macaque(Liu et al., NAR 2010) 602 transcriptionally active regions and numerous introns in Candida albicans(Bruno et al., 2010, Genome Research) 96% of the genes were corrected in Laccaria bicolor(Larsen et al., PLoS One 2010). 16,923 regions in mouse (Martazavi et al., 2008). 3,724 novel isoforms (Trapanell2010).
  • 30. Bioinformatics Challenges Store , retrieve and analyze large amounts of data Matching of reads to multiple locations Short reads with higher copy number and long reads representing less expressed genes.
  • 31. References: Wilhelm J. Ansorge, Next-generation DNA sequencing techniques, New Biotechnology, Volume 25, Issue 4, April 2009, Pages 195-203 Zhong Wang, Mark Gerstein, and Michael Snyder. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009 January; 10(1): 57–63. Peter E. Larsen et al., Using Deep RNA Sequencing for the Structural Annotation of the Laccaria Bicolor MycorrhizalTranscriptomePLoS One. 2010; 5(7): e9780 Wang et al. MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery, NAR, 2010 Denoeud et al., Annotating genomes with massive-scale RNA sequencing, Genome Biology, 2008 Trapnell C, Williams BA, Pertea G, Mortazavi AM, Kwan G, van Baren MJ, Salzberg SL, Wold B, Pachter L.Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621 Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics doi:10.1093/bioinformatics/btp120 Mortazavi et al. Nature Methods, May 2008

Editor's Notes

  1. Cap analysis of gene expression, Massively parallel signature sequencing , Serial analysis of gene expression
  2. An overview of the MapSplice pipeline. The algorithm contains two phases: tag alignment (Step 1–Step 4) and splice inference (Step 5–Step 6). In the ‘tag alignment&apos; phase, candidate alignments of the mRNA tags to the reference genome are determined. In the ‘splice inference&apos; phase, splice junctions that appear in one or more tag alignments are analyzed to determine a splice significance score based on the quality and diversity of alignments that include the splice. Ambiguous candidate alignments are resolved by selecting the alignment with the overall highest quality match and highest confidence splice junctions.