SlideShare a Scribd company logo
Analysis Tools for RNA-seq and
Isoform Characterization
Slides: t.co/nc7siKRm6W
Gunnar R¨atsch
Biomedical Data Science Group
Computational Biology Center
Memorial Sloan Kettering Cancer Center
gxr #RNA #MMR #SplAdder #riboDiff #Cancer
Biomedical Data Sciences Group
Facts
Cost of collecting data drops, amounts increase exponentially.
We have more data than accurate algorithms.
Group’s research
Data Science Algorithms, Models & Tools
Machine Learning,
Bioinformatics.
Biology & Medicine Problem Setting & Goals
RNA processing regulation,
Clinical data analysis.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 1
Memorial Sloan-Kettering Cancer Center
Biomedical Data Sciences Group
Facts
Cost of collecting data drops, amounts increase exponentially.
We have more data than accurate algorithms.
Group’s research
Data Science Algorithms, Models & Tools
Machine Learning,
Bioinformatics.
Biology & Medicine Problem Setting & Goals
RNA processing regulation,
Clinical data analysis.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 1
Memorial Sloan-Kettering Cancer Center
Biomedical Data Sciences Group
Facts
Cost of collecting data drops, amounts increase exponentially.
We have more data than accurate algorithms.
Group’s research
Data Science Algorithms, Models & Tools
Machine Learning,
Bioinformatics.
Biology & Medicine Problem Setting & Goals
RNA processing regulation,
Clinical data analysis.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 1
Memorial Sloan-Kettering Cancer Center
Learning About the Central Dogma
Goal: Learn to predict what these processes accomplish:
Given the DNA, . . . , predict all gene products
f (DNA, ) = RNA g(RNA, ) = protein
Estimating f , g amounts to cracking the codes of
transcription, epigenetics, splicing, . . .
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 2
Memorial Sloan-Kettering Cancer Center
RNA-seq based Transcriptome Characterization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
RNA-seq based Transcriptome Characterization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
RNA-seq based Transcriptome Characterization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
RNA-seq based Transcriptome Characterization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
RNA-seq based Transcriptome Characterization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
RNA-seq based Transcriptome Characterization
oqtans.org
cloud.oqtans.org
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
Transcript Quantitation and Dependence on Alignments
0 1 2 3 4 5 6 7 8
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
maximal number of mismatches
Pearsoncorrelationwithtrueabundance
True read origins
Alignments to
genome
92%
53%
84%
False alignments, multi-mappers etc. lead to weaker results
Simulated human reads from transcripts of known abundance (Fluxsimulator, [Sammeth, 2009]), 3% error rate, alignment w/
PALMapper [Jean et al., 2010], quantification w/ rQuant [Bohnert et al., 2009], Person correlation over considered transcripts.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 4
Memorial Sloan-Kettering Cancer Center
bioRxiv dx.doi.org/10.1101/017103
Efficient BAM file postprocessor for RNA- & DNA-seq
100M alignments in 20 minutes (10 threads)
Suitable for large-scale projects
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/mmr (C++)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 5
bioRxiv dx.doi.org/10.1101/017103
Efficient BAM file postprocessor for RNA- & DNA-seq
100M alignments in 20 minutes (10 threads)
Suitable for large-scale projects
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/mmr (C++)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 5
bioRxiv dx.doi.org/10.1101/017103
Efficient BAM file postprocessor for RNA- & DNA-seq
100M alignments in 20 minutes (10 threads)
Suitable for large-scale projects
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/mmr (C++)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 5
Multiple Mapper Resolution
Principle (Iterated over all reads, N times)
Use the change of local coverage around read mapping ...
... and use its smoothness to identify “better” mapping location
location 1 location 2
Coverage
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6
Memorial Sloan-Kettering Cancer Center
Multiple Mapper Resolution
Principle (Iterated over all reads, N times)
Use the change of local coverage around read mapping ...
... and use its smoothness to identify “better” mapping location
Read not mapped to location 1 ... ... but mapped to location 2
Read pairCoverage
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6
Memorial Sloan-Kettering Cancer Center
Multiple Mapper Resolution
Principle (Iterated over all reads, N times)
Use the change of local coverage around read mapping ...
... and use its smoothness to identify “better” mapping location
+
Read not mapped to location 1 ... ... but mapped to location 2
Read pair Variance measure Evaluation windowCoverage Average coverage
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6
Memorial Sloan-Kettering Cancer Center
Multiple Mapper Resolution
Principle (Iterated over all reads, N times)
Use the change of local coverage around read mapping ...
... and use its smoothness to identify “better” mapping location
+
+
Read pair Variance measure Evaluation windowCoverage Average coverage
Read not mapped to location 1 ... ... but mapped to location 2
Read mapped to location 1 ... ... and not mapped to location 2
Assignment1Assignment2
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6
Memorial Sloan-Kettering Cancer Center
Multiple Mapper Resolution
Results for simulated DNA-seq
Smooths coverage as expected on an artificial dataset
Simulated reads from tiling a part of A. thaliana genome, alignment w/ PALMapper [Jean et al., 2010] (with -a option), visual-
ization with IGV [Robinson et al., 2011].
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 7
Memorial Sloan-Kettering Cancer Center
Multiple Mapper Resolution
Results for simulated DNA-seq
Smooths coverage as expected on an artificial dataset
≥ ≥
Simulated reads from tiling a part of A. thaliana genome, alignment w/ PALMapper [Jean et al., 2010] (with -a option), visual-
ization with IGV [Robinson et al., 2011].
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 7
Memorial Sloan-Kettering Cancer Center
Multiple Mapper Resolution
Results for simulated RNA-seq
Improves performance of transcript quantification
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
0 1 2 4 6 all
Correlation
Mismatches
original alignments
best alignment only
MMR treated alignment
Simulated reads (75nt) from subset of human annotated transcripts with Fluxsimulator [Sammeth, 2009], PALMapper alignments
[Jean et al., 2010], rQuant quantitation Bohnert et al. [2009], Pearson correlation over all considered transcripts.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 8
Memorial Sloan-Kettering Cancer Center
LARGE-SCALE BIOLOGY ARTICLE
Nonsense-Mediated Decay of Alternative Precursor mRNA
Splicing Variants Is a Major Determinant of the Arabidopsis
Steady State TranscriptomeC W
Gabriele Drechsel,a,1 André Kahles,b,1 Anil K. Kesarwani,a Eva Stauffer,a,2 Jonas Behr,b Philipp Drewe,b
Gunnar Rätsch,b and Andreas Wachtera,3
a Center for Plant Molecular Biology, University of Tübingen, 72076 Tuebingen, Germany
b Computational Biology Center, Sloan-Kettering Institute, New York, New York 10065
ORCID IDs: 0000-0002-3411-0692 (A.K.); 0000-0001-5486-8532 (G.R.); 0000-0002-3132-5161 (A.W.).
The nonsense-mediated decay (NMD) surveillance pathway can recognize erroneous transcripts and physiological mRNA
such as precursor mRNA alternative splicing (AS) variants. Currently, information on the global extent of coupled AS and NM
The Plant Cell, Vol. 25: 3726–3742, October 2013, www.plantcell.org ã 2013 American Society of Plant Biologists. All rights reserved.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 9
bioRxiv dx.doi.org/10.1101/017095
Analysis of alternative isoforms with RNA-seq data
Analyses known and identifies novel splicing events
Quantifies & visualizes splicing-related data
Suitable for large-scale projects (1000’s of samples)
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/spladder (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 10
bioRxiv dx.doi.org/10.1101/017095
Analysis of alternative isoforms with RNA-seq data
Analyses known and identifies novel splicing events
Quantifies & visualizes splicing-related data
Suitable for large-scale projects (1000’s of samples)
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/spladder (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 10
bioRxiv dx.doi.org/10.1101/017095
Analysis of alternative isoforms with RNA-seq data
Analyses known and identifies novel splicing events
Quantifies & visualizes splicing-related data
Suitable for large-scale projects (1000’s of samples)
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/spladder (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 10
SplAdder Ideas
Major Problems in Transcriptome Analysis
1 Gene annotations are incomplete and often inaccurate
2 Whole transcript isoforms are difficult to predict/quantify
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11
Memorial Sloan-Kettering Cancer Center
SplAdder Ideas
Major Problems in Transcriptome Analysis
1 Gene annotations are incomplete and often inaccurate
2 Whole transcript isoforms are difficult to predict/quantify
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11
Memorial Sloan-Kettering Cancer Center
SplAdder Ideas
Major Problems in Transcriptome Analysis
1 Gene annotations are incomplete and often inaccurate
2 Whole transcript isoforms are difficult to predict/quantify
Solution
Augment annotation with RNA-Seq evidence
Use single splicing events instead of full transcripts
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11
Memorial Sloan-Kettering Cancer Center
SplAdder Ideas
Major Problems in Transcriptome Analysis
1 Gene annotations are incomplete and often inaccurate
2 Whole transcript isoforms are difficult to predict/quantify
Solution
Augment annotation with RNA-Seq evidence
Use single splicing events instead of full transcripts
Annotation
Alignment
Data
Augmented
Splicing
Graph
Detected
Splice
Events
Quantified
Splice
Events
Differential
Analysis/
sQTL Tests
Kahles et al., bioRxiv, 2015
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11
Memorial Sloan-Kettering Cancer Center
SplAdder Ideas
Major Problems in Transcriptome Analysis
1 Gene annotations are incomplete and often inaccurate
2 Whole transcript isoforms are difficult to predict/quantify
Solution
Augment annotation with RNA-Seq evidence
Use single splicing events instead of full transcripts
Annotation
Alignment
Data
Augmented
Splicing
Graph
Detected
Splice
Events
Quantified
Splice
Events
Differential
Analysis/
sQTL Tests
Kahles et al., bioRxiv, 2015
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11
Memorial Sloan-Kettering Cancer Center
SplAdder Graph Augmentation
Principle
Collapse annotated transcripts into graph representation
Use RNA-Seq evidence to add new nodes and edges
T1E1 T1E2 T1E3 T1E4
T2E1 T2E2 T2E3
T3E1 T3E2 T3E3
T4E1 T4E2
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12
Memorial Sloan-Kettering Cancer Center
SplAdder Graph Augmentation
Principle
Collapse annotated transcripts into graph representation
Use RNA-Seq evidence to add new nodes and edges
T1E1 T1E2 T1E3 T1E4
T2E1 T2E2 T2E3
T3E1 T3E2 T3E3
E1 E3 E4 E6
E2 E5
T4E1 T4E2
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12
Memorial Sloan-Kettering Cancer Center
SplAdder Graph Augmentation
Principle
Collapse annotated transcripts into graph representation
Use RNA-Seq evidence to add new nodes and edges
coverage
split alignments
New cassette exon
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12
Memorial Sloan-Kettering Cancer Center
SplAdder Graph Augmentation
Principle
Collapse annotated transcripts into graph representation
Use RNA-Seq evidence to add new nodes and edges
coverage
split alignments
New cassette exon
coverage
New retained intron
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12
Memorial Sloan-Kettering Cancer Center
SplAdder Graph Augmentation
coverage
split alignments
A New cassette exon
coverage
B New retained intron
split alignments
C New intron
split alignments
D Alternative splice sites on both intron ends
split alignments
E New start-terminal node / New end-terminal node
split alignments
F Alternative 3’splice site / New end-terminal node
split alignments
G Alternative 5’splice site / New start terminal node
split alignments
H New exon skip
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 13
Memorial Sloan-Kettering Cancer Center
SplAdder Event Extraction
E1 E3 E4 E6
E2 E5
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 14
Memorial Sloan-Kettering Cancer Center
SplAdder Event Extraction
E1 E3 E4 E6
E2 E5
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 14
Memorial Sloan-Kettering Cancer Center
SplAdder Event Extraction
E1 E3 E4 E6
E2 E5
E1 E3 E4 E6
E2 E5
E1 E3 E4 E6
E2 E5
E1 E3 E4 E6
E2 E5
E1 E3 E4 E6
E2 E5
Exon Skip
Multiple Exon Skip
Alternative 5’Splice Site
Intron Retention
Alternative 3’Splice Site
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 14
Memorial Sloan-Kettering Cancer Center
SplAdder Event Quantification and Visualization
Exon Skip
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
SplAdder Event Quantification and Visualization
Exon Skip
a
cb
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
SplAdder Event Quantification and Visualization
Exon Skip
a
cb
PSI =
b + c
2 · a + b + c
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
SplAdder Event Quantification and Visualization
Alternative 5’ Site
a
b
Exon Skip
a
cb
PSI =
b + c
2 · a + b + c
PSI =
b
a + b
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
SplAdder Event Quantification and Visualization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
SplAdder Event Quantification and Visualization
Summary
SplAdder effectively augments the annotation
Enables quantitative analysis of events instead of transcripts
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
Splicing Analysis Across Multiple Cancer Types
Goals
1 Identify cancer-specific splicing patterns
2 Identify variants regulating splicing in same gene (cis)
3 Identify variants regulating splicing in other cancer genes (trans)
TCGA provides RNA-seq and matching exome data
RNA-seq Find & quantify splicing events
Exome Identify variants in exons & flanking intronic regions
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 16
Memorial Sloan-Kettering Cancer Center
Splicing Analysis Across Multiple Cancer Types
Goals
1 Identify cancer-specific splicing patterns
2 Identify variants regulating splicing in same gene (cis)
3 Identify variants regulating splicing in other cancer genes (trans)
TCGA provides RNA-seq and matching exome data
RNA-seq Find & quantify splicing events
Exome Identify variants in exons & flanking intronic regions
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 16
Memorial Sloan-Kettering Cancer Center
Splicing Variation Across 4,700 Samples
Event Statistics Exon Skip Eve
A B
Analysis of a total of 4,700 RNA-seq samples from TCGA normal (tn), TCGA tumors (tc), Encode (ec) and Geuvadis (gv). Align-
ment w/ STAR [Dobin et al., 2013], analysis w/ SplAdder (SplA) and Gencode annotation (Anno). Figure from [Kahles, 2014].
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 17
Memorial Sloan-Kettering Cancer Center
Uniform analysis of Large-Scale RNA-seq Data
Large-scale Compute
4,700 RNA-seq libraries (≈100 TB)
⇒ STAR ≈ 6 CPU years
⇒ SplAdder ≈ 0.5 CPU years
[Kahles et al.]
Unified community resources
Docker with ICGC RNA-seq alignment SOP
bioweb.me/ICGC-RNA-SOP
Syncronize with Encode, gTex, TCGA, . . .
[ICGC PCAWG-3 WG]
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 18
Memorial Sloan-Kettering Cancer Center
Uniform analysis of Large-Scale RNA-seq Data
Large-scale Compute
4,700 RNA-seq libraries (≈100 TB)
⇒ STAR ≈ 6 CPU years
⇒ SplAdder ≈ 0.5 CPU years
[Kahles et al.]
Unified community resources
Docker with ICGC RNA-seq alignment SOP
bioweb.me/ICGC-RNA-SOP
Syncronize with Encode, gTex, TCGA, . . .
[ICGC PCAWG-3 WG]
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 18
Memorial Sloan-Kettering Cancer Center
bioRxiv dx.doi.org/10.1101/017095
Analysis of Ribosome profiling and RNA-seq data
Study translation efficiency
Adjusts for expression differences
Accurate method based on dispersion estimates and GLMs
Open Source bioweb.me/ribodiff (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 19
bioRxiv dx.doi.org/10.1101/017095
Analysis of Ribosome profiling and RNA-seq data
Study translation efficiency
Adjusts for expression differences
Accurate method based on dispersion estimates and GLMs
Open Source bioweb.me/ribodiff (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 19
bioRxiv dx.doi.org/10.1101/017095
Analysis of Ribosome profiling and RNA-seq data
Study translation efficiency
Adjusts for expression differences
Accurate method based on dispersion estimates and GLMs
Open Source bioweb.me/ribodiff (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 19
Application to Ribosome Profiling
Found a motif that was strongly in enriched in detected genes:
G-quadruplex structures
Memorial Sloan-Kettering Cancer Center
Application to Ribosome Profiling
Found a motif that was strongly in enriched in detected genes:
G-quadruplex structures
Memorial Sloan-Kettering Cancer Center
(Analysis based on related but different strategy [Wolfe et al., 2014].)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 20
Summary
MMR improves alignment choice for multi-mappers
⇒ Helps improving accuracy of tools like Cufflinks
SplAdder identifies, quantifies & visualizes alternative splicing
⇒ Finds unannotated alternative splicing, tumor/normal splicing
differences; splicing reprogramming; sQTLs
riboDiff accurately detects differential translation efficiency
⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in
5’ UTR that interacts with compound via eIF4a
Tools (+ six other ones) are open source and available
⇒ ratschlab.org/tools
⇒ (more) Docker images come soon . . .
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21
Memorial Sloan-Kettering Cancer Center
Summary
MMR improves alignment choice for multi-mappers
⇒ Helps improving accuracy of tools like Cufflinks
SplAdder identifies, quantifies & visualizes alternative splicing
⇒ Finds unannotated alternative splicing, tumor/normal splicing
differences; splicing reprogramming; sQTLs
riboDiff accurately detects differential translation efficiency
⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in
5’ UTR that interacts with compound via eIF4a
Tools (+ six other ones) are open source and available
⇒ ratschlab.org/tools
⇒ (more) Docker images come soon . . .
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21
Memorial Sloan-Kettering Cancer Center
Summary
MMR improves alignment choice for multi-mappers
⇒ Helps improving accuracy of tools like Cufflinks
SplAdder identifies, quantifies & visualizes alternative splicing
⇒ Finds unannotated alternative splicing, tumor/normal splicing
differences; splicing reprogramming; sQTLs
riboDiff accurately detects differential translation efficiency
⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in
5’ UTR that interacts with compound via eIF4a
Tools (+ six other ones) are open source and available
⇒ ratschlab.org/tools
⇒ (more) Docker images come soon . . .
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21
Memorial Sloan-Kettering Cancer Center
Summary
MMR improves alignment choice for multi-mappers
⇒ Helps improving accuracy of tools like Cufflinks
SplAdder identifies, quantifies & visualizes alternative splicing
⇒ Finds unannotated alternative splicing, tumor/normal splicing
differences; splicing reprogramming; sQTLs
riboDiff accurately detects differential translation efficiency
⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in
5’ UTR that interacts with compound via eIF4a
Tools (+ six other ones) are open source and available
⇒ ratschlab.org/tools
⇒ (more) Docker images come soon . . .
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21
Memorial Sloan-Kettering Cancer Center
Acknowledgements
R¨atsch Laboratory
Andre Kahles
Yi Zhong
Philipp Drewe @ MDC Berlin
Theofanis Karaletsos
Kjong Van Lehmann
Jonas Behr @ ETH Basel
Regina Bohnert @ Molecular Health
Geraldine Jean @ University of Nantes
Cancer Biology
Guido Wendel
Kamini Singh, . . .
Cancer Genomics Projects
Angela Brooks, Broad
Alvis Brazma, EBI
Matt Wilkerson, UNC
Niki Schultz, MSKCC
Chris Sander, MSKCC
Funding from MSKCC, Max Planck Society,
European Union, German Research Foundation,
Geoffrey Beene Foundation & NIH
Thank you!
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 22
Acknowledgements
R¨atsch Laboratory
Andre Kahles
Yi Zhong
Philipp Drewe @ MDC Berlin
Theofanis Karaletsos
Kjong Van Lehmann
Jonas Behr @ ETH Basel
Regina Bohnert @ Molecular Health
Geraldine Jean @ University of Nantes
Cancer Biology
Guido Wendel
Kamini Singh, . . .
Cancer Genomics Projects
Angela Brooks, Broad
Alvis Brazma, EBI
Matt Wilkerson, UNC
Niki Schultz, MSKCC
Chris Sander, MSKCC
Funding from MSKCC, Max Planck Society,
European Union, German Research Foundation,
Geoffrey Beene Foundation & NIH
Thank you!
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 22
References I
J. Behr, G. Schweikert, J. Cao, F. De Bona, G. Zeller, S. Laubinger, S. Ossowski,
K. Schneeberger, D. Weigel, and G. R¨atsch. Rna-seq and tiling arrays for improved gene
finding. Oral presentation at the CSHL Genome Informatics Meeting, September 2008. URL
http:
//www.fml.tuebingen.mpg.de/raetsch/lectures/RaetschGenomeInformatics08.pdf.
R. Bohnert, J. Behr, and G R¨atsch. Transcript quantification with RNA-Seq data. BMC
Bioinformatics, 10(S13):P5, 2009. URL
http://www.biomedcentral.com/1471-2105/10/S13/P5.
RM Clark, G Schweikert, C Toomajian, S Ossowski, G Zeller, P Shinn, N Warthmann, TT Hu,
G Fu, DA Hinds, H Chen, KA Frazer, DH Huson, B Sch¨olkopf, M Nordborg, G R¨atsch,
JR Ecker, and D Weigel. Common sequence polymorphisms shaping genetic diversity in
arabidopsis thaliana. Science, 317(5836):338–342, 2007. ISSN 1095-9203 (Electronic). doi:
10.1126/science.1138632.
Alexander Dobin, Carrie a Davis, Felix Schlesinger, Jorg Drenkow, Chris Zaleski, Sonali Jha,
Philippe Batut, Mark Chaisson, and Thomas R Gingeras. STAR: ultrafast universal RNA-seq
aligner. Bioinformatics (Oxford, England), 29(1):15–21, January 2013. ISSN 1367-4811. doi:
10.1093/bioinformatics/bts635. URL http://www.ncbi.nlm.nih.gov/pubmed/23104886.
Mitchell Guttman, Manuel Garber, Joshua Z Levin, Julie Donaghey, James Robinson, Xian
Adiconis, Lin Fan, Magdalena J Koziol, Andreas Gnirke, Chad Nusbaum, John L Rinn,
Eric S Lander, and Aviv Regev. Ab initio reconstruction of cell type-specific transcriptomes
in mouse reveals the conserved multi-exonic structure of lincrnas. Nat Biotechnol, 28(5):
503–10, May 2010. doi: 10.1038/nbt.1633.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 23
References II
G Jean, A Kahles, VT Sreedharan, F De Bona, and G R¨atsch. Rna-seq read alignments with
palmapper. Curr Protoc Bioinformatics, Unit 11.6, 2010.
Andre Kahles. Novel Methods for the Computational Analysis of RNA-Seq Data with
Applications to Alternative Splicing. PhD thesis, University of T¨ubingen, T¨ubingen,
Germany, September 2014.
G. R¨atsch and S. Sonnenburg. Accurate splice site detection for Caenorhabditis elegans. In
K. Tsuda B. Schoelkopf and J.-P. Vert, editors, Kernel Methods in Computational Biology.
MIT Press, 2004.
G. R¨atsch, S. Sonnenburg, and B. Sch¨olkopf. RASE: recognition of alternatively spliced exons
in C. elegans. Bioinformatics, 21(Suppl. 1):i369–i377, June 2005.
James T. Robinson, Helga Thorvaldsd´ottir, Wendy Winckler, Mitchell Guttman, Eric S. Lander,
Gad Getz, and Jill P. Mesirov. Integrative genomics viewer. Nature Biotechnology, 29:
24–26, 2011.
M. Sammeth. The Flux Simulator. Website, 2009. http://flux.sammeth.net/simulator.html.
Gabriele Schweikert, Alexander Zien, Georg Zeller, Jonas Behr, Christoph Dieterich,
Cheng Soon Ong, Petra Philips, Fabio De Bona, Lisa Hartmann, Anja Bohlen, Nina Kr¨uger,
S¨oren Sonnenburg, and Gunnar R¨atsch. mgene: Accurate svm-based gene finding with an
application to nematode genomes. Genome Research, 2009. URL
http://genome.cshlp.org/content/early/2009/06/29/gr.090597.108.full.pdf+html.
Advance access June 29, 2009.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 24
References III
S. Sonnenburg, G. R¨atsch, A. Jagota, and K.-R. M¨uller. New methods for splice-site
recognition. In Proc. International Conference on Artificial Neural Networks, 2002.
S¨oren Sonnenburg, Alexander Zien, and Gunnar R¨atsch. ARTS: Accurate Recognition of
Transcription Starts in Human. Bioinformatics, 22(14):e472–480, 2006.
Cole Trapnell, Brian A Williams, Geo Pertea, Ali Mortazavi, Gordon Kwan, Marijke J van
Baren, Steven L Salzberg, Barbara J Wold, and Lior Pachter. Transcript assembly and
quantification by rna-seq reveals unannotated transcripts and isoform switching during cell
differentiation. Nature Biotech, advance online publication, May 2010. doi:
10.1038/nbt.1621. URL http://dx.doi.org/10.1038/nbt.1621.
A Wolfe, K Singh, Y Zhong, P Drewe, others, G R¨atsch, and HG Wendel. Rna g-quadruplexes
cause eif4a-dependent oncogene translation in cancer. Nature, 2014. doi:
10.1038/nature13485.
G Zeller, RM Clark, K Schneeberger, A Bohlen, D Weigel, and G Ratsch. Detecting
polymorphic regions in arabidopsis thaliana with resequencing microarrays. Genome Res, 18
(6):918–929, 2008. ISSN 1088-9051 (Print). doi: 10.1101/gr.070169.107.
A. Zien, G. R¨atsch, S. Mika, B. Sch¨olkopf, T. Lengauer, and K.-R. M¨uller. Engineering Support
Vector Machine Kernels That Recognize Translation Initiation Sites. BioInformatics, 16(9):
799–807, September 2000.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 25

More Related Content

What's hot

Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
VHIR Vall d’Hebron Institut de Recerca
 
Transcript detection in RNAseq
Transcript detection in RNAseqTranscript detection in RNAseq
Transcript detection in RNAseq
Denis C. Bauer
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
mikaelhuss
 
Part 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goalPart 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goal
Joachim Jacob
 
RNASeq DE methods review Applied Bioinformatics Journal Club
RNASeq DE methods review Applied Bioinformatics Journal ClubRNASeq DE methods review Applied Bioinformatics Journal Club
RNASeq DE methods review Applied Bioinformatics Journal ClubJennifer Shelton
 
RNA-seq Data Analysis Overview
RNA-seq Data Analysis OverviewRNA-seq Data Analysis Overview
RNA-seq Data Analysis Overview
Sean Davis
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3
BITS
 
Differential gene expression
Differential gene expressionDifferential gene expression
Differential gene expression
Denis C. Bauer
 
RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1
BITS
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the Transcriptome
Sean Davis
 
Rna seq pipeline
Rna seq pipelineRna seq pipeline
Rna seq pipeline
Karan Veer Singh
 
RNA-Seq with R-Bioconductor
RNA-Seq with R-BioconductorRNA-Seq with R-Bioconductor
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
Jatinder Singh
 
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
Jan Aerts
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vstQiang Kou
 
RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities
Paolo Dametto
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
Dan Gaston
 

What's hot (20)

Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
 
Transcript detection in RNAseq
Transcript detection in RNAseqTranscript detection in RNAseq
Transcript detection in RNAseq
 
Rna seq
Rna seqRna seq
Rna seq
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
 
Part 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goalPart 1 of RNA-seq for DE analysis: Defining the goal
Part 1 of RNA-seq for DE analysis: Defining the goal
 
RNASeq DE methods review Applied Bioinformatics Journal Club
RNASeq DE methods review Applied Bioinformatics Journal ClubRNASeq DE methods review Applied Bioinformatics Journal Club
RNASeq DE methods review Applied Bioinformatics Journal Club
 
presentation
presentationpresentation
presentation
 
RNA-seq Data Analysis Overview
RNA-seq Data Analysis OverviewRNA-seq Data Analysis Overview
RNA-seq Data Analysis Overview
 
RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3RNA-seq: Mapping and quality control - part 3
RNA-seq: Mapping and quality control - part 3
 
Differential gene expression
Differential gene expressionDifferential gene expression
Differential gene expression
 
RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1RNA-seq: general concept, goal and experimental design - part 1
RNA-seq: general concept, goal and experimental design - part 1
 
RNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the TranscriptomeRNA-seq: A High-resolution View of the Transcriptome
RNA-seq: A High-resolution View of the Transcriptome
 
Rna seq pipeline
Rna seq pipelineRna seq pipeline
Rna seq pipeline
 
RNA-Seq with R-Bioconductor
RNA-Seq with R-BioconductorRNA-Seq with R-Bioconductor
RNA-Seq with R-Bioconductor
 
RNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential ExpressionRNASeq - Analysis Pipeline for Differential Expression
RNASeq - Analysis Pipeline for Differential Expression
 
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vst
 
RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities RNA sequencing: advances and opportunities
RNA sequencing: advances and opportunities
 
RNA-Seq
RNA-SeqRNA-Seq
RNA-Seq
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
 

Similar to Talk ABRF 2015 (Gunnar Rätsch)

Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Elia Brodsky
 
Molecular Biology Software Links
Molecular Biology Software LinksMolecular Biology Software Links
Molecular Biology Software Links
university of education,Lahore
 
RNA-seq based Genome Annotation with mGene.ngs and MiTie
RNA-seq based Genome Annotation with mGene.ngs and MiTieRNA-seq based Genome Annotation with mGene.ngs and MiTie
RNA-seq based Genome Annotation with mGene.ngs and MiTie
Gunnar Rätsch
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Golden Helix Inc
 
1073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_10121073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_1012Elsa von Licy
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
GenomeInABottle
 
140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposalGenomeInABottle
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
GenomeInABottle
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data Analysis
SANJANA PANDEY
 
Data analysis pipelines for NGS applications
Data analysis pipelines for NGS applicationsData analysis pipelines for NGS applications
Data analysis pipelines for NGS applications
Vall d'Hebron Institute of Research (VHIR)
 
Assign 2.0 software for the analysis of Phred quality values for quality con...
Assign 2.0  software for the analysis of Phred quality values for quality con...Assign 2.0  software for the analysis of Phred quality values for quality con...
Assign 2.0 software for the analysis of Phred quality values for quality con...
Crystal Sanchez
 
Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09
Sean Davis
 
Achieve Complete Coverage of the SARS-CoV-2 Genome
Achieve Complete Coverage of the SARS-CoV-2 GenomeAchieve Complete Coverage of the SARS-CoV-2 Genome
Achieve Complete Coverage of the SARS-CoV-2 Genome
Camille Cappello
 
RNA Seq Data Analysis
RNA Seq Data AnalysisRNA Seq Data Analysis
RNA Seq Data Analysis
Ravi Gandham
 
Analyzing Genomic Data with PyEnsembl and Varcode
Analyzing Genomic Data with PyEnsembl and VarcodeAnalyzing Genomic Data with PyEnsembl and Varcode
Analyzing Genomic Data with PyEnsembl and Varcode
Alex Rubinsteyn
 
Detecting and Quantifying Low Level Variants in Sanger Sequencing Traces
Detecting and Quantifying Low Level Variants in Sanger Sequencing TracesDetecting and Quantifying Low Level Variants in Sanger Sequencing Traces
Detecting and Quantifying Low Level Variants in Sanger Sequencing Traces
Thermo Fisher Scientific
 
2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
GenomeInABottle
 
20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop
External RNA Controls Consortium
 

Similar to Talk ABRF 2015 (Gunnar Rätsch) (20)

Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
 
Molecular Biology Software Links
Molecular Biology Software LinksMolecular Biology Software Links
Molecular Biology Software Links
 
RNA-seq based Genome Annotation with mGene.ngs and MiTie
RNA-seq based Genome Annotation with mGene.ngs and MiTieRNA-seq based Genome Annotation with mGene.ngs and MiTie
RNA-seq based Genome Annotation with mGene.ngs and MiTie
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
1073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_10121073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_1012
 
Use of NCBI Databases in qPCR Assay Design
Use of NCBI Databases in qPCR Assay DesignUse of NCBI Databases in qPCR Assay Design
Use of NCBI Databases in qPCR Assay Design
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data Analysis
 
Data analysis pipelines for NGS applications
Data analysis pipelines for NGS applicationsData analysis pipelines for NGS applications
Data analysis pipelines for NGS applications
 
Assign 2.0 software for the analysis of Phred quality values for quality con...
Assign 2.0  software for the analysis of Phred quality values for quality con...Assign 2.0  software for the analysis of Phred quality values for quality con...
Assign 2.0 software for the analysis of Phred quality values for quality con...
 
Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09
 
Achieve Complete Coverage of the SARS-CoV-2 Genome
Achieve Complete Coverage of the SARS-CoV-2 GenomeAchieve Complete Coverage of the SARS-CoV-2 Genome
Achieve Complete Coverage of the SARS-CoV-2 Genome
 
RNA Seq Data Analysis
RNA Seq Data AnalysisRNA Seq Data Analysis
RNA Seq Data Analysis
 
Analyzing Genomic Data with PyEnsembl and Varcode
Analyzing Genomic Data with PyEnsembl and VarcodeAnalyzing Genomic Data with PyEnsembl and Varcode
Analyzing Genomic Data with PyEnsembl and Varcode
 
Detecting and Quantifying Low Level Variants in Sanger Sequencing Traces
Detecting and Quantifying Low Level Variants in Sanger Sequencing TracesDetecting and Quantifying Low Level Variants in Sanger Sequencing Traces
Detecting and Quantifying Low Level Variants in Sanger Sequencing Traces
 
2023 GIAB AMP Update
2023 GIAB AMP Update2023 GIAB AMP Update
2023 GIAB AMP Update
 
20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop
 
KO poster 8:13
KO poster 8:13KO poster 8:13
KO poster 8:13
 

Recently uploaded

24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all
DrSathishMS1
 
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.GawadHemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
NephroTube - Dr.Gawad
 
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
Catherine Liao
 
Ocular injury ppt Upendra pal optometrist upums saifai etawah
Ocular injury  ppt  Upendra pal  optometrist upums saifai etawahOcular injury  ppt  Upendra pal  optometrist upums saifai etawah
Ocular injury ppt Upendra pal optometrist upums saifai etawah
pal078100
 
Antiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptxAntiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptx
Rohit chaurpagar
 
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
kevinkariuki227
 
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model SafeSurat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Savita Shen $i11
 
Superficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptxSuperficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptx
Dr. Rabia Inam Gandapore
 
Flu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore KarnatakaFlu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore Karnataka
addon Scans
 
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdfMANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
Jim Jacob Roy
 
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTSARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
Dr. Vinay Pareek
 
Cervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptxCervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptx
Dr. Rabia Inam Gandapore
 
BRACHYTHERAPY OVERVIEW AND APPLICATORS
BRACHYTHERAPY OVERVIEW  AND  APPLICATORSBRACHYTHERAPY OVERVIEW  AND  APPLICATORS
BRACHYTHERAPY OVERVIEW AND APPLICATORS
Krishan Murari
 
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptxMaxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
Dr. Rabia Inam Gandapore
 
Prix Galien International 2024 Forum Program
Prix Galien International 2024 Forum ProgramPrix Galien International 2024 Forum Program
Prix Galien International 2024 Forum Program
Levi Shapiro
 
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
Oleg Kshivets
 
Ophthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE examOphthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE exam
KafrELShiekh University
 
POST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its managementPOST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its management
touseefaziz1
 
Are There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdfAre There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdf
Little Cross Family Clinic
 
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdfBENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
DR SETH JOTHAM
 

Recently uploaded (20)

24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all
 
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.GawadHemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
Hemodialysis: Chapter 3, Dialysis Water Unit - Dr.Gawad
 
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
The POPPY STUDY (Preconception to post-partum cardiovascular function in prim...
 
Ocular injury ppt Upendra pal optometrist upums saifai etawah
Ocular injury  ppt  Upendra pal  optometrist upums saifai etawahOcular injury  ppt  Upendra pal  optometrist upums saifai etawah
Ocular injury ppt Upendra pal optometrist upums saifai etawah
 
Antiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptxAntiulcer drugs Advance Pharmacology .pptx
Antiulcer drugs Advance Pharmacology .pptx
 
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
 
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model SafeSurat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
 
Superficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptxSuperficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptx
 
Flu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore KarnatakaFlu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore Karnataka
 
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdfMANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
 
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTSARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
 
Cervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptxCervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptx
 
BRACHYTHERAPY OVERVIEW AND APPLICATORS
BRACHYTHERAPY OVERVIEW  AND  APPLICATORSBRACHYTHERAPY OVERVIEW  AND  APPLICATORS
BRACHYTHERAPY OVERVIEW AND APPLICATORS
 
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptxMaxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
Maxilla, Mandible & Hyoid Bone & Clinical Correlations by Dr. RIG.pptx
 
Prix Galien International 2024 Forum Program
Prix Galien International 2024 Forum ProgramPrix Galien International 2024 Forum Program
Prix Galien International 2024 Forum Program
 
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...
 
Ophthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE examOphthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE exam
 
POST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its managementPOST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its management
 
Are There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdfAre There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdf
 
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdfBENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
 

Talk ABRF 2015 (Gunnar Rätsch)

  • 1. Analysis Tools for RNA-seq and Isoform Characterization Slides: t.co/nc7siKRm6W Gunnar R¨atsch Biomedical Data Science Group Computational Biology Center Memorial Sloan Kettering Cancer Center gxr #RNA #MMR #SplAdder #riboDiff #Cancer
  • 2. Biomedical Data Sciences Group Facts Cost of collecting data drops, amounts increase exponentially. We have more data than accurate algorithms. Group’s research Data Science Algorithms, Models & Tools Machine Learning, Bioinformatics. Biology & Medicine Problem Setting & Goals RNA processing regulation, Clinical data analysis. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 1 Memorial Sloan-Kettering Cancer Center
  • 3. Biomedical Data Sciences Group Facts Cost of collecting data drops, amounts increase exponentially. We have more data than accurate algorithms. Group’s research Data Science Algorithms, Models & Tools Machine Learning, Bioinformatics. Biology & Medicine Problem Setting & Goals RNA processing regulation, Clinical data analysis. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 1 Memorial Sloan-Kettering Cancer Center
  • 4. Biomedical Data Sciences Group Facts Cost of collecting data drops, amounts increase exponentially. We have more data than accurate algorithms. Group’s research Data Science Algorithms, Models & Tools Machine Learning, Bioinformatics. Biology & Medicine Problem Setting & Goals RNA processing regulation, Clinical data analysis. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 1 Memorial Sloan-Kettering Cancer Center
  • 5. Learning About the Central Dogma Goal: Learn to predict what these processes accomplish: Given the DNA, . . . , predict all gene products f (DNA, ) = RNA g(RNA, ) = protein Estimating f , g amounts to cracking the codes of transcription, epigenetics, splicing, . . . c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 2 Memorial Sloan-Kettering Cancer Center
  • 6. RNA-seq based Transcriptome Characterization c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3 Memorial Sloan-Kettering Cancer Center
  • 7. RNA-seq based Transcriptome Characterization c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3 Memorial Sloan-Kettering Cancer Center
  • 8. RNA-seq based Transcriptome Characterization c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3 Memorial Sloan-Kettering Cancer Center
  • 9. RNA-seq based Transcriptome Characterization c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3 Memorial Sloan-Kettering Cancer Center
  • 10. RNA-seq based Transcriptome Characterization c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3 Memorial Sloan-Kettering Cancer Center
  • 11. RNA-seq based Transcriptome Characterization oqtans.org cloud.oqtans.org c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3 Memorial Sloan-Kettering Cancer Center
  • 12. Transcript Quantitation and Dependence on Alignments 0 1 2 3 4 5 6 7 8 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 maximal number of mismatches Pearsoncorrelationwithtrueabundance True read origins Alignments to genome 92% 53% 84% False alignments, multi-mappers etc. lead to weaker results Simulated human reads from transcripts of known abundance (Fluxsimulator, [Sammeth, 2009]), 3% error rate, alignment w/ PALMapper [Jean et al., 2010], quantification w/ rQuant [Bohnert et al., 2009], Person correlation over considered transcripts. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 4 Memorial Sloan-Kettering Cancer Center
  • 13. bioRxiv dx.doi.org/10.1101/017103 Efficient BAM file postprocessor for RNA- & DNA-seq 100M alignments in 20 minutes (10 threads) Suitable for large-scale projects Improved accuracy for transcript quantification and prediction Open Source bioweb.me/mmr (C++) c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 5
  • 14. bioRxiv dx.doi.org/10.1101/017103 Efficient BAM file postprocessor for RNA- & DNA-seq 100M alignments in 20 minutes (10 threads) Suitable for large-scale projects Improved accuracy for transcript quantification and prediction Open Source bioweb.me/mmr (C++) c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 5
  • 15. bioRxiv dx.doi.org/10.1101/017103 Efficient BAM file postprocessor for RNA- & DNA-seq 100M alignments in 20 minutes (10 threads) Suitable for large-scale projects Improved accuracy for transcript quantification and prediction Open Source bioweb.me/mmr (C++) c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 5
  • 16. Multiple Mapper Resolution Principle (Iterated over all reads, N times) Use the change of local coverage around read mapping ... ... and use its smoothness to identify “better” mapping location location 1 location 2 Coverage c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6 Memorial Sloan-Kettering Cancer Center
  • 17. Multiple Mapper Resolution Principle (Iterated over all reads, N times) Use the change of local coverage around read mapping ... ... and use its smoothness to identify “better” mapping location Read not mapped to location 1 ... ... but mapped to location 2 Read pairCoverage c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6 Memorial Sloan-Kettering Cancer Center
  • 18. Multiple Mapper Resolution Principle (Iterated over all reads, N times) Use the change of local coverage around read mapping ... ... and use its smoothness to identify “better” mapping location + Read not mapped to location 1 ... ... but mapped to location 2 Read pair Variance measure Evaluation windowCoverage Average coverage c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6 Memorial Sloan-Kettering Cancer Center
  • 19. Multiple Mapper Resolution Principle (Iterated over all reads, N times) Use the change of local coverage around read mapping ... ... and use its smoothness to identify “better” mapping location + + Read pair Variance measure Evaluation windowCoverage Average coverage Read not mapped to location 1 ... ... but mapped to location 2 Read mapped to location 1 ... ... and not mapped to location 2 Assignment1Assignment2 c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6 Memorial Sloan-Kettering Cancer Center
  • 20. Multiple Mapper Resolution Results for simulated DNA-seq Smooths coverage as expected on an artificial dataset Simulated reads from tiling a part of A. thaliana genome, alignment w/ PALMapper [Jean et al., 2010] (with -a option), visual- ization with IGV [Robinson et al., 2011]. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 7 Memorial Sloan-Kettering Cancer Center
  • 21. Multiple Mapper Resolution Results for simulated DNA-seq Smooths coverage as expected on an artificial dataset ≥ ≥ Simulated reads from tiling a part of A. thaliana genome, alignment w/ PALMapper [Jean et al., 2010] (with -a option), visual- ization with IGV [Robinson et al., 2011]. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 7 Memorial Sloan-Kettering Cancer Center
  • 22. Multiple Mapper Resolution Results for simulated RNA-seq Improves performance of transcript quantification 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 0 1 2 4 6 all Correlation Mismatches original alignments best alignment only MMR treated alignment Simulated reads (75nt) from subset of human annotated transcripts with Fluxsimulator [Sammeth, 2009], PALMapper alignments [Jean et al., 2010], rQuant quantitation Bohnert et al. [2009], Pearson correlation over all considered transcripts. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 8 Memorial Sloan-Kettering Cancer Center
  • 23. LARGE-SCALE BIOLOGY ARTICLE Nonsense-Mediated Decay of Alternative Precursor mRNA Splicing Variants Is a Major Determinant of the Arabidopsis Steady State TranscriptomeC W Gabriele Drechsel,a,1 André Kahles,b,1 Anil K. Kesarwani,a Eva Stauffer,a,2 Jonas Behr,b Philipp Drewe,b Gunnar Rätsch,b and Andreas Wachtera,3 a Center for Plant Molecular Biology, University of Tübingen, 72076 Tuebingen, Germany b Computational Biology Center, Sloan-Kettering Institute, New York, New York 10065 ORCID IDs: 0000-0002-3411-0692 (A.K.); 0000-0001-5486-8532 (G.R.); 0000-0002-3132-5161 (A.W.). The nonsense-mediated decay (NMD) surveillance pathway can recognize erroneous transcripts and physiological mRNA such as precursor mRNA alternative splicing (AS) variants. Currently, information on the global extent of coupled AS and NM The Plant Cell, Vol. 25: 3726–3742, October 2013, www.plantcell.org ã 2013 American Society of Plant Biologists. All rights reserved. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 9
  • 24. bioRxiv dx.doi.org/10.1101/017095 Analysis of alternative isoforms with RNA-seq data Analyses known and identifies novel splicing events Quantifies & visualizes splicing-related data Suitable for large-scale projects (1000’s of samples) Improved accuracy for transcript quantification and prediction Open Source bioweb.me/spladder (python) c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 10
  • 25. bioRxiv dx.doi.org/10.1101/017095 Analysis of alternative isoforms with RNA-seq data Analyses known and identifies novel splicing events Quantifies & visualizes splicing-related data Suitable for large-scale projects (1000’s of samples) Improved accuracy for transcript quantification and prediction Open Source bioweb.me/spladder (python) c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 10
  • 26. bioRxiv dx.doi.org/10.1101/017095 Analysis of alternative isoforms with RNA-seq data Analyses known and identifies novel splicing events Quantifies & visualizes splicing-related data Suitable for large-scale projects (1000’s of samples) Improved accuracy for transcript quantification and prediction Open Source bioweb.me/spladder (python) c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 10
  • 27. SplAdder Ideas Major Problems in Transcriptome Analysis 1 Gene annotations are incomplete and often inaccurate 2 Whole transcript isoforms are difficult to predict/quantify c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11 Memorial Sloan-Kettering Cancer Center
  • 28. SplAdder Ideas Major Problems in Transcriptome Analysis 1 Gene annotations are incomplete and often inaccurate 2 Whole transcript isoforms are difficult to predict/quantify c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11 Memorial Sloan-Kettering Cancer Center
  • 29. SplAdder Ideas Major Problems in Transcriptome Analysis 1 Gene annotations are incomplete and often inaccurate 2 Whole transcript isoforms are difficult to predict/quantify Solution Augment annotation with RNA-Seq evidence Use single splicing events instead of full transcripts c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11 Memorial Sloan-Kettering Cancer Center
  • 30. SplAdder Ideas Major Problems in Transcriptome Analysis 1 Gene annotations are incomplete and often inaccurate 2 Whole transcript isoforms are difficult to predict/quantify Solution Augment annotation with RNA-Seq evidence Use single splicing events instead of full transcripts Annotation Alignment Data Augmented Splicing Graph Detected Splice Events Quantified Splice Events Differential Analysis/ sQTL Tests Kahles et al., bioRxiv, 2015 c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11 Memorial Sloan-Kettering Cancer Center
  • 31. SplAdder Ideas Major Problems in Transcriptome Analysis 1 Gene annotations are incomplete and often inaccurate 2 Whole transcript isoforms are difficult to predict/quantify Solution Augment annotation with RNA-Seq evidence Use single splicing events instead of full transcripts Annotation Alignment Data Augmented Splicing Graph Detected Splice Events Quantified Splice Events Differential Analysis/ sQTL Tests Kahles et al., bioRxiv, 2015 c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11 Memorial Sloan-Kettering Cancer Center
  • 32. SplAdder Graph Augmentation Principle Collapse annotated transcripts into graph representation Use RNA-Seq evidence to add new nodes and edges T1E1 T1E2 T1E3 T1E4 T2E1 T2E2 T2E3 T3E1 T3E2 T3E3 T4E1 T4E2 c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12 Memorial Sloan-Kettering Cancer Center
  • 33. SplAdder Graph Augmentation Principle Collapse annotated transcripts into graph representation Use RNA-Seq evidence to add new nodes and edges T1E1 T1E2 T1E3 T1E4 T2E1 T2E2 T2E3 T3E1 T3E2 T3E3 E1 E3 E4 E6 E2 E5 T4E1 T4E2 c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12 Memorial Sloan-Kettering Cancer Center
  • 34. SplAdder Graph Augmentation Principle Collapse annotated transcripts into graph representation Use RNA-Seq evidence to add new nodes and edges coverage split alignments New cassette exon c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12 Memorial Sloan-Kettering Cancer Center
  • 35. SplAdder Graph Augmentation Principle Collapse annotated transcripts into graph representation Use RNA-Seq evidence to add new nodes and edges coverage split alignments New cassette exon coverage New retained intron c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12 Memorial Sloan-Kettering Cancer Center
  • 36. SplAdder Graph Augmentation coverage split alignments A New cassette exon coverage B New retained intron split alignments C New intron split alignments D Alternative splice sites on both intron ends split alignments E New start-terminal node / New end-terminal node split alignments F Alternative 3’splice site / New end-terminal node split alignments G Alternative 5’splice site / New start terminal node split alignments H New exon skip c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 13 Memorial Sloan-Kettering Cancer Center
  • 37. SplAdder Event Extraction E1 E3 E4 E6 E2 E5 c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 14 Memorial Sloan-Kettering Cancer Center
  • 38. SplAdder Event Extraction E1 E3 E4 E6 E2 E5 c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 14 Memorial Sloan-Kettering Cancer Center
  • 39. SplAdder Event Extraction E1 E3 E4 E6 E2 E5 E1 E3 E4 E6 E2 E5 E1 E3 E4 E6 E2 E5 E1 E3 E4 E6 E2 E5 E1 E3 E4 E6 E2 E5 Exon Skip Multiple Exon Skip Alternative 5’Splice Site Intron Retention Alternative 3’Splice Site c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 14 Memorial Sloan-Kettering Cancer Center
  • 40. SplAdder Event Quantification and Visualization Exon Skip c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15 Memorial Sloan-Kettering Cancer Center
  • 41. SplAdder Event Quantification and Visualization Exon Skip a cb c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15 Memorial Sloan-Kettering Cancer Center
  • 42. SplAdder Event Quantification and Visualization Exon Skip a cb PSI = b + c 2 · a + b + c c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15 Memorial Sloan-Kettering Cancer Center
  • 43. SplAdder Event Quantification and Visualization Alternative 5’ Site a b Exon Skip a cb PSI = b + c 2 · a + b + c PSI = b a + b c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15 Memorial Sloan-Kettering Cancer Center
  • 44. SplAdder Event Quantification and Visualization c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15 Memorial Sloan-Kettering Cancer Center
  • 45. SplAdder Event Quantification and Visualization Summary SplAdder effectively augments the annotation Enables quantitative analysis of events instead of transcripts c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15 Memorial Sloan-Kettering Cancer Center
  • 46. Splicing Analysis Across Multiple Cancer Types Goals 1 Identify cancer-specific splicing patterns 2 Identify variants regulating splicing in same gene (cis) 3 Identify variants regulating splicing in other cancer genes (trans) TCGA provides RNA-seq and matching exome data RNA-seq Find & quantify splicing events Exome Identify variants in exons & flanking intronic regions c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 16 Memorial Sloan-Kettering Cancer Center
  • 47. Splicing Analysis Across Multiple Cancer Types Goals 1 Identify cancer-specific splicing patterns 2 Identify variants regulating splicing in same gene (cis) 3 Identify variants regulating splicing in other cancer genes (trans) TCGA provides RNA-seq and matching exome data RNA-seq Find & quantify splicing events Exome Identify variants in exons & flanking intronic regions c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 16 Memorial Sloan-Kettering Cancer Center
  • 48. Splicing Variation Across 4,700 Samples Event Statistics Exon Skip Eve A B Analysis of a total of 4,700 RNA-seq samples from TCGA normal (tn), TCGA tumors (tc), Encode (ec) and Geuvadis (gv). Align- ment w/ STAR [Dobin et al., 2013], analysis w/ SplAdder (SplA) and Gencode annotation (Anno). Figure from [Kahles, 2014]. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 17 Memorial Sloan-Kettering Cancer Center
  • 49. Uniform analysis of Large-Scale RNA-seq Data Large-scale Compute 4,700 RNA-seq libraries (≈100 TB) ⇒ STAR ≈ 6 CPU years ⇒ SplAdder ≈ 0.5 CPU years [Kahles et al.] Unified community resources Docker with ICGC RNA-seq alignment SOP bioweb.me/ICGC-RNA-SOP Syncronize with Encode, gTex, TCGA, . . . [ICGC PCAWG-3 WG] c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 18 Memorial Sloan-Kettering Cancer Center
  • 50. Uniform analysis of Large-Scale RNA-seq Data Large-scale Compute 4,700 RNA-seq libraries (≈100 TB) ⇒ STAR ≈ 6 CPU years ⇒ SplAdder ≈ 0.5 CPU years [Kahles et al.] Unified community resources Docker with ICGC RNA-seq alignment SOP bioweb.me/ICGC-RNA-SOP Syncronize with Encode, gTex, TCGA, . . . [ICGC PCAWG-3 WG] c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 18 Memorial Sloan-Kettering Cancer Center
  • 51. bioRxiv dx.doi.org/10.1101/017095 Analysis of Ribosome profiling and RNA-seq data Study translation efficiency Adjusts for expression differences Accurate method based on dispersion estimates and GLMs Open Source bioweb.me/ribodiff (python) c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 19
  • 52. bioRxiv dx.doi.org/10.1101/017095 Analysis of Ribosome profiling and RNA-seq data Study translation efficiency Adjusts for expression differences Accurate method based on dispersion estimates and GLMs Open Source bioweb.me/ribodiff (python) c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 19
  • 53. bioRxiv dx.doi.org/10.1101/017095 Analysis of Ribosome profiling and RNA-seq data Study translation efficiency Adjusts for expression differences Accurate method based on dispersion estimates and GLMs Open Source bioweb.me/ribodiff (python) c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 19
  • 54. Application to Ribosome Profiling Found a motif that was strongly in enriched in detected genes: G-quadruplex structures Memorial Sloan-Kettering Cancer Center Application to Ribosome Profiling Found a motif that was strongly in enriched in detected genes: G-quadruplex structures Memorial Sloan-Kettering Cancer Center (Analysis based on related but different strategy [Wolfe et al., 2014].) c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 20
  • 55. Summary MMR improves alignment choice for multi-mappers ⇒ Helps improving accuracy of tools like Cufflinks SplAdder identifies, quantifies & visualizes alternative splicing ⇒ Finds unannotated alternative splicing, tumor/normal splicing differences; splicing reprogramming; sQTLs riboDiff accurately detects differential translation efficiency ⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in 5’ UTR that interacts with compound via eIF4a Tools (+ six other ones) are open source and available ⇒ ratschlab.org/tools ⇒ (more) Docker images come soon . . . c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21 Memorial Sloan-Kettering Cancer Center
  • 56. Summary MMR improves alignment choice for multi-mappers ⇒ Helps improving accuracy of tools like Cufflinks SplAdder identifies, quantifies & visualizes alternative splicing ⇒ Finds unannotated alternative splicing, tumor/normal splicing differences; splicing reprogramming; sQTLs riboDiff accurately detects differential translation efficiency ⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in 5’ UTR that interacts with compound via eIF4a Tools (+ six other ones) are open source and available ⇒ ratschlab.org/tools ⇒ (more) Docker images come soon . . . c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21 Memorial Sloan-Kettering Cancer Center
  • 57. Summary MMR improves alignment choice for multi-mappers ⇒ Helps improving accuracy of tools like Cufflinks SplAdder identifies, quantifies & visualizes alternative splicing ⇒ Finds unannotated alternative splicing, tumor/normal splicing differences; splicing reprogramming; sQTLs riboDiff accurately detects differential translation efficiency ⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in 5’ UTR that interacts with compound via eIF4a Tools (+ six other ones) are open source and available ⇒ ratschlab.org/tools ⇒ (more) Docker images come soon . . . c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21 Memorial Sloan-Kettering Cancer Center
  • 58. Summary MMR improves alignment choice for multi-mappers ⇒ Helps improving accuracy of tools like Cufflinks SplAdder identifies, quantifies & visualizes alternative splicing ⇒ Finds unannotated alternative splicing, tumor/normal splicing differences; splicing reprogramming; sQTLs riboDiff accurately detects differential translation efficiency ⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in 5’ UTR that interacts with compound via eIF4a Tools (+ six other ones) are open source and available ⇒ ratschlab.org/tools ⇒ (more) Docker images come soon . . . c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21 Memorial Sloan-Kettering Cancer Center
  • 59. Acknowledgements R¨atsch Laboratory Andre Kahles Yi Zhong Philipp Drewe @ MDC Berlin Theofanis Karaletsos Kjong Van Lehmann Jonas Behr @ ETH Basel Regina Bohnert @ Molecular Health Geraldine Jean @ University of Nantes Cancer Biology Guido Wendel Kamini Singh, . . . Cancer Genomics Projects Angela Brooks, Broad Alvis Brazma, EBI Matt Wilkerson, UNC Niki Schultz, MSKCC Chris Sander, MSKCC Funding from MSKCC, Max Planck Society, European Union, German Research Foundation, Geoffrey Beene Foundation & NIH Thank you! c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 22
  • 60. Acknowledgements R¨atsch Laboratory Andre Kahles Yi Zhong Philipp Drewe @ MDC Berlin Theofanis Karaletsos Kjong Van Lehmann Jonas Behr @ ETH Basel Regina Bohnert @ Molecular Health Geraldine Jean @ University of Nantes Cancer Biology Guido Wendel Kamini Singh, . . . Cancer Genomics Projects Angela Brooks, Broad Alvis Brazma, EBI Matt Wilkerson, UNC Niki Schultz, MSKCC Chris Sander, MSKCC Funding from MSKCC, Max Planck Society, European Union, German Research Foundation, Geoffrey Beene Foundation & NIH Thank you! c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 22
  • 61. References I J. Behr, G. Schweikert, J. Cao, F. De Bona, G. Zeller, S. Laubinger, S. Ossowski, K. Schneeberger, D. Weigel, and G. R¨atsch. Rna-seq and tiling arrays for improved gene finding. Oral presentation at the CSHL Genome Informatics Meeting, September 2008. URL http: //www.fml.tuebingen.mpg.de/raetsch/lectures/RaetschGenomeInformatics08.pdf. R. Bohnert, J. Behr, and G R¨atsch. Transcript quantification with RNA-Seq data. BMC Bioinformatics, 10(S13):P5, 2009. URL http://www.biomedcentral.com/1471-2105/10/S13/P5. RM Clark, G Schweikert, C Toomajian, S Ossowski, G Zeller, P Shinn, N Warthmann, TT Hu, G Fu, DA Hinds, H Chen, KA Frazer, DH Huson, B Sch¨olkopf, M Nordborg, G R¨atsch, JR Ecker, and D Weigel. Common sequence polymorphisms shaping genetic diversity in arabidopsis thaliana. Science, 317(5836):338–342, 2007. ISSN 1095-9203 (Electronic). doi: 10.1126/science.1138632. Alexander Dobin, Carrie a Davis, Felix Schlesinger, Jorg Drenkow, Chris Zaleski, Sonali Jha, Philippe Batut, Mark Chaisson, and Thomas R Gingeras. STAR: ultrafast universal RNA-seq aligner. Bioinformatics (Oxford, England), 29(1):15–21, January 2013. ISSN 1367-4811. doi: 10.1093/bioinformatics/bts635. URL http://www.ncbi.nlm.nih.gov/pubmed/23104886. Mitchell Guttman, Manuel Garber, Joshua Z Levin, Julie Donaghey, James Robinson, Xian Adiconis, Lin Fan, Magdalena J Koziol, Andreas Gnirke, Chad Nusbaum, John L Rinn, Eric S Lander, and Aviv Regev. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincrnas. Nat Biotechnol, 28(5): 503–10, May 2010. doi: 10.1038/nbt.1633. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 23
  • 62. References II G Jean, A Kahles, VT Sreedharan, F De Bona, and G R¨atsch. Rna-seq read alignments with palmapper. Curr Protoc Bioinformatics, Unit 11.6, 2010. Andre Kahles. Novel Methods for the Computational Analysis of RNA-Seq Data with Applications to Alternative Splicing. PhD thesis, University of T¨ubingen, T¨ubingen, Germany, September 2014. G. R¨atsch and S. Sonnenburg. Accurate splice site detection for Caenorhabditis elegans. In K. Tsuda B. Schoelkopf and J.-P. Vert, editors, Kernel Methods in Computational Biology. MIT Press, 2004. G. R¨atsch, S. Sonnenburg, and B. Sch¨olkopf. RASE: recognition of alternatively spliced exons in C. elegans. Bioinformatics, 21(Suppl. 1):i369–i377, June 2005. James T. Robinson, Helga Thorvaldsd´ottir, Wendy Winckler, Mitchell Guttman, Eric S. Lander, Gad Getz, and Jill P. Mesirov. Integrative genomics viewer. Nature Biotechnology, 29: 24–26, 2011. M. Sammeth. The Flux Simulator. Website, 2009. http://flux.sammeth.net/simulator.html. Gabriele Schweikert, Alexander Zien, Georg Zeller, Jonas Behr, Christoph Dieterich, Cheng Soon Ong, Petra Philips, Fabio De Bona, Lisa Hartmann, Anja Bohlen, Nina Kr¨uger, S¨oren Sonnenburg, and Gunnar R¨atsch. mgene: Accurate svm-based gene finding with an application to nematode genomes. Genome Research, 2009. URL http://genome.cshlp.org/content/early/2009/06/29/gr.090597.108.full.pdf+html. Advance access June 29, 2009. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 24
  • 63. References III S. Sonnenburg, G. R¨atsch, A. Jagota, and K.-R. M¨uller. New methods for splice-site recognition. In Proc. International Conference on Artificial Neural Networks, 2002. S¨oren Sonnenburg, Alexander Zien, and Gunnar R¨atsch. ARTS: Accurate Recognition of Transcription Starts in Human. Bioinformatics, 22(14):e472–480, 2006. Cole Trapnell, Brian A Williams, Geo Pertea, Ali Mortazavi, Gordon Kwan, Marijke J van Baren, Steven L Salzberg, Barbara J Wold, and Lior Pachter. Transcript assembly and quantification by rna-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotech, advance online publication, May 2010. doi: 10.1038/nbt.1621. URL http://dx.doi.org/10.1038/nbt.1621. A Wolfe, K Singh, Y Zhong, P Drewe, others, G R¨atsch, and HG Wendel. Rna g-quadruplexes cause eif4a-dependent oncogene translation in cancer. Nature, 2014. doi: 10.1038/nature13485. G Zeller, RM Clark, K Schneeberger, A Bohlen, D Weigel, and G Ratsch. Detecting polymorphic regions in arabidopsis thaliana with resequencing microarrays. Genome Res, 18 (6):918–929, 2008. ISSN 1088-9051 (Print). doi: 10.1101/gr.070169.107. A. Zien, G. R¨atsch, S. Mika, B. Sch¨olkopf, T. Lengauer, and K.-R. M¨uller. Engineering Support Vector Machine Kernels That Recognize Translation Initiation Sites. BioInformatics, 16(9): 799–807, September 2000. c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 25