The document describes a presentation given by Gunnar Rätsch on tools for RNA-seq analysis and isoform characterization. It discusses the increasing amounts of biological data and challenges in developing accurate analysis algorithms. The presentation covers multiple tools developed by Rätsch's group for analyzing RNA-seq data, including tools for transcript quantification, multiple read mapping, alternative splicing analysis and detection of novel isoforms. The tools aim to improve RNA-seq analysis for large datasets and characterization of transcript isoforms and splicing.
A workshop is intended for those who are interested in and are in the planning stages of conducting an RNA-Seq experiment. Topics to be discussed will include:
* Experimental Design of RNA-Seq experiment
* Sample preparation, best practices
* High throughput sequencing basics and choices
* Cost estimation
* Differential Gene Expression Analysis
* Data cleanup and quality assurance
* Mapping your data
* Assigning reads to genes and counting
* Analysis of differentially expressed genes
* Downstream analysis/visualizations and tables
AGRF in conjunction with EMBL Australia recently organised a workshop at Monash University Clayton. This workshop was targeted at beginners and biologists who are new to analysing Next-Gen Sequencing data. The workshop also aimed to provide users with a snapshot of bioinformatics and data analysis tips on how to begin to analyse project data. An introduction to RNA-seq data analysis was presented by AGRF Senior Bioinformatician Dr. Sonika Tyagi.
Presented: 1st August 2012
A workshop is intended for those who are interested in and are in the planning stages of conducting an RNA-Seq experiment. Topics to be discussed will include:
* Experimental Design of RNA-Seq experiment
* Sample preparation, best practices
* High throughput sequencing basics and choices
* Cost estimation
* Differential Gene Expression Analysis
* Data cleanup and quality assurance
* Mapping your data
* Assigning reads to genes and counting
* Analysis of differentially expressed genes
* Downstream analysis/visualizations and tables
AGRF in conjunction with EMBL Australia recently organised a workshop at Monash University Clayton. This workshop was targeted at beginners and biologists who are new to analysing Next-Gen Sequencing data. The workshop also aimed to provide users with a snapshot of bioinformatics and data analysis tips on how to begin to analyse project data. An introduction to RNA-seq data analysis was presented by AGRF Senior Bioinformatician Dr. Sonika Tyagi.
Presented: 1st August 2012
Course: Bioinformatics for Biomedical Research (2014).
Session: 4.1- Introduction to RNA-seq and RNA-seq Data Analysis.
Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.
Abstract: The focus in this session will be put on the differences between standard DNA mapping and RNAseq-specific transcript mapping: identifying splice variants and isoforms. The issue of transcript quantification and genomic variants that can be identified from RNAseq data will be discussed.
Part 1 of RNA-seq for DE analysis: Defining the goalJoachim Jacob
First part of the training session 'RNA-seq for Differential expression' analysis. We explain how we can detect differential expression based on RNA-seq data. Interested in following this session? Please contact http://www.jakonix.be/contact.html
This session will follow up from transcript quantification of RNAseq data and discusses statistical means of identifying differentially regulated transcripts, and isoforms and contrasts these against microarray analysis approaches.
RNA-seq: A High-resolution View of the TranscriptomeSean Davis
The molecular microscopes that we use to examine human biology have advanced significantly with the advent of next generation sequencing. RNA-seq is one application of this technology that leads to a very high-resolution view of the transcriptome. With these new technologies come increased data analysis and data handling burdens as well as the promise of new discovery. These slides present a high-level overview of the RNA-seq technology with a focus on the analysis approaches, quality control challenges, and experimental design.
RNA Sequence data analysis,Transcriptome sequencing, Sequencing steady state RNA in a sample is known as RNA-Seq. It is free of limitations such as prior knowledge about the organism is not required.
RNA-Seq is useful to unravel inaccessible complexities of transcriptomics such as finding novel transcripts and isoforms.
Data set produced is large and complex; interpretation is not straight forward.
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...Jan Aerts
Talk by A Tovchigrechko at BOSC2012: "MGTAXA: a toolkit and webserver for predicting taxonomy of the metagenomic sequences with Galaxy frontend and parallel computational backend"
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
This workshop will address critical issues related to Transcriptomics data:
Processing raw Next Generation Sequencing (NGS) data:
1. Next Generation Sequencing data preprocessing:
Trimming technical sequences
Removing PCR duplicates
2. RNA-seq based quantification of expression levels:
Conventional pipelines (looking at known transcripts)
Identification of novel isoforms
Analysis of Expression Data Using Machine Learning:
3. Unsupervised analysis of expression data:
Principal Component Analysis
Clustering
4. Supervised analysis:
Differential expression analysis
Classification, gene signature construction
5. Gene set enrichment analysis
The workshop will include hands-on exercises utilizing public domain datasets:
breast cancer cell lines transcriptomic profiles (https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-10-r110),
patient-derived xenograft (PDX) mouse model of tumor and stroma transcriptomic profiles (http://www.oncotarget.com/index.php?journal=oncotarget&page=article&op=view&path[]=8014&path[]=23533), and
processed data from The Cancer Genome Atlas samples (https://cancergenome.nih.gov/).
Team: The workshops are designed by the researchers at the Tauber Bioinformatics Research Center at University of Haifa, Israel in collaboration with academic centers across the US. Technical support for the workshops is provided by the Pine Biotech team. https://edu.t-bio.info/a-critical-approach-to-transcriptomic-data-analysis/
Course: Bioinformatics for Biomedical Research (2014).
Session: 4.1- Introduction to RNA-seq and RNA-seq Data Analysis.
Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.
Abstract: The focus in this session will be put on the differences between standard DNA mapping and RNAseq-specific transcript mapping: identifying splice variants and isoforms. The issue of transcript quantification and genomic variants that can be identified from RNAseq data will be discussed.
Part 1 of RNA-seq for DE analysis: Defining the goalJoachim Jacob
First part of the training session 'RNA-seq for Differential expression' analysis. We explain how we can detect differential expression based on RNA-seq data. Interested in following this session? Please contact http://www.jakonix.be/contact.html
This session will follow up from transcript quantification of RNAseq data and discusses statistical means of identifying differentially regulated transcripts, and isoforms and contrasts these against microarray analysis approaches.
RNA-seq: A High-resolution View of the TranscriptomeSean Davis
The molecular microscopes that we use to examine human biology have advanced significantly with the advent of next generation sequencing. RNA-seq is one application of this technology that leads to a very high-resolution view of the transcriptome. With these new technologies come increased data analysis and data handling burdens as well as the promise of new discovery. These slides present a high-level overview of the RNA-seq technology with a focus on the analysis approaches, quality control challenges, and experimental design.
RNA Sequence data analysis,Transcriptome sequencing, Sequencing steady state RNA in a sample is known as RNA-Seq. It is free of limitations such as prior knowledge about the organism is not required.
RNA-Seq is useful to unravel inaccessible complexities of transcriptomics such as finding novel transcripts and isoforms.
Data set produced is large and complex; interpretation is not straight forward.
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...Jan Aerts
Talk by A Tovchigrechko at BOSC2012: "MGTAXA: a toolkit and webserver for predicting taxonomy of the metagenomic sequences with Galaxy frontend and parallel computational backend"
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
This workshop will address critical issues related to Transcriptomics data:
Processing raw Next Generation Sequencing (NGS) data:
1. Next Generation Sequencing data preprocessing:
Trimming technical sequences
Removing PCR duplicates
2. RNA-seq based quantification of expression levels:
Conventional pipelines (looking at known transcripts)
Identification of novel isoforms
Analysis of Expression Data Using Machine Learning:
3. Unsupervised analysis of expression data:
Principal Component Analysis
Clustering
4. Supervised analysis:
Differential expression analysis
Classification, gene signature construction
5. Gene set enrichment analysis
The workshop will include hands-on exercises utilizing public domain datasets:
breast cancer cell lines transcriptomic profiles (https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-10-r110),
patient-derived xenograft (PDX) mouse model of tumor and stroma transcriptomic profiles (http://www.oncotarget.com/index.php?journal=oncotarget&page=article&op=view&path[]=8014&path[]=23533), and
processed data from The Cancer Genome Atlas samples (https://cancergenome.nih.gov/).
Team: The workshops are designed by the researchers at the Tauber Bioinformatics Research Center at University of Haifa, Israel in collaboration with academic centers across the US. Technical support for the workshops is provided by the Pine Biotech team. https://edu.t-bio.info/a-critical-approach-to-transcriptomic-data-analysis/
RNA-seq based Genome Annotation with mGene.ngs and MiTieGunnar Rätsch
Talk in gene discovery session at PAGXXII (https://pag.confex.com/pag/xxii/webprogram/Session2128.html)
Joint work with Jonas Behr, Gabriele Schweikert, Andre Kahles and others.
Abstract: High throughput sequencing of mRNA (RNA-Seq) has led to tremendous improvements in detection of expressed genes and transcripts. However, the immense dynamic range of gene expression, limitations and biases of the sequencing technology, as well as the observed complexity of the transcriptional landscape pose profound computational challenges. We discuss several of these challenges and based on illustrative simulation examples, we identify the limits of state-of-the-art tools in reconstructing multiple alternative transcripts even if sufficient information is provided. We propose a novel framework, called MiTie, for simultaneous transcript reconstruction and quantification based on combinatorial optimization. We use the negative binomial distribution to define a likelihood function and use a regularization approach to select a small number of transcripts quantitatively explaining the observed read data. We show that the resulting regularized maximum likelihood problem can be formulated as a mixed integer programming problem (MIP) which can be solved optimally using standard optimization approaches. We will also describe an extension of the discriminative gene finding system mGene that takes advantage of RNA-seq reads. We demonstrate that the extended system mGene.ngs can significantly more accurately predict transcript annotations when using RNA-seq data and also better than tools for transcriptome reconstruction that are solely based on RNA-seq data. Finally, we illustrate how a combination of gene finding and transcriptome reconstruction methods like MiTie can be used to accurately annotate newly sequenced genomes without prior annotations.
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Golden Helix Inc
With a focus on scalable architecture and optimized native code that fully utilizes the CPU and RAM available, we can scale genomic analysis into sizes conventionally considered Big Data on a single host. In this webcast, we demonstrate recent innovations and features in Golden Helix solutions that enable the analysis of big data on your own terms.
Presentation carried out by Sergi Beltran Agulló, from the CNAG, at the course: Identification and analysis of sequence variants in sequencing projects: fundamentals and tools .
Achieve Complete Coverage of the SARS-CoV-2 GenomeCamille Cappello
Utilize multiple overlapping amplicons in a single tube, using a rapid, 2-hour workflow to prepare ready-to-sequence libraries. The PCR1+PCR2 workflow generates robust libraries even from low input quantities of DNA that may be subsequently quantified and normalized with conventional methods such as Qubit® or Agilent Bioanalyzer, or optionally using the included Swift Normalase reagents.
Provides coverage of >99% of the SARS-CoV-2 genome from limited viral titers
Analyzing Genomic Data with PyEnsembl and VarcodeAlex Rubinsteyn
PyEnsembl and Varcode are two new libraries being developed at Mount Sinai's Hammerlab to facilitate the analysis of genomic variants with an eye toward Pythonic interfaces, data representations, and coding conventions. PyEnsembl provides access to genomic sequence and annotation data. PyEnsembl's API consists primarily of objects such as Gene, Transcript, and Exon, along with methods for querying these objects by properties such as their chromosomal locations. PyEnsembl can be used to answer fundamental questions such as "which genes overlap a genomic location?" and "what is the nucleotide sequence of a particular transcript?". Varcode sits on top of PyEnsembl and uses it to annotate genomic mutations. Varcode can be used to quickly answer questions such as "what's the degree of overlap between two sets of mutations" and "which genes are affected by each mutation?". Additionally, Varcode can predict the altered amino acid sequence arising from a mutation, which is useful for predicting properties of the mutant protein (such as presentation to the adaptive immune system). This talk will show some basic examples of PyEnsembl and Varcode in action and give a brief glimpse of how they can be used as part of a personalized cancer vaccine pipeline.
Detecting and Quantifying Low Level Variants in Sanger Sequencing TracesThermo Fisher Scientific
Automated fluorescent dye-terminator DNA Sequencing using capillary electrophoresis (also known as CE or Sanger sequencing) has been instrumental in the detailed characterization of the human genome and is now widely used as gold standard method for verification of mutation findings, notably in tumor samples. The primary information of the DNA sequencing process is the identification of the nucleotides and of possible sequence variants. A largely unexplored feature of fluorescent Sanger sequencing traces is the quantitative information embedded therein. With the growing need for quantifying somatic mutations in tumor tissue it is desirable to exploit the potential of the quantitative information obtained from sequencing traces.
Materials and Methods
To this end, we have developed a software tool that converts a Sanger sequencing trace file into a .comma separated value (.csv) file containing numerical data of peak data characteristics that can be explored and analyzed using conventional spreadsheet software. The web-based tool can be accessed at: http://apps.lifetechnologies.com/ab1peakreporter .
The output file contains the peak height and quality values for each nucleotide and peak height ratios for all 4 bases at any given locus allowing the detection and assessment of subtle changes at any given allele.
Results and Discussion
We demonstrate the utility of this tool by analyzing mixed DNA samples with known amounts of spiked in variant alleles from the human TP53 gene ranging from 2.5%, 5%, 7.5%, 10%, 15% and 25% and show that the minor alleles could be readily detected below the 10% level.
Conclusion
Enabling high sensitivity detection of minor alleles with a widely available and simple to use technology like Sanger sequencing will be useful for verification of results obtained from next generation sequencing (NGS) platforms.
- Video recording of this lecture in English language: https://youtu.be/lK81BzxMqdo
- Video recording of this lecture in Arabic language: https://youtu.be/Ve4P0COk9OI
- Link to download the book free: https://nephrotube.blogspot.com/p/nephrotube-nephrology-books.html
- Link to NephroTube website: www.NephroTube.com
- Link to NephroTube social media accounts: https://nephrotube.blogspot.com/p/join-nephrotube-on-social-media.html
Anti ulcer drugs and their Advance pharmacology ||
Anti-ulcer drugs are medications used to prevent and treat ulcers in the stomach and upper part of the small intestine (duodenal ulcers). These ulcers are often caused by an imbalance between stomach acid and the mucosal lining, which protects the stomach lining.
||Scope: Overview of various classes of anti-ulcer drugs, their mechanisms of action, indications, side effects, and clinical considerations.
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...kevinkariuki227
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Verified Chapters 1 - 19, Complete Newest Version.pdf
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Verified Chapters 1 - 19, Complete Newest Version.pdf
Flu Vaccine Alert in Bangalore Karnatakaaddon Scans
As flu season approaches, health officials in Bangalore, Karnataka, are urging residents to get their flu vaccinations. The seasonal flu, while common, can lead to severe health complications, particularly for vulnerable populations such as young children, the elderly, and those with underlying health conditions.
Dr. Vidisha Kumari, a leading epidemiologist in Bangalore, emphasizes the importance of getting vaccinated. "The flu vaccine is our best defense against the influenza virus. It not only protects individuals but also helps prevent the spread of the virus in our communities," he says.
This year, the flu season is expected to coincide with a potential increase in other respiratory illnesses. The Karnataka Health Department has launched an awareness campaign highlighting the significance of flu vaccinations. They have set up multiple vaccination centers across Bangalore, making it convenient for residents to receive their shots.
To encourage widespread vaccination, the government is also collaborating with local schools, workplaces, and community centers to facilitate vaccination drives. Special attention is being given to ensuring that the vaccine is accessible to all, including marginalized communities who may have limited access to healthcare.
Residents are reminded that the flu vaccine is safe and effective. Common side effects are mild and may include soreness at the injection site, mild fever, or muscle aches. These side effects are generally short-lived and far less severe than the flu itself.
Healthcare providers are also stressing the importance of continuing COVID-19 precautions. Wearing masks, practicing good hand hygiene, and maintaining social distancing are still crucial, especially in crowded places.
Protect yourself and your loved ones by getting vaccinated. Together, we can help keep Bangalore healthy and safe this flu season. For more information on vaccination centers and schedules, residents can visit the Karnataka Health Department’s official website or follow their social media pages.
Stay informed, stay safe, and get your flu shot today!
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdfJim Jacob Roy
Cardiac conduction defects can occur due to various causes.
Atrioventricular conduction blocks ( AV blocks ) are classified into 3 types.
This document describes the acute management of AV block.
Prix Galien International 2024 Forum ProgramLevi Shapiro
June 20, 2024, Prix Galien International and Jerusalem Ethics Forum in ROME. Detailed agenda including panels:
- ADVANCES IN CARDIOLOGY: A NEW PARADIGM IS COMING
- WOMEN’S HEALTH: FERTILITY PRESERVATION
- WHAT’S NEW IN THE TREATMENT OF INFECTIOUS,
ONCOLOGICAL AND INFLAMMATORY SKIN DISEASES?
- ARTIFICIAL INTELLIGENCE AND ETHICS
- GENE THERAPY
- BEYOND BORDERS: GLOBAL INITIATIVES FOR DEMOCRATIZING LIFE SCIENCE TECHNOLOGIES AND PROMOTING ACCESS TO HEALTHCARE
- ETHICAL CHALLENGES IN LIFE SCIENCES
- Prix Galien International Awards Ceremony
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...Oleg Kshivets
RESULTS: Overall life span (LS) was 2252.1±1742.5 days and cumulative 5-year survival (5YS) reached 73.2%, 10 years – 64.8%, 20 years – 42.5%. 513 LCP lived more than 5 years (LS=3124.6±1525.6 days), 148 LCP – more than 10 years (LS=5054.4±1504.1 days).199 LCP died because of LC (LS=562.7±374.5 days). 5YS of LCP after bi/lobectomies was significantly superior in comparison with LCP after pneumonectomies (78.1% vs.63.7%, P=0.00001 by log-rank test). AT significantly improved 5YS (66.3% vs. 34.8%) (P=0.00000 by log-rank test) only for LCP with N1-2. Cox modeling displayed that 5YS of LCP significantly depended on: phase transition (PT) early-invasive LC in terms of synergetics, PT N0—N12, cell ratio factors (ratio between cancer cells- CC and blood cells subpopulations), G1-3, histology, glucose, AT, blood cell circuit, prothrombin index, heparin tolerance, recalcification time (P=0.000-0.038). Neural networks, genetic algorithm selection and bootstrap simulation revealed relationships between 5YS and PT early-invasive LC (rank=1), PT N0—N12 (rank=2), thrombocytes/CC (3), erythrocytes/CC (4), eosinophils/CC (5), healthy cells/CC (6), lymphocytes/CC (7), segmented neutrophils/CC (8), stick neutrophils/CC (9), monocytes/CC (10); leucocytes/CC (11). Correct prediction of 5YS was 100% by neural networks computing (area under ROC curve=1.0; error=0.0).
CONCLUSIONS: 5YS of LCP after radical procedures significantly depended on: 1) PT early-invasive cancer; 2) PT N0--N12; 3) cell ratio factors; 4) blood cell circuit; 5) biochemical factors; 6) hemostasis system; 7) AT; 8) LC characteristics; 9) LC cell dynamics; 10) surgery type: lobectomy/pneumonectomy; 11) anthropometric data. Optimal diagnosis and treatment strategies for LC are: 1) screening and early detection of LC; 2) availability of experienced thoracic surgeons because of complexity of radical procedures; 3) aggressive en block surgery and adequate lymph node dissection for completeness; 4) precise prediction; 5) adjuvant chemoimmunoradiotherapy for LCP with unfavorable prognosis.
Explore natural remedies for syphilis treatment in Singapore. Discover alternative therapies, herbal remedies, and lifestyle changes that may complement conventional treatments. Learn about holistic approaches to managing syphilis symptoms and supporting overall health.
The prostate is an exocrine gland of the male mammalian reproductive system
It is a walnut-sized gland that forms part of the male reproductive system and is located in front of the rectum and just below the urinary bladder
Function is to store and secrete a clear, slightly alkaline fluid that constitutes 10-30% of the volume of the seminal fluid that along with the spermatozoa, constitutes semen
A healthy human prostate measures (4cm-vertical, by 3cm-horizontal, 2cm ant-post ).
It surrounds the urethra just below the urinary bladder. It has anterior, median, posterior and two lateral lobes
It’s work is regulated by androgens which are responsible for male sex characteristics
Generalised disease of the prostate due to hormonal derangement which leads to non malignant enlargement of the gland (increase in the number of epithelial cells and stromal tissue)to cause compression of the urethra leading to symptoms (LUTS
1. Analysis Tools for RNA-seq and
Isoform Characterization
Slides: t.co/nc7siKRm6W
Gunnar R¨atsch
Biomedical Data Science Group
Computational Biology Center
Memorial Sloan Kettering Cancer Center
gxr #RNA #MMR #SplAdder #riboDiff #Cancer
2. Biomedical Data Sciences Group
Facts
Cost of collecting data drops, amounts increase exponentially.
We have more data than accurate algorithms.
Group’s research
Data Science Algorithms, Models & Tools
Machine Learning,
Bioinformatics.
Biology & Medicine Problem Setting & Goals
RNA processing regulation,
Clinical data analysis.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 1
Memorial Sloan-Kettering Cancer Center
3. Biomedical Data Sciences Group
Facts
Cost of collecting data drops, amounts increase exponentially.
We have more data than accurate algorithms.
Group’s research
Data Science Algorithms, Models & Tools
Machine Learning,
Bioinformatics.
Biology & Medicine Problem Setting & Goals
RNA processing regulation,
Clinical data analysis.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 1
Memorial Sloan-Kettering Cancer Center
4. Biomedical Data Sciences Group
Facts
Cost of collecting data drops, amounts increase exponentially.
We have more data than accurate algorithms.
Group’s research
Data Science Algorithms, Models & Tools
Machine Learning,
Bioinformatics.
Biology & Medicine Problem Setting & Goals
RNA processing regulation,
Clinical data analysis.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 1
Memorial Sloan-Kettering Cancer Center
5. Learning About the Central Dogma
Goal: Learn to predict what these processes accomplish:
Given the DNA, . . . , predict all gene products
f (DNA, ) = RNA g(RNA, ) = protein
Estimating f , g amounts to cracking the codes of
transcription, epigenetics, splicing, . . .
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 2
Memorial Sloan-Kettering Cancer Center
6. RNA-seq based Transcriptome Characterization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
7. RNA-seq based Transcriptome Characterization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
8. RNA-seq based Transcriptome Characterization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
9. RNA-seq based Transcriptome Characterization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
10. RNA-seq based Transcriptome Characterization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
11. RNA-seq based Transcriptome Characterization
oqtans.org
cloud.oqtans.org
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 3
Memorial Sloan-Kettering Cancer Center
12. Transcript Quantitation and Dependence on Alignments
0 1 2 3 4 5 6 7 8
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
maximal number of mismatches
Pearsoncorrelationwithtrueabundance
True read origins
Alignments to
genome
92%
53%
84%
False alignments, multi-mappers etc. lead to weaker results
Simulated human reads from transcripts of known abundance (Fluxsimulator, [Sammeth, 2009]), 3% error rate, alignment w/
PALMapper [Jean et al., 2010], quantification w/ rQuant [Bohnert et al., 2009], Person correlation over considered transcripts.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 4
Memorial Sloan-Kettering Cancer Center
13. bioRxiv dx.doi.org/10.1101/017103
Efficient BAM file postprocessor for RNA- & DNA-seq
100M alignments in 20 minutes (10 threads)
Suitable for large-scale projects
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/mmr (C++)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 5
14. bioRxiv dx.doi.org/10.1101/017103
Efficient BAM file postprocessor for RNA- & DNA-seq
100M alignments in 20 minutes (10 threads)
Suitable for large-scale projects
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/mmr (C++)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 5
15. bioRxiv dx.doi.org/10.1101/017103
Efficient BAM file postprocessor for RNA- & DNA-seq
100M alignments in 20 minutes (10 threads)
Suitable for large-scale projects
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/mmr (C++)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 5
16. Multiple Mapper Resolution
Principle (Iterated over all reads, N times)
Use the change of local coverage around read mapping ...
... and use its smoothness to identify “better” mapping location
location 1 location 2
Coverage
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6
Memorial Sloan-Kettering Cancer Center
17. Multiple Mapper Resolution
Principle (Iterated over all reads, N times)
Use the change of local coverage around read mapping ...
... and use its smoothness to identify “better” mapping location
Read not mapped to location 1 ... ... but mapped to location 2
Read pairCoverage
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6
Memorial Sloan-Kettering Cancer Center
18. Multiple Mapper Resolution
Principle (Iterated over all reads, N times)
Use the change of local coverage around read mapping ...
... and use its smoothness to identify “better” mapping location
+
Read not mapped to location 1 ... ... but mapped to location 2
Read pair Variance measure Evaluation windowCoverage Average coverage
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6
Memorial Sloan-Kettering Cancer Center
19. Multiple Mapper Resolution
Principle (Iterated over all reads, N times)
Use the change of local coverage around read mapping ...
... and use its smoothness to identify “better” mapping location
+
+
Read pair Variance measure Evaluation windowCoverage Average coverage
Read not mapped to location 1 ... ... but mapped to location 2
Read mapped to location 1 ... ... and not mapped to location 2
Assignment1Assignment2
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 6
Memorial Sloan-Kettering Cancer Center
20. Multiple Mapper Resolution
Results for simulated DNA-seq
Smooths coverage as expected on an artificial dataset
Simulated reads from tiling a part of A. thaliana genome, alignment w/ PALMapper [Jean et al., 2010] (with -a option), visual-
ization with IGV [Robinson et al., 2011].
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 7
Memorial Sloan-Kettering Cancer Center
21. Multiple Mapper Resolution
Results for simulated DNA-seq
Smooths coverage as expected on an artificial dataset
≥ ≥
Simulated reads from tiling a part of A. thaliana genome, alignment w/ PALMapper [Jean et al., 2010] (with -a option), visual-
ization with IGV [Robinson et al., 2011].
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 7
Memorial Sloan-Kettering Cancer Center
22. Multiple Mapper Resolution
Results for simulated RNA-seq
Improves performance of transcript quantification
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
0 1 2 4 6 all
Correlation
Mismatches
original alignments
best alignment only
MMR treated alignment
Simulated reads (75nt) from subset of human annotated transcripts with Fluxsimulator [Sammeth, 2009], PALMapper alignments
[Jean et al., 2010], rQuant quantitation Bohnert et al. [2009], Pearson correlation over all considered transcripts.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 8
Memorial Sloan-Kettering Cancer Center
23. LARGE-SCALE BIOLOGY ARTICLE
Nonsense-Mediated Decay of Alternative Precursor mRNA
Splicing Variants Is a Major Determinant of the Arabidopsis
Steady State TranscriptomeC W
Gabriele Drechsel,a,1 André Kahles,b,1 Anil K. Kesarwani,a Eva Stauffer,a,2 Jonas Behr,b Philipp Drewe,b
Gunnar Rätsch,b and Andreas Wachtera,3
a Center for Plant Molecular Biology, University of Tübingen, 72076 Tuebingen, Germany
b Computational Biology Center, Sloan-Kettering Institute, New York, New York 10065
ORCID IDs: 0000-0002-3411-0692 (A.K.); 0000-0001-5486-8532 (G.R.); 0000-0002-3132-5161 (A.W.).
The nonsense-mediated decay (NMD) surveillance pathway can recognize erroneous transcripts and physiological mRNA
such as precursor mRNA alternative splicing (AS) variants. Currently, information on the global extent of coupled AS and NM
The Plant Cell, Vol. 25: 3726–3742, October 2013, www.plantcell.org ã 2013 American Society of Plant Biologists. All rights reserved.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 9
24. bioRxiv dx.doi.org/10.1101/017095
Analysis of alternative isoforms with RNA-seq data
Analyses known and identifies novel splicing events
Quantifies & visualizes splicing-related data
Suitable for large-scale projects (1000’s of samples)
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/spladder (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 10
25. bioRxiv dx.doi.org/10.1101/017095
Analysis of alternative isoforms with RNA-seq data
Analyses known and identifies novel splicing events
Quantifies & visualizes splicing-related data
Suitable for large-scale projects (1000’s of samples)
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/spladder (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 10
26. bioRxiv dx.doi.org/10.1101/017095
Analysis of alternative isoforms with RNA-seq data
Analyses known and identifies novel splicing events
Quantifies & visualizes splicing-related data
Suitable for large-scale projects (1000’s of samples)
Improved accuracy for transcript quantification and prediction
Open Source bioweb.me/spladder (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 10
27. SplAdder Ideas
Major Problems in Transcriptome Analysis
1 Gene annotations are incomplete and often inaccurate
2 Whole transcript isoforms are difficult to predict/quantify
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11
Memorial Sloan-Kettering Cancer Center
28. SplAdder Ideas
Major Problems in Transcriptome Analysis
1 Gene annotations are incomplete and often inaccurate
2 Whole transcript isoforms are difficult to predict/quantify
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11
Memorial Sloan-Kettering Cancer Center
29. SplAdder Ideas
Major Problems in Transcriptome Analysis
1 Gene annotations are incomplete and often inaccurate
2 Whole transcript isoforms are difficult to predict/quantify
Solution
Augment annotation with RNA-Seq evidence
Use single splicing events instead of full transcripts
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11
Memorial Sloan-Kettering Cancer Center
30. SplAdder Ideas
Major Problems in Transcriptome Analysis
1 Gene annotations are incomplete and often inaccurate
2 Whole transcript isoforms are difficult to predict/quantify
Solution
Augment annotation with RNA-Seq evidence
Use single splicing events instead of full transcripts
Annotation
Alignment
Data
Augmented
Splicing
Graph
Detected
Splice
Events
Quantified
Splice
Events
Differential
Analysis/
sQTL Tests
Kahles et al., bioRxiv, 2015
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11
Memorial Sloan-Kettering Cancer Center
31. SplAdder Ideas
Major Problems in Transcriptome Analysis
1 Gene annotations are incomplete and often inaccurate
2 Whole transcript isoforms are difficult to predict/quantify
Solution
Augment annotation with RNA-Seq evidence
Use single splicing events instead of full transcripts
Annotation
Alignment
Data
Augmented
Splicing
Graph
Detected
Splice
Events
Quantified
Splice
Events
Differential
Analysis/
sQTL Tests
Kahles et al., bioRxiv, 2015
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 11
Memorial Sloan-Kettering Cancer Center
32. SplAdder Graph Augmentation
Principle
Collapse annotated transcripts into graph representation
Use RNA-Seq evidence to add new nodes and edges
T1E1 T1E2 T1E3 T1E4
T2E1 T2E2 T2E3
T3E1 T3E2 T3E3
T4E1 T4E2
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12
Memorial Sloan-Kettering Cancer Center
33. SplAdder Graph Augmentation
Principle
Collapse annotated transcripts into graph representation
Use RNA-Seq evidence to add new nodes and edges
T1E1 T1E2 T1E3 T1E4
T2E1 T2E2 T2E3
T3E1 T3E2 T3E3
E1 E3 E4 E6
E2 E5
T4E1 T4E2
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12
Memorial Sloan-Kettering Cancer Center
34. SplAdder Graph Augmentation
Principle
Collapse annotated transcripts into graph representation
Use RNA-Seq evidence to add new nodes and edges
coverage
split alignments
New cassette exon
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12
Memorial Sloan-Kettering Cancer Center
35. SplAdder Graph Augmentation
Principle
Collapse annotated transcripts into graph representation
Use RNA-Seq evidence to add new nodes and edges
coverage
split alignments
New cassette exon
coverage
New retained intron
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 12
Memorial Sloan-Kettering Cancer Center
36. SplAdder Graph Augmentation
coverage
split alignments
A New cassette exon
coverage
B New retained intron
split alignments
C New intron
split alignments
D Alternative splice sites on both intron ends
split alignments
E New start-terminal node / New end-terminal node
split alignments
F Alternative 3’splice site / New end-terminal node
split alignments
G Alternative 5’splice site / New start terminal node
split alignments
H New exon skip
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 13
Memorial Sloan-Kettering Cancer Center
37. SplAdder Event Extraction
E1 E3 E4 E6
E2 E5
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 14
Memorial Sloan-Kettering Cancer Center
38. SplAdder Event Extraction
E1 E3 E4 E6
E2 E5
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 14
Memorial Sloan-Kettering Cancer Center
39. SplAdder Event Extraction
E1 E3 E4 E6
E2 E5
E1 E3 E4 E6
E2 E5
E1 E3 E4 E6
E2 E5
E1 E3 E4 E6
E2 E5
E1 E3 E4 E6
E2 E5
Exon Skip
Multiple Exon Skip
Alternative 5’Splice Site
Intron Retention
Alternative 3’Splice Site
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 14
Memorial Sloan-Kettering Cancer Center
40. SplAdder Event Quantification and Visualization
Exon Skip
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
41. SplAdder Event Quantification and Visualization
Exon Skip
a
cb
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
42. SplAdder Event Quantification and Visualization
Exon Skip
a
cb
PSI =
b + c
2 · a + b + c
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
43. SplAdder Event Quantification and Visualization
Alternative 5’ Site
a
b
Exon Skip
a
cb
PSI =
b + c
2 · a + b + c
PSI =
b
a + b
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
44. SplAdder Event Quantification and Visualization
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
45. SplAdder Event Quantification and Visualization
Summary
SplAdder effectively augments the annotation
Enables quantitative analysis of events instead of transcripts
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 15
Memorial Sloan-Kettering Cancer Center
46. Splicing Analysis Across Multiple Cancer Types
Goals
1 Identify cancer-specific splicing patterns
2 Identify variants regulating splicing in same gene (cis)
3 Identify variants regulating splicing in other cancer genes (trans)
TCGA provides RNA-seq and matching exome data
RNA-seq Find & quantify splicing events
Exome Identify variants in exons & flanking intronic regions
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 16
Memorial Sloan-Kettering Cancer Center
47. Splicing Analysis Across Multiple Cancer Types
Goals
1 Identify cancer-specific splicing patterns
2 Identify variants regulating splicing in same gene (cis)
3 Identify variants regulating splicing in other cancer genes (trans)
TCGA provides RNA-seq and matching exome data
RNA-seq Find & quantify splicing events
Exome Identify variants in exons & flanking intronic regions
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 16
Memorial Sloan-Kettering Cancer Center
48. Splicing Variation Across 4,700 Samples
Event Statistics Exon Skip Eve
A B
Analysis of a total of 4,700 RNA-seq samples from TCGA normal (tn), TCGA tumors (tc), Encode (ec) and Geuvadis (gv). Align-
ment w/ STAR [Dobin et al., 2013], analysis w/ SplAdder (SplA) and Gencode annotation (Anno). Figure from [Kahles, 2014].
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 17
Memorial Sloan-Kettering Cancer Center
49. Uniform analysis of Large-Scale RNA-seq Data
Large-scale Compute
4,700 RNA-seq libraries (≈100 TB)
⇒ STAR ≈ 6 CPU years
⇒ SplAdder ≈ 0.5 CPU years
[Kahles et al.]
Unified community resources
Docker with ICGC RNA-seq alignment SOP
bioweb.me/ICGC-RNA-SOP
Syncronize with Encode, gTex, TCGA, . . .
[ICGC PCAWG-3 WG]
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 18
Memorial Sloan-Kettering Cancer Center
50. Uniform analysis of Large-Scale RNA-seq Data
Large-scale Compute
4,700 RNA-seq libraries (≈100 TB)
⇒ STAR ≈ 6 CPU years
⇒ SplAdder ≈ 0.5 CPU years
[Kahles et al.]
Unified community resources
Docker with ICGC RNA-seq alignment SOP
bioweb.me/ICGC-RNA-SOP
Syncronize with Encode, gTex, TCGA, . . .
[ICGC PCAWG-3 WG]
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 18
Memorial Sloan-Kettering Cancer Center
51. bioRxiv dx.doi.org/10.1101/017095
Analysis of Ribosome profiling and RNA-seq data
Study translation efficiency
Adjusts for expression differences
Accurate method based on dispersion estimates and GLMs
Open Source bioweb.me/ribodiff (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 19
52. bioRxiv dx.doi.org/10.1101/017095
Analysis of Ribosome profiling and RNA-seq data
Study translation efficiency
Adjusts for expression differences
Accurate method based on dispersion estimates and GLMs
Open Source bioweb.me/ribodiff (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 19
53. bioRxiv dx.doi.org/10.1101/017095
Analysis of Ribosome profiling and RNA-seq data
Study translation efficiency
Adjusts for expression differences
Accurate method based on dispersion estimates and GLMs
Open Source bioweb.me/ribodiff (python)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 19
54. Application to Ribosome Profiling
Found a motif that was strongly in enriched in detected genes:
G-quadruplex structures
Memorial Sloan-Kettering Cancer Center
Application to Ribosome Profiling
Found a motif that was strongly in enriched in detected genes:
G-quadruplex structures
Memorial Sloan-Kettering Cancer Center
(Analysis based on related but different strategy [Wolfe et al., 2014].)
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 20
55. Summary
MMR improves alignment choice for multi-mappers
⇒ Helps improving accuracy of tools like Cufflinks
SplAdder identifies, quantifies & visualizes alternative splicing
⇒ Finds unannotated alternative splicing, tumor/normal splicing
differences; splicing reprogramming; sQTLs
riboDiff accurately detects differential translation efficiency
⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in
5’ UTR that interacts with compound via eIF4a
Tools (+ six other ones) are open source and available
⇒ ratschlab.org/tools
⇒ (more) Docker images come soon . . .
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21
Memorial Sloan-Kettering Cancer Center
56. Summary
MMR improves alignment choice for multi-mappers
⇒ Helps improving accuracy of tools like Cufflinks
SplAdder identifies, quantifies & visualizes alternative splicing
⇒ Finds unannotated alternative splicing, tumor/normal splicing
differences; splicing reprogramming; sQTLs
riboDiff accurately detects differential translation efficiency
⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in
5’ UTR that interacts with compound via eIF4a
Tools (+ six other ones) are open source and available
⇒ ratschlab.org/tools
⇒ (more) Docker images come soon . . .
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21
Memorial Sloan-Kettering Cancer Center
57. Summary
MMR improves alignment choice for multi-mappers
⇒ Helps improving accuracy of tools like Cufflinks
SplAdder identifies, quantifies & visualizes alternative splicing
⇒ Finds unannotated alternative splicing, tumor/normal splicing
differences; splicing reprogramming; sQTLs
riboDiff accurately detects differential translation efficiency
⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in
5’ UTR that interacts with compound via eIF4a
Tools (+ six other ones) are open source and available
⇒ ratschlab.org/tools
⇒ (more) Docker images come soon . . .
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21
Memorial Sloan-Kettering Cancer Center
58. Summary
MMR improves alignment choice for multi-mappers
⇒ Helps improving accuracy of tools like Cufflinks
SplAdder identifies, quantifies & visualizes alternative splicing
⇒ Finds unannotated alternative splicing, tumor/normal splicing
differences; splicing reprogramming; sQTLs
riboDiff accurately detects differential translation efficiency
⇒ Ribosome footprinting revealed RNA G-Quadruplex elements in
5’ UTR that interacts with compound via eIF4a
Tools (+ six other ones) are open source and available
⇒ ratschlab.org/tools
⇒ (more) Docker images come soon . . .
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 21
Memorial Sloan-Kettering Cancer Center
59. Acknowledgements
R¨atsch Laboratory
Andre Kahles
Yi Zhong
Philipp Drewe @ MDC Berlin
Theofanis Karaletsos
Kjong Van Lehmann
Jonas Behr @ ETH Basel
Regina Bohnert @ Molecular Health
Geraldine Jean @ University of Nantes
Cancer Biology
Guido Wendel
Kamini Singh, . . .
Cancer Genomics Projects
Angela Brooks, Broad
Alvis Brazma, EBI
Matt Wilkerson, UNC
Niki Schultz, MSKCC
Chris Sander, MSKCC
Funding from MSKCC, Max Planck Society,
European Union, German Research Foundation,
Geoffrey Beene Foundation & NIH
Thank you!
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 22
60. Acknowledgements
R¨atsch Laboratory
Andre Kahles
Yi Zhong
Philipp Drewe @ MDC Berlin
Theofanis Karaletsos
Kjong Van Lehmann
Jonas Behr @ ETH Basel
Regina Bohnert @ Molecular Health
Geraldine Jean @ University of Nantes
Cancer Biology
Guido Wendel
Kamini Singh, . . .
Cancer Genomics Projects
Angela Brooks, Broad
Alvis Brazma, EBI
Matt Wilkerson, UNC
Niki Schultz, MSKCC
Chris Sander, MSKCC
Funding from MSKCC, Max Planck Society,
European Union, German Research Foundation,
Geoffrey Beene Foundation & NIH
Thank you!
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 22
61. References I
J. Behr, G. Schweikert, J. Cao, F. De Bona, G. Zeller, S. Laubinger, S. Ossowski,
K. Schneeberger, D. Weigel, and G. R¨atsch. Rna-seq and tiling arrays for improved gene
finding. Oral presentation at the CSHL Genome Informatics Meeting, September 2008. URL
http:
//www.fml.tuebingen.mpg.de/raetsch/lectures/RaetschGenomeInformatics08.pdf.
R. Bohnert, J. Behr, and G R¨atsch. Transcript quantification with RNA-Seq data. BMC
Bioinformatics, 10(S13):P5, 2009. URL
http://www.biomedcentral.com/1471-2105/10/S13/P5.
RM Clark, G Schweikert, C Toomajian, S Ossowski, G Zeller, P Shinn, N Warthmann, TT Hu,
G Fu, DA Hinds, H Chen, KA Frazer, DH Huson, B Sch¨olkopf, M Nordborg, G R¨atsch,
JR Ecker, and D Weigel. Common sequence polymorphisms shaping genetic diversity in
arabidopsis thaliana. Science, 317(5836):338–342, 2007. ISSN 1095-9203 (Electronic). doi:
10.1126/science.1138632.
Alexander Dobin, Carrie a Davis, Felix Schlesinger, Jorg Drenkow, Chris Zaleski, Sonali Jha,
Philippe Batut, Mark Chaisson, and Thomas R Gingeras. STAR: ultrafast universal RNA-seq
aligner. Bioinformatics (Oxford, England), 29(1):15–21, January 2013. ISSN 1367-4811. doi:
10.1093/bioinformatics/bts635. URL http://www.ncbi.nlm.nih.gov/pubmed/23104886.
Mitchell Guttman, Manuel Garber, Joshua Z Levin, Julie Donaghey, James Robinson, Xian
Adiconis, Lin Fan, Magdalena J Koziol, Andreas Gnirke, Chad Nusbaum, John L Rinn,
Eric S Lander, and Aviv Regev. Ab initio reconstruction of cell type-specific transcriptomes
in mouse reveals the conserved multi-exonic structure of lincrnas. Nat Biotechnol, 28(5):
503–10, May 2010. doi: 10.1038/nbt.1633.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 23
62. References II
G Jean, A Kahles, VT Sreedharan, F De Bona, and G R¨atsch. Rna-seq read alignments with
palmapper. Curr Protoc Bioinformatics, Unit 11.6, 2010.
Andre Kahles. Novel Methods for the Computational Analysis of RNA-Seq Data with
Applications to Alternative Splicing. PhD thesis, University of T¨ubingen, T¨ubingen,
Germany, September 2014.
G. R¨atsch and S. Sonnenburg. Accurate splice site detection for Caenorhabditis elegans. In
K. Tsuda B. Schoelkopf and J.-P. Vert, editors, Kernel Methods in Computational Biology.
MIT Press, 2004.
G. R¨atsch, S. Sonnenburg, and B. Sch¨olkopf. RASE: recognition of alternatively spliced exons
in C. elegans. Bioinformatics, 21(Suppl. 1):i369–i377, June 2005.
James T. Robinson, Helga Thorvaldsd´ottir, Wendy Winckler, Mitchell Guttman, Eric S. Lander,
Gad Getz, and Jill P. Mesirov. Integrative genomics viewer. Nature Biotechnology, 29:
24–26, 2011.
M. Sammeth. The Flux Simulator. Website, 2009. http://flux.sammeth.net/simulator.html.
Gabriele Schweikert, Alexander Zien, Georg Zeller, Jonas Behr, Christoph Dieterich,
Cheng Soon Ong, Petra Philips, Fabio De Bona, Lisa Hartmann, Anja Bohlen, Nina Kr¨uger,
S¨oren Sonnenburg, and Gunnar R¨atsch. mgene: Accurate svm-based gene finding with an
application to nematode genomes. Genome Research, 2009. URL
http://genome.cshlp.org/content/early/2009/06/29/gr.090597.108.full.pdf+html.
Advance access June 29, 2009.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 24
63. References III
S. Sonnenburg, G. R¨atsch, A. Jagota, and K.-R. M¨uller. New methods for splice-site
recognition. In Proc. International Conference on Artificial Neural Networks, 2002.
S¨oren Sonnenburg, Alexander Zien, and Gunnar R¨atsch. ARTS: Accurate Recognition of
Transcription Starts in Human. Bioinformatics, 22(14):e472–480, 2006.
Cole Trapnell, Brian A Williams, Geo Pertea, Ali Mortazavi, Gordon Kwan, Marijke J van
Baren, Steven L Salzberg, Barbara J Wold, and Lior Pachter. Transcript assembly and
quantification by rna-seq reveals unannotated transcripts and isoform switching during cell
differentiation. Nature Biotech, advance online publication, May 2010. doi:
10.1038/nbt.1621. URL http://dx.doi.org/10.1038/nbt.1621.
A Wolfe, K Singh, Y Zhong, P Drewe, others, G R¨atsch, and HG Wendel. Rna g-quadruplexes
cause eif4a-dependent oncogene translation in cancer. Nature, 2014. doi:
10.1038/nature13485.
G Zeller, RM Clark, K Schneeberger, A Bohlen, D Weigel, and G Ratsch. Detecting
polymorphic regions in arabidopsis thaliana with resequencing microarrays. Genome Res, 18
(6):918–929, 2008. ISSN 1088-9051 (Print). doi: 10.1101/gr.070169.107.
A. Zien, G. R¨atsch, S. Mika, B. Sch¨olkopf, T. Lengauer, and K.-R. M¨uller. Engineering Support
Vector Machine Kernels That Recognize Translation Initiation Sites. BioInformatics, 16(9):
799–807, September 2000.
c Gunnar R¨atsch (cBio@MSKCC) Tools for RNA-seq and Isoform Characterization ABRF Annual Meeting 2015, St. Louis 25