4. 4
• Bulk vs. Single Cell
• Transcript discovery
• Differential gene expression
• Allele-specific expression
• Detection of RNA editing
• Viral detection
• Gene fusion detection
• Alternative splicing
• De novo transcript assembly
• …
“The analysis goals of RNA-Seq experiments are diverse. Each of
these analysis goals has distinct requirements and challenges”
Source: Griffith et al. (2015)
7. 7
Short Read Mapping
• Naïve mapping to genome won’t
work: exon-exon junctions
• Use splice-aware mappers (e.g.
STAR). Require annotation
• Recent (fast) alternative: pseudo-
aligners (e.g. Kallisto), which
map to transcriptome
Source: Wikipedia
8. 8
File Formats Plain Reads: FastQ
(usually >= 20M per sample)
Aligned: BAM Courtesy: Jonathan Göke
9. 9
Common Analysis Flowchart
• Example: Tuxedo suite
• Large number of tools
• Number of representative tools
listed in Griffith et al. (2015): >100
• Multiple versions available
Source: Griffith et al. 2015
11. 11
Installation: Experts Only?
• Often requires Unix/Linux knowledge
• Install package dependencies first
• Not being root/admin can complicate things
• How to install different version of the same program?
Source: XKCD
15. 15
Example: New Tuxedo Protocol (Pertea et al., 2016)
Source: https://ccb.jhu.edu/software/stringtie/index.shtml?t=manual
• Replaces TopHat, Cuffdiff etc.
• How to scale to many samples?
• What if a step fails?
• How to make useable for others?
17. 17
Putting it all Together
• Portable
• Validated
• Well documented
• Community maintained
Courtesy: Paolo Di Tommaso
Presented at Bio-IT World 2018
18. 18
RNASeq Pipelines, Written in Nextflow and Using Containers
• NF-Core RNASeq: expression analysis and extensive QC
• Tuxedo: Transcript-level expression analysis and DE
• CalliNGS-NF: Variant Calling with GATK, incl. ASE
• …
19. 19
Demo
• Example: GATK best practices for
variant calling for RNAseq
• Plus SNVs post-processing and
quantification for allele specific
expression
• Note: multiple samples, no software
installation, fully orchestrated etc.
Source: https://github.com/CRG-CNAG/CalliNGS-NF
20. 20
Honorable Mentions
Cloud based analytics portals:
And others…
Software for Single Cell Gene Expression:
Cell Ranger Pipelines and Loupe Cell Browser
21. 21
We can do the Heavy Lifting for you:
From Off-the-shelf to “Artisan” Analysis
http://www.igap.io | igap@gis.a-star.edu.sg