Differential analysis of gene regulation at
transcript resolution with RNA-seq
(2014)!
!
Alyssa C Frazee, Geo Pertea, Andrew E Jaffe, Ben Langmead, Steven L
Salzberg, Jeffrey T Leek
Preprint available at http://biorxiv.org/content/biorxiv/early/
2014/03/30/003665.full.pdf
Cuffdiff2 is over conservative by comparison!
Simulated data:!
274 transcripts differentially
expressed!
0 were called by Cuffdiff2!
80 were called by Ballgown!
!
“78 of the top 100 transcripts
called differentially expressed
were truly differentially expressed
for Ballgown versus 63 for
Cuffdiff2, a 23% increase in truly
differentially expressed genes
(Figure 2d).”!
!
Cuffdiff2 is over conservative by comparison!
!
!
“We further investigated the
source of the conservative bias of
Cuffdiff2 and found that when we
sampled reads with equal
probability from each
transcript, ignoring transcript
length, Cuffdiff2 produced
accurate measures of
statistical significance
(Supplmentary Figure 1). This
result suggests that the
conservative bias may be due to
transcript length normalization
in the Cuffdiff2 software.”!
A recent evaluation using biological samples
in which expression has been confirmed
with qRT-PCR agrees!
“Cuffdiff has reduced sensitivity
and specificity as measured by
ROC analysis as well as the
significant number of false
positives in the null model test.
We postulate that this is
related to its normalization
procedure, which attempts to
account for both alternative
isoform expression and length
of transcripts.”!
!
http://www.biomedcentral.com/
content/pdf/gb-2013-14-9-r95.pdf!
!
Ballgown can model continuous covariants!
Example: RNA quality or RNA Integrity Number (RIN) as a
continuous covariant
Ballgown can be used with other standard DE tools!
Example: eQTL with MatrixEQTL for 464 samples!
!
filters:!
Transcripts with FPKM > 0.1!
SNPs with minor allele frequency < 5%!
Cis eQTLs within 1000kb
“57% and 78% of transcript-SNP
pairs significant at FDR of 1%
appeared in the list of significant
transcript eQTL identified in the
original analysis of the EUR and
YRI populations individually. 14%
of eQTL pairs were identified for
transcripts that did not overlap
Ensembl annotated transcripts
(Figure 4).
Ballgown pipeline runtimes for the Geuvadis datasets
FPKM based (e.g. Cufflinks) vs Average Coverage
based (e.g. DESeq and edgeR)!
FPKM (length normalized) vs average coverage (a count-based
measure of expression though not the raw counts that DESeq and
edgeR take as input)!
Geuvadis!
Simulated!
Simulated!
Simulated
(avg.cov.)!
Simulated (FPKM)!
Similar

Journal club slides to discuss "Differential analysis of gene regulation at transcript resolution with RNA-seq" (2014).

  • 1.
    Differential analysis ofgene regulation at transcript resolution with RNA-seq (2014)! ! Alyssa C Frazee, Geo Pertea, Andrew E Jaffe, Ben Langmead, Steven L Salzberg, Jeffrey T Leek Preprint available at http://biorxiv.org/content/biorxiv/early/ 2014/03/30/003665.full.pdf
  • 2.
    Cuffdiff2 is overconservative by comparison! Simulated data:! 274 transcripts differentially expressed! 0 were called by Cuffdiff2! 80 were called by Ballgown! ! “78 of the top 100 transcripts called differentially expressed were truly differentially expressed for Ballgown versus 63 for Cuffdiff2, a 23% increase in truly differentially expressed genes (Figure 2d).”! !
  • 3.
    Cuffdiff2 is overconservative by comparison! ! ! “We further investigated the source of the conservative bias of Cuffdiff2 and found that when we sampled reads with equal probability from each transcript, ignoring transcript length, Cuffdiff2 produced accurate measures of statistical significance (Supplmentary Figure 1). This result suggests that the conservative bias may be due to transcript length normalization in the Cuffdiff2 software.”!
  • 4.
    A recent evaluationusing biological samples in which expression has been confirmed with qRT-PCR agrees! “Cuffdiff has reduced sensitivity and specificity as measured by ROC analysis as well as the significant number of false positives in the null model test. We postulate that this is related to its normalization procedure, which attempts to account for both alternative isoform expression and length of transcripts.”! ! http://www.biomedcentral.com/ content/pdf/gb-2013-14-9-r95.pdf! !
  • 5.
    Ballgown can modelcontinuous covariants! Example: RNA quality or RNA Integrity Number (RIN) as a continuous covariant
  • 6.
    Ballgown can beused with other standard DE tools! Example: eQTL with MatrixEQTL for 464 samples! ! filters:! Transcripts with FPKM > 0.1! SNPs with minor allele frequency < 5%! Cis eQTLs within 1000kb “57% and 78% of transcript-SNP pairs significant at FDR of 1% appeared in the list of significant transcript eQTL identified in the original analysis of the EUR and YRI populations individually. 14% of eQTL pairs were identified for transcripts that did not overlap Ensembl annotated transcripts (Figure 4).
  • 7.
    Ballgown pipeline runtimesfor the Geuvadis datasets
  • 8.
    FPKM based (e.g.Cufflinks) vs Average Coverage based (e.g. DESeq and edgeR)! FPKM (length normalized) vs average coverage (a count-based measure of expression though not the raw counts that DESeq and edgeR take as input)! Geuvadis! Simulated! Simulated! Simulated (avg.cov.)! Simulated (FPKM)! Similar