• Input: pooled RNAs from 20 tissues
• Approach: prepare double-stranded cDNAs -> CCS library -> PacBio sequencing
• Output: 476,000 CCS reads, mean=1kb
• 61% reads cover all introns and most first and last exons
• CCS reads well cover (generally >90%) short transcripts (<1.2 kb) but stay low
coverage for long transcripts, especially for those with >2.4 kb
Missing 3’ ends
Missing 5’ ends
of the number of
reads and …
ERCC, mixture of known/quantified RNAs
• 67% molecules with splicing sites were estimated
• CSMM: consensus split-mapped molecule (accurate CCS reads with splicing sites?)
• Splicing sites well match annotated splicing sites
• PacBio (versus 454) exhibits much higher power to detect isoforms with >=10 introns
• Estimate: 21,000 genes and 139,000 isoforms can be detected with high-depth seq
• Full-length RNA of up to 1.5kb can readily be
monitored with little sequence loss at the 5’
• With 476k CCS reads (>300bp), 14,000 spliced
genes were identified.
• The majority of introns are consistent with
annotations, but >10% are novel.
• Isoforms can be monitored at a single-molecule level
without amplification or fragmentation
• The majority of reads represent all splice sites of the
• Unannotated splice isoforms: long non-coding RNAs
with few introns and isoforms of known protein-
coding genes with many introns
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.