Next-generation sequencing technologies like ion semiconductor sequencing and pyrosequencing have enabled applications of RNA-Seq. RNA-Seq involves sequencing cDNA to analyze transcriptomes, identify differentially expressed genes between conditions, and reconstruct transcripts through methods like genome-guided assembly or by building de Bruijn graphs from k-mers.
5. Pyrosequencing
• pyrophosphate (PPi) is translated into light
emission in two enzymatic steps
• ATP-sulfurylase converts PPi in ATP
• luciferase converts Luciferin in
oxiluciferin and emitted light (ATP is the
fuel for this reaction).
5
7. Cyclic reversible termination
• incorporation of modified nucleotides
• fluorescence imaging
• washing of fluorescent dye and removing of
terminating group
7
14. Workflow of RNA-Seq
• sample preparation ribo minus and strand specific libs
(paired end)
• seqeuncing
• read mapping
Fragmentation of cDNA
Purification
Adapter ligation
Size-based purification of ligation
products using
PCR of ligation products
Purification and sequencing of the
fragments
14
15. Workflow of RNA-Seq
• sample preparation ribo minus and strand specific libs
(paired end)
• seqeuncing
• read mapping
Fragmentation of cDNA
Purification
Adapter ligation
Size-based purification of ligation
products using
PCR of ligation products
Purification and sequencing of the
fragments
14
17. Read mapping
unspliced alignment match PERFECTLY the
•
genome
unspliced read alignment seed : after mapping, the seed is extended with
smith waterman method
• seed methods
Burrow-Wheeler: transform genome into
efficient data structure
• Burrows-Wheeler transform methods
• spliced aligners
• exon first
• seed and extend
16
18. Read mapping (spliced alignment
methods)
2 steps:
unspliced read aligner break reads into shorter
(k-mer-)seeds.
unmapped reads are
split into fragments seed-regions are evaluated
and aligned with sensitive alignment
independently methods (smith
waterman)
much slower method
pseudogene is dysfunctional
and does not contain
introns. in the first step of
exon-first approaches reads
are aligned to the genome
based on unspliced read
mappers. here it is detected
as gene and not as
pseudogene.
17
20. Transcriptome reconstruction
exon identification: is used for very
• genome guided reconstruction
short reads (36 bp)
first map reads to the genome. the
unmapped reads are tested for all
• exon identification
possibilities of exon-exon junctions
not able to identify full transcript
structures
• genome-guided assembly
• genome independent reconstruction
19
21. Transcriptome reconstruction
exon identification: is used for very
• genome guided reconstruction
short reads (36 bp)
first map reads to the genome. the
unmapped reads are tested for all
• exon identification
possibilities of exon-exon junctions
not able to identify full transcript
structures
• genome-guided assembly
• genome independent reconstruction
19
22. Transcriptome reconstruction
break reads into k-mers.
align reads to the genome build de-brujin graph.
and use spliced reads to
build transcript graph alignment to the genome
(node: read fragment, is because of annotation
edge: link bet ween purpose
fragments) one paht
through the graph
represents an isoform of
the transcript)
20
24. Differential gene expression for RNA-Seq
• comparing the count of reads for one gene
in different experiments (conditions)
• the problem is that one read cannot
assigned uniquely to one gene, because of
the overlap of genes and their different
isoforms
22