VIP Mumbai Call Girls Hiranandani Gardens Just Call 9920874524 with A/C Room ...
Using RNA-Seq to Analyze Embryonic Pluripotency
1. Using RNA Seq to conduct systems-level analysis of
embryonic pluripotency, self-renewal and differentiation
David-Emlyn Parfitt
Shen Lab, Irving Cancer Research Center
2. The molecular regulators of self-renewal and pluripotency are
not completely defined or characterized
Mouse blastocyst Mouse egg cylinder Human blastocyst
(3.5 days) (5.5 days) (5-7 days)
Inner Cell
Mass
Epiblast
mESC mEpiSC hESC
≈
Nanog
JAK-STAT
Oct4 Self-renewal and Pluripotency MAPK
Sox2
Novel Master Regulators?
3. Defining the molecular networks associated with stem cell self-
renewal, pluripotency and differentiation
Which tool to use for
expression profiling?
150 Combinatory
Genome-Wide GEP Data
Chemical
Treatments
Algorithmic
analysis Master
Regulator
(ARACNe,
Analysis
MINDy)
Rank
In vitro and in vivo
validation
ESC/EpiSC
„Interactome‟
4. Gene Expression Profiling:
Microarrays vs RNA-Sequencing
Arrays:
Well defined technique
High throughput
Discrete measurement
Background noise + batch effect
No distinction between isoforms/alleles
5. Gene Expression Profiling:
Microarrays vs RNA-Sequencing
RNA Sequencing:
aaaaaaa
aaaaaaa Total RNA
aaaaaaa
Fragment
aaaaaaa
Reverse-transcribe
to cDNA
6. Gene Expression Profiling:
Microarrays vs RNA-Sequencing
RNA Sequencing:
aaaaaaa
aaaaaaa Total RNA* Algorithmic and logistic challenge
Lengthy library preparation
aaaaaaa
aaaaaaa
Single base resolution
Low background noise
Reverse-transcribe
to cDNA Distinction of isoform and allelic
expression
Low amount of RNA needed
*Including non-coding RNAs, depending
on purification protocol
7. RNA-Sequencing Methodology:
Deciding the parameters
aaaaaaa
aaaaaaa
Read length?
-Efficiency vs faithfulness
aaaaaaa
aaaaaaa Single end or paired end reads?
-Efficiency vs faithfulness
-Alignment accuracy
Number of reads?
-Depth of coverage
-Cost
How many to effectively cover
the mouse genome (~50MB)?
8. Deciding the parameters:
How many 100 bp reads is necessary for comprehensive
coverage of the mouse genome?
RPKM:
Normalized measurement of transcript abundance
Reads per kilobase of exome per million mapped
reads
RPKM for a particular transcript does not change
when overall number of reads changes, and it is
the same for transcripts with same abundance
9. Deciding the parameters:
How many 100 bp reads is necessary for comprehensive
coverage of the mouse genome?
RPKM:
Normalized measurement of transcript abundance
Reads per kilobase of exome per million mapped
reads
RPKM for a particular transcript does not change
when overall number of reads changes, and it is
the same for transcripts with same abundance
10. Deciding the parameters:
How many 100 bp reads is necessary for comprehensive
coverage of the mouse genome?
100 million, 100bp, SE reads
11. Setting the transcript ‘detection’ threshold
RA-72H-1 RA-72H-2 CM CM
Number of raw reads (million) 97.3 88 87 95
Number of mapped reads (million) 97 87.7 87 94
Transcripts w. RPKM > 0.01 (/27641) 72% 77% 84% 84%
12. Setting the transcript ‘detection’ threshold
RA-72H-1 RA-72H-2 CM CM
Number of raw reads (million) 97.3 88 87 95
Number of mapped reads (million) 97 87.7 87 94
Transcripts w. RPKM > 1 (/27641) 49% 48% 51% 52%
13. RPKM is constant, regardless of number of reads
r2=0.9 r2=0.97
“RPKM for a particular transcript does not change
when overall number of reads changes”
14. RPKM becomes relatively constant with increased read
number
0.95
0.9
Median RPKM 0.85
0.8
0.749
0.75 0.725
0.7
0.65
0.6
0.55
0.5
20 40 60 80
Reads (millions)
i.e. We are not detecting significantly more genes/transcripts above
20-30 million reads
15. How many 100 bp reads is necessary for comprehensive
coverage of the mouse genome?
1
0.95
Percent of final
0.9
transcripts
[60,)
0.85 [30,60)
[15,30)
[7.5,15)
Transcript
0.8 Abundance
[3.75,7.5)
[0.01,3.74) (RPKM)
0.75
0.7
0 20 40 60 80 100
Reads (millions)
Between 20 and 30 million 100bp reads is sufficient to capture
~100% of the most abundant transcripts and 95% of the least
abundant