Using RNA Seq to conduct systems-level analysis ofembryonic pluripotency, self-renewal and differentiation                ...
The molecular regulators of self-renewal and pluripotency are           not completely defined or characterized  Mouse bla...
Defining the molecular networks associated with stem cell self-                renewal, pluripotency and differentiation  ...
Gene Expression Profiling:Microarrays vs RNA-Sequencing           Arrays:                 Well defined technique          ...
Gene Expression Profiling:             Microarrays vs RNA-Sequencing                       RNA Sequencing:aaaaaaa   aaaaaa...
Gene Expression Profiling:             Microarrays vs RNA-Sequencing                       RNA Sequencing:aaaaaaa   aaaaaa...
RNA-Sequencing Methodology:               Deciding the parametersaaaaaaa   aaaaaaa                      Read length?      ...
Deciding the parameters:           How many 100 bp reads is necessary for comprehensive                     coverage of th...
Deciding the parameters:           How many 100 bp reads is necessary for comprehensive                     coverage of th...
Deciding the parameters:How many 100 bp reads is necessary for comprehensive          coverage of the mouse genome?       ...
Setting the transcript ‘detection’ threshold                                      RA-72H-1   RA-72H-2   CM    CMNumber of ...
Setting the transcript ‘detection’ threshold                                   RA-72H-1   RA-72H-2   CM    CMNumber of raw...
RPKM is constant, regardless of number of readsr2=0.9                            r2=0.97 “RPKM for a particular transcript...
RPKM becomes relatively constant with increased read                            number                        0.95        ...
How many 100 bp reads is necessary for comprehensive                      coverage of the mouse genome?                   ...
AcknowledgementsShen Lab:Michael ShenHui ZhaoShen Lab MembersCalifano Lab:Andrea CalifanoMariano AlvarezYufeng ShenXiaoyun...
Upcoming SlideShare
Loading in...5
×

David-Emlyn Parfitt, Columbia Illumina seminar 11/9/2011

1,353

Published on

Published in: Health & Medicine, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,353
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

David-Emlyn Parfitt, Columbia Illumina seminar 11/9/2011

  1. 1. Using RNA Seq to conduct systems-level analysis ofembryonic pluripotency, self-renewal and differentiation David-Emlyn Parfitt Shen Lab, Irving Cancer Research Center
  2. 2. The molecular regulators of self-renewal and pluripotency are not completely defined or characterized Mouse blastocyst Mouse egg cylinder Human blastocyst (3.5 days) (5.5 days) (5-7 days) Inner Cell Mass Epiblast mESC mEpiSC hESC ≈Nanog JAK-STATOct4 Self-renewal and Pluripotency MAPKSox2 Novel Master Regulators?
  3. 3. Defining the molecular networks associated with stem cell self- renewal, pluripotency and differentiation Which tool to use for expression profiling?150 Combinatory Genome-Wide GEP Data Chemical Treatments Algorithmic analysis Master Regulator (ARACNe, Analysis MINDy) Rank In vitro and in vivo validation ESC/EpiSC „Interactome‟
  4. 4. Gene Expression Profiling:Microarrays vs RNA-Sequencing Arrays: Well defined technique High throughput Discrete measurement Background noise + batch effect No distinction between isoforms/alleles
  5. 5. Gene Expression Profiling: Microarrays vs RNA-Sequencing RNA Sequencing:aaaaaaa aaaaaaa Total RNA aaaaaaa Fragment aaaaaaa Reverse-transcribe to cDNA
  6. 6. Gene Expression Profiling: Microarrays vs RNA-Sequencing RNA Sequencing:aaaaaaa aaaaaaa Total RNA* Algorithmic and logistic challenge Lengthy library preparation aaaaaaa aaaaaaa Single base resolution Low background noise Reverse-transcribe to cDNA Distinction of isoform and allelic expression Low amount of RNA needed *Including non-coding RNAs, depending on purification protocol
  7. 7. RNA-Sequencing Methodology: Deciding the parametersaaaaaaa aaaaaaa Read length? -Efficiency vs faithfulness aaaaaaa aaaaaaa Single end or paired end reads? -Efficiency vs faithfulness -Alignment accuracy Number of reads? -Depth of coverage -Cost How many to effectively cover the mouse genome (~50MB)?
  8. 8. Deciding the parameters: How many 100 bp reads is necessary for comprehensive coverage of the mouse genome?RPKM:Normalized measurement of transcript abundanceReads per kilobase of exome per million mappedreadsRPKM for a particular transcript does not changewhen overall number of reads changes, and it isthe same for transcripts with same abundance
  9. 9. Deciding the parameters: How many 100 bp reads is necessary for comprehensive coverage of the mouse genome?RPKM:Normalized measurement of transcript abundanceReads per kilobase of exome per million mappedreadsRPKM for a particular transcript does not changewhen overall number of reads changes, and it isthe same for transcripts with same abundance
  10. 10. Deciding the parameters:How many 100 bp reads is necessary for comprehensive coverage of the mouse genome? 100 million, 100bp, SE reads
  11. 11. Setting the transcript ‘detection’ threshold RA-72H-1 RA-72H-2 CM CMNumber of raw reads (million) 97.3 88 87 95Number of mapped reads (million) 97 87.7 87 94Transcripts w. RPKM > 0.01 (/27641) 72% 77% 84% 84%
  12. 12. Setting the transcript ‘detection’ threshold RA-72H-1 RA-72H-2 CM CMNumber of raw reads (million) 97.3 88 87 95Number of mapped reads (million) 97 87.7 87 94Transcripts w. RPKM > 1 (/27641) 49% 48% 51% 52%
  13. 13. RPKM is constant, regardless of number of readsr2=0.9 r2=0.97 “RPKM for a particular transcript does not change when overall number of reads changes”
  14. 14. RPKM becomes relatively constant with increased read number 0.95 0.9 Median RPKM 0.85 0.8 0.749 0.75 0.725 0.7 0.65 0.6 0.55 0.5 20 40 60 80 Reads (millions)i.e. We are not detecting significantly more genes/transcripts above 20-30 million reads
  15. 15. How many 100 bp reads is necessary for comprehensive coverage of the mouse genome? 1 0.95Percent of final 0.9 transcripts [60,) 0.85 [30,60) [15,30) [7.5,15) Transcript 0.8 Abundance [3.75,7.5) [0.01,3.74) (RPKM) 0.75 0.7 0 20 40 60 80 100 Reads (millions) Between 20 and 30 million 100bp reads is sufficient to capture ~100% of the most abundant transcripts and 95% of the least abundant
  16. 16. AcknowledgementsShen Lab:Michael ShenHui ZhaoShen Lab MembersCalifano Lab:Andrea CalifanoMariano AlvarezYufeng ShenXiaoyun SunOlivier CouronneErin Bush
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×