SlideShare a Scribd company logo
Gene Expression One Cell at a Time
Experimental design and analysis of single-cell RNA-Seq data
David Cook
Vanderhyden Lab, uOttawa
DavidPCook
dpcook
dcook082@uottawa.ca
Conclusions from bulk analysis can be representative of nothing
Bulk analysis
InterpretationSample
Conclusions from bulk analysis can be representative of nothing
Impossible to conclude if differences are due to composition or cancer cells themselves
TCGA, Nature, 2011
Bulk summaries hide underlying structure
X Mean:
Y Mean:
X SD:
Y SD:
Corr:
54.26
47.83
16.76
26.93
-0.06
Matejka and Fitzmaurice (Autodesk Research, Toronto)
Single-cell exposes this heterogeneity
scRNA-Seq
Fur
Venom
Sample Interpretation
Applications of scRNA-Seq
Single-Cell Platforms
Fluidigm C1
Pros
Allows visual inspection of captured
cells
Customizability
Cons
Only two inlets for cell samples
Throughput can’t keep up with field
Relatively long prep time
Live Cell Dead Cell Multiple Live Cells
Calcein AM Ethidium homodimer-1
Droplet-based methods
Pros
Very high throughput
Up to 8 unique samples per run
System cost relatively low
Cons
Limited customizability
Zheng et al., Nature Comm, 2017
Plate methods
Common Chemistry: RT and 3’ Enrichment
Only 3’ end of transcript
is PCR amplified
Why 3’ enrichment?
5’ 3’1kb cDNA
Ten 100bp reads needed for 1x coverage
100bp reads
5’ 3’
200bp 3’ fragment
Two 100bp reads needed for 1x coverage
100bp reads
Consequence: Lose nearly all information about isoform usage (sorry, Matt)
Single-Cell Platforms
10x Genomics
BioRad ddSeq
Fluidigm C1
Plate methods
Cost per cell Cells per run Flexibility/Customizable
+ ~1000-46000 +
++ ~300-10000 +
++++ 96 or 800 +++
Protocol
Dependent
10 - >10k +++++
Cost
10x Genomics
Reagent Kit (20 samples): $20,000
One sample = ~600-6000 cells
Microfluidics Chips (Six 8-sample chips): $1,440
Fluidigm C1 (HT assays)
Reagent Kit (5 runs): $5,000
One run = ~800 cells
Integrated Fluidics Circuit (1 run): $2000
Sequencing
NextSeq500 High Output
1 run ($3700) enough for ~2-3k cells
HiSeq4000
1 lane (~$2700) enough for ~2-3k cells
(Often need to purchase entire flow cell)
Experimental Design
How many cells?
Depends on what you’re looking at
More cells = better detection of rare populations
Mocosko et al,. Cell, 2015
Pollen et al,. Nature Biotech, 2014
More heterogeneity? More cells
Sequencing: How deep do you need to go?
Depends on what you want
Svensson et al., Nature Methods, 2017
Rough Guideline
Aim for 100,000 reads per cell
50,000 per cell is probably fine
Zheng et al., Nature Comm, 2017
16k reads/cell (>60k PBMCs)
Zheng et al., Nature Comm, 2017
Sample numbers and batch effects
Hicks et al., BioRxiv, 2016
Mix biological variables in individual runs!
Sample numbers and batch effects
Tung et al., Scientific Reports, 2017
Analyzing scRNA-Seq Data
Project Background
Control Estrogen
Areas of columnar OSE
Control Estradiol
0
5
10
15
%ovariansurfacethat
hascolumnarcells
*
Areas of hyperplastic OSE
Control Estradiol
5
10
15
%ovariansurface
thatishyperplastic
*
Placebo
E2
E2
Hormone replacement therapy increases risk of ovarian cancer
Exogenous estrogen enhances the cancer progression in mouse models
Prolonged estrogen exposure causes ovarian epithelial dysplasia in
normal mice
General Analysis Workflow
Sequencing
Processing
QC & Filtering
Normalization (and imputation?)
Clustering
Differential expression, trajectory
analysis, network analysis, etc
Alignment, transcript quantification, and import into R
Kallisto – Pseudoalignment to the transcriptome
Bray et al., Nature Biotech, 2016
tximport package to dump gene-level expression matrix into R
Soneson et al., F1000, 2016
General Workflow
Sequencing
Processing
QC & Filtering
Normalization (and imputation?)
Clustering
Differential expression, trajectory
analysis, network analysis, etc
Filtering scRNA-Seq Data
Dead Cell Multiple Live Cells
Ethidium homodimer-1
(Fluidigm specific)
Before Filtering After Filtering
800 cells
30735 genes
636 cells
14300 genes
Filter genes that
are not detected in
at least 10 cells
General Workflow
Sequencing
Processing
QC & Filtering
Normalization (and imputation?)
Clustering
Differential expression, trajectory
analysis, network analysis, etc
Finding and controlling for technical variables
Data exploration is critical
Exprs. matrices
Raw Counts
Log-transformed
Z-scores
Normalized
Cells
Genes
Cell metadataphenoData
Gene metadata
featureData
SCEset:
Finding and controlling for technical variables
1. Library Size
Scaling each library by a size factor
• Counts per million (CPM)
• DESeq
• TMM
• Pooled-based size factors (Lun et al., Genome
Biology, 2016)
Finding and controlling for technical variables
2. Cell Cycle (or other confounding biological processes we aren’t interested in)
Stegle et al., Nature Rev. Genetics, 2015
Cell cycle classification using “scran” package
Cell cycle not driving large amounts of
variation at this point
Finding and controlling for technical variables
3. Other technical variables
Finding variables that drive variation
Coloured by IFC Column
Finding and controlling for technical variables
3. Other technical variables
removeBatchEffect() – limma package
Yi = β0 + β1(TotalFeatures)i + β2(IFC.Row)i + β3(Condition)i + εi
Removes the effect of the technical
covariates on a per-gene basis
Note: IFC.Column tackled same way, but split by condition beforehand
Post-normalization Odd IFC Column
Finding and controlling for technical variables
3. Other technical variables
O_o?
Data Imputation
Data Imputation
Van Dijk et al., BioRxiv, 2017
Data Imputation
Before Imputation After Imputation
Before Imputation
After Imputation
General Workflow
Sequencing
Processing
QC & Filtering
Normalization (and imputation?)
Clustering
Differential expression, trajectory
analysis, network analysis, etc
Clustering
Nature Methods, 2017
General Workflow
Sequencing
Processing
QC & Filtering
Normalization (and imputation?)
Clustering
Differential expression, trajectory
analysis, network analysis, etc
Differential Expression
Currently no “standard”—variety of methods perform pretty well:
• DESeq
• edgeR
• Monocle
• SCDE (single cell differential expression)
Differential Expression
PC2 Values
PC1 Values
PC3 Values
Proteolysis
Actin cytoskeleton organization
Cell adhesion
Innate immune response(?)
Oxidation-reduction process
Oxidative stress
Positive regulation of apoptosis
Metabolic pathways
MAPK signaling
PI3K-Akt signaling
Negative regulation of apoptosis
Cell differentiation
Oxidation-reduction process
Oxidative stress
Proton transport
Mitophagy
Response to wound healing
Response to hypoxia
Apoptosis
Negative regulation of cell cycle
Oxidation-reduction process
Cell adhesion
Rho protein signaling
TCA cycle
Trajectory Analysis
Leveraging asynchrony to reconstruct cellular response trajectories
Wagner et al., Nature Biotech Reviews, 2016
A couple methods:
• Monocle
• Diffusion pseudotime
• PHATE
• Wishbone
• Waterfall
• Wanderlust
• SLICER
• TSCAN
• And more
Trajectory Analysis
Reverse Graph
Embedding
(monocle)
Qiu et al., BioRxiv, 2017
Trajectory Analysis
Can we model the phenotype divergence where estrogen-treated cells progress to form foci?
Does this model foci formation?
Trajectory Analysis
GREB1
Z-projection
Trajectory Analysis
Discovering new transcriptional dynamics
Yi = β0 + β1(Pseudotime)i + β2(Branch)i + β3(Pseudotime)i(Branch)i + εiFull Model:
Yi = β0 + β1(Pseudotime)i + β2(Branch)i + εiReduced Model:
Likelihood Ratio Test to find sig. genes
Trajectory Analysis
Still working on it! But here’s the type of stuff you pull out
Qiu et al., BioRxiv, 2017
Where is the field going?
Trajectory Analysis
• Larger data sets
• Combining the technology with perturbations
• Collecting multiple –omics datasets from individual cells
Dixit et al., Cell, 2016
BioRxiv, 2017
Disclaimer: Much of this will be obsolete in a matter of months
Staying on the ball with scRNA-Seq
Nature Methods, Jan 23rd, 2017
Science, March 3rd, 2017
Nature Methods, March 27th, 2017
Nature Methods, March 6th, 2017
Nature Methods, April 17, 2017
Nature Biotechnology, May 1st, 2017
Resources
Sean Davis’s “Awesome Single Cell” list
https://github.com/seandavi/awesome-single-cell
10x Genomics Public Datasets
https://support.10xgenomics.com/single-cell/datasets
1.3 Million brain cells from E18 mice
68k PBMCs
Fun Tutorials
Seurat: http://satijalab.org/seurat/get_started.html
Monocle (find on Bioconductor)
Thank You!!!

More Related Content

What's hot

Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biology
lemberger
 
Introduction to Single-cell RNA-seq
Introduction to Single-cell RNA-seqIntroduction to Single-cell RNA-seq
Introduction to Single-cell RNA-seq
Timothy Tickle
 
Whole Genome Sequencing Analysis
Whole Genome Sequencing AnalysisWhole Genome Sequencing Analysis
Whole Genome Sequencing Analysis
Efi Athieniti
 
Overview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seqOverview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seq
Alireza Doustmohammadi
 
The Cancer Genome Atlas Update
The Cancer Genome Atlas UpdateThe Cancer Genome Atlas Update
The Cancer Genome Atlas Update
Melanoma Research Foundation
 
Jan2016 pac bio giab
Jan2016 pac bio giabJan2016 pac bio giab
Jan2016 pac bio giab
GenomeInABottle
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
inside-BigData.com
 
ChipSeq Data Analysis
ChipSeq Data AnalysisChipSeq Data Analysis
ChipSeq Data Analysis
COST action BM1006
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data Analysis
SANJANA PANDEY
 
Ngs ppt
Ngs pptNgs ppt
Ngs ppt
Archa Dave
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
Amritha S R
 
Gene Expression Data Analysis
Gene Expression Data AnalysisGene Expression Data Analysis
Gene Expression Data Analysis
Jhoirene Clemente
 
Systems Biology Approaches to Cancer
Systems Biology Approaches to CancerSystems Biology Approaches to Cancer
Systems Biology Approaches to Cancer
Raunak Shrestha
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
Josh Neufeld
 
Single cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applicationsSingle cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applications
faraharooj
 
An Overview to Protein bioinformatics
An Overview to Protein bioinformaticsAn Overview to Protein bioinformatics
An Overview to Protein bioinformatics
Joel Ricci-López
 
Pcr primer design
Pcr primer designPcr primer design
Pcr primer design
Karan Veer Singh
 
NGS - QC & Dataformat
NGS - QC & Dataformat NGS - QC & Dataformat
NGS - QC & Dataformat
Karan Veer Singh
 
Proteomics
ProteomicsProteomics
Proteomics
Shereen Shehata
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNA
maryamshah13
 

What's hot (20)

Introduction to systems biology
Introduction to systems biologyIntroduction to systems biology
Introduction to systems biology
 
Introduction to Single-cell RNA-seq
Introduction to Single-cell RNA-seqIntroduction to Single-cell RNA-seq
Introduction to Single-cell RNA-seq
 
Whole Genome Sequencing Analysis
Whole Genome Sequencing AnalysisWhole Genome Sequencing Analysis
Whole Genome Sequencing Analysis
 
Overview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seqOverview of Single-Cell RNA-seq
Overview of Single-Cell RNA-seq
 
The Cancer Genome Atlas Update
The Cancer Genome Atlas UpdateThe Cancer Genome Atlas Update
The Cancer Genome Atlas Update
 
Jan2016 pac bio giab
Jan2016 pac bio giabJan2016 pac bio giab
Jan2016 pac bio giab
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
 
ChipSeq Data Analysis
ChipSeq Data AnalysisChipSeq Data Analysis
ChipSeq Data Analysis
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data Analysis
 
Ngs ppt
Ngs pptNgs ppt
Ngs ppt
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Gene Expression Data Analysis
Gene Expression Data AnalysisGene Expression Data Analysis
Gene Expression Data Analysis
 
Systems Biology Approaches to Cancer
Systems Biology Approaches to CancerSystems Biology Approaches to Cancer
Systems Biology Approaches to Cancer
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
 
Single cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applicationsSingle cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applications
 
An Overview to Protein bioinformatics
An Overview to Protein bioinformaticsAn Overview to Protein bioinformatics
An Overview to Protein bioinformatics
 
Pcr primer design
Pcr primer designPcr primer design
Pcr primer design
 
NGS - QC & Dataformat
NGS - QC & Dataformat NGS - QC & Dataformat
NGS - QC & Dataformat
 
Proteomics
ProteomicsProteomics
Proteomics
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNA
 

Similar to scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017

10.1.1.80.2149
10.1.1.80.214910.1.1.80.2149
10.1.1.80.2149
vantinhkhuc
 
Fluorescence- Activated Cell Sorter (FACS)
Fluorescence- Activated Cell Sorter (FACS)Fluorescence- Activated Cell Sorter (FACS)
Fluorescence- Activated Cell Sorter (FACS)
Nidhi Parikh
 
2014 naples
2014 naples2014 naples
2014 naples
c.titus.brown
 
Applications of Flow Cytometry | Cell Analysis
Applications of Flow Cytometry | Cell AnalysisApplications of Flow Cytometry | Cell Analysis
Applications of Flow Cytometry | Cell Analysis
University of The Punjab
 
Aug2015 analysis team 10 mason epigentics
Aug2015 analysis team 10 mason epigenticsAug2015 analysis team 10 mason epigentics
Aug2015 analysis team 10 mason epigentics
GenomeInABottle
 
Flow Cytometry Training : Introduction day 1 session 1
Flow Cytometry Training : Introduction day 1 session 1Flow Cytometry Training : Introduction day 1 session 1
Flow Cytometry Training : Introduction day 1 session 1
Robert (Rob) Salomon
 
BioSB meeting 2015
BioSB meeting 2015BioSB meeting 2015
BioSB meeting 2015
hansjansen9999
 
Mason abrf single_cell_2017
Mason abrf single_cell_2017Mason abrf single_cell_2017
Mason abrf single_cell_2017
Christopher Mason
 
Dna microarray mehran- u of toronto
Dna microarray  mehran- u of torontoDna microarray  mehran- u of toronto
Dna microarray mehran- u of toronto
Society for Heart Attack Prevention and Eradication
 
2014 villefranche
2014 villefranche2014 villefranche
2014 villefranche
c.titus.brown
 
Fragment Based Drug Discovery
Fragment Based Drug DiscoveryFragment Based Drug Discovery
Fragment Based Drug Discovery
Anthony Coyne
 
NAISTビッグデータシンポジウム - バイオ久保先生
NAISTビッグデータシンポジウム - バイオ久保先生NAISTビッグデータシンポジウム - バイオ久保先生
NAISTビッグデータシンポジウム - バイオ久保先生
ysuzuki-naist
 
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
Saul Kravitz
 
Trends In Genomics
Trends In GenomicsTrends In Genomics
Trends In Genomics
Saul Kravitz
 
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Varij Nayan
 
Advances and Applications Enabled by Single Cell Technology
Advances and Applications Enabled by Single Cell TechnologyAdvances and Applications Enabled by Single Cell Technology
Advances and Applications Enabled by Single Cell Technology
QIAGEN
 
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Ryohei Suzuki
 
Introduction to biocomputing
 Introduction to biocomputing Introduction to biocomputing
Introduction to biocomputing
Natalio Krasnogor
 
Bioinformatics final
Bioinformatics finalBioinformatics final
Bioinformatics final
Rainu Rajeev
 
Resolving transcriptional dynamics of the epithelial-mesenchymal transition u...
Resolving transcriptional dynamics of the epithelial-mesenchymal transition u...Resolving transcriptional dynamics of the epithelial-mesenchymal transition u...
Resolving transcriptional dynamics of the epithelial-mesenchymal transition u...
David Cook
 

Similar to scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017 (20)

10.1.1.80.2149
10.1.1.80.214910.1.1.80.2149
10.1.1.80.2149
 
Fluorescence- Activated Cell Sorter (FACS)
Fluorescence- Activated Cell Sorter (FACS)Fluorescence- Activated Cell Sorter (FACS)
Fluorescence- Activated Cell Sorter (FACS)
 
2014 naples
2014 naples2014 naples
2014 naples
 
Applications of Flow Cytometry | Cell Analysis
Applications of Flow Cytometry | Cell AnalysisApplications of Flow Cytometry | Cell Analysis
Applications of Flow Cytometry | Cell Analysis
 
Aug2015 analysis team 10 mason epigentics
Aug2015 analysis team 10 mason epigenticsAug2015 analysis team 10 mason epigentics
Aug2015 analysis team 10 mason epigentics
 
Flow Cytometry Training : Introduction day 1 session 1
Flow Cytometry Training : Introduction day 1 session 1Flow Cytometry Training : Introduction day 1 session 1
Flow Cytometry Training : Introduction day 1 session 1
 
BioSB meeting 2015
BioSB meeting 2015BioSB meeting 2015
BioSB meeting 2015
 
Mason abrf single_cell_2017
Mason abrf single_cell_2017Mason abrf single_cell_2017
Mason abrf single_cell_2017
 
Dna microarray mehran- u of toronto
Dna microarray  mehran- u of torontoDna microarray  mehran- u of toronto
Dna microarray mehran- u of toronto
 
2014 villefranche
2014 villefranche2014 villefranche
2014 villefranche
 
Fragment Based Drug Discovery
Fragment Based Drug DiscoveryFragment Based Drug Discovery
Fragment Based Drug Discovery
 
NAISTビッグデータシンポジウム - バイオ久保先生
NAISTビッグデータシンポジウム - バイオ久保先生NAISTビッグデータシンポジウム - バイオ久保先生
NAISTビッグデータシンポジウム - バイオ久保先生
 
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008CAMERA Presentation at KNAW ICoMM Colloquium May 2008
CAMERA Presentation at KNAW ICoMM Colloquium May 2008
 
Trends In Genomics
Trends In GenomicsTrends In Genomics
Trends In Genomics
 
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
 
Advances and Applications Enabled by Single Cell Technology
Advances and Applications Enabled by Single Cell TechnologyAdvances and Applications Enabled by Single Cell Technology
Advances and Applications Enabled by Single Cell Technology
 
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
 
Introduction to biocomputing
 Introduction to biocomputing Introduction to biocomputing
Introduction to biocomputing
 
Bioinformatics final
Bioinformatics finalBioinformatics final
Bioinformatics final
 
Resolving transcriptional dynamics of the epithelial-mesenchymal transition u...
Resolving transcriptional dynamics of the epithelial-mesenchymal transition u...Resolving transcriptional dynamics of the epithelial-mesenchymal transition u...
Resolving transcriptional dynamics of the epithelial-mesenchymal transition u...
 

Recently uploaded

如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
Hitesh Sikarwar
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 

Recently uploaded (20)

如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 

scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017

  • 1. Gene Expression One Cell at a Time Experimental design and analysis of single-cell RNA-Seq data David Cook Vanderhyden Lab, uOttawa DavidPCook dpcook dcook082@uottawa.ca
  • 2. Conclusions from bulk analysis can be representative of nothing Bulk analysis InterpretationSample
  • 3. Conclusions from bulk analysis can be representative of nothing Impossible to conclude if differences are due to composition or cancer cells themselves TCGA, Nature, 2011
  • 4. Bulk summaries hide underlying structure X Mean: Y Mean: X SD: Y SD: Corr: 54.26 47.83 16.76 26.93 -0.06 Matejka and Fitzmaurice (Autodesk Research, Toronto)
  • 5. Single-cell exposes this heterogeneity scRNA-Seq Fur Venom Sample Interpretation
  • 8. Fluidigm C1 Pros Allows visual inspection of captured cells Customizability Cons Only two inlets for cell samples Throughput can’t keep up with field Relatively long prep time Live Cell Dead Cell Multiple Live Cells Calcein AM Ethidium homodimer-1
  • 9. Droplet-based methods Pros Very high throughput Up to 8 unique samples per run System cost relatively low Cons Limited customizability Zheng et al., Nature Comm, 2017
  • 11. Common Chemistry: RT and 3’ Enrichment Only 3’ end of transcript is PCR amplified
  • 12. Why 3’ enrichment? 5’ 3’1kb cDNA Ten 100bp reads needed for 1x coverage 100bp reads 5’ 3’ 200bp 3’ fragment Two 100bp reads needed for 1x coverage 100bp reads Consequence: Lose nearly all information about isoform usage (sorry, Matt)
  • 13. Single-Cell Platforms 10x Genomics BioRad ddSeq Fluidigm C1 Plate methods Cost per cell Cells per run Flexibility/Customizable + ~1000-46000 + ++ ~300-10000 + ++++ 96 or 800 +++ Protocol Dependent 10 - >10k +++++
  • 14. Cost 10x Genomics Reagent Kit (20 samples): $20,000 One sample = ~600-6000 cells Microfluidics Chips (Six 8-sample chips): $1,440 Fluidigm C1 (HT assays) Reagent Kit (5 runs): $5,000 One run = ~800 cells Integrated Fluidics Circuit (1 run): $2000 Sequencing NextSeq500 High Output 1 run ($3700) enough for ~2-3k cells HiSeq4000 1 lane (~$2700) enough for ~2-3k cells (Often need to purchase entire flow cell)
  • 16. How many cells? Depends on what you’re looking at More cells = better detection of rare populations Mocosko et al,. Cell, 2015 Pollen et al,. Nature Biotech, 2014 More heterogeneity? More cells
  • 17. Sequencing: How deep do you need to go? Depends on what you want Svensson et al., Nature Methods, 2017 Rough Guideline Aim for 100,000 reads per cell 50,000 per cell is probably fine Zheng et al., Nature Comm, 2017 16k reads/cell (>60k PBMCs) Zheng et al., Nature Comm, 2017
  • 18. Sample numbers and batch effects Hicks et al., BioRxiv, 2016 Mix biological variables in individual runs!
  • 19. Sample numbers and batch effects Tung et al., Scientific Reports, 2017
  • 21. Project Background Control Estrogen Areas of columnar OSE Control Estradiol 0 5 10 15 %ovariansurfacethat hascolumnarcells * Areas of hyperplastic OSE Control Estradiol 5 10 15 %ovariansurface thatishyperplastic * Placebo E2 E2 Hormone replacement therapy increases risk of ovarian cancer Exogenous estrogen enhances the cancer progression in mouse models Prolonged estrogen exposure causes ovarian epithelial dysplasia in normal mice
  • 22. General Analysis Workflow Sequencing Processing QC & Filtering Normalization (and imputation?) Clustering Differential expression, trajectory analysis, network analysis, etc
  • 23. Alignment, transcript quantification, and import into R Kallisto – Pseudoalignment to the transcriptome Bray et al., Nature Biotech, 2016 tximport package to dump gene-level expression matrix into R Soneson et al., F1000, 2016
  • 24. General Workflow Sequencing Processing QC & Filtering Normalization (and imputation?) Clustering Differential expression, trajectory analysis, network analysis, etc
  • 25. Filtering scRNA-Seq Data Dead Cell Multiple Live Cells Ethidium homodimer-1 (Fluidigm specific) Before Filtering After Filtering 800 cells 30735 genes 636 cells 14300 genes Filter genes that are not detected in at least 10 cells
  • 26. General Workflow Sequencing Processing QC & Filtering Normalization (and imputation?) Clustering Differential expression, trajectory analysis, network analysis, etc
  • 27. Finding and controlling for technical variables Data exploration is critical Exprs. matrices Raw Counts Log-transformed Z-scores Normalized Cells Genes Cell metadataphenoData Gene metadata featureData SCEset:
  • 28. Finding and controlling for technical variables 1. Library Size Scaling each library by a size factor • Counts per million (CPM) • DESeq • TMM • Pooled-based size factors (Lun et al., Genome Biology, 2016)
  • 29. Finding and controlling for technical variables 2. Cell Cycle (or other confounding biological processes we aren’t interested in) Stegle et al., Nature Rev. Genetics, 2015 Cell cycle classification using “scran” package Cell cycle not driving large amounts of variation at this point
  • 30. Finding and controlling for technical variables 3. Other technical variables Finding variables that drive variation Coloured by IFC Column
  • 31. Finding and controlling for technical variables 3. Other technical variables removeBatchEffect() – limma package Yi = β0 + β1(TotalFeatures)i + β2(IFC.Row)i + β3(Condition)i + εi Removes the effect of the technical covariates on a per-gene basis Note: IFC.Column tackled same way, but split by condition beforehand Post-normalization Odd IFC Column
  • 32. Finding and controlling for technical variables 3. Other technical variables O_o?
  • 34. Data Imputation Van Dijk et al., BioRxiv, 2017
  • 35. Data Imputation Before Imputation After Imputation Before Imputation After Imputation
  • 36. General Workflow Sequencing Processing QC & Filtering Normalization (and imputation?) Clustering Differential expression, trajectory analysis, network analysis, etc
  • 38. General Workflow Sequencing Processing QC & Filtering Normalization (and imputation?) Clustering Differential expression, trajectory analysis, network analysis, etc
  • 39. Differential Expression Currently no “standard”—variety of methods perform pretty well: • DESeq • edgeR • Monocle • SCDE (single cell differential expression)
  • 40. Differential Expression PC2 Values PC1 Values PC3 Values Proteolysis Actin cytoskeleton organization Cell adhesion Innate immune response(?) Oxidation-reduction process Oxidative stress Positive regulation of apoptosis Metabolic pathways MAPK signaling PI3K-Akt signaling Negative regulation of apoptosis Cell differentiation Oxidation-reduction process Oxidative stress Proton transport Mitophagy Response to wound healing Response to hypoxia Apoptosis Negative regulation of cell cycle Oxidation-reduction process Cell adhesion Rho protein signaling TCA cycle
  • 41. Trajectory Analysis Leveraging asynchrony to reconstruct cellular response trajectories Wagner et al., Nature Biotech Reviews, 2016 A couple methods: • Monocle • Diffusion pseudotime • PHATE • Wishbone • Waterfall • Wanderlust • SLICER • TSCAN • And more
  • 43. Trajectory Analysis Can we model the phenotype divergence where estrogen-treated cells progress to form foci? Does this model foci formation?
  • 45. Trajectory Analysis Discovering new transcriptional dynamics Yi = β0 + β1(Pseudotime)i + β2(Branch)i + β3(Pseudotime)i(Branch)i + εiFull Model: Yi = β0 + β1(Pseudotime)i + β2(Branch)i + εiReduced Model: Likelihood Ratio Test to find sig. genes
  • 46. Trajectory Analysis Still working on it! But here’s the type of stuff you pull out Qiu et al., BioRxiv, 2017
  • 47. Where is the field going?
  • 48. Trajectory Analysis • Larger data sets • Combining the technology with perturbations • Collecting multiple –omics datasets from individual cells Dixit et al., Cell, 2016 BioRxiv, 2017
  • 49. Disclaimer: Much of this will be obsolete in a matter of months
  • 50. Staying on the ball with scRNA-Seq Nature Methods, Jan 23rd, 2017 Science, March 3rd, 2017 Nature Methods, March 27th, 2017 Nature Methods, March 6th, 2017 Nature Methods, April 17, 2017 Nature Biotechnology, May 1st, 2017
  • 51. Resources Sean Davis’s “Awesome Single Cell” list https://github.com/seandavi/awesome-single-cell 10x Genomics Public Datasets https://support.10xgenomics.com/single-cell/datasets 1.3 Million brain cells from E18 mice 68k PBMCs Fun Tutorials Seurat: http://satijalab.org/seurat/get_started.html Monocle (find on Bioconductor)

Editor's Notes

  1. CPM not robust to high amounts of differential expression DESeq and TMM don’t hold up well to lots of zero counts