SlideShare a Scribd company logo
1 of 24
Single Cell RNA-seq reveals ectopic and aberrant lung
resident cell populations in IPF
UAB Division of Pulmonary
and Critical Care Medicine
Special Journal Club
September 26th, 2019
Presenter: Thi Nguyen
Dr. Duncan’s lab
bioRxiv preprint Sep. 6, 2019
Outline
• Background (authors + 10X chromium)
• Figures
• Summary of findings
• Discussion
• Comparative analyses of scRNAseq studies in IPF
Background
Naftali Kaminski, MD
Section Chief
Pulmonary, Critical Care
& Sleep Medicine
Yale School of Medicine
Ivan O. Rosas
Associate Professor
Harvard Medical School
Main research interests:
• pioneer in high throughput genomic approaches to elucidate mechanisms and
improve IPF diagnosis/ treatment
• integrate ‘omics” data with clinical information for personalized medicine
Single Cell RNA-seq for IPF project:
• April 27th 2017, Three Lakes Partner in collaboration with MATTER announced $1
million cash in IPF Catalyst Challenge
• November 9-11th, PFFSUMMIT2017, Nashville, TN, Taylor Adams presented
encouraging poster that showed promising results from scRNAseq (5 IPF+ 5 CT)
• 2018, their team won the cash prize and proposed to sequence 100 lungs within
3-4 months and promised to share results in short order.
• Sept. 2019, published bioRxiv preprint at the same time as the Kropski’s group.
10X Chromium technology to partition single cells
Fig 1 A, B. Profile human lung heterogeneity with
scRNAseq
312,928 cells
38 cell types
Fig.1C. cell-type marker genes expression
312,928 cells
38 cell types
Fig.2. Identification of aberrant basaloid cells in
IPF and COPD lungs
UMAP:
uniform manifold approximation and projection
• nonlinear dimensionality reduction technique
Fig.2C. Epithelial cells gene expression and predicted
transcriptional factor activity
483 aberrant basaloid cells
(448 IPF vs 33 COPD)
Fig.2D. Aberrant basaloid cells in IPF lungs
Fig2E. Correlation matrix of epithelial cell in independent
dataset
Fig.3. Disease-Enriched Vascular Endothelial cells
Locations of Peribronchial endothelial cells (pVE)
independent dataset validation of pVE
Fig 4. IPF Fibroblast and Myofibroblast archetype
analysis
partition based graph abstraction
(PAGA)
Fig.4 C, D. Fb and MyoFb lineage trajectory
analysis
Diffusion Pseudotime (DPT) :
• measures progression through branching lineages using random-walk-based distance
in diffusion map space.
Fig 5. Gene regulatory network analysis
node= gene
edge = correlation
node size = PageRank centralities
Fig 5. IPF GRN analysis
node size = PageRank
centrality
https://en.wikipedia.org/wiki/PageRank
Summary
• single cell atlas of IPF lungs
• found the aberrant basaloid cells in IPF
• found an ectopic VE cell population
• lineage analysis: Fb and myoFb are independent cell
types that becomes invasive and fibrotic in IPF
• IPF GRN network is shifted from a balanced diverse
GRN to a more fragmented and modular type.
Discussion
• What are the origin of the aberrant basaloid cells in IPF?
• Where do the COL15A1+ VE cells come from?
• Are myoFb just differentiated Fb or are they from an
independent lineage?
• Can we target these cells to cure IPF?
• How do the knowledge of IPF cell atlas advance IPF research?
What are the next questions that scRNAseq technology can help
us to uncover?
comparative analysis of 2 scRNAseq studies on IPF
Earlier scRNAseq studies on IPF
• FACS sort for CD45-CD31-CD326+
HTII-280+ AT2 cells
• 3 IPF (325 cells) + 3 Ct lungs (215 cells)
• IPF AT2 cells coexpress AT1, AT2 and conducting
airway selective markers -> indeterminate state
of differentiation not seen in normal lung
• 8 Ct + 4 IPF + 2 SS + 1 polymyositis 1 + HP
(biopsies)
• 76,070 cells
• distinct population of alveolar MΦ with high
expression of profibrotic genes
• found KRT5+TP63+ SOX2+ cells in both
normal and fibrotic lungs
Discussion
Discussion:
Didn’t we know this all along?
scRNAseq isn’t the answer to everything
1. cells need to be dissociated into single cells
2. can scRNAseq recover every cells? How representative is 10^4 cells /10^12 cells?
3. FACS sorting/ frozen cells cause artifacts
4. low capture efficiency/ high drop out-> unable to detect low-abundance transcripts
5. information about cells’ original spatial context is lost
6. low starting material -> data are noisier, more variable than bulk
7. curse of dimensionality

More Related Content

What's hot

Research project
Research project Research project
Research project
Dingquan Yu
 

What's hot (20)

Crispr
CrisprCrispr
Crispr
 
Comparative analysis of gene regulation in mouse rat and human
Comparative analysis of gene regulation in mouse rat and humanComparative analysis of gene regulation in mouse rat and human
Comparative analysis of gene regulation in mouse rat and human
 
Human Cell Line Authentication. Why is it so important?
Human Cell Line Authentication. Why is it so important?Human Cell Line Authentication. Why is it so important?
Human Cell Line Authentication. Why is it so important?
 
Itqb talkslideshfd deritemplate
Itqb talkslideshfd deritemplateItqb talkslideshfd deritemplate
Itqb talkslideshfd deritemplate
 
Research project
Research project Research project
Research project
 
CRISPR-Cas9: The new frontier of Genome Engineering
CRISPR-Cas9: The new frontier of Genome EngineeringCRISPR-Cas9: The new frontier of Genome Engineering
CRISPR-Cas9: The new frontier of Genome Engineering
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Cell Authentication By STR Profiling
Cell Authentication By STR ProfilingCell Authentication By STR Profiling
Cell Authentication By STR Profiling
 
Crispr future prospects in public health
Crispr  future prospects in public healthCrispr  future prospects in public health
Crispr future prospects in public health
 
Gene Editing - Challenges and Future of CRISPR in Clinical Development
Gene Editing - Challenges and Future of CRISPR in Clinical DevelopmentGene Editing - Challenges and Future of CRISPR in Clinical Development
Gene Editing - Challenges and Future of CRISPR in Clinical Development
 
Robertson immemxi final March 2016
Robertson immemxi final March 2016Robertson immemxi final March 2016
Robertson immemxi final March 2016
 
Pattemore 2015
Pattemore 2015Pattemore 2015
Pattemore 2015
 
CRISPR A Genome Editing Tool
CRISPR A Genome Editing ToolCRISPR A Genome Editing Tool
CRISPR A Genome Editing Tool
 
CTF 2017 Cutaneous Neurofibroma Resource Sage Bionetworks
CTF 2017 Cutaneous Neurofibroma Resource Sage BionetworksCTF 2017 Cutaneous Neurofibroma Resource Sage Bionetworks
CTF 2017 Cutaneous Neurofibroma Resource Sage Bionetworks
 
FOLDING
FOLDINGFOLDING
FOLDING
 
Next Generation Sequencing application in virology
Next Generation Sequencing application in virologyNext Generation Sequencing application in virology
Next Generation Sequencing application in virology
 
Detection of Gene HLA-B5801 on A Lateral-Flow Membrane
Detection of Gene HLA-B5801 on A Lateral-Flow MembraneDetection of Gene HLA-B5801 on A Lateral-Flow Membrane
Detection of Gene HLA-B5801 on A Lateral-Flow Membrane
 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical information
 
The Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome SequencingThe Chills and Thrills of Whole Genome Sequencing
The Chills and Thrills of Whole Genome Sequencing
 
Proof of concept of WGS based surveillance: meningococcal disease
Proof of concept of WGS based surveillance: meningococcal diseaseProof of concept of WGS based surveillance: meningococcal disease
Proof of concept of WGS based surveillance: meningococcal disease
 

Similar to Single-Cell RNAseq in IPF

RNA-Seq To Identify Novel Markers For Research on Neural Tissue Differentiation
RNA-Seq To Identify Novel Markers For Research on Neural Tissue DifferentiationRNA-Seq To Identify Novel Markers For Research on Neural Tissue Differentiation
RNA-Seq To Identify Novel Markers For Research on Neural Tissue Differentiation
Thermo Fisher Scientific
 
Olivia_Creasey_Science2012_Poster
Olivia_Creasey_Science2012_PosterOlivia_Creasey_Science2012_Poster
Olivia_Creasey_Science2012_Poster
Olivia Creasey
 
Spring Research Paper FINAL
Spring Research Paper FINALSpring Research Paper FINAL
Spring Research Paper FINAL
Hameeda Naimi
 
Hotspot mutation and fusion transcript detection from the same non-small cell...
Hotspot mutation and fusion transcript detection from the same non-small cell...Hotspot mutation and fusion transcript detection from the same non-small cell...
Hotspot mutation and fusion transcript detection from the same non-small cell...
Thermo Fisher Scientific
 

Similar to Single-Cell RNAseq in IPF (20)

Developing a Rapid Clinical Sequencing System to Classify Meningioma: Meet th...
Developing a Rapid Clinical Sequencing System to Classify Meningioma: Meet th...Developing a Rapid Clinical Sequencing System to Classify Meningioma: Meet th...
Developing a Rapid Clinical Sequencing System to Classify Meningioma: Meet th...
 
Standardization of human stem cell pluripotency using bioinformatics presenta...
Standardization of human stem cell pluripotency using bioinformatics presenta...Standardization of human stem cell pluripotency using bioinformatics presenta...
Standardization of human stem cell pluripotency using bioinformatics presenta...
 
Encode jc 20130412
Encode jc 20130412Encode jc 20130412
Encode jc 20130412
 
AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ Manchester
 
RNA-Seq To Identify Novel Markers For Research on Neural Tissue Differentiation
RNA-Seq To Identify Novel Markers For Research on Neural Tissue DifferentiationRNA-Seq To Identify Novel Markers For Research on Neural Tissue Differentiation
RNA-Seq To Identify Novel Markers For Research on Neural Tissue Differentiation
 
Olivia_Creasey_Science2012_Poster
Olivia_Creasey_Science2012_PosterOlivia_Creasey_Science2012_Poster
Olivia_Creasey_Science2012_Poster
 
Grindberg - PNAS
Grindberg - PNASGrindberg - PNAS
Grindberg - PNAS
 
Poster
PosterPoster
Poster
 
Charles River Pathology Associates Capabilities
Charles River Pathology Associates CapabilitiesCharles River Pathology Associates Capabilities
Charles River Pathology Associates Capabilities
 
A New Day for Myeloid Genomic Profiling - How NGS Advancements Are Providing ...
A New Day for Myeloid Genomic Profiling - How NGS Advancements Are Providing ...A New Day for Myeloid Genomic Profiling - How NGS Advancements Are Providing ...
A New Day for Myeloid Genomic Profiling - How NGS Advancements Are Providing ...
 
Spring Research Paper FINAL
Spring Research Paper FINALSpring Research Paper FINAL
Spring Research Paper FINAL
 
Hotspot mutation and fusion transcript detection from the same non-small cell...
Hotspot mutation and fusion transcript detection from the same non-small cell...Hotspot mutation and fusion transcript detection from the same non-small cell...
Hotspot mutation and fusion transcript detection from the same non-small cell...
 
14825.full
14825.full14825.full
14825.full
 
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
 
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
 
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
 
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
Combined Analysis of Micro RNA and Proteomic Profiles and Interactions in Pat...
 
Plenary presentation saturday 11 7_dr. lucie bruijn
Plenary presentation  saturday 11 7_dr. lucie bruijnPlenary presentation  saturday 11 7_dr. lucie bruijn
Plenary presentation saturday 11 7_dr. lucie bruijn
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
 
-28th ISPSR POSTER
-28th ISPSR POSTER-28th ISPSR POSTER
-28th ISPSR POSTER
 

More from Thi K. Tran-Nguyen, PhD

More from Thi K. Tran-Nguyen, PhD (20)

CHAMP1-family-conference-Oct-2022.pptx
CHAMP1-family-conference-Oct-2022.pptxCHAMP1-family-conference-Oct-2022.pptx
CHAMP1-family-conference-Oct-2022.pptx
 
IL-21 promotes pulmonary fibrosis through the induction of profibrotic CD8+ T...
IL-21 promotes pulmonary fibrosis through the induction of profibrotic CD8+ T...IL-21 promotes pulmonary fibrosis through the induction of profibrotic CD8+ T...
IL-21 promotes pulmonary fibrosis through the induction of profibrotic CD8+ T...
 
BiP-derived HLA-DR4 Epitopes Differentially Recognized by T cells in RA
BiP-derived HLA-DR4 Epitopes Differentially Recognized by T cells in RABiP-derived HLA-DR4 Epitopes Differentially Recognized by T cells in RA
BiP-derived HLA-DR4 Epitopes Differentially Recognized by T cells in RA
 
Fibrotic Diseases
Fibrotic DiseasesFibrotic Diseases
Fibrotic Diseases
 
Histology Exam
Histology ExamHistology Exam
Histology Exam
 
Goblet Cells Deliver Luminal Antigen to CD103+ DCs
Goblet Cells Deliver Luminal Antigen to CD103+ DCsGoblet Cells Deliver Luminal Antigen to CD103+ DCs
Goblet Cells Deliver Luminal Antigen to CD103+ DCs
 
Induction of Protective IgA by intestinal DC
Induction of Protective IgA by intestinal DCInduction of Protective IgA by intestinal DC
Induction of Protective IgA by intestinal DC
 
Fibrosis- Why and How?
Fibrosis- Why and How?Fibrosis- Why and How?
Fibrosis- Why and How?
 
Vietnam
VietnamVietnam
Vietnam
 
Transcriptional Responses to Anti-cancer Drugs in vitro
Transcriptional Responses to Anti-cancer Drugs in vitroTranscriptional Responses to Anti-cancer Drugs in vitro
Transcriptional Responses to Anti-cancer Drugs in vitro
 
CancerSeek
CancerSeekCancerSeek
CancerSeek
 
Deep Learning for EHR Data
Deep Learning for EHR DataDeep Learning for EHR Data
Deep Learning for EHR Data
 
PSN for Precision Medicine
PSN for Precision MedicinePSN for Precision Medicine
PSN for Precision Medicine
 
Extract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep LearningExtract Stressors for Suicide from Twitter Using Deep Learning
Extract Stressors for Suicide from Twitter Using Deep Learning
 
Big Data Programming-Final Project
Big Data Programming-Final ProjectBig Data Programming-Final Project
Big Data Programming-Final Project
 
Predictive Features of TCR Repertoire
Predictive Features of TCR RepertoirePredictive Features of TCR Repertoire
Predictive Features of TCR Repertoire
 
Cancer Immunotherapy
Cancer ImmunotherapyCancer Immunotherapy
Cancer Immunotherapy
 
Allogeneic IgG Enhances Antitumor T-cell Immunity
Allogeneic IgG Enhances Antitumor T-cell ImmunityAllogeneic IgG Enhances Antitumor T-cell Immunity
Allogeneic IgG Enhances Antitumor T-cell Immunity
 
CD28null T-cells in Autoimmune Disease
CD28null T-cells in Autoimmune DiseaseCD28null T-cells in Autoimmune Disease
CD28null T-cells in Autoimmune Disease
 
Gut Microbiome Composition Influences Responses to immunotherapy
Gut Microbiome Composition Influences Responses to immunotherapyGut Microbiome Composition Influences Responses to immunotherapy
Gut Microbiome Composition Influences Responses to immunotherapy
 

Recently uploaded

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
Silpa
 

Recently uploaded (20)

Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 

Single-Cell RNAseq in IPF

  • 1. Single Cell RNA-seq reveals ectopic and aberrant lung resident cell populations in IPF UAB Division of Pulmonary and Critical Care Medicine Special Journal Club September 26th, 2019 Presenter: Thi Nguyen Dr. Duncan’s lab bioRxiv preprint Sep. 6, 2019
  • 2. Outline • Background (authors + 10X chromium) • Figures • Summary of findings • Discussion • Comparative analyses of scRNAseq studies in IPF
  • 3. Background Naftali Kaminski, MD Section Chief Pulmonary, Critical Care & Sleep Medicine Yale School of Medicine Ivan O. Rosas Associate Professor Harvard Medical School Main research interests: • pioneer in high throughput genomic approaches to elucidate mechanisms and improve IPF diagnosis/ treatment • integrate ‘omics” data with clinical information for personalized medicine Single Cell RNA-seq for IPF project: • April 27th 2017, Three Lakes Partner in collaboration with MATTER announced $1 million cash in IPF Catalyst Challenge • November 9-11th, PFFSUMMIT2017, Nashville, TN, Taylor Adams presented encouraging poster that showed promising results from scRNAseq (5 IPF+ 5 CT) • 2018, their team won the cash prize and proposed to sequence 100 lungs within 3-4 months and promised to share results in short order. • Sept. 2019, published bioRxiv preprint at the same time as the Kropski’s group.
  • 4. 10X Chromium technology to partition single cells
  • 5. Fig 1 A, B. Profile human lung heterogeneity with scRNAseq 312,928 cells 38 cell types
  • 6. Fig.1C. cell-type marker genes expression 312,928 cells 38 cell types
  • 7. Fig.2. Identification of aberrant basaloid cells in IPF and COPD lungs UMAP: uniform manifold approximation and projection • nonlinear dimensionality reduction technique
  • 8. Fig.2C. Epithelial cells gene expression and predicted transcriptional factor activity 483 aberrant basaloid cells (448 IPF vs 33 COPD)
  • 9. Fig.2D. Aberrant basaloid cells in IPF lungs
  • 10. Fig2E. Correlation matrix of epithelial cell in independent dataset
  • 11. Fig.3. Disease-Enriched Vascular Endothelial cells
  • 12. Locations of Peribronchial endothelial cells (pVE)
  • 14. Fig 4. IPF Fibroblast and Myofibroblast archetype analysis partition based graph abstraction (PAGA)
  • 15. Fig.4 C, D. Fb and MyoFb lineage trajectory analysis Diffusion Pseudotime (DPT) : • measures progression through branching lineages using random-walk-based distance in diffusion map space.
  • 16. Fig 5. Gene regulatory network analysis node= gene edge = correlation node size = PageRank centralities
  • 17. Fig 5. IPF GRN analysis node size = PageRank centrality https://en.wikipedia.org/wiki/PageRank
  • 18. Summary • single cell atlas of IPF lungs • found the aberrant basaloid cells in IPF • found an ectopic VE cell population • lineage analysis: Fb and myoFb are independent cell types that becomes invasive and fibrotic in IPF • IPF GRN network is shifted from a balanced diverse GRN to a more fragmented and modular type.
  • 19. Discussion • What are the origin of the aberrant basaloid cells in IPF? • Where do the COL15A1+ VE cells come from? • Are myoFb just differentiated Fb or are they from an independent lineage? • Can we target these cells to cure IPF? • How do the knowledge of IPF cell atlas advance IPF research? What are the next questions that scRNAseq technology can help us to uncover?
  • 20. comparative analysis of 2 scRNAseq studies on IPF
  • 21. Earlier scRNAseq studies on IPF • FACS sort for CD45-CD31-CD326+ HTII-280+ AT2 cells • 3 IPF (325 cells) + 3 Ct lungs (215 cells) • IPF AT2 cells coexpress AT1, AT2 and conducting airway selective markers -> indeterminate state of differentiation not seen in normal lung • 8 Ct + 4 IPF + 2 SS + 1 polymyositis 1 + HP (biopsies) • 76,070 cells • distinct population of alveolar MΦ with high expression of profibrotic genes • found KRT5+TP63+ SOX2+ cells in both normal and fibrotic lungs
  • 23. Discussion: Didn’t we know this all along?
  • 24. scRNAseq isn’t the answer to everything 1. cells need to be dissociated into single cells 2. can scRNAseq recover every cells? How representative is 10^4 cells /10^12 cells? 3. FACS sorting/ frozen cells cause artifacts 4. low capture efficiency/ high drop out-> unable to detect low-abundance transcripts 5. information about cells’ original spatial context is lost 6. low starting material -> data are noisier, more variable than bulk 7. curse of dimensionality

Editor's Notes

  1. This study was done as a collaboration between Kaminski’s group at Yale who is a pionerr n high-thruput genomic approaches to study IPF and Ivan Rosas group, who supply them with the clinical lung specifimens. WIth a strong inteest in integrating high throughput ‘omics’ data to generate tools for precision mediine in IPF, back in 2017 he started doing single cell –RNAseq but with limited samples. April, 201e, 3 lakes partners, a venture philanthropy with mission to end IPF, in collaboration with Matter, the health care technology incubator and innovation hub announed $1 million cash award for the IPF catalyst challenge. At the PPFsummit 2017 in Nashville, a postbaccalaureate researcher in Kaminski’s lab, Taylor Adams, presented the team’s first single-cell data in a poster with promising results from scRNAseq of only 5 IPF and 5 CT. This reearch attracted Three Lakes attention. in 2018, their team one the cash prize and proposed to sequence 100 lungs within 3-4 months and promised to share results in short order. This amount of funding has enabled their team to accellerate the process of scRNAseq, with unprecedented scale. Just earlier this month, they published bioRxiv preprint at the same time as the Kropski’s group from Vanderbilt. Dr. Kaminski has a strong interest in integrating high throughput ‘omics’ data, such as genome scale DNA variants, coding and non-coding RNAs, microbiome and metabolome information with clinical information to generate tools for personalized medicine of lung diseases that are significantly more precise, predictive and patient centered than anything that is currently available. Three Lakes Partners, a venture philanthropy committed to ending idiopathic pulmonary fibrosis (IPF), in collaboration with MATTER, the healthcare technology incubator and innovation hub, announced its $1 million IPF Catalyst Challenge this evening during a gathering of some of the world's foremost IPF experts and healthcare thought leaders at MATTER's headquarters in Chicago's Merchandise Mart. In fact, a postbaccalaureate researcher in Kaminski’s lab, Taylor Adams, presented the team’s first single-cell data in a poster at last year’s Pulmonary Fibrosis Foundation Summit in Nashville. That presentation, Kaminski believes, first attracted Three Lakes’ attention. Using single-cell transcriptomics, Kaminski and his team plan to sequence the RNA (ribonucleic acid) of every cell in more than 100 donor lungs affected by IPF and other lung diseases. Ivan O. Rosas, MD, a physician at Brigham and Women’s Hospital in Boston and associate professor of pulmonary and critical care at Harvard Medical School, is providing the lungs for this research as part of an ongoing collaboration. 
  2. Both the Kaminski group and the Kropski group utilized the 10X Genomics' Chromium technology which can partition single cells, or sometimes nuclei into a small nanolitter-scale oil droplet. Each droplet contains uniqely barcoded beades called gel in beads emulsions. Inseide the droplet, the cells are lysed and their mRNA is captured on the uniqe barcoded bead. then mRNA is reversed transcribed to make cDNA, PCR ampiflied, then pooled and sequenced on a high thruput platform. Using this technology, many novel cell types have been discovered. a Notable example is the discovery of pulmonay ionocyte back in 2018, which was published in nature.
  3. Back to the IPF study. Here is the overview of experimental design. They profile total 79 human lungs, with 32 lung explants from IPF, 18 from COPD and 29 lungs control from unused donor lungs. The lungs are dissociated to make single cell suspension, and stored in liquid nitrogen in the Ivan Rosas group before handing them over to the Kaminski group. Then they use the single cell barcoding technology to capture each cells’ mRNA, make cDNA and PCR amplied then sequenced. Then next step they do data processing, exploratory analysis and then validation using IHC. In totalled they successfully sequenced 312,928 cells from the distal lung paranchyma as shown in the UMAP presentation here. UMAP (uniform manifold approximation and projection) a new non-linear dimension reduction techniqe. It has faster run time, more reproducibility and can preserve global structure better than older techniques such as t-NE. in this UMAP, each dot represent a cells, each cells relationship to other cells are represented in the multidimentional space of gene expression. Human cells have 20,000 genes so with such high dimension, we need a dimension reduction technique to visualize the data. Typically scRNAseq can only capture about 15% of total transcripts, so each cells has about 3000 gene expression values which correspond to 3000 dimension. U rMAp= what it does is that it takes a high dimensional dataset and reduce it to a low dimensional dataset while retaining a lot of the informational in the original dataset, in such a way that the cluster of the cells in the high dimensional space is preserved. . These cells are grouped into 38 discrete cells types shown in different colors coded here, grouped in to 4 broader cell categories. High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space. . That manifold is, of course, just the low dimensional euclidean space we are trying to embed into. T
  4. Figure C shows the heatmap of marker genes expression values for these 38 discrete cells types, in 4 broader groups. epithelial, stromal, myeloid and lymphoid categories. For each cell type, they only show the top 5 genes most differentially expressed between the particular cell type against the rest of othr cell types in the same category. Each column shows the average expression for one subject. Then they also do hierachical clustering by disease status in the top. They forgot to include the figure legend for the color code of different disease but I guess the blue are normal, yellow are COPD and red are IPF. Overall this figures just want to show us the validity of their classification of cells. Heatmap of 235 marker genes for all 38 identified cell types, categorized into 4 broad cell categories. Each cell type is represented by the top 5 genes ranked by false-detection-rate adjusted p-value of a Wilcoxon rank-sum test between the average expression per subject value for each cell type against the other average subject expression of the other cell types in their respective grouping. Each column represents the average expression value for one subject, hierarchically grouped by disease 240 status and cell type. Gene expression values are unity normalized from 0 to 1. 
  5. UMAP of all epithelial cells labeled by cell types on the left or disease on teh right or subject. Boxplots show the distribution of the proportion of each cell types of all epithelial cells per subject, and stratified by different disease group. You can see that the epithelial cell repertoire of IPF lung has increased proportion of airway epithelial cells and decreased in aveolar epithelial cells. They also mention that there are profound change in gene expression of epithelial cell in IPF lung compared to COPD or ct (IPF cell atlas data mining site). Among the epithelial cells, they identified a population of cells that was transcriptionally distint from any epithelial cell types previously described that they called aberrant basaloid cells.
  6. Heat map of average gene epxression and predicted trancriptional factor activity per subject across each epithelial cell type. These columns are also grouped by disease status and cell type. They also zoomed into provide more info for annotation of aberrant basaloid cells. These cells are transcriptionally distinct from other epithelial cells. In addition to epithelial markers, they express basal cell marker such as TP63, KRT17, LAMB3, LAMC2, but do not expressed well-established basal markers such as KRT5 and 15. THey express markers of EMT such as VIM, CDH1, FN1, COL1A1, TNC, HMGA2 and senesence related genes such as CDKN1A2A, CCND etc. These cells also express highest level of IPF-related molecule suchaa as MMP7, alphaVbeta6 subunits and EPHB2. These cellls are predicted to express the TF SOX9, which is important for distal airway development, repair and oncogenesis. These cells were not found in ct lungs
  7. IHC of aberrant basaloid cells in IPF lungs: epithelial cells covering fibroblast foci are p63+KRT17+ basaloid cells staining COX2, p21 and HMGA2 positive, while basal cells in bronchi do not.
  8. To validate their results, they reanalyzed IPF cell single cell data by Refgman published earlier this year. Correlation matrix showing spearman rho correlation coefficnet color coded showing well correlation between analogous cell subset between their stud y and reyfman studies. Hierachical clustering is also applied to the cell populations showing the hierachical relationship among different epithelial cell subsets, showing the aberrant basaloid cells are very closely resembled the basal cells.
  9. Next they want to explore the endothelial cell repertoire in IPF lung. Cluster analysis of VE cells show 4 population characterized as capillary, arterial or venous VE. They also discovered an abnormal ectopic fifth population of VE which express COL15A1. Using Human protein atlas, they know that COL15A1 VE cells are restricted to vasculature near major airways. Therefore they name this Populations of cells VE peribroncial. UMAPs of endothelial and mesenchymal cells labelled by cell type, disease status and subject. In subject plot, each color represented by unique color. B. Heatmap showing characteristics of 5 subtupes of VE. Each column is one individual cell, which is group by disease or by subject in the row on top Boxplots show the percent makeup distributions of each VE cell-type amongst all VE cells within each disease group. You can see that VE peribronical cells are found in all disease states but are substantially more abundant in IPF.
  10. So they localize these endothelial cells in the lungs by staining with CD31, pan endothelial cells marker and COl15A1. in control lungs, these cells are confined to the bronchial vasculature surrounding large promimal airways but in IPF, these cells can be found in the distal lung at the edge of fibroblastic foci.
  11. Violin plots of expression of pan-VE markers and peribronchial VE-specific markers across VE cells from distal and airway lung samples from an independent dataset. These confirm that the COL15A1 + VE cells are an ectopic VE population in the distal lung in IPF. It is similar to a box plot, with the addition of a rotated kernel density plot on each side Violin plots are similar to box plots, except that they also show the probability density of the data at different values, A violin plot is more informative than a plain box plot. While a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data. The difference is particularly useful when the data distribution is multimodal (more than one peak) A violin plot is a hybrid of a box plot and a kernel density plot, which shows peaks in the data. On each side of the gray line is a kernel density estimation to show the distribution shape of the data. Wider sections of the violin plot represent a higher probability that members of the population will take on the given value; the skinnier sections represent a lower probability.
  12. They also characterize the Fb and myoFb in IPF. So how they define these mesenchymal cells is to select cells with PDGFRB+ which are negative for known smooth muscle cells markers. This strategy leads them to identify 2 distinct stromal populations as shown in heat map in A. Fb are dfined as cells express CD34 and ECM proteins such as FBN1, FBLN2 and VIT while myoFb are cells express high cytoskeleton markers such as MYLK, NEBL, MYO10 etc. Each column is average expression per cell type for one subject. And these columns are grouped acorrding to disease type. YOu can see in the IPF group, their myoFb has higher expression of ECM protein such as COL8A1, ACTA2. B. UMAPs of myoFb and Fb color coded by cell type, disease and unsupervised Louvain sub-clusters. Basically this is an method of unsupervised clustering (PCA first, then use the k-nearest neighbor algorithms), where you can arbitrarily choose the number of clusters for different resolution of the clustering of the dataset. Here they obviously choose n=8. Unsupervised means the cluster is achived through the data points relationship among each other, not through predetermined class/ group. Next, they applies a lineage reconstruction technique called PAGA to these subclusters of Fb and myoFb. Basically this technique allows the conversio of these clusters and their relationship into graph presentations of node and edge. Node represents subcluster and edge represents their interconnectivity with each otherr in the so-called phenotype-space. The strength of connectivity among the nodes is calculated and denoted here as edge confidence. You can see that the connectivity among the fb subclster are so much stronger than among a Fb cluster with a myofb subcluster.
  13. In order to analyze the lineage trajectory among Fb and myFb, they implemented the DPT algorithm which attempts to use scRNAseq data to reconstruct the developmental progression of the cells. Then again they use the UMAP dimensionality reduction technique after the DPT to obtain figure C, again labeled in colors by cell type, disease status and subject. D and E are heatmaps of Fb and myoFB ordered by DPT distance along UMAP manifolds that transition from control enriched region towards IPF-enriched archetype. The color codes for the distance. These heat map show the continuous trajectory developmental progression of both lineage of FB and myeoFb from normal to IPF. . That manifold is, of course, just the low dimensional euclidean space we are trying to embed into. T
  14. Next they perform gene regulatory network analysis. They implement the bigSScale approach to control and IPF cells, but exclude the COPD samples. In this method, cells are recursively clustered down to subcluster. Z score are calcualated based on DE between subclusters. Then they construct gene correlation matrix using Pearson correlation coeff and cosine distance and filtered the nodes by cosine correlations and has GO anatoation as gene regulator. Doing this, they constructed a network of 13,000 nodes in ct and 12,427 in IPF with about 300,000 edges. In this network, nodes are genes, edge = correlation of regulatory relationship. Node size= page rank centrality. Page rank is an algorithm Google search engine use to rank web pages. it measure the importance of one node in the network based on the assumption that the most important node will have the most connections/ links coming from other nodes. largest cluster are color coded. Top cell types in each cluster are highlited, and color coded according to the most dominant cell type in each cluster. Over all, you can see that the IPF network are more dense and has more discrete clustering/ more isolated compared to the ct. Ct network has more diverse and cells are more spreadout/ diverse than IPF, both within the cluster and across the cluster. Also in the IPF network, the aberant basaloid cells is very dense and located near the epithelial cell cluster, which is very isolated from the rest of the cell types. PageRank (PR) is an algorithm used by Google Search to rank web pages in their search engine results. PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other website
  15. Next, they use PageRank algorithm to rank the most influential genes on other genes in the network. Basically this algorithms based on the assumptions that the genes has the most connection/ edges to other genes is the most influencing genes in the network. Cartoon illustrating the basic principle of PageRank. The size of each face is proportional to the total size of the other faces which are pointing to it. Using this page rank approach, they identify 300 nodes/ genes highlighted in red tha tmost influence the genes/ in the network. Here is the same GRN as before with the top 300 nodes highlided in red that most different between the ct and IPF network based on theri pagerank centrality. The node size correspond to the Pagerank centrality. You can see genes that are the influencer in driving teh difference between IPF and ct network belongs to the BMP/WNT signaling pathway. E. They did gene set enrichment of these 300 genes and show results related to cellular aging, response to TGFbeta1, epithelial tuube formation and SMC differentiation.
  16. discover the shift in alveolar epithelial cells gene expression -> airway EC -> aberrant basaloid cells
  17. discover the shift in alveolar epithelial cells gene expression -> airway EC -> aberrant basaloid cells
  18. CD326 (EPCAM) HTII-280 marker for type II AT cells
  19. CD326 (EPCAM) HTII-280 marker for type II AT cells
  20. discover the shift in alveolar epithelial cells gene expression -> airway EC -> aberrant basaloid cells
  21. The pseudostratified epithelium of the mouse trachea and human airways contains a population of basal cells expressing Trp-63 (p63) and cytokeratins 5 (Krt5) and Krt14. Using a KRT5-CreER(T2) transgenic mouse line for lineage tracing, we show that basal cells generate differentiated cells during postnatal growth and in the adult during both steady state and epithelial repair. We have fractionated mouse basal cells by FACS and identified 627 genes preferentially expressed in a basal subpopulation vs. non-BCs. Analysis reveals potential mechanisms regulating basal cells and allows comparison with other epithelial stem cells. To study basal cell behaviors, we describe a simple in vitro clonal sphere-forming assay in which mouse basal cells self-renew and generate luminal cells, including differentiated ciliated cells, in the absence of stroma. The transcriptional profile identified 2 cell-surface markers, ITGA6 and NGFR, which can be used in combination to purify human lung basal cells by FACS. Like those from the mouse trachea, human airway basal cells both self-renew and generate luminal daughters in the sphere-forming assay. Idiopathic pulmonary fibrosis is a common form of interstitial lung disease resulting in alveolar remodeling and progressive loss of pulmonary function because of chronic alveolar injury and failure to regenerate the respiratory epithelium. Histologically, fibrotic lesions and honeycomb structures expressing atypical proximal airway epithelial markers replace alveolar structures, the latter normally lined by alveolar type 1 (AT1) and AT2 cells. Bronchial epithelial stem cells (BESCs) can give rise to AT2 and AT1 cells or honeycomb cysts following bleomycin-mediated lung injury. However, little is known about what controls this binary decision or whether this decision can be reversed. Here we report that inactivation of Fgfr2b in BESCs impairs their contribution to both alveolar epithelial regeneration and honeycomb cysts after bleomycin injury. By contrast overexpression of Fgf10 in BESCs enhances fibrosis resolution by favoring the more desirable outcome of alveolar epithelial regeneration over the development of pathologic honeycomb cysts.