High-Resolution Transcriptome Analysis: One Cell at a Time - Jian-Bing Fan


Published on

Gene function and regulation inside mammalian cells occurs spatially and temporally within the context of local microenvironment. Each individual cell is at a particular expression stage of gene activities which defines specific cellular functions/phenotypes such as cell growth, proliferation, and interactions with other cells. A comprehensive molecular characterization of individual cells will help uncover the structure and
dynamics of the cell lineage tree within a tissue/organ, in health and in disease, thus leading to a leapfrog advance in biology and medicine.
This talk will focus on some of the recent development of single cell transcriptome methodologies and their applications in cancer and stem cell research. The criteria for effective single-cell transcriptome analysis are (1) to be able to measure gene expression reliably and (2) to be able to profile a large number of individual cells cost-effectively. This talk will also discuss efforts toward the development of novel in-situ sequencing platforms that could carry out targeted expression analysis of 100s to 1000s of genes in millions of individual cells simultaneously, in either the tissue at a spatial resolution of single cell or a heterogeneous cell population in tissue culture.

Published in: Health & Medicine, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

High-Resolution Transcriptome Analysis: One Cell at a Time - Jian-Bing Fan

  1. 1. High-Resolution Transcriptome Analysis: One Cell at a Time AMATA 2013 Queensland, Australia October 16, 2013 Jian-Bing Fan Senior Director, Scientific Research © 2013 Illumina, Inc. All rights reserved. Illumina, IlluminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic Energy, Genome Analyzer, GenomeStudio, GoldenGate, HiScan, HiSeq, Infinium, iSelect, MiSeq, Nextera, NuPCR, SeqMonitor, Solexa, TruSeq, TruSight, VeraCode, the pumpkin orange color, and the Genetic Energy streaming bases design are trademarks or registered trademarks of Illumina, Inc. All other brands and names contained herein are the property of their respective owners.
  2. 2. The Intuitive Beauty of RNA-Seq Data All junctions are covered uniformly in RNA-Seq 2
  3. 3. RNA-Seq has evolved in 5 years New methods: Stranded vs. Non-stranded – New Stranded RNA Prep kits New methods: Poly-A vs. Total RNA – RiboZero kits method of choice for rRNA reduction – Total RNA methods reveal ncRNAs and allow “RIN independent” preps Lower Input Levels – Standard input levels into all TruSeq RNA kits today is only 100 ng total RNA Methods for studying highly degraded RNA – Can sequence RNA from FFPE samples Single Cell RNA Sequencing Methods 3
  4. 4. Why single cells Cellular heterogeneity – What is a cell type? – How many cell types are there? Non-symptomatic somatic mutations – Cells at terminal differentiation contain “substantial” variations Development and cellular differentiation – Cell lineage – Reprogramming Metagenomes Circulating cells (liquid biopsy) – CTC – Stem cells – Fetal cells 4
  5. 5. Single cell transcriptional landscapes 5
  6. 6. Unbiased cell-type discovery 6 Sten Linnarsson, MBB, Mol Neuro
  7. 7. STRT (single-cell tagged reverse transcription) Based on template-switching at 5’ of mRNA Barcoding already at RT step, pooling before amplification Sequence ~50 bp from 5’ end of mRNA (= TSS) Highly multiplexed: 96 cells at a time 7 Sten Linnarsson, MBB, Mol Neuro
  8. 8. STRT (single-cell tagged reverse transcription) Reverse transcription, with TdT activity adding Cs Template switching, PCR Fragmentation, retaining 5’ end P2 adapter P1 adapter (library PCR) Finished library 8 Sten Linnarsson, MBB, Mol Neuro
  9. 9. Reproducibility ES cells Synthetic mRNA R2 = 0.97 R2 = 0.98 mRNA molecules (ES cell #2) Number of molecules (single well) 10000 1000 100 10 1 1 10 100 1000 10000 Number of molecules (single well) 9 Sten Linnarsson, MBB, Mol Neuro mRNA molecules (ES cell #1)
  10. 10. Distinguish cell types by clustering Embryonic stem cells 1. 96 individual cells, representing 3 different cell types were profiled. 2. Transcripts from each cell was tagged by a short 5-base code (during RT) and pooled from 96 cells for amplification and made into sequencing library for mRNA-Seq. 3. Cell neighborhood was calculated based on individual cell expression profiles. 4. The results is a set of clusters of mutually similar cells, which reflected the true identity of cells Neuroblastoma (Neuro2A) Embryonic fibroblasts (MEF) Sten Linnarsson, Karolinska Inst Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Islam S, Kjällquist U, Moliner A, Zajac P, Fan JB, Lönnerberg P, Linnarsson S. Genome Research. 2011. 10
  11. 11. Cell type specific expression pattern Gene expression mapped on the cellular landscape. The number of hits to each gene, normalized to transcripts per million (t.p.m.) sequencing reads is shown on a logarithmic color scale (inset, upper left). The left column shows housekeeping genes selected from a range of average t.p.m. levels. The middle column shows genes known as ES cell markers. The right column shows genes that were determined in this study to be preferentially expressed in Neuro2A. 11
  12. 12. Single-cell transcriptional profiling 12
  13. 13. Clontech SMARTer ultra low RNA kit for Illumina sequencing 13
  14. 14. Sequencing the transcriptome of a single cell Sort Cells Smart-Seq Amplification cDNA Cells RNA 1 0.01 ng 10 0.1ng 100 1 ng 1000 10 ng 10000 100 ng Good 14 Bad Illumina Library Prep NGS Sequencing
  15. 15. SMARTer™ technology overview Key aspects of SMARTer™ protocol: switching mechanism at 5’ end of RNA template Single tube, single enzyme cDNA synthesis SMARTer oligo provides increased template switching efficiency of RT Minimal handling of starting material lowers the probability of RNA degradation Enrichment for full-lengths cDNA transcripts 15
  16. 16. Workflow overview Total RNA SMART cDNA Synthesis Total RNA • • ~ 5 hour Automatable Spri purification Full-length ds cDNA Amplification SMART cDNA Synthesis Full-length ds cDNA Amplification Covaris Nextera Tagmentation End Repair PCR Amplification A tailing Adp ligation PCR Amplification 16 1 day DAY TWO • • < 2 hour Automatable Spri purification
  17. 17. Primary sequencing metrics 120.00 18000 16000 100.00 14000 80.00 12000 %unique reads 10000 60.00 8000 %mapped reads % rRNA gene 40.00 6000 4000 20.00 2000 0.00 0 10ng rep1 17 10ng 1ng rep1 1ng rep2 0.1ng rep2 rep1 0.1ng rep2 0.05ng 0.05ng 0.01ng 0.01ng rep1 rep2 rep1 rep2
  18. 18. Reproducibility with various amounts of input RNA 10 ng 1 ng 0.1 ng Scatter plots comparing gene counts (i.e., log2 RPKM values) for replicate samples prepared using 10 ng, 1 ng, and 0.1 ng of mouse brain total RNA Input levels represent the amount of RNA obtained from ~500, 50, and 5 cells, respectively With decreased amount of input reproducibility is typically decreased 18
  19. 19. Base Coverage Sequencing coverage of SMARTer ultra low library % distance from 5’ 724 genes analyzed for average coverage across the entire length of the transcripts The graphs show consistent results between the 1 ng, 0.1 ng, 0.5 ng and 0.01 ng input amount of mouse brain total RNA 19
  20. 20. Accuracy of SMARTer ultra low compared to Taqman MAQC UHR/Brain 1ng Total RNA 0.1ng Total RNA 10 5 0 -10 -10 -5 0 5 10 Log2 sequencing count ratio (brain vs UHR) Number of genes retained: 705 Correlation (R): 0.942 Slope: 0.913 20 -5 Log2 qPCR ratio (brain vs UHR) 5 0 -5 -10 Log2 qPCR ratio (brain vs UHR) 10 MAQC UHR/Brain -10 -5 0 5 10 Log2 sequencing count ratio (brain vs UHR) Number of genes retained: 581 Correlation (R): 0.856 Slope: 0.754
  21. 21. Performance summary Sensitive cDNA synthesis technology combined with Illumina nextgeneration sequencing Single-tube protocol, robust library generation starting from picogram quantities of total RNA High mapping rate, wide dynamic range, accurate gene quantification, and uniform transcript coverage The SMARTer kit has been used and validated by more than 100 labs around the world Fluidigm C1 Single-Cell Autoprep system has been customized for SMARTer assay 21
  22. 22. Example 1: Gene-expression “landscape” of hematopoietic stem cells (HSCs) 22
  23. 23. Transcriptional ‘architecture’ of the first steps of the human hematopoietic hierarchy ‘Distances’ between hematopoietic populations, as measured by difference in expression in the downstream population relative to that in its progenitor (over twofold difference; FDR, <0.05), overlaid on the present hierarchical model of human hematopoietic differentiation. John Dick, University of Toronto The transcriptional architecture of early human hematopoiesis identifies multilevel control of lymphoid commitment. Elisa Laurenti, Sergei Doulatov, Sasan Zandi, Ian Plumb, Jing Chen, Craig April, Jian-Bing Fan & John E Dick. Nature Immunology. 2013. 23
  24. 24. Example 2: Single-cell transcriptome analysis of mammalian cell cycle 24
  25. 25. Single-cell transcriptomes of different cell cycle stages 40000 G1 35000 F lu o res c en c e (d R ) 30000 25000 20000 15000 10000 5000 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 -5000 C ycle Number Li et al, Biotechnol. Adv. 2013 25 G2 E xpression of C dt1 and Geminin John Zhong, USC
  26. 26. Molecular map of cell cycle Single-cell transcriptomes can be organized by similarity into a molecular map to re-constructs stepwise cell cycle events at the molecular level 26 John Zhong, USC
  27. 27. Example 3: NIH single cell analysis program (SCAP) 27
  28. 28. 28
  29. 29. The UCSD (PI)/Harvard/Scripps/Illumina team Samples Data Methods George Church Jerold Chun TSRI Harvard Kun Zhang Jian-Bing Fan UCSD Illumina Wei Wang Mostafa Ronaghi 29
  30. 30. NIH Single Cell Analysis Program Three centers funded from the National Institutes of Health's Common Fund, through its Single Cell Analysis Program (SCAP). – UCSD, USC and UPenn Single-cell sequencing and in-situ mapping of mRNA transcripts in human brains: – Generating total-RNAseq data on 10,000 microdissected single cells or flow-sorted single nuclei from Human Cortex and to create a 3D transcriptional map of the human brain. – Development and optimization of an in-situ RNA sequencing technology. – In-situ mapping of ~500 transcripts in 36 cortex sections, and integration with 10,000 sets of totalRNAseq data. – Includes UCSD (Kun Zhang (PI), Wei Wang), Scripps (Jerold Chun), Harvard (George Church), Illumina (Jian-Bing Fan, Mostafa Ronaghi) . 30
  31. 31. Approach Sample preparation (TSRI). – Microdissection of neurons and glia. – Flow sorting of neuronal and nonneuronal nuclei. Single-cell total-RNAseq (Illumina & UCSD). – – – – RNA transcripts +/- A-tails. Long and short transcripts. Strand-specificity. Batch processing in 96-well plates. RNA in situ sequencing (UCSD, Harvard & Illumina). – In-situ conversion of single RNA molecules into DNA nanoballs (rolonies). – In-situ decoding and counting by hybridization or sequencing on automated confocal microscope with customized fluidic devices. 31
  32. 32. Single-cell transcriptome sequencing methods Surani/LifeTech: Full length mRNA (Tang et al. 2009) STRT: mRNA 5’-end sequencing (Islam et al. 2011) CEL-seq: mRNA 3’-end sequencing (Hashimshony et al. 2012) Smart-seq: Full-length mRNA (Ramskold et al. 2012) Smart-seq2: Full-length mRNA (Picelli et al. 2013) Toto-RNAseq (UCSD/Illumina, being developed) – – – – 32 Full length Strand specific mRNAs and ncRNAs High throughput
  33. 33. Context is important Murray et al. Nat. Method, 2008 33
  34. 34. RNA FISH Barcoded RNA FISH + STORM RNA FISH + epifluorescent imaging Raj et al. Nat. Methods, 2008 Lubeck et al. Nat. Methods, 2012 34
  35. 35. In situ sequencing for RNA analysis in preserved tissue and cells Ke and Nilsson et al. Nat. Method, 2013 35
  36. 36. Fluorescent in situ sequencing (FISSEQ) 36 Jay Lee and George Church, Harvard
  37. 37. Two sequencing chemistries 37 Jay Lee and George Church, Harvard
  38. 38. Characterization of the 3D RNA-Seq library The system was able to sequence the whole transcriptome in situ in 3D, mapping over 100,000 reads and 6000 clusters, detecting mRNA, ncRNA, and antisenseRNA which can then strongly indicate the cell type. 38 Jay Lee and George Church, Harvard
  39. 39. Single cell sequencing applications Cancer – Early diagnosis of cancer    Circulating tumor cells may be present before … Limited clinical samples and early stage cancers Heterogeneity in tumors – Change in clonal population post-treatment Brain transcriptome – 3-D transcriptome map of a brain at high resolution Human cell lineage tree in health and disease (European Commission) Embryo to Adult – Accumulation of somatic mutations with cell division – Stem cell differentiation – Cellular origin mapping Fetal cells Single cell microbes (metagenomes) 39
  40. 40. Summary Single cell transcriptomes provided comprehensive molecular characterization of individual cells and revealed unique cell types/stages; discovered cell types correspond to marker-based cell types Systematic whole-organism cell mapping is feasible – Millions of single-cell transcriptomes needed Future technology development and integration – Isolation, identification & characterization of cells from all organs and systems in health, disease, & post-mortem – Molecular characterization of individual cells (e.g. single cell RNA-Seq) – Platforms: Next-gen sequencing, microfluidics, DNA arrays, & other analyses of individual cells – Three-dimensional subcellular transcriptome sequencing in situ – Real-time measurement – Computer Science & Systems: Extremely large-scale data capture, analysis, coalescence & management tools, methods & algorithms, cell lineage analysis & reconstruction algorithms, interactive data analyses & presentation. – Mathematics & Statistics 40
  41. 41. Acknowledgements STRT technology development Sten Linnarsson (Karolinska Inst) Saiful Islam (Karolinska Inst) SMART kit development Shujun Luo (Illumina) Gary Schroth (Illumina) Richard Sandberg (Ludwig Institute for Cancer Research) Daniel Ramskold (Ludwig Institute for Cancer Research) Andrew Farmer (Clontech) HSC and cell cycle projects John Dick (Ontario Cancer Institute, University of Toronto) Elisa Laurenti (Ontario Cancer Institute) John Zhong (University of Southern California) NIH SCAP Kun Zhang (PI; UCSD) Wei Wang (UCSD) Jerold Chun (Scripps) Jian-Bing Fan (Illumina) Mostafa Ronaghi (Illumina) Jay Lee (Harvard) George Church (Harvard) 41
  42. 42. Thank You 42
  43. 43. Fluorescent in situ sequencing (FISSEQ) 43 Jay Lee and George Church, Harvard