Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Pine Biotech - Profiling Tumor-associated Macrophages Using RNA-Seq

1,766 views

Published on

Using the T-BioInfo Platform to analyze publicly available data from a scientific publication, we were able to further explore the determination of macrophage types based upon gene expression.

Published in: Education
  • Be the first to comment

Pine Biotech - Profiling Tumor-associated Macrophages Using RNA-Seq

  1. 1. Profiling Tumor-associated Macrophages Using RNA-Seq Based on “Expression Profiling of Macrophages Reveals Multiple Populations with Distinct Biological Roles in an Immunocompetent Orthotopic Model of Lung Cancer. ” (Journal of Immunology 2016. doi:10.4049/jimmunol.1502364)
  2. 2. Introduction Tumor-associated macrophages are a type of white blood cell found near or inside tumors. There is evidence for their involvement in both pro-tumor and anti-tumor processes. The current widely used classification for macrophages is M1/M2. Using data from GSE76033 one can look at the expression profiles of all of the macrophage samples to see if there is a trend or grouping that can show genes specific to M1/M2 or subgroups within those classifications. Cells were taken from a cancerous lung and placed directly into the left lung of immunocompetent mice. At 2 weeks and 3 weeks after the injection procedure, the mice were sacrificed, along with uninjected control mice.1 Next samples that have commonalities between time points were selected. Using an RNA-Seq pipeline, estimated levels of expression for each gene and isoform were generated across all samples. Unsupervised analysis using principal component analysis (PCA) shows groupings by time and macrophage type. A supervised approach using Factor Regression Analysis shows the effects of different factors on genes in the samples. Thus specific genes and isoforms that are uniquely expressed in macrophage subtypes were identified. 1. Poczobutt JM, De S, Yadav VK, et al. Expression Profiling of Macrophages Reveals Multiple Populations with Distinct Biological Roles in an Immunocompetent Orthotopic Model of Lung Cancer. J Immunol. 2016. doi:10.4049/jimmunol.1502364. Run SiglecF CD11b Ly6G CD64 CD11c Cancer cell type Weeks 1 SRR300264 5 SiglecF+ CD11c+ No Mac A 0 2 SRR300264 6 SiglecF+ CD11c+ No Mac A 0 3 SRR300264 7 SiglecF+ CD11c+ No Mac A 0 4 SRR300264 8 CD11b+ Ly6G- CD64low CD11c+ No Mac B1 0 5 SRR300264 9 CD11b+ Ly6G- CD64low CD11c+ No Mac B1 0 6 SRR300265 0 CD11b+ Ly6G- CD64low CD11c+ No Mac B1 0 7 SRR300265 1 CD11b+ Ly6G- CD64med CD11c- No Mac B2 0 8 SRR300265 2 CD11b+ Ly6G- CD64med CD11c- No Mac B2 0 9 SRR300265 3 CD11b+ Ly6G- CD64med CD11c- No Mac B2 0 10 SRR300265 4 SiglecF+ CD11c+ Yes Mac A 2 11 SRR300265 5 SiglecF+ CD11c+ Yes Mac A 2 12 SRR300265 6 SiglecF+ CD11c+ Yes Mac A 2 13 SRR300265 7 CD11b+ Ly6G- CD64med CD11c- Yes Mac B2 2 14 SRR300265 8 CD11b+ Ly6G- CD64med CD11c- Yes Mac B2 2 15 SRR300265 9 CD11b+ Ly6G- CD64med CD11c- Yes Mac B2 2
  3. 3. RNA-seq pipeline prepares all annotated and non- annotated genomic element estimation of expression levels Removing genomic elements that did not have any expression (all zeros) in the RSEM table. Quantile Normalization Principal Component Analysis RSEM output tables of genes, isoforms and exons are prepared for Machine Learning Analysis 1. Mapping TopHat 2. Finding Isoforms by Cufflinks 3. GTF file of isoforms is made by Cuffmerge 4. Mapping Bowtie-2t on new transcriptome Factor Regression Analysis
  4. 4. Principal Component Analysis Principal Component Analysis is a data reduction technique that represents the dataset structure on principal components. The components that explain the most percentage of variability are chosen as principal. Using gene and isoform expression profiles across all samples, PCA shows a good separation between macrophage groups and time. Grouping by type can be seen in the PCA graphs on the bottom, with Mac A encircled in blue, Mac B1 and 2wk Mac B2 in red, and 3wk MacB2 along with MacB3 in green. The graph on the bottom right shows the normal distribution of gene expression, with no strong outliers. -15 -10 -5 0 5 10 15 -20 -15 -10 -5 0 5 10 15 Normal Distribution of Gene Expression NoC MacA NoC MacA NoC MacA NoC MacB1 NoC MacB1 NoC MacB1 NoC MacB2NoC MacB2 NoC MacB2 2wk MacA 2wk MacA 2wk MacA 2wk MacB2 2wk MacB2 2wk MacB2 2wk MacB3 2wk MacB3 2wk MacB3 3wk MacB2 3wk MacB2 3wk MacB2 3wk MacB3 3wk MacB3 3wk MacB3 PCA (13.26%, 10.78%) of Isoform Expression of All Samples after Quantile Normalization 0wk MacA 0wk MacA 0wk MacA 0wk MacB1 0wk MacB1 0wk MacB1 0wk MacB2 0wk MacB2 0wk MacB2 2wk MacA2wk MacA 2wk MacA 2wk MacB2 2wk MacB2 2wk MacB2 2wk MacB3 2wk MacB3 2wk MacB3 3wk MacB2 3wk MacB2 3wk MacB2 3wk MacB3 3wk MacB3 3wk MacB3 PCA (16.04%, 13.21%) of Gene Expression of All Samples after Quantile Normalization
  5. 5. PCA Comparison of Factor Groups MacA 0wk MacA 0wk MacA 0wk MacB2 0wk MacB2 0wk MacB2 0wk MacA 2wk MacA 2wk MacA 2wk MacB2 2wk MacB2 2wk MacB2 2wk PCA 28.65%/10.10% of 0wk/2wk MacA/B2 Genes over Samples MacA 0wk MacB2 0wk MacA 2wk MacB2 2wk MacA 0wk MacA 0wk MacA 0wk MacB2 0wk MacB2 0wk MacB2 0wk MacA 2wkMacA 2wk MacA 2wk MacB2 2wk MacB2 2wk MacB2 2wk PCA 21.80%/10.93% of 0wk/2wk MacA/B2 Isoforms over Samples MacA 0wk MacB2 0wk MacA 2wk MacB2 2wk MacB2 2wk MacB2 2wk MacB2 2wk MacB3 2wk MacB3 2wk MacB3 2wk MacB2 3wk MacB2 3wk MacB2 3wk MacB3 3wk MacB3 3wk MacB3 3wk PCA 15.05%/10.49% of 2wk/3wk MacB2/B3 Genes over Samples MacB2 2wk MacB3 2wk MacB2 3wk MacB3 3wk MacB2 2wk MacB2 2wk MacB2 2wk MacB3 2wk MacB3 2wk MacB3 2wk MacB2 3wk MacB2 3wk MacB2 3wk MacB3 3wk MacB3 3wk MacB3 3wk PCA 11.85%/10.47% of 2wk/3wk MacB2/B3 Isoforms over Samples MacB2 2wk MacB3 2wk MacB2 3wk MacB3 3wk Because there are no matching factors on all samples, we divided the project into 2 distinct groups that have 1 group of replicates that is overlapping. Before even performing factor analysis, we can see grouping appear in PCAs of the divided factor groups as outlined on slide 3. This factor analysis compared a combination of 0 weeks vs 2 weeks and Mac A vs Mac B2, alongside 2 weeks vs 3 weeks and MacB2 vs MacB3.
  6. 6. Factor Regression Analysis • MacA cells in the presence or absence of the tumor was negative for Ly6C and weak for MHC II suggesting alveolar macrophages. • MacB1 cells expressed low levels of Ly6C and MHC II expression ranged, suggesting two cell types. • MacB2 cells high levels of Ly6C and no expression for MHC II which is typical for monocytes. • MacB3 cells negative for Ly6C and expressed high levels of MHC II which is typical for macrophages. Run SiglecF CD11b Ly6G CD64 CD11c Cancer cell type Weeks 1 SRR3002645 SiglecF+ CD11c+ No Mac A 0 2 SRR3002646 SiglecF+ CD11c+ No Mac A 0 3 SRR3002647 SiglecF+ CD11c+ No Mac A 0 4 SRR3002648 CD11b+ Ly6G- CD64low CD11c+ No Mac B1 0 5 SRR3002649 CD11b+ Ly6G- CD64low CD11c+ No Mac B1 0 6 SRR3002650 CD11b+ Ly6G- CD64low CD11c+ No Mac B1 0 7 SRR3002651 CD11b+ Ly6G- CD64med CD11c- No Mac B2 0 8 SRR3002652 CD11b+ Ly6G- CD64med CD11c- No Mac B2 0 9 SRR3002653 CD11b+ Ly6G- CD64med CD11c- No Mac B2 0 10 SRR3002654 SiglecF+ CD11c+ Yes Mac A 2 11 SRR3002655 SiglecF+ CD11c+ Yes Mac A 2 12 SRR3002656 SiglecF+ CD11c+ Yes Mac A 2 13 SRR3002657 CD11b+ Ly6G- CD64med CD11c- Yes Mac B2 2 14 SRR3002658 CD11b+ Ly6G- CD64med CD11c- Yes Mac B2 2 15 SRR3002659 CD11b+ Ly6G- CD64med CD11c- Yes Mac B2 2 16 SRR3002660 CD11b+ Ly6G- CD64hi CD11c+ Yes Mac B3 2 17 SRR3002661 CD11b+ Ly6G- CD64hi CD11c+ Yes Mac B3 2 18 SRR3002662 CD11b+ Ly6G- CD64hi CD11c+ Yes Mac B3 2 19 SRR3002663 CD11b+ Ly6G- CD64med CD11c- Yes Mac B2 3 20 SRR3002664 CD11b+ Ly6G- CD64med CD11c- Yes Mac B2 3 21 SRR3002665 CD11b+ Ly6G- CD64med CD11c- Yes Mac B2 3 22 SRR3002666 CD11b+ Ly6G- CD64hi CD11c+ Yes Mac B3 3 23 SRR3002667 CD11b+ Ly6G- CD64hi CD11c+ Yes Mac B3 3 24 SRR3002668 CD11b+ Ly6G- CD64hi CD11c+ Yes Mac B3 3 F1:nocancervs cancer F2:MacAvsMacB2 F1:2wkvs3wkaftertumor F2:MacB2vsMacB3 In order to use Factor Regression Analysis, factors on all levels need to be present for each sample. In this project, there are a number of time points and cell types that are not fully represented across all samples. Because there are no matching factors on all samples, we divided the project into 2 distinct groups that have 1 group of replicates that is overlapping. This factor analysis compared a combination of 0 weeks vs 2 weeks and Mac A vs Mac B2, alongside 2 weeks vs 3 weeks and MacB2 vs MacB3.
  7. 7. Top Genes influenced by Factors 0 1 2 3 4 5 6 7 ENSMUSG00000048078 ENSMUSG00000028031 ENSMUSG00000086503 ENSMUSG00000036446 ENSMUSG00000029838 ENSMUSG00000026069 ENSMUSG00000035799 ENSMUSG00000026204 0 2 4 6 8 10 0wk and 2wk MacA vs MacB2 ENSMUSG00000039934 ENSMUSG00000061397 ENSMUSG00000020838 ENSMUSG00000048834 ENSMUSG00000010651 ENSMUSG00000039013 ENSMUSG00000000794 ENSMUSG00000026065 0 1 2 3 4 5 6 ENSMUSG00000030000 ENSMUSG00000020950 ENSMUSG00000025738 ENSMUSG00000074480 ENSMUSG00000024968 ENSMUSG00000002459 ENSMUSG00000039476 ENSMUSG00000028197 ENSMUSG00000021136 ENSMUSG00000076614 0 2 4 6 8 10 12 ENSMUSG00000036905 ENSMUSG00000026548 ENSMUSG00000018920 ENSMUSG00000089929 ENSMUSG00000060586 ENSMUSG00000024663 ENSMUSG00000093809 ENSMUSG00000050777 ENSMUSG00000045404 ENSMUSG00000010307 Taking a sample of the top genes from each Factor Regression Analysis comparison shows the high expression of selected genes for one factor (in the top right, Mac A) and low expression of those same genes in samples for another factor (Mac B2 in the top right). We can see similar results, with a smaller expression gap, in the bottom right between Mac B2 (low) and Mac B3 (high). The same comparison is made for time as a factor on the left. Qualifying genes which survived Factor Analysis filtering and match pathway analysis for PPAR Pathways and Cytokine-Cytokine Receptor Interaction are seen on the far right. 0 weeks 2 weeks 2 weeks 3 weeks B2 B3 B2 B3 A B2 A B2 2 weeks 3 weeks 0 weeks 2 weeksA B2 A B2 B2 B3 B2 B3 0 2 4 6 8 10 PPAR Pathways ENSMUSG00000030162 ENSMUSG00000010651 ENSMUSG00000015846 ENSMUSG00000030546 ENSMUSG00000022853 ENSMUSG00000028607 ENSMUSG00000002108 ENSMUSG00000015568 ENSMUSG00000026003 ENSMUSG00000025059 ENSMUSG00000062908 ENSMUSG00000020777 ENSMUSG00000031808 ENSMUSG00000024900 ENSMUSG00000002944 A B2 A B2 0 1 2 3 4 5 6 7 8 9 Cytokine-Cytokine Receptor Interaction ENSMUSG0000007388 9 ENSMUSG0000000918 5 ENSMUSG0000002251 4 ENSMUSG0000000761 3 ENSMUSG0000009732 8 ENSMUSG0000002440 1 ENSMUSG0000003074 5 ENSMUSG0000001892 0 ENSMUSG0000000289 7 ENSMUSG0000002462 0 ENSMUSG0000006822 7 ENSMUSG0000000079 1 ENSMUSG0000000048 9 ENSMUSG0000007171 4 B2 B3 B2 B3 0wk vs 2wk – MacA and MacB2 0wk and 2wk – MacA vs MacB2 2wk vs 3wk – MacB2 and MacB3 2wk vs 3wk – MacB2 and MacB3
  8. 8. Pathway Analysis of MacA vs MacB2 Genes Authors report out of 16 PPAR signaling genes, 12 were highly expressed in MacA cells. Using DAVID analysis, we were able to find similar results. In the pathway graph genes with red stars were found in our analysis exclusively, while genes with blue stars were exclusively in the author analysis, and the overlapping genes are marked with grey stars. As all of the author results were in our results, there are only red and grey stars, showing that our analysis found additional genes in the PPAR Pathway. 0 2 4 6 8 10 PPAR Pathways ENSMUSG0000003016 2 ENSMUSG0000001065 1 ENSMUSG0000001584 6 ENSMUSG0000003054 6 ENSMUSG0000002285 3 ENSMUSG0000002860 7 ENSMUSG0000000210 8 ENSMUSG0000001556 8 ENSMUSG0000002600 3 ENSMUSG0000002505 9 ENSMUSG0000006290 8 ENSMUSG0000002077 7 A B2 A B2 0 weeks 2 weeks
  9. 9. Pathway Analysis of MacB2 and MacB3 Authors report cluster B3 (genes highly expressed both in MacB3-2wk and MacB3-3wk) was enriched in pathways related to chemokine and cytokine signaling. Using Factor Analysis (below) and DAVID (left) we can see similar patterns. As on the previous slide, our analysis found all the same genes as the authors analysis and more. 0 1 2 3 4 5 6 7 8 9 ENSMUSG00000073889 ENSMUSG00000009185 ENSMUSG00000022514 ENSMUSG00000007613 ENSMUSG00000097328 ENSMUSG00000024401 ENSMUSG00000030745 ENSMUSG00000018920 ENSMUSG00000002897 ENSMUSG00000024620 ENSMUSG00000068227 ENSMUSG00000000791 ENSMUSG00000000489 ENSMUSG00000071714 ENSMUSG00000050395 ENSMUSG00000028362 ENSMUSG00000042333 ENSMUSG00000004296 B2 B3 B2 B3
  10. 10. Conclusion • The macrophage population plays a critical role in controlling tumors, including their growth and progression, thus understanding the model of cancer progression and the interactions between cancer cells and the macrophages/adaptive immunity cells that are present allows for the ability to define gene expression signatures that could assist in clinical determinations. • Several Biologically significant pathways were identified, including cytokine-cytokine signaling and PPAR signaling. This was identified from gene expression after Factor Analysis. • The expression profile of both genes and isoforms were consistent, as can be seen in slide 5 by PCA. • Overall, this study demonstrates the complex system interactions between the macrophage population to the presence of tumors.
  11. 11. Data All of the factor analysis data can be found in the following files. • Genes: • Unfiltered: expression_genes.txt • MacA and MacB2: FA of MacA and MacB2 • MacB2 and MacB3: FA of MacB2 and MacB3 • Isoforms: • Unfiltered: expression_isoforms.txt • MacA and MacB2: FA of MacA and MacB2 isoforms • MacB2 and MacB3: FA of MacB2 and MacB3 isoforms • PCA: PCA
  12. 12. Educational Dataset Running a full pipeline on unfiltered samples can take a long time, and produce many additional results that are difficult for interpretation like unannotated genes and transcripts. To simplify the project for educational use, we took all the reads from all samples that aligned to a selection of significant and insignificant genes, and extracted them into a small FastQ file. On average, these files are 6.2% of the original size and take significantly less time to run (approx. 3 hours). Significant genes are selected from the original set through Factor Regression Analysis, and a select amount of insignificant genes are also selected to show the difference in significance to students. These smaller datasets are available for download at the links below: • MacA vs MacB2: http://pine-biotech.com/data/edu-macab2.tar.xz • MacB2 vs MacB3: http://pine-biotech.com/data/edu-macb23.tar.xz

×