This document summarizes a study that analyzed whole transcriptome profiles of patient-derived xenograft (PDX) models of various cancer types using RNA sequencing. Specifically:
- Human tumor cells from 79 patients with cancers like breast, lung, GI, ovarian and leukemia were implanted in mouse models. RNA from both human tumor and mouse stromal cells was extracted and analyzed.
- Unsupervised analysis identified batch effects across samples and specific samples with high stromal or cancer expression. This helped distinguish tumor vs stromal signals and identify potential biomarkers.
- The study aims to better understand the relationship between human tumors and mouse stroma in PDX models and identify biomarkers for personalized cancer treatment.
1. Whole Transcriptome Profiling of Cancer
Tumors in Mouse PDX Models
Based on Breast Cancer Samples taken from the publication “Whole
transcriptome profiling of patient-derived xenograft models as a tool to
identify both tumor and stromal specific biomarkers”
(James R. Bradford et. al.; DOI: 10.18632/oncotarget.8014)
2. Article Background and Summary
•Human tumor cells from patients with varying cancer types and stages were placed in four
different mouse models. Later, RNA from human tumor cells and mouse stromal cells was
extracted and analyzed using unsupervised and supervised analysis methods on the T-Bio
platform. Large, integrative project that is total RNA with biological replicates. Within this study,
there are many different subtypes of cancer that could be broken into educational sections that
should demonstrate the strength of the T-bio algorithms/approach.
•This study is the first comprehensive analysis across PDX models, this focused on identifying the
specific stromal cell type, investigated the relationship between human tumor and mouse stroma
and identify specific biomarkers for both tumor and stroma.
• Types of Cancer investigated:
– Breast, Lung, GI, Ovarian, Endometrial and Leukemia
3. Data Information
Extracted Molecule: Total RNA
Extracted Protocol: 50mg of tissue were cut from the frozen tumors
and RNA isolated using the RNeasy Lipid Tissue Mini Kit (Qiagen)
Genome: mouse/human (human cells are placed into
immunocompromised mice to grow tumors)
Instrument Model: Illumina Hi-seq 2000
Sample #: 79
Sample Type:Lung (37 samples)-
Lung adenocarcinoma
(18)
Lung Squamous (14)
Small Cell Lung
Cancer (3)
Lung (other)(2)
Breast (19 samples)-
Breast TN (13)
Breast ER+ (5)
Breast HER2+ (1)
GI (12 samples)-
CRC (8)
Pancreatic
(2)
Ampullary
(2)
Ovarian (7 samples)
Endometrial (3 samples)
CLL (1 sample)
Data Generation:
RNA libraries were made with the Illumina TruSeq RNA
Sample Preparation kit (un-stranded) according to the
manufacturer’s protocol. These libraries were then
submitted for 100 bp paired-end sequencing on the
Illumina HiSeq 2000 platform using one lane per three
to six PDX models. A concatenated human
(GRCh37/hg19) and mouse (GRCm38/mm9)
genome was then constructed to form a single
genome of 43 chromosomes (23 from human and
20 from mouse). This was indexed using StarAlign
(https://github.com/alexdobin/STAR/releases) and a
“gtf” formatted file combining annotations from both
human and mouse genes downloaded from Ensembl
version 75
4. T-Bioinfo Analysis Steps
1. RNA-seq of 79 cancer
samples
2. Junk RNA on Non-mapped
Reads from Previous RNA-seq
3. Machine Learning
Gene, Isoform, and exon
expression profiles of cancer
tumor/stroma samples .
Repetitive Elements and Kchain
abundances
Unsupervised BiAssociation and clustering
approach allowed identification of some
specific samples. As well as Batch Effect
Correction was used in this project.
5. Unsupervised analysis of transcriptome sequencing data
allowed for identification of the following:
● Identification and Correction of Batch Effect across a number of samples
● Samples with Stroma specificity identified by analysis of the stromal
expression
● Cancer-Specific Samples were identified, this is currently under investigation
● More Updates soon!
Preliminary Conclusions