SlideShare a Scribd company logo
1 of 15
MICROARRAYS AND
DATA ANALYSIS
FINAL PROJECT
Kiranmayee Bakshy
08/19/2014
Introduction
• Expression data from 46 cultured human ovarian
carcinoma cell lines with and without Cisplatin
treatment
• Array: A-AFFY-141 - Affymetrix GeneChip Human
Gene 1.0 ST Array [HuGene-1_0-st-v1] (GPL6244)
• Technology type: in situ oligonucleotide
• Experiment type: transcription profiling by array
• Samples: 171
• NCBI GEO accession no. GSE47856
(http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GS
E47856)
Epithelial-mesenchymal status renders differential responses to cisplatin in ovarian cancer. Miow QH,
Tan TZ, Ye J, Lau JA, Yokomizo T, Thiery JP, Mori S. , Europe PMC 24858042
Background
• Chemo-resistance to platinum in anti-cancer drugs such
as cisplatin is critical in the treatment of cancer.
• Epithelial-mesenchymal transition (EMT) is linked with
the drug resistance as a contributing mechanism.
• The current study is designed to explore the
connection between cellular responses to cisplatin with
EMT in ovarian cancer.
• Expression microarrays were utilized to estimate the
EMT status as a binary phenotype
• Various bioassays such as cell number, proliferation
rate and apoptosis were conducted to quantify
phenotypic responses to Cisplatin treatment.
Data Analysis pipeline
Load raw CEL
files into R and
normalize using RMA
Outlier analysis (CV vs mean plot, Hierarchical
clustering dendrogram and Average correlation
plot)
Run statistical tests and fold change to select
differentially expressed genes
Dimensionality
reduction/clustering (PCA)
Classification (QDA)
Report top 5 up-
regulated and down-
regulated gene names
and their functions
Dataset:
Total no. of samples: 171
Total no. of probesets: 33297
Annotation classes:
Epithelial-like: 86 samples
Mesenchymal-like: 85 samples
Outlier analysis
Outlier-GSM1160845
Outlier-GSM1160845
Outlier analysis
Outlier-GSM1160845
Outlier analysis
 The outlier - GSM1160845 was removed from the data matrix
 13707 genes that have low expression values (mean < 5) were also
deleted
Statistical analysis
Student’s t-test and fold change
No. of probes with p-value:
< 0.05 9507
< 0.01 7074
< 0.05/no. of probes 2133
Linear fold change:
Min. -8.651798
Max. 33.05966
Threshold for selecting
differentially
expressed genes:
p-value < 0.05/no. of probes
and
fold change > log2(2)
Visualization of differentially
expressed genes
656 differentially expressed genes were selected from the analysis based
on the threshold
Dimensionality reduction of
differentially expressed genes
Around 50% of variability in data can be explained by the first two eigenfunctions
of PCA
Principle component analysis
Spectral k-means clustering of 50 random
differentially expressed genes
Spectral k-means clustering is useful in this case as the variability can be
best summarized in a few eigenfunctions.
Classification – Quadratic Discriminant Analysis
Epithelial
-like
Mesenchymal-
like
Epithelial-like 36 0
Mesenchymal-
like
0 34
Confusion matrix from QDA
predicted membership
Actualmembership
 Training set: 50 epithelial-like
50 mesenchymal-like
 Test set: 36 epithelial-like
34 mesenchymal-like
 QDA was performed on the first three
components of principle component
analysis of training set.
QDA predicted all the samples of the test set correctly
AFFYMETRIX_EX
ON_GENE_ID
GENE NAME
GENE
SYMBOL
FUNCTION
8102792 protocadherin 18
PCDH18
Potential calcium-dependent cell-
adhesion protein
7899167
lin-28 homolog(C.
elegans)
LIN28A
Acts as a 'translational enhancer',
driving specific mRNAs to polysomes
and thus increasing the efficiency
of protein synthesis
7906878
discoidin domain
receptor tyrosine
kinase-2
DDR2
This tyrosine kinase receptor for
fibrillar collagen mediates
fibroblast migration and
proliferation
7906900*
discoidin domain
receptor tyrosine
kinase 2
DDR2
This tyrosine kinase receptor for
fibrillar collagen mediates
fibroblast migration and
proliferation
7926368 vimentin
VIM
class-III intermediate filaments
found in mesenchymal cells
DAVID functional annotations of top 5 discriminant genes
(Negative)
* Unmapped in DAVID; information obtained from NetAffx
No pathways or GO information was suggested by DAVID.
AFFYMETRIX_EX
ON_GENE_ID
GENE NAME
GENE
SYMBOL
FUNCTION
8026490
urothelial cancer
associated-1
UCA1 role in bladder cancer progression and
embryonic development
8041853
epithelial cell
adhesion molecule
EpCAM
carcinoma-associated antigen EpCAM up
regulates c-myc and induces cell
proliferation
8098439
epithelial cell
adhesion molecule
EpCAM
carcinoma-associated antigen EpCAM up
regulates c-myc and induces cell
proliferation
8147351
mal, T-cell
differentiation
protein-2
MAL-2 Member of the machinery of polarized
transport
8148040
epithelial splicing
regulatory protein-
1
Esrp-1
mRNA splicing factor that regulates the
formation of epithelial cell-specific
isoforms
DAVID functional annotations of top 5 discriminant genes
(Positive)
No pathways or GO information was suggested by DAVID.
Conclusions:
• The outlier observed in this dataset was GSM1160845 which is
a mesenchymal-like ovarian cancer cell line treated with
Cisplatin.
• 656 out of 19590 genes were selected as differentially
expressed genes based on the threshold.
• The QDA classification model trained using 100 samples
predicted the classes of test set with 70 samples successfully.
• All the top 5 positively and negatively regulated genes obtained
in this analysis are involved in cellular processes such as cell
adhesion, migration, proliferation and protein synthesis.
• The authors have reported an epithelial gene set consisting of
known epithelial cell markers such as DDR1, KRT8, KRT18,
CDH1, CDH3, CLDN3, CLDN4 and EPCAM, and a mesenchymal
gene set consisting of known mesenchymal cell markers ZEB1,
CDH2, VIM and TWIST1.

More Related Content

What's hot

Whole Transcriptome Profiling of Cancer Tumors in Mouse PDX Models
Whole Transcriptome Profiling of Cancer Tumors in Mouse PDX ModelsWhole Transcriptome Profiling of Cancer Tumors in Mouse PDX Models
Whole Transcriptome Profiling of Cancer Tumors in Mouse PDX ModelsTom Koch
 
Rna seq - PDX models
Rna seq - PDX models Rna seq - PDX models
Rna seq - PDX models Amitha Dasari
 
prostate cancer classification - BioGenex
prostate cancer classification - BioGenexprostate cancer classification - BioGenex
prostate cancer classification - BioGenexVictoria Miller
 
Exome breast cancer-edu-tk-sb
Exome breast cancer-edu-tk-sbExome breast cancer-edu-tk-sb
Exome breast cancer-edu-tk-sbAmitha Dasari
 
2014 11-27 EATRIS biomarkers platform, Amsterdam, oncology case study
2014 11-27 EATRIS biomarkers platform, Amsterdam, oncology case study2014 11-27 EATRIS biomarkers platform, Amsterdam, oncology case study
2014 11-27 EATRIS biomarkers platform, Amsterdam, oncology case studyAlain van Gool
 
TINAGL1 and B3GALNT1 are potential therapy target genes to suppress metastasi...
TINAGL1 and B3GALNT1 are potential therapy target genes to suppress metastasi...TINAGL1 and B3GALNT1 are potential therapy target genes to suppress metastasi...
TINAGL1 and B3GALNT1 are potential therapy target genes to suppress metastasi...Y-h Taguchi
 
Relapsed AML: Steve Kornblau
Relapsed AML: Steve KornblauRelapsed AML: Steve Kornblau
Relapsed AML: Steve Kornblauspa718
 
MicroRNA Profiling in Serum from Donors with Germ Cell Cancer
MicroRNA Profiling in Serum from Donors with Germ Cell CancerMicroRNA Profiling in Serum from Donors with Germ Cell Cancer
MicroRNA Profiling in Serum from Donors with Germ Cell CancerThermo Fisher Scientific
 
ECCLU 2011 - J.J. Battermann - Prostate cancer: All the truth about local tre...
ECCLU 2011 - J.J. Battermann - Prostate cancer: All the truth about local tre...ECCLU 2011 - J.J. Battermann - Prostate cancer: All the truth about local tre...
ECCLU 2011 - J.J. Battermann - Prostate cancer: All the truth about local tre...European School of Oncology
 
Wild Type and Mutated BRCA - Differentiation of Breast Cancer - BioGenex
Wild Type and Mutated BRCA - Differentiation of Breast Cancer - BioGenexWild Type and Mutated BRCA - Differentiation of Breast Cancer - BioGenex
Wild Type and Mutated BRCA - Differentiation of Breast Cancer - BioGenexVictoria Miller
 
Probes 2010
Probes 2010Probes 2010
Probes 2010toluene
 
Breast Cancer - Molecular Basis of HER2+ Disease
Breast Cancer - Molecular Basis of HER2+ DiseaseBreast Cancer - Molecular Basis of HER2+ Disease
Breast Cancer - Molecular Basis of HER2+ DiseaseFaryn
 
Executive Summary_Smart Analyst_PARP Inhibitors (Repaired)
Executive Summary_Smart Analyst_PARP Inhibitors (Repaired)Executive Summary_Smart Analyst_PARP Inhibitors (Repaired)
Executive Summary_Smart Analyst_PARP Inhibitors (Repaired)ANGELA7676
 
Systemic therapy in malignant melanoma
Systemic therapy in malignant melanomaSystemic therapy in malignant melanoma
Systemic therapy in malignant melanomaRajib Bhattacharjee
 
Pharmacogenomics part I
Pharmacogenomics   part IPharmacogenomics   part I
Pharmacogenomics part Ialkabansal04
 

What's hot (20)

Whole Transcriptome Profiling of Cancer Tumors in Mouse PDX Models
Whole Transcriptome Profiling of Cancer Tumors in Mouse PDX ModelsWhole Transcriptome Profiling of Cancer Tumors in Mouse PDX Models
Whole Transcriptome Profiling of Cancer Tumors in Mouse PDX Models
 
Rna seq - PDX models
Rna seq - PDX models Rna seq - PDX models
Rna seq - PDX models
 
prostate cancer classification - BioGenex
prostate cancer classification - BioGenexprostate cancer classification - BioGenex
prostate cancer classification - BioGenex
 
Exome breast cancer-edu-tk-sb
Exome breast cancer-edu-tk-sbExome breast cancer-edu-tk-sb
Exome breast cancer-edu-tk-sb
 
2014 11-27 EATRIS biomarkers platform, Amsterdam, oncology case study
2014 11-27 EATRIS biomarkers platform, Amsterdam, oncology case study2014 11-27 EATRIS biomarkers platform, Amsterdam, oncology case study
2014 11-27 EATRIS biomarkers platform, Amsterdam, oncology case study
 
TINAGL1 and B3GALNT1 are potential therapy target genes to suppress metastasi...
TINAGL1 and B3GALNT1 are potential therapy target genes to suppress metastasi...TINAGL1 and B3GALNT1 are potential therapy target genes to suppress metastasi...
TINAGL1 and B3GALNT1 are potential therapy target genes to suppress metastasi...
 
Relapsed AML: Steve Kornblau
Relapsed AML: Steve KornblauRelapsed AML: Steve Kornblau
Relapsed AML: Steve Kornblau
 
Petrulli_SNM_2014_Talk
Petrulli_SNM_2014_TalkPetrulli_SNM_2014_Talk
Petrulli_SNM_2014_Talk
 
MicroRNA Profiling in Serum from Donors with Germ Cell Cancer
MicroRNA Profiling in Serum from Donors with Germ Cell CancerMicroRNA Profiling in Serum from Donors with Germ Cell Cancer
MicroRNA Profiling in Serum from Donors with Germ Cell Cancer
 
Arriagada, r. breast cancer
Arriagada, r. breast cancerArriagada, r. breast cancer
Arriagada, r. breast cancer
 
ECCLU 2011 - J.J. Battermann - Prostate cancer: All the truth about local tre...
ECCLU 2011 - J.J. Battermann - Prostate cancer: All the truth about local tre...ECCLU 2011 - J.J. Battermann - Prostate cancer: All the truth about local tre...
ECCLU 2011 - J.J. Battermann - Prostate cancer: All the truth about local tre...
 
Alexia Chrysostomou (083707160)
Alexia Chrysostomou (083707160)Alexia Chrysostomou (083707160)
Alexia Chrysostomou (083707160)
 
Wild Type and Mutated BRCA - Differentiation of Breast Cancer - BioGenex
Wild Type and Mutated BRCA - Differentiation of Breast Cancer - BioGenexWild Type and Mutated BRCA - Differentiation of Breast Cancer - BioGenex
Wild Type and Mutated BRCA - Differentiation of Breast Cancer - BioGenex
 
Pharmacogenomics
PharmacogenomicsPharmacogenomics
Pharmacogenomics
 
Probes 2010
Probes 2010Probes 2010
Probes 2010
 
Breast Cancer - Molecular Basis of HER2+ Disease
Breast Cancer - Molecular Basis of HER2+ DiseaseBreast Cancer - Molecular Basis of HER2+ Disease
Breast Cancer - Molecular Basis of HER2+ Disease
 
Executive Summary_Smart Analyst_PARP Inhibitors (Repaired)
Executive Summary_Smart Analyst_PARP Inhibitors (Repaired)Executive Summary_Smart Analyst_PARP Inhibitors (Repaired)
Executive Summary_Smart Analyst_PARP Inhibitors (Repaired)
 
Systemic therapy in malignant melanoma
Systemic therapy in malignant melanomaSystemic therapy in malignant melanoma
Systemic therapy in malignant melanoma
 
poster FINAL
poster FINALposter FINAL
poster FINAL
 
Pharmacogenomics part I
Pharmacogenomics   part IPharmacogenomics   part I
Pharmacogenomics part I
 

Similar to Final project-kbakshy

Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...Affymetrix
 
NSCLC: diagnóstico molecular, pronóstico y seguimiento; CTC
NSCLC: diagnóstico molecular, pronóstico y seguimiento; CTCNSCLC: diagnóstico molecular, pronóstico y seguimiento; CTC
NSCLC: diagnóstico molecular, pronóstico y seguimiento; CTCMauricio Lema
 
Qpcrpcr array poster
Qpcrpcr array posterQpcrpcr array poster
Qpcrpcr array posterElsa von Licy
 
Maldi tof-ms analysis in identification of prostate cancer
Maldi tof-ms analysis in identification of prostate cancerMaldi tof-ms analysis in identification of prostate cancer
Maldi tof-ms analysis in identification of prostate cancerMoustafa Rezk
 
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...European School of Oncology
 
Undergraduate Research Symposium Poster
Undergraduate Research Symposium PosterUndergraduate Research Symposium Poster
Undergraduate Research Symposium PosterTim Krueger
 
MCO 2011 - Slide 30 - K. Öberg - Spotlight session - Neuroendocrine tumours
MCO 2011 - Slide 30 - K. Öberg - Spotlight session - Neuroendocrine tumoursMCO 2011 - Slide 30 - K. Öberg - Spotlight session - Neuroendocrine tumours
MCO 2011 - Slide 30 - K. Öberg - Spotlight session - Neuroendocrine tumoursEuropean School of Oncology
 
Liangqun ms defense.pptx
Liangqun ms defense.pptxLiangqun ms defense.pptx
Liangqun ms defense.pptxLiangqun Lu
 
Ph D Swati Dhar
Ph D Swati DharPh D Swati Dhar
Ph D Swati DharSwati Dhar
 
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...European School of Oncology
 
Image analysis; Spinocellular carcinoma; Melanoma; Basal cell carcinoma; Art...
 Image analysis; Spinocellular carcinoma; Melanoma; Basal cell carcinoma; Art... Image analysis; Spinocellular carcinoma; Melanoma; Basal cell carcinoma; Art...
Image analysis; Spinocellular carcinoma; Melanoma; Basal cell carcinoma; Art...Healthcare and Medical Sciences
 
Cell lines breast cancer-project
Cell lines breast cancer-project Cell lines breast cancer-project
Cell lines breast cancer-project Amitha Dasari
 
2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool
2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool
2014 11-27 ODDP 2014 course, Amsterdam, Alain van GoolAlain van Gool
 
TaqMan® Rare Mutation Assays w/ Digital PCR | ESHG 2015 Poster PM14.030
TaqMan® Rare Mutation Assays w/ Digital PCR | ESHG 2015 Poster PM14.030TaqMan® Rare Mutation Assays w/ Digital PCR | ESHG 2015 Poster PM14.030
TaqMan® Rare Mutation Assays w/ Digital PCR | ESHG 2015 Poster PM14.030Thermo Fisher Scientific
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
 
CRISPR cas9 mediated TERT disruption in cancer cells
CRISPR cas9 mediated TERT disruption in cancer cells CRISPR cas9 mediated TERT disruption in cancer cells
CRISPR cas9 mediated TERT disruption in cancer cells ChiLerFam
 
Genomica - Microarreglos de DNA
Genomica - Microarreglos de DNAGenomica - Microarreglos de DNA
Genomica - Microarreglos de DNAUlises Urzua
 
Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Ronak Shah
 

Similar to Final project-kbakshy (20)

Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
 
NSCLC: diagnóstico molecular, pronóstico y seguimiento; CTC
NSCLC: diagnóstico molecular, pronóstico y seguimiento; CTCNSCLC: diagnóstico molecular, pronóstico y seguimiento; CTC
NSCLC: diagnóstico molecular, pronóstico y seguimiento; CTC
 
Qpcrpcr array poster
Qpcrpcr array posterQpcrpcr array poster
Qpcrpcr array poster
 
Maldi tof-ms analysis in identification of prostate cancer
Maldi tof-ms analysis in identification of prostate cancerMaldi tof-ms analysis in identification of prostate cancer
Maldi tof-ms analysis in identification of prostate cancer
 
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...
 
Undergraduate Research Symposium Poster
Undergraduate Research Symposium PosterUndergraduate Research Symposium Poster
Undergraduate Research Symposium Poster
 
MCO 2011 - Slide 30 - K. Öberg - Spotlight session - Neuroendocrine tumours
MCO 2011 - Slide 30 - K. Öberg - Spotlight session - Neuroendocrine tumoursMCO 2011 - Slide 30 - K. Öberg - Spotlight session - Neuroendocrine tumours
MCO 2011 - Slide 30 - K. Öberg - Spotlight session - Neuroendocrine tumours
 
Liangqun ms defense.pptx
Liangqun ms defense.pptxLiangqun ms defense.pptx
Liangqun ms defense.pptx
 
Ph D Swati Dhar
Ph D Swati DharPh D Swati Dhar
Ph D Swati Dhar
 
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...
Gene Profiling in Clinical Oncology - Slide 4 - L. Lacroix - New markers to d...
 
Role of Sema4D in Bone Metastasis of Breast Cancer
Role of Sema4D in Bone Metastasis of Breast CancerRole of Sema4D in Bone Metastasis of Breast Cancer
Role of Sema4D in Bone Metastasis of Breast Cancer
 
Image analysis; Spinocellular carcinoma; Melanoma; Basal cell carcinoma; Art...
 Image analysis; Spinocellular carcinoma; Melanoma; Basal cell carcinoma; Art... Image analysis; Spinocellular carcinoma; Melanoma; Basal cell carcinoma; Art...
Image analysis; Spinocellular carcinoma; Melanoma; Basal cell carcinoma; Art...
 
Cell lines breast cancer-project
Cell lines breast cancer-project Cell lines breast cancer-project
Cell lines breast cancer-project
 
2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool
2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool
2014 11-27 ODDP 2014 course, Amsterdam, Alain van Gool
 
TaqMan® Rare Mutation Assays w/ Digital PCR | ESHG 2015 Poster PM14.030
TaqMan® Rare Mutation Assays w/ Digital PCR | ESHG 2015 Poster PM14.030TaqMan® Rare Mutation Assays w/ Digital PCR | ESHG 2015 Poster PM14.030
TaqMan® Rare Mutation Assays w/ Digital PCR | ESHG 2015 Poster PM14.030
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
 
Nano-DIM.ppt
Nano-DIM.pptNano-DIM.ppt
Nano-DIM.ppt
 
CRISPR cas9 mediated TERT disruption in cancer cells
CRISPR cas9 mediated TERT disruption in cancer cells CRISPR cas9 mediated TERT disruption in cancer cells
CRISPR cas9 mediated TERT disruption in cancer cells
 
Genomica - Microarreglos de DNA
Genomica - Microarreglos de DNAGenomica - Microarreglos de DNA
Genomica - Microarreglos de DNA
 
Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...
 

Final project-kbakshy

  • 1. MICROARRAYS AND DATA ANALYSIS FINAL PROJECT Kiranmayee Bakshy 08/19/2014
  • 2. Introduction • Expression data from 46 cultured human ovarian carcinoma cell lines with and without Cisplatin treatment • Array: A-AFFY-141 - Affymetrix GeneChip Human Gene 1.0 ST Array [HuGene-1_0-st-v1] (GPL6244) • Technology type: in situ oligonucleotide • Experiment type: transcription profiling by array • Samples: 171 • NCBI GEO accession no. GSE47856 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GS E47856) Epithelial-mesenchymal status renders differential responses to cisplatin in ovarian cancer. Miow QH, Tan TZ, Ye J, Lau JA, Yokomizo T, Thiery JP, Mori S. , Europe PMC 24858042
  • 3. Background • Chemo-resistance to platinum in anti-cancer drugs such as cisplatin is critical in the treatment of cancer. • Epithelial-mesenchymal transition (EMT) is linked with the drug resistance as a contributing mechanism. • The current study is designed to explore the connection between cellular responses to cisplatin with EMT in ovarian cancer. • Expression microarrays were utilized to estimate the EMT status as a binary phenotype • Various bioassays such as cell number, proliferation rate and apoptosis were conducted to quantify phenotypic responses to Cisplatin treatment.
  • 4. Data Analysis pipeline Load raw CEL files into R and normalize using RMA Outlier analysis (CV vs mean plot, Hierarchical clustering dendrogram and Average correlation plot) Run statistical tests and fold change to select differentially expressed genes Dimensionality reduction/clustering (PCA) Classification (QDA) Report top 5 up- regulated and down- regulated gene names and their functions Dataset: Total no. of samples: 171 Total no. of probesets: 33297 Annotation classes: Epithelial-like: 86 samples Mesenchymal-like: 85 samples
  • 7. Outlier-GSM1160845 Outlier analysis  The outlier - GSM1160845 was removed from the data matrix  13707 genes that have low expression values (mean < 5) were also deleted
  • 8. Statistical analysis Student’s t-test and fold change No. of probes with p-value: < 0.05 9507 < 0.01 7074 < 0.05/no. of probes 2133 Linear fold change: Min. -8.651798 Max. 33.05966 Threshold for selecting differentially expressed genes: p-value < 0.05/no. of probes and fold change > log2(2)
  • 9. Visualization of differentially expressed genes 656 differentially expressed genes were selected from the analysis based on the threshold
  • 10. Dimensionality reduction of differentially expressed genes Around 50% of variability in data can be explained by the first two eigenfunctions of PCA Principle component analysis
  • 11. Spectral k-means clustering of 50 random differentially expressed genes Spectral k-means clustering is useful in this case as the variability can be best summarized in a few eigenfunctions.
  • 12. Classification – Quadratic Discriminant Analysis Epithelial -like Mesenchymal- like Epithelial-like 36 0 Mesenchymal- like 0 34 Confusion matrix from QDA predicted membership Actualmembership  Training set: 50 epithelial-like 50 mesenchymal-like  Test set: 36 epithelial-like 34 mesenchymal-like  QDA was performed on the first three components of principle component analysis of training set. QDA predicted all the samples of the test set correctly
  • 13. AFFYMETRIX_EX ON_GENE_ID GENE NAME GENE SYMBOL FUNCTION 8102792 protocadherin 18 PCDH18 Potential calcium-dependent cell- adhesion protein 7899167 lin-28 homolog(C. elegans) LIN28A Acts as a 'translational enhancer', driving specific mRNAs to polysomes and thus increasing the efficiency of protein synthesis 7906878 discoidin domain receptor tyrosine kinase-2 DDR2 This tyrosine kinase receptor for fibrillar collagen mediates fibroblast migration and proliferation 7906900* discoidin domain receptor tyrosine kinase 2 DDR2 This tyrosine kinase receptor for fibrillar collagen mediates fibroblast migration and proliferation 7926368 vimentin VIM class-III intermediate filaments found in mesenchymal cells DAVID functional annotations of top 5 discriminant genes (Negative) * Unmapped in DAVID; information obtained from NetAffx No pathways or GO information was suggested by DAVID.
  • 14. AFFYMETRIX_EX ON_GENE_ID GENE NAME GENE SYMBOL FUNCTION 8026490 urothelial cancer associated-1 UCA1 role in bladder cancer progression and embryonic development 8041853 epithelial cell adhesion molecule EpCAM carcinoma-associated antigen EpCAM up regulates c-myc and induces cell proliferation 8098439 epithelial cell adhesion molecule EpCAM carcinoma-associated antigen EpCAM up regulates c-myc and induces cell proliferation 8147351 mal, T-cell differentiation protein-2 MAL-2 Member of the machinery of polarized transport 8148040 epithelial splicing regulatory protein- 1 Esrp-1 mRNA splicing factor that regulates the formation of epithelial cell-specific isoforms DAVID functional annotations of top 5 discriminant genes (Positive) No pathways or GO information was suggested by DAVID.
  • 15. Conclusions: • The outlier observed in this dataset was GSM1160845 which is a mesenchymal-like ovarian cancer cell line treated with Cisplatin. • 656 out of 19590 genes were selected as differentially expressed genes based on the threshold. • The QDA classification model trained using 100 samples predicted the classes of test set with 70 samples successfully. • All the top 5 positively and negatively regulated genes obtained in this analysis are involved in cellular processes such as cell adhesion, migration, proliferation and protein synthesis. • The authors have reported an epithelial gene set consisting of known epithelial cell markers such as DDR1, KRT8, KRT18, CDH1, CDH3, CLDN3, CLDN4 and EPCAM, and a mesenchymal gene set consisting of known mesenchymal cell markers ZEB1, CDH2, VIM and TWIST1.