SlideShare a Scribd company logo
1 of 16
Gargi Mukherjee … Rutgers University, New Jersey
Kevin Raines … Stanford University, California
Srikanth Sastry … JNC, Bengaluru, India
Sebastian Doniach … Stanford University, California
Gyan Bhanot … Rutgers University, New Jersey
Michael Biehl … University of Groningen, The Netherlands
1
Predicting Recurrence in Clear Cell
Renal Cell Carcinoma
Analysis of TCGA data using Outlier Analysis and GMLVQ
WCCI 2016, Vancouver / BC 2 /15
overview
gene expression in tumor cells
specific example: clear cell Renal Cell Carcinomas (ccRCC)
• outlier analysis: identification of a panel of prognostic genes
with respect to recurrence
• risk score: prediction of individual recurrence risk
based on outlier status w.r.t. selected genes
• machine learning: analysis of extreme cases of low / high risk
distance based classification and relevance learning
(Generalized Matrix Relevance LVQ)
clinical data: recurrence free intervals
WCCI 2016, Vancouver / BC 3 /15
clear cell Renal Cell Carcinoma (ccRCC)
publicly available datasets:
The Cancer Genome Atlas (TCGA) cancergenome.nih.gov
also hosted at Broad Institute gdac.broadinstitute.org
data
WCCI 2016, Vancouver / BC 4 /15
data
20532genes
65normalsamples
469 tumor
samples
65 + 65
matched
clear cell renal cell carcinoma
TCGA data @ Broad Institute
mRNA-Seq expression data X
normalized, log-transformed:
Y=log(1+X)
65 normal samples
65 matched tumor samples
469 tumor samples in total
number of
recurrences
recurrence data:
days after diagnosis
WCCI 2016, Vancouver / BC 5 /15
380
training
samples
outlier analysis
89testsamples
randomized split
WCCI 2016, Vancouver / BC 6 /15
380
training
samples
outlier analysis
per gene:
determine
mean μ, standard deviation σ of Y
for each gene: identify outlier samples
Y > μ + σ “high outlier“
Y < μ - σ “low outlier“
restrict the following analysis to genes with
≥ 20 high outlier samples
or ≥ 20 low outlier samples
WCCI 2016, Vancouver / BC 7 /15
1546 „high-outlier genes“
with KM log rank p < 0.001
1628 „low-outlier genes“
with KM log rank p < 0.0005
construct two binary outlier matrices
„1“ for high-outlier samples
„0“ else
„1“ for low-outlier samples
„0“ else
1546 genes
 PCA
Kaplan-Meier (KM) analysis per gene:
test for significant association of outlier status of samples with recurrence
outlier analysis
1628 genes
380samples380samples
WCCI 2016, Vancouver / BC 8 /15
PCA reveals
four clusters of genes
711475
2261402
A B
DC
high outlier genes
low outlier genes
genes in small clusters (B,D):
outlier status associated
with late recurrence
genes in large clusters (A,C):
outlier status associated
with early recurrence
outlier analysis
WCCI 2016, Vancouver / BC 9 /15
recurrence risk score
top 20 genes (by KM p-value) from each cluster A,B,C,D
reference set of 80 genes
for each sample:
- determine outlier status with respect to the 80 genes (Y >?< μ ± σ )
- add up contributions per gene
- 1 if the sample is outlier w.r.t. to a gene in A or C (early rec.)
0 if the sample is not an outlier w.r.t. the gene
+ 1 if the sample is outlier w.r.t. to a gene in B or D (late rec.)
recurrence risk score - 40 ≤ R ≤ + 40
observe: median = 2 over the 380 training samples
crisp classification w.r.t. recurrence risk:
high risk (early recurrence) if R < 2
low risk (late recurrence) if R ≥ 2
WCCI 2016, Vancouver / BC 10 /15
recurrence risk prediction
training set (380 samples) test set (89 samples)
log rank p < 1.e-16 log rank p < 1.e-4
KM plots with respect to high / low risk groups:
• risk score R is predictive of the actual recurrence risk
• the 80 selected genes can serve as a prognostic panel
WCCI 2016, Vancouver / BC 11 /15
extreme case analysis
number of
recurrences:
≤ 2 years
(early)
> 5 years
(late or no
recurrence)
109 samples
class 2, high risk
107 samples
class 1, low risk
(undefined)
2 classes:
• 80-dim. feature vectors (gene expression)
• representation by one prototype vector per class:
• adaptive distance measure for comparison of samples and prototypes:
with relevance matrix
• distance-based classification, e.g. Nearest Prototype Classifier (NPC)
WCCI 2016, Vancouver / BC 12 /15
GMLVQ classifier
Generalized Matrix Relevance Learning Vector Quantization (GMLVQ)
training of prototypes and relevance matrix
= minimization of an appropriate cost function
with respect to performance on labeled training set
components of diagonal elements of Λ
A B C D A B C D
lowexpression|highexpression
WCCI 2016, Vancouver / BC 13 /15
GMLVQ classifier
ROC of GMLVQ classifier (Leave-One-Out of the 216 extreme samples)
KM plot w.r.t. all 469 samples
( L-1-O for 216 samples, plus 253 undefined )
log rank p < 1.e-7
WCCI 2016, Vancouver / BC 14 /15
extreme case analysis (107+109 samples)
GMLVQ classifier Risk score classifier
- AUC=0.84
 R=2
WCCI 2016, Vancouver / BC 15 /15
the set of 80 genes is also diagnostic:
• GMLVQ separates normal from tumor cells (close to) perfectly
• PCA of corresponding gene expressions:
65 normal samples
105 low risk samples (late recurrence)
109 high risk samples (early recurrence)
gradient from normal to high risk:
diagnostics?
WCCI 2016, Vancouver / BC 16 /15
• GMLVQ suggests an even smaller panel of prognostic genes (12?)
identify a minimum panel for diagnostics and prognostics
• 80 genes do not necessarily reflect biological mechanisms
compare, e.g., with known pathways / modules of genes
remarks and open questions
• prospective studies required with respect to use as an assay
• can the performance be improved further ?
study more sophisticated classifier systems
include further clinical information (diet, life style, family history, … )
easy-to-use GMLVQ-classifier: www.cs.rug.nl/~biehl/gmlvq
• more direct, multivariate identification of relevant genes ?
e.g. PCA+GMLVQ and back-transform

More Related Content

Similar to 2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma

coad_machine_learning
coad_machine_learningcoad_machine_learning
coad_machine_learningFord Sleeman
 
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...QIAGEN
 
NY Prostate Cancer Conference - K. Touijer - Session 4: Predicting clinical a...
NY Prostate Cancer Conference - K. Touijer - Session 4: Predicting clinical a...NY Prostate Cancer Conference - K. Touijer - Session 4: Predicting clinical a...
NY Prostate Cancer Conference - K. Touijer - Session 4: Predicting clinical a...European School of Oncology
 
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA
20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA Roberto Scarafia
 
BRITEREU_finalposter
BRITEREU_finalposterBRITEREU_finalposter
BRITEREU_finalposterElsa Fecke
 
Radiomics and Deep Learning for Lung Cancer Screening
Radiomics and Deep Learning for Lung Cancer ScreeningRadiomics and Deep Learning for Lung Cancer Screening
Radiomics and Deep Learning for Lung Cancer ScreeningWookjin Choi
 
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...QIAGEN
 
IMM_752_kSORT_Whitepaper_2016_revfinal_NoCrops
IMM_752_kSORT_Whitepaper_2016_revfinal_NoCropsIMM_752_kSORT_Whitepaper_2016_revfinal_NoCrops
IMM_752_kSORT_Whitepaper_2016_revfinal_NoCropsKevin Jaglinski
 
How to do successful gene expression analysis - Siena 20100625
How to do successful gene expression analysis - Siena 20100625How to do successful gene expression analysis - Siena 20100625
How to do successful gene expression analysis - Siena 20100625Biogazelle
 
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation Sources
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation SourcesUpdates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation Sources
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation SourcesGolden Helix
 
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation Sources
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation SourcesUpdates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation Sources
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation SourcesDelaina Hawkins
 
Bioinformatics-R program의 실례
Bioinformatics-R program의 실례Bioinformatics-R program의 실례
Bioinformatics-R program의 실례mothersafe
 
The National Center for Biotechnology Information (NCBI) Pathogen Analysis Pi...
The National Center for Biotechnology Information (NCBI) Pathogen Analysis Pi...The National Center for Biotechnology Information (NCBI) Pathogen Analysis Pi...
The National Center for Biotechnology Information (NCBI) Pathogen Analysis Pi...ExternalEvents
 
Recent Advances in Pathologic Evaluation of Melanoma Sentinel Lymph Nodes. Sl...
Recent Advances in Pathologic Evaluation of Melanoma Sentinel Lymph Nodes. Sl...Recent Advances in Pathologic Evaluation of Melanoma Sentinel Lymph Nodes. Sl...
Recent Advances in Pathologic Evaluation of Melanoma Sentinel Lymph Nodes. Sl...vshidham
 
Pharmacogenomic Prediction of Antracycline-induced Cardiotoxicity
Pharmacogenomic Prediction of Antracycline-induced CardiotoxicityPharmacogenomic Prediction of Antracycline-induced Cardiotoxicity
Pharmacogenomic Prediction of Antracycline-induced CardiotoxicityGolden Helix
 
Pharmacogenomic Prediction of Antracycline-induced Cardiotoxicity
Pharmacogenomic Prediction of Antracycline-induced CardiotoxicityPharmacogenomic Prediction of Antracycline-induced Cardiotoxicity
Pharmacogenomic Prediction of Antracycline-induced CardiotoxicityGolden Helix Inc
 
Population-Based DNA Variant Analysis
Population-Based DNA Variant AnalysisPopulation-Based DNA Variant Analysis
Population-Based DNA Variant AnalysisGolden Helix
 
Q biomarkersomaticmutation
Q biomarkersomaticmutationQ biomarkersomaticmutation
Q biomarkersomaticmutationElsa von Licy
 
Bioinformatics as a tool for understanding clinically significant variations ...
Bioinformatics as a tool for understanding clinically significant variations ...Bioinformatics as a tool for understanding clinically significant variations ...
Bioinformatics as a tool for understanding clinically significant variations ...Despoina Kalfakakou
 

Similar to 2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma (20)

coad_machine_learning
coad_machine_learningcoad_machine_learning
coad_machine_learning
 
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
Step by Step, from Liquid Biopsy to a Genomic Biomarker: Liquid Biopsy Series...
 
NY Prostate Cancer Conference - K. Touijer - Session 4: Predicting clinical a...
NY Prostate Cancer Conference - K. Touijer - Session 4: Predicting clinical a...NY Prostate Cancer Conference - K. Touijer - Session 4: Predicting clinical a...
NY Prostate Cancer Conference - K. Touijer - Session 4: Predicting clinical a...
 
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA
20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA
 
BRITEREU_finalposter
BRITEREU_finalposterBRITEREU_finalposter
BRITEREU_finalposter
 
Radiomics and Deep Learning for Lung Cancer Screening
Radiomics and Deep Learning for Lung Cancer ScreeningRadiomics and Deep Learning for Lung Cancer Screening
Radiomics and Deep Learning for Lung Cancer Screening
 
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
 
IMM_752_kSORT_Whitepaper_2016_revfinal_NoCrops
IMM_752_kSORT_Whitepaper_2016_revfinal_NoCropsIMM_752_kSORT_Whitepaper_2016_revfinal_NoCrops
IMM_752_kSORT_Whitepaper_2016_revfinal_NoCrops
 
Project_702
Project_702Project_702
Project_702
 
How to do successful gene expression analysis - Siena 20100625
How to do successful gene expression analysis - Siena 20100625How to do successful gene expression analysis - Siena 20100625
How to do successful gene expression analysis - Siena 20100625
 
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation Sources
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation SourcesUpdates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation Sources
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation Sources
 
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation Sources
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation SourcesUpdates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation Sources
Updates to VSClinical ACMG Guidelines & a Tour of Cancer Annotation Sources
 
Bioinformatics-R program의 실례
Bioinformatics-R program의 실례Bioinformatics-R program의 실례
Bioinformatics-R program의 실례
 
The National Center for Biotechnology Information (NCBI) Pathogen Analysis Pi...
The National Center for Biotechnology Information (NCBI) Pathogen Analysis Pi...The National Center for Biotechnology Information (NCBI) Pathogen Analysis Pi...
The National Center for Biotechnology Information (NCBI) Pathogen Analysis Pi...
 
Recent Advances in Pathologic Evaluation of Melanoma Sentinel Lymph Nodes. Sl...
Recent Advances in Pathologic Evaluation of Melanoma Sentinel Lymph Nodes. Sl...Recent Advances in Pathologic Evaluation of Melanoma Sentinel Lymph Nodes. Sl...
Recent Advances in Pathologic Evaluation of Melanoma Sentinel Lymph Nodes. Sl...
 
Pharmacogenomic Prediction of Antracycline-induced Cardiotoxicity
Pharmacogenomic Prediction of Antracycline-induced CardiotoxicityPharmacogenomic Prediction of Antracycline-induced Cardiotoxicity
Pharmacogenomic Prediction of Antracycline-induced Cardiotoxicity
 
Pharmacogenomic Prediction of Antracycline-induced Cardiotoxicity
Pharmacogenomic Prediction of Antracycline-induced CardiotoxicityPharmacogenomic Prediction of Antracycline-induced Cardiotoxicity
Pharmacogenomic Prediction of Antracycline-induced Cardiotoxicity
 
Population-Based DNA Variant Analysis
Population-Based DNA Variant AnalysisPopulation-Based DNA Variant Analysis
Population-Based DNA Variant Analysis
 
Q biomarkersomaticmutation
Q biomarkersomaticmutationQ biomarkersomaticmutation
Q biomarkersomaticmutation
 
Bioinformatics as a tool for understanding clinically significant variations ...
Bioinformatics as a tool for understanding clinically significant variations ...Bioinformatics as a tool for understanding clinically significant variations ...
Bioinformatics as a tool for understanding clinically significant variations ...
 

More from University of Groningen

Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024University of Groningen
 
Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...University of Groningen
 
The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...University of Groningen
 
Interpretable machine-learning (in endocrinology and beyond)
Interpretable machine-learning (in endocrinology and beyond)Interpretable machine-learning (in endocrinology and beyond)
Interpretable machine-learning (in endocrinology and beyond)University of Groningen
 
2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...University of Groningen
 
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...University of Groningen
 
2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ... 2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ... University of Groningen
 
Prototype-based classifiers and their applications in the life sciences
Prototype-based classifiers and their applications in the life sciencesPrototype-based classifiers and their applications in the life sciences
Prototype-based classifiers and their applications in the life sciencesUniversity of Groningen
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learningUniversity of Groningen
 
The statistical physics of learning - revisited
The statistical physics of learning - revisitedThe statistical physics of learning - revisited
The statistical physics of learning - revisitedUniversity of Groningen
 
2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...University of Groningen
 
2013: Prototype-based learning and adaptive distances for classification
2013: Prototype-based learning and adaptive distances for classification2013: Prototype-based learning and adaptive distances for classification
2013: Prototype-based learning and adaptive distances for classificationUniversity of Groningen
 
2015: Distance based classifiers: Basic concepts, recent developments and app...
2015: Distance based classifiers: Basic concepts, recent developments and app...2015: Distance based classifiers: Basic concepts, recent developments and app...
2015: Distance based classifiers: Basic concepts, recent developments and app...University of Groningen
 
2016: Classification of FDG-PET Brain Data
2016: Classification of FDG-PET Brain Data2016: Classification of FDG-PET Brain Data
2016: Classification of FDG-PET Brain DataUniversity of Groningen
 

More from University of Groningen (20)

Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
Interpretable machine learning in endocrinology, M. Biehl, APPIS 2024
 
ESE-Eyes-2023.pdf
ESE-Eyes-2023.pdfESE-Eyes-2023.pdf
ESE-Eyes-2023.pdf
 
APPIS-FDGPET.pdf
APPIS-FDGPET.pdfAPPIS-FDGPET.pdf
APPIS-FDGPET.pdf
 
stat-phys-appis-reduced.pdf
stat-phys-appis-reduced.pdfstat-phys-appis-reduced.pdf
stat-phys-appis-reduced.pdf
 
prototypes-AMALEA.pdf
prototypes-AMALEA.pdfprototypes-AMALEA.pdf
prototypes-AMALEA.pdf
 
stat-phys-AMALEA.pdf
stat-phys-AMALEA.pdfstat-phys-AMALEA.pdf
stat-phys-AMALEA.pdf
 
Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...Evidence for tissue and stage-specific composition of the ribosome: machine l...
Evidence for tissue and stage-specific composition of the ribosome: machine l...
 
The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...The statistical physics of learning revisted: Phase transitions in layered ne...
The statistical physics of learning revisted: Phase transitions in layered ne...
 
Interpretable machine-learning (in endocrinology and beyond)
Interpretable machine-learning (in endocrinology and beyond)Interpretable machine-learning (in endocrinology and beyond)
Interpretable machine-learning (in endocrinology and beyond)
 
Biehl hanze-2021
Biehl hanze-2021Biehl hanze-2021
Biehl hanze-2021
 
2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...2020: Prototype-based classifiers and relevance learning: medical application...
2020: Prototype-based classifiers and relevance learning: medical application...
 
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
2020: Phase transitions in layered neural networks: ReLU vs. sigmoidal activa...
 
2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ... 2020: So you thought the ribosome was constant and conserved ...
2020: So you thought the ribosome was constant and conserved ...
 
Prototype-based classifiers and their applications in the life sciences
Prototype-based classifiers and their applications in the life sciencesPrototype-based classifiers and their applications in the life sciences
Prototype-based classifiers and their applications in the life sciences
 
Prototype-based models in machine learning
Prototype-based models in machine learningPrototype-based models in machine learning
Prototype-based models in machine learning
 
The statistical physics of learning - revisited
The statistical physics of learning - revisitedThe statistical physics of learning - revisited
The statistical physics of learning - revisited
 
2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...2013: Sometimes you can trust a rat - The sbv improver species translation ch...
2013: Sometimes you can trust a rat - The sbv improver species translation ch...
 
2013: Prototype-based learning and adaptive distances for classification
2013: Prototype-based learning and adaptive distances for classification2013: Prototype-based learning and adaptive distances for classification
2013: Prototype-based learning and adaptive distances for classification
 
2015: Distance based classifiers: Basic concepts, recent developments and app...
2015: Distance based classifiers: Basic concepts, recent developments and app...2015: Distance based classifiers: Basic concepts, recent developments and app...
2015: Distance based classifiers: Basic concepts, recent developments and app...
 
2016: Classification of FDG-PET Brain Data
2016: Classification of FDG-PET Brain Data2016: Classification of FDG-PET Brain Data
2016: Classification of FDG-PET Brain Data
 

Recently uploaded

Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 

Recently uploaded (20)

Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 

2016: Predicting Recurrence in Clear Cell Renal Cell Carcinoma

  • 1. Gargi Mukherjee … Rutgers University, New Jersey Kevin Raines … Stanford University, California Srikanth Sastry … JNC, Bengaluru, India Sebastian Doniach … Stanford University, California Gyan Bhanot … Rutgers University, New Jersey Michael Biehl … University of Groningen, The Netherlands 1 Predicting Recurrence in Clear Cell Renal Cell Carcinoma Analysis of TCGA data using Outlier Analysis and GMLVQ
  • 2. WCCI 2016, Vancouver / BC 2 /15 overview gene expression in tumor cells specific example: clear cell Renal Cell Carcinomas (ccRCC) • outlier analysis: identification of a panel of prognostic genes with respect to recurrence • risk score: prediction of individual recurrence risk based on outlier status w.r.t. selected genes • machine learning: analysis of extreme cases of low / high risk distance based classification and relevance learning (Generalized Matrix Relevance LVQ) clinical data: recurrence free intervals
  • 3. WCCI 2016, Vancouver / BC 3 /15 clear cell Renal Cell Carcinoma (ccRCC) publicly available datasets: The Cancer Genome Atlas (TCGA) cancergenome.nih.gov also hosted at Broad Institute gdac.broadinstitute.org data
  • 4. WCCI 2016, Vancouver / BC 4 /15 data 20532genes 65normalsamples 469 tumor samples 65 + 65 matched clear cell renal cell carcinoma TCGA data @ Broad Institute mRNA-Seq expression data X normalized, log-transformed: Y=log(1+X) 65 normal samples 65 matched tumor samples 469 tumor samples in total number of recurrences recurrence data: days after diagnosis
  • 5. WCCI 2016, Vancouver / BC 5 /15 380 training samples outlier analysis 89testsamples randomized split
  • 6. WCCI 2016, Vancouver / BC 6 /15 380 training samples outlier analysis per gene: determine mean μ, standard deviation σ of Y for each gene: identify outlier samples Y > μ + σ “high outlier“ Y < μ - σ “low outlier“ restrict the following analysis to genes with ≥ 20 high outlier samples or ≥ 20 low outlier samples
  • 7. WCCI 2016, Vancouver / BC 7 /15 1546 „high-outlier genes“ with KM log rank p < 0.001 1628 „low-outlier genes“ with KM log rank p < 0.0005 construct two binary outlier matrices „1“ for high-outlier samples „0“ else „1“ for low-outlier samples „0“ else 1546 genes  PCA Kaplan-Meier (KM) analysis per gene: test for significant association of outlier status of samples with recurrence outlier analysis 1628 genes 380samples380samples
  • 8. WCCI 2016, Vancouver / BC 8 /15 PCA reveals four clusters of genes 711475 2261402 A B DC high outlier genes low outlier genes genes in small clusters (B,D): outlier status associated with late recurrence genes in large clusters (A,C): outlier status associated with early recurrence outlier analysis
  • 9. WCCI 2016, Vancouver / BC 9 /15 recurrence risk score top 20 genes (by KM p-value) from each cluster A,B,C,D reference set of 80 genes for each sample: - determine outlier status with respect to the 80 genes (Y >?< μ ± σ ) - add up contributions per gene - 1 if the sample is outlier w.r.t. to a gene in A or C (early rec.) 0 if the sample is not an outlier w.r.t. the gene + 1 if the sample is outlier w.r.t. to a gene in B or D (late rec.) recurrence risk score - 40 ≤ R ≤ + 40 observe: median = 2 over the 380 training samples crisp classification w.r.t. recurrence risk: high risk (early recurrence) if R < 2 low risk (late recurrence) if R ≥ 2
  • 10. WCCI 2016, Vancouver / BC 10 /15 recurrence risk prediction training set (380 samples) test set (89 samples) log rank p < 1.e-16 log rank p < 1.e-4 KM plots with respect to high / low risk groups: • risk score R is predictive of the actual recurrence risk • the 80 selected genes can serve as a prognostic panel
  • 11. WCCI 2016, Vancouver / BC 11 /15 extreme case analysis number of recurrences: ≤ 2 years (early) > 5 years (late or no recurrence) 109 samples class 2, high risk 107 samples class 1, low risk (undefined) 2 classes: • 80-dim. feature vectors (gene expression) • representation by one prototype vector per class: • adaptive distance measure for comparison of samples and prototypes: with relevance matrix • distance-based classification, e.g. Nearest Prototype Classifier (NPC)
  • 12. WCCI 2016, Vancouver / BC 12 /15 GMLVQ classifier Generalized Matrix Relevance Learning Vector Quantization (GMLVQ) training of prototypes and relevance matrix = minimization of an appropriate cost function with respect to performance on labeled training set components of diagonal elements of Λ A B C D A B C D lowexpression|highexpression
  • 13. WCCI 2016, Vancouver / BC 13 /15 GMLVQ classifier ROC of GMLVQ classifier (Leave-One-Out of the 216 extreme samples) KM plot w.r.t. all 469 samples ( L-1-O for 216 samples, plus 253 undefined ) log rank p < 1.e-7
  • 14. WCCI 2016, Vancouver / BC 14 /15 extreme case analysis (107+109 samples) GMLVQ classifier Risk score classifier - AUC=0.84  R=2
  • 15. WCCI 2016, Vancouver / BC 15 /15 the set of 80 genes is also diagnostic: • GMLVQ separates normal from tumor cells (close to) perfectly • PCA of corresponding gene expressions: 65 normal samples 105 low risk samples (late recurrence) 109 high risk samples (early recurrence) gradient from normal to high risk: diagnostics?
  • 16. WCCI 2016, Vancouver / BC 16 /15 • GMLVQ suggests an even smaller panel of prognostic genes (12?) identify a minimum panel for diagnostics and prognostics • 80 genes do not necessarily reflect biological mechanisms compare, e.g., with known pathways / modules of genes remarks and open questions • prospective studies required with respect to use as an assay • can the performance be improved further ? study more sophisticated classifier systems include further clinical information (diet, life style, family history, … ) easy-to-use GMLVQ-classifier: www.cs.rug.nl/~biehl/gmlvq • more direct, multivariate identification of relevant genes ? e.g. PCA+GMLVQ and back-transform