Oncology oracle

Oncology Oracle
Gregory Koytiger

Use cell line data to better match cancer patients and
therapies

Cell lines are a rich source of drug sensitivity data,
enables the training of 98 individual drug models
39, 571 Unique Drug - Cell line
Sensitivity Values
Predict
675 Human cancer cell lines with
RNA-seq Gene Expression

Linear models provide ease of interpretability and
transparency
Drug Sensitivity
Linear Model

Ridge regression keeps only a few significant genes
in model
Drug Sensitivity
Linear Model
Ridge Regression

Automatic Relevance Determnation Regression tunes
model sparsity from data
Drug Sensitivity
Linear Model
Ridge Regression
Automatic Relevance
Determination Regression

Start with only the most highly varying genes
8000 of the highest varying genes in the Genentech data were chosen
as features to reduce noise and for computational tractability
Drug Sensitivity
Linear Model
Ridge Regression
Automatic Relevance

Docetaxel model separates sensitive and resistant
breast cancer patients with 0.87 ROC AUC
False Positive Rate
True Positive
Rate
Increasing Predicted Docetaxel Sensitivity
91% of sensitive patients treated
85% of resistant patients avoid unnecessary treatment

Model predicts Cisplatin response in Lung Squamous
Cell Carcinoma

About Me
©2014NatureAmerica,Inc.Allrightsreserved.
T E C H N I C A L R E P O RT S
Functional interpretation of genomic variation is critical
to understanding human disease, but it remains difficult to
predict the effects of specific mutations on protein interaction
networks and the phenotypes they regulate. We describe an
analytical framework based on multiscale statistical mechanics
that integrates genomic and biophysical data to model the
human SH2-phosphoprotein network in normal and cancer
cells. We apply our approach to data in The Cancer Genome
Atlas (TCGA) and test model predictions experimentally.
We find that mutations mapping to phosphoproteins often
create new interactions but that mutations altering SH2
domains result almost exclusively in loss of interactions.
Some of these mutations eliminate all interactions, but many
cause more selective loss, thereby rewiring specific edges
in highly connected subnetworks. Moreover, idiosyncratic
mutations appear to be as functionally consequential as
recurrent mutations. By synthesizing genomic, structural and
biochemical data, our framework represents a new approach to
the interpretation of genetic variation.
TCGA and similar projects have generated extensive data on the muta -
tional landscape of tumors1. To understand the functional consequences
of these mutations, it is necessary to ascertain how they alter the pro-
tein-protein interaction (PPI) networks involved in regulating cellular
Low-throughput methods such as fluorescence polarization spec-
troscopy provide precise interaction data on a few dozen PIDs and
ligands but cannot easily be scaled to the full proteome2, whereas
high-throughput array-based methods provide greater scale but
suffer from systematic artifacts and high false positive and false
negative rates, resulting in data sets that only partly agree2. An
additional challenge is modeling the effects of mutations for pro-
teins with multiple binding domains and/or multiple sites of phos-
phorylation9, a reality for most signaling proteins (for example, the
CRK oncoprotein). Existing methods are either limited to individual
domains10–13 or are insufficiently precise to discern the effects of single-
residue changes14,15.
The MSM framework we have developed combines genomic, bind-
ing and structural data and reconciles inconsistencies within and
among data sets to generate PID networks for normal and cancer
cells. We develop a bottom-up first-principles approach, involving a
single mathematical equation based on statistical mechanical ensem-
bles, that models domains, proteins and networks, and we then apply
this approach to the analysis of SH2 networks and mutations found
in TCGA16. We validate newly predicted interactions experimentally
and demonstrate the sensitivity of MSM to single-residue mutations
that cause subtle changes in binding affinity. Our analysis provides
mechanistic insights into an important PID cancer network and vali-
dates a computational approach to PID networks that can be applied
A multiscale statistical mechanical framework integrates
biophysical and genomic data to assemble cancer networks
Mohammed AlQuraishi1–3, Grigoriy Koytiger1,3, Anne Jenney1, Gavin MacBeath2 & Peter K Sorger1
PTPN11 (SHP2)GRB2
STAT5A
STAT6
STAT1
STAT4
STAT2
STAT3STA
P1TENC1
TNS1
TNS3
TNS4
RIN
1
SH2D5
SH2B1
SH2B2
SH2B3
SHC1
SHC3
PLCG1
PLCG2
BCAR3
SH2D3C
SH2D3A
FER
FES
CRKCRKLMATKHSH2DSH2D2A
SHBSHFSHDSHEFGRYES1
SRC
H
C
K
LCK
LYN
BLK
SLA2
PTK6
PTPN6
PTPN11
ABL1
GRAP
GRB2
NCK1
NCK2
PIK3R1
PIK3R3
PIK3R22NHC
ITK
TXK
BTK
TEC
BMX
SYK
ZAP70
GRB7
GRB14
GRB10
SH2D1A
SH2D1B
IN
PPL1
BLNK
LCP2
DAPP1VAV1VAV3VAV2RASA1SH3BP2
JAK2
JAK3
SUPT6H
CBL
ANKS1A
ANKS1B
GULP1
NUMB
NUMBL
DAB1
DAB2
APBA1
APBA2
APBA3
CCM
2
D
O
K
1
DO
K2
DO
K4
DOK6
DOK5FRS3IRS1
IRS4
APBB1APBB2APBB3
SHC2
APPL1
EPS8L2
STAT5A
STAT6
STAT1
STAT4
STAT2
STAT3STA
P1TENC1
TNS1
TNS3
TNS4
RIN
1
SH2D5
SH2B1
SH2B2
SH2B3
SHC1
SHC3
PLCG1
PLCG2
BCAR3
SH2D3C
SH2D3A
FER
FES
CRKCRKLMATKHSH2DSH2D2A
SHBSHFSHDSHEFGRYES1
SRC
H
C
K
LCK
LYN
BLK
SLA2
PTK6
PTPN6
PTPN11
ABL1
GRAP
GRB2
NCK1
NCK2
PIK3R1
PIK3R3
PIK3R2CHN2
ITK
TXK
BTK
TEC
BMX
SYK
ZAP70
GRB7
GRB14
GRB10
SH2D1A
SH2D1B
IN
PPL1
BLNK
LCP2
DAPP1VAV1VAV3VAV2RASA1SH3BP2
JAK2
JAK3
SUPT6H
CBL
ANKS1A
ANKS1B
GULP1
NUMB
NUMBL
DAB1
DAB2
APBA1
APBA2
APBA3
CCM
2
D
O
K
1
DO
K2
DO
K4
DOK6
DOK5FRS3IRS1
IRS4
APBB1APBB2APBB3
SHC2
APPL1
EPS8L2
Ph.D. Harvard University
Department of Chemistry and Chemical Biology
Postdoc Harvard Medical School
Department of Systems Biology
Management Consultant
Dean & Co.

Model predicts Gemcitabine response in Lung
Cancer Patients

Drug Train Size Spearman r Spearman p
17-AAG 401 0.35 0.00
681640 282 0.17 0.01
A-443654 107 0.04 0.66
A-770041 107 0.30 0.00
ABT-263 294 0.15 0.01
ABT-888 296 0.09 0.14
AEW541 281 0.14 0.02
AG-014699 293 0.27 0.00
AICAR 296 0.29 0.00
AKT inhibitor VIII 301 0.25 0.00
AMG-706 294 0.20 0.00
AP-24534 301 0.17 0.00
AS601245 300 0.12 0.03
ATRA 296 0.20 0.00
AUY922 300 0.09 0.12
AZ628 106 0.30 0.00
AZD-0530 109 0.13 0.17
AZD-2281 296 0.21 0.00
AZD0530 282 0.27 0.00
AZD6244 394 0.17 0.00
AZD6482 283 0.21 0.00
AZD7762 296 0.25 0.00
AZD8055 292 0.24 0.00
Afatinib 240 0.18 0.01
Axitinib 296 0.18 0.00
BAY 61-3606 300 0.22 0.00
BI-2536 107 0.03 0.73
BIBW2992 296 0.33 0.00
BIRB 0796 294 0.14 0.02
BMS-509744 107 0.17 0.09
BMS-536924 107 0.41 0.00
BMS-708163 294 0.19 0.00
BMS-754807 300 0.12 0.04
BX-795 293 0.37 0.00
Bexarotene 303 0.11 0.06
Bicalutamide 306 0.16 0.01
Bleomycin 300 0.38 0.00
Bortezomib 122 0.00 0.96
Bosutinib 298 0.25 0.00
Drug Train Size Spearman rSpearman p
Bryostatin 1 300 0.13 0.02
CCT007093 294 0.05 0.37
CCT018159 287 0.21 0.00
CEP-701 296 0.25 0.00
CGP-082996 107 0.25 0.01
CGP-60474 107 -0.07 0.49
CHIR-99021 301 0.05 0.36
CI-1040 294 0.49 0.00
CMK 107 0.08 0.40
Camptothecin 296 0.24 0.00
Cisplatin 296 0.14 0.01
Cyclopamine 106 0.13 0.18
Cytarabine 299 0.28 0.00
DMOG 301 0.27 0.00
Dasatinib 117 0.33 0.00
Docetaxel 305 0.46 0.00
Doxorubicin 330 0.12 0.03
EHT 1864 294 0.11 0.06
Elesclomol 296 0.28 0.00
Embelin 300 0.18 0.00
Epothilone B 300 0.23 0.00
Erlotinib 346 0.29 0.00
Etoposide 323 0.15 0.01
FH535 300 0.21 0.00
FTI-277 301 0.03 0.60
GDC-0449 296 -0.03 0.61
GDC0941 291 0.12 0.04
GNF-2 107 -0.04 0.71
GSK-1904529A 300 0.14 0.01
GSK-650394 300 0.23 0.00
GSK269962A 106 -0.05 0.64
GW 441756 296 0.18 0.00
GW843682X 107 0.07 0.44
Gefitinib 300 0.26 0.00
Gemcitabine 316 0.12 0.04
IPA-3 300 0.30 0.00
Imatinib 128 0.18 0.04
Irinotecan 165 0.13 0.09
JNJ-26854165 293 0.34 0.00

JNK Inhibitor VIII294 0.11 0.06
JNK-9L 301 0.04 0.52
JW-7-52-1 107 -0.03 0.77
KIN001-135 107 0.09 0.38
KU-55933 293 0.20 0.00
L-685458 275 0.22 0.00
LAQ824 300 -0.03 0.57
LBW242 281 -0.04 0.48
LFM-A13 300 0.03 0.58
Lapatinib 354 0.32 0.00
Lenalidomide 297 0.00 0.93
MG-132 107 -0.23 0.02
MK-2206 275 0.16 0.01
MS-275 105 0.12 0.24
Methotrexate 304 0.25 0.00
Midostaurin 301 0.24 0.00
Mitomycin C 316 0.12 0.03
NSC-87877 301 0.19 0.00
NU-7441 293 0.10 0.08
NVP-BEZ235 292 0.14 0.01
NVP-TAE684 109 0.20 0.03
Nilotinib 383 0.07 0.15
Nutlin-3 282 0.23 0.00
Nutlin-3a 294 0.33 0.00
OSI-906 300 0.25 0.00
OSU-03012 300 0.23 0.00
Obatoclax Mesylate300 0.35 0.00
PAC-1 300 0.16 0.01
PD-0325901 401 0.49 0.00
PD-0332991 378 0.22 0.00
PD-173074 294 -0.06 0.31
PF-02341066 109 0.01 0.91
PF-4708671 293 -0.03 0.62
PF-562271 301 0.23 0.00
PF2341066 282 0.24 0.00
PHA-665752 348 0.21 0.00
PLX4720 398 0.24 0.00
Paclitaxel 369 0.17 0.00
Panobinostat 278 0.44 0.00
Parthenolide 106 0.07 0.49
Pazopanib 304 0.06 0.27
Ponatinib 273 0.14 0.02
Pyrimethamine 106 0.07 0.46
QS11 300 0.11 0.06
RAF265 258 -0.02 0.77
RDEA119 292 0.51 0.00
RO-3306 294 0.14 0.02
Rapamycin 109 0.02 0.85
Roscovitine 105 -0.06 0.54
S-Trityl-L-cysteine107 0.05 0.62
SB 216763 281 0.08 0.17
SB590885 282 0.19 0.00
SL 0101-1 291 0.00 0.99
Salubrinal 106 -0.18 0.07
Shikonin 301 0.21 0.00
Sirolimus 106 0.05 0.63
Sorafenib 353 0.02 0.73
Sunitinib 125 0.31 0.00
TAE684 282 0.14 0.02
TGX221 107 0.11 0.25
TKI258 282 0.09 0.13
TW 37 293 0.24 0.00
Temsirolimus 298 0.27 0.00
Thapsigargin 300 0.30 0.00
Tipifarnib 300 0.08 0.17
Topotecan 295 0.28 0.00
VX-680 107 -0.03 0.75
VX-702 296 -0.08 0.18
Vinblastine 306 0.29 0.00
Vinorelbine 312 0.15 0.01
Vorinostat 307 0.23 0.00
WH-4-023 107 0.18 0.07
WZ-1-84 107 0.19 0.05
XMD8-85 106 0.36 0.00
Z-LLNle-CHO 107 -0.17 0.07
ZD-6474 277 0.13 0.04
ZM-447439 283 0.02 0.77

Rescaling Breast Cancer dataset
NCI-60 U95 Array Clincal U95 Array
Combat Batch
Correct
NCI-60 U95 Array Clincal U95 ArrayNCI-60 RNA-Seq
Fit Gene Specific
Linear Model
Apply Linear
Rescaling
Clincal “RNA-Seq”
Prune genes with correlation p<0.001

ARD Regression
y
λ
~ N (0, λIn)
w
w ~ N (0, α )
α
X
y = Xw +
λ1 λ2
p(λ) = Γ(λ1, λ2)
γi
1 γi
2
p(αi ) = Γ(γi
1, γi
2)
Drug Sensitivity
Linear Model
Ridge Regression
Automatic Relevance

Oncology oracle

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Oncology oracle

Similar to Oncology oracle (20)

Recently uploaded

Recently uploaded (20)

Oncology oracle