2. Use cell line data to better match cancer patients and
therapies
3. Cell lines are a rich source of drug sensitivity data,
enables the training of 98 individual drug models
39, 571 Unique Drug - Cell line
Sensitivity Values
Predict
675 Human cancer cell lines with
RNA-seq Gene Expression
4. Linear models provide ease of interpretability and
transparency
Drug Sensitivity
Linear Model
5. Ridge regression keeps only a few significant genes
in model
Drug Sensitivity
Linear Model
Ridge Regression
6. Automatic Relevance Determnation Regression tunes
model sparsity from data
Drug Sensitivity
Linear Model
Ridge Regression
Automatic Relevance
Determination Regression
7. Start with only the most highly varying genes
8000 of the highest varying genes in the Genentech data were chosen
as features to reduce noise and for computational tractability
Drug Sensitivity
Linear Model
Ridge Regression
Automatic Relevance
Determination Regression
8. Docetaxel model separates sensitive and resistant
breast cancer patients with 0.87 ROC AUC
False Positive Rate
True Positive
Rate
Increasing Predicted Docetaxel Sensitivity
91% of sensitive patients treated
85% of resistant patients avoid unnecessary treatment
14. Rescaling Breast Cancer dataset
NCI-60 U95 Array Clincal U95 Array
Combat Batch
Correct
NCI-60 U95 Array Clincal U95 ArrayNCI-60 RNA-Seq
Fit Gene Specific
Linear Model
Apply Linear
Rescaling
Clincal “RNA-Seq”
Prune genes with correlation p<0.001
15. ARD Regression
y
λ
~ N (0, λIn)
w
w ~ N (0, α )
α
X
y = Xw +
λ1 λ2
p(λ) = Γ(λ1, λ2)
γi
1 γi
2
p(αi ) = Γ(γi
1, γi
2)
Drug Sensitivity
Linear Model
Ridge Regression
Automatic Relevance
Determination Regression