1. — APPLICATION OF MACHINE
LEARNING TECHNIQUES TO
PATIENT STRATIFICATION
ALEXANDER IVLIEV, PHD
OCTOBER 2016
—
2. 2
—
Why is it important?
Technical definition of the problem
Example from our research
Example from published research
01
02
03
04
AGENDA
3. 3
—
Why is it important?
Technical definition of the problem
Example from our research
Example from published research
01
02
03
04
AGENDA
4. 4
—
CASE STUDY - GEFITINIB
Early stratification would have prevented the withdrawal, but retrospective
analysis did allow a new Marketing approval
1st line therapy for metastatic
NSCLC with exon 19 deletions
or exon 21 (L858R)
substitutions
Launched with
accelerated
approval
Withdrawn from
market after
phase III study
NDA resubmission
with companion
diagnostic
• Accelerate approval but post-
marketing trial requested for full
approval
• Phase III studies showed no
improvement in overall survival
• Drug withdrawn from the market
• Retrospective data analysis:
10% partial response in genotype
unselected patients
>70% responders have activating
mutations in EGFR
Best responses seen in East
Asians, females and non-smokers
• NDA resubmitted with narrowed
labelling and mutation-based
companion diagnostic
• Unfortunately, lost market share to
Tarceva during elucidation of patient
segment
5. 5
—
TECHNICAL DEFINITION OF THE PROBLEM
Patients
Genes
Class 1,
Disease
type 1
Class 2,
Disease
type 2
Class ?
Patient
6. 6
—
AIM: TO DISCOVER PATIENT SELECTION BIOMARKERS
FOR ERLOTINIB USING PRECLINICAL DATA SUCH AS CELL
LINE VIABILITY SCREENS
Clinical trial
design for
erlotinib
Gene expression240 cell lines
Sensitive Resistant
+
12. 12
—
BIOLOGICAL INTERPRETATION
Network Building Algorithms
Analyze network
Analyze network (transcription factors)
Analyze network (receptors)
Transcription regulation
Shortest paths
Trace pathways
Self regulation
Direct interactions
Auto expand
Expand by one interaction
Manual expand
13. 13
—
Why is it important?
Technical definition of the problem
Example from our research
Published research: community effort
01
02
03
04
AGENDA
14. 14
—
MAQC CONSORTIUM
Leming Shi et al. MAQC-II study of common practices for the development and validation
of microarray-based predictive models. Nat. Biotech., 2010
• 6 datasets
• 13 endpoints
• 36 data analysis teams
• 30,000 models
15. 15
—
LESSONS LEARNED
The number one factor
which predicts how
accurate predictions will
be is endpoint complexity
Leming Shi et al. MAQC-II study of common practices for the development and validation
of microarray-based predictive models. Nat. Biotech., 2010
It’s easy to tell boys from
girls
It’s hard to predict outcome
of cancer treatment
16. 16
—
LESSONS LEARNED
Leming Shi et al. MAQC-II study of common practices for the development and validation
of microarray-based predictive models. Nat. Biotech., 2010
“The top-performing
teams were mainly
industrial participants
with many years of
experience in microarray
data analysis, whereas
bottom-performing teams
were mainly less-
experienced graduate
students or researchers.”
17. 17
—
LESSONS LEARNED
Leming Shi et al. MAQC-II study of common practices for the development and validation
of microarray-based predictive models. Nat. Biotech., 2010
“Applying good modeling
practices appeared to be
more important than the
actual choice of a
particular algorithm.”
“Many models with
similar performance can
be developed from a
given data set.”