Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

idalab seminar #10 Precision Medicine


Published on

Precision medicine aims at delivering the right treatment at the right time to the right person. This goes hand in hand with a deeper genetic understanding of a patient’s disease. An important milestone is the recent approval of an anti-cancer drug, that was granted by the FDA based solely on a tumor’s biomarker and not on where in the body the tumor started. How can statistics and machine learning help to identify a patient’s best treatment option? I will walk you through a typical Phase II clinical trial in order to discuss the challenge it poses to predict a patient’s response to a treatment that she has not yet received, demonstrate the opportunity of analysing a large set of biomarkers in a clinical trial with relatively few patients, as well as to point out the need to define concrete patient subgroups.

Published in: Health & Medicine
  • Be the first to comment

idalab seminar #10 Precision Medicine

  1. 1. Agency for Data Science Machine learning & AI Mathematical modelling Data strategy Dr. Nicole Krämer Precision Medicine – How to Identify Biomarkers that Help Patients Choose Their Best Treatment Option idalab seminar #10 | May 18th 2018
  2. 2. Precision Medicine How to Identify Biomarkers that Help Patients Choose Their Best Treatment Option Dr. Nicole Krämer 18 May 2018
  3. 3. Precision medicine – the right treatment at the right time to the right person “Doctors have always recognized that every patient is unique, and doctors have always tried to tailor their treatments as best they can to individuals. You can match a blood transfusion to a blood type — that was an important discovery. What if matching a cancer cure to our genetic code was just as easy, just as standard? “ - President Obama, January 30, 2015 Source: Table of Pharmacogenomic Biomarkers in Drug Labeling , NSCLC: non–small-cell lung cancer Li Genotyping and Genomic Profiling of Non–Small-Cell Lung Cancer: Implications for Current and Future Therapies. Journal of Clinical Oncology, 2013 31(8)
  4. 4. Pembrolizumab (Keytruda) • PD-1 inhibitor / cancer immunotherapy • Accelerated approval by Food & Drug Administration (FDA) based on a tumor’s biomarker and not on where in the body the tumor started • FDA’s first tissue/site-agnostic approval Source:
  5. 5. Source: The (classical) phases of drug development Approval by regulatory authorities (e.g. EMA, FDA) Main objectives: Safety and efficacy Efficacy endpoints: response rates, tumor shrinkage,…. Efficacy endpoints: Overall survival, Progression-free survival,…. Evidence for biomarkers - Mechanism of the drug - Preclinical data - Evidence from similar drugs (challenging for first-in-class drugs) Selection and validation of biomarkers Different strategies for phase III design - Inclusion criterion - biomarker-stratified design - Adaptive/enrichment design
  6. 6. Mok TS, Wu Y-L, Thongprasert S, et al. Gefitinib or carboplatin-paclitaxel in pulmonary adenocarcinoma. N Engl J Med. 2009;361(10):947-957 Properties of (biomarker) variables A variable is predictive if the relative treatment benefit (experimental vs. control) depends on the biomarker.
  7. 7. Properties of (biomarker) variables A variable is prognostic if it informs about a likely outcome in absence or irrespective of treatment received. Within each treatment arm, EGFR positive patients do better compared to EGFR negative patients.
  8. 8. (Hypothetical) case study • Randomized phase II trial comparing two treatments A (experimental) versus B (control) in a parallel design. • Endpoint: Objective response Yes/No • Different types of biomarkers measured at baseline. Trial population (n=150) A: Experimental arm B: Control arm Source: wikipedia Mutations based on a cancer panel (~300 genes) Immunohistochemistry for 10 selected genes Source: wikipedia High throughput sequencing ( ~ 1000 variables) Source: wikipedia 75 patients 75 patients
  9. 9. How can we identify patients who benefit from treatment A? Why not simply learn a classification tree? Classification trees in a nut-shell: 1.For each node, select the variable that splits the data best. 2.Test if the inclusion of this variable improves prediction. 3.Find the optimal cut-off based on some information criteria. Relative benefit needed! If you use the endpoint of the trial to learn a classifier, you willl only identify prognostic biomarkers! Important: We do not observe the endpoint of interest! Interaction models Counterfactual models
  10. 10. Interaction models odds = rate 100 − rate odds ratio = relative treatment benefit ,-. /(1 = 1) 1 − /(1 = 1) = 34 + 36 7 8 + 39 7 :; + 3< 7 8 7 :; Odds ratio .. … for a biomarker positive patient: exp(36 + 3<) … for a biomarker negative patient: exp(36) The biomarker is predictive if 3< ≠ 0.
  11. 11. Example 1: micro RNA 1000 continuous biomarker variables, normalized and filtered prior to the analysis. Univariate screening 1. For each biomarker, compute the interaction p-value. 2. Adjust all p-values for multiple testing. 3. Select those biomarkers with adjusted p-value <0.05. Multivariate selection 1. Learn a sparse high-dimensional classifier using all biomarkers and their treatment interactions. 2. Select those biomarkers with a non-zero interaction term.
  12. 12. Example 1: micro RNA Both approaches only return a list of biomarkers, but not a subgroup of patients. • Typically, a cut-off value is optimized that defines subgroups of patients – „biomarker high“ versus „biomarker low“. • Caveat: Due to the model fitting and the cut-off optimization, the effect in these subgroups may be inflated.
  13. 13. Example 2: Gene set enrichment analysis(GSEA) • GSEA determines whether an a priori defined set of genes shows statistically significant different results. • GSEA is competitive, i.e. analyses are always relative to all genes in the panel. A. Subramanian et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome- wide expression profiles. PNAS 102 (43). 2005 • Idea: 1. Ranking based on a score that measures association of biomarker to endpoint 2. Running-sum statistic by walking along the list of genes àenrichment score 3. Derive p-values based on the magnitude of the enrichment score.
  14. 14. Example 2: GSEA • Genomic profiling of ~ 300 cancer related genes Prior to the analysis, genes with very low mutation rates were excluded from the analysis • Gene sets (pathways) from publically available data bases • Score variable: interaction p-values for a single gene analysis Notes: • Subgroups may be defined for each gene set. • „Sparse“ cancer panely may not be optimal for this approach. • Different gene set methods exist, but are mostly applied to detect prognostic biomarker. Pathway Genes in the pathway Genes included in the panel Patients with a mutation in the pathway P-value obtained from GSEA Pathway 1 29 6 20 0.1432 Pathway 2 22 10 60 0.2754 Pathway 3 37 14 45 0.0989 … … … …
  15. 15. Foster, J. C., Taylor, J. M., & Ruberg, S. J. (2011). Subgroup identification from randomized clinical trial data. Statistics in medicine, 30(24), 2867-2880 1. In each treatment arm, train classifier for Y (e.g. probability of a response). 2. For each patient, predict endpoint. 3. Define predicted relative treatment benefit: !" − !$. 4. Learn tree on predicted relative treatment benefit. Counterfactual models % = !" '(, … , '+, ,(, … , ,- % = !$ '(, … , '+, ,(, … , ,- !" '(, … , '+, ,(, … , ,- Predicted endpoint with treatment A Biomarker '(, … , '+ Other characteristica ,(, … , ,- !$ '(, … , '+, ,(, … , ,- Predicted endpoint with treatment B „Virtual twin“
  16. 16. Example 3: The virtual twin method • Immunohistochemistry (IHC) expression for 10 selected genes. • Classifier: random forests (as investigated in the original paper) • Predicted relative benefit: difference in objective response rates. This was transformed into a binary classifier („Yes“/“No“) based on clinical relevance. • Note: The method directly returns a subgroup of patients, defined by two biomarkers. Some thoughts: • Complex subgroups may be hard to interpret. • In simulations, there is a considerable amount of false positives. à Use different classifier? • In general, prediction accuracy of any classifier may be limited due to the expected low signal to noise ratio in clinical trials.
  17. 17. Summary • The incorporation of biomarkers into drug development becomes more and more important. • Predictive biomarkers play a key role in precision medicine, as they help to select the right treatment for the right patient. • Data science can help tremendously in the identification of useful biomarkers. • Biomarkers can be used for many different purposes … • Data science is needed in many different areas of drug developlement … … let‘s talk!
  18. 18. Biostatistical services at Staburo Clinical Statistics Translational Medicine & Biomarkers Statistical Programming with CDISC Pharmacokinetics/- dynamics Health Technology Assessment Non-clinical Statistics