SlideShare a Scribd company logo
1 of 27
Download to read offline
New Statistical Learning Methods for Estimating the
Optimal Dynamic Treatment Regime
Lu Wang
Associate Professor
University of Michigan, Ann Arbor
luwang@umich.edu
December 10, 2019
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 1 / 27
Overview
1 Introduction: Dynamic Treatment Regime (DTR)
2 Adaptive Contrast Weighted Learning (ACWL)
3 Tree-based Reinforcement Learning (T-RL)
4 Summary and Other Ongoing Research
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 2 / 27
Motivating Example: Cancer Management
Esophageal Cancer Patients, MD Anderson, 1998 to 2012
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 3 / 27
Motivation of Dynamic Treatment Regimes (DTRs)
Chronic Disease management: routinely adjust, change, add, or
discontinue treatment based on progress, side effects, patient burden,
compliance, etc.
→ Dynamic Treatment Regimes (DTRs)
Goals: To account for patient heterogeneity and to personalize health
care strategy:
One Size Fits All
Degree of Tailoring
−−−−−−−−−−−−−−→
Inherent Characteristics
Targeted/Individualized
Once and for All
Degree of Tailoring
−−−−−−−−−−−−−−−−−−→
Time-Varying Characteristics
Dynamic
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 4 / 27
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 5 / 27
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 6 / 27
Evidence-based Personalized Health Care
To provide meaningful improved health outcomes for patients by
delivering the right drug with the right dose at the right time.
One size fits all Individualized TherapyDegree of Tailoring
Low predictability
Suboptimal health outcomes
Asses patient response heterogeneity;
Stratify patient population;
Optimize benefits/risk, etc.
Better predictability
Improved health outcomes
Family History
Medical
Genetics
Lifestyle
Environment
Engine
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 7 / 27
Key Notation
Stage j, j = 1, . . . , T with treatment options Kj (≥ 2)
Aj: treatment at stage j with value aj ∈ Aj = {1, . . . , Kj}
Xj: covariate history just prior to assign Aj
Y : overall outcome of interest
DTR g = (g1, . . . , gT ): a set of decision rules with
gj : Domain of Hj = (A1, . . . , Aj−1, Xj ) → Aj.
Y ∗(a): counterfactual outcome under treatment a
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 8 / 27
Assumptions for Identifiability using Observational Data
Use causal framework to assess DTRs under three assumptions
(Murphy et al. 2001; Robins and Hernan, 2009)
1 Consistency (no interference)
If a patient’s observed treatment history is compatible with a
given DTR, his or her observed clinical outcomes are the same as
the counterfactual ones under the DTR.
2 No Unmeasured Confounding
Treatment decision at a given time is independent of future
observations and counterfactual outcomes, conditional on all
previous covariate and treatment history.
3 Positivity
With probability one, every subject follows a specific DTR with a
nonnegative probability, which is bounded away from 0.
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 9 / 27
Adaptive Contrast Weighted Learning (ACWL)
Tao and Wang, 2017 Biometrics
For T = 1:
gopt
= arg max
g∈G
EH
K
a=1
E{Y ∗
(a)|H}I{g(H) = a} .
Under the Causal Assumptions 1-3, we have
gopt
= arg max
g∈G
EH
K
a=1
µa(H)I{g(H) = a} ,
where µa(H) E(Y |A = a, H).
Incorporate order statistics µ(1)(H) ≤ · · · ≤ µ(K)(H), then
gopt
= arg max
g∈G
EH
K
a=1
µ(a)(H)I{g(H) = la(H)} ,
where µ(a)(H) = µla (H)
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 10 / 27
Property of gopt
gopt
= arg min
g∈G
EH
K−1
a=1
{µ(K)(H) − µ(a)(H)}I{g(H) = la(H)} .
Minimizes the expected loss in the outcome due to sub-optimal
treatments in the entire population of interest.
Classify as many patients as possible to their optimal treatment
lK (i.e., letting I{g(H) = la(H)} = 0, a = 1, . . . , K − 1) while
putting more emphasis on patients with larger contrasts (i.e.,
larger µ(K)(H) − µ(a)(H)).
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 11 / 27
Bounds of the Objective Function
Let C1(H) µ(K)(H) − µ(K−1)(H), and C2(H) µ(K)(H) − µ(1)(H),
then
0 ≤ C1(H) ≤ µ(K)(H) − µ(a)(H) ≤ C2(H)
Therefore,
EH [C1(H)I{g(H) = lK(H)}]
≤ EH
K−1
a=1
{µ(K)(H) − µ(a)(H)}I{g(H) = la(H)}
≤ EH [C2(H)I{g(H) = lK(H)}]
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 12 / 27
Adaptive Contrasts for Transformation to Weighted
Classification
Contrasts C1 and C2: minimum and maximum expected losses in
the outcome given sub-optimal treatments, adaptive to each
patient’s own treatment effect ordering
In the best case where sub-optimal treatments only lead to
minimal expected losses in the outcome, gopt is equal to
arg min
g∈G
EH [C1(H)I{g(H) = lK(H)}] ,
In the worst case where sub-optimal treatments all lead to
maximal expected losses in the outcome, gopt is equal to
arg min
g∈G
EH [C2(H)I{g(H) = lK(H)}] .
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 13 / 27
Estimate µA(H) to Construct C1(H) and C2(H)
Given estimated propensity score ˆπa(H), the AIPW estimator
ˆµAIPW
a for E{Y ∗(a)} is
ˆµAIPW
a = Pn
I(A = a)
ˆπa(H)
Y + 1 −
I(A = a)
ˆπa(H)
ˆµa(H) .
Lemma (Double Robustness)
ˆµAIPW
a is a consistent estimator of E{Y ∗(a)} if either the propensity
model πa(H) or the conditional mean model µa(H) is correctly specified.
At a subject level,
ˆµAIPW
a (H) =
I(A = a)
ˆπa(H)
Y + 1 −
I(A = a)
ˆπa(H)
ˆµa(H).
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 14 / 27
Adaptive Contrast Weighted Learning (ACWL) with T > 1
For stage T, the assumptions and method derivation are the same
as in a single stage scenario.
For stage j, T − 1 ≥ j ≥ 1, we incorporate Q-learning to conduct
backward induction. The stage-specific pseudo outcome POj for
estimating treatment effect ordering and adaptive contrasts is a
predicted counterfactual outcome under optimal treatments at all
future stages, i.e.,
POj = E Y ∗
(A1, . . . , Aj, gopt
j+1, . . . , gopt
T ) .
Replace Y with POj to apply the same method as in T = 1.
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 15 / 27
Simulation Results
Q-learning BOWL ACWL
Stage1Stage2
Q-learning BOWL ACWL
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 16 / 27
Tree-based Reinforcement Learning (T-RL): Motivation
Tao, Wang and Armirall D. (2018) Annals of Applied Statistics
Supervised learning (SL):
True label is known
Then tree-based methods are easy to understand and interpret
without distributional assumptions for data, e.g., Classification
and Regression Tree (CART)
In a DTR problem, label (the optimal treatment) at each stage is not
directly observed
ACWL uses semiparametric regression to estimate the label first
and then convert RL to SL to apply existing classification methods
such as CART
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 17 / 27
T-RL: Methodological Basis and Feature
Batch-mode RL using a sequence of unsupervised decision trees with
backward induction
Maintain the RL feature without the extra step of conversion to
SL, and thus reduce estimation uncertainty (improvement upon
ACWL)
Incorporate multiple treatment comparisons directly, to improve
efficiency (improvement upon ACWL)
Also allow multi-categorical and ordinal treatments
Similar to ACWL, we expect T-RL to be robust and efficient by
combining robust semiparametric regression estimators with
nonparametric machine learning methods.
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 18 / 27
Example of an Unsupervised Decision Tree
The selected split at each node must improve the expected
counterfactual outcome.
Nodes Ωm, m = 1, 2, . . . , are regions defined by subset covariate space.
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 19 / 27
Tree-based RL (T-RL): Purity Measures
Use the purity measure to improve the expected counterfactual
outcome and apply doubly robust AIPW estimators as in ACWL.
For a given partition ω and ωc of node Ω, let gj,ω,a1,a2 denote the
decision rule that assigns treatment a1 to subjects in ω and
treatment a2 to subjects in ωc at stage j.
We define the purity measure Pj(Ω, ω) as
max
a1,a2∈Aj
Pn


Kj
aj=1
ˆµAIPW
j,aj
(Hj)I{gj,ω,a1,a2 (Hj) = aj}I(Hj ∈ Ω)

 .
Pj(Ω, ω) evaluates the performance of the best decision rule which
assigns a single treatment for each of the two arms under partition.
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 20 / 27
T-RL: Recursive Partitioning
The best split ωopt is chosen to maximize the improvement in the
purity, Pj(Ω, ω) − Pj(Ω).
Stopping Rules, given minimum node size n0 and minimum
purity improvement λ
1 If the node size is less than 2n0, the node will not be split.
2 If all possible splits of a node result in a child node with size smaller
than n0, the node will not be split.
3 If the current tree depth reaches the user-specified maximum depth,
the tree growing process will stop.
4 If the maximum purity improvement Pj(Ω, ˆωopt
) − Pj(Ω) is less
than λ, the node will not be split, where
ˆωopt
= arg max
ω
[Pj(Ω, ω) : min{nPnI(Hj ∈ ω), nPnI(Hj ∈ ωc
)} ≥ n0]
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 21 / 27
T-RL Algorithm
1 Start with stage j = T.
2 Obtain AIPW estimates ˆµAIPW
j,aj
(Hj), aj = 1, . . . , Kj,.
3 At root node Ωj,m, m = 1, set values for λ and n0.
4 At node Ωj,m, evaluate the four Stopping Rules. If any of them
is satisfied, assign a single best treatment
arg max
aj∈Aj
Pn[ˆµAIPW
j,aj
(Hj)I(Hj ∈ Ωj,m)]
to all subjects in Ωj,m. Otherwise, split Ωj,m into child nodes
Ωj,2m and Ωj,2m+1 by ˆωopt.
5 Set m = m + 1 and repeat Step 4 until all nodes are terminal.
6 If j > 1, set j = j − 1 and repeat steps 2 to 5. If j = 1, stop.
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 22 / 27
Simulation 1: Tree-type DTR
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 23 / 27
Simulation 2: Non-tree-type DTR
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 24 / 27
Summary
ACWL transforms batch-mode RL to SL and incorporates existing
classification methods: flexible and easy to implement.
T-RL directly deals with batch-mode RL and uses unsupervised
trees to learn optimal DTRs with purity measures maximizing the
counterfactual mean outcome.
ACWL and T-RL both combines semiparametric regression with
machine learning and can handle multiple stages and multiple
treatments.
Overall, both methods work well across different scenarios. T-RL
has slightly better robustness. T-RL works better for tree-type
DTRs while ACWL works better for non-tree-type ones.
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 25 / 27
Other Ongoing Research Projects
Stochastic Tree-based Reinforcement Learning (ST-RL) for
estimating optimal DTRs
Explore different structures of optimal DTRs
with restricted arms from observational data
with restricted tailoring variables
Some special DTR frameworks: e.g. Nested Test-and-Treat DTRs
Multivariate utility function: multiple competing clinical priorities
Continuous treatment options, e.g., radiation doses
Mobile health: continuous decisions and data updating
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 26 / 27
Acknowledgment
My Ph.D. students:
Yebin Tao, Yilun Sun, Nina Zhou, Ming Tang, Kelly Speth,
Jincheng Shen, Mochuan Liu, Yingchao Zhong.
My collaborators:
Daniel Almirall, Jeremy Taylor, Peter Thall, Stewart Wang.
Thank You!
luwang@umich.edu
Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 27 / 27

More Related Content

What's hot

Alpha auger emitters
Alpha auger emittersAlpha auger emitters
Alpha auger emittersGanesh Kumar
 
External Beam Radiotherapy for Hepatocellular carcinoma
External Beam Radiotherapy for Hepatocellular carcinomaExternal Beam Radiotherapy for Hepatocellular carcinoma
External Beam Radiotherapy for Hepatocellular carcinomaBala Vellayappan
 
OFDM (Orthogonal Frequency Division Multiplexing )
OFDM (Orthogonal Frequency Division Multiplexing �)OFDM (Orthogonal Frequency Division Multiplexing �)
OFDM (Orthogonal Frequency Division Multiplexing )Juan Camilo Sacanamboy
 
Role of Radiotherapy in Primary and Metastatic Liver Tumors
Role of Radiotherapy in Primary and Metastatic Liver Tumors Role of Radiotherapy in Primary and Metastatic Liver Tumors
Role of Radiotherapy in Primary and Metastatic Liver Tumors Anil Gupta
 
Image registration and data fusion techniques.pptx latest save
Image registration and data fusion techniques.pptx latest saveImage registration and data fusion techniques.pptx latest save
Image registration and data fusion techniques.pptx latest saveM'dee Phechudi
 
SGRT: Important Player in Oligometastatic Treatments
SGRT: Important Player in Oligometastatic TreatmentsSGRT: Important Player in Oligometastatic Treatments
SGRT: Important Player in Oligometastatic TreatmentsSGRT Community
 
Hodgkin's lymphoma early stage management
Hodgkin's lymphoma early stage management Hodgkin's lymphoma early stage management
Hodgkin's lymphoma early stage management dharmendrasingh3910
 
Digital cellular networks GSM
Digital cellular networks GSMDigital cellular networks GSM
Digital cellular networks GSMRAVIKIRAN ANANDE
 
Chromatic dispersion white_paper
Chromatic dispersion white_paperChromatic dispersion white_paper
Chromatic dispersion white_paperDorinel Ionascu
 
Gilberto Barreto Curriculo
Gilberto Barreto CurriculoGilberto Barreto Curriculo
Gilberto Barreto Curriculoguest60f32e73
 
디지털통신 9
디지털통신 9디지털통신 9
디지털통신 9KengTe Liao
 
Protocol for QoS Support Chapter 18
Protocol for QoS Support Chapter 18Protocol for QoS Support Chapter 18
Protocol for QoS Support Chapter 18daniel ayalew
 
Geometric errors in 3DCRT and IMRT
Geometric errors in 3DCRT and IMRTGeometric errors in 3DCRT and IMRT
Geometric errors in 3DCRT and IMRTAmin Amin
 

What's hot (18)

Alpha auger emitters
Alpha auger emittersAlpha auger emitters
Alpha auger emitters
 
Radiotherapy technique
Radiotherapy techniqueRadiotherapy technique
Radiotherapy technique
 
External Beam Radiotherapy for Hepatocellular carcinoma
External Beam Radiotherapy for Hepatocellular carcinomaExternal Beam Radiotherapy for Hepatocellular carcinoma
External Beam Radiotherapy for Hepatocellular carcinoma
 
OFDM (Orthogonal Frequency Division Multiplexing )
OFDM (Orthogonal Frequency Division Multiplexing �)OFDM (Orthogonal Frequency Division Multiplexing �)
OFDM (Orthogonal Frequency Division Multiplexing )
 
point operations in image processing
point operations in image processingpoint operations in image processing
point operations in image processing
 
Role of Radiotherapy in Primary and Metastatic Liver Tumors
Role of Radiotherapy in Primary and Metastatic Liver Tumors Role of Radiotherapy in Primary and Metastatic Liver Tumors
Role of Radiotherapy in Primary and Metastatic Liver Tumors
 
2 marks q&a
2 marks q&a2 marks q&a
2 marks q&a
 
Image registration and data fusion techniques.pptx latest save
Image registration and data fusion techniques.pptx latest saveImage registration and data fusion techniques.pptx latest save
Image registration and data fusion techniques.pptx latest save
 
SGRT: Important Player in Oligometastatic Treatments
SGRT: Important Player in Oligometastatic TreatmentsSGRT: Important Player in Oligometastatic Treatments
SGRT: Important Player in Oligometastatic Treatments
 
Hodgkin's lymphoma early stage management
Hodgkin's lymphoma early stage management Hodgkin's lymphoma early stage management
Hodgkin's lymphoma early stage management
 
Digital cellular networks GSM
Digital cellular networks GSMDigital cellular networks GSM
Digital cellular networks GSM
 
Chromatic dispersion white_paper
Chromatic dispersion white_paperChromatic dispersion white_paper
Chromatic dispersion white_paper
 
Gilberto Barreto Curriculo
Gilberto Barreto CurriculoGilberto Barreto Curriculo
Gilberto Barreto Curriculo
 
디지털통신 9
디지털통신 9디지털통신 9
디지털통신 9
 
Dif fft
Dif fftDif fft
Dif fft
 
Protocol for QoS Support Chapter 18
Protocol for QoS Support Chapter 18Protocol for QoS Support Chapter 18
Protocol for QoS Support Chapter 18
 
Geometric errors in 3DCRT and IMRT
Geometric errors in 3DCRT and IMRTGeometric errors in 3DCRT and IMRT
Geometric errors in 3DCRT and IMRT
 
Image segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
Image segmentation using wvlt trnsfrmtn and fuzzy logic. pptImage segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
Image segmentation using wvlt trnsfrmtn and fuzzy logic. ppt
 

Similar to Causal Inference Opening Workshop - New Statistical Learning Methods for Estimating the Optimal Dynamic Treatment Regime - Lu Wang, December 10, 2019

Subgroup identification for precision medicine. a comparative review of 13 me...
Subgroup identification for precision medicine. a comparative review of 13 me...Subgroup identification for precision medicine. a comparative review of 13 me...
Subgroup identification for precision medicine. a comparative review of 13 me...SuciAidaDahhar
 
Categorical data analysis full lecture note PPT.pptx
Categorical data analysis full lecture note  PPT.pptxCategorical data analysis full lecture note  PPT.pptx
Categorical data analysis full lecture note PPT.pptxMinilikDerseh1
 
2015.01.07 - HAI poster
2015.01.07 - HAI poster2015.01.07 - HAI poster
2015.01.07 - HAI posterFunan Shi
 
#2ECcourse.ppthjjkhkjhkjhjkjhkhhkhkhkhkj
#2ECcourse.ppthjjkhkjhkjhjkjhkhhkhkhkhkj#2ECcourse.ppthjjkhkjhkjhjkjhkhhkhkhkhkj
#2ECcourse.ppthjjkhkjhkjhjkjhkhhkhkhkhkjBereketRegassa1
 
#2ECcourse.pptgjdgjgfjgfjhkhkjkkjhkhkhkjhjkj
#2ECcourse.pptgjdgjgfjgfjhkhkjkkjhkhkhkjhjkj#2ECcourse.pptgjdgjgfjgfjhkhkjkkjhkhkhkjhjkj
#2ECcourse.pptgjdgjgfjgfjhkhkjkkjhkhkhkjhjkjBereketRegassa1
 
Strategies for setting futility analyses at multiple time points in clinical ...
Strategies for setting futility analyses at multiple time points in clinical ...Strategies for setting futility analyses at multiple time points in clinical ...
Strategies for setting futility analyses at multiple time points in clinical ...Lu Mao
 
Método Topsis - multiple decision makers
Método Topsis  - multiple decision makersMétodo Topsis  - multiple decision makers
Método Topsis - multiple decision makersLuizOlimpio4
 
Trimming the L1 Regularizer: Statistical Analysis, Optimization, and Applicat...
Trimming the L1 Regularizer: Statistical Analysis, Optimization, and Applicat...Trimming the L1 Regularizer: Statistical Analysis, Optimization, and Applicat...
Trimming the L1 Regularizer: Statistical Analysis, Optimization, and Applicat...Jihun Yun
 
Defensive Efficacy Interim Design
Defensive Efficacy Interim DesignDefensive Efficacy Interim Design
Defensive Efficacy Interim DesignZhongwen Tang
 
Chi square(hospital admin)
Chi square(hospital admin) Chi square(hospital admin)
Chi square(hospital admin) Mmedsc Hahm
 
Managerial Economics - Demand Estimation (regression analysis)
Managerial Economics - Demand Estimation (regression analysis)Managerial Economics - Demand Estimation (regression analysis)
Managerial Economics - Demand Estimation (regression analysis)JooneEltanal
 
Lec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methodsLec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methodsRonald Teo
 
Analytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational DataAnalytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational DataCTSI at UCSF
 

Similar to Causal Inference Opening Workshop - New Statistical Learning Methods for Estimating the Optimal Dynamic Treatment Regime - Lu Wang, December 10, 2019 (20)

Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Subgroup identification for precision medicine. a comparative review of 13 me...
Subgroup identification for precision medicine. a comparative review of 13 me...Subgroup identification for precision medicine. a comparative review of 13 me...
Subgroup identification for precision medicine. a comparative review of 13 me...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Lg ph d_slides_vfinal
Lg ph d_slides_vfinalLg ph d_slides_vfinal
Lg ph d_slides_vfinal
 
Categorical data analysis full lecture note PPT.pptx
Categorical data analysis full lecture note  PPT.pptxCategorical data analysis full lecture note  PPT.pptx
Categorical data analysis full lecture note PPT.pptx
 
Gap correction
Gap correctionGap correction
Gap correction
 
2015.01.07 - HAI poster
2015.01.07 - HAI poster2015.01.07 - HAI poster
2015.01.07 - HAI poster
 
Medical statistics2
Medical statistics2Medical statistics2
Medical statistics2
 
#2ECcourse.ppthjjkhkjhkjhjkjhkhhkhkhkhkj
#2ECcourse.ppthjjkhkjhkjhjkjhkhhkhkhkhkj#2ECcourse.ppthjjkhkjhkjhjkjhkhhkhkhkhkj
#2ECcourse.ppthjjkhkjhkjhjkjhkhhkhkhkhkj
 
#2ECcourse.pptgjdgjgfjgfjhkhkjkkjhkhkhkjhjkj
#2ECcourse.pptgjdgjgfjgfjhkhkjkkjhkhkhkjhjkj#2ECcourse.pptgjdgjgfjgfjhkhkjkkjhkhkhkjhjkj
#2ECcourse.pptgjdgjgfjgfjhkhkjkkjhkhkhkjhjkj
 
Strategies for setting futility analyses at multiple time points in clinical ...
Strategies for setting futility analyses at multiple time points in clinical ...Strategies for setting futility analyses at multiple time points in clinical ...
Strategies for setting futility analyses at multiple time points in clinical ...
 
PMED Transition Workshop - Estimation & Optimization of Composite Outcomes - ...
PMED Transition Workshop - Estimation & Optimization of Composite Outcomes - ...PMED Transition Workshop - Estimation & Optimization of Composite Outcomes - ...
PMED Transition Workshop - Estimation & Optimization of Composite Outcomes - ...
 
Método Topsis - multiple decision makers
Método Topsis  - multiple decision makersMétodo Topsis  - multiple decision makers
Método Topsis - multiple decision makers
 
Cost indexes
Cost indexesCost indexes
Cost indexes
 
Trimming the L1 Regularizer: Statistical Analysis, Optimization, and Applicat...
Trimming the L1 Regularizer: Statistical Analysis, Optimization, and Applicat...Trimming the L1 Regularizer: Statistical Analysis, Optimization, and Applicat...
Trimming the L1 Regularizer: Statistical Analysis, Optimization, and Applicat...
 
Defensive Efficacy Interim Design
Defensive Efficacy Interim DesignDefensive Efficacy Interim Design
Defensive Efficacy Interim Design
 
Chi square(hospital admin)
Chi square(hospital admin) Chi square(hospital admin)
Chi square(hospital admin)
 
Managerial Economics - Demand Estimation (regression analysis)
Managerial Economics - Demand Estimation (regression analysis)Managerial Economics - Demand Estimation (regression analysis)
Managerial Economics - Demand Estimation (regression analysis)
 
Lec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methodsLec5 advanced-policy-gradient-methods
Lec5 advanced-policy-gradient-methods
 
Analytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational DataAnalytic Methods and Issues in CER from Observational Data
Analytic Methods and Issues in CER from Observational Data
 

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 
2019 GDRR: Blockchain Data Analytics - Modeling Cryptocurrency Markets with T...
2019 GDRR: Blockchain Data Analytics - Modeling Cryptocurrency Markets with T...2019 GDRR: Blockchain Data Analytics - Modeling Cryptocurrency Markets with T...
2019 GDRR: Blockchain Data Analytics - Modeling Cryptocurrency Markets with T...
 
2019 GDRR: Blockchain Data Analytics - Tracking Criminals by Following the Mo...
2019 GDRR: Blockchain Data Analytics - Tracking Criminals by Following the Mo...2019 GDRR: Blockchain Data Analytics - Tracking Criminals by Following the Mo...
2019 GDRR: Blockchain Data Analytics - Tracking Criminals by Following the Mo...
 
2019 GDRR: Blockchain Data Analytics - Disclosure in the World of Cryptocurre...
2019 GDRR: Blockchain Data Analytics - Disclosure in the World of Cryptocurre...2019 GDRR: Blockchain Data Analytics - Disclosure in the World of Cryptocurre...
2019 GDRR: Blockchain Data Analytics - Disclosure in the World of Cryptocurre...
 

Recently uploaded

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 

Recently uploaded (20)

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 

Causal Inference Opening Workshop - New Statistical Learning Methods for Estimating the Optimal Dynamic Treatment Regime - Lu Wang, December 10, 2019

  • 1. New Statistical Learning Methods for Estimating the Optimal Dynamic Treatment Regime Lu Wang Associate Professor University of Michigan, Ann Arbor luwang@umich.edu December 10, 2019 Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 1 / 27
  • 2. Overview 1 Introduction: Dynamic Treatment Regime (DTR) 2 Adaptive Contrast Weighted Learning (ACWL) 3 Tree-based Reinforcement Learning (T-RL) 4 Summary and Other Ongoing Research Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 2 / 27
  • 3. Motivating Example: Cancer Management Esophageal Cancer Patients, MD Anderson, 1998 to 2012 Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 3 / 27
  • 4. Motivation of Dynamic Treatment Regimes (DTRs) Chronic Disease management: routinely adjust, change, add, or discontinue treatment based on progress, side effects, patient burden, compliance, etc. → Dynamic Treatment Regimes (DTRs) Goals: To account for patient heterogeneity and to personalize health care strategy: One Size Fits All Degree of Tailoring −−−−−−−−−−−−−−→ Inherent Characteristics Targeted/Individualized Once and for All Degree of Tailoring −−−−−−−−−−−−−−−−−−→ Time-Varying Characteristics Dynamic Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 4 / 27
  • 5. Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 5 / 27
  • 6. Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 6 / 27
  • 7. Evidence-based Personalized Health Care To provide meaningful improved health outcomes for patients by delivering the right drug with the right dose at the right time. One size fits all Individualized TherapyDegree of Tailoring Low predictability Suboptimal health outcomes Asses patient response heterogeneity; Stratify patient population; Optimize benefits/risk, etc. Better predictability Improved health outcomes Family History Medical Genetics Lifestyle Environment Engine Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 7 / 27
  • 8. Key Notation Stage j, j = 1, . . . , T with treatment options Kj (≥ 2) Aj: treatment at stage j with value aj ∈ Aj = {1, . . . , Kj} Xj: covariate history just prior to assign Aj Y : overall outcome of interest DTR g = (g1, . . . , gT ): a set of decision rules with gj : Domain of Hj = (A1, . . . , Aj−1, Xj ) → Aj. Y ∗(a): counterfactual outcome under treatment a Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 8 / 27
  • 9. Assumptions for Identifiability using Observational Data Use causal framework to assess DTRs under three assumptions (Murphy et al. 2001; Robins and Hernan, 2009) 1 Consistency (no interference) If a patient’s observed treatment history is compatible with a given DTR, his or her observed clinical outcomes are the same as the counterfactual ones under the DTR. 2 No Unmeasured Confounding Treatment decision at a given time is independent of future observations and counterfactual outcomes, conditional on all previous covariate and treatment history. 3 Positivity With probability one, every subject follows a specific DTR with a nonnegative probability, which is bounded away from 0. Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 9 / 27
  • 10. Adaptive Contrast Weighted Learning (ACWL) Tao and Wang, 2017 Biometrics For T = 1: gopt = arg max g∈G EH K a=1 E{Y ∗ (a)|H}I{g(H) = a} . Under the Causal Assumptions 1-3, we have gopt = arg max g∈G EH K a=1 µa(H)I{g(H) = a} , where µa(H) E(Y |A = a, H). Incorporate order statistics µ(1)(H) ≤ · · · ≤ µ(K)(H), then gopt = arg max g∈G EH K a=1 µ(a)(H)I{g(H) = la(H)} , where µ(a)(H) = µla (H) Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 10 / 27
  • 11. Property of gopt gopt = arg min g∈G EH K−1 a=1 {µ(K)(H) − µ(a)(H)}I{g(H) = la(H)} . Minimizes the expected loss in the outcome due to sub-optimal treatments in the entire population of interest. Classify as many patients as possible to their optimal treatment lK (i.e., letting I{g(H) = la(H)} = 0, a = 1, . . . , K − 1) while putting more emphasis on patients with larger contrasts (i.e., larger µ(K)(H) − µ(a)(H)). Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 11 / 27
  • 12. Bounds of the Objective Function Let C1(H) µ(K)(H) − µ(K−1)(H), and C2(H) µ(K)(H) − µ(1)(H), then 0 ≤ C1(H) ≤ µ(K)(H) − µ(a)(H) ≤ C2(H) Therefore, EH [C1(H)I{g(H) = lK(H)}] ≤ EH K−1 a=1 {µ(K)(H) − µ(a)(H)}I{g(H) = la(H)} ≤ EH [C2(H)I{g(H) = lK(H)}] Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 12 / 27
  • 13. Adaptive Contrasts for Transformation to Weighted Classification Contrasts C1 and C2: minimum and maximum expected losses in the outcome given sub-optimal treatments, adaptive to each patient’s own treatment effect ordering In the best case where sub-optimal treatments only lead to minimal expected losses in the outcome, gopt is equal to arg min g∈G EH [C1(H)I{g(H) = lK(H)}] , In the worst case where sub-optimal treatments all lead to maximal expected losses in the outcome, gopt is equal to arg min g∈G EH [C2(H)I{g(H) = lK(H)}] . Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 13 / 27
  • 14. Estimate µA(H) to Construct C1(H) and C2(H) Given estimated propensity score ˆπa(H), the AIPW estimator ˆµAIPW a for E{Y ∗(a)} is ˆµAIPW a = Pn I(A = a) ˆπa(H) Y + 1 − I(A = a) ˆπa(H) ˆµa(H) . Lemma (Double Robustness) ˆµAIPW a is a consistent estimator of E{Y ∗(a)} if either the propensity model πa(H) or the conditional mean model µa(H) is correctly specified. At a subject level, ˆµAIPW a (H) = I(A = a) ˆπa(H) Y + 1 − I(A = a) ˆπa(H) ˆµa(H). Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 14 / 27
  • 15. Adaptive Contrast Weighted Learning (ACWL) with T > 1 For stage T, the assumptions and method derivation are the same as in a single stage scenario. For stage j, T − 1 ≥ j ≥ 1, we incorporate Q-learning to conduct backward induction. The stage-specific pseudo outcome POj for estimating treatment effect ordering and adaptive contrasts is a predicted counterfactual outcome under optimal treatments at all future stages, i.e., POj = E Y ∗ (A1, . . . , Aj, gopt j+1, . . . , gopt T ) . Replace Y with POj to apply the same method as in T = 1. Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 15 / 27
  • 16. Simulation Results Q-learning BOWL ACWL Stage1Stage2 Q-learning BOWL ACWL Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 16 / 27
  • 17. Tree-based Reinforcement Learning (T-RL): Motivation Tao, Wang and Armirall D. (2018) Annals of Applied Statistics Supervised learning (SL): True label is known Then tree-based methods are easy to understand and interpret without distributional assumptions for data, e.g., Classification and Regression Tree (CART) In a DTR problem, label (the optimal treatment) at each stage is not directly observed ACWL uses semiparametric regression to estimate the label first and then convert RL to SL to apply existing classification methods such as CART Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 17 / 27
  • 18. T-RL: Methodological Basis and Feature Batch-mode RL using a sequence of unsupervised decision trees with backward induction Maintain the RL feature without the extra step of conversion to SL, and thus reduce estimation uncertainty (improvement upon ACWL) Incorporate multiple treatment comparisons directly, to improve efficiency (improvement upon ACWL) Also allow multi-categorical and ordinal treatments Similar to ACWL, we expect T-RL to be robust and efficient by combining robust semiparametric regression estimators with nonparametric machine learning methods. Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 18 / 27
  • 19. Example of an Unsupervised Decision Tree The selected split at each node must improve the expected counterfactual outcome. Nodes Ωm, m = 1, 2, . . . , are regions defined by subset covariate space. Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 19 / 27
  • 20. Tree-based RL (T-RL): Purity Measures Use the purity measure to improve the expected counterfactual outcome and apply doubly robust AIPW estimators as in ACWL. For a given partition ω and ωc of node Ω, let gj,ω,a1,a2 denote the decision rule that assigns treatment a1 to subjects in ω and treatment a2 to subjects in ωc at stage j. We define the purity measure Pj(Ω, ω) as max a1,a2∈Aj Pn   Kj aj=1 ˆµAIPW j,aj (Hj)I{gj,ω,a1,a2 (Hj) = aj}I(Hj ∈ Ω)   . Pj(Ω, ω) evaluates the performance of the best decision rule which assigns a single treatment for each of the two arms under partition. Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 20 / 27
  • 21. T-RL: Recursive Partitioning The best split ωopt is chosen to maximize the improvement in the purity, Pj(Ω, ω) − Pj(Ω). Stopping Rules, given minimum node size n0 and minimum purity improvement λ 1 If the node size is less than 2n0, the node will not be split. 2 If all possible splits of a node result in a child node with size smaller than n0, the node will not be split. 3 If the current tree depth reaches the user-specified maximum depth, the tree growing process will stop. 4 If the maximum purity improvement Pj(Ω, ˆωopt ) − Pj(Ω) is less than λ, the node will not be split, where ˆωopt = arg max ω [Pj(Ω, ω) : min{nPnI(Hj ∈ ω), nPnI(Hj ∈ ωc )} ≥ n0] Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 21 / 27
  • 22. T-RL Algorithm 1 Start with stage j = T. 2 Obtain AIPW estimates ˆµAIPW j,aj (Hj), aj = 1, . . . , Kj,. 3 At root node Ωj,m, m = 1, set values for λ and n0. 4 At node Ωj,m, evaluate the four Stopping Rules. If any of them is satisfied, assign a single best treatment arg max aj∈Aj Pn[ˆµAIPW j,aj (Hj)I(Hj ∈ Ωj,m)] to all subjects in Ωj,m. Otherwise, split Ωj,m into child nodes Ωj,2m and Ωj,2m+1 by ˆωopt. 5 Set m = m + 1 and repeat Step 4 until all nodes are terminal. 6 If j > 1, set j = j − 1 and repeat steps 2 to 5. If j = 1, stop. Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 22 / 27
  • 23. Simulation 1: Tree-type DTR Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 23 / 27
  • 24. Simulation 2: Non-tree-type DTR Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 24 / 27
  • 25. Summary ACWL transforms batch-mode RL to SL and incorporates existing classification methods: flexible and easy to implement. T-RL directly deals with batch-mode RL and uses unsupervised trees to learn optimal DTRs with purity measures maximizing the counterfactual mean outcome. ACWL and T-RL both combines semiparametric regression with machine learning and can handle multiple stages and multiple treatments. Overall, both methods work well across different scenarios. T-RL has slightly better robustness. T-RL works better for tree-type DTRs while ACWL works better for non-tree-type ones. Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 25 / 27
  • 26. Other Ongoing Research Projects Stochastic Tree-based Reinforcement Learning (ST-RL) for estimating optimal DTRs Explore different structures of optimal DTRs with restricted arms from observational data with restricted tailoring variables Some special DTR frameworks: e.g. Nested Test-and-Treat DTRs Multivariate utility function: multiple competing clinical priorities Continuous treatment options, e.g., radiation doses Mobile health: continuous decisions and data updating Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 26 / 27
  • 27. Acknowledgment My Ph.D. students: Yebin Tao, Yilun Sun, Nina Zhou, Ming Tang, Kelly Speth, Jincheng Shen, Mochuan Liu, Yingchao Zhong. My collaborators: Daniel Almirall, Jeremy Taylor, Peter Thall, Stewart Wang. Thank You! luwang@umich.edu Lu Wang ACWL and T-RL for Optimal DTR December 10, 2019 27 / 27