(a)
Sequential Multiple Assignment Randomized Trials
SMARTs: An Introduction
John Sperger
2023-11-30
Outline
Sequential Multiple Assignment Randomized Trials
Precision Health & Medicine
SMARTs Redux
Analysis of SMART Data
Designing a SMART
The BEST Trial
1/48
Learning Objectives
At the end of today’s lecture you should be able to:
• Explain how clinical considerations motivated SMART designs.
• Create example research questions that could be answered by a SMART but not
other designs.
• Contrast SMARTs with other trial designs especially cross-over designs.
• Explain how Q-learning can be applied to analyze data from a SMART.
I also want to instill:
• Excitement about the kinds of clinical and scientific questions that can be
answered by SMARTs.
• A healthy respect for the complexity of SMARTs1 and recognition that, like all
trial designs (perhaps excluding the 3+3 design), they are appropriate for
specific goals and not a panacea.
1
In the way you can still enjoy swimming in the ocean in spite of the myriad ways it could kill you.
2/48
Sequential Multiple Assignment Randomized Trial
Definition: A SMART is a randomized trial where some or all participants are
randomized at two or more decision points.
3/48
Sequential Multiple Assignment Randomized Trial
Definition: A SMART is a randomized trial where some or all participants are
randomized at two or more decision points.
• Sequential: participants are followed over multiple time periods usually
without a washout period.
3/48
Sequential Multiple Assignment Randomized Trial
Definition: A SMART is a randomized trial where some or all participants are
randomized at two or more decision points.
• Sequential: participants are followed over multiple time periods usually
without a washout period.
• Multiple Assignment: participants (may) receive multiple treatments during
the course of the study.
3/48
Sequential Multiple Assignment Randomized Trial
Definition: A SMART is a randomized trial where some or all participants are
randomized at two or more decision points.
• Sequential: participants are followed over multiple time periods usually
without a washout period.
• Multiple Assignment: participants (may) receive multiple treatments during
the course of the study.
• Randomized: treatment assignment is randomized for at least some patients at
each decision point.
3/48
Sequential Multiple Assignment Randomized Trial
Definition: A SMART is a randomized trial where some or all participants are
randomized at two or more decision points.
• Sequential: participants are followed over multiple time periods usually
without a washout period.
• Multiple Assignment: participants (may) receive multiple treatments during
the course of the study.
• Randomized: treatment assignment is randomized for at least some patients at
each decision point.
• Trial: left as an exercise to the reader.
3/48
Example SMART – Partially Deterministic2
2
Kosorok, Michael R. and Moodie, Erica EM, eds. Adaptive Treatment Strategies in Practice:
Planning Trials and Analyzing Data for Personalized Medicine. Vol. 21. ASA-SIAM Series on Statistics
and Applied Mathematics. SIAM, 2015.
4/48
Example SMART – Fully Randomized3
3
Kosorok, Michael R. and Moodie, Erica EM, Adaptive Treatment Strategies in Practice: Planning
Trials and Analyzing Data for Personalized Medicine. 5/48
SMARTs - The big idea
6/48
SMARTs - The big idea
• Identify critical decision making points over the course of treatment.
6/48
SMARTs - The big idea
• Identify critical decision making points over the course of treatment.
• Randomize decisions according to realistic clinical options at those decision
points.
6/48
Clinical Motivations for SMARTs
Mimic real-world clinical decision making
7/48
Clinical Motivations for SMARTs
Mimic real-world clinical decision making
Many health issues involve multiple decisions made over time according to either a
fixed schedule or key events that necessitate a decision (e.g. disease progression).
7/48
Clinical Motivations for SMARTs
Mimic real-world clinical decision making
Many health issues involve multiple decisions made over time according to either a
fixed schedule or key events that necessitate a decision (e.g. disease progression).
Multiple options at each decision point (do nothing is always an option).
7/48
Clinical Motivations for SMARTs
Mimic real-world clinical decision making
Many health issues involve multiple decisions made over time according to either a
fixed schedule or key events that necessitate a decision (e.g. disease progression).
Multiple options at each decision point (do nothing is always an option).
Clinicians try to make the best decision using their expert judgment based on:
• the patient’s medical history.
• the treatment options including efficacy, side effect burden, and cost.
• patient preferences.
7/48
Statistical Motivations for SMARTs
SMARTs can avoid the causal issues with observational longitudinal data.
• The Sequential Ignorability Assumption (SRA) becomes increasingly
implausible over time. 4
• Even in large datasets certain sequences of treatments may be too rare to reliably
estimate (positivity violation). With differences in healthcare systems,
insurance, regulations etc. this doesn’t necessarily imply a rarely observed
treatment sequence is suboptimal.
4
Informally you can think of this as the longitudinal analogue of the no unmeasured confounding
assumptions.
8/48
Common Clinical Settings
Mental Health — ADHD, Bipolar disorder, Depression, OCD, Schizophrenia,
Suicide prevention
Substance Use Disorders
Chronic Diseases
Oncology5
General Well-being — Mobile Health
5
Giulia Lorenzoni et al. “Use of Sequential Multiple Assignment Randomized Trials (SMARTs) in
Oncology: Systematic Review of Published Studies”. In: British Journal of Cancer 128.7 (7 Mar. 2023),
pp. 1177–1188.
9/48
Outline
Sequential Multiple Assignment Randomized Trials
Precision Health & Medicine
SMARTs Redux
Analysis of SMART Data
Designing a SMART
The BEST Trial
10/48
Precision Health6 & Medicine
Statistical/mathematical attempt to formalize clinical decision making based on
patient characteristics and apply evidence-based decision making.
6
“Precision Health” is a newer term to emphasize on the determinants of health that go beyond the
clinical setting
11/48
Precision Health6 & Medicine
Statistical/mathematical attempt to formalize clinical decision making based on
patient characteristics and apply evidence-based decision making.
Unofficial motto - “the right treatment for the right patient given at the right time”
6
“Precision Health” is a newer term to emphasize on the determinants of health that go beyond the
clinical setting
11/48
Precision Health6 & Medicine
Statistical/mathematical attempt to formalize clinical decision making based on
patient characteristics and apply evidence-based decision making.
Unofficial motto - “the right treatment for the right patient given at the right time”
Tailoring treatments to patients is not new, but recent developments in trial designs,
estimation methods, and data collection (EHRs, sensors) have made it possible to
consider smaller subgroups of patients.
6
“Precision Health” is a newer term to emphasize on the determinants of health that go beyond the
clinical setting
11/48
Treatment Policies or Dynamic Treatment Regimes
(DTRs)
A treatment policy 𝜋 is a function maps contexts to actions 𝜋 ∶ 𝓧 ↦ 𝓐.
Most commonly referred to as a dynamic treatment regime (DTR) in the statistical
literature. Other common names are treatment rule and individualized treatment
rule.
I’ve included a slide in the appendix with many more terms that have been used (Appendix slide 7)
12/48
Single Stage Treatment Policy7
7
Sophia K. Smith et al. “A SMART Approach to Optimizing Delivery of an mHealth Intervention
among Cancer Survivors with Posttraumatic Stress Symptoms”. In: Contemporary Clinical Trials 110
(Nov. 2021), p. 106569.
13/48
Two-stage Treatment Policy8
8
Smith et al., “A SMART Approach to Optimizing Delivery of an mHealth Intervention among
Cancer Survivors with Posttraumatic Stress Symptoms”.
14/48
Types of Biomarkers9
Prescriptive biomarkers are also called tailoring biomarkers/covariates.
9
Michael R. Kosorok and Eric B. Laber. “Precision Medicine”. In: Annual Review of Statistics and Its
Application 6.1 (2019), pp. 263–286.
15/48
Types of Biomarkers9
Prescriptive biomarkers are also called tailoring biomarkers/covariates.
The FDA uses prognostic, predictive, and pharmacodynamic (response) biomarkers. Prescriptive and moderating
biomarkers both qualify as predictive by FDA; response biomarkers are not shown here.
9
Kosorok and Laber, “Precision Medicine”.
15/48
Risk Prediction10 vs. Policy Estimation
Risk calculators and risk prediction tools typically
• Are based on data from large observational studies
• Minimize the mean squared prediction error for
the observed outcomes
• Should not be used to answer “what-if ” questions
by changing parameters for patient characteristics
such as age, weight, medications, etc.
10
https://www.cvriskcalculator.com/ 10-year risk of heart disease or stroke using the ASCVD algorithm published in 2013 ACC/AHA
Guideline on the Assessment of Cardiovascular Risk.
16/48
Risk Prediction10 vs. Policy Estimation
Risk calculators and risk prediction tools typically
• Are based on data from large observational studies
• Minimize the mean squared prediction error for
the observed outcomes
• Should not be used to answer “what-if ” questions
by changing parameters for patient characteristics
such as age, weight, medications, etc.
In contrast, precision medicine is fundamentally causal — what is expected to happen if I
changed the treatment my patient was on?
10
https://www.cvriskcalculator.com/ 10-year risk of heart disease or stroke using the ASCVD algorithm published in 2013 ACC/AHA
Guideline on the Assessment of Cardiovascular Risk.
16/48
Notation
Let [𝐾] denote the set {1, … , 𝐾} for a positive integer 𝐾. For individual 𝑛 ∈ [𝑁] at
stage 𝑠 ∈ [𝑆] their data is given by:
17/48
Notation
Let [𝐾] denote the set {1, … , 𝐾} for a positive integer 𝐾. For individual 𝑛 ∈ [𝑁] at
stage 𝑠 ∈ [𝑆] their data is given by:
• 𝑋𝑠𝑛 ∈ 𝓧𝑠 ⊆ ℝ𝑑𝑠 denotes the covariates
• 𝐴𝑠𝑛 ∈ 𝓐𝑠 denotes the treatment arm
• 𝑌𝑠𝑛 ∈ ℝ denotes the response
17/48
Notation
Let [𝐾] denote the set {1, … , 𝐾} for a positive integer 𝐾. For individual 𝑛 ∈ [𝑁] at
stage 𝑠 ∈ [𝑆] their data is given by:
• 𝑋𝑠𝑛 ∈ 𝓧𝑠 ⊆ ℝ𝑑𝑠 denotes the covariates
• 𝐴𝑠𝑛 ∈ 𝓐𝑠 denotes the treatment arm
• 𝑌𝑠𝑛 ∈ ℝ denotes the response
We’ll consider a two-stage SMART for exposition. The generalization to finitely
many more stages is straightforward. The study data is comprised of iid replicates
17/48
Notation
Let [𝐾] denote the set {1, … , 𝐾} for a positive integer 𝐾. For individual 𝑛 ∈ [𝑁] at
stage 𝑠 ∈ [𝑆] their data is given by:
• 𝑋𝑠𝑛 ∈ 𝓧𝑠 ⊆ ℝ𝑑𝑠 denotes the covariates
• 𝐴𝑠𝑛 ∈ 𝓐𝑠 denotes the treatment arm
• 𝑌𝑠𝑛 ∈ ℝ denotes the response
We’ll consider a two-stage SMART for exposition. The generalization to finitely
many more stages is straightforward. The study data is comprised of iid replicates
{𝑋1𝑛, 𝐴1𝑛, 𝑌1𝑛, 𝑋2𝑛, 𝐴2𝑛, 𝑌2𝑛}𝑁
𝑛=1 (1)
17/48
Notation
Let [𝐾] denote the set {1, … , 𝐾} for a positive integer 𝐾. For individual 𝑛 ∈ [𝑁] at
stage 𝑠 ∈ [𝑆] their data is given by:
• 𝑋𝑠𝑛 ∈ 𝓧𝑠 ⊆ ℝ𝑑𝑠 denotes the covariates
• 𝐴𝑠𝑛 ∈ 𝓐𝑠 denotes the treatment arm
• 𝑌𝑠𝑛 ∈ ℝ denotes the response
We’ll consider a two-stage SMART for exposition. The generalization to finitely
many more stages is straightforward. The study data is comprised of iid replicates
{𝑋1𝑛, 𝐴1𝑛, 𝑌1𝑛, 𝑋2𝑛, 𝐴2𝑛, 𝑌2𝑛}𝑁
𝑛=1 (1)
Depending on author, context etc. 𝑌𝑠𝑛 may be included in 𝑋(𝑠+1)𝑛 for 𝑠 = 2, … , 𝑆 − 1.
Then there is only a single 𝑌 and it is the ultimate response (𝑌𝑆𝑛 in the other notation)
17/48
General Notation & Potential Outcomes
For a process 𝑍 define 𝑍𝑛 = (𝑍1, … , 𝑍𝑛) for a positive integer 𝑛.
18/48
General Notation & Potential Outcomes
For a process 𝑍 define 𝑍𝑛 = (𝑍1, … , 𝑍𝑛) for a positive integer 𝑛.
Denote the potential outcome under a treatment sequence 𝑎 = (𝑎1, 𝑎2) for an
individual with covariates 𝑥 by
𝑌∗(𝑎, 𝑥) = 𝑌∗(𝑋1 = 𝑥1, 𝐴1 = 𝑎1, 𝑋2 = 𝑥2, 𝐴2 = 𝑎2)
18/48
General Notation & Potential Outcomes
For a process 𝑍 define 𝑍𝑛 = (𝑍1, … , 𝑍𝑛) for a positive integer 𝑛.
Denote the potential outcome under a treatment sequence 𝑎 = (𝑎1, 𝑎2) for an
individual with covariates 𝑥 by
𝑌∗(𝑎, 𝑥) = 𝑌∗(𝑋1 = 𝑥1, 𝐴1 = 𝑎1, 𝑋2 = 𝑥2, 𝐴2 = 𝑎2)
We will suppress the dependence of 𝑌∗ on 𝑥 and write
𝑌∗(𝑎) = 𝑌∗(𝐴1 = 𝑎1, 𝐴2 = 𝑎2)
18/48
What does it mean for a policy to be optimal?
Define the value of a policy 𝜋 as
V(𝜋) = I
E𝑋[𝑌∗(𝑎 = 𝜋(𝑋))] (2)
An optimal policy 𝜋∗ is any policy that satisfies
V(𝜋∗) ≥ V(𝜋) for all 𝜋 ∈ Π (3)
19/48
Exercise: Write Out the Optimal Policy
Suppose we have three treatments 𝐴1, 𝐴2, 𝐴3 and a single binary tailoring variable
𝑋 ∈ {0, 1} where 𝑋 ∼ Bernoulli(.5).
I
E[𝑌] = .4𝐴1 + .3𝐴2 + .5𝐴2 + .3𝑋𝐴2
20/48
Exercise: Write Out the Optimal Policy
Suppose we have three treatments 𝐴1, 𝐴2, 𝐴3 and a single binary tailoring variable
𝑋 ∈ {0, 1} where 𝑋 ∼ Bernoulli(.5).
I
E[𝑌] = .4𝐴1 + .3𝐴2 + .5𝐴2 + .3𝑋𝐴2
Questions
What is the optimal policy 𝜋∗ that does not involve 𝑋? That is what is the policy 𝜋
that maximizes 𝐸[𝑌] if treatment must be assigned without using the value of 𝑥?
Does the optimal policy change if 𝑋 ∼ Bernoulli(.95)?
What is the optimal policy if we can use the observed 𝑥 to assign treatment? Does
the mean of 𝑋 matter in this case?
20/48
Exercise continued
Table: Expected Response by Covariate and Treatment Value
𝐴 𝑋 𝐸[𝑌]
1 0 0.4
1 1 0.4
2 0 0.3
2 1 0.6
3 0 0.5
3 1 0.5
I
E[𝜋(𝑋)] = I
E[𝜋(𝑋)|𝑋 = 0]𝑃(𝑋 = 0) + I
E[𝜋(𝑋)|𝑋 = 1]𝑃(𝑋 = 1)
21/48
Outline
Sequential Multiple Assignment Randomized Trials
Precision Health & Medicine
SMARTs Redux
Analysis of SMART Data
Designing a SMART
The BEST Trial
22/48
Embedded DTRs – What Are They
An embedded policy or DTR is a DTRs that is directly observable in the study.11
We can redefine a SMART as a multistage trial wherein participants are randomized
to follow an embedded treatment regime.
11
May involve aggregation or subsetting.
23/48
Identify the Embedded DTRs
24/48
Typical SMART Characteristics
Common characteristics
• Pragmatic Inclusion/Exclusion Criteria
• Pre-established efficacy for individual interventions or intervention components
• Fixed randomization probabilities
25/48
Typical SMART Characteristics
Common characteristics
• Pragmatic Inclusion/Exclusion Criteria
• Pre-established efficacy for individual interventions or intervention components
• Fixed randomization probabilities
• The embedded regimes are constructed to test an intervention strategy.
• Discovery, not confirmatory inference, may be the motivation for the design.
The output is an estimated optimal policy or promising biomarkers for further
study.
25/48
Microrandomized Trials, mHealth, and JITAIs
SMARTs typically refer to designs with a finite number of decision points. What
happens with an indefinite time horizon or decisions in continuous time?
12
Predrag Klasnja et al. “Microrandomized Trials: An Experimental Design for Developing
Just-in-Time Adaptive Interventions”. In: Health Psychology: Official Journal of the Division of Health
Psychology, American Psychological Association 34S.0 (Dec. 2015), pp. 1220–1228.
26/48
Microrandomized Trials, mHealth, and JITAIs
SMARTs typically refer to designs with a finite number of decision points. What
happens with an indefinite time horizon or decisions in continuous time?
Microrandomized trials and Just-in-time Adaptive Interventions (JITAI)12 — extend
the idea of randomizing critical decision points to an indefinite time horizon.
12
Klasnja et al., “Microrandomized Trials”.
26/48
Microrandomized Trials, mHealth, and JITAIs
SMARTs typically refer to designs with a finite number of decision points. What
happens with an indefinite time horizon or decisions in continuous time?
Microrandomized trials and Just-in-time Adaptive Interventions (JITAI)12 — extend
the idea of randomizing critical decision points to an indefinite time horizon.
Markov Decision Processes
Markov property: loosely, if for any time 𝑠 ∈ ℕ+
𝑃(𝑋𝑠+1|𝑋𝑠) = 𝑃(𝑋𝑠+1|𝑋𝑠)
12
Klasnja et al., “Microrandomized Trials”.
26/48
SMART vs. Cross-over Design13
13
Kosorok, Michael R. and Moodie, Erica EM, Adaptive Treatment Strategies in Practice: Planning
Trials and Analyzing Data for Personalized Medicine.
27/48
SMART vs. Cross-over Design13
13
Kosorok, Michael R. and Moodie, Erica EM, Adaptive Treatment Strategies in Practice: Planning
Trials and Analyzing Data for Personalized Medicine.
27/48
Test your knowledge
What considerations (e.g. clinical setting, intervention(s), outcomes, etc.) would
typically suggest using a
1. SMART?
2. Cross-over trial?
3. Cross-sectional design?
4. A microrandomized trial?
28/48
Outline
Sequential Multiple Assignment Randomized Trials
Precision Health & Medicine
SMARTs Redux
Analysis of SMART Data
Designing a SMART
The BEST Trial
29/48
Key Estimands
Non-decision-making Estimands
• First stage average treatment effects.
• Response probabilities
30/48
Key Estimands
Non-decision-making Estimands
• First stage average treatment effects.
• Response probabilities
Treatment Policies
• Identification of the best embedded treatment policy/eDTR
• Optimal policy 𝜋∗
• Optimal policy 𝜋∗
ℱ in a restrcited function class ℱ (e.g. optimal policy among
linear decision rules, trees of depth 𝑐, or embedded policies)
30/48
Key Estimands
Non-decision-making Estimands
• First stage average treatment effects.
• Response probabilities
Treatment Policies
• Identification of the best embedded treatment policy/eDTR
• Optimal policy 𝜋∗
• Optimal policy 𝜋∗
ℱ in a restrcited function class ℱ (e.g. optimal policy among
linear decision rules, trees of depth 𝑐, or embedded policies)
Value of a Policy and Value Comparisons
• Value of a fixed policy or policies (usually the embedded policies) V(𝜋)
• Value of the estimated optimal policy V(̂
𝜋)
• Value of the optimal policy V(𝜋∗)
• Comparison of non-overlapping policies e.g. most intensive vs. least intensive
30/48
Key Estimands
Non-decision-making Estimands
• First stage average treatment effects.
• Response probabilities
Treatment Policies
• Identification of the best embedded treatment policy/eDTR
• Optimal policy 𝜋∗
• Optimal policy 𝜋∗
ℱ in a restrcited function class ℱ (e.g. optimal policy among
linear decision rules, trees of depth 𝑐, or embedded policies)
Value of a Policy and Value Comparisons
• Value of a fixed policy or policies (usually the embedded policies) V(𝜋)
• Value of the estimated optimal policy V(̂
𝜋)
• Value of the optimal policy V(𝜋∗)
• Comparison of non-overlapping policies e.g. most intensive vs. least intensive
The difference between policy and value estimation is analagous to the difference between estimating
the parameters of a linear model and estimating the average treatment effect.
30/48
Analyzing the First Stage as the Primary Aim
The first stage of a SMART is identical to a standard RCT and standard methods can
be applied.
In the early days of SMARTs it was more common to make these analyses the
primary aim and use them for power calculations. As SMARTs have become more
widespread this is less common.
• First stage power isn’t likely to be informative of the power for other analyses.
• Doesn’t justify the need for a SMART design versus a simpler design.
• It may still be of interest to ensure power for both a first-stage analysis and
another analysis.
31/48
Multiple Comparisons with the Best14
Consider the embedded policies 𝜋𝑖 from a study. Let 𝜃𝑖 = 𝑉(𝜋𝑖) be the value for the
𝑖-th embedded DTR and ̂
𝜃𝑖 be a consistent estimator of 𝜃𝑖.
14
William J Artman et al. “Power Analysis in a SMART Design: Sample Size Estimation for
Determining the Best Embedded Dynamic Treatment Regime”. In: Biostatistics 21.3 (July 1, 2020),
pp. 432–448; Donald G. Edwards and Jason C. Hsu. “Multiple Comparisons With the Best Treatment”.
In: Journal of the American Statistical Association 78.384 (1983), pp. 965–971.
32/48
Multiple Comparisons with the Best14
Consider the embedded policies 𝜋𝑖 from a study. Let 𝜃𝑖 = 𝑉(𝜋𝑖) be the value for the
𝑖-th embedded DTR and ̂
𝜃𝑖 be a consistent estimator of 𝜃𝑖.
We consider 𝜋𝑖 statistically indistinguishable from optimal if and only if
̂
𝜃𝑖 − ̂
𝜃𝑗
√Var(̂
𝜃𝑖 − ̂
𝜃𝑗)
≥ −𝑐𝑖, 1−𝛼 for all 𝑗 ≠ 𝑖
where 𝑐𝑖, 1−𝛼 > 0 is chosen so that the set of indistinguishable DTRs includes the best
eDTR with probability at least 1 − 𝛼. Produces a confidence set ̂
Π𝛼
14
Artman et al., “Power Analysis in a SMART Design”; Edwards and Hsu, “Multiple Comparisons
With the Best Treatment”.
32/48
Multiple Comparisons with the Best14
Consider the embedded policies 𝜋𝑖 from a study. Let 𝜃𝑖 = 𝑉(𝜋𝑖) be the value for the
𝑖-th embedded DTR and ̂
𝜃𝑖 be a consistent estimator of 𝜃𝑖.
We consider 𝜋𝑖 statistically indistinguishable from optimal if and only if
̂
𝜃𝑖 − ̂
𝜃𝑗
√Var(̂
𝜃𝑖 − ̂
𝜃𝑗)
≥ −𝑐𝑖, 1−𝛼 for all 𝑗 ≠ 𝑖
where 𝑐𝑖, 1−𝛼 > 0 is chosen so that the set of indistinguishable DTRs includes the best
eDTR with probability at least 1 − 𝛼. Produces a confidence set ̂
Π𝛼
̂
Π𝛼 = {𝜋𝑖 ∶ ̂
𝜃𝑖 ≥ max
𝑗≠𝑖
̂
𝜃𝑗 − 𝑐𝑖, 1−𝛼√Var(̂
𝜃𝑖 − ̂
𝜃𝑗)}
14
Artman et al., “Power Analysis in a SMART Design”; Edwards and Hsu, “Multiple Comparisons
With the Best Treatment”.
32/48
Treatment Policy Estimation Approach Families
1. Regression-based approaches (indirect) estimate the outcome model and then
the optimal policy is simply the argmax over the estimated regression functions.
Examples include: Q-learning and Advantage-learning (A-learning)15
15
Lu Wang et al. “Evaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized
Trial of Advanced Prostate Cancer”. In: Journal of the American Statistical Association 107.498 (June
2012), pp. 493–508.
16
Yingqi Zhao et al. “Estimating Individualized Treatment Rules Using Outcome Weighted
Learning”. In: Journal of the American Statistical Association 107.499 (Sept. 1, 2012), pp. 1106–1118.
17
Zhengling Qi et al. “Multi-Armed Angle-Based Direct Learning for Estimating Optimal
Individualized Treatment Rules With Various Outcomes”. In: Journal of the American Statistical
Association 115.530 (Apr. 2, 2020), pp. 678–691.
33/48
Treatment Policy Estimation Approach Families
1. Regression-based approaches (indirect) estimate the outcome model and then
the optimal policy is simply the argmax over the estimated regression functions.
Examples include: Q-learning and Advantage-learning (A-learning)15
2. Classification-based approaches (direct, direct search) try to directly optimize
over the space of policies by turning the problem into a weighted classification
problem.
Examples: Outcome weighted learning(OWL),16 Direct-learning(D-learning)17
15
Wang et al., “Evaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized
Trial of Advanced Prostate Cancer”.
16
Zhao et al., “Estimating Individualized Treatment Rules Using Outcome Weighted Learning”.
17
Qi et al., “Multi-Armed Angle-Based Direct Learning for Estimating Optimal Individualized
Treatment Rules With Various Outcomes”.
33/48
Treatment Policy Estimation Approach Families
1. Regression-based approaches (indirect) estimate the outcome model and then
the optimal policy is simply the argmax over the estimated regression functions.
Examples include: Q-learning and Advantage-learning (A-learning)15
2. Classification-based approaches (direct, direct search) try to directly optimize
over the space of policies by turning the problem into a weighted classification
problem.
Examples: Outcome weighted learning(OWL),16 Direct-learning(D-learning)17
3. Combination approaches are also possible.
15
Wang et al., “Evaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized
Trial of Advanced Prostate Cancer”.
16
Zhao et al., “Estimating Individualized Treatment Rules Using Outcome Weighted Learning”.
17
Qi et al., “Multi-Armed Angle-Based Direct Learning for Estimating Optimal Individualized
Treatment Rules With Various Outcomes”.
33/48
Backwards Induction
In multistage learning policies are estimated from back to front rather than front to back.
34/48
Backwards Induction
In multistage learning policies are estimated from back to front rather than front to back.
Fill out the expected payoffs in the image with an example where moving forwards would
result in a suboptimal policy:
A very age and culturally specific hint:
34/48
Q-learning18
Assume that there are 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested
in the ultimate outcome 𝑌 = 𝑌2.
18
Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call
Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning.
35/48
Q-learning18
Assume that there are 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested
in the ultimate outcome 𝑌 = 𝑌2.
1. Estimate I
E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂
𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression
function 𝑓2.
18
Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call
Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning.
35/48
Q-learning18
Assume that there are 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested
in the ultimate outcome 𝑌 = 𝑌2.
1. Estimate I
E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂
𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression
function 𝑓2.
2. Determine the second-stage estimated optimal treatment rule
̂
𝜋2(𝑥) = argmax𝑘∈𝐾2
̂
𝑓2(𝐴2 = 𝑘, 𝑥2, 𝑎1)
18
Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call
Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning.
35/48
Q-learning18
Assume that there are 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested
in the ultimate outcome 𝑌 = 𝑌2.
1. Estimate I
E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂
𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression
function 𝑓2.
2. Determine the second-stage estimated optimal treatment rule
̂
𝜋2(𝑥) = argmax𝑘∈𝐾2
̂
𝑓2(𝐴2 = 𝑘, 𝑥2, 𝑎1)
3. Calculate the expected response had everyone been given their estimated
optimal treatment ̃
𝐴2: ̃
𝑌 = ̂
𝑓2( ̃
𝐴2 = ̂
𝜋2(𝑥2, 𝑎1), 𝑥2, 𝑎1)
18
Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call
Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning.
35/48
Q-learning18
Assume that there are 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested
in the ultimate outcome 𝑌 = 𝑌2.
1. Estimate I
E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂
𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression
function 𝑓2.
2. Determine the second-stage estimated optimal treatment rule
̂
𝜋2(𝑥) = argmax𝑘∈𝐾2
̂
𝑓2(𝐴2 = 𝑘, 𝑥2, 𝑎1)
3. Calculate the expected response had everyone been given their estimated
optimal treatment ̃
𝐴2: ̃
𝑌 = ̂
𝑓2( ̃
𝐴2 = ̂
𝜋2(𝑥2, 𝑎1), 𝑥2, 𝑎1)
4. Estimate I
E[ ̃
𝑌|𝑋1, 𝐴1] ≐ ̂
𝑓1(𝑥1, 𝑎1)
18
Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call
Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning.
35/48
Q-learning18
Assume that there are 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested
in the ultimate outcome 𝑌 = 𝑌2.
1. Estimate I
E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂
𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression
function 𝑓2.
2. Determine the second-stage estimated optimal treatment rule
̂
𝜋2(𝑥) = argmax𝑘∈𝐾2
̂
𝑓2(𝐴2 = 𝑘, 𝑥2, 𝑎1)
3. Calculate the expected response had everyone been given their estimated
optimal treatment ̃
𝐴2: ̃
𝑌 = ̂
𝑓2( ̃
𝐴2 = ̂
𝜋2(𝑥2, 𝑎1), 𝑥2, 𝑎1)
4. Estimate I
E[ ̃
𝑌|𝑋1, 𝐴1] ≐ ̂
𝑓1(𝑥1, 𝑎1)
5. Determine the first-stage optimal treatment rule
̂
𝜋1(𝑥) = argmax𝑘∈𝐾1
̂
𝑓1(𝐴1 = 𝑘, 𝑥1)
18
Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call
Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning.
35/48
Q-learning18
Assume that there are 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested
in the ultimate outcome 𝑌 = 𝑌2.
1. Estimate I
E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂
𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression
function 𝑓2.
2. Determine the second-stage estimated optimal treatment rule
̂
𝜋2(𝑥) = argmax𝑘∈𝐾2
̂
𝑓2(𝐴2 = 𝑘, 𝑥2, 𝑎1)
3. Calculate the expected response had everyone been given their estimated
optimal treatment ̃
𝐴2: ̃
𝑌 = ̂
𝑓2( ̃
𝐴2 = ̂
𝜋2(𝑥2, 𝑎1), 𝑥2, 𝑎1)
4. Estimate I
E[ ̃
𝑌|𝑋1, 𝐴1] ≐ ̂
𝑓1(𝑥1, 𝑎1)
5. Determine the first-stage optimal treatment rule
̂
𝜋1(𝑥) = argmax𝑘∈𝐾1
̂
𝑓1(𝐴1 = 𝑘, 𝑥1)
The estimated optimal policy is ̂
𝜋 = (̂
𝜋1, ̂
𝜋2)
18
Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call
Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning.
35/48
Q-learning
What is the estimation step of Q-learning when there is only a single stage and a
linear model is used?
36/48
Operator notation
• 𝑋1, … , 𝑋𝑁 is an iid random sample from a fixed but unknown distribution P
• 𝑔 is a generic parametric function indexed by 𝜃 ∈ Θ
• ̂
𝜃 ∈ Θ is a random variable constructed from the sample 𝑋1, … , 𝑋𝑁
P denotes the probability measure:
P 𝑔(𝑋; ̂
𝜃) = ∫ 𝑔(𝑥; ̂
𝜃) d P(𝑥))
ℙn denotes the corresponding empirical measure:
ℙn 𝑔(𝑋; ̂
𝜃) = 𝑛−1
𝑛
∑
𝑖=1
𝑔(𝑥𝑖; ̂
𝜃)
denotes convergence in distribution.
37/48
Value Functions
1. Conditional Value of the estimated optimal policy
I
E𝑋[̂
𝜋𝑛(𝑋)|̂
𝜋𝑛] = P ̂
𝜋𝑛(𝑋)
2. (Expected) Value of an estimated optimal policy
I
E𝑋[̂
𝜋𝑛(𝑋)]
3. Value of the optimal policy
I
E𝑋[𝜋∗(𝑋)] = P 𝜋∗(𝑋)
38/48
Value Functions
1. Conditional Value of the estimated optimal policy
I
E𝑋[̂
𝜋𝑛(𝑋)|̂
𝜋𝑛] = P ̂
𝜋𝑛(𝑋)
2. (Expected) Value of an estimated optimal policy
I
E𝑋[̂
𝜋𝑛(𝑋)]
3. Value of the optimal policy
I
E𝑋[𝜋∗(𝑋)] = P 𝜋∗(𝑋)
Are estimators of these functions asymptotically equivalent?
38/48
Value Functions
1. Conditional Value of the estimated optimal policy
I
E𝑋[̂
𝜋𝑛(𝑋)|̂
𝜋𝑛] = P ̂
𝜋𝑛(𝑋)
2. (Expected) Value of an estimated optimal policy
I
E𝑋[̂
𝜋𝑛(𝑋)]
3. Value of the optimal policy
I
E𝑋[𝜋∗(𝑋)] = P 𝜋∗(𝑋)
Are estimators of these functions asymptotically equivalent?
Brainstorm a scenario for each of the value functions where that estimand would
make the most sense.
38/48
Estimating the Value Function
• Model-based — seldom used. Why? Think about what a SMART provides by
design.
• Inverse Probability Weighted Estimator
For a two-stage SMART, a fixed regime 𝜋, and histories 𝐻1 = 𝑋1,
𝐻2 = (𝑋1, 𝐴1, 𝑋2)
̂
Vn(𝜋) = ℙn [
𝑌 1 {𝐴1 = 𝜋1(𝐻1)} 1 {𝐴2 = 𝜋2(𝐻2)}
Pr(𝐴1|𝐻1) Pr(𝐴2|𝐻2)
]
• Augmented Inverse Probability Weighted Estimator
39/48
Toy Problem: Max of Gaussian Means
Suppose we have a random iid sample of size 𝑛 where
𝑋𝑖 ∼ MVN (𝜇 = (
2
−1
) , Σ = [
1 0
0 1
])
40/48
Toy Problem: Max of Gaussian Means
Suppose we have a random iid sample of size 𝑛 where
𝑋𝑖 ∼ MVN (𝜇 = (
2
−1
) , Σ = [
1 0
0 1
])
In the general case let 𝑝 denote the dimension of 𝜇 and assume that the covariance
matrix is the identity matrix. Suppose we are interested in 𝜃:
𝜃 = max
𝑗∈[𝑝]
𝜇𝑗 = ∧
𝑝
𝑗=1𝜇𝑗
Here 𝜃 = max{2, −1} = 2
40/48
Toy Problem: Max of Gaussian Means
Suppose we have a random iid sample of size 𝑛 where
𝑋𝑖 ∼ MVN (𝜇 = (
2
−1
) , Σ = [
1 0
0 1
])
In the general case let 𝑝 denote the dimension of 𝜇 and assume that the covariance
matrix is the identity matrix. Suppose we are interested in 𝜃:
𝜃 = max
𝑗∈[𝑝]
𝜇𝑗 = ∧
𝑝
𝑗=1𝜇𝑗
Here 𝜃 = max{2, −1} = 2
Let
̂
𝜃𝑛 = max
𝑗∈{1,2}
ℙn 𝑋𝑗
40/48
Toy Problem: Max of Gaussian Means
Suppose we have a random iid sample of size 𝑛 where
𝑋𝑖 ∼ MVN (𝜇 = (
2
−1
) , Σ = [
1 0
0 1
])
In the general case let 𝑝 denote the dimension of 𝜇 and assume that the covariance
matrix is the identity matrix. Suppose we are interested in 𝜃:
𝜃 = max
𝑗∈[𝑝]
𝜇𝑗 = ∧
𝑝
𝑗=1𝜇𝑗
Here 𝜃 = max{2, −1} = 2
Let
̂
𝜃𝑛 = max
𝑗∈{1,2}
ℙn 𝑋𝑗
What is the limiting distribution of √𝑛( ̂
𝜃 − 𝜃) for 𝜇 = (2, −1)T?
Hint: don’t overthink it
40/48
Max of Gaussian Means continued
A: √𝑛( ̂
𝜃 − 𝜃) �� 𝑁(0, 1)
41/48
Max of Gaussian Means continued
A: √𝑛( ̂
𝜃 − 𝜃) �� 𝑁(0, 1)
Suppose 𝜇 = (0, 0)T
Now what is the limiting distribution of √𝑛( ̂
𝜃 − 𝜃)?
41/48
Max of Gaussian Means continued
A: √𝑛( ̂
𝜃 − 𝜃) �� 𝑁(0, 1)
Suppose 𝜇 = (0, 0)T
Now what is the limiting distribution of √𝑛( ̂
𝜃 − 𝜃)?
√𝑛( ̂
𝜃 − 𝜃) �� max(𝑁(0, 1), 𝑁(0, 1))
Now the limiting distribution is the maximum of two independent standard normal
RVs.
41/48
Max of Gaussian Means continued
A: √𝑛( ̂
𝜃 − 𝜃) �� 𝑁(0, 1)
Suppose 𝜇 = (0, 0)T
Now what is the limiting distribution of √𝑛( ̂
𝜃 − 𝜃)?
√𝑛( ̂
𝜃 − 𝜃) �� max(𝑁(0, 1), 𝑁(0, 1))
Now the limiting distribution is the maximum of two independent standard normal
RVs.
Problem: the limiting distribution depends on the value of the parameter 𝜇
To analyze SMARTs we’ll need to address nonregular asymptotics.
41/48
Outline
Sequential Multiple Assignment Randomized Trials
Precision Health & Medicine
SMARTs Redux
Analysis of SMART Data
Designing a SMART
The BEST Trial
42/48
SMART Design Considerations
• Time between randomizations
• Number of stages
• What defines response status? (One day I’ll convince someone to run a SMART
where response status isn’t dichotomized)
• Will responders be re-randomized or continue treatment (deterministic)
• Set of treatments available at each stage
• Outcomes used to assess response
• Can the coordinating center handle the additional logistical complexity?
• Does the study design match the scientific aim (true of all studies)?
43/48
Common Second-Stage Design Patterns
• Switch off ineffective treatment
• Step up (intensify) intervention
• Step down (relax) intervention
• Dose adjustment
44/48
Challenges with SMART Grant Proposals
• Need for increased communication between statisticians and clinicians when
designing study and preparing protocol
• Nonstandard sample size calculations
• Reviewers may be unfamiliar with SMART designs
• Reviewers may be unfamiliar with novel statistical methods
45/48
Choosing a Design
Never forget that the scientific or clinical question should determine the design, not
vice versa.
46/48
Choosing a name for your method
The success of the term SMART1920 is a bit of a branding coup. This is not to denigrate Susan
Murphy’s extensive contributions, but to highlight other researchers in the area at the time.
19
S. A. Murphy. “An Experimental Design for the Development of Adaptive Treatment Strategies”.
In: Statistics in Medicine 24.10 (2005), pp. 1455–1481.
20
Interestingly this paper never uses SMART always “SMAR” trials.
21
Linda M. Collins et al. “A Strategy for Optimizing and Evaluating Behavioral Interventions”. In:
Annals of Behavioral Medicine 30.1 (Feb. 1, 2005), pp. 65–73.
22
Peter F. Thall, Randall E. Millikan, and Hsi-Guang Sung. “Evaluating Multiple Treatment Courses
in Clinical Trials”. In: Statistics in Medicine 19.8 (2000), pp. 1011–1028; Peter F. Thall et al.
“Hierarchical Bayesian Approaches to Phase II Trials in Diseases with Multiple Subtypes”. In: Statistics
in Medicine 22.5 (2003), pp. 763–780.
23
Maurizio Fava et al. “Background and Rationale for the Sequenced Treatment Alternatives to
Relieve Depression (STAR�D) Study”. In: Psychiatric Clinics of North America 26.2 (June 1, 2003),
pp. 457–494; Sonia M. Davis et al. “Statistical Approaches to Effectiveness Measurement and
Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of Intervention Effectiveness
(CATIE) Studies”. In: Schizophrenia Bulletin 29.1 (2003), pp. 73–80. 47/48
Choosing a name for your method
The success of the term SMART1920 is a bit of a branding coup. This is not to denigrate Susan
Murphy’s extensive contributions, but to highlight other researchers in the area at the time.
Multiphase Optimization Strategy (MOST)21
19
Murphy, “An Experimental Design for the Development of Adaptive Treatment Strategies”.
20
Interestingly this paper never uses SMART always “SMAR” trials.
21
Collins et al., “A Strategy for Optimizing and Evaluating Behavioral Interventions”.
22
Thall, Millikan, and Sung, “Evaluating Multiple Treatment Courses in Clinical Trials”; Thall et al.,
“Hierarchical Bayesian Approaches to Phase II Trials in Diseases with Multiple Subtypes”.
23
Fava et al., “Background and Rationale for the Sequenced Treatment Alternatives to Relieve
Depression (STAR�D) Study”; Davis et al., “Statistical Approaches to Effectiveness Measurement and
Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of Intervention Effectiveness
(CATIE) Studies”.
47/48
Choosing a name for your method
The success of the term SMART1920 is a bit of a branding coup. This is not to denigrate Susan
Murphy’s extensive contributions, but to highlight other researchers in the area at the time.
Multiphase Optimization Strategy (MOST)21
Bayesian trials with rerandomization22
19
Murphy, “An Experimental Design for the Development of Adaptive Treatment Strategies”.
20
Interestingly this paper never uses SMART always “SMAR” trials.
21
Collins et al., “A Strategy for Optimizing and Evaluating Behavioral Interventions”.
22
Thall, Millikan, and Sung, “Evaluating Multiple Treatment Courses in Clinical Trials”; Thall et al.,
“Hierarchical Bayesian Approaches to Phase II Trials in Diseases with Multiple Subtypes”.
23
Fava et al., “Background and Rationale for the Sequenced Treatment Alternatives to Relieve
Depression (STAR�D) Study”; Davis et al., “Statistical Approaches to Effectiveness Measurement and
Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of Intervention Effectiveness
(CATIE) Studies”.
47/48
Choosing a name for your method
The success of the term SMART1920 is a bit of a branding coup. This is not to denigrate Susan
Murphy’s extensive contributions, but to highlight other researchers in the area at the time.
Multiphase Optimization Strategy (MOST)21
Bayesian trials with rerandomization22
Trials using “outcome-driven re-randomizations”23
19
Murphy, “An Experimental Design for the Development of Adaptive Treatment Strategies”.
20
Interestingly this paper never uses SMART always “SMAR” trials.
21
Collins et al., “A Strategy for Optimizing and Evaluating Behavioral Interventions”.
22
Thall, Millikan, and Sung, “Evaluating Multiple Treatment Courses in Clinical Trials”; Thall et al.,
“Hierarchical Bayesian Approaches to Phase II Trials in Diseases with Multiple Subtypes”.
23
Fava et al., “Background and Rationale for the Sequenced Treatment Alternatives to Relieve
Depression (STAR�D) Study”; Davis et al., “Statistical Approaches to Effectiveness Measurement and
Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of Intervention Effectiveness
(CATIE) Studies”.
47/48
Choosing a name for your method
The success of the term SMART1920 is a bit of a branding coup. This is not to denigrate Susan
Murphy’s extensive contributions, but to highlight other researchers in the area at the time.
Multiphase Optimization Strategy (MOST)21
Bayesian trials with rerandomization22
Trials using “outcome-driven re-randomizations”23
19
Murphy, “An Experimental Design for the Development of Adaptive Treatment Strategies”.
20
Interestingly this paper never uses SMART always “SMAR” trials.
21
Collins et al., “A Strategy for Optimizing and Evaluating Behavioral Interventions”.
22
Thall, Millikan, and Sung, “Evaluating Multiple Treatment Courses in Clinical Trials”; Thall et al.,
“Hierarchical Bayesian Approaches to Phase II Trials in Diseases with Multiple Subtypes”.
23
Fava et al., “Background and Rationale for the Sequenced Treatment Alternatives to Relieve
Depression (STAR�D) Study”; Davis et al., “Statistical Approaches to Effectiveness Measurement and
Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of Intervention Effectiveness
(CATIE) Studies”.
47/48
Outline
Sequential Multiple Assignment Randomized Trials
Precision Health & Medicine
SMARTs Redux
Analysis of SMART Data
Designing a SMART
The BEST Trial
48/48
Appendices
Outline
References
Resources
Terminology
Markov Decision Processes
1/10
Artman, William J et al. “Power Analysis in a SMART Design: Sample Size
Estimation for Determining the Best Embedded Dynamic Treatment Regime”.
In: Biostatistics 21.3 (July 1, 2020), pp. 432–448.
Bidargaddi, N. et al. “Designing M-Health Interventions for Precision Mental
Health Support”. In: Translational Psychiatry 10.1 (1 July 7, 2020), pp. 1–8.
Collins, Linda M. et al. “A Strategy for Optimizing and Evaluating Behavioral
Interventions”. In: Annals of Behavioral Medicine 30.1 (Feb. 1, 2005), pp. 65–73.
Davis, Sonia M. et al. “Statistical Approaches to Effectiveness Measurement and
Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of
Intervention Effectiveness (CATIE) Studies”. In: Schizophrenia Bulletin 29.1
(2003), pp. 73–80.
Edwards, Donald G. and Jason C. Hsu. “Multiple Comparisons With the Best
Treatment”. In: Journal of the American Statistical Association 78.384 (1983),
pp. 965–971.
Fava, Maurizio et al. “Background and Rationale for the Sequenced Treatment
Alternatives to Relieve Depression (STAR�D) Study”. In: Psychiatric Clinics of
North America 26.2 (June 1, 2003), pp. 457–494.
1/10
Klasnja, Predrag et al. “Microrandomized Trials: An Experimental Design for
Developing Just-in-Time Adaptive Interventions”. In: Health Psychology: Official
Journal of the Division of Health Psychology, American Psychological Association
34S.0 (Dec. 2015), pp. 1220–1228.
Kosorok, Michael R. and Eric B. Laber. “Precision Medicine”. In: Annual Review
of Statistics and Its Application 6.1 (2019), pp. 263–286.
Kosorok, Michael R. and Moodie, Erica EM, eds. Adaptive Treatment Strategies
in Practice: Planning Trials and Analyzing Data for Personalized Medicine.
Vol. 21. ASA-SIAM Series on Statistics and Applied Mathematics. SIAM, 2015.
Lorenzoni, Giulia et al. “Use of Sequential Multiple Assignment Randomized
Trials (SMARTs) in Oncology: Systematic Review of Published Studies”. In:
British Journal of Cancer 128.7 (7 Mar. 2023), pp. 1177–1188.
Murphy, S. A. “An Experimental Design for the Development of Adaptive
Treatment Strategies”. In: Statistics in Medicine 24.10 (2005), pp. 1455–1481.
Qi, Zhengling et al. “Multi-Armed Angle-Based Direct Learning for Estimating
Optimal Individualized Treatment Rules With Various Outcomes”. In: Journal of
the American Statistical Association 115.530 (Apr. 2, 2020), pp. 678–691.
1/10
Smith, Sophia K. et al. “A SMART Approach to Optimizing Delivery of an
mHealth Intervention among Cancer Survivors with Posttraumatic Stress
Symptoms”. In: Contemporary Clinical Trials 110 (Nov. 2021), p. 106569.
Thall, Peter F., Randall E. Millikan, and Hsi-Guang Sung. “Evaluating Multiple
Treatment Courses in Clinical Trials”. In: Statistics in Medicine 19.8 (2000),
pp. 1011–1028.
Thall, Peter F. et al. “Hierarchical Bayesian Approaches to Phase II Trials in
Diseases with Multiple Subtypes”. In: Statistics in Medicine 22.5 (2003),
pp. 763–780.
Tsiatis, Anastasios A. et al. Dynamic Treatment Regimes: Statistical Methods for
Precision Medicine. New York: Chapman and Hall/CRC, Dec. 19, 2019. 618 pp.
Wang, Lu et al. “Evaluation of Viable Dynamic Treatment Regimes in a
Sequentially Randomized Trial of Advanced Prostate Cancer”. In: Journal of the
American Statistical Association 107.498 (June 2012), pp. 493–508.
Zhao, Yingqi et al. “Estimating Individualized Treatment Rules Using Outcome
Weighted Learning”. In: Journal of the American Statistical Association 107.499
(Sept. 1, 2012), pp. 1106–1118.
2/10
Outline
References
Resources
Terminology
Markov Decision Processes
2/10
Resources
Books
Focused more on estimation than design but still probably the number one resource: Anastasios A. Tsiatis et al.
Dynamic Treatment Regimes: Statistical Methods for Precision Medicine. New York: Chapman and Hall/CRC,
Dec. 19, 2019. 618 pp.
Older reference: Kosorok, Michael R. and Moodie, Erica EM, eds. Adaptive Treatment Strategies in Practice:
Planning Trials and Analyzing Data for Personalized Medicine. Vol. 21. ASA-SIAM Series on Statistics and Applied
Mathematics. SIAM, 2015
Course Notes
• BIOS 740 – https://mkosorok.web.unc.edu/current-course/, Archive.org link
• SAMSI 2019 Precision Medicine Course –
https://www.slideshare.net/search?searchfrom=header&q=2019+PMED+Course
Papers
Michael R. Kosorok and Eric B. Laber. “Precision Medicine”. In: Annual Review of Statistics and Its Application 6.1
(2019), pp. 263–286
N. Bidargaddi et al. “Designing M-Health Interventions for Precision Mental Health Support”. In: Translational
Psychiatry 10.1 (1 July 7, 2020), pp. 1–8
3/10
Online Resources
Data Science for Dynamic Intervention Decision Making
https://d3c.isr.umich.edu/
(Archived) Penn State Methodology Center
https://wayback.archive-it.org/3524/20210524145320/https://www.methodology.psu.edu/ra/adap-inter/
4/10
Outline
References
Resources
Terminology
Markov Decision Processes
5/10
Terminology Overview
The slides in this section are a reference for the many terms that have been used to
describe the same or very similar concepts.
In some cases these terms are a specific subtype of the term that corresponds to a
DTR. For example, clinical decision support systems are a broad term that includes
notifications or recommendations but don’t prescribe actions, but automated
decision support systems
While it was sufficiently exhausting, it is sadly not exhaustive.
6/10
Terminology Equivalents - DTRs
Adaptive treatment strategies, Adaptive interventions, Decision functions,
Dynamic treatment regimes, Multistage treatment strategies, Stepped-care Strategies,
Treatment policies, Treatment Regimes, Treatment rules,
Individualized treatment rules, Expert systems, Expert advice systems,
(Automated) Clinical Decision Support System (Automated) Decision Aid
(Automated) Decision Support Tool
7/10
The “Dynamic” in Dynamic Treatment Regime
Depending on the author, the dynamic in dynamic treatment regime can mean:
Dynamic = time-varying
Dynamic = depends on patient covariates (including responder status)
8/10
Outline
References
Resources
Terminology
Markov Decision Processes
9/10
“Cheating” the Markov Assumption
When the number of stages 𝐽 is finite we can expand the state space to ensure the
Markov condition holds
𝑋+
𝑗 = (∪
𝑗
𝑖=1(𝑋𝑖, 𝐴𝑖, 𝑌𝑖)) ∪ 𝑋𝑗
Then the Markov condition
I
E[𝑌𝑗|𝑋+
1 , … , 𝑋+
𝑗 ] = I
E[𝑌𝑗|𝑋+
𝑗 ]
is trivially satisfied.
This trick is not available to us in the infinite horizon case, though we can continue
in the spirit of this approach by defining summary measurements.
10/10

An Introduction to Sequential Multiple Assignment Randomized Trials (SMARTs)

  • 1.
    (a) Sequential Multiple AssignmentRandomized Trials SMARTs: An Introduction John Sperger 2023-11-30
  • 2.
    Outline Sequential Multiple AssignmentRandomized Trials Precision Health & Medicine SMARTs Redux Analysis of SMART Data Designing a SMART The BEST Trial 1/48
  • 3.
    Learning Objectives At theend of today’s lecture you should be able to: • Explain how clinical considerations motivated SMART designs. • Create example research questions that could be answered by a SMART but not other designs. • Contrast SMARTs with other trial designs especially cross-over designs. • Explain how Q-learning can be applied to analyze data from a SMART. I also want to instill: • Excitement about the kinds of clinical and scientific questions that can be answered by SMARTs. • A healthy respect for the complexity of SMARTs1 and recognition that, like all trial designs (perhaps excluding the 3+3 design), they are appropriate for specific goals and not a panacea. 1 In the way you can still enjoy swimming in the ocean in spite of the myriad ways it could kill you. 2/48
  • 4.
    Sequential Multiple AssignmentRandomized Trial Definition: A SMART is a randomized trial where some or all participants are randomized at two or more decision points. 3/48
  • 5.
    Sequential Multiple AssignmentRandomized Trial Definition: A SMART is a randomized trial where some or all participants are randomized at two or more decision points. • Sequential: participants are followed over multiple time periods usually without a washout period. 3/48
  • 6.
    Sequential Multiple AssignmentRandomized Trial Definition: A SMART is a randomized trial where some or all participants are randomized at two or more decision points. • Sequential: participants are followed over multiple time periods usually without a washout period. • Multiple Assignment: participants (may) receive multiple treatments during the course of the study. 3/48
  • 7.
    Sequential Multiple AssignmentRandomized Trial Definition: A SMART is a randomized trial where some or all participants are randomized at two or more decision points. • Sequential: participants are followed over multiple time periods usually without a washout period. • Multiple Assignment: participants (may) receive multiple treatments during the course of the study. • Randomized: treatment assignment is randomized for at least some patients at each decision point. 3/48
  • 8.
    Sequential Multiple AssignmentRandomized Trial Definition: A SMART is a randomized trial where some or all participants are randomized at two or more decision points. • Sequential: participants are followed over multiple time periods usually without a washout period. • Multiple Assignment: participants (may) receive multiple treatments during the course of the study. • Randomized: treatment assignment is randomized for at least some patients at each decision point. • Trial: left as an exercise to the reader. 3/48
  • 9.
    Example SMART –Partially Deterministic2 2 Kosorok, Michael R. and Moodie, Erica EM, eds. Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine. Vol. 21. ASA-SIAM Series on Statistics and Applied Mathematics. SIAM, 2015. 4/48
  • 10.
    Example SMART –Fully Randomized3 3 Kosorok, Michael R. and Moodie, Erica EM, Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine. 5/48
  • 11.
    SMARTs - Thebig idea 6/48
  • 12.
    SMARTs - Thebig idea • Identify critical decision making points over the course of treatment. 6/48
  • 13.
    SMARTs - Thebig idea • Identify critical decision making points over the course of treatment. • Randomize decisions according to realistic clinical options at those decision points. 6/48
  • 14.
    Clinical Motivations forSMARTs Mimic real-world clinical decision making 7/48
  • 15.
    Clinical Motivations forSMARTs Mimic real-world clinical decision making Many health issues involve multiple decisions made over time according to either a fixed schedule or key events that necessitate a decision (e.g. disease progression). 7/48
  • 16.
    Clinical Motivations forSMARTs Mimic real-world clinical decision making Many health issues involve multiple decisions made over time according to either a fixed schedule or key events that necessitate a decision (e.g. disease progression). Multiple options at each decision point (do nothing is always an option). 7/48
  • 17.
    Clinical Motivations forSMARTs Mimic real-world clinical decision making Many health issues involve multiple decisions made over time according to either a fixed schedule or key events that necessitate a decision (e.g. disease progression). Multiple options at each decision point (do nothing is always an option). Clinicians try to make the best decision using their expert judgment based on: • the patient’s medical history. • the treatment options including efficacy, side effect burden, and cost. • patient preferences. 7/48
  • 18.
    Statistical Motivations forSMARTs SMARTs can avoid the causal issues with observational longitudinal data. • The Sequential Ignorability Assumption (SRA) becomes increasingly implausible over time. 4 • Even in large datasets certain sequences of treatments may be too rare to reliably estimate (positivity violation). With differences in healthcare systems, insurance, regulations etc. this doesn’t necessarily imply a rarely observed treatment sequence is suboptimal. 4 Informally you can think of this as the longitudinal analogue of the no unmeasured confounding assumptions. 8/48
  • 19.
    Common Clinical Settings MentalHealth — ADHD, Bipolar disorder, Depression, OCD, Schizophrenia, Suicide prevention Substance Use Disorders Chronic Diseases Oncology5 General Well-being — Mobile Health 5 Giulia Lorenzoni et al. “Use of Sequential Multiple Assignment Randomized Trials (SMARTs) in Oncology: Systematic Review of Published Studies”. In: British Journal of Cancer 128.7 (7 Mar. 2023), pp. 1177–1188. 9/48
  • 20.
    Outline Sequential Multiple AssignmentRandomized Trials Precision Health & Medicine SMARTs Redux Analysis of SMART Data Designing a SMART The BEST Trial 10/48
  • 21.
    Precision Health6 &Medicine Statistical/mathematical attempt to formalize clinical decision making based on patient characteristics and apply evidence-based decision making. 6 “Precision Health” is a newer term to emphasize on the determinants of health that go beyond the clinical setting 11/48
  • 22.
    Precision Health6 &Medicine Statistical/mathematical attempt to formalize clinical decision making based on patient characteristics and apply evidence-based decision making. Unofficial motto - “the right treatment for the right patient given at the right time” 6 “Precision Health” is a newer term to emphasize on the determinants of health that go beyond the clinical setting 11/48
  • 23.
    Precision Health6 &Medicine Statistical/mathematical attempt to formalize clinical decision making based on patient characteristics and apply evidence-based decision making. Unofficial motto - “the right treatment for the right patient given at the right time” Tailoring treatments to patients is not new, but recent developments in trial designs, estimation methods, and data collection (EHRs, sensors) have made it possible to consider smaller subgroups of patients. 6 “Precision Health” is a newer term to emphasize on the determinants of health that go beyond the clinical setting 11/48
  • 24.
    Treatment Policies orDynamic Treatment Regimes (DTRs) A treatment policy 𝜋 is a function maps contexts to actions 𝜋 ∶ 𝓧 ↦ 𝓐. Most commonly referred to as a dynamic treatment regime (DTR) in the statistical literature. Other common names are treatment rule and individualized treatment rule. I’ve included a slide in the appendix with many more terms that have been used (Appendix slide 7) 12/48
  • 25.
    Single Stage TreatmentPolicy7 7 Sophia K. Smith et al. “A SMART Approach to Optimizing Delivery of an mHealth Intervention among Cancer Survivors with Posttraumatic Stress Symptoms”. In: Contemporary Clinical Trials 110 (Nov. 2021), p. 106569. 13/48
  • 26.
    Two-stage Treatment Policy8 8 Smithet al., “A SMART Approach to Optimizing Delivery of an mHealth Intervention among Cancer Survivors with Posttraumatic Stress Symptoms”. 14/48
  • 27.
    Types of Biomarkers9 Prescriptivebiomarkers are also called tailoring biomarkers/covariates. 9 Michael R. Kosorok and Eric B. Laber. “Precision Medicine”. In: Annual Review of Statistics and Its Application 6.1 (2019), pp. 263–286. 15/48
  • 28.
    Types of Biomarkers9 Prescriptivebiomarkers are also called tailoring biomarkers/covariates. The FDA uses prognostic, predictive, and pharmacodynamic (response) biomarkers. Prescriptive and moderating biomarkers both qualify as predictive by FDA; response biomarkers are not shown here. 9 Kosorok and Laber, “Precision Medicine”. 15/48
  • 29.
    Risk Prediction10 vs.Policy Estimation Risk calculators and risk prediction tools typically • Are based on data from large observational studies • Minimize the mean squared prediction error for the observed outcomes • Should not be used to answer “what-if ” questions by changing parameters for patient characteristics such as age, weight, medications, etc. 10 https://www.cvriskcalculator.com/ 10-year risk of heart disease or stroke using the ASCVD algorithm published in 2013 ACC/AHA Guideline on the Assessment of Cardiovascular Risk. 16/48
  • 30.
    Risk Prediction10 vs.Policy Estimation Risk calculators and risk prediction tools typically • Are based on data from large observational studies • Minimize the mean squared prediction error for the observed outcomes • Should not be used to answer “what-if ” questions by changing parameters for patient characteristics such as age, weight, medications, etc. In contrast, precision medicine is fundamentally causal — what is expected to happen if I changed the treatment my patient was on? 10 https://www.cvriskcalculator.com/ 10-year risk of heart disease or stroke using the ASCVD algorithm published in 2013 ACC/AHA Guideline on the Assessment of Cardiovascular Risk. 16/48
  • 31.
    Notation Let [𝐾] denotethe set {1, … , 𝐾} for a positive integer 𝐾. For individual 𝑛 ∈ [𝑁] at stage 𝑠 ∈ [𝑆] their data is given by: 17/48
  • 32.
    Notation Let [𝐾] denotethe set {1, … , 𝐾} for a positive integer 𝐾. For individual 𝑛 ∈ [𝑁] at stage 𝑠 ∈ [𝑆] their data is given by: • 𝑋𝑠𝑛 ∈ 𝓧𝑠 ⊆ ℝ𝑑𝑠 denotes the covariates • 𝐴𝑠𝑛 ∈ 𝓐𝑠 denotes the treatment arm • 𝑌𝑠𝑛 ∈ ℝ denotes the response 17/48
  • 33.
    Notation Let [𝐾] denotethe set {1, … , 𝐾} for a positive integer 𝐾. For individual 𝑛 ∈ [𝑁] at stage 𝑠 ∈ [𝑆] their data is given by: • 𝑋𝑠𝑛 ∈ 𝓧𝑠 ⊆ ℝ𝑑𝑠 denotes the covariates • 𝐴𝑠𝑛 ∈ 𝓐𝑠 denotes the treatment arm • 𝑌𝑠𝑛 ∈ ℝ denotes the response We’ll consider a two-stage SMART for exposition. The generalization to finitely many more stages is straightforward. The study data is comprised of iid replicates 17/48
  • 34.
    Notation Let [𝐾] denotethe set {1, … , 𝐾} for a positive integer 𝐾. For individual 𝑛 ∈ [𝑁] at stage 𝑠 ∈ [𝑆] their data is given by: • 𝑋𝑠𝑛 ∈ 𝓧𝑠 ⊆ ℝ𝑑𝑠 denotes the covariates • 𝐴𝑠𝑛 ∈ 𝓐𝑠 denotes the treatment arm • 𝑌𝑠𝑛 ∈ ℝ denotes the response We’ll consider a two-stage SMART for exposition. The generalization to finitely many more stages is straightforward. The study data is comprised of iid replicates {𝑋1𝑛, 𝐴1𝑛, 𝑌1𝑛, 𝑋2𝑛, 𝐴2𝑛, 𝑌2𝑛}𝑁 𝑛=1 (1) 17/48
  • 35.
    Notation Let [𝐾] denotethe set {1, … , 𝐾} for a positive integer 𝐾. For individual 𝑛 ∈ [𝑁] at stage 𝑠 ∈ [𝑆] their data is given by: • 𝑋𝑠𝑛 ∈ 𝓧𝑠 ⊆ ℝ𝑑𝑠 denotes the covariates • 𝐴𝑠𝑛 ∈ 𝓐𝑠 denotes the treatment arm • 𝑌𝑠𝑛 ∈ ℝ denotes the response We’ll consider a two-stage SMART for exposition. The generalization to finitely many more stages is straightforward. The study data is comprised of iid replicates {𝑋1𝑛, 𝐴1𝑛, 𝑌1𝑛, 𝑋2𝑛, 𝐴2𝑛, 𝑌2𝑛}𝑁 𝑛=1 (1) Depending on author, context etc. 𝑌𝑠𝑛 may be included in 𝑋(𝑠+1)𝑛 for 𝑠 = 2, … , 𝑆 − 1. Then there is only a single 𝑌 and it is the ultimate response (𝑌𝑆𝑛 in the other notation) 17/48
  • 36.
    General Notation &Potential Outcomes For a process 𝑍 define 𝑍𝑛 = (𝑍1, … , 𝑍𝑛) for a positive integer 𝑛. 18/48
  • 37.
    General Notation &Potential Outcomes For a process 𝑍 define 𝑍𝑛 = (𝑍1, … , 𝑍𝑛) for a positive integer 𝑛. Denote the potential outcome under a treatment sequence 𝑎 = (𝑎1, 𝑎2) for an individual with covariates 𝑥 by 𝑌∗(𝑎, 𝑥) = 𝑌∗(𝑋1 = 𝑥1, 𝐴1 = 𝑎1, 𝑋2 = 𝑥2, 𝐴2 = 𝑎2) 18/48
  • 38.
    General Notation &Potential Outcomes For a process 𝑍 define 𝑍𝑛 = (𝑍1, … , 𝑍𝑛) for a positive integer 𝑛. Denote the potential outcome under a treatment sequence 𝑎 = (𝑎1, 𝑎2) for an individual with covariates 𝑥 by 𝑌∗(𝑎, 𝑥) = 𝑌∗(𝑋1 = 𝑥1, 𝐴1 = 𝑎1, 𝑋2 = 𝑥2, 𝐴2 = 𝑎2) We will suppress the dependence of 𝑌∗ on 𝑥 and write 𝑌∗(𝑎) = 𝑌∗(𝐴1 = 𝑎1, 𝐴2 = 𝑎2) 18/48
  • 39.
    What does itmean for a policy to be optimal? Define the value of a policy 𝜋 as V(𝜋) = I E𝑋[𝑌∗(𝑎 = 𝜋(𝑋))] (2) An optimal policy 𝜋∗ is any policy that satisfies V(𝜋∗) ≥ V(𝜋) for all 𝜋 ∈ Π (3) 19/48
  • 40.
    Exercise: Write Outthe Optimal Policy Suppose we have three treatments 𝐴1, 𝐴2, 𝐴3 and a single binary tailoring variable 𝑋 ∈ {0, 1} where 𝑋 ∼ Bernoulli(.5). I E[𝑌] = .4𝐴1 + .3𝐴2 + .5𝐴2 + .3𝑋𝐴2 20/48
  • 41.
    Exercise: Write Outthe Optimal Policy Suppose we have three treatments 𝐴1, 𝐴2, 𝐴3 and a single binary tailoring variable 𝑋 ∈ {0, 1} where 𝑋 ∼ Bernoulli(.5). I E[𝑌] = .4𝐴1 + .3𝐴2 + .5𝐴2 + .3𝑋𝐴2 Questions What is the optimal policy 𝜋∗ that does not involve 𝑋? That is what is the policy 𝜋 that maximizes 𝐸[𝑌] if treatment must be assigned without using the value of 𝑥? Does the optimal policy change if 𝑋 ∼ Bernoulli(.95)? What is the optimal policy if we can use the observed 𝑥 to assign treatment? Does the mean of 𝑋 matter in this case? 20/48
  • 42.
    Exercise continued Table: ExpectedResponse by Covariate and Treatment Value 𝐴 𝑋 𝐸[𝑌] 1 0 0.4 1 1 0.4 2 0 0.3 2 1 0.6 3 0 0.5 3 1 0.5 I E[𝜋(𝑋)] = I E[𝜋(𝑋)|𝑋 = 0]𝑃(𝑋 = 0) + I E[𝜋(𝑋)|𝑋 = 1]𝑃(𝑋 = 1) 21/48
  • 43.
    Outline Sequential Multiple AssignmentRandomized Trials Precision Health & Medicine SMARTs Redux Analysis of SMART Data Designing a SMART The BEST Trial 22/48
  • 44.
    Embedded DTRs –What Are They An embedded policy or DTR is a DTRs that is directly observable in the study.11 We can redefine a SMART as a multistage trial wherein participants are randomized to follow an embedded treatment regime. 11 May involve aggregation or subsetting. 23/48
  • 45.
  • 46.
    Typical SMART Characteristics Commoncharacteristics • Pragmatic Inclusion/Exclusion Criteria • Pre-established efficacy for individual interventions or intervention components • Fixed randomization probabilities 25/48
  • 47.
    Typical SMART Characteristics Commoncharacteristics • Pragmatic Inclusion/Exclusion Criteria • Pre-established efficacy for individual interventions or intervention components • Fixed randomization probabilities • The embedded regimes are constructed to test an intervention strategy. • Discovery, not confirmatory inference, may be the motivation for the design. The output is an estimated optimal policy or promising biomarkers for further study. 25/48
  • 48.
    Microrandomized Trials, mHealth,and JITAIs SMARTs typically refer to designs with a finite number of decision points. What happens with an indefinite time horizon or decisions in continuous time? 12 Predrag Klasnja et al. “Microrandomized Trials: An Experimental Design for Developing Just-in-Time Adaptive Interventions”. In: Health Psychology: Official Journal of the Division of Health Psychology, American Psychological Association 34S.0 (Dec. 2015), pp. 1220–1228. 26/48
  • 49.
    Microrandomized Trials, mHealth,and JITAIs SMARTs typically refer to designs with a finite number of decision points. What happens with an indefinite time horizon or decisions in continuous time? Microrandomized trials and Just-in-time Adaptive Interventions (JITAI)12 — extend the idea of randomizing critical decision points to an indefinite time horizon. 12 Klasnja et al., “Microrandomized Trials”. 26/48
  • 50.
    Microrandomized Trials, mHealth,and JITAIs SMARTs typically refer to designs with a finite number of decision points. What happens with an indefinite time horizon or decisions in continuous time? Microrandomized trials and Just-in-time Adaptive Interventions (JITAI)12 — extend the idea of randomizing critical decision points to an indefinite time horizon. Markov Decision Processes Markov property: loosely, if for any time 𝑠 ∈ ℕ+ 𝑃(𝑋𝑠+1|𝑋𝑠) = 𝑃(𝑋𝑠+1|𝑋𝑠) 12 Klasnja et al., “Microrandomized Trials”. 26/48
  • 51.
    SMART vs. Cross-overDesign13 13 Kosorok, Michael R. and Moodie, Erica EM, Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine. 27/48
  • 52.
    SMART vs. Cross-overDesign13 13 Kosorok, Michael R. and Moodie, Erica EM, Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine. 27/48
  • 53.
    Test your knowledge Whatconsiderations (e.g. clinical setting, intervention(s), outcomes, etc.) would typically suggest using a 1. SMART? 2. Cross-over trial? 3. Cross-sectional design? 4. A microrandomized trial? 28/48
  • 54.
    Outline Sequential Multiple AssignmentRandomized Trials Precision Health & Medicine SMARTs Redux Analysis of SMART Data Designing a SMART The BEST Trial 29/48
  • 55.
    Key Estimands Non-decision-making Estimands •First stage average treatment effects. • Response probabilities 30/48
  • 56.
    Key Estimands Non-decision-making Estimands •First stage average treatment effects. • Response probabilities Treatment Policies • Identification of the best embedded treatment policy/eDTR • Optimal policy 𝜋∗ • Optimal policy 𝜋∗ ℱ in a restrcited function class ℱ (e.g. optimal policy among linear decision rules, trees of depth 𝑐, or embedded policies) 30/48
  • 57.
    Key Estimands Non-decision-making Estimands •First stage average treatment effects. • Response probabilities Treatment Policies • Identification of the best embedded treatment policy/eDTR • Optimal policy 𝜋∗ • Optimal policy 𝜋∗ ℱ in a restrcited function class ℱ (e.g. optimal policy among linear decision rules, trees of depth 𝑐, or embedded policies) Value of a Policy and Value Comparisons • Value of a fixed policy or policies (usually the embedded policies) V(𝜋) • Value of the estimated optimal policy V(̂ 𝜋) • Value of the optimal policy V(𝜋∗) • Comparison of non-overlapping policies e.g. most intensive vs. least intensive 30/48
  • 58.
    Key Estimands Non-decision-making Estimands •First stage average treatment effects. • Response probabilities Treatment Policies • Identification of the best embedded treatment policy/eDTR • Optimal policy 𝜋∗ • Optimal policy 𝜋∗ ℱ in a restrcited function class ℱ (e.g. optimal policy among linear decision rules, trees of depth 𝑐, or embedded policies) Value of a Policy and Value Comparisons • Value of a fixed policy or policies (usually the embedded policies) V(𝜋) • Value of the estimated optimal policy V(̂ 𝜋) • Value of the optimal policy V(𝜋∗) • Comparison of non-overlapping policies e.g. most intensive vs. least intensive The difference between policy and value estimation is analagous to the difference between estimating the parameters of a linear model and estimating the average treatment effect. 30/48
  • 59.
    Analyzing the FirstStage as the Primary Aim The first stage of a SMART is identical to a standard RCT and standard methods can be applied. In the early days of SMARTs it was more common to make these analyses the primary aim and use them for power calculations. As SMARTs have become more widespread this is less common. • First stage power isn’t likely to be informative of the power for other analyses. • Doesn’t justify the need for a SMART design versus a simpler design. • It may still be of interest to ensure power for both a first-stage analysis and another analysis. 31/48
  • 60.
    Multiple Comparisons withthe Best14 Consider the embedded policies 𝜋𝑖 from a study. Let 𝜃𝑖 = 𝑉(𝜋𝑖) be the value for the 𝑖-th embedded DTR and ̂ 𝜃𝑖 be a consistent estimator of 𝜃𝑖. 14 William J Artman et al. “Power Analysis in a SMART Design: Sample Size Estimation for Determining the Best Embedded Dynamic Treatment Regime”. In: Biostatistics 21.3 (July 1, 2020), pp. 432–448; Donald G. Edwards and Jason C. Hsu. “Multiple Comparisons With the Best Treatment”. In: Journal of the American Statistical Association 78.384 (1983), pp. 965–971. 32/48
  • 61.
    Multiple Comparisons withthe Best14 Consider the embedded policies 𝜋𝑖 from a study. Let 𝜃𝑖 = 𝑉(𝜋𝑖) be the value for the 𝑖-th embedded DTR and ̂ 𝜃𝑖 be a consistent estimator of 𝜃𝑖. We consider 𝜋𝑖 statistically indistinguishable from optimal if and only if ̂ 𝜃𝑖 − ̂ 𝜃𝑗 √Var(̂ 𝜃𝑖 − ̂ 𝜃𝑗) ≥ −𝑐𝑖, 1−𝛼 for all 𝑗 ≠ 𝑖 where 𝑐𝑖, 1−𝛼 > 0 is chosen so that the set of indistinguishable DTRs includes the best eDTR with probability at least 1 − 𝛼. Produces a confidence set ̂ Π𝛼 14 Artman et al., “Power Analysis in a SMART Design”; Edwards and Hsu, “Multiple Comparisons With the Best Treatment”. 32/48
  • 62.
    Multiple Comparisons withthe Best14 Consider the embedded policies 𝜋𝑖 from a study. Let 𝜃𝑖 = 𝑉(𝜋𝑖) be the value for the 𝑖-th embedded DTR and ̂ 𝜃𝑖 be a consistent estimator of 𝜃𝑖. We consider 𝜋𝑖 statistically indistinguishable from optimal if and only if ̂ 𝜃𝑖 − ̂ 𝜃𝑗 √Var(̂ 𝜃𝑖 − ̂ 𝜃𝑗) ≥ −𝑐𝑖, 1−𝛼 for all 𝑗 ≠ 𝑖 where 𝑐𝑖, 1−𝛼 > 0 is chosen so that the set of indistinguishable DTRs includes the best eDTR with probability at least 1 − 𝛼. Produces a confidence set ̂ Π𝛼 ̂ Π𝛼 = {𝜋𝑖 ∶ ̂ 𝜃𝑖 ≥ max 𝑗≠𝑖 ̂ 𝜃𝑗 − 𝑐𝑖, 1−𝛼√Var(̂ 𝜃𝑖 − ̂ 𝜃𝑗)} 14 Artman et al., “Power Analysis in a SMART Design”; Edwards and Hsu, “Multiple Comparisons With the Best Treatment”. 32/48
  • 63.
    Treatment Policy EstimationApproach Families 1. Regression-based approaches (indirect) estimate the outcome model and then the optimal policy is simply the argmax over the estimated regression functions. Examples include: Q-learning and Advantage-learning (A-learning)15 15 Lu Wang et al. “Evaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized Trial of Advanced Prostate Cancer”. In: Journal of the American Statistical Association 107.498 (June 2012), pp. 493–508. 16 Yingqi Zhao et al. “Estimating Individualized Treatment Rules Using Outcome Weighted Learning”. In: Journal of the American Statistical Association 107.499 (Sept. 1, 2012), pp. 1106–1118. 17 Zhengling Qi et al. “Multi-Armed Angle-Based Direct Learning for Estimating Optimal Individualized Treatment Rules With Various Outcomes”. In: Journal of the American Statistical Association 115.530 (Apr. 2, 2020), pp. 678–691. 33/48
  • 64.
    Treatment Policy EstimationApproach Families 1. Regression-based approaches (indirect) estimate the outcome model and then the optimal policy is simply the argmax over the estimated regression functions. Examples include: Q-learning and Advantage-learning (A-learning)15 2. Classification-based approaches (direct, direct search) try to directly optimize over the space of policies by turning the problem into a weighted classification problem. Examples: Outcome weighted learning(OWL),16 Direct-learning(D-learning)17 15 Wang et al., “Evaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized Trial of Advanced Prostate Cancer”. 16 Zhao et al., “Estimating Individualized Treatment Rules Using Outcome Weighted Learning”. 17 Qi et al., “Multi-Armed Angle-Based Direct Learning for Estimating Optimal Individualized Treatment Rules With Various Outcomes”. 33/48
  • 65.
    Treatment Policy EstimationApproach Families 1. Regression-based approaches (indirect) estimate the outcome model and then the optimal policy is simply the argmax over the estimated regression functions. Examples include: Q-learning and Advantage-learning (A-learning)15 2. Classification-based approaches (direct, direct search) try to directly optimize over the space of policies by turning the problem into a weighted classification problem. Examples: Outcome weighted learning(OWL),16 Direct-learning(D-learning)17 3. Combination approaches are also possible. 15 Wang et al., “Evaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized Trial of Advanced Prostate Cancer”. 16 Zhao et al., “Estimating Individualized Treatment Rules Using Outcome Weighted Learning”. 17 Qi et al., “Multi-Armed Angle-Based Direct Learning for Estimating Optimal Individualized Treatment Rules With Various Outcomes”. 33/48
  • 66.
    Backwards Induction In multistagelearning policies are estimated from back to front rather than front to back. 34/48
  • 67.
    Backwards Induction In multistagelearning policies are estimated from back to front rather than front to back. Fill out the expected payoffs in the image with an example where moving forwards would result in a suboptimal policy: A very age and culturally specific hint: 34/48
  • 68.
    Q-learning18 Assume that thereare 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested in the ultimate outcome 𝑌 = 𝑌2. 18 Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning. 35/48
  • 69.
    Q-learning18 Assume that thereare 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested in the ultimate outcome 𝑌 = 𝑌2. 1. Estimate I E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂ 𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression function 𝑓2. 18 Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning. 35/48
  • 70.
    Q-learning18 Assume that thereare 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested in the ultimate outcome 𝑌 = 𝑌2. 1. Estimate I E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂ 𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression function 𝑓2. 2. Determine the second-stage estimated optimal treatment rule ̂ 𝜋2(𝑥) = argmax𝑘∈𝐾2 ̂ 𝑓2(𝐴2 = 𝑘, 𝑥2, 𝑎1) 18 Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning. 35/48
  • 71.
    Q-learning18 Assume that thereare 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested in the ultimate outcome 𝑌 = 𝑌2. 1. Estimate I E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂ 𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression function 𝑓2. 2. Determine the second-stage estimated optimal treatment rule ̂ 𝜋2(𝑥) = argmax𝑘∈𝐾2 ̂ 𝑓2(𝐴2 = 𝑘, 𝑥2, 𝑎1) 3. Calculate the expected response had everyone been given their estimated optimal treatment ̃ 𝐴2: ̃ 𝑌 = ̂ 𝑓2( ̃ 𝐴2 = ̂ 𝜋2(𝑥2, 𝑎1), 𝑥2, 𝑎1) 18 Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning. 35/48
  • 72.
    Q-learning18 Assume that thereare 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested in the ultimate outcome 𝑌 = 𝑌2. 1. Estimate I E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂ 𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression function 𝑓2. 2. Determine the second-stage estimated optimal treatment rule ̂ 𝜋2(𝑥) = argmax𝑘∈𝐾2 ̂ 𝑓2(𝐴2 = 𝑘, 𝑥2, 𝑎1) 3. Calculate the expected response had everyone been given their estimated optimal treatment ̃ 𝐴2: ̃ 𝑌 = ̂ 𝑓2( ̃ 𝐴2 = ̂ 𝜋2(𝑥2, 𝑎1), 𝑥2, 𝑎1) 4. Estimate I E[ ̃ 𝑌|𝑋1, 𝐴1] ≐ ̂ 𝑓1(𝑥1, 𝑎1) 18 Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning. 35/48
  • 73.
    Q-learning18 Assume that thereare 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested in the ultimate outcome 𝑌 = 𝑌2. 1. Estimate I E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂ 𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression function 𝑓2. 2. Determine the second-stage estimated optimal treatment rule ̂ 𝜋2(𝑥) = argmax𝑘∈𝐾2 ̂ 𝑓2(𝐴2 = 𝑘, 𝑥2, 𝑎1) 3. Calculate the expected response had everyone been given their estimated optimal treatment ̃ 𝐴2: ̃ 𝑌 = ̂ 𝑓2( ̃ 𝐴2 = ̂ 𝜋2(𝑥2, 𝑎1), 𝑥2, 𝑎1) 4. Estimate I E[ ̃ 𝑌|𝑋1, 𝐴1] ≐ ̂ 𝑓1(𝑥1, 𝑎1) 5. Determine the first-stage optimal treatment rule ̂ 𝜋1(𝑥) = argmax𝑘∈𝐾1 ̂ 𝑓1(𝐴1 = 𝑘, 𝑥1) 18 Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning. 35/48
  • 74.
    Q-learning18 Assume that thereare 𝑆 = 2 stages, 𝐾𝑠 treatments in each stage, and we are interested in the ultimate outcome 𝑌 = 𝑌2. 1. Estimate I E[𝑌|𝑋1, 𝐴1, 𝑋2, 𝐴2] ≐ ̂ 𝑓2(𝑥1, 𝑎1, 𝑥2, 𝑎2) using your favorite regression function 𝑓2. 2. Determine the second-stage estimated optimal treatment rule ̂ 𝜋2(𝑥) = argmax𝑘∈𝐾2 ̂ 𝑓2(𝐴2 = 𝑘, 𝑥2, 𝑎1) 3. Calculate the expected response had everyone been given their estimated optimal treatment ̃ 𝐴2: ̃ 𝑌 = ̂ 𝑓2( ̃ 𝐴2 = ̂ 𝜋2(𝑥2, 𝑎1), 𝑥2, 𝑎1) 4. Estimate I E[ ̃ 𝑌|𝑋1, 𝐴1] ≐ ̂ 𝑓1(𝑥1, 𝑎1) 5. Determine the first-stage optimal treatment rule ̂ 𝜋1(𝑥) = argmax𝑘∈𝐾1 ̂ 𝑓1(𝐴1 = 𝑘, 𝑥1) The estimated optimal policy is ̂ 𝜋 = (̂ 𝜋1, ̂ 𝜋2) 18 Typically what statisticians call Q-learning is referred to as Q-learning with function approximation in the CS literature. What CS people call Q-learning is fully nonparametric and statisticians will refer to it as tabular Q-learning. 35/48
  • 75.
    Q-learning What is theestimation step of Q-learning when there is only a single stage and a linear model is used? 36/48
  • 76.
    Operator notation • 𝑋1,… , 𝑋𝑁 is an iid random sample from a fixed but unknown distribution P • 𝑔 is a generic parametric function indexed by 𝜃 ∈ Θ • ̂ 𝜃 ∈ Θ is a random variable constructed from the sample 𝑋1, … , 𝑋𝑁 P denotes the probability measure: P 𝑔(𝑋; ̂ 𝜃) = ∫ 𝑔(𝑥; ̂ 𝜃) d P(𝑥)) ℙn denotes the corresponding empirical measure: ℙn 𝑔(𝑋; ̂ 𝜃) = 𝑛−1 𝑛 ∑ 𝑖=1 𝑔(𝑥𝑖; ̂ 𝜃) denotes convergence in distribution. 37/48
  • 77.
    Value Functions 1. ConditionalValue of the estimated optimal policy I E𝑋[̂ 𝜋𝑛(𝑋)|̂ 𝜋𝑛] = P ̂ 𝜋𝑛(𝑋) 2. (Expected) Value of an estimated optimal policy I E𝑋[̂ 𝜋𝑛(𝑋)] 3. Value of the optimal policy I E𝑋[𝜋∗(𝑋)] = P 𝜋∗(𝑋) 38/48
  • 78.
    Value Functions 1. ConditionalValue of the estimated optimal policy I E𝑋[̂ 𝜋𝑛(𝑋)|̂ 𝜋𝑛] = P ̂ 𝜋𝑛(𝑋) 2. (Expected) Value of an estimated optimal policy I E𝑋[̂ 𝜋𝑛(𝑋)] 3. Value of the optimal policy I E𝑋[𝜋∗(𝑋)] = P 𝜋∗(𝑋) Are estimators of these functions asymptotically equivalent? 38/48
  • 79.
    Value Functions 1. ConditionalValue of the estimated optimal policy I E𝑋[̂ 𝜋𝑛(𝑋)|̂ 𝜋𝑛] = P ̂ 𝜋𝑛(𝑋) 2. (Expected) Value of an estimated optimal policy I E𝑋[̂ 𝜋𝑛(𝑋)] 3. Value of the optimal policy I E𝑋[𝜋∗(𝑋)] = P 𝜋∗(𝑋) Are estimators of these functions asymptotically equivalent? Brainstorm a scenario for each of the value functions where that estimand would make the most sense. 38/48
  • 80.
    Estimating the ValueFunction • Model-based — seldom used. Why? Think about what a SMART provides by design. • Inverse Probability Weighted Estimator For a two-stage SMART, a fixed regime 𝜋, and histories 𝐻1 = 𝑋1, 𝐻2 = (𝑋1, 𝐴1, 𝑋2) ̂ Vn(𝜋) = ℙn [ 𝑌 1 {𝐴1 = 𝜋1(𝐻1)} 1 {𝐴2 = 𝜋2(𝐻2)} Pr(𝐴1|𝐻1) Pr(𝐴2|𝐻2) ] • Augmented Inverse Probability Weighted Estimator 39/48
  • 81.
    Toy Problem: Maxof Gaussian Means Suppose we have a random iid sample of size 𝑛 where 𝑋𝑖 ∼ MVN (𝜇 = ( 2 −1 ) , Σ = [ 1 0 0 1 ]) 40/48
  • 82.
    Toy Problem: Maxof Gaussian Means Suppose we have a random iid sample of size 𝑛 where 𝑋𝑖 ∼ MVN (𝜇 = ( 2 −1 ) , Σ = [ 1 0 0 1 ]) In the general case let 𝑝 denote the dimension of 𝜇 and assume that the covariance matrix is the identity matrix. Suppose we are interested in 𝜃: 𝜃 = max 𝑗∈[𝑝] 𝜇𝑗 = ∧ 𝑝 𝑗=1𝜇𝑗 Here 𝜃 = max{2, −1} = 2 40/48
  • 83.
    Toy Problem: Maxof Gaussian Means Suppose we have a random iid sample of size 𝑛 where 𝑋𝑖 ∼ MVN (𝜇 = ( 2 −1 ) , Σ = [ 1 0 0 1 ]) In the general case let 𝑝 denote the dimension of 𝜇 and assume that the covariance matrix is the identity matrix. Suppose we are interested in 𝜃: 𝜃 = max 𝑗∈[𝑝] 𝜇𝑗 = ∧ 𝑝 𝑗=1𝜇𝑗 Here 𝜃 = max{2, −1} = 2 Let ̂ 𝜃𝑛 = max 𝑗∈{1,2} ℙn 𝑋𝑗 40/48
  • 84.
    Toy Problem: Maxof Gaussian Means Suppose we have a random iid sample of size 𝑛 where 𝑋𝑖 ∼ MVN (𝜇 = ( 2 −1 ) , Σ = [ 1 0 0 1 ]) In the general case let 𝑝 denote the dimension of 𝜇 and assume that the covariance matrix is the identity matrix. Suppose we are interested in 𝜃: 𝜃 = max 𝑗∈[𝑝] 𝜇𝑗 = ∧ 𝑝 𝑗=1𝜇𝑗 Here 𝜃 = max{2, −1} = 2 Let ̂ 𝜃𝑛 = max 𝑗∈{1,2} ℙn 𝑋𝑗 What is the limiting distribution of √𝑛( ̂ 𝜃 − 𝜃) for 𝜇 = (2, −1)T? Hint: don’t overthink it 40/48
  • 85.
    Max of GaussianMeans continued A: √𝑛( ̂ 𝜃 − 𝜃) �� 𝑁(0, 1) 41/48
  • 86.
    Max of GaussianMeans continued A: √𝑛( ̂ 𝜃 − 𝜃) �� 𝑁(0, 1) Suppose 𝜇 = (0, 0)T Now what is the limiting distribution of √𝑛( ̂ 𝜃 − 𝜃)? 41/48
  • 87.
    Max of GaussianMeans continued A: √𝑛( ̂ 𝜃 − 𝜃) �� 𝑁(0, 1) Suppose 𝜇 = (0, 0)T Now what is the limiting distribution of √𝑛( ̂ 𝜃 − 𝜃)? √𝑛( ̂ 𝜃 − 𝜃) �� max(𝑁(0, 1), 𝑁(0, 1)) Now the limiting distribution is the maximum of two independent standard normal RVs. 41/48
  • 88.
    Max of GaussianMeans continued A: √𝑛( ̂ 𝜃 − 𝜃) �� 𝑁(0, 1) Suppose 𝜇 = (0, 0)T Now what is the limiting distribution of √𝑛( ̂ 𝜃 − 𝜃)? √𝑛( ̂ 𝜃 − 𝜃) �� max(𝑁(0, 1), 𝑁(0, 1)) Now the limiting distribution is the maximum of two independent standard normal RVs. Problem: the limiting distribution depends on the value of the parameter 𝜇 To analyze SMARTs we’ll need to address nonregular asymptotics. 41/48
  • 89.
    Outline Sequential Multiple AssignmentRandomized Trials Precision Health & Medicine SMARTs Redux Analysis of SMART Data Designing a SMART The BEST Trial 42/48
  • 90.
    SMART Design Considerations •Time between randomizations • Number of stages • What defines response status? (One day I’ll convince someone to run a SMART where response status isn’t dichotomized) • Will responders be re-randomized or continue treatment (deterministic) • Set of treatments available at each stage • Outcomes used to assess response • Can the coordinating center handle the additional logistical complexity? • Does the study design match the scientific aim (true of all studies)? 43/48
  • 91.
    Common Second-Stage DesignPatterns • Switch off ineffective treatment • Step up (intensify) intervention • Step down (relax) intervention • Dose adjustment 44/48
  • 92.
    Challenges with SMARTGrant Proposals • Need for increased communication between statisticians and clinicians when designing study and preparing protocol • Nonstandard sample size calculations • Reviewers may be unfamiliar with SMART designs • Reviewers may be unfamiliar with novel statistical methods 45/48
  • 93.
    Choosing a Design Neverforget that the scientific or clinical question should determine the design, not vice versa. 46/48
  • 94.
    Choosing a namefor your method The success of the term SMART1920 is a bit of a branding coup. This is not to denigrate Susan Murphy’s extensive contributions, but to highlight other researchers in the area at the time. 19 S. A. Murphy. “An Experimental Design for the Development of Adaptive Treatment Strategies”. In: Statistics in Medicine 24.10 (2005), pp. 1455–1481. 20 Interestingly this paper never uses SMART always “SMAR” trials. 21 Linda M. Collins et al. “A Strategy for Optimizing and Evaluating Behavioral Interventions”. In: Annals of Behavioral Medicine 30.1 (Feb. 1, 2005), pp. 65–73. 22 Peter F. Thall, Randall E. Millikan, and Hsi-Guang Sung. “Evaluating Multiple Treatment Courses in Clinical Trials”. In: Statistics in Medicine 19.8 (2000), pp. 1011–1028; Peter F. Thall et al. “Hierarchical Bayesian Approaches to Phase II Trials in Diseases with Multiple Subtypes”. In: Statistics in Medicine 22.5 (2003), pp. 763–780. 23 Maurizio Fava et al. “Background and Rationale for the Sequenced Treatment Alternatives to Relieve Depression (STAR�D) Study”. In: Psychiatric Clinics of North America 26.2 (June 1, 2003), pp. 457–494; Sonia M. Davis et al. “Statistical Approaches to Effectiveness Measurement and Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) Studies”. In: Schizophrenia Bulletin 29.1 (2003), pp. 73–80. 47/48
  • 95.
    Choosing a namefor your method The success of the term SMART1920 is a bit of a branding coup. This is not to denigrate Susan Murphy’s extensive contributions, but to highlight other researchers in the area at the time. Multiphase Optimization Strategy (MOST)21 19 Murphy, “An Experimental Design for the Development of Adaptive Treatment Strategies”. 20 Interestingly this paper never uses SMART always “SMAR” trials. 21 Collins et al., “A Strategy for Optimizing and Evaluating Behavioral Interventions”. 22 Thall, Millikan, and Sung, “Evaluating Multiple Treatment Courses in Clinical Trials”; Thall et al., “Hierarchical Bayesian Approaches to Phase II Trials in Diseases with Multiple Subtypes”. 23 Fava et al., “Background and Rationale for the Sequenced Treatment Alternatives to Relieve Depression (STAR�D) Study”; Davis et al., “Statistical Approaches to Effectiveness Measurement and Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) Studies”. 47/48
  • 96.
    Choosing a namefor your method The success of the term SMART1920 is a bit of a branding coup. This is not to denigrate Susan Murphy’s extensive contributions, but to highlight other researchers in the area at the time. Multiphase Optimization Strategy (MOST)21 Bayesian trials with rerandomization22 19 Murphy, “An Experimental Design for the Development of Adaptive Treatment Strategies”. 20 Interestingly this paper never uses SMART always “SMAR” trials. 21 Collins et al., “A Strategy for Optimizing and Evaluating Behavioral Interventions”. 22 Thall, Millikan, and Sung, “Evaluating Multiple Treatment Courses in Clinical Trials”; Thall et al., “Hierarchical Bayesian Approaches to Phase II Trials in Diseases with Multiple Subtypes”. 23 Fava et al., “Background and Rationale for the Sequenced Treatment Alternatives to Relieve Depression (STAR�D) Study”; Davis et al., “Statistical Approaches to Effectiveness Measurement and Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) Studies”. 47/48
  • 97.
    Choosing a namefor your method The success of the term SMART1920 is a bit of a branding coup. This is not to denigrate Susan Murphy’s extensive contributions, but to highlight other researchers in the area at the time. Multiphase Optimization Strategy (MOST)21 Bayesian trials with rerandomization22 Trials using “outcome-driven re-randomizations”23 19 Murphy, “An Experimental Design for the Development of Adaptive Treatment Strategies”. 20 Interestingly this paper never uses SMART always “SMAR” trials. 21 Collins et al., “A Strategy for Optimizing and Evaluating Behavioral Interventions”. 22 Thall, Millikan, and Sung, “Evaluating Multiple Treatment Courses in Clinical Trials”; Thall et al., “Hierarchical Bayesian Approaches to Phase II Trials in Diseases with Multiple Subtypes”. 23 Fava et al., “Background and Rationale for the Sequenced Treatment Alternatives to Relieve Depression (STAR�D) Study”; Davis et al., “Statistical Approaches to Effectiveness Measurement and Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) Studies”. 47/48
  • 98.
    Choosing a namefor your method The success of the term SMART1920 is a bit of a branding coup. This is not to denigrate Susan Murphy’s extensive contributions, but to highlight other researchers in the area at the time. Multiphase Optimization Strategy (MOST)21 Bayesian trials with rerandomization22 Trials using “outcome-driven re-randomizations”23 19 Murphy, “An Experimental Design for the Development of Adaptive Treatment Strategies”. 20 Interestingly this paper never uses SMART always “SMAR” trials. 21 Collins et al., “A Strategy for Optimizing and Evaluating Behavioral Interventions”. 22 Thall, Millikan, and Sung, “Evaluating Multiple Treatment Courses in Clinical Trials”; Thall et al., “Hierarchical Bayesian Approaches to Phase II Trials in Diseases with Multiple Subtypes”. 23 Fava et al., “Background and Rationale for the Sequenced Treatment Alternatives to Relieve Depression (STAR�D) Study”; Davis et al., “Statistical Approaches to Effectiveness Measurement and Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) Studies”. 47/48
  • 99.
    Outline Sequential Multiple AssignmentRandomized Trials Precision Health & Medicine SMARTs Redux Analysis of SMART Data Designing a SMART The BEST Trial 48/48
  • 100.
  • 101.
  • 102.
    Artman, William Jet al. “Power Analysis in a SMART Design: Sample Size Estimation for Determining the Best Embedded Dynamic Treatment Regime”. In: Biostatistics 21.3 (July 1, 2020), pp. 432–448. Bidargaddi, N. et al. “Designing M-Health Interventions for Precision Mental Health Support”. In: Translational Psychiatry 10.1 (1 July 7, 2020), pp. 1–8. Collins, Linda M. et al. “A Strategy for Optimizing and Evaluating Behavioral Interventions”. In: Annals of Behavioral Medicine 30.1 (Feb. 1, 2005), pp. 65–73. Davis, Sonia M. et al. “Statistical Approaches to Effectiveness Measurement and Outcome-Driven Re-Randomizations in the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) Studies”. In: Schizophrenia Bulletin 29.1 (2003), pp. 73–80. Edwards, Donald G. and Jason C. Hsu. “Multiple Comparisons With the Best Treatment”. In: Journal of the American Statistical Association 78.384 (1983), pp. 965–971. Fava, Maurizio et al. “Background and Rationale for the Sequenced Treatment Alternatives to Relieve Depression (STAR�D) Study”. In: Psychiatric Clinics of North America 26.2 (June 1, 2003), pp. 457–494. 1/10
  • 103.
    Klasnja, Predrag etal. “Microrandomized Trials: An Experimental Design for Developing Just-in-Time Adaptive Interventions”. In: Health Psychology: Official Journal of the Division of Health Psychology, American Psychological Association 34S.0 (Dec. 2015), pp. 1220–1228. Kosorok, Michael R. and Eric B. Laber. “Precision Medicine”. In: Annual Review of Statistics and Its Application 6.1 (2019), pp. 263–286. Kosorok, Michael R. and Moodie, Erica EM, eds. Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine. Vol. 21. ASA-SIAM Series on Statistics and Applied Mathematics. SIAM, 2015. Lorenzoni, Giulia et al. “Use of Sequential Multiple Assignment Randomized Trials (SMARTs) in Oncology: Systematic Review of Published Studies”. In: British Journal of Cancer 128.7 (7 Mar. 2023), pp. 1177–1188. Murphy, S. A. “An Experimental Design for the Development of Adaptive Treatment Strategies”. In: Statistics in Medicine 24.10 (2005), pp. 1455–1481. Qi, Zhengling et al. “Multi-Armed Angle-Based Direct Learning for Estimating Optimal Individualized Treatment Rules With Various Outcomes”. In: Journal of the American Statistical Association 115.530 (Apr. 2, 2020), pp. 678–691. 1/10
  • 104.
    Smith, Sophia K.et al. “A SMART Approach to Optimizing Delivery of an mHealth Intervention among Cancer Survivors with Posttraumatic Stress Symptoms”. In: Contemporary Clinical Trials 110 (Nov. 2021), p. 106569. Thall, Peter F., Randall E. Millikan, and Hsi-Guang Sung. “Evaluating Multiple Treatment Courses in Clinical Trials”. In: Statistics in Medicine 19.8 (2000), pp. 1011–1028. Thall, Peter F. et al. “Hierarchical Bayesian Approaches to Phase II Trials in Diseases with Multiple Subtypes”. In: Statistics in Medicine 22.5 (2003), pp. 763–780. Tsiatis, Anastasios A. et al. Dynamic Treatment Regimes: Statistical Methods for Precision Medicine. New York: Chapman and Hall/CRC, Dec. 19, 2019. 618 pp. Wang, Lu et al. “Evaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized Trial of Advanced Prostate Cancer”. In: Journal of the American Statistical Association 107.498 (June 2012), pp. 493–508. Zhao, Yingqi et al. “Estimating Individualized Treatment Rules Using Outcome Weighted Learning”. In: Journal of the American Statistical Association 107.499 (Sept. 1, 2012), pp. 1106–1118. 2/10
  • 105.
  • 106.
    Resources Books Focused more onestimation than design but still probably the number one resource: Anastasios A. Tsiatis et al. Dynamic Treatment Regimes: Statistical Methods for Precision Medicine. New York: Chapman and Hall/CRC, Dec. 19, 2019. 618 pp. Older reference: Kosorok, Michael R. and Moodie, Erica EM, eds. Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine. Vol. 21. ASA-SIAM Series on Statistics and Applied Mathematics. SIAM, 2015 Course Notes • BIOS 740 – https://mkosorok.web.unc.edu/current-course/, Archive.org link • SAMSI 2019 Precision Medicine Course – https://www.slideshare.net/search?searchfrom=header&q=2019+PMED+Course Papers Michael R. Kosorok and Eric B. Laber. “Precision Medicine”. In: Annual Review of Statistics and Its Application 6.1 (2019), pp. 263–286 N. Bidargaddi et al. “Designing M-Health Interventions for Precision Mental Health Support”. In: Translational Psychiatry 10.1 (1 July 7, 2020), pp. 1–8 3/10
  • 107.
    Online Resources Data Sciencefor Dynamic Intervention Decision Making https://d3c.isr.umich.edu/ (Archived) Penn State Methodology Center https://wayback.archive-it.org/3524/20210524145320/https://www.methodology.psu.edu/ra/adap-inter/ 4/10
  • 108.
  • 109.
    Terminology Overview The slidesin this section are a reference for the many terms that have been used to describe the same or very similar concepts. In some cases these terms are a specific subtype of the term that corresponds to a DTR. For example, clinical decision support systems are a broad term that includes notifications or recommendations but don’t prescribe actions, but automated decision support systems While it was sufficiently exhausting, it is sadly not exhaustive. 6/10
  • 110.
    Terminology Equivalents -DTRs Adaptive treatment strategies, Adaptive interventions, Decision functions, Dynamic treatment regimes, Multistage treatment strategies, Stepped-care Strategies, Treatment policies, Treatment Regimes, Treatment rules, Individualized treatment rules, Expert systems, Expert advice systems, (Automated) Clinical Decision Support System (Automated) Decision Aid (Automated) Decision Support Tool 7/10
  • 111.
    The “Dynamic” inDynamic Treatment Regime Depending on the author, the dynamic in dynamic treatment regime can mean: Dynamic = time-varying Dynamic = depends on patient covariates (including responder status) 8/10
  • 112.
  • 113.
    “Cheating” the MarkovAssumption When the number of stages 𝐽 is finite we can expand the state space to ensure the Markov condition holds 𝑋+ 𝑗 = (∪ 𝑗 𝑖=1(𝑋𝑖, 𝐴𝑖, 𝑌𝑖)) ∪ 𝑋𝑗 Then the Markov condition I E[𝑌𝑗|𝑋+ 1 , … , 𝑋+ 𝑗 ] = I E[𝑌𝑗|𝑋+ 𝑗 ] is trivially satisfied. This trick is not available to us in the infinite horizon case, though we can continue in the spirit of this approach by defining summary measurements. 10/10