Statistics in Clinical and Translational Research in Drug Development - Judith D. Goldberg, Sc.D. Professor
Division of Biostatistics New York University School of Medicine
1. Statistics in Clinical and
Translational Research in
Drug Development
Judith D. Goldberg, Sc.D.
Professor
Division of Biostatistics
New York University School of Medicine
MedicReS
International Congress on Good Medical Research
New York, New York October 16, 2014
JD Goldberg MedicReS 10162014
2. Personal Perspectives from:
Pharmaceutical industry drug and device development
Non profit health care research
Academia
FDA Advisory Committee Member
Expert Witness, Other
History
Current views
Future directions and challenges
Bioinformatics
Big Data
Personalized medicine
JD Goldberg MedicReS 10162014
3. Statistics in Clinical and Translational
Research: Process
Planning
Problem formulation
What is the question (hypothesis)?
Study design
Type of study? Comparison?
What is the intervention? Outcome?
For whom?
When? For how long?
Sample size?
Data collection: forms design, database design, procedures,
timelines
Contingency plans? early stopping?
Analysis plan
JD Goldberg MedicReS 10162014
4. Statistics in Clinical and Translational
Research: Process
Implementation
Study conduct
Study progress
accrual, data and safety monitoring
Data management
Study Completion
Study closeout
Data analysis
Interpretation
Reporting
JD Goldberg MedicReS 10162014
5. Environment [early 1970’s]
New statistical methods:
logistic regression
log linear models
Cox proportional hazards model
Batch computing, IBM cards, card readers,
sorters, tape back ups, …
Statistical computing:
SPSS, BMDP
limited software
JD Goldberg MedicReS 10162014
6. Current Environment
Basic issues of study design, replication
need to be addressed
Software availability (R, SAS, STATA, …)
Emphasis on speed, efficiency, accelerated
development
Large amounts of data need special tools
Multiplicity makes usual p-values
uninterpretable – false discovery rate
Assumptions in pre-processing of data at
multiple steps influence results
Assumptions in analytic methods influence
results
JD Goldberg MedicReS 10162014
7. Changing Roles
Basic statistical issues remain the same
Focus on problem identification
Collaborative involvement throughout research
process
Planning
Implementation
Reporting
Statistical thinking has expanded
Tools and methods have changed
Advances in science
Explosion in amounts of data
Enabled by advances in computing
JD Goldberg MedicReS 10162014
8. Biostatistics in Drug Development:
Today and Tomorrow
JD Goldberg MedicReS 10162014
Issues:
Basic issues the same
Thinking has expanded
Problems more complex
High dimensional data –many variables, small numbers of observations
Environment:
Interdisciplinary research: TEAM SCIENCE
Challenges for statistics
Expanded role in problem formulation, complex research process, input
into all stages of development
Interactions with ‘bioinformatics’, informatics
Data sharing, regulations (e.g, privacy)
Combining data from multiple sources; warehousing
Making explicit requirements for IT infrastructure to enable and
enhance research process
9. New [and Old] Opportunities
Strategic input at all stages of drug development
Compound screening
Patent preparation ***
New study designs to address efficiency without
compromising science
Phase I/II; Phase II/III; adaptive designs
Incorporation of biomarkers
Safety evaluations from early development
through post marketing
Combining data from multiple sources
Comparative effectiveness
JD Goldberg MedicReS 10162014
11. Pre Clinical Development
Drug Discovery
Compound Screening
High throughput in silico
Animal models
New Opportunity
Statistical methods for screening to
minimize false negative and false
positive leads
Use of experimental designs to optimize
screening and animal testing
JD Goldberg MedicReS 10162014
12. Patents
Historically, little statistical involvement
File rapidly
Example:
• Survival curves in mice calculated incorrectly
Led to major patent litigation
ignorance– but should not happen
• Lab notebooks at issue as well
labelled ‘fraud’ ------
Example:
• Patent claims that two drugs given together are
synergistic
JD Goldberg MedicReS 10162014
13. Phase I
Investigator controlled treatment administration and structured
observations
Generally not randomized; can be circumstances where
randomization is used
Objectives:
Safety and tolerance; single and multiple dose
Dose finding (MTD- maximum tolerated dose that is
associated with serious but reversible side effects in a ‘sizeable’
proportion of patients ; use RPTD – recommended Phase II dose-one
level down
Bioavailabilty – rate and extent to which active ingredient or
therapeutic compound absorbed and available at site of action
Equivalence of formulations, drugs (bioequivalence)
Special populations, drug –drug interactions, fed/fast
Exploratory - tentative answers
Issues—ethics
Healthy volunteers vs patients
JD Goldberg MedicReS 10162014
14. Phase II Designs
Objective: Preliminary evidence of efficacy
and side effects at fixed dose(s)
Parallel group randomized designs
Uncontrolled single group
Objectives:
Proof of concept, efficacy, mechanism, dose
ranging, pilot studies
JD Goldberg MedicReS 10162014
15. Phase II Objectives -
continued
Estimate clinical endpoint with
specified precision
Proportion of patients who
respond
Average change from baseline
in diastolic blood pressure
Proportion of patients with
side effects
Proportion of patients who fail
(failure rate)
Dose response
JD Goldberg MedicReS 10162014
16. Types of Phase II Designs
Single arm uncontrolled trial with specified
number of patients to estimate the response rate
with specified precision
Example: If 20% is the lowest acceptable response rate
for a new treatment, if there are no observed responses in
12 patients, then the exact binomial upper 95% confidence
interval is 20%.
Randomized phase II
Seamless Phase II/III
Response to pressure for more efficient study designs
JD Goldberg MedicReS 10162014
17. Phase II Single Arm Two Stage
Designs (Simon, 1989)
p0 uninteresting level of response
p1 interesting level of response
If true probability of response is less
than p0, then the chance of accepting
treatment for further study is α
If true probability of response is
greater than p1 , chance of rejecting
treatment is β (1-power)
JD Goldberg MedicReS 10162014
18. Simon Two Stage Design- cont.
Study ends at end of stage 1 only if the
treatment appears ineffective
Stop early only for lack of efficacy
Stage 1: If r1 or fewer responses are
observed in the first n1 patients, stop;
otherwise continue
Stage 2: If r (total stage 1+ stage 2)
responses are observed in n (total
patients), continue to study the drug
JD Goldberg MedicReS 10162014
19. Adaptive Designs
Accumulating data as basis for
modifying trial without impacting
validity, integrity
JD Goldberg MedicReS 10162014
20. Possible Adaptations:
Early stop (futility, early rejection)
Sample size re-assessment
Treatment allocation ratio change
Treatment arms changes (drop, add, modify)
Change hypotheses
Change study population (inclusion, exclusion)
Change test statistics
Combine trials (eg, seamless Phase II/III)
JD Goldberg MedicReS 10162014
21. Adaptive Designs: Sample Size
Re-assessment
When, how
Blinded, unblinded
FDA Draft Guidance (2010)
‘revisions based on blinded interim evaluations of data (eg,
aggregate event rates based on aggregate event rates or
variance of the endpoint are advisable procedures that can be
considered , variance, discontinuation rates, baseline
characteristics) do not introduce statistical bias into the study
or into subsequent study revisions made by the same personnel.
Certain blinded analysis based changes, such as sample size
revisions planned at the design stage, but can also be applied
when not planned from the study outset if the study has
remained unequivocally blinded’. [13, lines 91-96].
JD Goldberg MedicReS 10162014
22. Seamless Phase II/III
Designs
Goal: Combine treatment selection
and confirmation into one trial to
speed development
During trial, choose optimal dose,
population based on interim data
Surrogate marker, early data on endpoint,
primary endpoint
Enrollment continues on selected dose,
treatment arm(s), and population
JD Goldberg MedicReS 10162014
23. Intention To Treat (ITT)
Principle
Analyze all subjects randomized
Include all events
Beware of “look alikes”
Modified ITT: Analyze subjects who get
some intervention
Per Protocol: Analyze subjects who
comply according to the protocol
JD Goldberg MedicReS 10162014
24. Dynamic Treatment Strategies
and SMART Trials
DTS: set of decision rules for
management of patients
Can be represented by time-indexed
mapping from patient state history and
previous treatments into set of possible
treatment strategies
SMART: experiment for comparing DTSs
Randomizes to different treatment
branches that separate DTSs
JD Goldberg MedicReS 10162014
25. SMART continued
ITT randomizes at start
Treatment changes after initial
randomization are not randomized and
analysis is over distribution of implied DTSs
DTS ITT converts to SMART by
randomizing when would change
treatment decisions
‘sequential ignorability’ (generalizes
Rubin ignorability in this context)
JD Goldberg MedicReS 10162014
26. SMART Analysis
G estimation, marginal means, optimal
semi parametric estimator (Moodie,
2007)
Patient information contributed to one
or more DTS until patient leaves that
DTS
Alternative to baseline randomization
among DTSs
JD Goldberg MedicReS 10162014
27. Choices in Design of Randomized Controlled
Trials
Treatment Regimen
Controls
Types of patients and
severity of disease
Level of blinding
Parallel group or
alternative design
Number and size of
centers
Stratification
Interim
Analysis/monitoring
Adaptive
Bayesian
Length of observation
period, need for
retreatment
Methods of treatment
delivery
Unit of analysis
Outcomes and their
measures;
measurement error
Meaningful effect size
[statistical
significance vs. clinical
significance]
JD Goldberg MedicReS 10162014
28. Defining the Question
Defined carefully in advance
Must be clinically relevant
Prioritize into primary, secondary, …
Design built around primary question(s)
Superiority, non inferiority, equivalence of
treatments with respect to outcome
Eligibility criteria define population studied
and inferences to be made
Surrogates desirable but risky
Need the relevant measure of efficacy
JD Goldberg MedicReS 10162014
29. Who Should Be Studied?
Homogeneous vs. Heterogeneous
• Well defined Not easily specified
• Mechanism of action Not know if all groups
well known respond similarly
• No dilution of results Easier to recruit
• Infer results specifically Easier to generalize
JD Goldberg MedicReS 10162014
30. Outcome measures
Occurrence of event
e.g., in-hospital mortality
Time to event
e.g., time to death, time to heart failure
Mean level of response
e.g., VO2, 6 min walk
Mean change from baseline in key variable
Response (yes/no)
JD Goldberg MedicReS 10162014
31. Data analysis
Descriptive data analysis
Specify in advance
Primary
Secondary
Other
Statistical approach
Exploratory
JD Goldberg MedicReS 10162014
33. What Data Should Be Analyzed?
Basic Intention-to-Treat Principle
Analyze what is randomized!
All subjects randomized, all events during
follow-up
Randomized control trial is the “gold” standard”
Definitions
Exclusions
Screened but not randomized
Affects generalizability but validity OK
Withdrawals from Analysis
Randomized, but not included in data analysis
Possible to introduce bias!
JD Goldberg MedicReS 10162014
34. Patient Closeout
ICH E9 Glossary
“Intention-to-treat principle - …It has
the consequence that subjects allocated
to a treatment group should be followed
up, assessed, and analyzed as members
of that group irrespective of their
compliance with the planned course of
treatment.”
JD Goldberg MedicReS 10162014
35. Patient Withdrawn in
Analysis
Common Practice - 1980s
Over 3 years, 37/109 trials in New England Journal of
Medicine published papers with some patient data not
included
Typical Reasons
-Patient ineligible (in retrospect)
-Noncompliance
-Competing events
-Missing data
JD Goldberg MedicReS 10162014
36. Patient Withdrawn in Analysis-continued
Patient INELIGIBLE after randomization
Concern ineligible patients may dilute treatment effect
Temptation to withdraw ineligibles
Withdrawal of ineligible patients, post hoc, may introduce
bias
JD Goldberg MedicReS 10162014
37. Sources of Bias in Clinical Trials
• Patient selection
• Treatment assignment
• Evaluation of patient outcomes
• Dropouts, crossovers
• Loss to follow up
• Missing covariate data
• Missing outcome data
Methods to Minimize Bias
• Randomized Controls
• Double blind (masked)
• Analyze as randomized (intent to treat) JD Goldberg MedicReS 10162014
38. Betablocker Heart Attack Trial
(JAMA, 1982)
3837 post MI patients randomized
341 patients found by Central Review to be ineligible
Results
% Mortality
Propranolol Placebo
Eligible 7.3 9.6
Ineligible 6.7 11.3
Total 7.2 9.8
In the ineligible patients, treatment works best
JD Goldberg MedicReS 10162014
39. Data Analysis Issues
Heterogeniety among patients
Non compliance
Crossovers, dropouts
Approaches:
Censoring at time of crossover, dropout
Causal effects and principal stratification
methods
Complier average causal effects (CACE)
40. Data Analysis Issues
continued
Missing Data
Outcomes
Dummy variable to indicate whether outcome
observed or not vs covariates
Covariates
Multiple imputation
Inverse probability weighting
Propensity score adjustments for
balance
Sensitivity analyses
JD Goldberg MedicReS 10162014
41. Example:
New Beta-blocker for
Hypertension
Changed paradigm of initial treatment of
mild-moderate hypertension from
monotherapy to low dose combination new
beta-blocker and diuretic (standard)
Designed experiment for regulatory
approval of new drug
Preserved monotherapy study
Primary efficacy based on increasing dose and
difference between maximum dose and placebo
Allowed study of combinations
JD Goldberg MedicReS 10162014
43. Example:
Translational Research:
‘Bench to Bedside’
Issues and Environment
New laboratory science
Explosion of data –genomics, proteomics,…
Data management and computing
Cross disciplinary collaboration
Study design
Reduction of data within and across domains
Integration of diverse data domains
JD Goldberg MedicReS 10162014
44. Translational Research Studies:
Biomarkers
Investigators are provided with small
number of patient samples for their
substudy in context of larger project
(e.g., clinical trial)
Issue:
Difficult to develop comprehensive,
integrated analysis of disease across all
domains of data
JD Goldberg MedicReS 10162014
45. Systematic Missing-At-Random (SMAR)
Designs for Translational Research Studies
Belitskaya-Levy, Shao, and Goldberg (2008)*
Motivation: DOD Center of Excellence:
Locally Advanced Breast Cancer
Treatment and Prognosis
Goal: Identification of characteristics that predict pathological
response to treatment, progression, and survival
Based on clinical and laboratory data
genomics, molecular/biochemical markers, immunological,
JD Goldberg MedicReS 10162014
hormonal markers
clinical, demographic, social, cultural data
Standard chemoradiation protocol and patient follow-up
• Multi-ethnic cohorts
• Multiple cancer centers world wide
* The International Journal of Biostatistics: http://www.bepress.com/ijb/vol4/iss1/15
46. LABC: Statistical Challenges
in Design and Analysis
Large sample size required for
primary, secondary endpoints
Costly modern technologies for
laboratory studies (time, money)
Inability to measure all variables on
all patients
JD Goldberg MedicReS 10162014
47. Statistical Solution:
Systematic Missing-At-Random
(SMAR) Design
Entire cohort is used for measurement of
endpoints, important covariates,
inexpensive variables
Nested random subsamples of the cohort
are used to measure more ‘expensive’
classes of variables
As cost of collection increases, random
subsamples are smaller
JD Goldberg MedicReS 10162014
48. LABC Design:
Data Structure
Types of Variables
JD Goldberg MedicReS 10162014
Number
of
Patients
Clinical Genomics Molecular
markers
Immunology Mutational
analyses
Hormonal
assays
n1
+
n0
+ + + + + +
* Stratified by center
50. SMAR Designs: Advantages
Planned Missingness [monotone missing]
enables integrated analysis of entire cohort
with partially observed covariates across all domains
of data
statistically efficient
computationally efficient
cost effective allocation of resources
SMAR data are Missing-At-Random [MAR]
statistical likelihood based inference valid
SMAR designs are prospective
allows evaluation of efficacy, safety of
treatment, survival, …
JD Goldberg MedicReS 10162014
51. SMAR Design: Summary
Enables integrated statistical analysis across all
data domains
Statistical theory holds
JD Goldberg MedicReS 10162014
SMAR is MAR
Computationally efficient
Obtain cell probability estimates once prior to EM iteration
Can use outcome (Y) in calculation of cell
probabilities
Cost effective
Designed experiments
Can handle:
Discrete variables with multiple categories
Large numbers of observations; large numbers of variables
Heavy missingness
Two stage response dependent sampling to increase power
52. Example:
Active Controlled Clinical Trials*
Compare new to standard treatment
Dilemma:
design for superiority or non-inferiority
uncertainty about projected efficacy of
new treatment
simultaneous testing?
*Shao, Y., Mukhi, V., and Goldberg, J.D.: A Hybrid Bayesian-frequentist
approach to evaluate clinical trial designs for tests of superiority and non-inferiority.
Stat.in Medicine 27:504–519, 2008
JD Goldberg MedicReS 10162014
53. Specification of Study
Objective
Decision to conduct a Superiority or
Non-Inferiority trial
0 (preliminary estimate of *) and
ε0 (pre-specified non-inferiority margin)
If 0 >> ε – Design Superiority
If 0 < 0 or 0 < ε – Design Non
Inferiority
JD Goldberg MedicReS 10162014
55. How to design?
Single stage
NI - Sup : Test non-inferiority; If non-inferior
then test superiority
Sup - NI : Test superiority; If superiority
fails then test non-inferiority
Adaptive or group sequential
JD Goldberg MedicReS 10162014
56. Single-stage Simultaneous
Testing
Is it appropriate to conduct multiple
tests?
Is overall type I error rate controlled?
Is power adequate?
Are the discoveries reproducible?
JD Goldberg MedicReS 10162014
57. Hybrid Bayesian - Frequentist Approach
[Mukhi, Shao, Goldberg]
JD Goldberg MedicReS 10162014
Method:
Specification of uncertainty using
distribution and Bayes formula
Classical endpoint analysis
58. Advantages:
Hybrid Approach
Overall type I error rate is controlled
Using Closed Testing Principle
Pre-specification of ε0 (non-inferiority
margin) is necessary
PowerNI adequacy depends on 0
(preliminary estimate of difference) and ε0
Can plan to conduct simultaneous tests
under reasonable scenarios
JD Goldberg MedicReS 10162014
59. Example: Patent Litigation
3 clinical trials to compare 2 devices
I: first in man randomized trial of 2
devices evaluated at 6, 12 months; ex US
II: randomized 2 group, evaluated at 6,
24 months; active control; single blind; ex
US
III: randomized 2 group; randomized
within group to 8 month evaluation
(invasive); US
Different control arms
JD Goldberg MedicReS 10162014
60. Patent Claims
all require in part that the drug
delivery device has
“an in-stent diameter stenosis at 12
months . . . less than about 22%,
as measured by quantitative coronary
angiography.
JD Goldberg MedicReS 10162014
61. Example: Patent Claim of Synergy
Based on Randomized Trial Data
Trials designed to test combination
and each agent against placebo
Not designed to test for interaction
Endpoint
Sumatriptan &
Naproxen
Sumatriptan Naproxen Placebo
n % n % n % n %
Sustaine
d
Respons
e
115 250 46.0 66 229 28.8 61 247 24.7 41 241 17.0
Sustaine
d Pain
Free
63 250 25.2 25 229 10.9 29 247 11.7 12 241 5.0
Pain
Respons
e
250 65 229 49 247 46 241 27
JD Goldberg MedicReS 10162014
62. Inclusion Criteria for Clinical
Trials
Lesion
Type
Lesion
Length
Number of
Lesions
Percent
Diameter
Stenosis
Vessel Reference
Diameter
.
SPIRIT I de novo <18 mm 1 >50% 3.0mm
SPIRIT II de novo < 28mm 2 50% - 99% 2.5-4.25mm
SPIRIT III de novo <28mm 1 or 2 50% - 99% 2.5-3.75mm
And Active Control Arms Differed
JD Goldberg MedicReS 10162014
63. Comparison of Studies
Study Design/Patient Populations
%
Diabetic
%
Male
Proportion of
Patients With
Multiple Stents
Follow -Up
Evaluation
Time
Percent of
Patients with
Follow-up
Evaluation
SPIRIT I 11% 70.1% 1 Stent – 100% 6 mos.
12 mos.
75%
74.1%
SPIRIT II 20.2% 70.9% 1 Stent – 70%
2 Stents – 23%
3 Stents – 5%
4 Stents – 2%
6 mos.
24 mos.
74.3%
75%
SPIRIT III 29.6% 70.1% 1 Stent – 83%
2 Stents – 15%
3 Stents – 1%
4 Stents – 1%
8 mos. 80%
JD Goldberg MedicReS 10162014
64. Angiographic Evaluation
Times and Patient Numbers
JD Goldberg MedicReS 10162014
Study 6
months
8
months
12
months
24
months
I 23 22
II 223 85
III 302
65. Analysis
Combined data from all 3 trials with
mixed effects regression models
Differences between two devices
Flawed because of study differences
Patent case won on ‘first principles’
Data not combinable
Different evaluation times
Different patient populations
JD Goldberg MedicReS 10162014
66. Example:
Multicenter Randomized Clinical Trial PVSG-
01: 32P vs Phlebotomy vs Chlorambucil
Issues and Environment:
Multiple endpoints
Long term follow-up
Changes in treatment, supportive care over time
Multiple analyses – ‘adjust’?
‘Intent to treat’ – not invented yet
Interim stopping rules- primitive
Data Safety Monitoring- ad hoc
Results:
Early stopping of treatment arm (chlorambucil)
Major impact on treatment of disease
JD Goldberg MedicReS 10162014
67. Cumulative Survival by
Treatment: PVSG-01
Berk, Goldberg, et al, NEJM, 1981
JD Goldberg MedicReS 10162014
68. Leukemia-free Survival from
Randomization
From Randomization Hazard Function
Berk, Goldberg, et al, NEJM, 1981
JD Goldberg MedicReS 10162014
69. Cumulative Survival by
Treatment: 20 year data
JD Goldberg MedicReS 10162014
From
randomizatio
n
Conditional on
surviving 7
years
Berk, Wasserman, Fruchtman, and Goldberg, Chap. 15, Polycythemia and the
Myeloproliferative Disorders, ed. Wasserman, et al, Saunders, 1995.
70. Examples:
New areas for statistical collaboration
and methodology development
Proteomics
Imaging
Biomarkers
Genetics, gene-environment interactions
-----------------------------------------------------------
Adaptive clinical trial designs, other ‘new’ designs
Safety assessment
Combining data from multiple sources
Comparative effectiveness research
…
JD Goldberg MedicReS 10162014
71. Where next?
Need for collaboration with scientists greater than ever
throughout research process from inception
Continue to exploit new technologies
Continue to make explicit the IT requirements for
infrastructure to enable new approaches
Continue to expand role of biostatistics in drug development
Includes compound screening, high throughput
JD Goldberg MedicReS 10162014
technologies
Clinical translational research including clinical trials
(controlled and uncontrolled), meta-analysis, safety
evaluation, comparative effectiveness research
Continue to stretch the boundaries of statistics and
statistical thinking
Strategic input into drug development from compound
identification, patent development, post marketing
effectiveness and safety evaluation
72. Thank you to collaborators and
colleagues:
Health Insurance Plan of Greater New York
Mount Sinai School of Medicine
Lederle Laboratories, American Cyanamid
D. Alemayehu, K. Koury, …
Bristol-Myers Squibb
New York University School of Medicine
Y. Shao, M. Liu, I. Belitskaya-Levy, V. Mukhi, …
Herman P. Friedman
…
JD Goldberg MedicReS 10162014
73. Currently supported in part
by:
NYU Cancer Center Support Grant:
NCI P30 CA16087
NYU Clinical Translational Science Award:
1UL1RR029893
MPD Research Consortium: P01 CA108671
Locally Advanced Breast Cancer Center of
Excellence: DOD W81XWH-04-2-0905
JD Goldberg MedicReS 10162014