Clinical Trials are not Enough
Stephen Senn
(c) Stephen Senn 1
Acknowledgements
(c) Stephen Senn 2
Acknowledgements
This work is partly supported by the European Union’s 7th Framework Programme for
research, technological development and demonstration under grant agreement no.
602552. “IDEAL”
Basic Thesis
• Clinical trials are experiments
• At every stage of drug development, even phase III, they are
unrepresentative of ‘real world’ clinical practice
• The key to using their results to inform ‘real world’ decisions
is not to make the trial more representative
• The key is to use appropriate scales for analysis and transfer
the results into practical real-world decision making
• This will require
– A different attitude
– More modelling
– Extensive use of auxiliary real world data
(c) Stephen Senn 3
An Example
• Dan Moerman’s analysis more than 30 years ago
of Smith-Kline-French’s development of Tagamet®
– Tagamet® (cimetidine) was the best selling drug of its
day
• 31 trials in 1692 patients
– in 17 countries
– Duodenal, Gastric, Mixed
– 13 significant according to Moerman
• Used Chi-square test with Yates’s correction
(c) Stephen Senn 4
(c) Stephen Senn 5
on the risk-
difference
scale and
there is
significant
evidence of
heterogeneity
What the EBM movement used to
conclude from this sort of thing
• The treatment effect varies according to the type
of patient
• We want RCTS that automatically deliver the right
decision
• We can’t rely on the results from RCTs which
involve artificially selected patients
• We need large simple trials in representative
populations
• They need to be reported using numbers needed
to treat (NNTs)
(c) Stephen Senn 6
Problems
• At best you can hope to recommend
treatments that are beneficial on average
• But in practice you can never guarantee that
trials are representative of practice anyway
• You lose the opportunity to study issues more
deeply
• NNTs are a terrible scale for doing analysis
(c) Stephen Senn 7
(c) Stephen Senn 8
Here the
analysis is on
the log-odds
ratio scale and
heterogeneity is
much reduced
and not even
‘significant’.
9
“If you need statistics to prove it I don’t believe it”
You can’t prove it with statistics but everybody believes it
Thanks to Pat
Ballew’s blog site
(c) Stephen Senn 9
That is to say, we see non-random variation too easily.
This example does NOT give strong evidence treatment
by trial interaction
Lessons
• If not carefully, studied random variation can
be underestimated
• Differences from trial to trial in true effect
may be less than one thinks
• Finding a good scale is important
• BUT The additive scale is not necessarily the
relevant one
(c) Stephen Senn 10
What not to do
• The solution is not to attempt to make trials
more representative
• The solution is to measure appropriately and
translate appropriately
• This requires the following
– Good scales
– Good analysis
– Good modelling
– Good supplementary real world data
(c) Stephen Senn 11
(c) Stephen Senn 12
Chasing sub-
groups leads
nowhere
Solution?
(c) Stephen Senn 14
“a possible resolution is to use the additive
measure at the point of analysis and transform
to the relevant scale at the point of
implementation. This transformation at the
point of medical decision-making will require
auxiliary information on the level of background
risk of the patient.”
Senn, Statistics in Medicine, 2004
How we already use modelling, data
and additive scales
• Interspecies scaling
• Bioequivalence
– log relative bioavailability is additive but
difference in absolute bioavailability is not
• Dose proportionality
• Use of additive scales in phase III
– Log hazard
– Log-odds ratio
(c) Stephen Senn 15
(c) Stephen Senn 17Controlled Clinical Trials, 1989
Implications of the Lubsen-Tijssen Model
• We need to study treatment benefit on
disaggregated (of harm) additive scale
• We will need real world data on harms
• We will need real world data on background risk
• We will need models
• We will need cooperation between
– Medics and statisticians working on clinical trials
– Statisticians, epidemiologists, health economists,
medics and others working in real world data
(c) Stephen Senn 18
Example of Atrial Fibrillation
• Such patients are at
higher risk of stroke
• Meta-analysis
(reproduced in Hart et al
2007)concluded that
warfarin has a beneficial
protective effect
• But there is a risk of
intracranial bleeding
• Who should get warfarin?
(c) Stephen Senn 19
(c) Stephen Senn 20
(c) Stephen Senn 21
So you have atrial fibrillation
• Should you take warfarin?
• What else do you need to know?
– The difference in risk taking warfarin or not
– The rate of side effects
– The consequences of side-effects
• These cannot be answered (alone) by analysis of
RCTs with pre-specified efficacy measures on the
additive scale
• The RCTs has to be translated and supplemented
by real world data
(c) Stephen Senn 22
(c) Stephen Senn 23
(c) Stephen Senn 24
Estimate
based on 6 v 3
cases only
The reimburser’s perspective
• What benefit and harm to the population will
accrue from recommending warfarin
prophylaxis?
• How much will it cost?
• How can its use be optimised?
• Who should get it?
(c) Stephen Senn 25
(c) Stephen Senn 26
Reimburser’s needs
Requirements
• A means of separating
patients by risk
• A means of establishing risk
distribution in the
population of patients
above any threshold chosen
• A means of determining
expected benefits and costs
Solutions
• These figures cannot be
delivered by clinical trials
alone but will require
– Cohort studies/case control
studies
– Health surveys
– Economic modelling
(c) Stephen Senn 27
We need to model background risk
• Sort of data set we could use is that provided
by the UK Clinical Practice Research Data Link
CPRD
• Could use this to model
– A) Predictors of risk of stroke
– B) Distribution of risk levels in the population
• Former relevant to individuals to make
decisions
• Latter is relevant to reimbursers
(c) Stephen Senn 28
Is there a Trust Problem?
• Yes
• Clinical trials provide a “template of trust”
whereby regulators can mandate sponsors to
provide the proof
• Modelling + real world data cannot provide these
guarantees
• But this is no excuse
– Whether or not you model, others will
– You need to know as much as possible about your
own drugs and where and when to use them
(c) Stephen Senn 29
Finally
I leave you with this though
(c) Stephen Senn 30
Any damn fool can analyse a clinical trial and frequently does
But doing it properly involves skilful analysis,
understanding what the results mean requires intelligence
insight and experience,
and applying the results intelligently needs more of the
same plus modelling and real world data
And to whom do we look to provide these skills?
Statisticians!

Real world modified

  • 1.
    Clinical Trials arenot Enough Stephen Senn (c) Stephen Senn 1
  • 2.
    Acknowledgements (c) Stephen Senn2 Acknowledgements This work is partly supported by the European Union’s 7th Framework Programme for research, technological development and demonstration under grant agreement no. 602552. “IDEAL”
  • 3.
    Basic Thesis • Clinicaltrials are experiments • At every stage of drug development, even phase III, they are unrepresentative of ‘real world’ clinical practice • The key to using their results to inform ‘real world’ decisions is not to make the trial more representative • The key is to use appropriate scales for analysis and transfer the results into practical real-world decision making • This will require – A different attitude – More modelling – Extensive use of auxiliary real world data (c) Stephen Senn 3
  • 4.
    An Example • DanMoerman’s analysis more than 30 years ago of Smith-Kline-French’s development of Tagamet® – Tagamet® (cimetidine) was the best selling drug of its day • 31 trials in 1692 patients – in 17 countries – Duodenal, Gastric, Mixed – 13 significant according to Moerman • Used Chi-square test with Yates’s correction (c) Stephen Senn 4
  • 5.
    (c) Stephen Senn5 on the risk- difference scale and there is significant evidence of heterogeneity
  • 6.
    What the EBMmovement used to conclude from this sort of thing • The treatment effect varies according to the type of patient • We want RCTS that automatically deliver the right decision • We can’t rely on the results from RCTs which involve artificially selected patients • We need large simple trials in representative populations • They need to be reported using numbers needed to treat (NNTs) (c) Stephen Senn 6
  • 7.
    Problems • At bestyou can hope to recommend treatments that are beneficial on average • But in practice you can never guarantee that trials are representative of practice anyway • You lose the opportunity to study issues more deeply • NNTs are a terrible scale for doing analysis (c) Stephen Senn 7
  • 8.
    (c) Stephen Senn8 Here the analysis is on the log-odds ratio scale and heterogeneity is much reduced and not even ‘significant’.
  • 9.
    9 “If you needstatistics to prove it I don’t believe it” You can’t prove it with statistics but everybody believes it Thanks to Pat Ballew’s blog site (c) Stephen Senn 9 That is to say, we see non-random variation too easily. This example does NOT give strong evidence treatment by trial interaction
  • 10.
    Lessons • If notcarefully, studied random variation can be underestimated • Differences from trial to trial in true effect may be less than one thinks • Finding a good scale is important • BUT The additive scale is not necessarily the relevant one (c) Stephen Senn 10
  • 11.
    What not todo • The solution is not to attempt to make trials more representative • The solution is to measure appropriately and translate appropriately • This requires the following – Good scales – Good analysis – Good modelling – Good supplementary real world data (c) Stephen Senn 11
  • 12.
    (c) Stephen Senn12 Chasing sub- groups leads nowhere
  • 13.
    Solution? (c) Stephen Senn14 “a possible resolution is to use the additive measure at the point of analysis and transform to the relevant scale at the point of implementation. This transformation at the point of medical decision-making will require auxiliary information on the level of background risk of the patient.” Senn, Statistics in Medicine, 2004
  • 14.
    How we alreadyuse modelling, data and additive scales • Interspecies scaling • Bioequivalence – log relative bioavailability is additive but difference in absolute bioavailability is not • Dose proportionality • Use of additive scales in phase III – Log hazard – Log-odds ratio (c) Stephen Senn 15
  • 15.
    (c) Stephen Senn17Controlled Clinical Trials, 1989
  • 16.
    Implications of theLubsen-Tijssen Model • We need to study treatment benefit on disaggregated (of harm) additive scale • We will need real world data on harms • We will need real world data on background risk • We will need models • We will need cooperation between – Medics and statisticians working on clinical trials – Statisticians, epidemiologists, health economists, medics and others working in real world data (c) Stephen Senn 18
  • 17.
    Example of AtrialFibrillation • Such patients are at higher risk of stroke • Meta-analysis (reproduced in Hart et al 2007)concluded that warfarin has a beneficial protective effect • But there is a risk of intracranial bleeding • Who should get warfarin? (c) Stephen Senn 19
  • 18.
  • 19.
  • 20.
    So you haveatrial fibrillation • Should you take warfarin? • What else do you need to know? – The difference in risk taking warfarin or not – The rate of side effects – The consequences of side-effects • These cannot be answered (alone) by analysis of RCTs with pre-specified efficacy measures on the additive scale • The RCTs has to be translated and supplemented by real world data (c) Stephen Senn 22
  • 21.
  • 22.
    (c) Stephen Senn24 Estimate based on 6 v 3 cases only
  • 23.
    The reimburser’s perspective •What benefit and harm to the population will accrue from recommending warfarin prophylaxis? • How much will it cost? • How can its use be optimised? • Who should get it? (c) Stephen Senn 25
  • 24.
  • 25.
    Reimburser’s needs Requirements • Ameans of separating patients by risk • A means of establishing risk distribution in the population of patients above any threshold chosen • A means of determining expected benefits and costs Solutions • These figures cannot be delivered by clinical trials alone but will require – Cohort studies/case control studies – Health surveys – Economic modelling (c) Stephen Senn 27
  • 26.
    We need tomodel background risk • Sort of data set we could use is that provided by the UK Clinical Practice Research Data Link CPRD • Could use this to model – A) Predictors of risk of stroke – B) Distribution of risk levels in the population • Former relevant to individuals to make decisions • Latter is relevant to reimbursers (c) Stephen Senn 28
  • 27.
    Is there aTrust Problem? • Yes • Clinical trials provide a “template of trust” whereby regulators can mandate sponsors to provide the proof • Modelling + real world data cannot provide these guarantees • But this is no excuse – Whether or not you model, others will – You need to know as much as possible about your own drugs and where and when to use them (c) Stephen Senn 29
  • 28.
    Finally I leave youwith this though (c) Stephen Senn 30 Any damn fool can analyse a clinical trial and frequently does But doing it properly involves skilful analysis, understanding what the results mean requires intelligence insight and experience, and applying the results intelligently needs more of the same plus modelling and real world data And to whom do we look to provide these skills? Statisticians!