Clinical trials &
observational studies
in the age of AI
Paul Agapow
VP of Data Science, Mitra Bio
paul@mitrabio.tech
Introduction & disclaimer
● Have been a immunologist, molecular evolutionist,
bioinformatician, database manager, analyst, data scientist,
epi-informatician, “computer guy” ...
● More recently, using AIML, stats & computation to derisk therapy
development
● Nothing in this presentation implies the existence of any project or
policy at any company
Portrait via ChatGPT
When we do
epidemiology,
what are we
doing?
“Why this thing?”
“I saw a thing!”
How can we infer
causality?
The virtuous application of codified
judgement, e,g. Bradford Hill criteria
The history of
clinical study
methodologies is
the history of
attempts to infer
causality
● Randomized Clinical Trials (RCTs)
● Observational studies / “natural trials”
● Patient matching
● Propensity scores
● Digital twins
Simple biology only helps with simple patients
● Maybe all the low-hanging fruit has been
picked
○ E.g. single gene / single system diseases
● Most diseases are complex & systemic
● Many patients are complex
● Lifestyle, exposure, co-morbidities,
co-medications
● A cohort is rarely just a simple table
Building a
wolf-husky
detector
A relevant anecdote on two axes
Ribeiro, Singh, Guestrin, (2016) Why should I trust you? Explaining the predictions
of any classifier, in Proceedings of the 22nd ACM SIGKDD international
conference on knowledge discovery and data mining
Besse, Castets-Renard, et al. (2018). Can Everyday AI Be Ethical? Machine
Learning Algorithm Fairness DOI:10.13140/RG.2.2.22973.31207
● Build an ML system to distinguish between
Wolves and Eskimo Dogs (huskies)
● Trained upon a set of photos
● Tested upon a further set
Getting the right answer the wrong way
Biomedicine is
the kingdom of
accidental
associations
But we keep making this mistake
● Patients with Long COVID have low
Vitamin D levels
● Women who undergo HRT have fewer
heart attacks
● Best predictor of a heart attack is
being referred for a heart check
● Catching pneumonia “protects” you
from influenza
● Low cholesterol linked to risk of
Alzheimers but alleged that reverse
man be true ...
These are obvious errors that we
caught. What aren’t we seeing?
Interpreting clinical trials is complicated
More than 400 clinical trials are run in the UK each year, more than 4000 in the European Economic Area. But finding and
attributing clinical difference in a study is difficult when patient populations are complex.
80%
Of trials have trouble finding
enough patients
30%
Of trials fail due to insufficient
patients
40%
Of trials return equivocal
results
It is very easy to
break a trial,
thwart
randomization,
introduce bias &
lower power
● Trial subjects are rarely typical
subjects
● Unconvincing placebos & controls
● Treatment response heterogeneity
● Variation in site and investigator
performance
● Dropouts, treatment adherence
● Dose escalation / selection
● May be unclear if AI & digital
endpoints are clinically real or
associative ...
Example:
WA-ATOM trial
Testing effectiveness of atropine
for controlling myopia
● Demonstrated a significant effect at
the one-year mark
● However, a substantial dropout in
second year of trial: 10% in treatment
group and 25% in placebo group
● Likely related to the severity of
myopia - those with faster
progression being more likely to drop
out, particularly in the placebo group.
● Resulted in the remaining subjects in
both groups having more stable
myopia, making two-year results
statistically insignificant
AI is a powerful
tool for solving
complex data
problems but ...
AI is also an increasing focus of scepticism & disappointment
Modern “AI” is
essentially
associative
● AI historically encompassed a wide
variety of approaches: rule-based,
symbolic reasoning, situated robotics
...
● However, the wild success of neural
nets has caused this approach to
dominate the field
● Neural networks, at their root, link
inputs to outputs, map one space to
another
Modern AI does
not reason
Leading to models that get the right answer the
wrong way (or sometimes just the wrong
answer):
● Analysis of chest scans that detect if the
patient is lying down
● Inferring the brand / type and hence
location of analysis equipment
● Molecular network analyses home in on
“hubs”
● Infection disease prediction looking for
basketball news
● AI literature reviews don’t understand
historical evolution ...
“AI systems don’t mean things, they
say things”
To summarize Biomedicine is full of wolves & huskies,
accidental correlations
Clinical trials are especially prone to these
spurious connections & biases
Similarly, contemporary AIML is essentially
correlational, associative
Combined, these are a recipe for disaster
Causal analysis / inference
To infer probabilities more broadly under conditions that are
changing, changes induced by treatments or external interventions,
or internal confounding. Essentially, everything that isn’t a joint
probability
It’s a way of imagining the same patient with a different exposure.
A formal model for causality
● How entities affect each other
● Seek to identify how exposures
(treatments / interventions)
affect outcomes
● Confounders influences both
the exposure and the outcome
● Interactions can be positive or
negative
● Can include reverse causality,
where outcome changes other
entities
A chain of epidemiological events
Body Fat
Blood
Pressure
Allele
rs877087
Heart
Failure
dCCB
dose
Allele
rs108988
15-A
Derived from: Pharmacogenomics J.
2024 Apr 17;24(3):12
Complicated by trial realities
Body Fat
Blood
Pressure
Allele
rs877087
Patient
outcome
data
Exposure
Allele
rs108988
15-A
Patient
population
demographics
Observed data:
Protocol deviations
Milestones missed
Recruitment rate
Site
Performance
Geography
Investigator
and Trial Site
What a DAG tells you
● A belief about how the world works
○ Possibly wrong ...
● How entities affect each other ... and
what things don’t affect each other
● What parameters we need to know
or estimate
● Turns it into a table of states over
many observations
● Turns it into a missing data problem
● Which can be resolved by a number
of statistical or AIML approaches
It’s a missing data problem
Why not put this
all in some
massive
multivariable /
ML model?
● Causality considerably shrinks the search
space, through weak and strong beliefs
● Proper distinction of confounders and
non-confounders
● Causal can more easily give us individual
effects not population averages
● The proposed causal model is an
opinionated view of the world that can be
refuted
Causal inference
can give us
better analytics
on clinical trials
● Or any interventional study
● As stated, the power and information
within a trial can be thwarted
● Trials are designed for statistical
positivity, can be difficult to extend to
real populations or extract other
information
● Important but hidden states like
non-compliance
● Allows us to possibly deal with
attrition
Example: treatment effects
● Possibly largest application of CI in
clinical trials
● Because there are so many things
that can obscure the effect
● Calculate the individual treatment
effect for each patient in a trial
● ... giving a road to personalised and
predictive selective of treatment,
direct translation of clinical results
Example: non-compliance
● Trials analyse based on “intent to
treat”, although many patients do
not adhere to treatment
● Study analysed trial of
immunosuppressive treatment of
MS, with a proxy capturing
non-compliance
● Used 4 models to try and capture
non-compliance - shows wildly
varying efficacy
Causal inference
can give us
better analytics
on Real World
Data
● All the problems of analysing a clinical
trial are magnified when working with
RWD
● By their very nature, often weird,
biased, difficult to do cross study
comparisons
● Allows us to possibly deal with
missing data under looser
assumptions
● RWD is very attractive
○ Cheap, fast
○ Less ethical issues
○ More “real”, more generalizable
Example: paediatric wheeze
● Observational studies propose that
giving paracetamol to infants is
cause of later wheeze / asthma
● But viral infections may cause
administration of paracetamol
● Causal and other studies have
weakened link
Example: vaccine efficacy
● Measuring treatment effect of rotavirus vaccine in
belgian children
● Complicated by other children in household,
attendance at daycare / preschool, medical history,
breastfeeding ...
● Causal analysis shows overwhelming efficacy
What now?
This is just the
start
Causal inference could also be used to:
● Improving estimates of actual efficacy /
survival analysis using other covariates
● Can test trials where several treatments
are used in parallel or apart, where
treatments are adapted or vary over time
● Allows to think more broadly - and more
broadly solve for - confounders
● Analyse operational factors
● Actually use trials to learn in a much
broader sense
Necessary
assumptions &
limitations
● You need to have the right “picture of the
world”
● Acyclic graph
● Independence from non-parent nodes, of
runs
● Unknown and unknowable assumptions &
hidden states
● Hidden confounders
● Multiple confounders acting on same
target, colliders
● Complex models
● Data
Because nothing comes for free
A practical
limitation ● It might be technically plausible and
sound.
● But regulators have to be open to it.
● And submissions have to start to use it.
● These are three different things
Regulatory use and approvals
Explanations are
a smoke test ● Explict way of stress-testing your model
and conclusions: “Does this make sense?”
○ If your intuition is good
● Explanations create trust
○ Rightly or wrongly
● But always good for hypothesis
generation
Knowledge of
biology is still a
bottleneck
● You can’t make a model of what you don’t
understand
● You can’t use “all the information”
● We still need to observe
● We still need to exercise judgement
● Epidemiological analysis is not just a data
problem
Further reading
Ashenden ed. (2021), Academic Press
agapow.substack.com

Clinical studies & observational trials in the age of AI

  • 1.
    Clinical trials & observationalstudies in the age of AI Paul Agapow VP of Data Science, Mitra Bio paul@mitrabio.tech
  • 2.
    Introduction & disclaimer ●Have been a immunologist, molecular evolutionist, bioinformatician, database manager, analyst, data scientist, epi-informatician, “computer guy” ... ● More recently, using AIML, stats & computation to derisk therapy development ● Nothing in this presentation implies the existence of any project or policy at any company Portrait via ChatGPT
  • 3.
    When we do epidemiology, whatare we doing? “Why this thing?” “I saw a thing!”
  • 4.
    How can weinfer causality? The virtuous application of codified judgement, e,g. Bradford Hill criteria
  • 5.
    The history of clinicalstudy methodologies is the history of attempts to infer causality ● Randomized Clinical Trials (RCTs) ● Observational studies / “natural trials” ● Patient matching ● Propensity scores ● Digital twins
  • 6.
    Simple biology onlyhelps with simple patients ● Maybe all the low-hanging fruit has been picked ○ E.g. single gene / single system diseases ● Most diseases are complex & systemic ● Many patients are complex ● Lifestyle, exposure, co-morbidities, co-medications ● A cohort is rarely just a simple table
  • 7.
    Building a wolf-husky detector A relevantanecdote on two axes Ribeiro, Singh, Guestrin, (2016) Why should I trust you? Explaining the predictions of any classifier, in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining Besse, Castets-Renard, et al. (2018). Can Everyday AI Be Ethical? Machine Learning Algorithm Fairness DOI:10.13140/RG.2.2.22973.31207 ● Build an ML system to distinguish between Wolves and Eskimo Dogs (huskies) ● Trained upon a set of photos ● Tested upon a further set
  • 8.
    Getting the rightanswer the wrong way
  • 9.
    Biomedicine is the kingdomof accidental associations But we keep making this mistake ● Patients with Long COVID have low Vitamin D levels ● Women who undergo HRT have fewer heart attacks ● Best predictor of a heart attack is being referred for a heart check ● Catching pneumonia “protects” you from influenza ● Low cholesterol linked to risk of Alzheimers but alleged that reverse man be true ... These are obvious errors that we caught. What aren’t we seeing?
  • 10.
    Interpreting clinical trialsis complicated More than 400 clinical trials are run in the UK each year, more than 4000 in the European Economic Area. But finding and attributing clinical difference in a study is difficult when patient populations are complex. 80% Of trials have trouble finding enough patients 30% Of trials fail due to insufficient patients 40% Of trials return equivocal results
  • 11.
    It is veryeasy to break a trial, thwart randomization, introduce bias & lower power ● Trial subjects are rarely typical subjects ● Unconvincing placebos & controls ● Treatment response heterogeneity ● Variation in site and investigator performance ● Dropouts, treatment adherence ● Dose escalation / selection ● May be unclear if AI & digital endpoints are clinically real or associative ...
  • 12.
    Example: WA-ATOM trial Testing effectivenessof atropine for controlling myopia ● Demonstrated a significant effect at the one-year mark ● However, a substantial dropout in second year of trial: 10% in treatment group and 25% in placebo group ● Likely related to the severity of myopia - those with faster progression being more likely to drop out, particularly in the placebo group. ● Resulted in the remaining subjects in both groups having more stable myopia, making two-year results statistically insignificant
  • 13.
    AI is apowerful tool for solving complex data problems but ...
  • 14.
    AI is alsoan increasing focus of scepticism & disappointment
  • 15.
    Modern “AI” is essentially associative ●AI historically encompassed a wide variety of approaches: rule-based, symbolic reasoning, situated robotics ... ● However, the wild success of neural nets has caused this approach to dominate the field ● Neural networks, at their root, link inputs to outputs, map one space to another
  • 16.
    Modern AI does notreason Leading to models that get the right answer the wrong way (or sometimes just the wrong answer): ● Analysis of chest scans that detect if the patient is lying down ● Inferring the brand / type and hence location of analysis equipment ● Molecular network analyses home in on “hubs” ● Infection disease prediction looking for basketball news ● AI literature reviews don’t understand historical evolution ... “AI systems don’t mean things, they say things”
  • 17.
    To summarize Biomedicineis full of wolves & huskies, accidental correlations Clinical trials are especially prone to these spurious connections & biases Similarly, contemporary AIML is essentially correlational, associative Combined, these are a recipe for disaster
  • 18.
    Causal analysis /inference To infer probabilities more broadly under conditions that are changing, changes induced by treatments or external interventions, or internal confounding. Essentially, everything that isn’t a joint probability It’s a way of imagining the same patient with a different exposure.
  • 19.
    A formal modelfor causality ● How entities affect each other ● Seek to identify how exposures (treatments / interventions) affect outcomes ● Confounders influences both the exposure and the outcome ● Interactions can be positive or negative ● Can include reverse causality, where outcome changes other entities
  • 20.
    A chain ofepidemiological events Body Fat Blood Pressure Allele rs877087 Heart Failure dCCB dose Allele rs108988 15-A Derived from: Pharmacogenomics J. 2024 Apr 17;24(3):12
  • 21.
    Complicated by trialrealities Body Fat Blood Pressure Allele rs877087 Patient outcome data Exposure Allele rs108988 15-A Patient population demographics Observed data: Protocol deviations Milestones missed Recruitment rate Site Performance Geography Investigator and Trial Site
  • 22.
    What a DAGtells you ● A belief about how the world works ○ Possibly wrong ... ● How entities affect each other ... and what things don’t affect each other ● What parameters we need to know or estimate ● Turns it into a table of states over many observations ● Turns it into a missing data problem ● Which can be resolved by a number of statistical or AIML approaches
  • 23.
    It’s a missingdata problem
  • 24.
    Why not putthis all in some massive multivariable / ML model? ● Causality considerably shrinks the search space, through weak and strong beliefs ● Proper distinction of confounders and non-confounders ● Causal can more easily give us individual effects not population averages ● The proposed causal model is an opinionated view of the world that can be refuted
  • 25.
    Causal inference can giveus better analytics on clinical trials ● Or any interventional study ● As stated, the power and information within a trial can be thwarted ● Trials are designed for statistical positivity, can be difficult to extend to real populations or extract other information ● Important but hidden states like non-compliance ● Allows us to possibly deal with attrition
  • 26.
    Example: treatment effects ●Possibly largest application of CI in clinical trials ● Because there are so many things that can obscure the effect ● Calculate the individual treatment effect for each patient in a trial ● ... giving a road to personalised and predictive selective of treatment, direct translation of clinical results
  • 27.
    Example: non-compliance ● Trialsanalyse based on “intent to treat”, although many patients do not adhere to treatment ● Study analysed trial of immunosuppressive treatment of MS, with a proxy capturing non-compliance ● Used 4 models to try and capture non-compliance - shows wildly varying efficacy
  • 28.
    Causal inference can giveus better analytics on Real World Data ● All the problems of analysing a clinical trial are magnified when working with RWD ● By their very nature, often weird, biased, difficult to do cross study comparisons ● Allows us to possibly deal with missing data under looser assumptions ● RWD is very attractive ○ Cheap, fast ○ Less ethical issues ○ More “real”, more generalizable
  • 29.
    Example: paediatric wheeze ●Observational studies propose that giving paracetamol to infants is cause of later wheeze / asthma ● But viral infections may cause administration of paracetamol ● Causal and other studies have weakened link
  • 30.
    Example: vaccine efficacy ●Measuring treatment effect of rotavirus vaccine in belgian children ● Complicated by other children in household, attendance at daycare / preschool, medical history, breastfeeding ... ● Causal analysis shows overwhelming efficacy
  • 31.
  • 32.
    This is justthe start Causal inference could also be used to: ● Improving estimates of actual efficacy / survival analysis using other covariates ● Can test trials where several treatments are used in parallel or apart, where treatments are adapted or vary over time ● Allows to think more broadly - and more broadly solve for - confounders ● Analyse operational factors ● Actually use trials to learn in a much broader sense
  • 33.
    Necessary assumptions & limitations ● Youneed to have the right “picture of the world” ● Acyclic graph ● Independence from non-parent nodes, of runs ● Unknown and unknowable assumptions & hidden states ● Hidden confounders ● Multiple confounders acting on same target, colliders ● Complex models ● Data Because nothing comes for free
  • 34.
    A practical limitation ●It might be technically plausible and sound. ● But regulators have to be open to it. ● And submissions have to start to use it. ● These are three different things Regulatory use and approvals
  • 35.
    Explanations are a smoketest ● Explict way of stress-testing your model and conclusions: “Does this make sense?” ○ If your intuition is good ● Explanations create trust ○ Rightly or wrongly ● But always good for hypothesis generation
  • 36.
    Knowledge of biology isstill a bottleneck ● You can’t make a model of what you don’t understand ● You can’t use “all the information” ● We still need to observe ● We still need to exercise judgement ● Epidemiological analysis is not just a data problem
  • 37.
  • 39.
    Ashenden ed. (2021),Academic Press agapow.substack.com