Maarten van Smeden, PhD
AI & Big Data
Amsterdam UMC
June 15 2022
Algorithm based medicine
June 15, 2022 Twitter: @MaartenvSmeden
https://bit.ly/2CwW43A
June 15, 2022 Twitter: @MaartenvSmeden
Forsting, J Nuc Med, 2017, DOI: 10.2967/jnumed.117.190397
June 15, 2022 Twitter: @MaartenvSmeden
Proportion of studies indexed in Medline with the Medical Subject
Heading (MeSH) term “Artificial Intelligence” divided by the total number
of publications per year.
Faes et al. doi: 10.3389/fdgth.2022.833912
June 15, 2022 Twitter: @MaartenvSmeden
Reviewer #2
June 15, 2022 Twitter: @MaartenvSmeden
https://bit.ly/2TOdd0F
June 15, 2022 Twitter: @MaartenvSmeden
https://bit.ly/2v2aokk
June 15, 2022 Twitter: @MaartenvSmeden
Tech company business model
June 15, 2022 Twitter: @MaartenvSmeden
Tech company business model
https://bit.ly/2HSp8X5; https://bit.ly/2Z0Pfop; https://bit.ly/2KIcpHG; https://bit.ly/33IJhr9
June 15, 2022 Twitter: @MaartenvSmeden
Other success stories
https://go.nature.com/2VG2hS7; https://bbc.in/2Z1drXQ; https://bit.ly/2TAfRIP
June 15, 2022 Twitter: @MaartenvSmeden
IBM Watson winning Jeopardy! (2011)
https://bbc.in/2TMvV8I
June 15, 2022 Twitter: @MaartenvSmeden
IBM Watson for oncology
https://bit.ly/2LxiWGj
June 15, 2022 Twitter: @MaartenvSmeden
Machine learning everywhere
https://bit.ly/2ka0HLq; https://go.nature.com/33TQgO6; https://bit.ly/2kp6X23; https://bit.ly/2lZuKWt; https://bit.ly/2lI298g
Examples where
“ML” has done well
June 15, 2022 Twitter: @MaartenvSmeden
Example: retinal disease
Gulshan et al, JAMA, 2016, 10.1001/jama.2016.17216; Picture retinopathy: https://bit.ly/2kB3X2w
Diabetic retinopathy
Deep learning (= Neural network)
• 128,000 images
• Transfer learning (preinitialization)
• Sensitivity and specificity > .90
• Estimated from training data
June 15, 2022 Twitter: @MaartenvSmeden
Example: lymph node metastases
Bejnordi et al, JAMA, 2018, doi: 10.1001/jama.2017.14585. See our letter to the editor for a critical discussion: https://bit.ly/2kcYS0e
Deep learning competition
But:
• 390 teams signed up, 23 submitted
• “Only” 270 images for training
• Test AUC range: 0.56 to 0.99
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
Primary outcome: time to TB treatment.
Time to TB treatment lowered from a median of 11 days in
standard of care to 1 day with computer aided X-ray screening
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
10.1016/j.cell.2020.01.021
Misconceptions about
machine learning methods
June 15, 2022 Twitter: @MaartenvSmeden
https://bit.ly/38A1ng0
June 15, 2022 Twitter: @MaartenvSmeden
“Everything is an ML method”
https://bit.ly/2lEVn33
June 15, 2022 Twitter: @MaartenvSmeden
“ML methods are new computing methods”
• Artificial intelligence (since 1956)
• Machine learning (since 1959)
• Deep learning (since 1962)
• Reinforcement learning (since 1965)
Courtesy: Prof Daniel Oberski
June 15, 2022 Twitter: @MaartenvSmeden
“ML methods come from computer science”
https://bit.ly/2zhbwPv; https://stanford.io/2TVp1xK; https://stanford.io/2ZfED0k
Leo Breiman Jerome H Friedman Trevor Hastie
CART, random forest Gradient boosting Elements of statistical learning
Education Physics/Math Physics Statistics
Job title Professor of Statistics Professor of Statistics Professor of Statistics
June 15, 2022 Twitter: @MaartenvSmeden
“ML methods for prediction, statistics for explaining”
1See further: Kreiff and Diaz Ordaz; https://bit.ly/2m1eYdK
ML and causal inference, small selection1
• Superlearner (e.g. van der Laan)
• High dimensional propensity scores (e.g. Schneeweiss)
• Causal random forest (e.g. Wager)
• The book of why (Pearl)
June 15, 2022 Twitter: @MaartenvSmeden
Two cultures
Breiman, Stat Sci, 2001, DOI: 10.1214/ss/1009213726
June 15, 2022 Twitter: @MaartenvSmeden
Faes et al. doi: 10.3389/fdgth.2022.833912
Language
June 15, 2022 Twitter: @MaartenvSmeden
Robert Tibshirani: https://stanford.io/2zqEGfr
Machine learning: large grant = $1,000,000
Statistics: large grant = $50,000
June 15, 2022 Twitter: @MaartenvSmeden
ML refers to a culture, not to methods
Distinguishing between statistics and machine learning
• Substantial overlap methods used by both cultures
• Substantial overlap analysis goals
• Attempts to separate the two frequently result in disagreement
Pragmatic approach:
I’ll use “ML” to refer to models roughly outside of the traditional regression
types of analysis: decision trees (and descendants), SVMs, neural networks
(including Deep learning), boosting etc.
June 15, 2022 Twitter: @MaartenvSmeden
Beam & Kohane, JAMA, 2018, doi : 10.1001/jama.2017.18391
June 15, 2022 Twitter: @MaartenvSmeden
learning
June 15, 2022 Twitter: @MaartenvSmeden
Adversarial examples
https://bit.ly/2N4mQFo; https://bit.ly/2W7X9rF
June 15, 2022 Twitter: @MaartenvSmeden
”Class imbalance”
Doi: 10.1371/journal.pone.0158285
June 15, 2022 Twitter: @MaartenvSmeden
Potentially harmful idea: imbalance correction
• Imbalance between “events”
and “non-events” often
considered problematic
• Examples exist where >99%
of “non-events” were
discarded, to “correct” for
imbalance
• Can have very bad
consequences
June 15, 2022 Twitter: @MaartenvSmeden
Recidivism Algorithm
Pro-publica (2016) https://bit.ly/1XMKh5R
June 15, 2022 Twitter: @MaartenvSmeden
Skin cancer and rulers
Esteva et al., Nature, 2016, DOI: 10.1038/nature21056; https://bit.ly/2lE0vV0
June 15, 2022 Twitter: @MaartenvSmeden
Predicting mortality – the conclusion
PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
June 15, 2022 Twitter: @MaartenvSmeden
Predicting mortality – the results
PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
June 15, 2022 Twitter: @MaartenvSmeden
Predicting mortality – the media
PlosOne, 2018, DOI: 10.1371/journal.pone.0202344; https://bit.ly/2Q6H41R; https://bit.ly/2m3RLrn
June 15, 2022 Twitter: @MaartenvSmeden
HYPE!
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
“As of today, we have deployed the system in 16 hospitals, and
it is performing over 1,300 screenings per day”
MedRxiv pre-print only, 23 March 2020,
doi.org/10.1101/2020.03.19.20039354
June 15, 2022 Twitter: @MaartenvSmeden
FDA APPROVED
FDA APPROVED
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
• Published on 7 April 2020
• 18 days between idea and article acceptance (sprint)
• Invited by BMJ as the first ever living review (marathon)
June 15, 2022 Twitter: @MaartenvSmeden
Living review (update 3)
doi: 10.1136/bmj.m1328
June 15, 2022 Twitter: @MaartenvSmeden
Living review (update 3)
doi: 10.1136/bmj.m1328
June 15, 2022 Twitter: @MaartenvSmeden
Living review (update 3)
Risk of bias assessment ursing PROBAST tool: https://www.probast.org/
doi: 10.1136/bmj.m1328
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
Clinical prediction modeling
June 15, 2022 Twitter: @MaartenvSmeden
Doi: 10.1016/j.jclinepi.2021.01.009
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
Living review (update 3)
doi: 10.1136/bmj.m1328
June 15, 2022 Twitter: @MaartenvSmeden
Living review (update 3)
Risk of bias assessment ursing PROBAST tool: https://www.probast.org/
doi: 10.1136/bmj.m1328
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
Systematic review clinical prediction models
Christodoulou et al. Journal of Clinical Epidemiology, 2019, doi: 10.1016/j.jclinepi.2019.02.004
June 15, 2022 Twitter: @MaartenvSmeden
Sources of prediction error
Y = 𝑓 𝑥 + 𝜀
For a model 𝑘 the expected test prediction error is:
σ!
+ bias! -
𝑓" 𝑥 + var -
𝑓" 𝑥
See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra
Irreducible error Mean squared prediction error
(with E 𝜀 = 0, var 𝜀 = 𝜎!
, values in 𝑥 are not random)
What we don’t model How we model
≈
≈
June 15, 2022 Twitter: @MaartenvSmeden
Sources of prediction error
Y = 𝑓 𝑥 + 𝜀
For a model 𝑘 the expected test prediction error is:
σ!
+ bias! -
𝑓" 𝑥 + var -
𝑓" 𝑥
See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra
Irreducible error Mean squared prediction error
(with E 𝜀 = 0, var 𝜀 = 𝜎!
, values in 𝑥 are not random)
What we don’t model How we model
≈
≈
In words, two main components for error in predictions are:
• Mean squared predictor error
• Under control of the modeler
June 15, 2022 Twitter: @MaartenvSmeden
Sources of prediction error
Y = 𝑓 𝑥 + 𝜀
For a model 𝑘 the expected test prediction error is:
σ!
+ bias! -
𝑓" 𝑥 + var -
𝑓" 𝑥
See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra
Irreducible error Mean squared prediction error
(with E 𝜀 = 0, var 𝜀 = 𝜎!
, values in 𝑥 are not random)
What we don’t model How we model
≈
≈
In words, two main components for error in predictions are:
• Mean squared predictor error
• Under control of the modeler
overfitting underfitting ”just right”
June 15, 2022 Twitter: @MaartenvSmeden
Sources of prediction error
Y = 𝑓 𝑥 + 𝜀
For a model 𝑘 the expected test prediction error is:
σ!
+ bias! -
𝑓" 𝑥 + var -
𝑓" 𝑥
See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra
Irreducible error Mean squared prediction error
(with E 𝜀 = 0, var 𝜀 = 𝜎!
, values in 𝑥 are not random)
What we don’t model How we model
≈
≈
In words, two main components for error in predictions are:
• Mean squared predictor error
• Under control of the modeler
• Irreducible error
• Not under direct control of the modeler
June 15, 2022 Twitter: @MaartenvSmeden
Bias-variance trade-off
Irreducible error
June 15, 2022 Twitter: @MaartenvSmeden
Irreducible error is often large
• Health and lack thereof complex to measure (‘no gold standard’)
• Predictors of diseases are often imperfectly and partly
measured
• We often don’t know all the causal mechanisms at play
• much easier to predict if you know the causal mechanisms!
• “Prediction is very difficult, especially if it’s about the future!”
(Niels Bohr might have said this first)
Courtesy Cecile Janssens: https://bit.ly/2Jf5ft6
June 15, 2022 Twitter: @MaartenvSmeden
What can we do to reduce “irreducible” error?
• Changing the information
• Prognostication by text mining electronic health records
• e.g. predicting life expectancy
https://bit.ly/2k8Ao8e
• Analyzing social media posts
• e.g. pharmacovigilance, adverse events monitoring via Twitter posts
https://bit.ly/2m0KKrg
• Speech signal processing
• e.g. Parkinson‟s disease,
https://bit.ly/2v3ZdHR
• Medical imaging
June 15, 2022 Twitter: @MaartenvSmeden
Bias-variance trade-off revisited: double descent
June 15, 2022 Twitter: @MaartenvSmeden
But…
June 15, 2022 Twitter: @MaartenvSmeden
Flexible algorithms are data hungry
From slide deck Ben van Calster: https://bit.ly/38Aqmjs
June 15, 2022 Twitter: @MaartenvSmeden
Flexible algorithms are energy hungry
The costs of running (cloud computing) the Transformer
algorithm are estimated at 1 to 3 million Dollars
https://bit.ly/33Dj38X
Where do we go from here?
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
Sample size for development and validation
June 15, 2022 Twitter: @MaartenvSmeden
Reporting
June 15, 2022 Twitter: @MaartenvSmeden
Pipeline of algorithmic medicine failure
Van Royen et al, ERJ, in press
June 15, 2022 Twitter: @MaartenvSmeden
Regulation
source: https://tinyurl.com/2yjtkz3h
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
June 15, 2022 Twitter: @MaartenvSmeden
https://www.leidraad-ai.nl/
June 15, 2022 Twitter: @MaartenvSmeden
What is the guideline?
• Guideline provides a description of what the work field
considers good professional conduct in the development,
testing and implementation of an Artificial Intelligence Prediction
Algorithm (AIPA) in the medical sector, including public
healthcare.
• Part of professional standard of professionals.
• Guideline is not legally binding, but it is relevant because it
describes what is considered as good professional conduct.
June 15, 2022 Twitter: @MaartenvSmeden
Comply or explain
• The guideline distinguishes
between requirements and
recommendations for good
professional conduct.
• The requirements are
indicated by mandatory.
Recommendations are
indicated by (strongly)
recommended.
• Use of the field standard
presupposes a comply or
explain approach.
June 15, 2022 Twitter: @MaartenvSmeden
Target group
Healthcare provider
Professional Scientific
/ medical associations
Education/training IT
suppliers
Citizen
Applying AI Assessing AI
Validator
Responsible
developer
Researcher
Data manager Data
supplier
Developing AI
(Internal)
supervisor
Notified body Peer
reviewer Privacy
officer Insurer
Patient(s)-
(associations)
Interest parties
Political parties
Interested citizen
Society
June 15, 2022 Twitter: @MaartenvSmeden
Phases guideline
Collection and
management of the
data
Phase 1
Development of the
AIP
Phase 2
Validation of the
AIPA
Phase 3
Development of the
required software
Phase 4
Impact assessment
of the AIPA in
combination with
the software
Phase 5
Implementation
and use of the AIPA
with software in
daily practice
Phase 6
Saskia Haitjema
Andre Dekker
Paul Algra
Amy Eikelenboom
Christian van
Ginkel
Martine de Vries
Daniel Oberski
Desy Kakiay
Kicky van
Leeuwen
Joran Lokkerbol
Evangelos
Kanoulas
Gabrielle
Davelaar
Wouter Veldhuis
Bart-Jan Verhoeff
Vincent Stirler
Daan van den
Donk
Huib Burger
Giovanni Cina
Martijn van der
Meulen
Maurits Kaptein
Floor van
Leeuwen
Egge van der Poel
Marcel Hilgersom
Sade Faneyte
Jonas Teuwen
Teus Kappen
Ewout Steyerberg
Leo Hovestadt
René Drost
Bart Geerts
Anne de Hond
René Verhaart
Nynke Breimer
Karen Wiegant
Laure Wynants
Lysette
Meuleman
June 15, 2022 Twitter: @MaartenvSmeden
https://www.leidraad-ai.nl/
June 15, 2022 Twitter: @MaartenvSmeden
Doi: 10.1093/eurheartj/ehac238
Concluding…
June 15, 2022 Twitter: @MaartenvSmeden
Algorithm based medicine
• Algorithms are high maintenance
• Developed models need repeated testing and updating to
remain useful over time and place
• Many new barriers: black box proprietary algorithms,
computing costs
• Regulation and quality control of algorithms
• Algorithms need testing, preferably in experimental fashion
June 15, 2022 Twitter: @MaartenvSmeden
Old statistics wine in new machine learning bottles?
Lots of…
• Hype
• Rebranding traditional analysis as ML and AI
• Methodological reinventions
• Traditional issues such as low sample size, lack of adequate
validation, poor reporting
Also, real developments in…
• Methods and architectures, allowing for modeling (unstructured)
data that could previously not easily be used
• Software
• Computing power
• Clinical trials showing benefit of AI assistance
June 15, 2022 Twitter: @MaartenvSmeden
Email: M.vanSmeden@umcutrecht.nl
Twitter: @MaartenvSmeden

Algorithm based medicine

  • 1.
    Maarten van Smeden,PhD AI & Big Data Amsterdam UMC June 15 2022 Algorithm based medicine
  • 2.
    June 15, 2022Twitter: @MaartenvSmeden https://bit.ly/2CwW43A
  • 3.
    June 15, 2022Twitter: @MaartenvSmeden Forsting, J Nuc Med, 2017, DOI: 10.2967/jnumed.117.190397
  • 4.
    June 15, 2022Twitter: @MaartenvSmeden Proportion of studies indexed in Medline with the Medical Subject Heading (MeSH) term “Artificial Intelligence” divided by the total number of publications per year. Faes et al. doi: 10.3389/fdgth.2022.833912
  • 5.
    June 15, 2022Twitter: @MaartenvSmeden Reviewer #2
  • 6.
    June 15, 2022Twitter: @MaartenvSmeden https://bit.ly/2TOdd0F
  • 7.
    June 15, 2022Twitter: @MaartenvSmeden https://bit.ly/2v2aokk
  • 8.
    June 15, 2022Twitter: @MaartenvSmeden Tech company business model
  • 9.
    June 15, 2022Twitter: @MaartenvSmeden Tech company business model https://bit.ly/2HSp8X5; https://bit.ly/2Z0Pfop; https://bit.ly/2KIcpHG; https://bit.ly/33IJhr9
  • 10.
    June 15, 2022Twitter: @MaartenvSmeden Other success stories https://go.nature.com/2VG2hS7; https://bbc.in/2Z1drXQ; https://bit.ly/2TAfRIP
  • 11.
    June 15, 2022Twitter: @MaartenvSmeden IBM Watson winning Jeopardy! (2011) https://bbc.in/2TMvV8I
  • 12.
    June 15, 2022Twitter: @MaartenvSmeden IBM Watson for oncology https://bit.ly/2LxiWGj
  • 13.
    June 15, 2022Twitter: @MaartenvSmeden Machine learning everywhere https://bit.ly/2ka0HLq; https://go.nature.com/33TQgO6; https://bit.ly/2kp6X23; https://bit.ly/2lZuKWt; https://bit.ly/2lI298g
  • 14.
  • 15.
    June 15, 2022Twitter: @MaartenvSmeden Example: retinal disease Gulshan et al, JAMA, 2016, 10.1001/jama.2016.17216; Picture retinopathy: https://bit.ly/2kB3X2w Diabetic retinopathy Deep learning (= Neural network) • 128,000 images • Transfer learning (preinitialization) • Sensitivity and specificity > .90 • Estimated from training data
  • 16.
    June 15, 2022Twitter: @MaartenvSmeden Example: lymph node metastases Bejnordi et al, JAMA, 2018, doi: 10.1001/jama.2017.14585. See our letter to the editor for a critical discussion: https://bit.ly/2kcYS0e Deep learning competition But: • 390 teams signed up, 23 submitted • “Only” 270 images for training • Test AUC range: 0.56 to 0.99
  • 17.
    June 15, 2022Twitter: @MaartenvSmeden
  • 18.
    June 15, 2022Twitter: @MaartenvSmeden
  • 19.
    June 15, 2022Twitter: @MaartenvSmeden Primary outcome: time to TB treatment. Time to TB treatment lowered from a median of 11 days in standard of care to 1 day with computer aided X-ray screening
  • 20.
    June 15, 2022Twitter: @MaartenvSmeden
  • 21.
    June 15, 2022Twitter: @MaartenvSmeden 10.1016/j.cell.2020.01.021
  • 22.
  • 23.
    June 15, 2022Twitter: @MaartenvSmeden https://bit.ly/38A1ng0
  • 24.
    June 15, 2022Twitter: @MaartenvSmeden “Everything is an ML method” https://bit.ly/2lEVn33
  • 25.
    June 15, 2022Twitter: @MaartenvSmeden “ML methods are new computing methods” • Artificial intelligence (since 1956) • Machine learning (since 1959) • Deep learning (since 1962) • Reinforcement learning (since 1965) Courtesy: Prof Daniel Oberski
  • 26.
    June 15, 2022Twitter: @MaartenvSmeden “ML methods come from computer science” https://bit.ly/2zhbwPv; https://stanford.io/2TVp1xK; https://stanford.io/2ZfED0k Leo Breiman Jerome H Friedman Trevor Hastie CART, random forest Gradient boosting Elements of statistical learning Education Physics/Math Physics Statistics Job title Professor of Statistics Professor of Statistics Professor of Statistics
  • 27.
    June 15, 2022Twitter: @MaartenvSmeden “ML methods for prediction, statistics for explaining” 1See further: Kreiff and Diaz Ordaz; https://bit.ly/2m1eYdK ML and causal inference, small selection1 • Superlearner (e.g. van der Laan) • High dimensional propensity scores (e.g. Schneeweiss) • Causal random forest (e.g. Wager) • The book of why (Pearl)
  • 28.
    June 15, 2022Twitter: @MaartenvSmeden Two cultures Breiman, Stat Sci, 2001, DOI: 10.1214/ss/1009213726
  • 29.
    June 15, 2022Twitter: @MaartenvSmeden Faes et al. doi: 10.3389/fdgth.2022.833912 Language
  • 30.
    June 15, 2022Twitter: @MaartenvSmeden Robert Tibshirani: https://stanford.io/2zqEGfr Machine learning: large grant = $1,000,000 Statistics: large grant = $50,000
  • 31.
    June 15, 2022Twitter: @MaartenvSmeden ML refers to a culture, not to methods Distinguishing between statistics and machine learning • Substantial overlap methods used by both cultures • Substantial overlap analysis goals • Attempts to separate the two frequently result in disagreement Pragmatic approach: I’ll use “ML” to refer to models roughly outside of the traditional regression types of analysis: decision trees (and descendants), SVMs, neural networks (including Deep learning), boosting etc.
  • 32.
    June 15, 2022Twitter: @MaartenvSmeden Beam & Kohane, JAMA, 2018, doi : 10.1001/jama.2017.18391
  • 33.
    June 15, 2022Twitter: @MaartenvSmeden learning
  • 34.
    June 15, 2022Twitter: @MaartenvSmeden Adversarial examples https://bit.ly/2N4mQFo; https://bit.ly/2W7X9rF
  • 35.
    June 15, 2022Twitter: @MaartenvSmeden ”Class imbalance” Doi: 10.1371/journal.pone.0158285
  • 36.
    June 15, 2022Twitter: @MaartenvSmeden Potentially harmful idea: imbalance correction • Imbalance between “events” and “non-events” often considered problematic • Examples exist where >99% of “non-events” were discarded, to “correct” for imbalance • Can have very bad consequences
  • 37.
    June 15, 2022Twitter: @MaartenvSmeden Recidivism Algorithm Pro-publica (2016) https://bit.ly/1XMKh5R
  • 38.
    June 15, 2022Twitter: @MaartenvSmeden Skin cancer and rulers Esteva et al., Nature, 2016, DOI: 10.1038/nature21056; https://bit.ly/2lE0vV0
  • 40.
    June 15, 2022Twitter: @MaartenvSmeden Predicting mortality – the conclusion PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
  • 41.
    June 15, 2022Twitter: @MaartenvSmeden Predicting mortality – the results PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
  • 42.
    June 15, 2022Twitter: @MaartenvSmeden Predicting mortality – the media PlosOne, 2018, DOI: 10.1371/journal.pone.0202344; https://bit.ly/2Q6H41R; https://bit.ly/2m3RLrn
  • 43.
    June 15, 2022Twitter: @MaartenvSmeden HYPE!
  • 44.
    June 15, 2022Twitter: @MaartenvSmeden
  • 45.
    June 15, 2022Twitter: @MaartenvSmeden “As of today, we have deployed the system in 16 hospitals, and it is performing over 1,300 screenings per day” MedRxiv pre-print only, 23 March 2020, doi.org/10.1101/2020.03.19.20039354
  • 46.
    June 15, 2022Twitter: @MaartenvSmeden FDA APPROVED FDA APPROVED
  • 47.
    June 15, 2022Twitter: @MaartenvSmeden
  • 48.
    June 15, 2022Twitter: @MaartenvSmeden • Published on 7 April 2020 • 18 days between idea and article acceptance (sprint) • Invited by BMJ as the first ever living review (marathon)
  • 49.
    June 15, 2022Twitter: @MaartenvSmeden Living review (update 3) doi: 10.1136/bmj.m1328
  • 50.
    June 15, 2022Twitter: @MaartenvSmeden Living review (update 3) doi: 10.1136/bmj.m1328
  • 51.
    June 15, 2022Twitter: @MaartenvSmeden Living review (update 3) Risk of bias assessment ursing PROBAST tool: https://www.probast.org/ doi: 10.1136/bmj.m1328
  • 52.
    June 15, 2022Twitter: @MaartenvSmeden
  • 53.
    June 15, 2022Twitter: @MaartenvSmeden
  • 54.
  • 55.
    June 15, 2022Twitter: @MaartenvSmeden Doi: 10.1016/j.jclinepi.2021.01.009
  • 56.
    June 15, 2022Twitter: @MaartenvSmeden
  • 57.
    June 15, 2022Twitter: @MaartenvSmeden
  • 59.
    June 15, 2022Twitter: @MaartenvSmeden Living review (update 3) doi: 10.1136/bmj.m1328
  • 60.
    June 15, 2022Twitter: @MaartenvSmeden Living review (update 3) Risk of bias assessment ursing PROBAST tool: https://www.probast.org/ doi: 10.1136/bmj.m1328
  • 61.
    June 15, 2022Twitter: @MaartenvSmeden
  • 62.
    June 15, 2022Twitter: @MaartenvSmeden Systematic review clinical prediction models Christodoulou et al. Journal of Clinical Epidemiology, 2019, doi: 10.1016/j.jclinepi.2019.02.004
  • 63.
    June 15, 2022Twitter: @MaartenvSmeden Sources of prediction error Y = 𝑓 𝑥 + 𝜀 For a model 𝑘 the expected test prediction error is: σ! + bias! - 𝑓" 𝑥 + var - 𝑓" 𝑥 See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra Irreducible error Mean squared prediction error (with E 𝜀 = 0, var 𝜀 = 𝜎! , values in 𝑥 are not random) What we don’t model How we model ≈ ≈
  • 64.
    June 15, 2022Twitter: @MaartenvSmeden Sources of prediction error Y = 𝑓 𝑥 + 𝜀 For a model 𝑘 the expected test prediction error is: σ! + bias! - 𝑓" 𝑥 + var - 𝑓" 𝑥 See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra Irreducible error Mean squared prediction error (with E 𝜀 = 0, var 𝜀 = 𝜎! , values in 𝑥 are not random) What we don’t model How we model ≈ ≈ In words, two main components for error in predictions are: • Mean squared predictor error • Under control of the modeler
  • 65.
    June 15, 2022Twitter: @MaartenvSmeden Sources of prediction error Y = 𝑓 𝑥 + 𝜀 For a model 𝑘 the expected test prediction error is: σ! + bias! - 𝑓" 𝑥 + var - 𝑓" 𝑥 See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra Irreducible error Mean squared prediction error (with E 𝜀 = 0, var 𝜀 = 𝜎! , values in 𝑥 are not random) What we don’t model How we model ≈ ≈ In words, two main components for error in predictions are: • Mean squared predictor error • Under control of the modeler overfitting underfitting ”just right”
  • 66.
    June 15, 2022Twitter: @MaartenvSmeden Sources of prediction error Y = 𝑓 𝑥 + 𝜀 For a model 𝑘 the expected test prediction error is: σ! + bias! - 𝑓" 𝑥 + var - 𝑓" 𝑥 See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra Irreducible error Mean squared prediction error (with E 𝜀 = 0, var 𝜀 = 𝜎! , values in 𝑥 are not random) What we don’t model How we model ≈ ≈ In words, two main components for error in predictions are: • Mean squared predictor error • Under control of the modeler • Irreducible error • Not under direct control of the modeler
  • 67.
    June 15, 2022Twitter: @MaartenvSmeden Bias-variance trade-off Irreducible error
  • 68.
    June 15, 2022Twitter: @MaartenvSmeden Irreducible error is often large • Health and lack thereof complex to measure (‘no gold standard’) • Predictors of diseases are often imperfectly and partly measured • We often don’t know all the causal mechanisms at play • much easier to predict if you know the causal mechanisms! • “Prediction is very difficult, especially if it’s about the future!” (Niels Bohr might have said this first) Courtesy Cecile Janssens: https://bit.ly/2Jf5ft6
  • 69.
    June 15, 2022Twitter: @MaartenvSmeden What can we do to reduce “irreducible” error? • Changing the information • Prognostication by text mining electronic health records • e.g. predicting life expectancy https://bit.ly/2k8Ao8e • Analyzing social media posts • e.g. pharmacovigilance, adverse events monitoring via Twitter posts https://bit.ly/2m0KKrg • Speech signal processing • e.g. Parkinson‟s disease, https://bit.ly/2v3ZdHR • Medical imaging
  • 70.
    June 15, 2022Twitter: @MaartenvSmeden Bias-variance trade-off revisited: double descent
  • 71.
    June 15, 2022Twitter: @MaartenvSmeden But…
  • 72.
    June 15, 2022Twitter: @MaartenvSmeden Flexible algorithms are data hungry From slide deck Ben van Calster: https://bit.ly/38Aqmjs
  • 73.
    June 15, 2022Twitter: @MaartenvSmeden Flexible algorithms are energy hungry The costs of running (cloud computing) the Transformer algorithm are estimated at 1 to 3 million Dollars https://bit.ly/33Dj38X
  • 74.
    Where do wego from here?
  • 75.
    June 15, 2022Twitter: @MaartenvSmeden
  • 76.
    June 15, 2022Twitter: @MaartenvSmeden Sample size for development and validation
  • 77.
    June 15, 2022Twitter: @MaartenvSmeden Reporting
  • 78.
    June 15, 2022Twitter: @MaartenvSmeden Pipeline of algorithmic medicine failure Van Royen et al, ERJ, in press
  • 79.
    June 15, 2022Twitter: @MaartenvSmeden Regulation source: https://tinyurl.com/2yjtkz3h
  • 80.
    June 15, 2022Twitter: @MaartenvSmeden
  • 81.
    June 15, 2022Twitter: @MaartenvSmeden
  • 82.
    June 15, 2022Twitter: @MaartenvSmeden https://www.leidraad-ai.nl/
  • 83.
    June 15, 2022Twitter: @MaartenvSmeden What is the guideline? • Guideline provides a description of what the work field considers good professional conduct in the development, testing and implementation of an Artificial Intelligence Prediction Algorithm (AIPA) in the medical sector, including public healthcare. • Part of professional standard of professionals. • Guideline is not legally binding, but it is relevant because it describes what is considered as good professional conduct.
  • 84.
    June 15, 2022Twitter: @MaartenvSmeden Comply or explain • The guideline distinguishes between requirements and recommendations for good professional conduct. • The requirements are indicated by mandatory. Recommendations are indicated by (strongly) recommended. • Use of the field standard presupposes a comply or explain approach.
  • 85.
    June 15, 2022Twitter: @MaartenvSmeden Target group Healthcare provider Professional Scientific / medical associations Education/training IT suppliers Citizen Applying AI Assessing AI Validator Responsible developer Researcher Data manager Data supplier Developing AI (Internal) supervisor Notified body Peer reviewer Privacy officer Insurer Patient(s)- (associations) Interest parties Political parties Interested citizen Society
  • 86.
    June 15, 2022Twitter: @MaartenvSmeden Phases guideline Collection and management of the data Phase 1 Development of the AIP Phase 2 Validation of the AIPA Phase 3 Development of the required software Phase 4 Impact assessment of the AIPA in combination with the software Phase 5 Implementation and use of the AIPA with software in daily practice Phase 6 Saskia Haitjema Andre Dekker Paul Algra Amy Eikelenboom Christian van Ginkel Martine de Vries Daniel Oberski Desy Kakiay Kicky van Leeuwen Joran Lokkerbol Evangelos Kanoulas Gabrielle Davelaar Wouter Veldhuis Bart-Jan Verhoeff Vincent Stirler Daan van den Donk Huib Burger Giovanni Cina Martijn van der Meulen Maurits Kaptein Floor van Leeuwen Egge van der Poel Marcel Hilgersom Sade Faneyte Jonas Teuwen Teus Kappen Ewout Steyerberg Leo Hovestadt René Drost Bart Geerts Anne de Hond René Verhaart Nynke Breimer Karen Wiegant Laure Wynants Lysette Meuleman
  • 87.
    June 15, 2022Twitter: @MaartenvSmeden https://www.leidraad-ai.nl/
  • 88.
    June 15, 2022Twitter: @MaartenvSmeden Doi: 10.1093/eurheartj/ehac238
  • 89.
  • 90.
    June 15, 2022Twitter: @MaartenvSmeden Algorithm based medicine • Algorithms are high maintenance • Developed models need repeated testing and updating to remain useful over time and place • Many new barriers: black box proprietary algorithms, computing costs • Regulation and quality control of algorithms • Algorithms need testing, preferably in experimental fashion
  • 91.
    June 15, 2022Twitter: @MaartenvSmeden Old statistics wine in new machine learning bottles? Lots of… • Hype • Rebranding traditional analysis as ML and AI • Methodological reinventions • Traditional issues such as low sample size, lack of adequate validation, poor reporting Also, real developments in… • Methods and architectures, allowing for modeling (unstructured) data that could previously not easily be used • Software • Computing power • Clinical trials showing benefit of AI assistance
  • 92.
    June 15, 2022Twitter: @MaartenvSmeden Email: M.vanSmeden@umcutrecht.nl Twitter: @MaartenvSmeden