Maarten van Smeden, PhD
Predictive Analytics course
Den Haag
21 feb 2023
Rage Against The Machine Learning
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Terminology
In medical research, “artificial intelligence” usually
just means “machine learning” or “algorithm”
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
https://bit.ly/2CwW43A
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Proportion of studies indexed in Medline with the Medical Subject
Heading (MeSH) term “Artificial Intelligence” divided by the total number
of publications per year.
Faes et al. doi: 10.3389/fdgth.2022.833912
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Reviewer #2
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
https://bit.ly/2TOdd0F
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Forsting, J Nuc Med, 2017, DOI: 10.2967/jnumed.117.190397
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
https://bit.ly/2v2aokk
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Tech company business model
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Tech company business model
https://bit.ly/2HSp8X5; https://bit.ly/2Z0Pfop; https://bit.ly/2KIcpHG; https://bit.ly/33IJhr9
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Other success stories
https://go.nature.com/2VG2hS7; https://bbc.in/2Z1drXQ; https://bit.ly/2TAfRIP
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
IBM Watson winning Jeopardy! (2011)
https://bbc.in/2TMvV8I
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
IBM Watson for oncology
https://bit.ly/2LxiWGj
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Machine learning everywhere
https://bit.ly/2ka0HLq; https://go.nature.com/33TQgO6; https://bit.ly/2kp6X23; https://bit.ly/2lZuKWt; https://bit.ly/2lI298g
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
“As of today, we have deployed the system in 16 hospitals, and
it is performing over 1,300 screenings per day”
MedRxiv pre-print only, 23 March 2020,
doi.org/10.1101/2020.03.19.20039354
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
FDA APPROVED
FDA APPROVED
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Living review (update 4)
doi: 10.1136/bmj.m1328
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Living review (update 4)
doi: 10.1136/bmj.m1328
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
doi: 10.1136/bmj-2021-069881
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
doi: 10.1136/bmj-2021-069881
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
https://twitter.com/AndrewLBeam/status/1620855064033382401?s=20&t=VO9_LdFFCj_wcwIQLvKcIQ
Source: Ilse Kant (UMC Utrecht)
what are these
machine learning methods?
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
https://bit.ly/38A1ng0
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
“Everything is an ML method”
https://bit.ly/2lEVn33
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
“ML methods come from computer science”
https://bit.ly/2zhbwPv; https://stanford.io/2TVp1xK; https://stanford.io/2ZfED0k
Leo Breiman Jerome H Friedman Trevor Hastie
CART, random forest Gradient boosting Elements of statistical learning
Education Physics/Math Physics Statistics
Job title Professor of Statistics Professor of Statistics Professor of Statistics
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
“ML methods for prediction, statistics for explaining”
1See further: Kreiff and Diaz Ordaz; https://bit.ly/2m1eYdK
ML and causal inference, small selection1
• Superlearner (e.g. van der Laan)
• High dimensional propensity scores (e.g. Schneeweiss)
• The book of why (Pearl)
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Two cultures
Breiman, Stat Sci, 2001, DOI: 10.1214/ss/1009213726
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Faes et al. doi: 10.3389/fdgth.2022.833912
Language
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Robert Tibshirani: https://stanford.io/2zqEGfr
Machine learning: large grant = $1,000,000
Statistics: large grant = $50,000
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
ML refers to a culture, not to methods
Distinguishing between statistics and machine learning
• Substantial overlap methods used by both cultures
• Substantial overlap analysis goals
• Attempts to separate the two frequently result in disagreement
Pragmatic approach:
I’ll use “ML” to refer to models roughly outside of the traditional regression
types of analysis: decision trees (and descendants), SVMs, neural networks
(including Deep learning), boosting etc.
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Beam & Kohane, JAMA, 2018, doi : 10.1001/jama.2017.18391
Examples where
“ML” has done well
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Example: retinal disease
Gulshan et al, JAMA, 2016, 10.1001/jama.2016.17216; Picture retinopathy: https://bit.ly/2kB3X2w
Diabetic retinopathy
Deep learning (= Neural network)
• 128,000 images
• Transfer learning (preinitialization)
• Sensitivity and specificity > .90
• Estimated from training data
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Example: lymph node metastases
Bejnordi et al, JAMA, 2018, doi: 10.1001/jama.2017.14585. See our letter to the editor for a critical discussion: https://bit.ly/2kcYS0e
Deep learning competition
But:
• 390 teams signed up, 23 submitted
• “Only” 270 images for training
• Test AUC range: 0.56 to 0.99
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Primary outcome: time to TB treatment.
Time to TB treatment lowered from a median of 11 days in
standard of care to 1 day with computer aided X-ray screening
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
10.1016/j.cell.2020.01.021
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Examples where
“ML” has done poorly
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
https://tinyurl.com/3knkuzs3
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Adversarial examples
https://bit.ly/2N4mQFo; https://bit.ly/2W7X9rF
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Recidivism Algorithm
Pro-publica (2016) https://bit.ly/1XMKh5R
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Skin cancer and rulers
Esteva et al., Nature, 2016, DOI: 10.1038/nature21056; https://bit.ly/2lE0vV0
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Predicting mortality – the conclusion
PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Predicting mortality – the results
PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Predicting mortality – the media
PlosOne, 2018, DOI: 10.1371/journal.pone.0202344; https://bit.ly/2Q6H41R; https://bit.ly/2m3RLrn
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
HYPE!
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Systematic review clinical prediction models
Christodoulou et al. Journal of Clinical Epidemiology, 2019, doi: 10.1016/j.jclinepi.2019.02.004
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Sources of prediction error
Y = 𝑓 𝑥 + 𝜀
For a model 𝑘 the expected test prediction error is:
σ!
+ bias! -
𝑓" 𝑥 + var -
𝑓" 𝑥
See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra
Irreducible error Mean squared prediction error
(with E 𝜀 = 0, var 𝜀 = 𝜎!
, values in 𝑥 are not random)
What we don’t model How we model
≈
≈
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Sources of prediction error
Y = 𝑓 𝑥 + 𝜀
For a model 𝑘 the expected test prediction error is:
σ!
+ bias! -
𝑓" 𝑥 + var -
𝑓" 𝑥
See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra
Irreducible error Mean squared prediction error
(with E 𝜀 = 0, var 𝜀 = 𝜎!
, values in 𝑥 are not random)
What we don’t model How we model
≈
≈
In words, two main components for error in predictions are:
• Mean squared predictor error
• Under control of the modeler
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Sources of prediction error
Y = 𝑓 𝑥 + 𝜀
For a model 𝑘 the expected test prediction error is:
σ!
+ bias! -
𝑓" 𝑥 + var -
𝑓" 𝑥
See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra
Irreducible error Mean squared prediction error
(with E 𝜀 = 0, var 𝜀 = 𝜎!
, values in 𝑥 are not random)
What we don’t model How we model
≈
≈
In words, two main components for error in predictions are:
• Mean squared predictor error
• Under control of the modeler
overfitting underfitting ”just right”
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Sources of prediction error
Y = 𝑓 𝑥 + 𝜀
For a model 𝑘 the expected test prediction error is:
σ!
+ bias! -
𝑓" 𝑥 + var -
𝑓" 𝑥
See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra
Irreducible error Mean squared prediction error
(with E 𝜀 = 0, var 𝜀 = 𝜎!
, values in 𝑥 are not random)
What we don’t model How we model
≈
≈
In words, two main components for error in predictions are:
• Mean squared predictor error
• Under control of the modeler
• Irreducible error
• Not under direct control of the modeler
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Bias-variance trade-off
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Irreducible error is often large
• Health and lack thereof complex to measure (‘no gold standard’)
• Predictors of diseases are often imperfectly and partly
measured
• We often don’t know all the causal mechanisms at play
• much easier to predict if you know the causal mechanisms!
• “Prediction is very difficult, especially if it’s about the future!”
(Niels Bohr might have said this first)
Courtesy Cecile Janssens: https://bit.ly/2Jf5ft6
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
What can we do to reduce “irreducible” error?
• Changing the information
• Prognostication by text mining electronic health records
• e.g. predicting life expectancy
https://bit.ly/2k8Ao8e
• Analyzing social media posts
• e.g. pharmacovigilance, adverse events monitoring via Twitter posts
https://bit.ly/2m0KKrg
• Speech signal processing
• e.g. Parkinson‟s disease,
https://bit.ly/2v3ZdHR
• Medical imaging
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Bias-variance trade-off revisited: double descent
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
But…
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Flexible algorithms are data hungry
From slide deck Ben van Calster: https://bit.ly/38Aqmjs
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Flexible algorithms are energy hungry
The costs of running (cloud computing) the Transformer
algorithm are estimated at 1 to 3 million Dollars
https://bit.ly/33Dj38X
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Algorithm based medicine
• Algorithms are high maintenance
• Developed models need repeated testing and updating to
remain useful over time and place
• Many new barriers: black box proprietary algorithms,
computing costs
• Regulation and quality control of algorithms
• Algorithms need testing, preferably in experimental fashion
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
https://twitter.com/DrHughHarvey/status/1230218991026819077
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Old statistics wine in new machine learning bottles?
Lots of…
• Hype
• Rebranding traditional analysis as ML and AI
• Methodological reinventions
• Traditional issues such as low sample size, lack of adequate
validation, poor reporting
Also, real developments in…
• Methods and architectures, allowing for modeling (unstructured)
data that could previously not easily be used
• Software
• Computing power
• Clinical trials showing benefit of AI assistance
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Source: Ilse Kant (UMC Utrecht), adapted from doi: 10.1080/08956308.1997.11671126
3000 100 10 2 1
Ideas Explorations Launches
well defined
projects
Succes
From research AI model to implemented AI application,
innovation is ….
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
AI/ML MODELS USED IN PRACTICE
AI/ML MODELS THAT WILL NEVER BE USED IN PRACTICE
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Pipeline of algorithmic medicine failure
Van Royen et al, ERJ, 2922, doi:10.1183/13993003.00250-2022, also credits to Laure Wynants
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Utopia
Courtesy Anna Lohmann
“SOMETHING USEFUL”
Multivariable
model
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
AI/ML models are…
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
AI/ML models are…
• Expensive
• Not one-size-fits-all
• Many alternatives usually available
• Need crash testing (“impact”)
• Require regular MOT (“validation”)
• Require regular maintenance (”updating”)
• Require people to be trained how to operate them
• Can be dangerous when wrongly used
• Regulations apply
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Step 2: from review to national guideline
www.leidraad-ai.nl
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
The guideline for diagnostic/prognostic applications
• What the healthcare field considers good professional
conduct in the development, testing and implementation of AI-
based prediction models in the medical sector, including
public healthcare.
• Starting point: stakeholder opinions and review
• Use of the guideline can (hopefully) improve quality and lower
costs of healthcare
• Guideline is not legally binding
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
https://www.leidraad-ai.nl/
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden
Email: M.vanSmeden@umcutrecht.nl
Twitter: @MaartenvSmeden
Den Haag, 21 Feb 2022 Twitter: @MaartenvSmeden

Rage against the machine learning 2023

  • 1.
    Maarten van Smeden,PhD Predictive Analytics course Den Haag 21 feb 2023 Rage Against The Machine Learning
  • 2.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Terminology In medical research, “artificial intelligence” usually just means “machine learning” or “algorithm”
  • 3.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden https://bit.ly/2CwW43A
  • 4.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Proportion of studies indexed in Medline with the Medical Subject Heading (MeSH) term “Artificial Intelligence” divided by the total number of publications per year. Faes et al. doi: 10.3389/fdgth.2022.833912
  • 5.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Reviewer #2
  • 6.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden https://bit.ly/2TOdd0F
  • 7.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Forsting, J Nuc Med, 2017, DOI: 10.2967/jnumed.117.190397
  • 8.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden https://bit.ly/2v2aokk
  • 9.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Tech company business model
  • 10.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Tech company business model https://bit.ly/2HSp8X5; https://bit.ly/2Z0Pfop; https://bit.ly/2KIcpHG; https://bit.ly/33IJhr9
  • 11.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Other success stories https://go.nature.com/2VG2hS7; https://bbc.in/2Z1drXQ; https://bit.ly/2TAfRIP
  • 12.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden IBM Watson winning Jeopardy! (2011) https://bbc.in/2TMvV8I
  • 13.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden IBM Watson for oncology https://bit.ly/2LxiWGj
  • 14.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Machine learning everywhere https://bit.ly/2ka0HLq; https://go.nature.com/33TQgO6; https://bit.ly/2kp6X23; https://bit.ly/2lZuKWt; https://bit.ly/2lI298g
  • 15.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden
  • 16.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden “As of today, we have deployed the system in 16 hospitals, and it is performing over 1,300 screenings per day” MedRxiv pre-print only, 23 March 2020, doi.org/10.1101/2020.03.19.20039354
  • 17.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden FDA APPROVED FDA APPROVED
  • 18.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Living review (update 4) doi: 10.1136/bmj.m1328
  • 19.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Living review (update 4) doi: 10.1136/bmj.m1328
  • 20.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden
  • 21.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden doi: 10.1136/bmj-2021-069881
  • 22.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden doi: 10.1136/bmj-2021-069881
  • 23.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden
  • 25.
  • 26.
    Source: Ilse Kant(UMC Utrecht)
  • 27.
    what are these machinelearning methods?
  • 28.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden https://bit.ly/38A1ng0
  • 29.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden “Everything is an ML method” https://bit.ly/2lEVn33
  • 30.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden “ML methods come from computer science” https://bit.ly/2zhbwPv; https://stanford.io/2TVp1xK; https://stanford.io/2ZfED0k Leo Breiman Jerome H Friedman Trevor Hastie CART, random forest Gradient boosting Elements of statistical learning Education Physics/Math Physics Statistics Job title Professor of Statistics Professor of Statistics Professor of Statistics
  • 31.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden “ML methods for prediction, statistics for explaining” 1See further: Kreiff and Diaz Ordaz; https://bit.ly/2m1eYdK ML and causal inference, small selection1 • Superlearner (e.g. van der Laan) • High dimensional propensity scores (e.g. Schneeweiss) • The book of why (Pearl)
  • 32.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Two cultures Breiman, Stat Sci, 2001, DOI: 10.1214/ss/1009213726
  • 33.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Faes et al. doi: 10.3389/fdgth.2022.833912 Language
  • 34.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Robert Tibshirani: https://stanford.io/2zqEGfr Machine learning: large grant = $1,000,000 Statistics: large grant = $50,000
  • 35.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden ML refers to a culture, not to methods Distinguishing between statistics and machine learning • Substantial overlap methods used by both cultures • Substantial overlap analysis goals • Attempts to separate the two frequently result in disagreement Pragmatic approach: I’ll use “ML” to refer to models roughly outside of the traditional regression types of analysis: decision trees (and descendants), SVMs, neural networks (including Deep learning), boosting etc.
  • 36.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Beam & Kohane, JAMA, 2018, doi : 10.1001/jama.2017.18391
  • 37.
  • 38.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden
  • 39.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Example: retinal disease Gulshan et al, JAMA, 2016, 10.1001/jama.2016.17216; Picture retinopathy: https://bit.ly/2kB3X2w Diabetic retinopathy Deep learning (= Neural network) • 128,000 images • Transfer learning (preinitialization) • Sensitivity and specificity > .90 • Estimated from training data
  • 40.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Example: lymph node metastases Bejnordi et al, JAMA, 2018, doi: 10.1001/jama.2017.14585. See our letter to the editor for a critical discussion: https://bit.ly/2kcYS0e Deep learning competition But: • 390 teams signed up, 23 submitted • “Only” 270 images for training • Test AUC range: 0.56 to 0.99
  • 41.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden
  • 42.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden
  • 43.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Primary outcome: time to TB treatment. Time to TB treatment lowered from a median of 11 days in standard of care to 1 day with computer aided X-ray screening
  • 44.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden
  • 45.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden 10.1016/j.cell.2020.01.021
  • 46.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden
  • 47.
  • 48.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden https://tinyurl.com/3knkuzs3
  • 49.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Adversarial examples https://bit.ly/2N4mQFo; https://bit.ly/2W7X9rF
  • 50.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Recidivism Algorithm Pro-publica (2016) https://bit.ly/1XMKh5R
  • 51.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Skin cancer and rulers Esteva et al., Nature, 2016, DOI: 10.1038/nature21056; https://bit.ly/2lE0vV0
  • 53.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Predicting mortality – the conclusion PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
  • 54.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Predicting mortality – the results PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
  • 55.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Predicting mortality – the media PlosOne, 2018, DOI: 10.1371/journal.pone.0202344; https://bit.ly/2Q6H41R; https://bit.ly/2m3RLrn
  • 56.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden HYPE!
  • 57.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Systematic review clinical prediction models Christodoulou et al. Journal of Clinical Epidemiology, 2019, doi: 10.1016/j.jclinepi.2019.02.004
  • 58.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Sources of prediction error Y = 𝑓 𝑥 + 𝜀 For a model 𝑘 the expected test prediction error is: σ! + bias! - 𝑓" 𝑥 + var - 𝑓" 𝑥 See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra Irreducible error Mean squared prediction error (with E 𝜀 = 0, var 𝜀 = 𝜎! , values in 𝑥 are not random) What we don’t model How we model ≈ ≈
  • 59.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Sources of prediction error Y = 𝑓 𝑥 + 𝜀 For a model 𝑘 the expected test prediction error is: σ! + bias! - 𝑓" 𝑥 + var - 𝑓" 𝑥 See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra Irreducible error Mean squared prediction error (with E 𝜀 = 0, var 𝜀 = 𝜎! , values in 𝑥 are not random) What we don’t model How we model ≈ ≈ In words, two main components for error in predictions are: • Mean squared predictor error • Under control of the modeler
  • 60.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Sources of prediction error Y = 𝑓 𝑥 + 𝜀 For a model 𝑘 the expected test prediction error is: σ! + bias! - 𝑓" 𝑥 + var - 𝑓" 𝑥 See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra Irreducible error Mean squared prediction error (with E 𝜀 = 0, var 𝜀 = 𝜎! , values in 𝑥 are not random) What we don’t model How we model ≈ ≈ In words, two main components for error in predictions are: • Mean squared predictor error • Under control of the modeler overfitting underfitting ”just right”
  • 61.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Sources of prediction error Y = 𝑓 𝑥 + 𝜀 For a model 𝑘 the expected test prediction error is: σ! + bias! - 𝑓" 𝑥 + var - 𝑓" 𝑥 See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra Irreducible error Mean squared prediction error (with E 𝜀 = 0, var 𝜀 = 𝜎! , values in 𝑥 are not random) What we don’t model How we model ≈ ≈ In words, two main components for error in predictions are: • Mean squared predictor error • Under control of the modeler • Irreducible error • Not under direct control of the modeler
  • 62.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Bias-variance trade-off
  • 63.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Irreducible error is often large • Health and lack thereof complex to measure (‘no gold standard’) • Predictors of diseases are often imperfectly and partly measured • We often don’t know all the causal mechanisms at play • much easier to predict if you know the causal mechanisms! • “Prediction is very difficult, especially if it’s about the future!” (Niels Bohr might have said this first) Courtesy Cecile Janssens: https://bit.ly/2Jf5ft6
  • 64.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden What can we do to reduce “irreducible” error? • Changing the information • Prognostication by text mining electronic health records • e.g. predicting life expectancy https://bit.ly/2k8Ao8e • Analyzing social media posts • e.g. pharmacovigilance, adverse events monitoring via Twitter posts https://bit.ly/2m0KKrg • Speech signal processing • e.g. Parkinson‟s disease, https://bit.ly/2v3ZdHR • Medical imaging
  • 65.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Bias-variance trade-off revisited: double descent
  • 66.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden But…
  • 67.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Flexible algorithms are data hungry From slide deck Ben van Calster: https://bit.ly/38Aqmjs
  • 68.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Flexible algorithms are energy hungry The costs of running (cloud computing) the Transformer algorithm are estimated at 1 to 3 million Dollars https://bit.ly/33Dj38X
  • 69.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden
  • 70.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Algorithm based medicine • Algorithms are high maintenance • Developed models need repeated testing and updating to remain useful over time and place • Many new barriers: black box proprietary algorithms, computing costs • Regulation and quality control of algorithms • Algorithms need testing, preferably in experimental fashion
  • 71.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden https://twitter.com/DrHughHarvey/status/1230218991026819077
  • 72.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Old statistics wine in new machine learning bottles? Lots of… • Hype • Rebranding traditional analysis as ML and AI • Methodological reinventions • Traditional issues such as low sample size, lack of adequate validation, poor reporting Also, real developments in… • Methods and architectures, allowing for modeling (unstructured) data that could previously not easily be used • Software • Computing power • Clinical trials showing benefit of AI assistance
  • 73.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Source: Ilse Kant (UMC Utrecht), adapted from doi: 10.1080/08956308.1997.11671126 3000 100 10 2 1 Ideas Explorations Launches well defined projects Succes From research AI model to implemented AI application, innovation is ….
  • 74.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden AI/ML MODELS USED IN PRACTICE AI/ML MODELS THAT WILL NEVER BE USED IN PRACTICE
  • 75.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Pipeline of algorithmic medicine failure Van Royen et al, ERJ, 2922, doi:10.1183/13993003.00250-2022, also credits to Laure Wynants
  • 76.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Utopia Courtesy Anna Lohmann “SOMETHING USEFUL” Multivariable model
  • 77.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden AI/ML models are…
  • 78.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden AI/ML models are… • Expensive • Not one-size-fits-all • Many alternatives usually available • Need crash testing (“impact”) • Require regular MOT (“validation”) • Require regular maintenance (”updating”) • Require people to be trained how to operate them • Can be dangerous when wrongly used • Regulations apply
  • 79.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden
  • 80.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden
  • 81.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Step 2: from review to national guideline www.leidraad-ai.nl
  • 82.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden The guideline for diagnostic/prognostic applications • What the healthcare field considers good professional conduct in the development, testing and implementation of AI- based prediction models in the medical sector, including public healthcare. • Starting point: stakeholder opinions and review • Use of the guideline can (hopefully) improve quality and lower costs of healthcare • Guideline is not legally binding
  • 83.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden https://www.leidraad-ai.nl/
  • 84.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden Email: M.vanSmeden@umcutrecht.nl Twitter: @MaartenvSmeden
  • 85.
    Den Haag, 21Feb 2022 Twitter: @MaartenvSmeden