Oct 21, 2022
Statistics and machine learning:
friends or foes?
Ewout W. Steyerberg, PhD
Professor of Clinical Biostatistics and
Medical Decision Making
Dept of Biomedical Data Sciences
Leiden University Medical Center
Thanks to many, including Ben van Calster, Leuven;
Maarten van Smeden, Utrecht
Statistics and machine learning:
friends or foes?
21-Oct-22
2 Insert > Header & footer
• Introduction for debate
• Friction points: foes
• Commonalities between statistics and ML: friends
Statistics and Machine Learning (ML)
In medical research, “artificial intelligence” usually just means “machine learning” or
“algorithm”
21-Oct-22
3 Insert > Header & footer
Machine learning in medical research
21-Oct-22
4 Insert > Header & footer
Machine learning and AI everywhere
IBM Watson winning Jeopardy! (2011)
Dr Watson
21-Oct-22
8 Insert > Header & footer
Dr Watson lessons
21-Oct-22
10 Insert > Header & footer
Dr Watson lesson 1
21-Oct-22
11 Insert > Header & footer
Dr Watson lesson 2
21-Oct-22
12 Insert > Header & footer
Dr Watson lesson 3
21-Oct-22
13 Insert > Header & footer
Friction points between statistics and ML: foes
1. ML claims to be new and supersede statistics
2. ML claims any data is relevant
3. ML makes promises it cannot keep
21-Oct-22
14 Insert > Header & footer
1. ML claims to be new and supersede statistics
21-Oct-22
15 Insert > Header & footer
Reviewer comment
“Everything is an ML method”
Statistics Machine learning
Covariates Features
Outcome variable Target
Model Network, graphs
Parameters Weights
Model for discrete var. Classifier
Model for continuous var. Regression
Log-likelihood Cross-entropy loss
Multinomial regression Softmax
Measurement error Noise
Subject/observation Sample/instance
Dummy coding One-hot encoding
Measurement invariance Concept drift
Statistics Machine learning
Prediction Supervised learning
Latent variable modeling Unsupervised learning
Fitting Learning
Prediction error Error
Sensitivity Recall
Positive predictive value Precision
Contingency table Confusion matrix
Measurement error model Noise-aware ML
Structural equation model Gaussian Bayesian network
Gold standard Ground truth
Derivation–validation Training–test
Experiment A/B test
Language
https://www.analyticsvidhya.com/glossary-of-common-statistics-and-machine-learning-terms/
https://developers.google.com/machine-learning/glossary
Where to place Machine Learning?
21-Oct-22
20 Insert > Header & footer
https://codeburst.io/statistics-a-machine-learning-essential-ee537695786b
21-Oct-22
21 Insert > Header & footer
1. ML claims to be new and supersede statistics
ML has developed from statistics
ML as part of statistics
Statistics as part of ML
ML:
models roughly outside of the traditional regression types of analysis:
• decision trees (and descendants, XGBoost, ..)
• Support vector machines (SVMs)
• neural networks (including Deep learning)
21-Oct-22
22 Insert > Header & footer
2. ML claims any data is relevant
Typical context: Electronic Health Records (EHR); large administrative data sets
Uncover patterns in data that are there but remained hidden
Strong point of EHR: large N, large sets of features
Weak point of EHR: ‘quality’
Selection of patients
Start point definition
End point definition
Selective measurement
Missing values
…
21-Oct-22
23 Insert > Header & footer
More data is better? Lessons from meta-analysis
Meta-analysis:
Risk of bias assessment
Respect clustering nature
21-Oct-22
24 Personal protective equipment for preventing highly infectious diseases
Big Data, Big Errors
3. ML makes promises it cannot keep
“Uncover patterns in data that are there but remained hidden”
Unsupervised learning
Clustering unstable and determined by optimization criterion
Supervised learning
Trees / neural networks better for prediction than regression
21-Oct-22
26 Insert > Header & footer
Supervised learning example
21-Oct-22
27 Insert > Header & footer
Example from Maarten van Smeden, Utrecht; @MaartenvSmeden
Predicting mortality – the media
Findings not convincing
Cox, #4, 30 vars, max c =0.793
RF, #7, 600 vars, c=0.797
Elastic, #9, 600 vars, c=0.801
21-Oct-22
29 Insert > Header & footer
RF showed poor calibration
21-Oct-22
30 Insert > Header & footer
Machine learning vs conventional modeling
Text
“We found that random forests did not outperform Cox models despite their inherent ability to
accommodate nonlinearities and interactions. …
Elastic nets achieved the highest discrimination performance …, demonstrating
the ability of regularisation to select relevant variables and optimise model coefficients in an EHR context.”
21-Oct-22
31 Insert > Header & footer
Systematic review on ML vs classic modeling
21-Oct-22
32 Insert > Header & footer
Differences in discrimination
Commonalities between statistics and ML: friends
4. Research question is key
5. Complex data structures require innovative approaches
6. Some problems are really hard
21-Oct-22
35 Insert > Header & footer
21-Oct-22
36 Insert > Header & footer
4. Research question is key
From easy to hard questions
- Exploratory / descriptive
- Prediction / classification
- Causal
21-Oct-22
37 Insert > Header & footer
4. Research questions
Separate
- Exploratory: data mining
“enjoy the results, because you will never see these results again”
- Descriptive: patterns in the data to learn about nature;
hypothesis generating; biomarkers – disease
ML provides more flexibility; less interpretability?
- Prediction: machine learning /trees often poor in performance
ML may provide benefits in specific circumstances?
21-Oct-22
38 Insert > Header & footer
39
Van der Ploeg et al. BMC Med Res Methodol 2014;14:137.
ML good for prediction?
Large N, small p
“Natural flexibility”?
Versus non-linear terms / interactions in regression?
21-Oct-22
40 Insert > Header & footer
ML good for treatment selection rules?
High hopes
“The incorporation of new data modalities such as single-cell profiling, along with techniques that
rapidly find effective drug combinations will likely be instrumental in improving cancer care.”
21-Oct-22
42 Insert > Header & footer
Statistics good for treatment selection rules?
21-Oct-22
43 Insert > Header & footer
21-Oct-22
44 Insert > Header & footer
https://hbiostat.org/blog/post/path/index.html
Alternatives
21-Oct-22
45 Insert > Header & footer
1) Risk-based methods (11 papers) use only prognostic factors to define patient subgroups,
relying on the mathematical dependency of the absolute risk difference on baseline risk;
2) Treatment effect modeling methods (9 papers): prognostic factors and treatment effect modifiers,
including penalization or separate data sets for subgroup identification / effect
3) Optimal treatment regime methods (12 papers) focus primarily on treatment effect modifiers
to classify the trial population into those who benefit from treatment and those who do not
5. Complex data structures require innovative approaches
Examples of succesful ML
- Image analysis: Deep Learning (DL)
- Radiology, pathology, dermatology, opthalmology, gastroenterology, cardiology,
…
- Free text: natural language processing (NLP)
- Mining electronic health records, building blocks for prediction, …
- Pharmacovigilance in social media
21-Oct-22
46 Insert > Header & footer
6. Some problems are really hard
Prediction
Small N, small p  regression
Small N, large p  hopeless
Large N, small p  regression
Large N, large p  ?
Treatment selection
Balance bias – precision
Causal interpretation
21-Oct-22
47 Insert > Header & footer
Summary 21 Oct 2022
1. ML is not really new and needs to liaise with statistics
2. Data quality and bias: design is key, learn from clinical epidemiology
3. Don’t make too many promises
4. Research questions relate to description, prediction and causality
5. Recognized power for specific complex data structures
6. Work on the truly hard problems together
21-Oct-22
48 Insert > Header & footer

Statistics and ML 21Oct22 sel.pptx

  • 1.
    Oct 21, 2022 Statisticsand machine learning: friends or foes? Ewout W. Steyerberg, PhD Professor of Clinical Biostatistics and Medical Decision Making Dept of Biomedical Data Sciences Leiden University Medical Center Thanks to many, including Ben van Calster, Leuven; Maarten van Smeden, Utrecht
  • 2.
    Statistics and machinelearning: friends or foes? 21-Oct-22 2 Insert > Header & footer • Introduction for debate • Friction points: foes • Commonalities between statistics and ML: friends
  • 3.
    Statistics and MachineLearning (ML) In medical research, “artificial intelligence” usually just means “machine learning” or “algorithm” 21-Oct-22 3 Insert > Header & footer
  • 4.
    Machine learning inmedical research 21-Oct-22 4 Insert > Header & footer
  • 5.
    Machine learning andAI everywhere
  • 7.
    IBM Watson winningJeopardy! (2011)
  • 8.
  • 10.
    Dr Watson lessons 21-Oct-22 10Insert > Header & footer
  • 11.
    Dr Watson lesson1 21-Oct-22 11 Insert > Header & footer
  • 12.
    Dr Watson lesson2 21-Oct-22 12 Insert > Header & footer
  • 13.
    Dr Watson lesson3 21-Oct-22 13 Insert > Header & footer
  • 14.
    Friction points betweenstatistics and ML: foes 1. ML claims to be new and supersede statistics 2. ML claims any data is relevant 3. ML makes promises it cannot keep 21-Oct-22 14 Insert > Header & footer
  • 15.
    1. ML claimsto be new and supersede statistics 21-Oct-22 15 Insert > Header & footer
  • 16.
  • 17.
    “Everything is anML method”
  • 19.
    Statistics Machine learning CovariatesFeatures Outcome variable Target Model Network, graphs Parameters Weights Model for discrete var. Classifier Model for continuous var. Regression Log-likelihood Cross-entropy loss Multinomial regression Softmax Measurement error Noise Subject/observation Sample/instance Dummy coding One-hot encoding Measurement invariance Concept drift Statistics Machine learning Prediction Supervised learning Latent variable modeling Unsupervised learning Fitting Learning Prediction error Error Sensitivity Recall Positive predictive value Precision Contingency table Confusion matrix Measurement error model Noise-aware ML Structural equation model Gaussian Bayesian network Gold standard Ground truth Derivation–validation Training–test Experiment A/B test Language https://www.analyticsvidhya.com/glossary-of-common-statistics-and-machine-learning-terms/ https://developers.google.com/machine-learning/glossary
  • 20.
    Where to placeMachine Learning? 21-Oct-22 20 Insert > Header & footer https://codeburst.io/statistics-a-machine-learning-essential-ee537695786b
  • 21.
    21-Oct-22 21 Insert >Header & footer
  • 22.
    1. ML claimsto be new and supersede statistics ML has developed from statistics ML as part of statistics Statistics as part of ML ML: models roughly outside of the traditional regression types of analysis: • decision trees (and descendants, XGBoost, ..) • Support vector machines (SVMs) • neural networks (including Deep learning) 21-Oct-22 22 Insert > Header & footer
  • 23.
    2. ML claimsany data is relevant Typical context: Electronic Health Records (EHR); large administrative data sets Uncover patterns in data that are there but remained hidden Strong point of EHR: large N, large sets of features Weak point of EHR: ‘quality’ Selection of patients Start point definition End point definition Selective measurement Missing values … 21-Oct-22 23 Insert > Header & footer
  • 24.
    More data isbetter? Lessons from meta-analysis Meta-analysis: Risk of bias assessment Respect clustering nature 21-Oct-22 24 Personal protective equipment for preventing highly infectious diseases
  • 25.
  • 26.
    3. ML makespromises it cannot keep “Uncover patterns in data that are there but remained hidden” Unsupervised learning Clustering unstable and determined by optimization criterion Supervised learning Trees / neural networks better for prediction than regression 21-Oct-22 26 Insert > Header & footer
  • 27.
    Supervised learning example 21-Oct-22 27Insert > Header & footer Example from Maarten van Smeden, Utrecht; @MaartenvSmeden
  • 28.
  • 29.
    Findings not convincing Cox,#4, 30 vars, max c =0.793 RF, #7, 600 vars, c=0.797 Elastic, #9, 600 vars, c=0.801 21-Oct-22 29 Insert > Header & footer
  • 30.
    RF showed poorcalibration 21-Oct-22 30 Insert > Header & footer
  • 31.
    Machine learning vsconventional modeling Text “We found that random forests did not outperform Cox models despite their inherent ability to accommodate nonlinearities and interactions. … Elastic nets achieved the highest discrimination performance …, demonstrating the ability of regularisation to select relevant variables and optimise model coefficients in an EHR context.” 21-Oct-22 31 Insert > Header & footer
  • 32.
    Systematic review onML vs classic modeling 21-Oct-22 32 Insert > Header & footer
  • 34.
  • 35.
    Commonalities between statisticsand ML: friends 4. Research question is key 5. Complex data structures require innovative approaches 6. Some problems are really hard 21-Oct-22 35 Insert > Header & footer
  • 36.
    21-Oct-22 36 Insert >Header & footer
  • 37.
    4. Research questionis key From easy to hard questions - Exploratory / descriptive - Prediction / classification - Causal 21-Oct-22 37 Insert > Header & footer
  • 38.
    4. Research questions Separate -Exploratory: data mining “enjoy the results, because you will never see these results again” - Descriptive: patterns in the data to learn about nature; hypothesis generating; biomarkers – disease ML provides more flexibility; less interpretability? - Prediction: machine learning /trees often poor in performance ML may provide benefits in specific circumstances? 21-Oct-22 38 Insert > Header & footer
  • 39.
    39 Van der Ploeget al. BMC Med Res Methodol 2014;14:137.
  • 40.
    ML good forprediction? Large N, small p “Natural flexibility”? Versus non-linear terms / interactions in regression? 21-Oct-22 40 Insert > Header & footer
  • 42.
    ML good fortreatment selection rules? High hopes “The incorporation of new data modalities such as single-cell profiling, along with techniques that rapidly find effective drug combinations will likely be instrumental in improving cancer care.” 21-Oct-22 42 Insert > Header & footer
  • 43.
    Statistics good fortreatment selection rules? 21-Oct-22 43 Insert > Header & footer
  • 44.
    21-Oct-22 44 Insert >Header & footer https://hbiostat.org/blog/post/path/index.html
  • 45.
    Alternatives 21-Oct-22 45 Insert >Header & footer 1) Risk-based methods (11 papers) use only prognostic factors to define patient subgroups, relying on the mathematical dependency of the absolute risk difference on baseline risk; 2) Treatment effect modeling methods (9 papers): prognostic factors and treatment effect modifiers, including penalization or separate data sets for subgroup identification / effect 3) Optimal treatment regime methods (12 papers) focus primarily on treatment effect modifiers to classify the trial population into those who benefit from treatment and those who do not
  • 46.
    5. Complex datastructures require innovative approaches Examples of succesful ML - Image analysis: Deep Learning (DL) - Radiology, pathology, dermatology, opthalmology, gastroenterology, cardiology, … - Free text: natural language processing (NLP) - Mining electronic health records, building blocks for prediction, … - Pharmacovigilance in social media 21-Oct-22 46 Insert > Header & footer
  • 47.
    6. Some problemsare really hard Prediction Small N, small p  regression Small N, large p  hopeless Large N, small p  regression Large N, large p  ? Treatment selection Balance bias – precision Causal interpretation 21-Oct-22 47 Insert > Header & footer
  • 48.
    Summary 21 Oct2022 1. ML is not really new and needs to liaise with statistics 2. Data quality and bias: design is key, learn from clinical epidemiology 3. Don’t make too many promises 4. Research questions relate to description, prediction and causality 5. Recognized power for specific complex data structures 6. Work on the truly hard problems together 21-Oct-22 48 Insert > Header & footer