SlideShare a Scribd company logo
1 of 57
Download to read offline
Improving predictions: Ridge, Lasso and Stein’s paradox
RIVM Epi masterclass (22/3/18)
Maarten van Smeden
Post-doc clinical epidemiology/medical statistics, Leiden University Medical Center
This slide deck available:
https://www.slideshare.net/MaartenvanSmeden
Diagnostic / prognostic prediction
Clinical prediction models
•Diagnostic prediction: probability of disease D = d in patient i?
•Prognostic prediction: probability of developing health outcome Y = y within
(or up to) T years in patient i?
Apgar score (since 1952)
Just this morning
Rise of prediction models
•>110 models for prostate cancer (Shariat 2008)
•>100 models for Traumatic Brain Injury (Perel 2006)
•83 models for stroke (Counsell 2001)
•54 models for breast cancer (Altman 2009)
•43 models for type 2 diabetes (Collins 2011; Dieren 2012)
•31 models for osteoporotic fracture (Steurer 2011)
•29 models in reproductive medicine (Leushuis 2009)
•26 models for hospital readmission (Kansagara 2011)
•>25 models for length of stay in cardiac surgery (Ettema 2010)
•>350 models for CVD outcomes (Damen 2016)
The overview was created and first presented by Prof. KGM Moons (Julius Center, UMC Utrecht)
Reality
Bell et al. BMJ 2015;351:h5639
This talk
Key message
Regression shrinkage strategies, such as Ridge and Lasso, have the ability to
dramatically improve predictive performance of prediction models
Outline
•What is wrong with traditional prediction model development strategies?
•What is Ridge and Lasso?
•Some thoughts on when to consider Ridge/Lasso.
Setting
•Development data: with subjects (i = 1, . . . , N) for which an outcome is
observed (y: the outcome to predict), and P predictor variables (X: explanatory
variables to make a prediction of y)
•(External) validation data: with subjects that were not part of the
development data but have the same outcome and predictor variables observed.
Perhaps subjects from a different geographical area
•The goal is to develop a prediction model with high as possible predictive
performance in validation (out-of-sample performance); performance in
development sample is not directly relevant
{•I’ll focus on the linear model for illustrative reasons}
{•N >> P}
Setting
•Development data: with subjects (i = 1, . . . , N) for which an outcome is
observed (y: the outcome to predict), and P predictor variables (X: explanatory
variables to make a prediction of y)
•(External) validation data: with subjects that were not part of the
development data but have the same outcome and predictor variables observed.
Perhaps subjects from a different geographical area
•The goal is to develop prediction model with high as possible predictive
performance in validation (out-of-sample performance); performance in
development sample is not directly relevant
•I’ll focus on the linear model for illustrative reasons
•N >> P
Linear model: OLS regression
Linear regression model
y = f(X) + , ∼ N(0, σ2
)
•With linear main effects only: ˆf(X) = ˆβ0 + ˆβ1x1 + ˆβ2X2 + . . . + ˆβP xP
•Find β that minimizes (in-sample) squared prediction error: i
(yi − ˆf(xi))
•Closed form solution: (X X)−1
X y
Question
Is ˆf(.) the best estimator to predict for future individuals?
1955: Stein’s paradox
1955: Stein’s paradox
Stein’s paradox in words (rather simplified)
When one has three or more units (say, individuals), and for each unit one can
calculate an average score (say, average blood pressure), then the best guess of
future observations (blood pressure) for each unit is NOT its average score.
1961: James-Stein estimator: the next Berkley Symposium
James and Stein. Estimation with quadratic loss. Proceedings of the fourth Berkeley symposium on mathematical
statistics and probability. Vol. 1. 1961.
1977: Baseball example
Efron and Morris (1977). Stein’s paradox in statistics. Scientific American, 236 (5): 119-127.
Lessons from Stein’s paradox
•Probably among the most surprising (and initially doubted) phenomena in
statistics
•Now a large “family”: shrinkage estimators reduce prediction variance to an
extent that typically outweighs the bias that is introduced
•Bias/variance trade-off principle has motivated many statistical developments
Bias, variance and prediction error1
Expected prediction error = irreducible error + bias2
+ variance
1
Friedman et al. (2001). The elements of statistical learning. Vol. 1. New York: Springer series.
Illustration of regression shrinkage
Illustration of regression shrinkage
Illustration of regression shrinkage
Illustration of regression shrinkage
Illustration of regression shrinkage
Illustration of shrinkage
Illustration of shrinkage
Illustration of shrinkage
Illustration of shrinkage
Illustration of shrinkage
Illustration of shrinkage
Was I just lucky?
Simulate 100 times
Not just lucky
•5% reduction in MSPE just by shrinkage estimator
•Van Houwelingen and le Cessie’s heuristic shrinkage factor
Heuristic argument for shrinkage
calibration plot
predicted
observed
ideal
model
Typical calibration plot: “overfitting”
Heuristic argument for shrinkage
calibration plot
predicted
observed
ideal
model
Typical calibration plot: “overfitting”
Overfitting
"Idiosyncrasies in the data are fitted rather than generalizable
patterns. A model may hence not be applicable to new patients,
even when the setting of application is very similar to the
development setting."
Steyerberg (2009). Clinical Prediction Models.
Ridge regression
Objective
i
(yi − ˆf(xi))2
+ λ
P
p=1
ˆβ2
p
•Note: λ = 0 corresponds to the OLS solution
•Closed form solution: (X X+λIp)−1
X y, where Ip is a P-dimensional
identity matrix
•In most software programs X is standardized and y centered for estimation
(output is mostly transformed back to original scale)
The challenge of ridge regression
finding a good value for the "tuning parameter": λ.
Diabetes data
Source: https://web.stanford.edu/ hastie/Papers/LARS/ (19/3/2018)
Details: Efron et al. (2004) Least angle regression. The annals of Statistics.
Diabetes data
K-fold cross-validation to find “optimal” λ
•Usually K = 10 or K = 5
•Partition the dataset into K non-overlapping sub-datasets of equal size
(disjoint subsets)
•Fit statistical model on all but 1 of the subsets (training set), and evaluate
performance of the model in the left-out subset (test set)
•Fit and evaluate K times
First fold of cross-validation (Diabetes data)
5-fold cross-validation (Diabetes data)
Diabetes data: Ridge regression results
AGE SEX BMI BP s1 s2 s3 s4 s5 s6
OLS -10.00 -239.80 519.80 324.40 -792.2 476.70 -101.00 177.10 751.30 67.60
Ridge -9.93 -239.68 520.11 324.25 -763.5 454.28 -88.23 173.37 740.69 67.66
Regression coefficients (data were standardized, outcome centered)
•log(λ) = 1.60 minimized average cross-validation MSPE
•R-code Ridge regression (glmnet package):
require(glmnet)
require(glmnetUtils)
df <- read.table("diabetes.txt",header=T)
rcv <- cv.glmnet(y~.,df,alpha=0,family="gaussian",nfolds=5)
fitr <- glmnet(y~.,data,alpha=0,lambda=rcv$lambda.min)
coef(fitr)
Lasso regression
Objective
i
(yi − ˆf(xi))2
+ λ2
P
p=1
|ˆβp|
•Remember Ridge regression: i
(yi − ˆf(xi))2
+ λ
P
p=1
ˆβ2
p
•No closed form solution for Lasso: estimation regression proceeds iteratively
•Like Ridge regression, cross-validation for estimating λ2
Diabetes data: Lasso regression results
AGE SEX BMI BP s1 s2 s3 s4 s5 s6
OLS -10.00 -239.80 519.80 324.40 -792.20 476.70 -101.00 177.10 751.30 67.60
Ridge -9.93 -239.68 520.11 324.25 -763.50 454.28 -88.23 173.37 740.69 67.66
Lasso 0.00 -184.39 520.52 290.18 -87.53 0.00 219.67 0.00 504.93 48.08
Regression coefficients (data were standardized, outcome centered)
•Lasso shrinks some variables to zero: built-in variable selection (!!!)
•R-code Lasso regression (glmnet package):
require(glmnet)
require(glmnetUtils)
df <- read.table("diabetes.txt",header=T)
lcv <- cv.glmnet(y~.,df,alpha=1,family="gaussian",nfolds=5)
fitl <- glmnet(y~.,data,alpha=1,lambda=lcv$lambda.min)
coef(fitr)
The argument to use Ridge/Lasso
Key message
Regression shrinkage strategies, such as Ridge and Lasso, have the ability to
dramatically improve predictive performance of prediction models
Some arguments against Ridge/Lasso
•Interpretation of regression coefficient
•Shrinkage not needed due to sufficient sample size (e.g. based on rule of
thumb)
•Cross-validation can lead to unstable estimation of the λ parameter
•Difficult to implement
Interpretation of regression coefficients
•Shrinkage estimators such as Ridge and Lasso introduce bias in (‘shrink’) the
regression coefficient by design
•Most software programs not provide standard errors and confidence intervals
for Ridge/Lasso regression coefficients
•Interpretation of coefficients is not / should not be the goal of a prediction
model
Note
Popular approaches to develop prediction models yield biased regression
coefficients and provide uninterpretable confidence intervals
Variable selection without shrinkage
Parameters may need shrinkage to become unbiased
Available at: https://www.slideshare.net/MaartenvanSmeden
Some arguments against Ridge/Lasso
•Interpretation of regression coefficient
•Shrinkage not needed due to sufficient sample size
•Cross-validation can lead to unstable estimation of the λ parameter
•Difficult to implement
Sufficient sample size?
Benefit of regression shrinkage dependents on:
•Sample size
•Correlations between predictor variables
•Sparsity of outcome and predictor variables
•The irreducible error component
•Type of outcome (continuous, binary, count, time-to-event,. . . )
•Number of candidate predictor variables
•Non-linear/interaction effects
•Weak/strong predictor balance
How to know that there is no need for shrinkage at some sample size?
Is a rule of thumb a rule of dumb1?
1
direct quote from tweet by prof Stephen Senn:
https://twitter.com/stephensenn/status/936213710770753536
Some arguments against Ridge/Lasso
•Interpretation of regression coefficient
•Shrinkage not needed due to sufficient sample size (e.g. based on rule of
thumb)
•Cross-validation can lead to unstable estimation of the λ parameter
•Difficult to implement
Estimating Ridge/Lasso
•“Programming” Ridge/Lasso regression isn’t hard with user friendly software
such as the glmnet package in R
•Getting it right might be a bit tougher than traditional approaches. It’s all
about the tuning parameter (λ)
•K-fold cross-validation makes arbitrary partitions of data which may make
estimating the tuning parameter unstable (there are some suggestions to
circumvent the problems). Note: this is not a flaw of cross-validation: it means
that there is probably insufficient data to estimate how much shrinkage is really
needed!
Closing remarks
•Shrinkage is highly recommended when developing a prediction model (e.g. see
Tripod guidelines for reporting)
•Software and methodological developments have made Lasso and Ridge
regression relatively easy to implement and computationally fast
•The cross-validation procedure can provide insights about possible overfitting
(much like propensity score analysis can provide information about balance)
•Consider the Lasso instead of traditional backward/forward selection strategies
Slide deck available: https://www.slideshare.net/MaartenvanSmeden
Free R tutorial (~ 2 hours): http://www.r-tutorial.nl/
AI and machine learning
AI and machine learning

More Related Content

What's hot

ML and AI: a blessing and curse for statisticians and medical doctors
ML and AI: a blessing and curse forstatisticians and medical doctorsML and AI: a blessing and curse forstatisticians and medical doctors
ML and AI: a blessing and curse for statisticians and medical doctorsMaarten van Smeden
 
Machine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMachine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMaarten van Smeden
 
Clinical prediction models: development, validation and beyond
Clinical prediction models:development, validation and beyondClinical prediction models:development, validation and beyond
Clinical prediction models: development, validation and beyondMaarten van Smeden
 
Introduction to prediction modelling - Berlin 2018 - Part II
Introduction to prediction modelling - Berlin 2018 - Part IIIntroduction to prediction modelling - Berlin 2018 - Part II
Introduction to prediction modelling - Berlin 2018 - Part IIMaarten van Smeden
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical researchMaarten van Smeden
 
A gentle introduction to AI for medicine
A gentle introduction to AI for medicineA gentle introduction to AI for medicine
A gentle introduction to AI for medicineMaarten van Smeden
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Maarten van Smeden
 
Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Maarten van Smeden
 
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...Maarten van Smeden
 
Development and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsDevelopment and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsMaarten van Smeden
 
The basics of prediction modeling
The basics of prediction modeling The basics of prediction modeling
The basics of prediction modeling Maarten van Smeden
 
Rage against the machine learning 2023
Rage against the machine learning 2023Rage against the machine learning 2023
Rage against the machine learning 2023Maarten van Smeden
 
Personalised medicine a sceptical view
Personalised medicine a sceptical viewPersonalised medicine a sceptical view
Personalised medicine a sceptical viewStephen Senn
 
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Ewout Steyerberg
 
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...GaryCollins74
 

What's hot (20)

ML and AI: a blessing and curse for statisticians and medical doctors
ML and AI: a blessing and curse forstatisticians and medical doctorsML and AI: a blessing and curse forstatisticians and medical doctors
ML and AI: a blessing and curse for statisticians and medical doctors
 
Machine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMachine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctors
 
Clinical prediction models: development, validation and beyond
Clinical prediction models:development, validation and beyondClinical prediction models:development, validation and beyond
Clinical prediction models: development, validation and beyond
 
Introduction to prediction modelling - Berlin 2018 - Part II
Introduction to prediction modelling - Berlin 2018 - Part IIIntroduction to prediction modelling - Berlin 2018 - Part II
Introduction to prediction modelling - Berlin 2018 - Part II
 
Algorithm based medicine
Algorithm based medicineAlgorithm based medicine
Algorithm based medicine
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical research
 
P-values in crisis
P-values in crisisP-values in crisis
P-values in crisis
 
A gentle introduction to AI for medicine
A gentle introduction to AI for medicineA gentle introduction to AI for medicine
A gentle introduction to AI for medicine
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...
 
Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead Why the EPV≥10 sample size rule is rubbish and what to use instead
Why the EPV≥10 sample size rule is rubbish and what to use instead
 
Clinical prediction models
Clinical prediction modelsClinical prediction models
Clinical prediction models
 
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
 
Development and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsDevelopment and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutions
 
UMC Utrecht AI Methods Lab
UMC Utrecht AI Methods LabUMC Utrecht AI Methods Lab
UMC Utrecht AI Methods Lab
 
The basics of prediction modeling
The basics of prediction modeling The basics of prediction modeling
The basics of prediction modeling
 
Uncertainty in AI
Uncertainty in AIUncertainty in AI
Uncertainty in AI
 
Rage against the machine learning 2023
Rage against the machine learning 2023Rage against the machine learning 2023
Rage against the machine learning 2023
 
Personalised medicine a sceptical view
Personalised medicine a sceptical viewPersonalised medicine a sceptical view
Personalised medicine a sceptical view
 
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
 
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
 

Similar to Improving predictions: Lasso, Ridge and Stein's paradox

Quantitative analysis
Quantitative analysisQuantitative analysis
Quantitative analysisRajesh Mishra
 
Spss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatSpss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatMarwa Zalat
 
Quantitative_analysis.ppt
Quantitative_analysis.pptQuantitative_analysis.ppt
Quantitative_analysis.pptmousaderhem1
 
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docxSTAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docxwhitneyleman54422
 
Explore, Analyze and Present your data
Explore, Analyze and Present your dataExplore, Analyze and Present your data
Explore, Analyze and Present your datagcalmettes
 
Undergraduate Research work
Undergraduate Research workUndergraduate Research work
Undergraduate Research workPeter M Addo
 
Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Henock Beyene
 
Survival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive ModelsSurvival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive ModelsChristos Argyropoulos
 
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSISEXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSISBabasID2
 
MH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -cleanMH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -cleanMin-hyung Kim
 
Probability Forecasting - a Machine Learning Perspective
Probability Forecasting - a Machine Learning PerspectiveProbability Forecasting - a Machine Learning Perspective
Probability Forecasting - a Machine Learning Perspectivebutest
 
Statistical significance
Statistical significanceStatistical significance
Statistical significanceMai Ngoc Duc
 
Dive into the Data
Dive into the DataDive into the Data
Dive into the Datadr_jp_ebejer
 
Estimation and hypothesis
Estimation and hypothesisEstimation and hypothesis
Estimation and hypothesisJunaid Ijaz
 
840 plenary elder_using his laptop
840 plenary elder_using his laptop840 plenary elder_using his laptop
840 plenary elder_using his laptopRising Media, Inc.
 

Similar to Improving predictions: Lasso, Ridge and Stein's paradox (20)

Quantitative analysis
Quantitative analysisQuantitative analysis
Quantitative analysis
 
Spss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatSpss basic Dr Marwa Zalat
Spss basic Dr Marwa Zalat
 
Quantitative_analysis.ppt
Quantitative_analysis.pptQuantitative_analysis.ppt
Quantitative_analysis.ppt
 
Data in science
Data in science Data in science
Data in science
 
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docxSTAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
 
Explore, Analyze and Present your data
Explore, Analyze and Present your dataExplore, Analyze and Present your data
Explore, Analyze and Present your data
 
Undergraduate Research work
Undergraduate Research workUndergraduate Research work
Undergraduate Research work
 
Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01
 
Survival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive ModelsSurvival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive Models
 
Statistics
StatisticsStatistics
Statistics
 
EXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSISEXPLORATORY DATA ANALYSIS
EXPLORATORY DATA ANALYSIS
 
MH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -cleanMH Prediction Modeling and Validation -clean
MH Prediction Modeling and Validation -clean
 
Metaanalysis copy
Metaanalysis    copyMetaanalysis    copy
Metaanalysis copy
 
Presentation1
Presentation1Presentation1
Presentation1
 
Probability Forecasting - a Machine Learning Perspective
Probability Forecasting - a Machine Learning PerspectiveProbability Forecasting - a Machine Learning Perspective
Probability Forecasting - a Machine Learning Perspective
 
Statistical significance
Statistical significanceStatistical significance
Statistical significance
 
Dive into the Data
Dive into the DataDive into the Data
Dive into the Data
 
Estimation and hypothesis
Estimation and hypothesisEstimation and hypothesis
Estimation and hypothesis
 
840 plenary elder_using his laptop
840 plenary elder_using his laptop840 plenary elder_using his laptop
840 plenary elder_using his laptop
 
920 plenary elder
920 plenary elder920 plenary elder
920 plenary elder
 

More from Maarten van Smeden

Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Maarten van Smeden
 
Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Maarten van Smeden
 
Living systematic reviews: now and in the future
Living systematic reviews: now and in the futureLiving systematic reviews: now and in the future
Living systematic reviews: now and in the futureMaarten van Smeden
 
The statistics of the coronavirus
The statistics of the coronavirusThe statistics of the coronavirus
The statistics of the coronavirusMaarten van Smeden
 
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...Maarten van Smeden
 
The absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemThe absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemMaarten van Smeden
 
Anatomy of a successful science thread
Anatomy of a successful science threadAnatomy of a successful science thread
Anatomy of a successful science threadMaarten van Smeden
 

More from Maarten van Smeden (10)

Associate professor lecture
Associate professor lectureAssociate professor lecture
Associate professor lecture
 
Predictimands
PredictimandsPredictimands
Predictimands
 
Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...
 
Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19
 
Living systematic reviews: now and in the future
Living systematic reviews: now and in the futureLiving systematic reviews: now and in the future
Living systematic reviews: now and in the future
 
Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19
 
The statistics of the coronavirus
The statistics of the coronavirusThe statistics of the coronavirus
The statistics of the coronavirus
 
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
 
The absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemThe absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problem
 
Anatomy of a successful science thread
Anatomy of a successful science threadAnatomy of a successful science thread
Anatomy of a successful science thread
 

Recently uploaded

Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 

Recently uploaded (20)

Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 

Improving predictions: Lasso, Ridge and Stein's paradox

  • 1. Improving predictions: Ridge, Lasso and Stein’s paradox RIVM Epi masterclass (22/3/18) Maarten van Smeden Post-doc clinical epidemiology/medical statistics, Leiden University Medical Center
  • 2.
  • 3.
  • 4. This slide deck available: https://www.slideshare.net/MaartenvanSmeden
  • 5. Diagnostic / prognostic prediction Clinical prediction models •Diagnostic prediction: probability of disease D = d in patient i? •Prognostic prediction: probability of developing health outcome Y = y within (or up to) T years in patient i?
  • 8. Rise of prediction models •>110 models for prostate cancer (Shariat 2008) •>100 models for Traumatic Brain Injury (Perel 2006) •83 models for stroke (Counsell 2001) •54 models for breast cancer (Altman 2009) •43 models for type 2 diabetes (Collins 2011; Dieren 2012) •31 models for osteoporotic fracture (Steurer 2011) •29 models in reproductive medicine (Leushuis 2009) •26 models for hospital readmission (Kansagara 2011) •>25 models for length of stay in cardiac surgery (Ettema 2010) •>350 models for CVD outcomes (Damen 2016) The overview was created and first presented by Prof. KGM Moons (Julius Center, UMC Utrecht)
  • 9. Reality Bell et al. BMJ 2015;351:h5639
  • 10. This talk Key message Regression shrinkage strategies, such as Ridge and Lasso, have the ability to dramatically improve predictive performance of prediction models Outline •What is wrong with traditional prediction model development strategies? •What is Ridge and Lasso? •Some thoughts on when to consider Ridge/Lasso.
  • 11. Setting •Development data: with subjects (i = 1, . . . , N) for which an outcome is observed (y: the outcome to predict), and P predictor variables (X: explanatory variables to make a prediction of y) •(External) validation data: with subjects that were not part of the development data but have the same outcome and predictor variables observed. Perhaps subjects from a different geographical area •The goal is to develop a prediction model with high as possible predictive performance in validation (out-of-sample performance); performance in development sample is not directly relevant {•I’ll focus on the linear model for illustrative reasons} {•N >> P}
  • 12. Setting •Development data: with subjects (i = 1, . . . , N) for which an outcome is observed (y: the outcome to predict), and P predictor variables (X: explanatory variables to make a prediction of y) •(External) validation data: with subjects that were not part of the development data but have the same outcome and predictor variables observed. Perhaps subjects from a different geographical area •The goal is to develop prediction model with high as possible predictive performance in validation (out-of-sample performance); performance in development sample is not directly relevant •I’ll focus on the linear model for illustrative reasons •N >> P
  • 13. Linear model: OLS regression Linear regression model y = f(X) + , ∼ N(0, σ2 ) •With linear main effects only: ˆf(X) = ˆβ0 + ˆβ1x1 + ˆβ2X2 + . . . + ˆβP xP •Find β that minimizes (in-sample) squared prediction error: i (yi − ˆf(xi)) •Closed form solution: (X X)−1 X y Question Is ˆf(.) the best estimator to predict for future individuals?
  • 15. 1955: Stein’s paradox Stein’s paradox in words (rather simplified) When one has three or more units (say, individuals), and for each unit one can calculate an average score (say, average blood pressure), then the best guess of future observations (blood pressure) for each unit is NOT its average score.
  • 16. 1961: James-Stein estimator: the next Berkley Symposium James and Stein. Estimation with quadratic loss. Proceedings of the fourth Berkeley symposium on mathematical statistics and probability. Vol. 1. 1961.
  • 17. 1977: Baseball example Efron and Morris (1977). Stein’s paradox in statistics. Scientific American, 236 (5): 119-127.
  • 18. Lessons from Stein’s paradox •Probably among the most surprising (and initially doubted) phenomena in statistics •Now a large “family”: shrinkage estimators reduce prediction variance to an extent that typically outweighs the bias that is introduced •Bias/variance trade-off principle has motivated many statistical developments Bias, variance and prediction error1 Expected prediction error = irreducible error + bias2 + variance 1 Friedman et al. (2001). The elements of statistical learning. Vol. 1. New York: Springer series.
  • 31. Not just lucky •5% reduction in MSPE just by shrinkage estimator •Van Houwelingen and le Cessie’s heuristic shrinkage factor
  • 32. Heuristic argument for shrinkage calibration plot predicted observed ideal model Typical calibration plot: “overfitting”
  • 33. Heuristic argument for shrinkage calibration plot predicted observed ideal model Typical calibration plot: “overfitting”
  • 34. Overfitting "Idiosyncrasies in the data are fitted rather than generalizable patterns. A model may hence not be applicable to new patients, even when the setting of application is very similar to the development setting." Steyerberg (2009). Clinical Prediction Models.
  • 35. Ridge regression Objective i (yi − ˆf(xi))2 + λ P p=1 ˆβ2 p •Note: λ = 0 corresponds to the OLS solution •Closed form solution: (X X+λIp)−1 X y, where Ip is a P-dimensional identity matrix •In most software programs X is standardized and y centered for estimation (output is mostly transformed back to original scale) The challenge of ridge regression finding a good value for the "tuning parameter": λ.
  • 36. Diabetes data Source: https://web.stanford.edu/ hastie/Papers/LARS/ (19/3/2018) Details: Efron et al. (2004) Least angle regression. The annals of Statistics.
  • 38. K-fold cross-validation to find “optimal” λ •Usually K = 10 or K = 5 •Partition the dataset into K non-overlapping sub-datasets of equal size (disjoint subsets) •Fit statistical model on all but 1 of the subsets (training set), and evaluate performance of the model in the left-out subset (test set) •Fit and evaluate K times
  • 39. First fold of cross-validation (Diabetes data)
  • 41. Diabetes data: Ridge regression results AGE SEX BMI BP s1 s2 s3 s4 s5 s6 OLS -10.00 -239.80 519.80 324.40 -792.2 476.70 -101.00 177.10 751.30 67.60 Ridge -9.93 -239.68 520.11 324.25 -763.5 454.28 -88.23 173.37 740.69 67.66 Regression coefficients (data were standardized, outcome centered) •log(λ) = 1.60 minimized average cross-validation MSPE •R-code Ridge regression (glmnet package): require(glmnet) require(glmnetUtils) df <- read.table("diabetes.txt",header=T) rcv <- cv.glmnet(y~.,df,alpha=0,family="gaussian",nfolds=5) fitr <- glmnet(y~.,data,alpha=0,lambda=rcv$lambda.min) coef(fitr)
  • 42. Lasso regression Objective i (yi − ˆf(xi))2 + λ2 P p=1 |ˆβp| •Remember Ridge regression: i (yi − ˆf(xi))2 + λ P p=1 ˆβ2 p •No closed form solution for Lasso: estimation regression proceeds iteratively •Like Ridge regression, cross-validation for estimating λ2
  • 43. Diabetes data: Lasso regression results AGE SEX BMI BP s1 s2 s3 s4 s5 s6 OLS -10.00 -239.80 519.80 324.40 -792.20 476.70 -101.00 177.10 751.30 67.60 Ridge -9.93 -239.68 520.11 324.25 -763.50 454.28 -88.23 173.37 740.69 67.66 Lasso 0.00 -184.39 520.52 290.18 -87.53 0.00 219.67 0.00 504.93 48.08 Regression coefficients (data were standardized, outcome centered) •Lasso shrinks some variables to zero: built-in variable selection (!!!) •R-code Lasso regression (glmnet package): require(glmnet) require(glmnetUtils) df <- read.table("diabetes.txt",header=T) lcv <- cv.glmnet(y~.,df,alpha=1,family="gaussian",nfolds=5) fitl <- glmnet(y~.,data,alpha=1,lambda=lcv$lambda.min) coef(fitr)
  • 44. The argument to use Ridge/Lasso Key message Regression shrinkage strategies, such as Ridge and Lasso, have the ability to dramatically improve predictive performance of prediction models
  • 45. Some arguments against Ridge/Lasso •Interpretation of regression coefficient •Shrinkage not needed due to sufficient sample size (e.g. based on rule of thumb) •Cross-validation can lead to unstable estimation of the λ parameter •Difficult to implement
  • 46. Interpretation of regression coefficients •Shrinkage estimators such as Ridge and Lasso introduce bias in (‘shrink’) the regression coefficient by design •Most software programs not provide standard errors and confidence intervals for Ridge/Lasso regression coefficients •Interpretation of coefficients is not / should not be the goal of a prediction model Note Popular approaches to develop prediction models yield biased regression coefficients and provide uninterpretable confidence intervals
  • 48. Parameters may need shrinkage to become unbiased Available at: https://www.slideshare.net/MaartenvanSmeden
  • 49. Some arguments against Ridge/Lasso •Interpretation of regression coefficient •Shrinkage not needed due to sufficient sample size •Cross-validation can lead to unstable estimation of the λ parameter •Difficult to implement
  • 50. Sufficient sample size? Benefit of regression shrinkage dependents on: •Sample size •Correlations between predictor variables •Sparsity of outcome and predictor variables •The irreducible error component •Type of outcome (continuous, binary, count, time-to-event,. . . ) •Number of candidate predictor variables •Non-linear/interaction effects •Weak/strong predictor balance How to know that there is no need for shrinkage at some sample size?
  • 51. Is a rule of thumb a rule of dumb1? 1 direct quote from tweet by prof Stephen Senn: https://twitter.com/stephensenn/status/936213710770753536
  • 52. Some arguments against Ridge/Lasso •Interpretation of regression coefficient •Shrinkage not needed due to sufficient sample size (e.g. based on rule of thumb) •Cross-validation can lead to unstable estimation of the λ parameter •Difficult to implement
  • 53. Estimating Ridge/Lasso •“Programming” Ridge/Lasso regression isn’t hard with user friendly software such as the glmnet package in R •Getting it right might be a bit tougher than traditional approaches. It’s all about the tuning parameter (λ) •K-fold cross-validation makes arbitrary partitions of data which may make estimating the tuning parameter unstable (there are some suggestions to circumvent the problems). Note: this is not a flaw of cross-validation: it means that there is probably insufficient data to estimate how much shrinkage is really needed!
  • 54. Closing remarks •Shrinkage is highly recommended when developing a prediction model (e.g. see Tripod guidelines for reporting) •Software and methodological developments have made Lasso and Ridge regression relatively easy to implement and computationally fast •The cross-validation procedure can provide insights about possible overfitting (much like propensity score analysis can provide information about balance) •Consider the Lasso instead of traditional backward/forward selection strategies
  • 55. Slide deck available: https://www.slideshare.net/MaartenvanSmeden Free R tutorial (~ 2 hours): http://www.r-tutorial.nl/
  • 56. AI and machine learning
  • 57. AI and machine learning