SlideShare a Scribd company logo
June, 2022
Is Open Science Better Science?
Ewout W. Steyerberg, PhD
Professor of Clinical Biostatistics and
Medical Decision Making
Thanks to many for assistance and inspiration,
including the GAP3 consortium, CENTER-TBI Study
Yes, but …
Open vs closed science
Long ago
- Performed by few, elitarian scientists
- Doing private experiments
- Discussion in small, closed communities
Probabilities to quantify uncertainty
• Christiaan Huygens 1657:
'Van rekeningh in spelen van geluck'
• Thomas Bayes 1763:
An Essay towards solving a Problem in the Doctrine of Chances”
(read to the Royal Society by Richard Price)
• Pierre Laplace 1812:
Théorie analytique des probabilités
6-Jun-22
3 Insert > Header & footer
Open vs closed science
Long ago
- Performed by few, elitarian scientists
- Doing private experiments
- Discussion in small, closed communities
Recent
- Science as a profession
- Protect data + code as intellectual property
- Aim for shocking findings in high IF journals
https://www.sciencemag.org/news/2020/06/whos-blame-these-three-scientists-are-heart-surgisphere-covid-19-scandal
Overall claim
“Open Science will make research better”
Vote pro / neutral / con
“More data is better”
Vote pro / neutral / con
6-Jun-22
5 Insert > Header & footer
Today
Aims:
- Highlight some strong points in Open Science
- Hint at some challenges in Open Science
Reflections based on personal 30-yr research experience,
specific focus on prediction research / decision making
6-Jun-22
6 Insert > Header & footer
Open Science to better address
Big Research questions
Open science research questions: case 1
Example 1: Red cards and dark skin soccer players
https://psyarxiv.com/qkwst/
6-Jun-22
8 Insert > Header & footer
Open science research questions: case 1
• 29 teams involving 61 analysts; same dataset; same research question:
whether soccer referees are more likely to give red cards to dark skin
toned players than light skin toned players
• Estimated odds ratios 0.89 –2.93 (median 1.3)
• 20 teams: statistically significant positive effect, 9: non-significant relation
6-Jun-22
9 Insert > Header & footer
Estimated odds ratios by 29 research teams
6-Jun-22
10 Insert > Header & footer
“Logistic regression”
6-Jun-22
11 Insert > Header & footer
Open science research questions: case 1
• 29 teams involving 61 analysts; same dataset; same research question:
whether soccer referees are more likely to give red cards to dark skin toned
players than light skin toned players
• Estimated odds ratios 0.89 –2.93 (median 1.3).
• 20 teams: statistically significant positive effect, 9: non-significant relation.
• 21 unique combinations of covariates
• “Variation in analysis of complex data may be difficult to
avoid, even by experts with honest intentions”
6-Jun-22
12 Insert > Header & footer
Open science research questions: case 2
6-Jun-22
13 Insert > Header & footer
Example from Maarten van Smeden
@MaartenvSmeden
Predicting mortality – the media
Findings not convincing
Cox, #4, 30 vars, max c =0.793
RF, #7, 600 vars, c=0.797
Elastic, #9, 600 vars, c=0.801
6-Jun-22
15 Insert > Header & footer
Machine learning vs conventional modeling
1. Findings convincing?
“We found that random forests did not outperform Cox models despite their
inherent ability to accommodate nonlinearities and interactions. …
Elastic nets achieved the highest discrimination performance …, demonstrating
the ability of regularisation to select relevant variables and optimise model
coefficients in an EHR context.”
6-Jun-22
16 Insert > Header & footer
Machine learning vs conventional modeling
1. Findings convincing? Not in case-study
2. Systematic / ”it depends” ?
6-Jun-22
17 Insert > Header & footer
6-Jun-22
18 Insert > Header & footer
6-Jun-22
19 Insert > Header & footer
Open science research questions: case 2
• 243 real datasets from “the OpenML database”
• RF performed better than LR:
mean difference between RF and LR was 0.041 (95%-CI =[0.031,0.053]) for
the Area Under the ROC Curve
• Results were dependent on the inclusion criteria used to select the example
datasets
• ES: Results rely on 10 x 10-fold cross-validation
6-Jun-22
20 Insert > Header & footer
Open science research questions: case 2
• More clarification needed when ML / RF works best; at least large N needed
6-Jun-22
21 Insert > Header & footer
Systematic review on ML vs classic modeling
6-Jun-22
22 Insert > Header & footer
Differences in discrimination
Thanks to Maarten van Smeden
Summary on examples of Open Science
to better address Big research questions
• 1 data set
• multiple modelers
• Multiple modeling options
• 1 neutral comparison; 243 OpenML databases
• Review of 282 comparative studies: meta-research
6-Jun-22
25 Insert > Header & footer
Open Science: data sharing
 Collaboration vs giving
6-Jun-22
27 Insert > Header & footer
Heterogeneity in data .. ignored
6-Jun-22
28 Insert > Header & footer
Data sharing
• Pro:
• Allowed for larger sample size in a rare disease
• Cons:
• Heterogeneity?
• Substantial politics / efforts
6-Jun-22
29 Insert > Header & footer
Open Science: analyses and interpretation
Analyses: ODHSI model
6-Jun-22
31 Insert > Header & footer
OHDSI: COVID and other research topics
6-Jun-22
32 Insert > Header & footer
The power of OHDSI
6-Jun-22
33 Insert > Header & footer
OMOP common data model enables sharing of
model development code
6-Jun-22
34 Insert > Header & footer
Performance for different outcomes in multiple cohorts
6-Jun-22
35 Insert > Header & footer
OHDSI: bridging data sharing - analyses
• Keep data local
• Run locally started, centrally available analyses
• Share results centrally
Open Science: analyses and interpretation
Open Science challenge:
dealing with heterogeneity for prediction research
Heterogeneity
• Study design
• Selection of subjects
• Measurement of covariates
• Measurement of outcomes
• Associations of covariates with outcome
• Overall outcome rates
• Performance of prediction models
Analyses: dealing with heterogeneity
6-Jun-22
39 Insert > Header & footer
15 cohorts: 11 RCTs, 4 Observational studies
6-Jun-22
40 Insert > Header & footer
Heterogeneous case-mix
6-Jun-22
41 Insert > Header & footer
Heterogeneous predictor effects
6-Jun-22
42 Insert > Header & footer
Heterogeneous predictions
6-Jun-22
43 Insert > Header & footer
Heterogeneity  uncertainty in individual predictions
given that a prespecified logistic model is fitted
6-Jun-22
44 Insert > Header & footer
“Open Science is Better Science”
1. Research questions in competitions
• Red cards
• Neutral comparisons / meta-research
2. Data sharing
• Collaborative efforts most successful
3. Analyses
• OHDSI: modern, keep data local
• Heterogeneity
6-Jun-22
45 Insert > Header & footer

More Related Content

Similar to Open Science Better Science? Steyerberg 2June2022.pptx

2016 Scope david cocker
2016 Scope david cocker2016 Scope david cocker
2016 Scope david cocker
David Cocker
 
Data peer review workshop
Data peer review workshopData peer review workshop
Data peer review workshop
Varsha Khodiyar
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?
Paul Agapow
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still Work
Daniel S. Katz
 
محاضرة د.سعاد
محاضرة د.سعادمحاضرة د.سعاد
محاضرة د.سعاد
researchcenterm
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data Science
University of Washington
 
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Barry Smith
 
Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?
Philip Bourne
 
AllegroGraph - Cognitive Probability Graph webcast
AllegroGraph - Cognitive Probability Graph webcastAllegroGraph - Cognitive Probability Graph webcast
AllegroGraph - Cognitive Probability Graph webcast
Franz Inc. - AllegroGraph
 
From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge Graphs
Paul Groth
 
Presentation on Open Science and its 'Impacts';
Presentation on Open Science and its 'Impacts'; Presentation on Open Science and its 'Impacts';
Presentation on Open Science and its 'Impacts';
Rene Von schomberg
 
What is the reproducibility crisis in science and what can we do about it?
What is the reproducibility crisis in science and what can we do about it?What is the reproducibility crisis in science and what can we do about it?
What is the reproducibility crisis in science and what can we do about it?
Dorothy Bishop
 
محاضرة 4
محاضرة 4محاضرة 4
Sharing and standards christopher hart - clinical innovation and partnering...
Sharing and standards   christopher hart - clinical innovation and partnering...Sharing and standards   christopher hart - clinical innovation and partnering...
Sharing and standards christopher hart - clinical innovation and partnering...
Christopher Hart
 
The Uneven Future of Evidence-Based Medicine
The Uneven Future of Evidence-Based MedicineThe Uneven Future of Evidence-Based Medicine
The Uneven Future of Evidence-Based Medicine
Ida Sim
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
vishal choudhary
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Carole Goble
 
Paris Data Ladies #14
Paris Data Ladies #14Paris Data Ladies #14
Paris Data Ladies #14
Nina Bertrand
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptx
shalini s
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
ssuser1a4f0f
 

Similar to Open Science Better Science? Steyerberg 2June2022.pptx (20)

2016 Scope david cocker
2016 Scope david cocker2016 Scope david cocker
2016 Scope david cocker
 
Data peer review workshop
Data peer review workshopData peer review workshop
Data peer review workshop
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still Work
 
محاضرة د.سعاد
محاضرة د.سعادمحاضرة د.سعاد
محاضرة د.سعاد
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data Science
 
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
 
Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?
 
AllegroGraph - Cognitive Probability Graph webcast
AllegroGraph - Cognitive Probability Graph webcastAllegroGraph - Cognitive Probability Graph webcast
AllegroGraph - Cognitive Probability Graph webcast
 
From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge Graphs
 
Presentation on Open Science and its 'Impacts';
Presentation on Open Science and its 'Impacts'; Presentation on Open Science and its 'Impacts';
Presentation on Open Science and its 'Impacts';
 
What is the reproducibility crisis in science and what can we do about it?
What is the reproducibility crisis in science and what can we do about it?What is the reproducibility crisis in science and what can we do about it?
What is the reproducibility crisis in science and what can we do about it?
 
محاضرة 4
محاضرة 4محاضرة 4
محاضرة 4
 
Sharing and standards christopher hart - clinical innovation and partnering...
Sharing and standards   christopher hart - clinical innovation and partnering...Sharing and standards   christopher hart - clinical innovation and partnering...
Sharing and standards christopher hart - clinical innovation and partnering...
 
The Uneven Future of Evidence-Based Medicine
The Uneven Future of Evidence-Based MedicineThe Uneven Future of Evidence-Based Medicine
The Uneven Future of Evidence-Based Medicine
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
Paris Data Ladies #14
Paris Data Ladies #14Paris Data Ladies #14
Paris Data Ladies #14
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptx
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 

More from Ewout Steyerberg

Statistics and ML Paris 20sept22
Statistics and ML Paris 20sept22Statistics and ML Paris 20sept22
Statistics and ML Paris 20sept22
Ewout Steyerberg
 
Reproducibility Leiden 12jul22.pptx
Reproducibility Leiden 12jul22.pptxReproducibility Leiden 12jul22.pptx
Reproducibility Leiden 12jul22.pptx
Ewout Steyerberg
 
Prediction research Twente 22June22 sel.pptx
Prediction research Twente 22June22 sel.pptxPrediction research Twente 22June22 sel.pptx
Prediction research Twente 22June22 sel.pptx
Ewout Steyerberg
 
Prediction research: perspectives on performance Stanford 19May22.pptx
Prediction research: perspectives on performance Stanford 19May22.pptxPrediction research: perspectives on performance Stanford 19May22.pptx
Prediction research: perspectives on performance Stanford 19May22.pptx
Ewout Steyerberg
 
Evaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk predictionEvaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk prediction
Ewout Steyerberg
 
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Ewout Steyerberg
 

More from Ewout Steyerberg (6)

Statistics and ML Paris 20sept22
Statistics and ML Paris 20sept22Statistics and ML Paris 20sept22
Statistics and ML Paris 20sept22
 
Reproducibility Leiden 12jul22.pptx
Reproducibility Leiden 12jul22.pptxReproducibility Leiden 12jul22.pptx
Reproducibility Leiden 12jul22.pptx
 
Prediction research Twente 22June22 sel.pptx
Prediction research Twente 22June22 sel.pptxPrediction research Twente 22June22 sel.pptx
Prediction research Twente 22June22 sel.pptx
 
Prediction research: perspectives on performance Stanford 19May22.pptx
Prediction research: perspectives on performance Stanford 19May22.pptxPrediction research: perspectives on performance Stanford 19May22.pptx
Prediction research: perspectives on performance Stanford 19May22.pptx
 
Evaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk predictionEvaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk prediction
 
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
 

Recently uploaded

The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
rakeshsharma20142015
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
Cherry
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 

Recently uploaded (20)

The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 

Open Science Better Science? Steyerberg 2June2022.pptx

  • 1. June, 2022 Is Open Science Better Science? Ewout W. Steyerberg, PhD Professor of Clinical Biostatistics and Medical Decision Making Thanks to many for assistance and inspiration, including the GAP3 consortium, CENTER-TBI Study Yes, but …
  • 2. Open vs closed science Long ago - Performed by few, elitarian scientists - Doing private experiments - Discussion in small, closed communities
  • 3. Probabilities to quantify uncertainty • Christiaan Huygens 1657: 'Van rekeningh in spelen van geluck' • Thomas Bayes 1763: An Essay towards solving a Problem in the Doctrine of Chances” (read to the Royal Society by Richard Price) • Pierre Laplace 1812: Théorie analytique des probabilités 6-Jun-22 3 Insert > Header & footer
  • 4. Open vs closed science Long ago - Performed by few, elitarian scientists - Doing private experiments - Discussion in small, closed communities Recent - Science as a profession - Protect data + code as intellectual property - Aim for shocking findings in high IF journals https://www.sciencemag.org/news/2020/06/whos-blame-these-three-scientists-are-heart-surgisphere-covid-19-scandal
  • 5. Overall claim “Open Science will make research better” Vote pro / neutral / con “More data is better” Vote pro / neutral / con 6-Jun-22 5 Insert > Header & footer
  • 6. Today Aims: - Highlight some strong points in Open Science - Hint at some challenges in Open Science Reflections based on personal 30-yr research experience, specific focus on prediction research / decision making 6-Jun-22 6 Insert > Header & footer
  • 7. Open Science to better address Big Research questions
  • 8. Open science research questions: case 1 Example 1: Red cards and dark skin soccer players https://psyarxiv.com/qkwst/ 6-Jun-22 8 Insert > Header & footer
  • 9. Open science research questions: case 1 • 29 teams involving 61 analysts; same dataset; same research question: whether soccer referees are more likely to give red cards to dark skin toned players than light skin toned players • Estimated odds ratios 0.89 –2.93 (median 1.3) • 20 teams: statistically significant positive effect, 9: non-significant relation 6-Jun-22 9 Insert > Header & footer
  • 10. Estimated odds ratios by 29 research teams 6-Jun-22 10 Insert > Header & footer
  • 12. Open science research questions: case 1 • 29 teams involving 61 analysts; same dataset; same research question: whether soccer referees are more likely to give red cards to dark skin toned players than light skin toned players • Estimated odds ratios 0.89 –2.93 (median 1.3). • 20 teams: statistically significant positive effect, 9: non-significant relation. • 21 unique combinations of covariates • “Variation in analysis of complex data may be difficult to avoid, even by experts with honest intentions” 6-Jun-22 12 Insert > Header & footer
  • 13. Open science research questions: case 2 6-Jun-22 13 Insert > Header & footer Example from Maarten van Smeden @MaartenvSmeden
  • 15. Findings not convincing Cox, #4, 30 vars, max c =0.793 RF, #7, 600 vars, c=0.797 Elastic, #9, 600 vars, c=0.801 6-Jun-22 15 Insert > Header & footer
  • 16. Machine learning vs conventional modeling 1. Findings convincing? “We found that random forests did not outperform Cox models despite their inherent ability to accommodate nonlinearities and interactions. … Elastic nets achieved the highest discrimination performance …, demonstrating the ability of regularisation to select relevant variables and optimise model coefficients in an EHR context.” 6-Jun-22 16 Insert > Header & footer
  • 17. Machine learning vs conventional modeling 1. Findings convincing? Not in case-study 2. Systematic / ”it depends” ? 6-Jun-22 17 Insert > Header & footer
  • 18. 6-Jun-22 18 Insert > Header & footer
  • 19. 6-Jun-22 19 Insert > Header & footer
  • 20. Open science research questions: case 2 • 243 real datasets from “the OpenML database” • RF performed better than LR: mean difference between RF and LR was 0.041 (95%-CI =[0.031,0.053]) for the Area Under the ROC Curve • Results were dependent on the inclusion criteria used to select the example datasets • ES: Results rely on 10 x 10-fold cross-validation 6-Jun-22 20 Insert > Header & footer
  • 21. Open science research questions: case 2 • More clarification needed when ML / RF works best; at least large N needed 6-Jun-22 21 Insert > Header & footer
  • 22. Systematic review on ML vs classic modeling 6-Jun-22 22 Insert > Header & footer
  • 24. Thanks to Maarten van Smeden
  • 25. Summary on examples of Open Science to better address Big research questions • 1 data set • multiple modelers • Multiple modeling options • 1 neutral comparison; 243 OpenML databases • Review of 282 comparative studies: meta-research 6-Jun-22 25 Insert > Header & footer
  • 26. Open Science: data sharing  Collaboration vs giving
  • 27. 6-Jun-22 27 Insert > Header & footer
  • 28. Heterogeneity in data .. ignored 6-Jun-22 28 Insert > Header & footer
  • 29. Data sharing • Pro: • Allowed for larger sample size in a rare disease • Cons: • Heterogeneity? • Substantial politics / efforts 6-Jun-22 29 Insert > Header & footer
  • 30. Open Science: analyses and interpretation
  • 31. Analyses: ODHSI model 6-Jun-22 31 Insert > Header & footer
  • 32. OHDSI: COVID and other research topics 6-Jun-22 32 Insert > Header & footer
  • 33. The power of OHDSI 6-Jun-22 33 Insert > Header & footer
  • 34. OMOP common data model enables sharing of model development code 6-Jun-22 34 Insert > Header & footer
  • 35. Performance for different outcomes in multiple cohorts 6-Jun-22 35 Insert > Header & footer
  • 36. OHDSI: bridging data sharing - analyses • Keep data local • Run locally started, centrally available analyses • Share results centrally
  • 37. Open Science: analyses and interpretation
  • 38. Open Science challenge: dealing with heterogeneity for prediction research Heterogeneity • Study design • Selection of subjects • Measurement of covariates • Measurement of outcomes • Associations of covariates with outcome • Overall outcome rates • Performance of prediction models
  • 39. Analyses: dealing with heterogeneity 6-Jun-22 39 Insert > Header & footer
  • 40. 15 cohorts: 11 RCTs, 4 Observational studies 6-Jun-22 40 Insert > Header & footer
  • 44. Heterogeneity  uncertainty in individual predictions given that a prespecified logistic model is fitted 6-Jun-22 44 Insert > Header & footer
  • 45. “Open Science is Better Science” 1. Research questions in competitions • Red cards • Neutral comparisons / meta-research 2. Data sharing • Collaborative efforts most successful 3. Analyses • OHDSI: modern, keep data local • Heterogeneity 6-Jun-22 45 Insert > Header & footer