Slides for the statistics in practice session for the Biometrisches Kolloqium (organized by the Deutsche Region der Internationalen Biometrischen Gesellschaft), 18 March 2021
Improving predictions: Lasso, Ridge and Stein's paradoxMaarten van Smeden
Slides of masterclass "Improving predictions: Lasso, Ridge and Stein's paradox" at the (Dutch) National Institute for Public Health and the Environment (RIVM)
Improving predictions: Lasso, Ridge and Stein's paradoxMaarten van Smeden
Slides of masterclass "Improving predictions: Lasso, Ridge and Stein's paradox" at the (Dutch) National Institute for Public Health and the Environment (RIVM)
Improving epidemiological research: avoiding the statistical paradoxes and fa...Maarten van Smeden
Keynote at Norwegian Epidemiological Association conference, October 26 2022. Discussing absence of evidence fallacy, Table 2 fallacy, Winner's curse and Stein's paradox.
Developing and validating statistical models for clinical prediction and prog...Evangelos Kritsotakis
Talk on clinical prediction models presented at the Joint Seminar Series in Translational and Clinical Medicine organised by the University of Crete Medical School, the Institute of Molecular Biology and Biotechnology of the Foundation for Research and Technology Hellas (IMBB-FORTH), and the University of Crete Research Center (UCRC), Heraklion [online], Greece, April 7, 2021.
Dichotomania and other challenges for the collaborating biostatisticianLaure Wynants
Conference presentation at ISCB 41 in the session
"Biostatistical inference in practice: moving beyond false
dichotomies"
A comment in Nature, signed by over 800 researchers, called for the scientific community to “retire statistical significance”. The responses included a call to halt the use of the term „statistically significant”, and changes in journal’s author guidelines. The leading discourse among statisticians is that inadequate statistical training of clinical researchers and publishing practices are to blame for the misuse of statistical testing. In this presentation, we search our collective conscience by reviewing ethical guidelines for statisticians in light of the p-value crisis, examine what this implies for us when conducting analyses in collaborative work and teaching, and whether the ATOM (accept uncertainty; be thoughtful, open and modest) principles can guide us.
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Ewout Steyerberg
Title"Clinical prediction models in the age of artificial intelligence and big data", presented at the Basel Biometrics Society seminar Nov 1, 2019, Basel, by Ewout Steyerberg, with substantial inout from Maarten van Smeden and Ben van Calster
Improving epidemiological research: avoiding the statistical paradoxes and fa...Maarten van Smeden
Keynote at Norwegian Epidemiological Association conference, October 26 2022. Discussing absence of evidence fallacy, Table 2 fallacy, Winner's curse and Stein's paradox.
Developing and validating statistical models for clinical prediction and prog...Evangelos Kritsotakis
Talk on clinical prediction models presented at the Joint Seminar Series in Translational and Clinical Medicine organised by the University of Crete Medical School, the Institute of Molecular Biology and Biotechnology of the Foundation for Research and Technology Hellas (IMBB-FORTH), and the University of Crete Research Center (UCRC), Heraklion [online], Greece, April 7, 2021.
Dichotomania and other challenges for the collaborating biostatisticianLaure Wynants
Conference presentation at ISCB 41 in the session
"Biostatistical inference in practice: moving beyond false
dichotomies"
A comment in Nature, signed by over 800 researchers, called for the scientific community to “retire statistical significance”. The responses included a call to halt the use of the term „statistically significant”, and changes in journal’s author guidelines. The leading discourse among statisticians is that inadequate statistical training of clinical researchers and publishing practices are to blame for the misuse of statistical testing. In this presentation, we search our collective conscience by reviewing ethical guidelines for statisticians in light of the p-value crisis, examine what this implies for us when conducting analyses in collaborative work and teaching, and whether the ATOM (accept uncertainty; be thoughtful, open and modest) principles can guide us.
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Ewout Steyerberg
Title"Clinical prediction models in the age of artificial intelligence and big data", presented at the Basel Biometrics Society seminar Nov 1, 2019, Basel, by Ewout Steyerberg, with substantial inout from Maarten van Smeden and Ben van Calster
Theory and Practice of Integrating Machine Learning and Conventional Statisti...University of Malaya
The practice of medical decision making is changing rapidly with the development of innovative
computing technologies. The growing interest of data analysis in line with the advancement in data
science raises the question of whether machine learning can be integrated with conventional statistics
in health research. To help address this knowledge gap, this talk focuses on the conceptual
integration between conventional statistics and machine learning, with a direction towards health
research. The similarities and differences between the two are compared using mathematical
concepts and algorithms. The comparison between conventional statistics and machine learning
methods indicates that conventional statistics are the fundamental basis of machine learning, where
the black box algorithms are derived from basic mathematics, but are advanced in terms of
automated analysis, handling big data and providing interactive visualizations. While the nature of
both these methods are different, they are conceptually similar. The evidence produced here
concludes that conventional statistics and machine learning are best to be integrated to develop
automated data analysis tools. Health researchers may explore machine learning as a potential tool to
enhance conventional statistics in data analytics for added reliable validation measures.
Because everyone matters.
IBM Health and Social Programs Summit, October 2014
Stephen Morgan
Senior Vice President and Chief Medical Officer
Carilion Clinic
Jianying Hu
Research Staff Member and Manager of Healthcare Analytics Research
IBM
Paul Grundy
Global Director of Healthcare Transformation
IBM
Building a Comprehensive Health HistoryBuild a health histor.docxrichardnorman90310
Building a Comprehensive Health History
Build a health history for a 55-year-old Asian female living in a high-density public housing complex –
Introduction of the paper, then explain
1. How would your communication and interview techniques for building a health history differ with each patient?
2. How might you target your questions for building a health history based on the patient’s social determinants of health?
3. What risk assessment instruments would be appropriate to use with the patient, or what questions would you ask the patient to assess his or her health risks?
4. Identify any potential health-related risks based upon the patient’s age, gender, ethnicity, or environmental setting that should be taken into consideration.
5. Select one of the risk assessment instruments presented in Chapter 1 or Chapter 5 of the Seidel's Guide to Physical Examination text, or another tool with which you are familiar, related to your selected patient.
6. Develop at least eight targeted questions you would ask the selected patient to assess his or her health risks and begin building a health history.
Resources
Ball, J. W., Dains, J. E., Flynn, J. A., Solomon, B. S., & Stewart, R. W. (2019). Seidel's guide to physical examination: An interprofessional approach (9th ed.). St. Louis, MO: Elsevier Mosby.
· Chapter 1, “The History and Interviewing Process”
· Chapter 5, “Recording Information” provides methods for maintaining clear and accurate records, also explore the legal aspects of patient records.
Sullivan, D. D. (2019). Guide to clinical documentation (3rd ed.). Philadelphia, PA: F. A. Davis.
· Chapter 2, "The Comprehensive History and Physical Exam" (pp. 19–29)
R Ryanne, W., & Lori A, O. (2015). Implementation of health risk assessments with family health history: barriers and benefits. Postgraduate Medical Journal, 1079, 508.
Lushniak, B. D. (2015). Surgeon general’s perspectives: family health history: using the past to improve future health. Public Health Reports, 1, 3.
Jardim, T. V., Sousa, A. L. L., Povoa, T. I. R., Barroso, W. K. S., Chinem, B., Jardim, L., Bernardes, R., Coca, A., & Jardim, P. C. B. V. (2015). The natural history of cardiovascular risk factors in health professionals: 20-year follow-up. BMC Public Health, 15, 1111.
ITS 832
Chapter 5
From Building a Model to Adaptive Robust
Decision Making Using Systems Modeling
InformationTechnology in a Global Economy
Professor Miguel Buleje
Introduction
• Modeling & Simulation
• Fields that develops and applies computational methods to
address complex system
• Addresses problems related to complex issues
• Focus on decision making abilities
• Opportunities to leverage interdisciplinary approach, and learn
across fields to understand complex systems.
• Legacy System Dynamics (SD) modeling and others
methods are presented
• Recent innovations
• What the future holds
• Examples
Systems Modeling
• Dynamic complexity
• Behavior evolves over time
• Mode.
Building a Comprehensive Health HistoryBuild a health histor.docxcurwenmichaela
Building a Comprehensive Health History
Build a health history for a 55-year-old Asian female living in a high-density public housing complex –
Introduction of the paper, then explain
1. How would your communication and interview techniques for building a health history differ with each patient?
2. How might you target your questions for building a health history based on the patient’s social determinants of health?
3. What risk assessment instruments would be appropriate to use with the patient, or what questions would you ask the patient to assess his or her health risks?
4. Identify any potential health-related risks based upon the patient’s age, gender, ethnicity, or environmental setting that should be taken into consideration.
5. Select one of the risk assessment instruments presented in Chapter 1 or Chapter 5 of the Seidel's Guide to Physical Examination text, or another tool with which you are familiar, related to your selected patient.
6. Develop at least eight targeted questions you would ask the selected patient to assess his or her health risks and begin building a health history.
Resources
Ball, J. W., Dains, J. E., Flynn, J. A., Solomon, B. S., & Stewart, R. W. (2019). Seidel's guide to physical examination: An interprofessional approach (9th ed.). St. Louis, MO: Elsevier Mosby.
· Chapter 1, “The History and Interviewing Process”
· Chapter 5, “Recording Information” provides methods for maintaining clear and accurate records, also explore the legal aspects of patient records.
Sullivan, D. D. (2019). Guide to clinical documentation (3rd ed.). Philadelphia, PA: F. A. Davis.
· Chapter 2, "The Comprehensive History and Physical Exam" (pp. 19–29)
R Ryanne, W., & Lori A, O. (2015). Implementation of health risk assessments with family health history: barriers and benefits. Postgraduate Medical Journal, 1079, 508.
Lushniak, B. D. (2015). Surgeon general’s perspectives: family health history: using the past to improve future health. Public Health Reports, 1, 3.
Jardim, T. V., Sousa, A. L. L., Povoa, T. I. R., Barroso, W. K. S., Chinem, B., Jardim, L., Bernardes, R., Coca, A., & Jardim, P. C. B. V. (2015). The natural history of cardiovascular risk factors in health professionals: 20-year follow-up. BMC Public Health, 15, 1111.
ITS 832
Chapter 5
From Building a Model to Adaptive Robust
Decision Making Using Systems Modeling
InformationTechnology in a Global Economy
Professor Miguel Buleje
Introduction
• Modeling & Simulation
• Fields that develops and applies computational methods to
address complex system
• Addresses problems related to complex issues
• Focus on decision making abilities
• Opportunities to leverage interdisciplinary approach, and learn
across fields to understand complex systems.
• Legacy System Dynamics (SD) modeling and others
methods are presented
• Recent innovations
• What the future holds
• Examples
Systems Modeling
• Dynamic complexity
• Behavior evolves over time
• Mode.
A multidisciplinary reflexion on health issues of the 21st century could lead to innovative solutions. One of the challenges to overcome in the coming decades is how to support the increasing number of chronic patients in a pressured healthcare ecology. Patients in chronic disease management are expected to increasingly use Information and Communication Technology (ICT) for self-care during their treatment process and for co-decision with health care providers. The application of these types of information and communication technology is looked upon as one of the ways to get both patients and healthcare providers more involved in their treatment and to increase the health related quality of care, according to the WHO. Connecting patients and health care professionals would not only improve the technical system of communicating but also triggers social innovations of care models in which new ways of interacting and deciding improves the diagnostics and treatment. So far, a general overview of the extent and nature of published research involving this subset of ICT-interventions is lacking. Based on a scoping review conducted by Wildevuur e.o cancer was chosen as a case study to research how ICT could support cancer-patients in a person-centred approach to care.
Presented at "Hospital Management 2015" Program, Hospital Administration School, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Thailand on August 18, 2015
Operations research within UK healthcare: A reviewHarender Singh
The paper "Operations research within UK healthcare: a review" provides an overview of the application of operations research (OR) in the UK healthcare sector. The review highlights the contribution of OR in improving efficiency, reducing costs, and enhancing patient outcomes in various areas of healthcare, such as hospital management, patient flow, resource allocation, and scheduling. The paper also discusses the challenges and opportunities in applying OR in healthcare, such as data availability, ethical considerations, and stakeholder engagement. Overall, the review provides insights into the potential of OR to drive innovation and improve healthcare delivery in the UK.
2017 06-15 Ctbg Relatiedag 2017, Ede, Alain van GoolAlain van Gool
Keynote lecture for the Ctgb (Board for the Authorisation of Plant Protection Products and Biocides) conference where I was asked to sketch the parallel between precision medicine and precision agriculture for an interested audience of 400+ scientists and companies.
How to establish and evaluate clinical prediction models - StatsworkStats Statswork
A clinical prediction model can be used in various clinical contexts, including screening for asymptomatic illness, forecasting future events such as disease, and assisting doctors in their decision-making and health education. Despite the positive effects of clinical prediction models on practice, prediction modeling is a difficult process that necessitates meticulous statistical analysis and sound clinical judgments. Statswork offers statistical services as per the requirements of the customers. When you Order statistical Services at Statswork, we promise you the following always on Time, outstanding customer support, and High-quality Subject Matter Experts.
Read More With Us: https://bit.ly/3dxn32c
Why Statswork?
Plagiarism Free | Unlimited Support | Prompt Turnaround Times | Subject Matter Expertise | Experienced Bio-statisticians & Statisticians | Statistics across Methodologies | Wide Range of Tools & Technologies Supports | Tutoring Services | 24/7 Email Support | Recommended by Universities
Contact Us:
Website: www.statswork.com
Email: info@statswork.com
United Kingdom: 44-1143520021
India: 91-4448137070
WhatsApp: 91-8754446690
The Role and Responsibilities of Statisticians in Clinical Trials Presentation to MedicReS 5th World Congress on October 19-25,2015 in New York by Shing Lee, PhD
Similar to Development and evaluation of prediction models: pitfalls and solutions (20)
The absence of a gold standard: a measurement error problemMaarten van Smeden
Talk about gold standard problems and solutions in medicine and epidemiology. Invited by the department of infectious disease epidemiology, University Medical Center Utrecht
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Development and evaluation of prediction models: pitfalls and solutions
1. Maarten van Smeden, PhD
Biometrisches Kolloquim
16 March 2021
Development and evaluation of prediction
models: pitfalls and solutions
Statistics in practice I
2. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Aims of the lectures
• To give our views on the state of the art of clinical prediction modelling
• To highlight common pitfalls and potential solutions when developing or
evaluating clinical prediction models
• To argue the need for less model development and more model evaluation
• To plead for increased attention to quantifying differences in model
performance over time and place
3. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Agenda
• PART I (11:00 – 12:40)
• State of the medical prediction modeling art
• Just another prediction model
• Methods against overfitting
• Methods for deciding on appropriate sample size
• PART II (13:30 – 15:10)
• Model performance and validation
• Heterogeneity over time and place: there is no such thing as a validated
model
• Applied example: ADNEX model
• Future perspective: machine learning and AI
4. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Agenda
• PART I (11:00 – 12:40)
• State of the medical prediction modeling art
• Just another prediction model
• Methods against overfitting
• Methods for deciding on appropriate sample size
• PART II (13:30 – 15:10)
• Model performance and validation
• Heterogeneity over time and place: there is no such thing as a validated
model
• Applied example: ADNEX model
• Future perspective: machine learning and AI
5. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models Source: https://bit.ly/2ODx6c2
State of the medical
prediction modeling art
6. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
What do we mean with a prediction model?
“… summarize the effects of predictors to provide individualized predictions of
the absolute risk of a diagnostic or prognostic outcome.”
Steyerberg, 2019
”Model development studies aim to derive a prediction model by selecting the
relevant predictors and combining them statistically into a multivariable model.”
TRIPOD statement, 2015
“… may be developed for scientific or clinical reasons, or both”
Altman & Royston, 2000
7. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
What do I mean with a prediction model?
• Mathematical rule or equation derived from empirical data describing the
relation between a health outcome (Y) and input (X)
• The aim is to find a function ̂
𝑓𝑓(X) that yields accurate individual level
predictions, �
𝑦𝑦𝑖𝑖
• Finding the optimal ̂
𝑓𝑓(X) is usually not only prediction error minimization
problem, but also takes into account requirements regarding transparency,
transportability, costs, sparsity etc.
• Interpretation and contribution of individual components in ̂
𝑓𝑓(X) are not of
primary interest (exception: issues related to fairness)
8. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
van Smeden et al., JCE, in press
9. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
APGAR score: the oldest prediction model?
Apgar et al. JAMA, 1958
10. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Diagnostic prediction model
Wells et al., Lancet, 1997
11. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Prognostic prediction model
Conroy et al. EHJ, 2003.
13. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Complex model
Gulshan et al. JAMA, 2016
Diabetic retinopathy
Deep learning (= Neural network)
• 128,000 images
• Transfer learning (preinitialization)
• Sensitivity and specificity > .90
• Estimated from training data
14. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
“Bedside” prediction models
15. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
“Bedside” prediction models
Numerical example: 10 year death due to cardiovascular disease
17. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Why bother?
“As of today, we have deployed the system in 16 hospitals, and it is
performing over 1,300 screenings per day”
MedRxiv pre-print only, 23 March 2020,
doi.org/10.1101/2020.03.19.20039354
20. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
• Published on 7 April 2020
• 18 days between idea and article acceptance (sprint)
• Invited by BMJ as the first ever living review (marathon)
21. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Latest version (Update 3)
Wynants et al. BMJ, 2020
22. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Prediction model related to prognosis and diagnosis
3 main types models
1. Patient X is infected / COVID-19 pneumonia
diagnostic
2. Patient X will die from COVID-19 / need respiratory support
prognostic
3. Currently healthy person X will get severe COVID-19
general population
23. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Data extraction
• 43 researchers, duplicate reviews
• Extraction form based on CHARMS checklist & PROBAST (127 questions)
• Assessment of each prediction model separately
if more than one was developed/validated
24. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
UPDATE 3
• 232 models reviewed
• Peer reviewed articles
in Pubmed and Embase
• Pre-prints only until update 2
from bioRxiv, medRxiv, and arXiv
• Search: up to 1 July 2020
25. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Some key numbers
Origin of data
• Single country: N = 174, 75% (of which from China: 56%)
• Multiple countries: 18%
• Unknown origin: 7%
Target setting
• Patients admitted to hospital: N = 119, 51%
• Patient at triage centre or fever clinic: 5%
• Patients in general practice: 1%
• Other/unclear: 42%
26. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Some key numbers
Target population
• Confirmed COVID-19: N = 108, 47%
• Suspected COVID-19: 36%
• Other/unclear: 18%
Sample size
• Sample size development, median: 338 (IQR: 134—707)
• number of events, median: 69 (37—160)
• Sample size external validation, median: 189 (76—312)
• Number of events, median: 40 (24—122)
27. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Models
• Logistic regression (including regularized): 35%
• Neural net (including deep learning): 33%
• Tree-based (including random forest): 7%
• Survival models (including Cox ph): 6%
• SVM: 4%
• Other/unclear: 15%
29. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Reported performance – AUC range
• General population models: 0.71 to 0.99
• Diagnostic models: 0.65 to 0.99
• Diagnostic severity models: 0.80 to 0.99
• Diagnostic imaging models: 0.70 to 0.99
• Prognosis models: 0.54 to 0.99
prediction horizon varied from 1 to 37 days
31. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Participants
• Inappropriate or unclear in/exclusion or study design
Predictors
• Scored “unknown” in imaging studies
Outcome
• Subjective or proxy outcomes
Analysis
• Small sample size
• Inappropriate or incomplete evaluation of performance
32. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Conclusion update 3 living review
“…models are all at high or unclear risk of bias”
We do “not recommend any of the current prediction
models to be used in practice, but one diagnostic and
one prognostic model originated from higher quality
studies and should be (independently) validated in other
datasets”
33. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
https://twitter.com/EricTopol/status/1328086389024440321
34. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Roberts et al. Nature ML, 2021
37. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Landscape of clinical prediction models
• 37 models for treatment response in pulmonary TB (Peetluk, 2021)
• 35 models for in vitro fertilisation (Ratna, 2020)
• 34 models for stroke in type-2 diabetes (Chowdhury, 2019)
• 34 models for graft failure in kidney transplantation (Kabore, 2017)
• 31 models for length of stay in ICU (Verburg, 2016)
• 27 models for pediatric early warning systems (Trubey, 2019)
• 27 models for malaria prognosis (Njim, 2019)
• 26 models for postoperative outcomes colorectal cancer (Souwer, 2020)
• 26 models for childhood asthma (Kothalawa, 2020)
• 25 models for lung cancer risk (Gray, 2016)
• 25 models for re-admission after admitted for heart failure (Mahajan, 2018)
• 23 models for recovery after ischemic stroke (Jampathong, 2018)
• 23 models for delirium in older adults (Lindroth, 2018)
• 21 models for atrial fibrillation detection in community (Himmelreich, 2020)
• 19 models for survival after resectable pancreatic cancer (Stijker, 2019)
• 18 models for recurrence hep. carcinoma after liver transplantation (Al-Ameri, 2020)
• 18 models for future hypertension in children (Hamoen, 2018)
• 18 models for risk of falls after stroke (Walsh, 2016)
• 18 models for mortality in acute pancreatitis (Di, 2016)
• 17 models for bacterial meningitis (van Zeggeren, 2019)
• 17 models for cardiovascular disease in hypertensive population (Cai, 2020)
• 14 models for ICU delirium risk (Chen, 2020)
• 14 models for diabetic retinopathy progression (Haider, 2019)
• 408 models for COPD prognosis (Bellou, 2019)
• 363 models for cardiovascular disease general population (Damen, 2016)
• 263 prognosis models in obstetrics (Kleinrouweler, 2016)
• 258 models mortality after general trauma (Munter, 2017)
• 232 models related to COVID-19 (Wynants, 2020)
• 160 female-specific models for cardiovascular disease (Baart, 2019)
• 119 models for critical care prognosis in LMIC (Haniffa, 2018)
• 101 models for primary gastric cancer prognosis (Feng, 2019)
• 81 models for sudden cardiac arrest (Carrick, 2020)
• 74 models for contrast-induced acute kidney injury (Allen, 2017)
• 73 models for 28/30 day hospital readmission (Zhou, 2016)
• 68 models for preeclampsia (De Kat, 2019)
• 67 models for traumatic brain injury prognosis (Dijkland, 2019)
• 64 models for suicide / suicide attempt (Belsher, 2019)
• 61 models for dementia (Hou, 2019)
• 58 models for breast cancer prognosis (Phung, 2019)
• 52 models for pre‐eclampsia (Townsend, 2019)
• 52 models for colorectal cancer risk (Usher-Smith, 2016)
• 48 models for incident hypertension (Sun, 2017)
• 46 models for melanoma (Kaiser, 2020)
• 46 models for prognosis after carotid revascularisation (Volkers, 2017)
• 43 models for mortality in critically ill (Keuning, 2019)
• 42 models for kidney failure in chronic kidney disease (Ramspek, 2019)
• 40 models for incident heart failure (Sahle, 2017)
38. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Original: https://xkcd.com/927/
PREDICTION MODELS.
PREDICTION MODELS
MODEL
PREDICTION MODELS.
39. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Agenda
• PART I (11:00 – 12:40)
• State of the medical prediction modeling art
• Just another prediction model
• Methods against overfitting
• Methods for deciding on appropriate sample size
• PART II (13:30 – 15:10)
• Model performance and validation
• Heterogeneity over time and place: there is no such thing as a validated
model
• Applied example: ADNEX model
• Future perspective: machine learning and AI
41. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Diagnosis: “starting point of all medical actions”
42. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
"Usually doctors are right, but conservatively about 15
percent of all people are misdiagnosed.
Some experts think it's as high as 20 to 25 percent"
43. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
“Patient trust was essential in the healing process.
It could be won by a punctilious bedside manner,
by meticulous explanation,
and by mastery of prognosis, an art of demanding
experience, observation and logic”
Galen, 2nd century AD
44. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
PROGRESS framework
The PROGRESS (PROGnosis RESearch Strategy) framework classifies
prognosis research into four main types of study:
1. Overall prognosis research
2. Prognostic factor research
3. Prognostic model research
4. Predictors of treatment effect research
More details: https://www.prognosisresearch.com/progress-framework
45. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Overall prognosis research
Adabag et al. JAMA, 2008
46. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Prognostic factor research
Letellier et al. BJC, 2017
47. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Predictors of treatment effect research
Bass et al. JCEM, 2010
48. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
PROGRESS framework
The PROGRESS (PROGnosis RESearch Strategy) framework classifies
prognosis research into four main types of study:
1. Overall prognosis research Prevalence studies
2. Prognostic factor research Diagnostic test (accuracy) studies
3. Prognostic model research Diagnostic model research
4. Predictors of treatment effect research
More details: https://www.prognosisresearch.com/progress-framework
49. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Focus of this lecture
The PROGRESS (PROGnosis RESearch Strategy) framework classifies
prognosis research into four main types of study:
1. Overall prognosis research Prevalence studies
2. Prognostic factor research Diagnostic test (accuracy) studies
3. Prognostic model research Diagnostic model research
4. Predictors of treatment effect research
More details: https://www.prognosisresearch.com/progress-framework
50. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
The phases of prognostic/diagnostic model research
1. Development (including internal validation)
2. External validation(s) and updating
3. Software development and regulations
4. Impact assessment
5. Clinical implementation and scalability
• Most research focusses on development
• Few models are validated
• Fewer models reach the implementation
51. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Picture courtesy: Laure Wynants
Not fit for purpose
• Wrong target population
• Expensive predictors
• Discrepancies between
development and use
• Complexity/transparency
No validation/impact
• Poor development
• Insufficient reporting
• Incentives
• If done, usually small
studies
Regulation/implementation
• MDR
• Soft- and hardware
• Model updating
• Quality control
Not adopted
• User time
• No prediction needed
52. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Books (focus on development/validation)
53. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Typical regression models
Binary logistic model
Pr(𝑌𝑌 = 1) = expit(𝛽𝛽0 + 𝛽𝛽1𝑋𝑋1 + … + 𝛽𝛽𝑃𝑃𝑋𝑋𝑃𝑃)
= exp(𝑋𝑋𝛽𝛽)/[1 + exp(𝑋𝑋𝛽𝛽)]
Multinomial logistic model
Pr(𝑌𝑌 = 𝑗𝑗) = exp(𝑋𝑋𝛽𝛽𝑗𝑗)/[1 + ∑ℎ=1
𝐽𝐽−1
exp(𝑋𝑋𝛽𝛽ℎ) ]
Cox model
ℎ(𝑋𝑋, 𝑡𝑡) = ℎ0 𝑡𝑡 exp(𝛽𝛽1𝑋𝑋1 + … + 𝛽𝛽𝑃𝑃𝑋𝑋𝑃𝑃)
= ℎ0 𝑡𝑡 exp(𝑋𝑋𝛽𝛽)
Remarks
• 𝑋𝑋𝛽𝛽 is the linear predictor
• Most predictive performance
metrics do not directly
generalize between outcomes
(discrimination/calibration)
• Linearity/additivity assumptions
can be relaxed
• Tree based models (e.g.
random forest), SVM and
neural networks seem to be
upcoming
54. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Model users often unaware of models and equations
http://www.tbi-impact.org/?p=impact/calc
55. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Reporting
Baseline hazard?
56. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Example: Systematic review by Bellou et al, 2019
Prognostic models for COPD: 228 articles
(408 prognostic models, 38 external validations)
Not reported:
• 12%: modelling method (e.g. type of regression model)
• 24%: evaluation of discrimination (e.g. area under the ROC curve)
• 64%: missing data handling (e.g. multiple imputation)
• 70%: full model (e.g. no regression coefficients, intercept / baseline hazard)
• 78%: no evaluation of calibration (e.g. calibration plot)
Bellou et al. BMJ, 2019
57. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Reporting guidelines & risk of bias tools
• TRIPOD: reporting development/validation prediction models
• PROBAST: risk of bias development/validation prediction models
• STARD: reporting diagnostic accuracy studies
• QUADAS-2: risk of bias diagnostic accuracy studies
• REMARK: reporting (tumour marker) prognostic factor studies
• QUIPS: risk of bias in prognostic factor studies
• Currently in development: TRIPOD-AI, TRIPOD-cluster, PROBAST-AI,
STARD-AI, QUADAS-AI, DECIDE-AI etc.
More info: equator-network.org
58. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Dichotomania
“Dichotomania is an obsessive compulsive disorder to which medical
advisors in particular are prone… Show a medical advisor some continuous
measurements and he or she immediately wonders. ‘Hmm, how can I make
these clinically meaningful? Where can I cut them in two? What ludicrous
side conditions can I impose on this?’”
Stephen Senn
59. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
It’s all in the title
Source: Gary Collins, https://bit.ly/30Kr3W2
60. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Unnecessary dichotomizations of predictors
• Biological implausible step-functions in predicted risk
• Source of overfitting when cut-off is chosen based on maximizing
predictive performance
• Loss of information (e.g. Ensor et al. show dichotomizing BMI was
equivalent to throwing away 1/3 of the data)
Dichotomization/categorization remains very prevalent
• Wynants et al. 2020 (COVID prediction models): 48%
• Collins et al. 2011 (Diabetes T2 prediction models): 63%
• Mallett et al. 2010 (Prognostic models in cancer): 70%
61. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Unnecessary dichotomizations of predictions
62. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Predictimands
Van Geloven et al. EJE, 2020
63. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Predictimands
Van Geloven et al. EJE, 2020
64. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Agenda
• PART I (11:00 – 12:40)
• State of the medical prediction modeling art
• Just another prediction model
• Methods against overfitting
• Methods for deciding on appropriate sample size
• PART II (13:30 – 15:10)
• Model performance and validation
• Heterogeneity over time and place: there is no such thing as a validated
model
• Applied example: ADNEX model
• Future perspective: machine learning and AI
67. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Classical consequence of overfitting
Bell et al. BMJ, 2015
68. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Shortest routes to overfitting
Steyerberg et al. JCE, 2017
69. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Shortest routes to overfitting
Steyerberg et al. JCE, 2017
70. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Shortest routes to overfitting
Steyerberg et al. JCE, 2017
71. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Routes to preventing overfitting
• Sample size: gather data on a sufficient* number of individuals
• Use it all on model development (avoid training-test splitting)
• Careful candidate predictors preselection
• Especially if data are small
• Avoid and be aware of the winner’s curse
• Data driven variable selection, model selection
• Avoid unnecessary dichotomizations
• Regression shrinkage / penalization / regularization
* Topic of 4th topic of this talk
76. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Uniform shrinkage factor
Shrinkage < 1 means overfitting
lower values = more overfitting
Formula from Riley et al.
77. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Shrinkage/tuning/penalization hard to estimate
Riley et al. JCE, 2021, DOI: 10.1016/j.jclinepi.2020.12.005
78. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Van Houwelingen, Statistica Neerlandica, 2001
“shrinkage works on the average but may fail in the particular unique
problem on which the statistician is working.”
79. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Shrinkage works on the average
Simulation results, averaged over 4k scenarios and 20M datasets
van Smeden et al. SMMR, 2019
EPV
Ridge Lasso Maximum likelihood
Calibration
slope
80. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Shrinkage may not work on a particular dataset
Van Calster et al. SMMR, 2020
RMSD log(slope): root mean squared distance of log calibration slope to the ideal slope (value = log(1))
Maximum likelihood, Ridge, Lasso
81. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Shrinkage may not work on a particular dataset
Van Calster et al. SMMR, 2020
RMSD log(slope): root mean squared distance of log calibration slope to the ideal slope (value = log(1))
Maximum likelihood, Ridge, Lasso
“We conclude that, despite improved performance
on average, shrinkage often worked poorly in
individual datasets, in particular when it was most
needed. The results imply that shrinkage methods
do not solve problems associated with small
sample size or low number of events per variable.”
82. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
What about the variance in estimated risk?
EPV
Root
mean
squared
prediction
error
Ridge Lasso Maximum likelihood
van Smeden et al. SMMR, 2019
83. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Sources of prediction error
Y = 𝑓𝑓 𝑥𝑥 + 𝜀𝜀
For a model 𝑘𝑘 the expected test prediction error is:
σ2 + bias2 ̂
𝑓𝑓𝑘𝑘 𝑥𝑥 + var ̂
𝑓𝑓𝑘𝑘 𝑥𝑥
See equation 2.46 in Hastie et al., the elements of statistical learning, https://stanford.io/2voWjra
Irreducible error Mean squared prediction error
(with E 𝜀𝜀 = 0, var 𝜀𝜀 = 𝜎𝜎2, values in 𝑥𝑥 are not random)
What we don’t model How we model
≈
≈
84. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Agenda
• PART I (11:00 – 12:40)
• State of the medical prediction modeling art
• Just another prediction model
• Methods against overfitting
• Methods for deciding on appropriate sample size
• PART II (13:30 – 15:10)
• Model performance and validation
• Heterogeneity over time and place: there is no such thing as a validated
model
• Applied example: ADNEX model
• Future perspective: machine learning and AI
85. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
You get what you pay for
86. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Sample size considerations
• Sample size matters when designing a study of any kind
• When designing a prediction study precedes data collection (prospective)
• How many units/individuals/events do I need to collect data on?
• When designing a prediction study on existing data (retrospective)
• Is my dataset large enough to build a model?
• How much model complexity (e.g. number predictors considered) can I
afford?
88. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
The origin of the 1 in 10 rule
“For EPV values of 10 or greater, no major problems occurred. For EPV
values less than 10, however, the regression coefficients were biased in
both positive and negative directions”
Peduzzi et al. JCE, 1996
89. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Peduzzi et al. JCE, 1996
?
90. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
More simulation studies
Citations based on Google Scholar, Oct 30 2020
citations: 5,736
“a minimum of 10 EPV […] may be too conservative”
“substantial problems even if the number of EPV exceeds 10”
For EPV values of 10 or greater, no major problems
citations: 2,438
citations: 216
91. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
More simulation studies
Citations based on Google Scholar, Oct 30 2020
citations: 5,736
“a minimum of 10 EPV […] may be too conservative”
“substantial problems even if the number of EPV exceeds 10”
For EPV values of 10 or greater, no major problems
citations: 2,438
citations: 216
!?!
92. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Log(odds) is consistent but finite sample biased
93. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Schaefer, Stat Med, 1983
94. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
• Examine the reasons for substantial differences between the earlier EPV
simulation studies
• Evaluate a possible solution to reduce the finite sample bias
van Smeden et al. BMC MRM, 2016
95. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
• Examine the reasons for substantial differences between the earlier EPV
simulation studies (simulation technicality: handling of “separation”)
• Evaluate a possible solution to reduce the finite sample bias
van Smeden et al. BMC MRM, 2016
96. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
• Examine the reasons for substantial differences between the earlier EPV
simulation studies (simulation technicality: handling of “separation”)
• Evaluate a possible solution to reduce the finite sample bias
van Smeden et al. BMC MRM, 2016
97. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
• Firth’s ”correction” aims to reduce finite sample bias in maximum
likelihood estimates, applicable to logistic regression
• It makes clever use of the “Jeffries prior” (from Bayesian literature) to
penalize the log-likelihood, which shrinks the estimated coefficients
• Nice theoretical justifications
98. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Standard
Averaged over 465 simulation conditions with 10,000 replications each
99. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Standard
Firth’s
correction
Averaged over 465 simulation conditions with 10,000 replications each
100. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Firth’s correction and predictive performance
Van Calster, SMMR, 2020, DOI: 10.1177/0962280220921415
RMSD log(slope): root mean squared distance of log calibration slope to the ideal slope (value = log(1))
Maximum likelihood, Firth’s correction
101. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Beyond events per variable
• Sample size has a big influence on the performance of prediction models
• Giving appropriate weights to predictors is not a simple task
• Challenge is to avoid overfitting and lack of precision in the predictions
• Requires adequate sample size
• How much data is sufficient? Depends
• Model complexity (e.g. # predictors, EPV)
• Signal:noise ratio (e.g. R2 or C-index)
• Performance required (e.g. high vs low stake medical decisions)
103. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Riley et al., BMJ, 2020
104. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Our proposal
• Calculate sample size that is needed to
• minimise potential overfitting
• estimate probability (risk) precisely
• Sample size formula’s for
• Continuous outcomes
• Time-to-event outcomes
• Binary outcomes (focus today)
Riley et al., BMJ, 2020
105. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Example
Gupta et al. Lancet Resp Med, 2020
106. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Example
• COVID-19 prognosis hospitalized
patients
• Composite outcome: “deterioration”
(in-hospital death, ventilator support,
ICU)
A priori expectations
• Event fraction at least 30%
• 40 candidate predictor parameters
• C-statistic of 0.71(conservative est)
-> Cox-Snell R2 of 0.24
Gupta et al. Lancet Resp Med, 2020
107. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Restricted cubic splines
with 4 knots: 3 degrees of
freedom
Note: EPV rule also
calculates degrees of
freedom of candidate
predictors, not variables!
Gupta et al. Lancet Resp Med, 2020
108. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Calculate required sample size
Criterion 1. Shrinkage: expected heuristic shrinkage factor, S ≥ 0.9
(calibration slope, target < 10% overfitting)
Criterion 2. Optimism: Cox-Snell R2 apparent - Cox-Snell R2 validation < 0.05
(overfitting)
Criterion 3: A small margin of error in overall risk estimate < 0.05 absolute error
(precision estimated baseline risk)
(Criterion 4: a small margin of absolute error in the estimated risks)
109. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Calculation
R code:
> require(pmsampsize)
> pmsampsize(type="b",rsquared=0.24,parameters=40,prevalence=0.3)
110. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
A few alternative scenarios
• rsquared=0.24,parameters=40,prevalence=0.3 -> EPV≥9.7
• rsquared=0.12,parameters=40,prevalence=0.3 -> EPV≥21.0
• rsquared=0.12,parameters=40,prevalence=0.5 -> EPV≥35.0
• rsquared=0.36,parameters=40,prevalence=0.2 -> EPV≥5
111. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
The sample size that meets all criteria is the MINIMUM required
• Why minimum? Other criteria may be important. e.g. missing data,
clustering, variable selection
Future directions
• More evidence/guidance to choose values for the sample size criteria
• Sample size for validation (for continuous outcome just published)
• Sample size for different outcomes (e.g. multi-category)
• Sample size taking into account variable selection
• Simulation based approaches
• High-dimensional data?
112. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Steyerberg, 2019
113. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Work in collaboration with:
• Carl Moons
• Hans Reitsma
• Ben Van Calster (Leuven)
• Laure Wynants (Maastricht)
• Richard Riley (Keele, materials for this presentation)
• Gary Collins (Oxford, materials for this presentation)
• Ewout Steyerberg (Leiden)
• Many others
Contact: M.vanSmeden@umcutrecht.nl
114. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmeden
Statistics in practice I: development and evaluation of prediction models
Reference list
Adabag et al. JAMA, 2008. doi: 10.1001/jama.2008.553
Altman & Royston, Stat Med, 2000, doi: 10.1002/(SICI)1097-0258(20000229)19:4<453::AID-SIM350>3.0.CO;2-5
Apgar et al. JAMA, 1958. doi: 10.1001/jama.1958.03000150027007
Bass et al., JCEM, 2010, doi:10.1210/jc.2010-0947
Bell et al. BMJ, 2015, doi: 10.1136/bmj.h5639
Bellou et al. BMJ, 2019, doi: 10.1136/bmj.l5358
Conroy et al. EHJ, 2003. doi: 10.1016/S0195-668X(03)00114-3
Collins et al. BMC Med, 2011, doi: 10.1186/1741-7015-9-103
Courvoisier et al. JCE, 2011, doi: /10.1016/j.jclinepi.2010.11.012
Gulshan et al. JAMA, 2016, doi: 10.1001/jama.2016.17216
Gupta et al. ERJ, doi: 10.1183/13993003.03498-2020
Gupta et al. Lancet Resp Med, 2020, doi: 10.1016/S2213-2600(20)30559-2
Letellier et al. BJC, 2017. doi: 10.1038/bjc.2017.352
Mallett et al. 2010, doi: 10.1186/1741-7015-8-21
Peduzzi, JCE, 1996, doi: 10.1016/s0895-4356(96)00236-3
Roberts et al. Nature ML, 2021, doi: 10.1038/s42256-021-00307-0
Steyerberg et al, JCE, 2017, doi: 10.1016/j.jclinepi.2017.11.013
Steyerberg, 2019, doi: /10.1007/978-3-030-16399-0
TRIPOD statement, 2015: Collins et al., Ann Intern Med, doi: 10.7326/M14-0697
Van Calster et al. SMMR, 2020, doi: 10.1177/0962280220921415
van Houwelingen, Statistica Neerlandica, 2001
van Houwelingen & Le Cessie, Stat Med, 1990, doi: 10.1002/sim.4780091109
van Geloven et al. EJE, 2020, doi: 0.1007/s10654-020-00636-1
van Smeden et al. Clinical prediction models: diagnosis versus prognosis, JCE, in press
van Smeden et al., SMMR, 2019, doi: 10.1177/0962280218784726
van Smeden et al. BMC MRM, 2016, doi: 10.1186/s12874-016-0267-3
Vittinghof & McCulloch. AJE, 2007, doi: 10.1093/aje/kwk052
Riley et al. Stat Med, 2019, doi: 10.1002/sim.7992
Riley et al., BMJ, 2020, doi: 10.1136/bmj.m441
Riley et al. JCE, 2021, doi: 10.1016/j.jclinepi.2020.12.005
Sauerbrei & Royston. JRSA-A, 1999, doi: 10.1111/1467-985x.00122
Schaefer. Stat Med, 1983, 10.1002/sim.4780020108
van Smeden et al. SMMR, 2019, doi: 10.1177/0962280218784726
Wells et al. Lancet, 1997, doi: 10.1016/S0140-6736(97)08140-3
Wynants et al. 2020, BMJ, doi: 10.1136/bmj.m1328