Keynote at Norwegian Epidemiological Association conference, October 26 2022. Discussing absence of evidence fallacy, Table 2 fallacy, Winner's curse and Stein's paradox.
Development and evaluation of prediction models: pitfalls and solutionsMaarten van Smeden
Slides for the statistics in practice session for the Biometrisches Kolloqium (organized by the Deutsche Region der Internationalen Biometrischen Gesellschaft), 18 March 2021
Development and evaluation of prediction models: pitfalls and solutionsMaarten van Smeden
Slides for the statistics in practice session for the Biometrisches Kolloqium (organized by the Deutsche Region der Internationalen Biometrischen Gesellschaft), 18 March 2021
Clinical trials are about comparability not generalisability V2.pptxStephenSenn2
Lecture delivered at the September 2022 EFSPI meeting in Basle in which I argued that the patients in a clinical trial should not be viewed as being a representative sample of some target population.
Developing and validating statistical models for clinical prediction and prog...Evangelos Kritsotakis
Talk on clinical prediction models presented at the Joint Seminar Series in Translational and Clinical Medicine organised by the University of Crete Medical School, the Institute of Molecular Biology and Biotechnology of the Foundation for Research and Technology Hellas (IMBB-FORTH), and the University of Crete Research Center (UCRC), Heraklion [online], Greece, April 7, 2021.
Whatever happened to design based inferenceStephenSenn2
Given as the Sprott lecture, University of Waterloo September 2022
Abstract
What exactly should we think about appropriate analyses for designed experiments and why? If conditional inference trumps marginal inference, why should we care about randomisation? Isn’t everything just modelling? The Rothamsted School held that design matters. Taking an example of applying John Nelder’s general balance approach to a notorious problem, Lord’s paradox, I shall show that there may be some lessons for two fashionable topics: causal analysis and big data. I shall conclude that if we want not only to make good estimates but estimate how good our estimates are, design does matter.
How to establish and evaluate clinical prediction models - StatsworkStats Statswork
A clinical prediction model can be used in various clinical contexts, including screening for asymptomatic illness, forecasting future events such as disease, and assisting doctors in their decision-making and health education. Despite the positive effects of clinical prediction models on practice, prediction modelling is a difficult process that necessitates meticulous statistical analysis and sound clinical judgments. Statswork offers statistical services as per the requirements of the customers. When you Order statistical Services at Statswork, we promise you the following always on Time, outstanding customer support, and High-quality Subject Matter Experts.
Read More With Us: https://bit.ly/3dxn32c
Why Statswork?
Plagiarism Free | Unlimited Support | Prompt Turnaround Times | Subject Matter Expertise | Experienced Bio-statisticians & Statisticians | Statistics across Methodologies | Wide Range of Tools & Technologies Supports | Tutoring Services | 24/7 Email Support | Recommended by Universities
Contact Us:
Website: www.statswork.com
Email: info@statswork.com
United Kingdom: 44-1143520021
India: 91-4448137070
WhatsApp: 91-8754446690
Improving predictions: Lasso, Ridge and Stein's paradoxMaarten van Smeden
Slides of masterclass "Improving predictions: Lasso, Ridge and Stein's paradox" at the (Dutch) National Institute for Public Health and the Environment (RIVM)
Clinical trials are about comparability not generalisability V2.pptxStephenSenn2
Lecture delivered at the September 2022 EFSPI meeting in Basle in which I argued that the patients in a clinical trial should not be viewed as being a representative sample of some target population.
Developing and validating statistical models for clinical prediction and prog...Evangelos Kritsotakis
Talk on clinical prediction models presented at the Joint Seminar Series in Translational and Clinical Medicine organised by the University of Crete Medical School, the Institute of Molecular Biology and Biotechnology of the Foundation for Research and Technology Hellas (IMBB-FORTH), and the University of Crete Research Center (UCRC), Heraklion [online], Greece, April 7, 2021.
Whatever happened to design based inferenceStephenSenn2
Given as the Sprott lecture, University of Waterloo September 2022
Abstract
What exactly should we think about appropriate analyses for designed experiments and why? If conditional inference trumps marginal inference, why should we care about randomisation? Isn’t everything just modelling? The Rothamsted School held that design matters. Taking an example of applying John Nelder’s general balance approach to a notorious problem, Lord’s paradox, I shall show that there may be some lessons for two fashionable topics: causal analysis and big data. I shall conclude that if we want not only to make good estimates but estimate how good our estimates are, design does matter.
How to establish and evaluate clinical prediction models - StatsworkStats Statswork
A clinical prediction model can be used in various clinical contexts, including screening for asymptomatic illness, forecasting future events such as disease, and assisting doctors in their decision-making and health education. Despite the positive effects of clinical prediction models on practice, prediction modelling is a difficult process that necessitates meticulous statistical analysis and sound clinical judgments. Statswork offers statistical services as per the requirements of the customers. When you Order statistical Services at Statswork, we promise you the following always on Time, outstanding customer support, and High-quality Subject Matter Experts.
Read More With Us: https://bit.ly/3dxn32c
Why Statswork?
Plagiarism Free | Unlimited Support | Prompt Turnaround Times | Subject Matter Expertise | Experienced Bio-statisticians & Statisticians | Statistics across Methodologies | Wide Range of Tools & Technologies Supports | Tutoring Services | 24/7 Email Support | Recommended by Universities
Contact Us:
Website: www.statswork.com
Email: info@statswork.com
United Kingdom: 44-1143520021
India: 91-4448137070
WhatsApp: 91-8754446690
Improving predictions: Lasso, Ridge and Stein's paradoxMaarten van Smeden
Slides of masterclass "Improving predictions: Lasso, Ridge and Stein's paradox" at the (Dutch) National Institute for Public Health and the Environment (RIVM)
The absence of a gold standard: a measurement error problemMaarten van Smeden
Talk about gold standard problems and solutions in medicine and epidemiology. Invited by the department of infectious disease epidemiology, University Medical Center Utrecht
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
ISI 2024: Application Form (Extended), Exam Date (Out), EligibilitySciAstra
The Indian Statistical Institute (ISI) has extended its application deadline for 2024 admissions to April 2. Known for its excellence in statistics and related fields, ISI offers a range of programs from Bachelor's to Junior Research Fellowships. The admission test is scheduled for May 12, 2024. Eligibility varies by program, generally requiring a background in Mathematics and English for undergraduate courses and specific degrees for postgraduate and research positions. Application fees are ₹1500 for male general category applicants and ₹1000 for females. Applications are open to Indian and OCI candidates.
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills MN
Travis Hills of Minnesota developed a method to convert waste into high-value dry fertilizer, significantly enriching soil quality. By providing farmers with a valuable resource derived from waste, Travis Hills helps enhance farm profitability while promoting environmental stewardship. Travis Hills' sustainable practices lead to cost savings and increased revenue for farmers by improving resource efficiency and reducing waste.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxRASHMI M G
Abnormal or anomalous secondary growth in plants. It defines secondary growth as an increase in plant girth due to vascular cambium or cork cambium. Anomalous secondary growth does not follow the normal pattern of a single vascular cambium producing xylem internally and phloem externally.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
20240520 Planning a Circuit Simulator in JavaScript.pptx
Improving epidemiological research: avoiding the statistical paradoxes and fallacies that nobody talks about
1. Maarten van Smeden, PhD
28th Norwegian Epidemiological Association
(NOFE) conference
October 26, 2022
Improving epidemiological research:
avoiding the statistical paradoxes and
fallacies that nobody else talks about
2. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Slides available at:
https://www.slideshare.net/MaartenvanSmeden
I have no conflicts of interest to declare
3. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Index
Part I:
• Absence of evidence fallacy
• Table 2 fallacy
• Winner’s curse
• Stein’s paradox
Part II:
• Good statistical practices
• Avoiding statistical fallacies and paradoxes
6. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Cartoon of Jim Borgman, published by the Cincinnati Inquirer and King Features Syndicate April 27 1997
7. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Cookbook review
“We selected 50 common ingredients from random
recipes of a cookbook”
21. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Hypothetical example
Tea
Coffee
22. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Hypothetical example
Tea
Coffee
“No health consequences for Coffee”
“Negative health consequences for Tea”
23. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Hypothetical example
Tea
Coffee
“No health consequences for Coffee”
“No effect of Coffee”
“Coffee is healthy”
“Coffee is better for health than tea”
“Negative health consequences for Tea”
“Tea is bad for health”
“Tea kills”
26. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Observational (non-randomized) study
A
L
Y
exposure outcome
confounder
27. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Observational (non-randomized) study
A
L
Y
Diet Diabetes
confounder
28. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Diet -> diabetes, age a counfounder?
Numerical example adapted from Peter Tennant with permission
Age No diabetes Diabetes No diabetes Diabetes RR
< 50 years 19 1 37 3 1.50
≥ 50 years 28 12 12 8 1.33
Total 47 13 49 11 0.88
Traditional Exotic diet
50%
40%
30%
20%
10%
≥ 50 years
> 50 years
Total
Diabetes
risk
< 50 years
29. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Diet -> diabetes, weight loss a confounder?
Weight No diabetes Diabetes No diabetes Diabetes RR
Lost 19 1 37 3 1.50
Gained 28 12 12 8 1.33
Total 47 13 49 11 0.88
Traditional Exotic diet
50%
40%
30%
20%
10%
Gained wt
Lost wt
Total
Diabetes
risk
< 50 years
30. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Beware of the colliders
A
L
Y
Diet Diabetes
Weight loss
31. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Careful selection of confounders, e.g. through DAGs
From Tennant et al, IJE, 2021, doi: 10.1093/ije/dyaa213
32. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Example of multivariable model table
33. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Table 2 fallacy
Common approach:
• Fit a multivariable regression model using all “risk factors” of
interest
• Presents estimates of regression coefficients as mutually
adjusted for each other
34. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Example of Table 2 fallacy
35. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Example of Table 2 fallacy
39. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Distribution of 1.1m z-values in medical literature
Source: https://agbarnett.github.io/talks/TRI/slides#13
41. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Myth 1: The number of variables in a model should be reduced
until there are 10 events per variables.
Myth 2: Only variables with proven univariable-model significance
should be included in a model.
Myth 3: Insignificant effects should be eliminated from a model.
Myth 4: The reported P-value quantifies the type I error of a
variable being falsely selected.
Myth 5: Variable selection simplifies analysis.
42. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Myth 1: The number of variables in a model should be reduced
until there are 10 events per variables.
Myth 2: Only variables with proven univariable-model significance
should be included in a model.
Myth 3: Insignificant effects should be eliminated from a model.
Myth 4: The reported P-value quantifies the type I error of a
variable being falsely selected.
Myth 5: Variable selection simplifies analysis.
43. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Variable selection can be very instable in low N
44. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Illustrative example
45. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Illustrative example
46. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Illustrative example
47. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Illustrative example
48. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Winner’s curse
49. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Variable selection often makes things worse
Investigated 3960 scenario’s, backward elimination made
exposure effect estimation worse 97% of the time
(in remaining 3% improvements were neglegible)
50. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Variable selection…
• Is often instable (e.g. small N or high collinearity)
• Can create testimation bias
• Can invalidate inferential statistics: default p-values and
confidence intervals not valid (post-selection inference literature)
• Can be a source of model overfitting
52. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
1955: Stein’s paradox
53. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Stein’s paradox in words (rather simplified)
When one has three or more units (say, individuals), and
for each unit one can calculate an average score (say,
average blood pressure), then the best guess of future
observations for each unit (say, blood pressure tomorrow)
is NOT the average score.
54. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
1961: James-Stein estimator: the next Berkley Symposium
• James and Stein. Estimation with quadratic loss. Proceedings of the fourth Berkeley
symposium on mathematical statistics and probability. Vol. 1. 1961.
55. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
1977: Baseball example
Squared error reduced from .077 to .022
56. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Stein’s paradox
• Probably among the most surprising (and initially
doubted) phenomena in statistics
• Now a large “family”: shrinkage estimators reduce
prediction variance to an extent that typically outweighs
the bias that is introduced
• Bias/variance trade-off principle has motivated many
statistical and machine learning developments
Expected prediction error = irreducible error + bias2 + variance
67. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Simulation: 100 times
68. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Not just lucky
• 5% reduction in MSPE just by shrinkage estimator
• Van Houwelingen and le Cessie’s heuristic shrinkage factor
69. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Shrinkage
Post-estimation shrinkage factor estimation
• Van Houwelingen & Le Cessie, 1990: uniform shrinkage factor
• Sauerbrei 1999: parameterwise shrinkage factors
Regularized regression (shrinkage during estimation)
• Ridge regression: L2-penalty on regression coefficient
• Lasso: L1 penalty
• Elastic net: L1 and L2 penalty
71. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Consequences of shrinkage
• Can improve the accuracy of predictions on average1
• Can reduce (part of) the detrimental effects of overfitting
• In specific situations (e.g. Lasso) it can be used for automated
variable selection at reduced risk of winner’s curse
• No free lunch principle: shrinkage often introduces (by design) a
negative bias in regression coefficients
• Exception: Firth’s correction, e.g. see:
1On average is important here, see:
76. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
To explain or to predict?
Explanatory models
• Theory: interest in regression coefficients
• Testing and comparing existing causal theories
• e.g. aetiology of illness, effect of treatment
Prediction models
• Interest in (risk) predictions of future observations
• Causality not a primary concern
• Concerns about overfitting and optimism
• e.g. prognostic or diagnostic prediction model
Descriptive models
• Capture the data structure
77. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
To explain or to predict?
Explanatory models
• Theory: interest in regression coefficients
• Testing and comparing existing causal theories
• e.g. aetiology of illness, effect of treatment
Prediction models
• Interest in (risk) predictions of future observations
• Causality not a primary concern
• Concerns about overfitting and optimism
• e.g. prognostic or diagnostic prediction model
Descriptive models
• Capture the data structure
78. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
To explain or to predict?
Explanatory models
• Absence of absence fallacy: e.g. non-significant effect of
exposure interpreted as ”not working” (tx) or “not bad for health”
• Table 2 fallacy: e.g. regression coefficients of confounding
variables interpreted as themselves “adjusted” for confounding
• Winner’s curse: e.g. selected factors on average too extreme
values for the regression coefficients (i.e. biased)
• Stein’s paradox: shrinkage may lead to a bias that may not be
beneficial for inference
(but not always, see1)
1https://bit.ly/3VZQFJV
79. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
To explain or to predict?
Prediction models
• Absence of absence fallacy: e.g. non-significant result on
measure for miscalibration misinterpreted as good calibration
• Table 2 fallacy: e.g. predictors misinterpreted as causal effects
• Winner’s curse: e.g. final model with selected predictors results
in overfitting
• Stein’s paradox: shrinkage may improve predictions
(but not always, see1)
1https://bit.ly/3spjPo5
80. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Explanatory vs prediction models
81. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Specific guidance on conduct
82. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Specific guidance on reporting and risk of bias
83. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Many other fallacies and paradoxes to consider
• Ecological fallacy
• Lord’s paradox
• Simpson’s paradox
• Berkson’s paradox
• Prosecutors fallacy
• Gambler’s fallacy
• Lindley’s paradox
• Low birthweight paradox
• Noisy data fallacy
• Will Rogers phenomenon
• …
More info: https://bit.ly/3TylBz7
84. Maarten van Smeden, PhD
28th Norwegian Epidemiological Association
(NOFE) conference
October 26, 2022
Improving epidemiological research:
avoiding the statistical paradoxes and
fallacies that nobody else talks about
85. Maarten van Smeden, PhD
28th Norwegian Epidemiological Association
(NOFE) conference
October 26, 2022
Improving epidemiological research:
avoiding the statistical paradoxes and
fallacies that nobody else talks about
everyone should talk about and are well
described in literature
86. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
A gentle (1000 words) introduction
87. Tromsø, Oct 26, 2022 Twitter: @MaartenvSmeden
Email: M.vanSmeden@umcutrecht.nl
Twitter: @MaartenvSmeden
Slides available at:
https://www.slideshare.net/MaartenvanSmeden