Minimisation is an approach to allocating patients to treatment in clinical trials that forces a greater degree of balance than does randomisation. Here I explain why I dislike it.
Clinical trials are about comparability not generalisability V2.pptxStephenSenn3
It is a fundamental but common mistake to regard clinical trials as being a form of representative inference. The key issue is comparability. Experiments do not involve typical material. In clinical trials; it is concurrent control that is key and randomisation is a device for calculating standard errors appropriately that should reflect the design.
Generalisation beyond the clinical trial always involves theory.
Clinical trials are about comparability not generalisability V2.pptxStephenSenn2
Lecture delivered at the September 2022 EFSPI meeting in Basle in which I argued that the patients in a clinical trial should not be viewed as being a representative sample of some target population.
How to combine results from randomised clinical trials on the additive scale with real world data to provide predictions on the clinically relevant scale for individual patients
The importance of measurements of uncertainty is explained and illustrated with the help of a famous experiment in nutrition from 1930 and a complex cross-over trial from the 1990s,
There are many questions one might ask of a clinical trial, ranging from what was the effect in the patients studied to what might the effect be in future patients via what was the effect in individual patients? The extent to which the answer to these questions is similar depends on various assumptions made and in some cases the design used may not permit any meaningful answer to be given at all.
A related issue is confusion between randomisation, random sampling, linear model and true multivariate based modelling. These distinctions don’t matter much for some purposes and under some circumstances but for others they do.
A yet further issue is that causal analysis in epidemiology, which has brought valuable insights in many cases, has tended to stress point estimates and ignore standard errors. This has potentially misleading consequences.
An understanding of components of variation is key. Unfortunately, the development of two particular topics in recent years, evidence synthesis by the evidence based medicine movement and personalised medicine by bench scientists has either paid scant attention to components of variation or to the questions being asked or both resulting in confusion about many issues.
For instance, it is often claimed that numbers needed to treat indicate the proportion of patients for whom treatments work, that inclusion criteria determine the generalisability of results and that heterogeneity means that a random effects meta-analysis is required. None of these is true. The scope for personalised medicine has very plausibly been exaggerated and an important cause of variation in the healthcare system, physicians, is often overlooked.
I shall argue that thinking about questions is important.
The replication crisis: are P-values the problem and are Bayes factors the so...StephenSenn2
Today’s posterior is tomorrow’s prior. Dennis Lindley1 (P2)
It has been claimed that science is undergoing a replication crisis and that when looking for culprits, the cult of significance is the chief suspect. It has also been claimed that Bayes factors might provide a solution.
In my opinion, these claims are misleading and part of the problem is our understanding of the purpose and nature of replication, which has only recently been subject to formal analysis2. What we are or should be interested in is truth. Replication is a coherence not a correspondence requirement3 and one that has a strong dependence on the size of the replication study4.
Consideration of Bayes factors raises a puzzling question. Should the Bayes factor for a replication study be calculated as if it were the initial study? If the answer is yes, the approach is not fully Bayesian and furthermore the Bayes factors will be subject to exactly the same replication ‘paradox’ as P-values. If the answer is no, then in what sense can an initially found Bayes factor be replicated and what are the implications for how we should view replication of P-values?
A further issue is that little attention has been paid to false negatives and, by extension to true negative values. Yet, as is well known from the theory of diagnostic tests, it is meaningless to consider the performance of a test in terms of false positives alone.
I shall argue that we are in danger of confusing evidence with the conclusions we draw and that any reforms of scientific practice should concentrate on producing evidence that is reliable as it can be qua evidence. There are many basic scientific practices in need of reform. Pseudoreplication5, for example, and the routine destruction of information through dichotomisation6 are far more serious problems than many matters of inferential framing that seem to have excited statisticians.
References
1. Lindley DV. Bayesian statistics: A review. SIAM; 1972.
2. Devezer B, Navarro DJ, Vandekerckhove J, Ozge Buzbas E. The case for formal methodology in scientific reform. R Soc Open Sci. Mar 31 2021;8(3):200805. doi:10.1098/rsos.200805
3. Walker RCS. Theories of Truth. In: Hale B, Wright C, Miller A, eds. A Companion to the Philosophy of Language. John Wiley & Sons,; 2017:532-553:chap 21.
4. Senn SJ. A comment on replication, p-values and evidence by S.N.Goodman, Statistics in Medicine 1992; 11:875-879. Letter. Statistics in Medicine. 2002;21(16):2437-44.
5. Hurlbert SH. Pseudoreplication and the design of ecological field experiments. Ecological monographs. 1984;54(2):187-211.
6. Senn SJ. Being Efficient About Efficacy Estimation. Research. Statistics in Biopharmaceutical Research. 2013;5(3):204-210. doi:10.1080/19466315.2012.754726
When estimating sample sizes for clinical trials there are several different views that might be taken as to what definition and meaning should be given to the sought-for treatment effect. However, if the concept of a ‘minimally important difference’ (MID) does have relevance to interpreting clinical trials (which can be disputed) then its value cannot be the same as the ‘clinically relevant difference’ (CRD) that would be used for planning them.
A doubly pernicious use of the MID is as a means of classifying patients as responders and non-responders. Not only does such an analysis lead to an increase in the necessary sample size but it misleads trialists into making causal distinctions that the data cannot support and has been responsible for exaggerating the scope for personalised medicine.
In this talk these statistical points will be explained using a minimum of technical detail.
Clinical trials are about comparability not generalisability V2.pptxStephenSenn3
It is a fundamental but common mistake to regard clinical trials as being a form of representative inference. The key issue is comparability. Experiments do not involve typical material. In clinical trials; it is concurrent control that is key and randomisation is a device for calculating standard errors appropriately that should reflect the design.
Generalisation beyond the clinical trial always involves theory.
Clinical trials are about comparability not generalisability V2.pptxStephenSenn2
Lecture delivered at the September 2022 EFSPI meeting in Basle in which I argued that the patients in a clinical trial should not be viewed as being a representative sample of some target population.
How to combine results from randomised clinical trials on the additive scale with real world data to provide predictions on the clinically relevant scale for individual patients
The importance of measurements of uncertainty is explained and illustrated with the help of a famous experiment in nutrition from 1930 and a complex cross-over trial from the 1990s,
There are many questions one might ask of a clinical trial, ranging from what was the effect in the patients studied to what might the effect be in future patients via what was the effect in individual patients? The extent to which the answer to these questions is similar depends on various assumptions made and in some cases the design used may not permit any meaningful answer to be given at all.
A related issue is confusion between randomisation, random sampling, linear model and true multivariate based modelling. These distinctions don’t matter much for some purposes and under some circumstances but for others they do.
A yet further issue is that causal analysis in epidemiology, which has brought valuable insights in many cases, has tended to stress point estimates and ignore standard errors. This has potentially misleading consequences.
An understanding of components of variation is key. Unfortunately, the development of two particular topics in recent years, evidence synthesis by the evidence based medicine movement and personalised medicine by bench scientists has either paid scant attention to components of variation or to the questions being asked or both resulting in confusion about many issues.
For instance, it is often claimed that numbers needed to treat indicate the proportion of patients for whom treatments work, that inclusion criteria determine the generalisability of results and that heterogeneity means that a random effects meta-analysis is required. None of these is true. The scope for personalised medicine has very plausibly been exaggerated and an important cause of variation in the healthcare system, physicians, is often overlooked.
I shall argue that thinking about questions is important.
The replication crisis: are P-values the problem and are Bayes factors the so...StephenSenn2
Today’s posterior is tomorrow’s prior. Dennis Lindley1 (P2)
It has been claimed that science is undergoing a replication crisis and that when looking for culprits, the cult of significance is the chief suspect. It has also been claimed that Bayes factors might provide a solution.
In my opinion, these claims are misleading and part of the problem is our understanding of the purpose and nature of replication, which has only recently been subject to formal analysis2. What we are or should be interested in is truth. Replication is a coherence not a correspondence requirement3 and one that has a strong dependence on the size of the replication study4.
Consideration of Bayes factors raises a puzzling question. Should the Bayes factor for a replication study be calculated as if it were the initial study? If the answer is yes, the approach is not fully Bayesian and furthermore the Bayes factors will be subject to exactly the same replication ‘paradox’ as P-values. If the answer is no, then in what sense can an initially found Bayes factor be replicated and what are the implications for how we should view replication of P-values?
A further issue is that little attention has been paid to false negatives and, by extension to true negative values. Yet, as is well known from the theory of diagnostic tests, it is meaningless to consider the performance of a test in terms of false positives alone.
I shall argue that we are in danger of confusing evidence with the conclusions we draw and that any reforms of scientific practice should concentrate on producing evidence that is reliable as it can be qua evidence. There are many basic scientific practices in need of reform. Pseudoreplication5, for example, and the routine destruction of information through dichotomisation6 are far more serious problems than many matters of inferential framing that seem to have excited statisticians.
References
1. Lindley DV. Bayesian statistics: A review. SIAM; 1972.
2. Devezer B, Navarro DJ, Vandekerckhove J, Ozge Buzbas E. The case for formal methodology in scientific reform. R Soc Open Sci. Mar 31 2021;8(3):200805. doi:10.1098/rsos.200805
3. Walker RCS. Theories of Truth. In: Hale B, Wright C, Miller A, eds. A Companion to the Philosophy of Language. John Wiley & Sons,; 2017:532-553:chap 21.
4. Senn SJ. A comment on replication, p-values and evidence by S.N.Goodman, Statistics in Medicine 1992; 11:875-879. Letter. Statistics in Medicine. 2002;21(16):2437-44.
5. Hurlbert SH. Pseudoreplication and the design of ecological field experiments. Ecological monographs. 1984;54(2):187-211.
6. Senn SJ. Being Efficient About Efficacy Estimation. Research. Statistics in Biopharmaceutical Research. 2013;5(3):204-210. doi:10.1080/19466315.2012.754726
When estimating sample sizes for clinical trials there are several different views that might be taken as to what definition and meaning should be given to the sought-for treatment effect. However, if the concept of a ‘minimally important difference’ (MID) does have relevance to interpreting clinical trials (which can be disputed) then its value cannot be the same as the ‘clinically relevant difference’ (CRD) that would be used for planning them.
A doubly pernicious use of the MID is as a means of classifying patients as responders and non-responders. Not only does such an analysis lead to an increase in the necessary sample size but it misleads trialists into making causal distinctions that the data cannot support and has been responsible for exaggerating the scope for personalised medicine.
In this talk these statistical points will be explained using a minimum of technical detail.
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Ewout Steyerberg
Title"Clinical prediction models in the age of artificial intelligence and big data", presented at the Basel Biometrics Society seminar Nov 1, 2019, Basel, by Ewout Steyerberg, with substantial inout from Maarten van Smeden and Ben van Calster
Dichotomania and other challenges for the collaborating biostatisticianLaure Wynants
Conference presentation at ISCB 41 in the session
"Biostatistical inference in practice: moving beyond false
dichotomies"
A comment in Nature, signed by over 800 researchers, called for the scientific community to “retire statistical significance”. The responses included a call to halt the use of the term „statistically significant”, and changes in journal’s author guidelines. The leading discourse among statisticians is that inadequate statistical training of clinical researchers and publishing practices are to blame for the misuse of statistical testing. In this presentation, we search our collective conscience by reviewing ethical guidelines for statisticians in light of the p-value crisis, examine what this implies for us when conducting analyses in collaborative work and teaching, and whether the ATOM (accept uncertainty; be thoughtful, open and modest) principles can guide us.
Improving epidemiological research: avoiding the statistical paradoxes and fa...Maarten van Smeden
Keynote at Norwegian Epidemiological Association conference, October 26 2022. Discussing absence of evidence fallacy, Table 2 fallacy, Winner's curse and Stein's paradox.
The Seven Habits of Highly Effective StatisticiansStephen Senn
If you know why the title of this talk is extremely stupid, then you clearly know something about control, data and reasoning: in short, you have most of what it takes to be a statistician. If you have studied statistics then you will also know that a large amount of anything, and this includes successful careers, is luck.
In this talk I shall try share some of my experiences of being a statistician in the hope that it will help you make the most of whatever luck life throws you, In so doing, I shall try my best to overcome the distorting influence of that easiest of sciences hindsight. Without giving too much away, I shall be recommending that you read, listen, think, calculate, understand, communicate, and do. I shall give you some example of what I think works and what I think doesn’t
In all of this you should never forget the power of negativity and also the joy of being able to wake up every day and say to yourself ‘I love the small of data in the morning’.
Presentation on similarities and differences between statistical and machine learning research fields for the @UM_MiCHAMP Big Data & AI in Health Seminar Series; October 21, 2022
Disease screening and screening test validityTampiwaChebani
Full lecture covering screening tests and validity testing. Covers topics such as calculation and interpretation of sensitivity, specificity, positive predictive value and negative predictive value of a screening test.
Improving predictions: Lasso, Ridge and Stein's paradoxMaarten van Smeden
Slides of masterclass "Improving predictions: Lasso, Ridge and Stein's paradox" at the (Dutch) National Institute for Public Health and the Environment (RIVM)
Ragui Assaad- University of Minnesota
Caroline Krafft- ST. Catherine University
ERF Training on Applied Micro-Econometrics and Public Policy Evaluation
Cairo, Egypt July 25-27, 2016
www.erf.org.eg
This year marks the 70th anniversary of the Medical Research Council randomised clinical trial (RCT) of streptomycin in tuberculosis led by Bradford Hill. This is widely regarded as a landmark in clinical research. Despite its widespread use in drug regulation and in clinical research more widely and its high standing with the evidence based medicine movement, the RCT continues to attracts criticism. I show that many of these criticisms are traceable to failure to understand two key concepts in statistics: probabilistic inference and design efficiency. To these methodological misunderstandings can be added the practical one of failing to appreciate that entry into clinical trials is not simultaneous but sequential.
I conclude that although randomisation should not be used as an excuse for ignoring prognostic variables, it is valuable and that many standard criticisms of RCTs are invalid.
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Ewout Steyerberg
Title"Clinical prediction models in the age of artificial intelligence and big data", presented at the Basel Biometrics Society seminar Nov 1, 2019, Basel, by Ewout Steyerberg, with substantial inout from Maarten van Smeden and Ben van Calster
Dichotomania and other challenges for the collaborating biostatisticianLaure Wynants
Conference presentation at ISCB 41 in the session
"Biostatistical inference in practice: moving beyond false
dichotomies"
A comment in Nature, signed by over 800 researchers, called for the scientific community to “retire statistical significance”. The responses included a call to halt the use of the term „statistically significant”, and changes in journal’s author guidelines. The leading discourse among statisticians is that inadequate statistical training of clinical researchers and publishing practices are to blame for the misuse of statistical testing. In this presentation, we search our collective conscience by reviewing ethical guidelines for statisticians in light of the p-value crisis, examine what this implies for us when conducting analyses in collaborative work and teaching, and whether the ATOM (accept uncertainty; be thoughtful, open and modest) principles can guide us.
Improving epidemiological research: avoiding the statistical paradoxes and fa...Maarten van Smeden
Keynote at Norwegian Epidemiological Association conference, October 26 2022. Discussing absence of evidence fallacy, Table 2 fallacy, Winner's curse and Stein's paradox.
The Seven Habits of Highly Effective StatisticiansStephen Senn
If you know why the title of this talk is extremely stupid, then you clearly know something about control, data and reasoning: in short, you have most of what it takes to be a statistician. If you have studied statistics then you will also know that a large amount of anything, and this includes successful careers, is luck.
In this talk I shall try share some of my experiences of being a statistician in the hope that it will help you make the most of whatever luck life throws you, In so doing, I shall try my best to overcome the distorting influence of that easiest of sciences hindsight. Without giving too much away, I shall be recommending that you read, listen, think, calculate, understand, communicate, and do. I shall give you some example of what I think works and what I think doesn’t
In all of this you should never forget the power of negativity and also the joy of being able to wake up every day and say to yourself ‘I love the small of data in the morning’.
Presentation on similarities and differences between statistical and machine learning research fields for the @UM_MiCHAMP Big Data & AI in Health Seminar Series; October 21, 2022
Disease screening and screening test validityTampiwaChebani
Full lecture covering screening tests and validity testing. Covers topics such as calculation and interpretation of sensitivity, specificity, positive predictive value and negative predictive value of a screening test.
Improving predictions: Lasso, Ridge and Stein's paradoxMaarten van Smeden
Slides of masterclass "Improving predictions: Lasso, Ridge and Stein's paradox" at the (Dutch) National Institute for Public Health and the Environment (RIVM)
Ragui Assaad- University of Minnesota
Caroline Krafft- ST. Catherine University
ERF Training on Applied Micro-Econometrics and Public Policy Evaluation
Cairo, Egypt July 25-27, 2016
www.erf.org.eg
This year marks the 70th anniversary of the Medical Research Council randomised clinical trial (RCT) of streptomycin in tuberculosis led by Bradford Hill. This is widely regarded as a landmark in clinical research. Despite its widespread use in drug regulation and in clinical research more widely and its high standing with the evidence based medicine movement, the RCT continues to attracts criticism. I show that many of these criticisms are traceable to failure to understand two key concepts in statistics: probabilistic inference and design efficiency. To these methodological misunderstandings can be added the practical one of failing to appreciate that entry into clinical trials is not simultaneous but sequential.
I conclude that although randomisation should not be used as an excuse for ignoring prognostic variables, it is valuable and that many standard criticisms of RCTs are invalid.
Talk given at ISCB 2016 Birmingham
For indications and treatments where their use is possible, n-of-1 trials represent a promising means of investigating potential treatments for rare diseases. Each patient permits repeated comparison of the treatments being investigated and this both increases the number of observations and reduces their variability compared to conventional parallel group trials.
However, depending on whether the framework for analysis used is randomisation-based or model- based produces puzzling difference in inferences. This can easily be shown by starting on the one hand with the randomisation philosophy associated with the Rothamsted school of inference and building up the analysis through the block + treatment structure approach associated with John Nelder’s theory of general balance (as implemented in GenStat®) or starting on the other hand with a plausible variance component approach through a mixed model. However, it can be shown that these differences are related not so much to modelling approach per se but to the questions one attempts to answer: ranging from testing whether there was a difference between treatments in the patients studied, to predicting the true difference for a future patient, via making inferences about the effect in the average patient.
This in turn yields interesting insight into the long-run debate over the use of fixed or random effect meta-analysis.
Some practical issues of analysis will also be covered in R and SAS®, in which languages some functions and macros to facilitate analysis have been written. It is concluded that n-of-1 hold great promise in investigating chronic rare diseases but that careful consideration of matters of purpose, design and analysis is necessary to make best use of them.
Acknowledgement
This work is partly supported by the European Union’s 7th Framework Programme for research, technological development and demonstration under grant agreement no. 602552. “IDEAL”
The Rothamsted school meets Lord's paradoxStephen Senn
Lords ‘paradox’ is a notoriously difficult puzzle that is guaranteed to provoke discussion, dissent and disagreement. Two statisticians analyse some observational data and come to radically different conclusions, each of which has acquired defenders over the years since Lord first proposed his puzzle in 1967. It features in the recent Book of Why by Pearl and McKenzie, who use it to demonstrate the power of Pearl’s causal calculus, obtaining a solution they claim is unambiguously right. They also claim that statisticians have failed to get to grips with causal questions for well over a century, in fact ever since Karl Pearson developed Galton’s idea of correlation and warned the scientific world that correlation is not causation.
However, only two years before Lord published his paradox John Nelder outlined a powerful causal calculus for analyzing designed experiments based on a careful distinction between block and treatment structure. This represents an important advance in formalizing the approach to analysing complex experiments that started with Fisher 100 years ago, when he proposed splitting variability using the square of the standard deviation, which he called the variance, continued with Yates and has been developed since the 1960s by Rosemary Bailey, amongst others. This tradition might be referred to as The Rothamsted School. It is fully implemented in Genstat® but, as far as I am aware, not in any other package.
With the help of Genstat®, I demonstrate how the Rothamsted School would approach Lord’s paradox and come to a solution that is not the same as the one reached by Pearl and McKenzie, although given certain strong but untestable assumptions it would reduce to it. I conclude that the statistical tradition may have more to offer in this respect than has been supposed.
There are many questions one might ask of a clinical trial, ranging from what was the effect in the patients studied to what might the effect be in future patients via what was the effect in individual patients? The extent to which the answer to these questions is similar depends on various assumptions made and in some cases the design used may not permit any meaningful answer to be given at all.
A related issue is confusion between randomisation, random sampling, linear model and true multivariate based modelling. These distinctions don’t matter much for some purposes and under some circumstances but for others they do.
1. A law firm wants to determine the trend in its annual billings .docxmonicafrancis71118
1. A law firm wants to determine the trend in its annual billings so that it can better forecast revenues. It plots the data on its billings for the past 10 years and finds that the scatter plot appears to be linear. What formula should they use to determine the trend line?
σ = ∑√(x - μ)2 ÷ N
F = s12 ÷ s22
t = (x̄ - μx-bar) ÷ s/√n
Tt = b0 + b1t
3 points
QUESTION 2
1. A set of subjects, usually randomly sampled, selected to participate in a research study is called:
Population
Sample
Mode group
Partial selection
3 points
QUESTION 3
1. If a researcher accepts a null hypothesis when that hypothesis is actually true, she has committed:
a type I error
a type II error
no error
a causation
3 points
QUESTION 4
1. A binomial probability distribution is a discrete distribution (i.e., the x-variable is discrete).
True
False
3 points
QUESTION 5
1. The tdistribution is wider and flatter (i.e., has more variation) than the normal distribution.
True
False
3 points
QUESTION 6
1. A physician wants to estimate the average amount of time that patients spend in his waiting room. He asks his receptionist to record the waiting times for 28 of his patients and finds that the sample mean (x̄) is 37 minutes and the sample standard deviation (s) is 12 minutes. What formula would you use to construct the 95% confidence interval for the population mean of waiting times?
t = (x̄ - μx-bar) ÷ s/√n
µ = ∑ x ÷ N
x̄ - t(s ÷ √n) < µ < x̄ + t(s ÷ √n)
z = (x - µ) ÷ σ
3 points
QUESTION 7
1. When the alternative hypothesis states that the difference between two groups can only be in one direction, we call this a:
One-tailed test
Bi-directional test
Two-tailed test
Non-parametric test
3 points
QUESTION 8
1. For any probability distribution, the probability of any x-value occurring within any given range is equal to the area under the distribution and above that range.
True
False
3 points
QUESTION 9
1. The formula for ____________ is (Row total X Column total)/T
Observed frequencies
Degrees of freedom
Expected frequencies
Sampling error
3 points
QUESTION 10
1. State Senator Hanna Rowe has ordered an investigation of the large number of boating accidents that have occurred in the state in recent summers. Acting on her instructions, her aide, Geoff Spencer, has randomly selected 9 summer months within the last few years and has compiled data on the number of boating accidents that occurred during each of these months. The mean number of boating accidents to occur in these 9 months was 31 (x̄), and the standard deviation (s) in this sample was 9 boating accidents per month. Geoff was told to construct a 90% confidence interval for the true mean number of boating accidents per month. What formula should Geoff use?
x̄ - t(s ÷ √n) < µ < x̄ + t(s ÷ √n)
F = s12 ÷ s22
z = (x - µ) ÷ σ
x̄ - z(σ ÷ √n) < µ < x̄ + z(σ ÷ √n)
.
The Rothamsted School & The analysis of designed experimentsStephenSenn2
A historical account is given of the approach of "The Rothamsted School" to the analysis of designed experiments. The link between the way that experiments are designed and how they should be analysed is fundamental to this approach. The key figures are RA Fisher, Frank Yates and John Nelder
It is argued that when it comes to nuisance parameters an assumption of ignorance is harmful. On the other hand this raises problems as to how far one should go in searching for further data when combining evidence.
Whatever happened to design based inferenceStephenSenn2
Given as the Sprott lecture, University of Waterloo September 2022
Abstract
What exactly should we think about appropriate analyses for designed experiments and why? If conditional inference trumps marginal inference, why should we care about randomisation? Isn’t everything just modelling? The Rothamsted School held that design matters. Taking an example of applying John Nelder’s general balance approach to a notorious problem, Lord’s paradox, I shall show that there may be some lessons for two fashionable topics: causal analysis and big data. I shall conclude that if we want not only to make good estimates but estimate how good our estimates are, design does matter.
The response to the COVID-19 crisis by various vaccine developers has been extraordinary, both in terms of speed of response and the delivered efficacy of the vaccines. It has also raised some fascinating issues of design, analysis and interpretation. I shall consider some of these issues, taking as my example, five vaccines: Pfizer/BioNTech, AstraZeneca/Oxford, Moderna, Novavax, and J&J Janssen but concentrating mainly on the first two. Among matters covered will be concurrent control, efficient design, issues of measurement raised by two-shot vaccines and implications for roll-out, and the surprising effectiveness of simple analyses. Differences between the five development programmes as they affect statistics will be covered but some essential similarities will also be discussed.
The statistical revolution of the 20th century was largely concerned with developing methods for analysing small datasets. Student’s paper of 1908 was the first in the English literature to address the problem of second order uncertainty (uncertainty about the measures of uncertainty) seriously and was hailed by Fisher as heralding a new age of statistics. Much of what Fisher did was concerned with problems of what might be called ‘small data’, not only as regards efficient analysis but also as regards efficient design and in addition paying close attention to what was necessary to measure uncertainty validly.
I shall consider the history of some of these developments, in particular those that are associated with what might be called the Rothamsted School, starting with Fisher and having its apotheosis in John Nelder’s theory of General Balance and see what lessons they hold for the supposed ‘big data’ revolution of the 21st century.
Clinical trials: quo vadis in the age of covid?Stephen Senn
A discussion of the role of clinical trials in the age of COVID. My contribution to the phastar 2020 life sciences summit https://phastar.com/phastar-life-science-summit
What should we expect from reproducibiliryStephen Senn
Is there really a reproducibility crisis and if so are P-values to blame? Choose any statistic you like and carry out two identical independent studies and report this statistic for each. In advance of collecting any data, you ought to expect that it is just as likely that statistic 1 will be smaller than statistic 2 as vice versa. Once you have seen statistic 1, things are not so simple but if they are not so simple, it is that you have other information in some form. However, it is at least instructive that you need to be careful in jumping to conclusions about what to expect from reproducibility. Furthermore, the forecasts of good Bayesians ought to obey a Martingale property. On average you should be in the future where you are now but, of course, your inferential random walk may lead to some peregrination before it homes in on “the truth”. But you certainly can’t generally expect that a probability will get smaller as you continue. P-values, like other statistics are a position not a movement. Although often claimed, there is no such things as a trend towards significance.
Using these and other philosophical considerations I shall try and establish what it is we want from reproducibility. I shall conclude that we statisticians should probably be paying more attention to checking that standard errors are being calculated appropriately and rather less to inferential framework.
Personalised medicine a sceptical viewStephen Senn
Some grounds for believing that the current enthusiasm about personalised medicine is exaggerated, founded on poor statistics and represents a disappointing loss of ambition.
Sample size determination in clinical trials is considered from various ethical and practical perspectives. It is concluded that cost is a missing dimension and that the value of information is key.
An early and overlooked causal revolution in statistics was the development of the theory of experimental design, initially associated with the "Rothamstead School". An important stage in the evolution of this theory was the experimental calculus developed by John Nelder in the 1960s with its clear distinction between block and treatment factors in designed experiments. This experimental calculus produced appropriate models automatically from more basic formal considerations but was, unfortunately, only ever implemented in Genstat®, a package widely used in agriculture but rarely so in medical research. In consequence its importance has not been appreciated and the approach of many statistical packages to designed experiments is poor. A key feature of the Rothamsted School approach is that identification of the appropriate components of variation for judging treatment effects is simple and automatic.
The impressive more recent causal revolution in epidemiology, associated with Judea Pearl, seems to have no place for components of variation, however. By considering the application of Nelder’s experimental calculus to Lord’s Paradox, I shall show that this reveals that solutions that have been proposed using the more modern causal calculus are problematic. I shall also show that lessons from designed clinical trials have important implications for the use of historical data and big data more generally.
Views of the role of hypothesis falsification in statistical testing do not divide as cleanly between frequentist and Bayesian views as is commonly supposed. This can be shown by considering the two major variants of the Bayesian approach to statistical inference and the two major variants of the frequentist one.
A good case can be made that the Bayesian, de Finetti, just like Popper, was a falsificationist. A thumbnail view, which is not just a caricature, of de Finetti’s theory of learning, is that your subjective probabilities are modified through experience by noticing which of your predictions are wrong, striking out the sequences that involved them and renormalising.
On the other hand, in the formal frequentist Neyman-Pearson approach to hypothesis testing, you can, if you wish, shift conventional null and alternative hypotheses, making the latter the strawman and by ‘disproving’ it, assert the former.
The frequentist, Fisher, however, at least in his approach to testing of hypotheses, seems to have taken a strong view that the null hypothesis was quite different from any other and there was a strong asymmetry on inferences that followed from the application of significance tests.
Finally, to complete a quartet, the Bayesian geophysicist Jeffreys, inspired by Broad, specifically developed his approach to significance testing in order to be able to ‘prove’ scientific laws.
By considering the controversial case of equivalence testing in clinical trials, where the object is to prove that ‘treatments’ do not differ from each other, I shall show that there are fundamental differences between ‘proving’ and falsifying a hypothesis and that this distinction does not disappear by adopting a Bayesian philosophy. I conclude that falsificationism is important for Bayesians also, although it is an open question as to whether it is enough for frequentists.
In Search of Lost Infinities: What is the “n” in big data?Stephen Senn
In designing complex experiments, agricultural scientists, with the help of their statistician collaborators, soon came to realise that variation at different levels had very different consequences for estimating different treatment effects, depending on how the treatments were mapped onto the underlying block structure. This was a key feature of the Rothamsted approach to design and analysis and a strong thread running through the work of Fisher, Yates and Nelder, being expressed in topics such as split-pot designs, recovering inter-block information and fractional factorials. The null block-structure of an experiment is key to this philosophy of design and analysis. However modern techniques for analysing experiments stress models rather than symmetries and this modelling approach requires much greater care in analysis, with the consequence that you can easily make mistakes and often will.
In this talk I shall underline the obvious, but often unintentionally overlooked, fact that understanding variation at the various levels at which it occurs is crucial to analysis. I shall take three examples, an application of John Nelder’s theory of general balance to Lord’s Paradox, the use of historical data in drug development and a hybrid randomised non-randomised clinical trial, the TARGET study, to show that the data that many, including those promoting a so-called causal revolution, assume to be ‘big’ may actually be rather ‘small’. The consequence is that there is a danger that the size of standard errors will be underestimated or even that the appropriate regression coefficients for adjusting for confounding may not be identified correctly.
I conclude that an old but powerful experimental design approach holds important lessons for observational data about limitations in interpretation that mere numbers cannot overcome. Small may be beautiful, after all.
Unfortunately, some have interpreted Numbers Needed to Treat as indicating the proportion of patients on whom the treatment has had a causal effect. This interpretation is very rarely, if ever, necessarily correct. It is certainly inappropriate if based on a responder dichotomy. I shall illustrate the problem using simple causal models.
One also sometimes encounters the claim that the extent to which two distributions of outcomes overlap from a clinical trial indicates how many patients benefit. This is also false and can be traced to a similar causal confusion.
Presidents' invited lecture ISCB Vigo 2017
Discusses various issues to do with how randomised clinical trials should be analysed. See also https://errorstatistics.com/2017/07/01/s-senn-fishing-for-fakes-with-fisher-guest-post/
History of how and why a complex cross-over trial was designed to prove the equivalence of two formulations of a beta-agonist and what the eventual results were. Presented at the Newton Institute 28 July 2008. Warning: following the important paper by Kenward & Roger Biostatistics, 2010, I no longer think the random effects analysis is appropriate, although, in fact the results are pretty much the same as for the fixed effects analysis.
The history of p-values is covered to try and shed light on a mystery: why did Student and Fisher agree numerically but disagree in terms of interpretation.?
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
2. (c) Stephen Senn 2008 2
The Value of Marketing
• I used to work in the
pharmaceutical industry
and know the value of
marketing
• In those days I would have
taken this opportunity to
promote my new book
• Now that I am an
academic, modesty forbids
me to do so
3. (c) Stephen Senn 2008 3
Outline
• Why you must condition on covariates
• The true value of balance
• What is minimisation
• How to balance if you must
• Variance matters
• Why I hate minimisation
4. (c) Stephen Senn 2008 4
Terminology
• Regression
– (Usually) that field of statistics that deals with estimating
parameters of linear equations using (ordinary) least squares
• Covariate – something you model and which will help
sharpen your inferences about the effect of treatment but is
not of direct interest
– Example age in a placebo-controlled trial of a beta-blocker in
hypertension
• Unbiased estimator
– A means of estimating a quantity that would on average over all
similar circumstances in which you used it equal the thing you
were estimating
5. (c) Stephen Senn 2008 5
Typical MRC Stuff
‘The central telephone randomisation system used a minimisation
algorithm to balance the treatment groups with respect to eligibility
criteria and other major prognostic factors.’ (p24)
‘All comparisons involved logrank analyses of the first occurrence
of particular events during the scheduled treatment period after
randomisation among all those allocated the vitamins versus all
those allocated matching placebo capsules (ie, they were “intention-
to treat” analyses).’ (p24)
1. (2002) MRC/BHF Heart Protection Study of cholesterol lowering with
simvastatin in 20,536 high-risk individuals: a randomised placebo-controlled trial.
Lancet 360:7-22
6. (c) Stephen Senn 2008 6
Three Games with Two Dice
• The object is to call the odds for getting a score of ten in
rolling two dice
– A red die and a black die
• The game is played three different ways
– Game 1. The two dice are rolled together you call the
odds before they are rolled
– Game 2. The red die is rolled you are shown the score
and then call the odds before the black die is rolled
– Game 3. You call the odds. The red die is rolled first
but you are not shown it and then the black one is
rolled
• How should you bet?
8. (c) Stephen Senn 2008 8
Game 2: Either the probability = 0 or the probability = 1/6
Game 3: The probability = ½ x 0 + ½ x 1/6= 1/12
9. (c) Stephen Senn 2008 9
The Morals
• You can’t treat game 2 like game 1.
– You must condition on the information you receive in order to be
wisely
– You must use the actual data from the red die
• You can treat game 3 like game 1.
– You can use the distribution in probability that the red die has
• You can’t ignore an observed prognostic covariate in
analysing a clinical trial just because you randomised
– That would be to treat game 2 like game 1
• You can ignore an unobserved covariate precisely because
you did randomise
– Because you are entitled to treat game 3 like game 1
10. (c) Stephen Senn 2008 10
Trialists continue to use their randomisation
as an excuse for ignoring prognostic
information, and they continue to worry
about the effect of factors they have not
measured. Neither practice is logical.
The Reality
11. (c) Stephen Senn 2008 11
The True Value of Balance
• It is generally held as being self evident that
a trial which is not balanced is not valid.
• Trials are examined at baseline to establish
their validity.
• In fact the matter is not so simple...........
12. (c) Stephen Senn 2008 12
A Tale of two Tables
Trial 1 Treatment
Sex Verum Placebo TOTAL TOTA
Male 34 26 60
Female 15 25 40
TOTAL 49 51 100
Trial 2 Treatment TOTA
Sex Verum Placebo TOTAL
Male 26 26 52
Female 15 15 30
TOTAL 41 41 82
13. (c) Stephen Senn 2008 13
Choices, Choices
Trial two is balanced whereas trial one is not.
One might think that trial two provides the more reliable
information.
However, the reverse is the case.
Trial one contains a comparable trial to trial two within it.
It is simply trial two with the addition of 8 further male patients
in the verum group and 10 further female patients in the placebo
group.
How could more information be worse than less?
14. (c) Stephen Senn 2008 14
Stratification
All we need to do is compare like with like.
If we compare males with males and females with females we
shall obtain two unbiased estimators of the treatment effects.
These can then be combined in some appropriate way. This
technique is called stratification.
A similar approach called analysis of covariance is available to
deal with continuous covariates such as height, age or a baseline
measurement.
15. (c) Stephen Senn 2008 15
Validity and Efficiency
• So if balance is not necessary to produce unbiased
estimates what is its value?
• The answer is that it leads to efficient estimates
• If you condition on covariates the estimate will be
more efficient the closer they are to being
balanced
• However if you don’t condition on observed
balanced covariates your inferences are invalid
– Example analysing a matched pairs design using the
two-sample t-test
16. (c) Stephen Senn 2008 16
A Definition
Minimization. A procedure for allocating patients
sequentially to clinical trials taking account of the
demographic characteristics of the given patient
and of those already entered on the trial in such a
way as to attempt to deliberately balance the
groups. In
practice this achieves a higher degree of balance
than for randomization, although the difference for
even moderately sized trials is small.
Where did I find this definition?
I read it in a book I rather like
In fact, I wrote it myself
17. (c) Stephen Senn 2008 17
What is minimisation?
• Approach to ‘balancing’ clinical trials
• Proposed by Taves (1974) and Pocock &
Simon (1975)
• Uses marginal balance
• Reflects the reality that patients in clinical
trials are treated when they present
• Avoids the ‘deep-freeze microwave’ fallacy
18. (c) Stephen Senn 2008 18
An Example
We can explain this with the help of an example of Pocock’s
(Clinical Trials A Practical Approach, Wiley, 1983, p85).
Suppose that we wish to balance patients in a clinical trial for
advanced breast cancer with respect to 4 factors: performance
status (ambulatory and non-ambulatory), age ( < 50, ≥ 50),
disease free interval (<2 years, ≥ 2 years), dominant metastatic
lesion (visceral, osseous, soft tissue).
Suppose that the next patient to be entered is ambulatory, age <
50, disease free interval ≥ 2 years and visceral metastasis and
suppose that 80 patients have been entered already so that the
position is as given in the table below.
19. (c) Stephen Senn 2008 19
Factor Level No. on each treatment
A B
Next
Patient
Performance
Status
Ambulatory
Non-Amb.
30
10
31
9
*
Age < 50
≥ 50
18
22
17
23
*
Disease-free
Interval
< 2 years
≥ 2 years
31
9
32
8 *
Dominant
met. lesion
Visceral
Osseous
Soft tissue
19
8
13
21
7
12
*
Using the levels of the factors for this particular patient we
see that the sum for A is 30+18+9+19=76 while for B it is
31+17+8+21 = 77. Therefore we assign the patient to A.
20. (c) Stephen Senn 2008 20
What you learn in your first regression
course
1 11 1 0 1
2 12 2 1 2
1
1
1 ...
1 ...
k
k
n n kn k n
Y X X
Y X X
Y X X
β ε
β ε
β ε
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷= = = =
÷ ÷ ÷ ÷
÷ ÷ ÷ ÷
Xβ ε
Y = Xβ + ε
L
M M M O M M M
Y
( )ˆ ′ ′
-1
β = X X X Y ( )
1 2ˆ ˆ( ) , ( ) .E V σ
−
′= =β β β X X
Design matrixResponse vector
Parameter
vector
Disturbance
term vector
Estimator
Unbiased
Variance of the
disturbance termsVariance of the
estimates
21. (c) Stephen Senn 2008 21
1 2
11 12 1
12 22 2
1
22
ˆvar( ) ( )
2/
k
k kk
X X
a a a
a a
a a
a n
β σ
σ
−
′=
÷
÷=
÷
÷
≥
The value of σ2
depends
on the model.
For a given model, the
value of a22 depends on
the design and this only
achieves its lower bound
when covariates are
balanced.
The Value of Balance
Variance multiplier for the treatment effect
22. (c) Stephen Senn 2008 22
0 10 20 30 40 50 60
Patients in Trial
0.0
0.2
0.4
0.6
0.8
1.0
Efficiency
Efficiency of Randomised Trial Compared to Balanced One
23. (c) Stephen Senn 2008 23
6 8 10 12 14 16 18 20 22 24
Sample size
0.6
0.7
0.8
0.9
1.0
Efficiency
Efficiency of Randomised Design Compared to a Balanced One
10 x 1000 runs at each sample size
Balanced Numbers: One Covariate
run
mean
formula
24. (c) Stephen Senn 2008 24
Minimisation Doesn’t even
Minimise Well
• In fact minimisation achieves marginal and
partly irrelevant balance
– Adding together apples and pears
• The best solution as ‘enny fule kan kno’ is
to work with the design matrix
• Allocate so that the variance multiplier for
the treatment effect is as favourable as
possible
25. (c) Stephen Senn 2008 25
Atkinson’s Approach
11 12 1
21 22 2
0 1
k
k
k k kk
a a a
a a a
a a a
÷
÷=
÷
÷
A
L
L
M M O M
K
2ˆvar( )i iiaβ σ= Choose allocations such
that a22 is minimised
( )′
-1
X X = A
26. (c) Stephen Senn 2008 26
Non-Linear Models
• This is more complicated
• Variance depends not just on design matrix
but on response
• Nevertheless design matrix is important
• Furthermore the problems in not
conditioning are worse
– Gail et al, Robinson & Jewel, Ford et al
27. (c) Stephen Senn 2008 27
A Problem
• All such sequential balancing methods restrict the
randomisation strongly to a degree beyond that
necessary to balance by the factor by the end of
the trial
• This may lead to invalid variance estimates
– incompatible with Fisher philosophy of randomised
experiments
• see also Nelder general balance
• Student’s (and Taves’s) argument would be that it
leads to conservative inference
– and that this is good
28. (c) Stephen Senn 2008 28
Don’t Forget the Variance
Estimate
Full "Correct"
Model
Reduced
Randomised
Reduced
Minimised
Treatment Treatment Treatment
Covariate
Error Error Error
Total Total Total
29. (c) Stephen Senn 2008 29
A Red Herring
• The minimisers common defence is ‘conservative
inference’
• ‘So what if our reported standard errors are higher
than the true ones’
• ‘The result is conservative inference’
• If you like conservative inference why not just
multiply all your standard errors by two?
• The next slide shows the consequence of
conservative inference
30. (c) Stephen Senn 2008 30
Variance, Randomisation and Meta-
Analysis
Trial True
Variance
Weight Est
Variance
Weight
Randomised Higher Less Lower More
Balanced Lower More Higher Less
Consider the meta-analysis of two otherwise identical
trials: one randomised, one balanced.
Should be Will be
31. (c) Stephen Senn 2008 31
How to Eliminate The Effect of
Covariates by Allocation Alone
A B
Males 50 50
Females 50 50
A B
Males 100
Females 100
This eliminates the effect
of sex from the unadjusted
treatment difference
This eliminates the effect
of sex from the unadjusted
within-treatment variance
estimate
32. (c) Stephen Senn 2008 32
How to Eliminate the Effect of a Covariate from
Estimated Treatment Effect and Variance
A B
Male 47 53
Female 53 47
You do this by
conditioning on sex
(modelling) whether or
not sex and treatment
are balanced
33. (c) Stephen Senn 2008 33
What About Bayesians?
• Belief dictates the model
• The model dictates the analysis
• The design determines efficiency
– Design does not dictate analysis
• Randomised designs are (slightly) less efficient
• Why randomise?
– Randomisation prevents your having to factor in your beliefs about
how the trialists will behave
• Balance what you can and randomise what you can’t but
neither balance nor randomisation is an excuse for not
conditioning
34. (c) Stephen Senn 2008 34
Why I Hate Minimisation
Reasons
• It is not based on sound design theory
• Its contribution to improving efficiency is minimal
• It violates randomisation based analysis
• People who use it don’t even do good model-
based analysis (MRC, EORTC etc balance but
don’t condition)
– See the example of the MRC/BHF trial
• We should fit covariates not find elaborate
excuses to ignore them
35. (c) Stephen Senn 2008 35
In Short
I’d rather be hanged for a
sheep than a lamb
And I for one will not let
the minimisers pull the
wool over my eyes
36. (c) Stephen Senn 2008 36
My Philosophy of Clinical Trials
• Your (reasonable) beliefs dictate the model
• You should try measure what you think is important
• You should try fit what you have measured
– Caveat : random regressors and the Gauss-Markov theorem
• If you can balance what is important so much the better
– But fitting is more important than balancing
• Randomisation deals with unmeasured covariates
– You can use the distribution in probability of unmeasured covariates
– For measured covariates you must use the actual observed distribution
• Claiming to do ‘conservative inference’ is just a
convenient way of hiding bad practice
– Who thinks apart from the MRC that analysing a matched pairs t as a two
sample t is acceptable?
37. (c) Stephen Senn 2008 37
What’s out and What’s in
Out In
• Log-rank test
• T-test on change scores
• Chi-square tests on 2 x 2
tables
• Responder analysis and
dichotomies
• Balancing as an excuse for
not conditioning
• Proportional hazards
• Analysis of covariance
fitting baseline
• Logistic regression fitting
covariates
• Analysis of original values
• Modelling as a guide for
designs
38. (c) Stephen Senn 2008 38
"overpaid, oversexed and over here".
Tommy Trinder (1909-1989) on the subject of the GIs in WWII
“Over-hyped, overused and overdue for retirement”
Stephen Senn on minimisation
A plea to all right- thinking statisticians. Help me consign this
piece of garbage to the rubbish dump of history
39. (c) Stephen Senn 2008 39
Finally
• I leave you with this
thought
• Did you know that
there are only 260
shopping days until
Christmas
• May I make a small
suggestion?
Editor's Notes
&lt;number&gt;
From Senn, Statistical Issues in Drug Development, (Second edition due Jan 2008)