1) The document discusses the challenge of studying rare diseases due to the small amount of available data. It proposes that N-of-1 trials, where individual patients are repeatedly randomized to treatment or control, could help address this issue.
2) It provides examples of how careful experimental design and statistical analysis are important even with small data sets. Factors like randomization, blocking, and replication can increase efficiency and validity.
3) Analyzing an N-of-1 trial for a rare disease, the document explores objectives like determining if one treatment is better, estimating average effects, and predicting effects for future patients. It discusses randomization and sampling philosophies and mixed effects models.
The statistical revolution of the 20th century was largely concerned with developing methods for analysing small datasets. Student’s paper of 1908 was the first in the English literature to address the problem of second order uncertainty (uncertainty about the measures of uncertainty) seriously and was hailed by Fisher as heralding a new age of statistics. Much of what Fisher did was concerned with problems of what might be called ‘small data’, not only as regards efficient analysis but also as regards efficient design and in addition paying close attention to what was necessary to measure uncertainty validly.
I shall consider the history of some of these developments, in particular those that are associated with what might be called the Rothamsted School, starting with Fisher and having its apotheosis in John Nelder’s theory of General Balance and see what lessons they hold for the supposed ‘big data’ revolution of the 21st century.
Clinical trials: quo vadis in the age of covid?Stephen Senn
A discussion of the role of clinical trials in the age of COVID. My contribution to the phastar 2020 life sciences summit https://phastar.com/phastar-life-science-summit
The response to the COVID-19 crisis by various vaccine developers has been extraordinary, both in terms of speed of response and the delivered efficacy of the vaccines. It has also raised some fascinating issues of design, analysis and interpretation. I shall consider some of these issues, taking as my example, five vaccines: Pfizer/BioNTech, AstraZeneca/Oxford, Moderna, Novavax, and J&J Janssen but concentrating mainly on the first two. Among matters covered will be concurrent control, efficient design, issues of measurement raised by two-shot vaccines and implications for roll-out, and the surprising effectiveness of simple analyses. Differences between the five development programmes as they affect statistics will be covered but some essential similarities will also be discussed.
The statistical revolution of the 20th century was largely concerned with developing methods for analysing small datasets. Student’s paper of 1908 was the first in the English literature to address the problem of second order uncertainty (uncertainty about the measures of uncertainty) seriously and was hailed by Fisher as heralding a new age of statistics. Much of what Fisher did was concerned with problems of what might be called ‘small data’, not only as regards efficient analysis but also as regards efficient design and in addition paying close attention to what was necessary to measure uncertainty validly.
I shall consider the history of some of these developments, in particular those that are associated with what might be called the Rothamsted School, starting with Fisher and having its apotheosis in John Nelder’s theory of General Balance and see what lessons they hold for the supposed ‘big data’ revolution of the 21st century.
Clinical trials: quo vadis in the age of covid?Stephen Senn
A discussion of the role of clinical trials in the age of COVID. My contribution to the phastar 2020 life sciences summit https://phastar.com/phastar-life-science-summit
The response to the COVID-19 crisis by various vaccine developers has been extraordinary, both in terms of speed of response and the delivered efficacy of the vaccines. It has also raised some fascinating issues of design, analysis and interpretation. I shall consider some of these issues, taking as my example, five vaccines: Pfizer/BioNTech, AstraZeneca/Oxford, Moderna, Novavax, and J&J Janssen but concentrating mainly on the first two. Among matters covered will be concurrent control, efficient design, issues of measurement raised by two-shot vaccines and implications for roll-out, and the surprising effectiveness of simple analyses. Differences between the five development programmes as they affect statistics will be covered but some essential similarities will also be discussed.
The Rothamsted school meets Lord's paradoxStephen Senn
Lords ‘paradox’ is a notoriously difficult puzzle that is guaranteed to provoke discussion, dissent and disagreement. Two statisticians analyse some observational data and come to radically different conclusions, each of which has acquired defenders over the years since Lord first proposed his puzzle in 1967. It features in the recent Book of Why by Pearl and McKenzie, who use it to demonstrate the power of Pearl’s causal calculus, obtaining a solution they claim is unambiguously right. They also claim that statisticians have failed to get to grips with causal questions for well over a century, in fact ever since Karl Pearson developed Galton’s idea of correlation and warned the scientific world that correlation is not causation.
However, only two years before Lord published his paradox John Nelder outlined a powerful causal calculus for analyzing designed experiments based on a careful distinction between block and treatment structure. This represents an important advance in formalizing the approach to analysing complex experiments that started with Fisher 100 years ago, when he proposed splitting variability using the square of the standard deviation, which he called the variance, continued with Yates and has been developed since the 1960s by Rosemary Bailey, amongst others. This tradition might be referred to as The Rothamsted School. It is fully implemented in Genstat® but, as far as I am aware, not in any other package.
With the help of Genstat®, I demonstrate how the Rothamsted School would approach Lord’s paradox and come to a solution that is not the same as the one reached by Pearl and McKenzie, although given certain strong but untestable assumptions it would reduce to it. I conclude that the statistical tradition may have more to offer in this respect than has been supposed.
When estimating sample sizes for clinical trials there are several different views that might be taken as to what definition and meaning should be given to the sought-for treatment effect. However, if the concept of a ‘minimally important difference’ (MID) does have relevance to interpreting clinical trials (which can be disputed) then its value cannot be the same as the ‘clinically relevant difference’ (CRD) that would be used for planning them.
A doubly pernicious use of the MID is as a means of classifying patients as responders and non-responders. Not only does such an analysis lead to an increase in the necessary sample size but it misleads trialists into making causal distinctions that the data cannot support and has been responsible for exaggerating the scope for personalised medicine.
In this talk these statistical points will be explained using a minimum of technical detail.
Talk given at RSS 2016 Manchester
I consider the problems that the ASA faced in getting a P-value statement together, not in terms of the process, but by looking at the expressed opinion of 21 published commentaries of the agreed statement. I then trace the history of the development of P-values. I show that the perceived problem with P-values in not just one of a supposed inadequacy of frequentist statistics but reflects a struggle at the very heart of Bayesian inference. I conclude that replacing P-values by automatic Bayesian approaches is unlikely to abolish controversy. It may be better to try and embrace diversity than to pretend it is not there.
Unfortunately, some have interpreted Numbers Needed to Treat as indicating the proportion of patients on whom the treatment has had a causal effect. This interpretation is very rarely, if ever, necessarily correct. It is certainly inappropriate if based on a responder dichotomy. I shall illustrate the problem using simple causal models.
One also sometimes encounters the claim that the extent to which two distributions of outcomes overlap from a clinical trial indicates how many patients benefit. This is also false and can be traced to a similar causal confusion.
How to combine results from randomised clinical trials on the additive scale with real world data to provide predictions on the clinically relevant scale for individual patients
Personalised medicine a sceptical viewStephen Senn
Some grounds for believing that the current enthusiasm about personalised medicine is exaggerated, founded on poor statistics and represents a disappointing loss of ambition.
An early and overlooked causal revolution in statistics was the development of the theory of experimental design, initially associated with the "Rothamstead School". An important stage in the evolution of this theory was the experimental calculus developed by John Nelder in the 1960s with its clear distinction between block and treatment factors in designed experiments. This experimental calculus produced appropriate models automatically from more basic formal considerations but was, unfortunately, only ever implemented in Genstat®, a package widely used in agriculture but rarely so in medical research. In consequence its importance has not been appreciated and the approach of many statistical packages to designed experiments is poor. A key feature of the Rothamsted School approach is that identification of the appropriate components of variation for judging treatment effects is simple and automatic.
The impressive more recent causal revolution in epidemiology, associated with Judea Pearl, seems to have no place for components of variation, however. By considering the application of Nelder’s experimental calculus to Lord’s Paradox, I shall show that this reveals that solutions that have been proposed using the more modern causal calculus are problematic. I shall also show that lessons from designed clinical trials have important implications for the use of historical data and big data more generally.
There are many questions one might ask of a clinical trial, ranging from what was the effect in the patients studied to what might the effect be in future patients via what was the effect in individual patients? The extent to which the answer to these questions is similar depends on various assumptions made and in some cases the design used may not permit any meaningful answer to be given at all.
A related issue is confusion between randomisation, random sampling, linear model and true multivariate based modelling. These distinctions don’t matter much for some purposes and under some circumstances but for others they do.
Talk given at ISCB 2016 Birmingham
For indications and treatments where their use is possible, n-of-1 trials represent a promising means of investigating potential treatments for rare diseases. Each patient permits repeated comparison of the treatments being investigated and this both increases the number of observations and reduces their variability compared to conventional parallel group trials.
However, depending on whether the framework for analysis used is randomisation-based or model- based produces puzzling difference in inferences. This can easily be shown by starting on the one hand with the randomisation philosophy associated with the Rothamsted school of inference and building up the analysis through the block + treatment structure approach associated with John Nelder’s theory of general balance (as implemented in GenStat®) or starting on the other hand with a plausible variance component approach through a mixed model. However, it can be shown that these differences are related not so much to modelling approach per se but to the questions one attempts to answer: ranging from testing whether there was a difference between treatments in the patients studied, to predicting the true difference for a future patient, via making inferences about the effect in the average patient.
This in turn yields interesting insight into the long-run debate over the use of fixed or random effect meta-analysis.
Some practical issues of analysis will also be covered in R and SAS®, in which languages some functions and macros to facilitate analysis have been written. It is concluded that n-of-1 hold great promise in investigating chronic rare diseases but that careful consideration of matters of purpose, design and analysis is necessary to make best use of them.
Acknowledgement
This work is partly supported by the European Union’s 7th Framework Programme for research, technological development and demonstration under grant agreement no. 602552. “IDEAL”
The Seven Habits of Highly Effective StatisticiansStephen Senn
If you know why the title of this talk is extremely stupid, then you clearly know something about control, data and reasoning: in short, you have most of what it takes to be a statistician. If you have studied statistics then you will also know that a large amount of anything, and this includes successful careers, is luck.
In this talk I shall try share some of my experiences of being a statistician in the hope that it will help you make the most of whatever luck life throws you, In so doing, I shall try my best to overcome the distorting influence of that easiest of sciences hindsight. Without giving too much away, I shall be recommending that you read, listen, think, calculate, understand, communicate, and do. I shall give you some example of what I think works and what I think doesn’t
In all of this you should never forget the power of negativity and also the joy of being able to wake up every day and say to yourself ‘I love the small of data in the morning’.
There are many valid criticisms of P-values but the criticism that they are largely responsible for the reproducibility crisis has been accepted rather lightly in some quarters. Whatever the inferential statistic that is used, it is quite illogical to assume that as the sample size increases it will tend to show more evidence against the null hypothesis. This applies to Bayesian posterior probabilities as much as it does to P-values. In the context of P-values it can be referred to as the trend towards significance fallacy but more generally, for reasons I shall explain, it could be referred to as the anticipated evidence fallacy.
The anticipated evidence fallacy is itself an example of the overstated evidence fallacy. I shall also discuss this fallacy and other relevant matters affecting reproducible science including the problem of false negatives.
Sample size determination in clinical trials is considered from various ethical and practical perspectives. It is concluded that cost is a missing dimension and that the value of information is key.
This year marks the 70th anniversary of the Medical Research Council randomised clinical trial (RCT) of streptomycin in tuberculosis led by Bradford Hill. This is widely regarded as a landmark in clinical research. Despite its widespread use in drug regulation and in clinical research more widely and its high standing with the evidence based medicine movement, the RCT continues to attracts criticism. I show that many of these criticisms are traceable to failure to understand two key concepts in statistics: probabilistic inference and design efficiency. To these methodological misunderstandings can be added the practical one of failing to appreciate that entry into clinical trials is not simultaneous but sequential.
I conclude that although randomisation should not be used as an excuse for ignoring prognostic variables, it is valuable and that many standard criticisms of RCTs are invalid.
What should we expect from reproducibiliryStephen Senn
Is there really a reproducibility crisis and if so are P-values to blame? Choose any statistic you like and carry out two identical independent studies and report this statistic for each. In advance of collecting any data, you ought to expect that it is just as likely that statistic 1 will be smaller than statistic 2 as vice versa. Once you have seen statistic 1, things are not so simple but if they are not so simple, it is that you have other information in some form. However, it is at least instructive that you need to be careful in jumping to conclusions about what to expect from reproducibility. Furthermore, the forecasts of good Bayesians ought to obey a Martingale property. On average you should be in the future where you are now but, of course, your inferential random walk may lead to some peregrination before it homes in on “the truth”. But you certainly can’t generally expect that a probability will get smaller as you continue. P-values, like other statistics are a position not a movement. Although often claimed, there is no such things as a trend towards significance.
Using these and other philosophical considerations I shall try and establish what it is we want from reproducibility. I shall conclude that we statisticians should probably be paying more attention to checking that standard errors are being calculated appropriately and rather less to inferential framework.
Views of the role of hypothesis falsification in statistical testing do not divide as cleanly between frequentist and Bayesian views as is commonly supposed. This can be shown by considering the two major variants of the Bayesian approach to statistical inference and the two major variants of the frequentist one.
A good case can be made that the Bayesian, de Finetti, just like Popper, was a falsificationist. A thumbnail view, which is not just a caricature, of de Finetti’s theory of learning, is that your subjective probabilities are modified through experience by noticing which of your predictions are wrong, striking out the sequences that involved them and renormalising.
On the other hand, in the formal frequentist Neyman-Pearson approach to hypothesis testing, you can, if you wish, shift conventional null and alternative hypotheses, making the latter the strawman and by ‘disproving’ it, assert the former.
The frequentist, Fisher, however, at least in his approach to testing of hypotheses, seems to have taken a strong view that the null hypothesis was quite different from any other and there was a strong asymmetry on inferences that followed from the application of significance tests.
Finally, to complete a quartet, the Bayesian geophysicist Jeffreys, inspired by Broad, specifically developed his approach to significance testing in order to be able to ‘prove’ scientific laws.
By considering the controversial case of equivalence testing in clinical trials, where the object is to prove that ‘treatments’ do not differ from each other, I shall show that there are fundamental differences between ‘proving’ and falsifying a hypothesis and that this distinction does not disappear by adopting a Bayesian philosophy. I conclude that falsificationism is important for Bayesians also, although it is an open question as to whether it is enough for frequentists.
There are many questions one might ask of a clinical trial, ranging from what was the effect in the patients studied to what might the effect be in future patients via what was the effect in individual patients? The extent to which the answer to these questions is similar depends on various assumptions made and in some cases the design used may not permit any meaningful answer to be given at all.
A related issue is confusion between randomisation, random sampling, linear model and true multivariate based modelling. These distinctions don’t matter much for some purposes and under some circumstances but for others they do.
A yet further issue is that causal analysis in epidemiology, which has brought valuable insights in many cases, has tended to stress point estimates and ignore standard errors. This has potentially misleading consequences.
An understanding of components of variation is key. Unfortunately, the development of two particular topics in recent years, evidence synthesis by the evidence based medicine movement and personalised medicine by bench scientists has either paid scant attention to components of variation or to the questions being asked or both resulting in confusion about many issues.
For instance, it is often claimed that numbers needed to treat indicate the proportion of patients for whom treatments work, that inclusion criteria determine the generalisability of results and that heterogeneity means that a random effects meta-analysis is required. None of these is true. The scope for personalised medicine has very plausibly been exaggerated and an important cause of variation in the healthcare system, physicians, is often overlooked.
I shall argue that thinking about questions is important.
The Rothamsted School & The analysis of designed experimentsStephenSenn2
A historical account is given of the approach of "The Rothamsted School" to the analysis of designed experiments. The link between the way that experiments are designed and how they should be analysed is fundamental to this approach. The key figures are RA Fisher, Frank Yates and John Nelder
The Rothamsted school meets Lord's paradoxStephen Senn
Lords ‘paradox’ is a notoriously difficult puzzle that is guaranteed to provoke discussion, dissent and disagreement. Two statisticians analyse some observational data and come to radically different conclusions, each of which has acquired defenders over the years since Lord first proposed his puzzle in 1967. It features in the recent Book of Why by Pearl and McKenzie, who use it to demonstrate the power of Pearl’s causal calculus, obtaining a solution they claim is unambiguously right. They also claim that statisticians have failed to get to grips with causal questions for well over a century, in fact ever since Karl Pearson developed Galton’s idea of correlation and warned the scientific world that correlation is not causation.
However, only two years before Lord published his paradox John Nelder outlined a powerful causal calculus for analyzing designed experiments based on a careful distinction between block and treatment structure. This represents an important advance in formalizing the approach to analysing complex experiments that started with Fisher 100 years ago, when he proposed splitting variability using the square of the standard deviation, which he called the variance, continued with Yates and has been developed since the 1960s by Rosemary Bailey, amongst others. This tradition might be referred to as The Rothamsted School. It is fully implemented in Genstat® but, as far as I am aware, not in any other package.
With the help of Genstat®, I demonstrate how the Rothamsted School would approach Lord’s paradox and come to a solution that is not the same as the one reached by Pearl and McKenzie, although given certain strong but untestable assumptions it would reduce to it. I conclude that the statistical tradition may have more to offer in this respect than has been supposed.
When estimating sample sizes for clinical trials there are several different views that might be taken as to what definition and meaning should be given to the sought-for treatment effect. However, if the concept of a ‘minimally important difference’ (MID) does have relevance to interpreting clinical trials (which can be disputed) then its value cannot be the same as the ‘clinically relevant difference’ (CRD) that would be used for planning them.
A doubly pernicious use of the MID is as a means of classifying patients as responders and non-responders. Not only does such an analysis lead to an increase in the necessary sample size but it misleads trialists into making causal distinctions that the data cannot support and has been responsible for exaggerating the scope for personalised medicine.
In this talk these statistical points will be explained using a minimum of technical detail.
Talk given at RSS 2016 Manchester
I consider the problems that the ASA faced in getting a P-value statement together, not in terms of the process, but by looking at the expressed opinion of 21 published commentaries of the agreed statement. I then trace the history of the development of P-values. I show that the perceived problem with P-values in not just one of a supposed inadequacy of frequentist statistics but reflects a struggle at the very heart of Bayesian inference. I conclude that replacing P-values by automatic Bayesian approaches is unlikely to abolish controversy. It may be better to try and embrace diversity than to pretend it is not there.
Unfortunately, some have interpreted Numbers Needed to Treat as indicating the proportion of patients on whom the treatment has had a causal effect. This interpretation is very rarely, if ever, necessarily correct. It is certainly inappropriate if based on a responder dichotomy. I shall illustrate the problem using simple causal models.
One also sometimes encounters the claim that the extent to which two distributions of outcomes overlap from a clinical trial indicates how many patients benefit. This is also false and can be traced to a similar causal confusion.
How to combine results from randomised clinical trials on the additive scale with real world data to provide predictions on the clinically relevant scale for individual patients
Personalised medicine a sceptical viewStephen Senn
Some grounds for believing that the current enthusiasm about personalised medicine is exaggerated, founded on poor statistics and represents a disappointing loss of ambition.
An early and overlooked causal revolution in statistics was the development of the theory of experimental design, initially associated with the "Rothamstead School". An important stage in the evolution of this theory was the experimental calculus developed by John Nelder in the 1960s with its clear distinction between block and treatment factors in designed experiments. This experimental calculus produced appropriate models automatically from more basic formal considerations but was, unfortunately, only ever implemented in Genstat®, a package widely used in agriculture but rarely so in medical research. In consequence its importance has not been appreciated and the approach of many statistical packages to designed experiments is poor. A key feature of the Rothamsted School approach is that identification of the appropriate components of variation for judging treatment effects is simple and automatic.
The impressive more recent causal revolution in epidemiology, associated with Judea Pearl, seems to have no place for components of variation, however. By considering the application of Nelder’s experimental calculus to Lord’s Paradox, I shall show that this reveals that solutions that have been proposed using the more modern causal calculus are problematic. I shall also show that lessons from designed clinical trials have important implications for the use of historical data and big data more generally.
There are many questions one might ask of a clinical trial, ranging from what was the effect in the patients studied to what might the effect be in future patients via what was the effect in individual patients? The extent to which the answer to these questions is similar depends on various assumptions made and in some cases the design used may not permit any meaningful answer to be given at all.
A related issue is confusion between randomisation, random sampling, linear model and true multivariate based modelling. These distinctions don’t matter much for some purposes and under some circumstances but for others they do.
Talk given at ISCB 2016 Birmingham
For indications and treatments where their use is possible, n-of-1 trials represent a promising means of investigating potential treatments for rare diseases. Each patient permits repeated comparison of the treatments being investigated and this both increases the number of observations and reduces their variability compared to conventional parallel group trials.
However, depending on whether the framework for analysis used is randomisation-based or model- based produces puzzling difference in inferences. This can easily be shown by starting on the one hand with the randomisation philosophy associated with the Rothamsted school of inference and building up the analysis through the block + treatment structure approach associated with John Nelder’s theory of general balance (as implemented in GenStat®) or starting on the other hand with a plausible variance component approach through a mixed model. However, it can be shown that these differences are related not so much to modelling approach per se but to the questions one attempts to answer: ranging from testing whether there was a difference between treatments in the patients studied, to predicting the true difference for a future patient, via making inferences about the effect in the average patient.
This in turn yields interesting insight into the long-run debate over the use of fixed or random effect meta-analysis.
Some practical issues of analysis will also be covered in R and SAS®, in which languages some functions and macros to facilitate analysis have been written. It is concluded that n-of-1 hold great promise in investigating chronic rare diseases but that careful consideration of matters of purpose, design and analysis is necessary to make best use of them.
Acknowledgement
This work is partly supported by the European Union’s 7th Framework Programme for research, technological development and demonstration under grant agreement no. 602552. “IDEAL”
The Seven Habits of Highly Effective StatisticiansStephen Senn
If you know why the title of this talk is extremely stupid, then you clearly know something about control, data and reasoning: in short, you have most of what it takes to be a statistician. If you have studied statistics then you will also know that a large amount of anything, and this includes successful careers, is luck.
In this talk I shall try share some of my experiences of being a statistician in the hope that it will help you make the most of whatever luck life throws you, In so doing, I shall try my best to overcome the distorting influence of that easiest of sciences hindsight. Without giving too much away, I shall be recommending that you read, listen, think, calculate, understand, communicate, and do. I shall give you some example of what I think works and what I think doesn’t
In all of this you should never forget the power of negativity and also the joy of being able to wake up every day and say to yourself ‘I love the small of data in the morning’.
There are many valid criticisms of P-values but the criticism that they are largely responsible for the reproducibility crisis has been accepted rather lightly in some quarters. Whatever the inferential statistic that is used, it is quite illogical to assume that as the sample size increases it will tend to show more evidence against the null hypothesis. This applies to Bayesian posterior probabilities as much as it does to P-values. In the context of P-values it can be referred to as the trend towards significance fallacy but more generally, for reasons I shall explain, it could be referred to as the anticipated evidence fallacy.
The anticipated evidence fallacy is itself an example of the overstated evidence fallacy. I shall also discuss this fallacy and other relevant matters affecting reproducible science including the problem of false negatives.
Sample size determination in clinical trials is considered from various ethical and practical perspectives. It is concluded that cost is a missing dimension and that the value of information is key.
This year marks the 70th anniversary of the Medical Research Council randomised clinical trial (RCT) of streptomycin in tuberculosis led by Bradford Hill. This is widely regarded as a landmark in clinical research. Despite its widespread use in drug regulation and in clinical research more widely and its high standing with the evidence based medicine movement, the RCT continues to attracts criticism. I show that many of these criticisms are traceable to failure to understand two key concepts in statistics: probabilistic inference and design efficiency. To these methodological misunderstandings can be added the practical one of failing to appreciate that entry into clinical trials is not simultaneous but sequential.
I conclude that although randomisation should not be used as an excuse for ignoring prognostic variables, it is valuable and that many standard criticisms of RCTs are invalid.
What should we expect from reproducibiliryStephen Senn
Is there really a reproducibility crisis and if so are P-values to blame? Choose any statistic you like and carry out two identical independent studies and report this statistic for each. In advance of collecting any data, you ought to expect that it is just as likely that statistic 1 will be smaller than statistic 2 as vice versa. Once you have seen statistic 1, things are not so simple but if they are not so simple, it is that you have other information in some form. However, it is at least instructive that you need to be careful in jumping to conclusions about what to expect from reproducibility. Furthermore, the forecasts of good Bayesians ought to obey a Martingale property. On average you should be in the future where you are now but, of course, your inferential random walk may lead to some peregrination before it homes in on “the truth”. But you certainly can’t generally expect that a probability will get smaller as you continue. P-values, like other statistics are a position not a movement. Although often claimed, there is no such things as a trend towards significance.
Using these and other philosophical considerations I shall try and establish what it is we want from reproducibility. I shall conclude that we statisticians should probably be paying more attention to checking that standard errors are being calculated appropriately and rather less to inferential framework.
Views of the role of hypothesis falsification in statistical testing do not divide as cleanly between frequentist and Bayesian views as is commonly supposed. This can be shown by considering the two major variants of the Bayesian approach to statistical inference and the two major variants of the frequentist one.
A good case can be made that the Bayesian, de Finetti, just like Popper, was a falsificationist. A thumbnail view, which is not just a caricature, of de Finetti’s theory of learning, is that your subjective probabilities are modified through experience by noticing which of your predictions are wrong, striking out the sequences that involved them and renormalising.
On the other hand, in the formal frequentist Neyman-Pearson approach to hypothesis testing, you can, if you wish, shift conventional null and alternative hypotheses, making the latter the strawman and by ‘disproving’ it, assert the former.
The frequentist, Fisher, however, at least in his approach to testing of hypotheses, seems to have taken a strong view that the null hypothesis was quite different from any other and there was a strong asymmetry on inferences that followed from the application of significance tests.
Finally, to complete a quartet, the Bayesian geophysicist Jeffreys, inspired by Broad, specifically developed his approach to significance testing in order to be able to ‘prove’ scientific laws.
By considering the controversial case of equivalence testing in clinical trials, where the object is to prove that ‘treatments’ do not differ from each other, I shall show that there are fundamental differences between ‘proving’ and falsifying a hypothesis and that this distinction does not disappear by adopting a Bayesian philosophy. I conclude that falsificationism is important for Bayesians also, although it is an open question as to whether it is enough for frequentists.
There are many questions one might ask of a clinical trial, ranging from what was the effect in the patients studied to what might the effect be in future patients via what was the effect in individual patients? The extent to which the answer to these questions is similar depends on various assumptions made and in some cases the design used may not permit any meaningful answer to be given at all.
A related issue is confusion between randomisation, random sampling, linear model and true multivariate based modelling. These distinctions don’t matter much for some purposes and under some circumstances but for others they do.
A yet further issue is that causal analysis in epidemiology, which has brought valuable insights in many cases, has tended to stress point estimates and ignore standard errors. This has potentially misleading consequences.
An understanding of components of variation is key. Unfortunately, the development of two particular topics in recent years, evidence synthesis by the evidence based medicine movement and personalised medicine by bench scientists has either paid scant attention to components of variation or to the questions being asked or both resulting in confusion about many issues.
For instance, it is often claimed that numbers needed to treat indicate the proportion of patients for whom treatments work, that inclusion criteria determine the generalisability of results and that heterogeneity means that a random effects meta-analysis is required. None of these is true. The scope for personalised medicine has very plausibly been exaggerated and an important cause of variation in the healthcare system, physicians, is often overlooked.
I shall argue that thinking about questions is important.
The Rothamsted School & The analysis of designed experimentsStephenSenn2
A historical account is given of the approach of "The Rothamsted School" to the analysis of designed experiments. The link between the way that experiments are designed and how they should be analysed is fundamental to this approach. The key figures are RA Fisher, Frank Yates and John Nelder
Clinical trials are about comparability not generalisability V2.pptxStephenSenn3
It is a fundamental but common mistake to regard clinical trials as being a form of representative inference. The key issue is comparability. Experiments do not involve typical material. In clinical trials; it is concurrent control that is key and randomisation is a device for calculating standard errors appropriately that should reflect the design.
Generalisation beyond the clinical trial always involves theory.
Clinical trials are about comparability not generalisability V2.pptxStephenSenn2
Lecture delivered at the September 2022 EFSPI meeting in Basle in which I argued that the patients in a clinical trial should not be viewed as being a representative sample of some target population.
It is argued that when it comes to nuisance parameters an assumption of ignorance is harmful. On the other hand this raises problems as to how far one should go in searching for further data when combining evidence.
In Search of Lost Infinities: What is the “n” in big data?Stephen Senn
In designing complex experiments, agricultural scientists, with the help of their statistician collaborators, soon came to realise that variation at different levels had very different consequences for estimating different treatment effects, depending on how the treatments were mapped onto the underlying block structure. This was a key feature of the Rothamsted approach to design and analysis and a strong thread running through the work of Fisher, Yates and Nelder, being expressed in topics such as split-pot designs, recovering inter-block information and fractional factorials. The null block-structure of an experiment is key to this philosophy of design and analysis. However modern techniques for analysing experiments stress models rather than symmetries and this modelling approach requires much greater care in analysis, with the consequence that you can easily make mistakes and often will.
In this talk I shall underline the obvious, but often unintentionally overlooked, fact that understanding variation at the various levels at which it occurs is crucial to analysis. I shall take three examples, an application of John Nelder’s theory of general balance to Lord’s Paradox, the use of historical data in drug development and a hybrid randomised non-randomised clinical trial, the TARGET study, to show that the data that many, including those promoting a so-called causal revolution, assume to be ‘big’ may actually be rather ‘small’. The consequence is that there is a danger that the size of standard errors will be underestimated or even that the appropriate regression coefficients for adjusting for confounding may not be identified correctly.
I conclude that an old but powerful experimental design approach holds important lessons for observational data about limitations in interpretation that mere numbers cannot overcome. Small may be beautiful, after all.
Presidents' invited lecture ISCB Vigo 2017
Discusses various issues to do with how randomised clinical trials should be analysed. See also https://errorstatistics.com/2017/07/01/s-senn-fishing-for-fakes-with-fisher-guest-post/
History of how and why a complex cross-over trial was designed to prove the equivalence of two formulations of a beta-agonist and what the eventual results were. Presented at the Newton Institute 28 July 2008. Warning: following the important paper by Kenward & Roger Biostatistics, 2010, I no longer think the random effects analysis is appropriate, although, in fact the results are pretty much the same as for the fixed effects analysis.
Minimisation is an approach to allocating patients to treatment in clinical trials that forces a greater degree of balance than does randomisation. Here I explain why I dislike it.
The history of p-values is covered to try and shed light on a mystery: why did Student and Fisher agree numerically but disagree in terms of interpretation.?
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Cancer cell metabolism: special Reference to Lactate Pathway
The challenge of small data
1. The Challenge of “Small Data”
Rare Diseases and Ways to Study Them
Stephen Senn
(c) Stephen Senn 1
2. Acknowledgements
Many thanks for the invitation
This work is partly supported by the European Union’s 7th Framework
Programme for research, technological development and
demonstration under grant agreement no. 602552. “IDEAL”
Some of this work is joint with Artur Araujo and Sonia Leite in my group
and Steven Julious at Sheffield University
(c) Stephen Senn 2(c) Stephen Senn 2
3. Outline
• The roots of modern statistics
• Small data
• Careful design of experiments
• Some examples of problems with judging causality from associations
in the health care field
• Rare diseases
• N-of-1 trials as a possible solution (in some cases)
• Some statistical issues
(c) Stephen Senn 3
4. Warning
I am a statistician
This means that whatever you believe
in, I don’t
(c) Stephen Senn 4
5. (c) Stephen Senn 5
William Sealy Gosset
1876-1937
• Born Canterbury 1876
• Educated Winchester and Oxford
• First in mathematical moderations 1897
and first in degree in Chemistry 1899
• Starts with Guinness in 1899 in Dublin
• Autumn 1906-spring 1907 with Karl
Pearson at UCL
• 1908 publishes ‘The probable error of a
mean’
• First method available to judge
‘significance’ in small samples
6. (c) Stephen Senn 6
Ronald Aylmer Fisher
1890-1962
• Most influential statistician ever
• Also major figure in evolutionary
biology
• Educated Harrow and Cambridge
• Statistician at Rothamsted agricultural
station 1919-1933
• Developed theory of small sample
inference and many modern concepts
• Likelihood, variance, sufficiency, ANOVA
• Developed theory of experimental
design
• Blocking, Randomisation, Replication,
7. Characteristics of development of statistics in
the first half of the 20th century
• Numerical work was arduous and long
• Human computers
• Desk calculators
• Careful thought as to how to perform a calculation paid dividends
• Much development of inferential theory for small samples
• Design of experiments became a new subject in its own right developed by
statisticians
• Orthogonality
• Made calculation easier (eg decomposition of variance terms in ANOVA)
• Increased efficiency
• Randomisation
• “Guaranteed” properties of statistical analysis
• Dealt with hidden confounders
• Factorial experimentation
• Efficient way to study multiple influences
(c) Stephen Senn 7
8. A big data analyst is an expert at reaching
misleading conclusions with huge data sets,
whereas a statistician can do the same with
small ones
(c) Stephen Senn 8
9. TARGET study
• Trial of more than 18,000
patients in osteoarthritis over
one year or more
• Two sub-studies
• Lumiracoxib v ibuprofen
• Lumiracoxib v naproxen
• Stratified by aspirin use or not
• Has some features of a
randomised trial but also
some of a non-randomised
study
(c) Stephen Senn 9
11. Data Filtering Some Examples
• Oscar winners lived longer than actors who didn’t win an
Oscar
• A 20 year follow-up study of women in an English village
found higher survival amongst smokers than non-smokers
• Transplant receivers on highest doses of cyclosporine had
higher probability of graft rejection than on lower doses
• Left-handers observed to die younger on average than
right-handers
• Obese infarct survivors have better prognosis than non-
obese
(c) Stephen Senn 11
12. Moral
• What you don’t see can be important
• For some purposes just piling on data does not really
help
• What helps are
• Careful design
• Thinking!
(c) Stephen Senn 12
13. We tend to believe “the truth is in
there”, but sometimes it isn’t and
the danger is we will find it
anyway
(c) Stephen Senn 13
14. Rare Diseases
• As far as the Food and Drug
Administration is concerned
anything that affects fewer than
300,000 people in the US
• However many diseases are
much rarer than this
• But there are at least 7,000 rare
diseases
• Thus the total number of
persons effected is considerable
(c) Stephen Senn 14
15. N-of-1 studies
• Studies in which patients are
repeatedly randomised to
treatment and control
• Increased efficiency because
• Each patient acts as own control
• More than one judgement of
effect per patient
• However, only possible for
chronic diseases
• Possible randomisation in k
cycles of treatment
• Implies 2 𝑘 possible sequences (c) Stephen Senn 15
16. A simulated example
• Twelve patients suffering from a chronic rare respiratory complaint
• For example cystic fibrosis
• Each patient is randomised in three pairs of periods, comparing two
treatments A and B
• Adequate washout is built in to the design
• Thus we have 12 x 3 x 2 = 72 observations altogether
• Efficacy is measured using forced expiratory volume in one second
(FEV1) in ml
• How should we analyse such an experiment?
(c) Stephen Senn 16
18. Possible objectives of an analysis
• Is one of the treatments better?
• Significance tests
• What can be said about the average effect in the patients that were
studied?
• Estimates, confidence intervals
• What can be said about the average effects in future patients?
• What can be said about the effect of a given patient in the trial?
• What can be said about a future patient not in the trial?
(c) Stephen Senn 18
19. Two different philosophies
Randomisation philosophy
• The patients in a clinical trial are
taken as fixed
• The population about which
inference is made is all possible
randomisations
• The patients don’t change, only
the pattern of assignments of
treatments change
Sampling philosophy
• The patients are regarded as a
sample from some possible
population of patients
• This is usually handled by adding
error terms corresponding to
various components of variance
• This approach is much more
common
(c) Stephen Senn 19
20. Is one of the treatments better?
Significance tests
Rothamsted School
• Leading statisticians such as
Fisher, Yates, Nelder, Bailey
• Developed analysis of variance
not in terms of linear models
but in terms of symmetry
• High point was John Nelder’s
theory of general balance (1965)
General Balance
1) Establish and define block structure
2) Establish and define treatment
structure
3) Given randomisation the analysis
then follows automatically
Here the block structure is
Patient/Cycle GenStat®
Patient(Cycle) SAS®
The treatment structure is
Treatment
(c) Stephen Senn 20
21. The general balance approach
(c) Stephen Senn 21
BLOCKSTRUCTURE Patient/Cycle
TREATMENTSTRUCTURE Treatment
ANOVA[FPROBABILITY=YES;NOMESSAGE=residual] Y
.
Analysis of variance
Variate: FEV1 (mL)
Source of variation d.f. s.s. m.s. v.r. F pr.
Patient stratum 11 1458791. 132617. 10.04
Patient.Cycle stratum 24 316885. 13204. 1.04
Patient.Cycle.*Units* stratum
Treatment 1 641089. 641089. 50.57 <.001
Residual 35 443736. 12678.
Total 71 2860501.
22. Comparing two models
The first is without a patient
by treatment interaction
NB Analysis with proc glm
of SAS®
The second is with a patient
by treatment interaction
(c) Stephen Senn 22
23. Any damn fool can analyse a clinical trial
and frequently does
(c) Stephen Senn 23
24. Two more difficult questions
The average effects in future patients?
• This may require a mixed effects
model
• Allow for a random treatment-
by-patient interaction
• The possibility that there may be
variation in the effect from patient
to patient
• Strong assumptions my be
involved
The average effect for a given patient?
• The same random effect model
can be used to predict long-term
average effects for patients in
the trial
• A weighted estimate is used
whereby the patient’s only
results are averaged with the
general result
(c) Stephen Senn 24
26. (C)Stephen Senn 26
The difference between
mathematical and applied
statistics is that the former is full
of lemmas whereas the latter is
full of dilemmas
29. Morals
• There is still a role for small data analysis
• Design is crucial
• Analysis depends on purpose
• And also on design and vice versa
• Results depend on philosophical framework
• Calculation is difficult, yes, but so is thinking
(c) Stephen Senn 29
Purpose
Analysis Design
30. (c) Stephen Senn 30
To call in the statistician after
the experiment is done may be
no more than asking him to
perform a post-mortem
examination: he may be able
to say what the experiment
died of
RA Fisher