This document discusses statistical issues related to using patient-reported outcome (PRO) measures in clinical trials. It notes that PROs are often measured on an ordinal scale with skewed distributions and are subject to floor and ceiling effects. Standard analysis methods may not be appropriate. It also discusses challenges like regression to the mean, baseline adjustment, and evaluating change over time without a control group. While a multi-scale PRO like a health-related quality of life instrument can be used as a primary endpoint, its multidimensional nature raises issues around validation and methodology compared to single measures. Regulatory agencies provide guidance on properly using PROs, including multi-scale ones, as primary endpoints.
Medical research relies heavily on statistical inference for generalization of findings, for assessing the uncertainty in applying these findings on new patients. SPSS and similar packages has made complex statistical calculations possible with no or very little understanding of statistical inference. As a consequence, research findings are misunderstood, the presentation of them confusing, and their reliability massively overestimated.
Avoid overfitting in precision medicine: How to use cross-validation to relia...Nicole Krämer
The identification of patient subgroups who may derive benefit from a treatment is of crucial importance in precision medicine. Many different algorithms have been proposed and studied in the literature.
We illustrate that many of these algorithms overfit in the sense that the treatment benefit for the identified patients is substantially overestimated. Further, we show that with cross-validation, it is possible to obtain more realistic estimates.
Scrambler Therapy May Relieve Chronic Neuropathic Pain More Effectively Than Guideline-Based Drug Management: Results of a Pilot, Randomized, Controlled Trial
Outcomes from 45 Years of Clinical Practice (Paul Clement)Scott Miller
Paul Clement is one of my heroes. He's been tracking the outcome of his clinical services for decades. I was stunned when, in 1994, he published results from his private work over a two decades long period. Now, we have the data from 45 years. Read it!
Medical research relies heavily on statistical inference for generalization of findings, for assessing the uncertainty in applying these findings on new patients. SPSS and similar packages has made complex statistical calculations possible with no or very little understanding of statistical inference. As a consequence, research findings are misunderstood, the presentation of them confusing, and their reliability massively overestimated.
Avoid overfitting in precision medicine: How to use cross-validation to relia...Nicole Krämer
The identification of patient subgroups who may derive benefit from a treatment is of crucial importance in precision medicine. Many different algorithms have been proposed and studied in the literature.
We illustrate that many of these algorithms overfit in the sense that the treatment benefit for the identified patients is substantially overestimated. Further, we show that with cross-validation, it is possible to obtain more realistic estimates.
Scrambler Therapy May Relieve Chronic Neuropathic Pain More Effectively Than Guideline-Based Drug Management: Results of a Pilot, Randomized, Controlled Trial
Outcomes from 45 Years of Clinical Practice (Paul Clement)Scott Miller
Paul Clement is one of my heroes. He's been tracking the outcome of his clinical services for decades. I was stunned when, in 1994, he published results from his private work over a two decades long period. Now, we have the data from 45 years. Read it!
The Rothamsted school meets Lord's paradoxStephen Senn
Lords ‘paradox’ is a notoriously difficult puzzle that is guaranteed to provoke discussion, dissent and disagreement. Two statisticians analyse some observational data and come to radically different conclusions, each of which has acquired defenders over the years since Lord first proposed his puzzle in 1967. It features in the recent Book of Why by Pearl and McKenzie, who use it to demonstrate the power of Pearl’s causal calculus, obtaining a solution they claim is unambiguously right. They also claim that statisticians have failed to get to grips with causal questions for well over a century, in fact ever since Karl Pearson developed Galton’s idea of correlation and warned the scientific world that correlation is not causation.
However, only two years before Lord published his paradox John Nelder outlined a powerful causal calculus for analyzing designed experiments based on a careful distinction between block and treatment structure. This represents an important advance in formalizing the approach to analysing complex experiments that started with Fisher 100 years ago, when he proposed splitting variability using the square of the standard deviation, which he called the variance, continued with Yates and has been developed since the 1960s by Rosemary Bailey, amongst others. This tradition might be referred to as The Rothamsted School. It is fully implemented in Genstat® but, as far as I am aware, not in any other package.
With the help of Genstat®, I demonstrate how the Rothamsted School would approach Lord’s paradox and come to a solution that is not the same as the one reached by Pearl and McKenzie, although given certain strong but untestable assumptions it would reduce to it. I conclude that the statistical tradition may have more to offer in this respect than has been supposed.
Talk given at ISCB 2016 Birmingham
For indications and treatments where their use is possible, n-of-1 trials represent a promising means of investigating potential treatments for rare diseases. Each patient permits repeated comparison of the treatments being investigated and this both increases the number of observations and reduces their variability compared to conventional parallel group trials.
However, depending on whether the framework for analysis used is randomisation-based or model- based produces puzzling difference in inferences. This can easily be shown by starting on the one hand with the randomisation philosophy associated with the Rothamsted school of inference and building up the analysis through the block + treatment structure approach associated with John Nelder’s theory of general balance (as implemented in GenStat®) or starting on the other hand with a plausible variance component approach through a mixed model. However, it can be shown that these differences are related not so much to modelling approach per se but to the questions one attempts to answer: ranging from testing whether there was a difference between treatments in the patients studied, to predicting the true difference for a future patient, via making inferences about the effect in the average patient.
This in turn yields interesting insight into the long-run debate over the use of fixed or random effect meta-analysis.
Some practical issues of analysis will also be covered in R and SAS®, in which languages some functions and macros to facilitate analysis have been written. It is concluded that n-of-1 hold great promise in investigating chronic rare diseases but that careful consideration of matters of purpose, design and analysis is necessary to make best use of them.
Acknowledgement
This work is partly supported by the European Union’s 7th Framework Programme for research, technological development and demonstration under grant agreement no. 602552. “IDEAL”
The statistical revolution of the 20th century was largely concerned with developing methods for analysing small datasets. Student’s paper of 1908 was the first in the English literature to address the problem of second order uncertainty (uncertainty about the measures of uncertainty) seriously and was hailed by Fisher as heralding a new age of statistics. Much of what Fisher did was concerned with problems of what might be called ‘small data’, not only as regards efficient analysis but also as regards efficient design and in addition paying close attention to what was necessary to measure uncertainty validly.
I shall consider the history of some of these developments, in particular those that are associated with what might be called the Rothamsted School, starting with Fisher and having its apotheosis in John Nelder’s theory of General Balance and see what lessons they hold for the supposed ‘big data’ revolution of the 21st century.
An early and overlooked causal revolution in statistics was the development of the theory of experimental design, initially associated with the "Rothamstead School". An important stage in the evolution of this theory was the experimental calculus developed by John Nelder in the 1960s with its clear distinction between block and treatment factors in designed experiments. This experimental calculus produced appropriate models automatically from more basic formal considerations but was, unfortunately, only ever implemented in Genstat®, a package widely used in agriculture but rarely so in medical research. In consequence its importance has not been appreciated and the approach of many statistical packages to designed experiments is poor. A key feature of the Rothamsted School approach is that identification of the appropriate components of variation for judging treatment effects is simple and automatic.
The impressive more recent causal revolution in epidemiology, associated with Judea Pearl, seems to have no place for components of variation, however. By considering the application of Nelder’s experimental calculus to Lord’s Paradox, I shall show that this reveals that solutions that have been proposed using the more modern causal calculus are problematic. I shall also show that lessons from designed clinical trials have important implications for the use of historical data and big data more generally.
Multiple sclerosis is a demyelinating disease affecting brain, optic nerves and spinal cord. It is characterised by frequent relapses. Now, there are a number of effective treatment options for MS. Earlier, only clinical parameters were considered to evaluate the efficacy of MS treatments. However, now, we need to look at disability as well as MRI parameters. All these points are included in NEDA (no evidence of disease activity). This presentation looks at the definition and classification of NEDA. It also looks at NEDA rates with various treatment options.
There are many questions one might ask of a clinical trial, ranging from what was the effect in the patients studied to what might the effect be in future patients via what was the effect in individual patients? The extent to which the answer to these questions is similar depends on various assumptions made and in some cases the design used may not permit any meaningful answer to be given at all.
A related issue is confusion between randomisation, random sampling, linear model and true multivariate based modelling. These distinctions don’t matter much for some purposes and under some circumstances but for others they do.
Paxil Study 329 Retracted: A Critical Statistical AnalysisCarlo Carandang
I give a lecture regarding the statistical methodology employed in the 2001 Paxil (paroxetine) Study 329: Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial. Keller MB, Ryan ND, Strober M, Klein RG, Kutcher SP, Birmaher B, Hagino OR, Koplewicz H, Carlson GA, Clarke GN, Emslie GJ, Feinberg D, Geller B, Kusumakar V, Papatheodorou G, Sack WH, Sweeney M, Wagner KD, Weller EB, Winters NC, Oakes R, McCafferty JP. J Am Acad Child Adolesc Psychiatry. 2001 Jul;40(7):762-72.
The Rothamsted school meets Lord's paradoxStephen Senn
Lords ‘paradox’ is a notoriously difficult puzzle that is guaranteed to provoke discussion, dissent and disagreement. Two statisticians analyse some observational data and come to radically different conclusions, each of which has acquired defenders over the years since Lord first proposed his puzzle in 1967. It features in the recent Book of Why by Pearl and McKenzie, who use it to demonstrate the power of Pearl’s causal calculus, obtaining a solution they claim is unambiguously right. They also claim that statisticians have failed to get to grips with causal questions for well over a century, in fact ever since Karl Pearson developed Galton’s idea of correlation and warned the scientific world that correlation is not causation.
However, only two years before Lord published his paradox John Nelder outlined a powerful causal calculus for analyzing designed experiments based on a careful distinction between block and treatment structure. This represents an important advance in formalizing the approach to analysing complex experiments that started with Fisher 100 years ago, when he proposed splitting variability using the square of the standard deviation, which he called the variance, continued with Yates and has been developed since the 1960s by Rosemary Bailey, amongst others. This tradition might be referred to as The Rothamsted School. It is fully implemented in Genstat® but, as far as I am aware, not in any other package.
With the help of Genstat®, I demonstrate how the Rothamsted School would approach Lord’s paradox and come to a solution that is not the same as the one reached by Pearl and McKenzie, although given certain strong but untestable assumptions it would reduce to it. I conclude that the statistical tradition may have more to offer in this respect than has been supposed.
Talk given at ISCB 2016 Birmingham
For indications and treatments where their use is possible, n-of-1 trials represent a promising means of investigating potential treatments for rare diseases. Each patient permits repeated comparison of the treatments being investigated and this both increases the number of observations and reduces their variability compared to conventional parallel group trials.
However, depending on whether the framework for analysis used is randomisation-based or model- based produces puzzling difference in inferences. This can easily be shown by starting on the one hand with the randomisation philosophy associated with the Rothamsted school of inference and building up the analysis through the block + treatment structure approach associated with John Nelder’s theory of general balance (as implemented in GenStat®) or starting on the other hand with a plausible variance component approach through a mixed model. However, it can be shown that these differences are related not so much to modelling approach per se but to the questions one attempts to answer: ranging from testing whether there was a difference between treatments in the patients studied, to predicting the true difference for a future patient, via making inferences about the effect in the average patient.
This in turn yields interesting insight into the long-run debate over the use of fixed or random effect meta-analysis.
Some practical issues of analysis will also be covered in R and SAS®, in which languages some functions and macros to facilitate analysis have been written. It is concluded that n-of-1 hold great promise in investigating chronic rare diseases but that careful consideration of matters of purpose, design and analysis is necessary to make best use of them.
Acknowledgement
This work is partly supported by the European Union’s 7th Framework Programme for research, technological development and demonstration under grant agreement no. 602552. “IDEAL”
The statistical revolution of the 20th century was largely concerned with developing methods for analysing small datasets. Student’s paper of 1908 was the first in the English literature to address the problem of second order uncertainty (uncertainty about the measures of uncertainty) seriously and was hailed by Fisher as heralding a new age of statistics. Much of what Fisher did was concerned with problems of what might be called ‘small data’, not only as regards efficient analysis but also as regards efficient design and in addition paying close attention to what was necessary to measure uncertainty validly.
I shall consider the history of some of these developments, in particular those that are associated with what might be called the Rothamsted School, starting with Fisher and having its apotheosis in John Nelder’s theory of General Balance and see what lessons they hold for the supposed ‘big data’ revolution of the 21st century.
An early and overlooked causal revolution in statistics was the development of the theory of experimental design, initially associated with the "Rothamstead School". An important stage in the evolution of this theory was the experimental calculus developed by John Nelder in the 1960s with its clear distinction between block and treatment factors in designed experiments. This experimental calculus produced appropriate models automatically from more basic formal considerations but was, unfortunately, only ever implemented in Genstat®, a package widely used in agriculture but rarely so in medical research. In consequence its importance has not been appreciated and the approach of many statistical packages to designed experiments is poor. A key feature of the Rothamsted School approach is that identification of the appropriate components of variation for judging treatment effects is simple and automatic.
The impressive more recent causal revolution in epidemiology, associated with Judea Pearl, seems to have no place for components of variation, however. By considering the application of Nelder’s experimental calculus to Lord’s Paradox, I shall show that this reveals that solutions that have been proposed using the more modern causal calculus are problematic. I shall also show that lessons from designed clinical trials have important implications for the use of historical data and big data more generally.
Multiple sclerosis is a demyelinating disease affecting brain, optic nerves and spinal cord. It is characterised by frequent relapses. Now, there are a number of effective treatment options for MS. Earlier, only clinical parameters were considered to evaluate the efficacy of MS treatments. However, now, we need to look at disability as well as MRI parameters. All these points are included in NEDA (no evidence of disease activity). This presentation looks at the definition and classification of NEDA. It also looks at NEDA rates with various treatment options.
There are many questions one might ask of a clinical trial, ranging from what was the effect in the patients studied to what might the effect be in future patients via what was the effect in individual patients? The extent to which the answer to these questions is similar depends on various assumptions made and in some cases the design used may not permit any meaningful answer to be given at all.
A related issue is confusion between randomisation, random sampling, linear model and true multivariate based modelling. These distinctions don’t matter much for some purposes and under some circumstances but for others they do.
Paxil Study 329 Retracted: A Critical Statistical AnalysisCarlo Carandang
I give a lecture regarding the statistical methodology employed in the 2001 Paxil (paroxetine) Study 329: Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial. Keller MB, Ryan ND, Strober M, Klein RG, Kutcher SP, Birmaher B, Hagino OR, Koplewicz H, Carlson GA, Clarke GN, Emslie GJ, Feinberg D, Geller B, Kusumakar V, Papatheodorou G, Sack WH, Sweeney M, Wagner KD, Weller EB, Winters NC, Oakes R, McCafferty JP. J Am Acad Child Adolesc Psychiatry. 2001 Jul;40(7):762-72.
The two statistical cornerstones of replicability: addressing selective infer...jemille6
Tukey’s last published work in 2020 was an obscure entry on multiple comparisons in the
Encyclopedia of Behavioral Sciences, addressing the two topics in the title. Replicability
was not mentioned at all, nor was any other connection made between the two topics. I shall demonstrate how these two topics critically affect replicability using recently completed studies. I shall review how these have been addressed in the past. I shall
review in more detail the available ways to address selective inference. My conclusion is that conducting many small replicability studies without strict standardization is the way to assure replicability of results in science, and we should introduce policies to make this happen.
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdfPubrica
In clinical science, biostatistics services are essential for data collection, analysis, presentation, and interpretation. Epidemiology, clinical trials, population genetics, systems biology, and other disciplines all benefit from it. It aids in the evaluation of a drug's effectiveness and safety in clinical trials.
Continue Reading: https://bit.ly/3tRRxkW
Reference: https://pubrica.com/services/research-services/biostatistics-and-statistical-programming-services/
Why Pubrica:
When you order our services, We promise you the following – Plagiarism free | always on Time | 24*7 customer support | Written to international Standard | Unlimited Revisions support | Medical writing Expert | Publication Support | Biostatistical experts | High-quality Subject Matter Experts.
Contact us :
Web: https://pubrica.com/
Blog: https://pubrica.com/academy/
Email: sales@pubrica.com
WhatsApp : +91 9884350006
United Kingdom: +44 1618186353
7. The statistician's task
To eliminate as much uncertainty as possible (by design) and
to quantify (in the analysis) what is left.
8. Statistical characteristics of PROs
1. Measured on an ordinal scale
2. Discrete distribution
3. Truncated distribution
4. Skewed distribution
5. Floor and/or ceiling effects
Are standard analysis methods appropriate?
9.
10. Lillgraven S, Kristiansen IS, Kvien TK. Comparison of utility measures and their relationship
with other health status measures in 1041 patients with rheumatoid arthritis. Ann Rheum Dis
2010;69:1762-1767.
11.
12. The EQ-5D index
Simulation studies at RC Syd show that
Type-1 error rate (risk of false positive findings)
EQ-5D is best analyzed using methods that assume a
Gaussian distribution, at least if n is large, between 20-50.
Non-parametric alternatives perform poorly with any n value.
Type-2 error rate (risk of false negative findings)
No single method can be recommended. All investigated
methods perform poorly for any distributional shape.
14. Baseline versus change
Simulated data (n = 1000)
pre - post
correlation (pre, post) = 0
delta = post - pre
correlation (pre, post - pre) = -0.7
15. Baseline versus change
50% of the “change” can be explained by baseline.
When comparing “change” in different groups, always adjust
for imbalance at baseline (e.g. using ANCOVA).
16. RTM - Regression to the mean
If the first measurement of a variable is extreme, the second
measurement will tend to be closer to the average.
Note, this is a purely statistical phenomenon.
Galton F. Regression towards mediocrity in hereditary stature. J Anth Inst Gr
Br Ire.1886;15:246–263.
19. RTM - Regression to the mean
The phenomenon explains the placebo effect in clinical trials
and apparent treatment effects found in some studies on
homeopathic drugs, bible reading, etc.
20. RTM - Easy to quantify
(for Normally distributed endpoints)
Barnett AG, van der Pols JC, Dobson AJ. Regression to the mean: what it is and how to deal with it. Int J Epidemiol 2005;34:215–220
21. Hypothetical example of RTM in SF-36 PF
Mean = 80, SD = 17, cut off = 60
r RTM
0.0 28.4
0.1 25.5
0.2 22.7
0.3 19.9
0.4 17.0
0.5 14.2
0.6 11.3
0.7 8.5
0.8 5.7
0.9 2.8
1.0 0
22. Regression-to-the-mean
Evaluation of a single group’s development over time
should be avoided, or at least include a comparison with
the expected RTM effect.
23. Practical consequences
When evaluating change, use a control group.
Adjust for baseline imbalance.
The validity of this procedure with multi-modal data
(e.g. the EQ-5D index) is unknown.
25. Can a multi-scale PRO be used as a
primary endpoint in a randomized trial?
26. Can a multi-scale PRO be used as a
primary endpoint in a randomized trial?
27. Can a PRO be used as a primary
endpoint in a randomized trial?
PRO as primary endpoint
“A PRO measurement can be the clinical trial’s primary
endpoint measure, a co-primary endpoint measure ... or a
secondary endpoint measure whose analysis is considered
according to a hierarchical sequence.“
FDA. Patient-Reported Outcome Measures: Use in Medical Product Development
to Support Labeling Claims. Guidance for Industry.
28. Can a multi-scale PRO be used as a
primary endpoint in a randomized trial?
PRO and HRQL
“The term PRO is proposed as an umbrella term to cover both
single dimension and multi-dimension measures of
symptoms, health-related quality of life (HRQL), health status,
adherence to treatment, satisfaction with treatment, etc.”
“In the context of drug approval, HRQL is considered to
represent a specific type/subset of PROs, distinguished by its
multi-dimensionality.”
EMEA. Reflection paper on the regulatory guidance for the use of health-related
quality of life (HRQL) measures in the evaluation of medicinal products.
29. Can a multi-scale PRO be used as a
primary endpoint in a randomized trial?
HRQL as primary endpoint
“In general, the methodology for assessing the effect on
HRQL is similar to the methodology used in any efficacy trial,
except for issues related to the nature of the instruments,
which are generally composed of multi-items, and multi-
domains.
Briefly, it is recommended that HRQL instrument be
previously validated for the condition studied...”
EMEA. Reflection paper on the regulatory guidance for the use of health-related
quality of life (HRQL) measures in the evaluation of medicinal products.
30. FDA. Patient-Reported Outcome Measures: Use in Medical Product Development to
Support Labeling Claims. Guidance for Industry.