Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The revenge of RA Fisher

263 views

Published on

Presidents' invited lecture ISCB Vigo 2017
Discusses various issues to do with how randomised clinical trials should be analysed. See also https://errorstatistics.com/2017/07/01/s-senn-fishing-for-fakes-with-fisher-guest-post/

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

The revenge of RA Fisher

  1. 1. The Revenge of RA Fisher Thoughts on Randomgate and its Wider Implications Stephen Senn The revenge of RA Fisher 1
  2. 2. Acknowledgements & and an apology Acknowledgments • Thanks to KyungMann and ISCB for inviting me • I appreciate that I am a late replacement for Butch (to whom we all wish the best) • I apologise to the very many of you who will be disappointed • Thanks to Bill Rosenberger & Dieter Hilgers for information on randomisation • Thanks to Dr Helen Senn for reminding me about publication bias • This work is partly supported by the European Union’s 7th Framework Programme for research, technological development and demonstration under grant agreement no. 602552. “IDeAL” Apology • I am going to talk about randomisation • I was the joint presidents’ invited lecturer at ISCB/SCT 2003 London (Grazia Valsecchi & John Lachin) • And I talked about randomisation then! The revenge of RA Fisher 2
  3. 3. The revenge of RA Fisher 3 A Quote from the Master “Two things are necessary, however: (a) that a sharp distinction should be drawn between those components of error which are to be eliminated in the field, and those which are not to be eliminated; and that while the elimination of the one class shall be complete, no attempt shall be made to eliminate the other; (b) that the statistical process of the estimation of error shall be modified so as to take account of the field arrangement, and so that the components of error actually eliminated in the field shall equally be eliminated in the statistical laboratory.” (My emphasis.) R.A. Fisher, CP 48 pp507-508 A slide from 2003
  4. 4. Outline • Carlisle’s bombshell • Three explanations • 1. Declaration of independence • 3. Balancing act • 2. Minding your Ps and Qs • Conclusion and recommendations The revenge of RA Fisher 4 Note that the order is 1,3,2 No I have not randomly permuted them 1,2,3 is the natural order to think about them but 1,3,2 is in increasing order of difficulty
  5. 5. The revenge of RA Fisher 5
  6. 6. Randomgate “I wanted to determine whether data distributions in trials published in specialist anaesthetic journals have been different to distributions in non-specialist medical journals. I analysed the distribution of 72, 261 means o f 29, 789 variables in 5087 randomised , controlled trials published in eight journals between January 2000 and December 2015: Anaesthesia (399); Anesthesia and Analgesia (1288); Anesthesiology (541); British Journal of Anaesthesia (618); Canadian Journal of Anesthesia (384); European Journal of Anaesthesiology (404); Journal of the American Medical Association (518) and New England Journal of Medicine (935). I chose these journals as I had electronic access to the full text.” The revenge of RA Fisher 6
  7. 7. Statistics of Carlisle’s Statistics Identifier Description Number Statistic Calculation Description Result V1 Journals 8 V2 Years 6.0 S1 V1xV2 Journal years 48.0 V3 Papers 5087 S2 V3/(V1xV2) Papers/Journal year 106.0 V4 Variables 29789 S3 V4/V3 Variables/Paper 5.9 V5 Means 72261 S4 V5/V4 Means/Variable 2.4 V6 S5 V4/V3 Means/Paper 14.2 The revenge of RA Fisher 7
  8. 8. Little Ps have further Ps upon their backs to bite them • Carlisle then proceeds to look at the P of the Ps • That is to say he considers if their distribution is random • Calculates the balance three different ways (but all assuming simple randomisation) • Chooses that P-value closest to 0.5 • Combines P-values for different variables by averaging the Z scores • Calculates a one-sided version and subtracts from 1 • Closer to 1 the greater the imbalance • Closer to 0 the greater the balance • Then does a QQ plot • Finds too many values close to 1 and too many close to 0 The revenge of RA Fisher 8
  9. 9. The revenge of RA Fisher 9
  10. 10. The revenge of RA Fisher 10
  11. 11. The revenge of RA Fisher 11
  12. 12. The revenge of RA Fisher 12
  13. 13. Three assumptions in Carlisle’s analysis • Carlisle assumes his P-values would have a uniform distribution given that the null hypothesis is true. • This requires however 1. Independent covariates 2. From randomly chosen randomised trials 3. Analysed as randomised • I shall take these in the order of increasing difficulty 1,3,2 The revenge of RA Fisher 13
  14. 14. Declaration of independence We hold these truths to be self-evident But are they? The revenge of RA Fisher 14
  15. 15. Independent covariates? • Many commentators have picked up on this • Carlisle combines P-values using what he calls Stouffer’s method • Weighted Z-statistics • He calculates the variance of the weighted Z-statistic as if they were independent • Ok for combining different trials • But covariates are not independent • Example of asthma • Sex & height • Height and FEV1 • FEV1 and age The revenge of RA Fisher 15
  16. 16. All outcome measures are presumed to be equi-correlated with correlation 𝜌 ≥ 0 This implies they are conditionally independent given a latent variable with whom they have correlation √𝜌 The lines give Z- inflation as a function of the number of measures combined for 1 to 10 measures Obviously for 1 measure there is no inflation The revenge of RA Fisher 16
  17. 17. The revenge of RA Fisher 17
  18. 18. The revenge of RA Fisher 18
  19. 19. The revenge of RA Fisher 19 NEJM Baseline P- values Expected variance under H0 is 1/12 This shows that the variance of the combined P- values is very different from that of the individual P- values
  20. 20. Balancing act Being a statistician means never having to say you are certain But are you certain how certain you are? The revenge of RA Fisher 20
  21. 21. How are trials really randomised? The revenge of RA Fisher 21 • Looked at BMJ, JAMA, Lancet, NEJM in 2010 • 258 trials Method Number Percent Permuted Blocks 125 49 Minimisation 29 11 Simple randomisation 4 2 Other 4 2 Unclear 96 37 TOTAL 258 100
  22. 22. An illustration of what Fisher warned us of The master’s warning • If you randomise one way and analyse another you are in for trouble • Your estimate may be reasonable • Consistent • Precise • You won’t know how precise it is • It is not clear what relationship the standard error as calculated has to the variation the statistic would have over all randomisations Hills and Arrmitage 1979 • Trial of enuresis • Patients randomised to one of two sequences • Active treatment in period 1 followed by placebo in period 2 • Placebo in period 1 followed by active treatment in period 2 • Treatment periods were 14 days long • Number of dry nights measured The revenge of RA Fisher 22
  23. 23. The revenge of RA Fisher 23 Hills, M, Armitage, P. The two- period cross-over clinical trial, British Journal of Clinical Pharmacology 1979; 8: 7-20.
  24. 24. Important points to note • Because every patient acts as his or her own control all patient level covariates (of which there could be thousands and thousands) are perfectly balanced • Differences in these covariates can have no effect on the difference between results under treatment and the results under placebo • However, period level covariates (changes within the lives of patients) could have an effect • My normal practice is to fit a period effect as well as patient effects, however, I shall omit doing so to simplify • I am going to analyse these data two ways • One is reasonable (similar to H & A’s analysis) and the other is silly The revenge of RA Fisher 24
  25. 25. 0.7 4 0.5 2 0.3 0 0.1 -2-4 0.6 0.2 0.4 0.0 Density Permutatedtreatment effect Blue diamond shows treatment effect whether or not we condition on patient as a factor. It is identical because the trial is balanced by patient. However the permutation distribution is quite different and our inferences are different whether we condition (red) or not (black) and clearly balancing the randomisation by patient and not conditioning the analysis by patient is wrong The revenge of RA Fisher 25
  26. 26. Not fitting patient effect Estimate s.e. t(56) t pr. 2.172 0.964 2.25 0.0282 (P-value for permutation is 0.034) Two Parametric Approaches Fitting patient effect Estimate s.e. t(28) t pr . 2.172 0.616 3.53 0.00147 (P-value for Permutation is 0.0014) 26The revenge of RA Fisher
  27. 27. What happens if you balance but don’t condition? Approach Variance of estimated treatment effect over all randomisations* Mean of variance of estimated treatment effect over all randomisations* Completely randomised Analysed as such 0.987 0.996 Randomised within- patient Analysed as such 0.534 0.529 Randomised within- patient Analysed as completely randomised 0.534 1.005 *Based on 10000 random permutations The revenge of RA Fisher 27 That is to say, permute values respecting the fact that they come from a cross- over but analysing them as if they came from a parallel group trial
  28. 28. Fisher’s point of view • Balancing and not conditioning is an unforgiveable sin • Your estimated standard errors bear an unknown relationship to the true standard errors • It is like analysing a match pairs design as if it were a completely randomised design • If you do this, you deserve to fail Stat 1 The revenge of RA Fisher 28
  29. 29. Implications for Carlisle’s analysis • Trials are ‘randomised’ using balancing approaches • Permuted block • Within centres • Minimisation • This eliminates various effects from the true variance at baseline • It adds them to the error variance • It will produce an excess of trials that are balanced ‘too good to be true’ The revenge of RA Fisher 29
  30. 30. Minding your P’s and Q’s Missing data matters Except when studying missing data, apparently The revenge of RA Fisher 30
  31. 31. Summary of Observational Studies (Based on Song et al, 2009) Favours negative Favours positive Analysis produced using Guido Schwarzer’s meta package in R The revenge of RA Fisher 31
  32. 32. Summary of randomised studies The revenge of RA Fisher 32
  33. 33. The revenge of RA Fisher 33
  34. 34. The revenge of RA Fisher 34
  35. 35. The revenge of RA Fisher 35
  36. 36. The revenge of RA Fisher 36
  37. 37. Implications • You don’t get to see a random set of randomised trials • You get to see a selected set • Although it is true that looking forward for a randomised trial all covariates should have a null distribution, publication acts as a data-filter • There are four things we need to know • What effect does imbalance have on significance under the null • What effect does imbalance have on significance under the alternative • What effect does significance have on publication • What is the distribution of true effects • The first of these is ‘easy’ but the rest aren’t The revenge of RA Fisher 37
  38. 38. See Senn, Statistics in Medicine, 1989 The revenge of RA Fisher 38
  39. 39. Conclusions and recommendations So long, and thanks for all the P-values The revenge of RA Fisher 39
  40. 40. We need to change our publication mode for trials • Trials are missing not at random • Or at least not completely at random • Even where the trials are present, they are poorly documented • In particular details of analysis and randomisation are frequently missing • Self-publication is the solution The revenge of RA Fisher 40
  41. 41. The elephant in the room • We can all breathe a sign of relief (probably) • There are three reasons why P-values for (naïve) baseline tests might not have the supposed distribution • This may let (most of) the trials off the hook as regards dishonesty or incompetence • Problem is that, at least as regards one aspect, all Carlisle was doing was following a (regrettably) common practice • This has unpleasant consequences for analyses of outcomes A forgotten moment in statistical history St Exupéry tries to fit a Normal Distribution to an elephant using Python The revenge of RA Fisher 41
  42. 42. The really worrying thing is we are analysing clinical trials all wrong • Fisher warned us what would happen • We are balancing but not fitting • We are failing stat 1 • The result is wasted resources, wasted lives, wasted time and increased uncertainty • We need to stop using simplicity as an excuse for not conditioning • We should use models that reflect (at least) the complexity of the randomisation procedure • Finally, if it is fraud you are interested in The revenge of RA Fisher 42
  43. 43. The revenge of RA Fisher 43 Presented at the International Society for Clinical Biostatistics and the Society for Clinical Trials, Boston, 7-10 July, 1997.
  44. 44. The revenge of RA Fisher 44 Buyse et al
  45. 45. The revenge of RA Fisher 45 Finally, I leave you with this thought Those who don’t take randomisation seriously are condemned to make conclusions at random RA Fisher demonstrating the theory of the random walk and hitting an absorbing barrier

×