Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Common Statistical Errors
In Medical Publications
Dr Petra Graham
Macquarie University
petra.graham@mq.edu.au
• Statistical errors in medical journals are surprisingly
common. For example:
• Olsen (2003) found that 54% of a sample o...
Types of errors
Errors can be broadly classified into three main areas
• Errors in design
• Errors in analysis
• Reporting...
Common Design Errors:
• Lack of a sample size calculation (or wrong calculation)
• Studies with too few subjects are under...
Another example:
Yang et al. (2017)
More Design Errors:
• Primary outcome measures unclear
• Randomisation method unclear
• Hypotheses unclear
• An a priori a...
Errors in analysis
• Testing for equality of baseline characteristics
in RCTs
• Potentially misleading, not meaningful, no...
More on analysis errors:
• Use of the wrong test eg
• Two-sample t-test (for independent groups) used where a paired t-tes...
And more on analysis errors:
• In RCTs comparisons within groups but not between groups tests are performed (or are
ignore...
And more on analysis:
• Continuous data made binary or into ordinal categories (or ordinal
categories made binary) without...
Errors/Deficiencies in Reporting Statistics
• Failure to use (or define the use of) a variability measure (eg. SD)
• Use o...
Errors in conclusions
• Correlation
is not
causation!
• Make sure
that
conclusions
don’t
suggest
causation
Errors in conclusions
• Conclusions are drawn that are not supported by results
• Interpreting “not significant” as “not d...
Errors in conclusions
• Making too much of
potentially spurious results
in the conclusions
References
Useful summaries of errors
Altman DG, Bland JM. Statistics notes: Absence of evidence is not evidence of absenc...
References
Examples
Sung et al. Octreotide infusion or emergency sclerotherapy for variceal haemorrhage. Lancet 1993; 342:...
Additional thoughts
• Several of you asked me how you could get in contact with
statisticians to include as reviewers for ...
Upcoming SlideShare
Loading in …5
×

Common statistical errors in medical publications

569 views

Published on

Presentation by Petra Graham, Macquarie University, to the AHMEN meeting 19 June 2017 in Sydney, Australia

Published in: Education
  • Be the first to comment

  • Be the first to like this

Common statistical errors in medical publications

  1. 1. Common Statistical Errors In Medical Publications Dr Petra Graham Macquarie University petra.graham@mq.edu.au
  2. 2. • Statistical errors in medical journals are surprisingly common. For example: • Olsen (2003) found that 54% of a sample of 141 papers published in Infection and Immunity had errors in reporting, analysis or both. • Yim et al. (2010) found 79% of a sample of 139 papers published in the Korean Journal of Pain had errors. • Nieuwenhuis et al (2011) found that 15% of articles reviewed in the top ranking journals Science, Nature, Nature Neuroscience, Neuron and The Journal of Neuroscience had used the wrong method. Errors are surprisingly common..
  3. 3. Types of errors Errors can be broadly classified into three main areas • Errors in design • Errors in analysis • Reporting and interpretational errors Various publications describe these problems: Eg. Clark (2011), Lang (2004), Olsen (2003), Strasak et al. (2007)
  4. 4. Common Design Errors: • Lack of a sample size calculation (or wrong calculation) • Studies with too few subjects are underpowered – a difference won’t be found even if a real difference exits (Altman and Bland, 1995) Sung et al. (1993) Results
  5. 5. Another example: Yang et al. (2017)
  6. 6. More Design Errors: • Primary outcome measures unclear • Randomisation method unclear • Hypotheses unclear • An a priori analysis plan should be made so that it’s clear that the research isn’t the result of a “fishing expedition”
  7. 7. Errors in analysis • Testing for equality of baseline characteristics in RCTs • Potentially misleading, not meaningful, not needed. Yang et al. (2017) 19 comparisons! Expect 5% (~1) to be spuriously significant
  8. 8. More on analysis errors: • Use of the wrong test eg • Two-sample t-test (for independent groups) used where a paired t-test (for dependent groups) should have been and vice versa • Parametric methods used where non-parametric should have been used (i.e. in skewed data, small samples) • Methods not appropriate for data type eg linear regression used with ordinal response. • Failure to adjust p-values for multiple testing (to avoid Type I errors) • Failure to carefully define all of the tests used in the methods section
  9. 9. And more on analysis errors: • In RCTs comparisons within groups but not between groups tests are performed (or are ignored) • Watson et al (2009) compared an anti-ageing product (n=30) with a placebo (“vehicle”) (n=30). • They found the test product showed significant improvement in facial wrinkles compared to baseline assessment (P = 0·013), with no significant improvement given by the vehicle (P = 0·11). • But, there was no significant difference between test and vehicle (P=0·72). • Media suggested this was the first anti-ageing cream “proven to work.” • But the treatment vs placebo comparison is what matters – this is the only comparison that shows that the treatment works (or not)! See Bland and Altman (2011) for a useful discussion on this paper!
  10. 10. And more on analysis: • Continuous data made binary or into ordinal categories (or ordinal categories made binary) without justification • May be done to “find”/increase significance • Typically a great loss of information results from dichotomisation • Failure to show/comment on assumptions required for testing.
  11. 11. Errors/Deficiencies in Reporting Statistics • Failure to use (or define the use of) a variability measure (eg. SD) • Use of mean and standard deviation (SD) in skewed data • median and quartiles are preferable • Using standard error (SE) of the mean instead of SD in descriptive statistics or confusing the two • SE used because it is smaller so “looks” better • Reporting thresholds for p-values rather than the actual p-values • Reporting p-value but no data (i.e. estimate and interval, change and interval etc) – like the anti-ageing cream study • Reporting significance of a test or analysis not shown or described
  12. 12. Errors in conclusions • Correlation is not causation! • Make sure that conclusions don’t suggest causation
  13. 13. Errors in conclusions • Conclusions are drawn that are not supported by results • Interpreting “not significant” as “not different” or “equivalent” Yang et al., 2017 Sung et al., 1993
  14. 14. Errors in conclusions • Making too much of potentially spurious results in the conclusions
  15. 15. References Useful summaries of errors Altman DG, Bland JM. Statistics notes: Absence of evidence is not evidence of absence. BMJ 1995; 311 :485 Bland JM, Altman DG. Comparisons against baseline within randomised groups are often used and can be highly misleading. Trials 2011, 12:264 Clark GT, Mulligan R. Fifteen common mistakes encountered in clinical research. Journal of Prosthodontic Research 2011; 55:1-6 Lang T. Twenty Statistical Errors Even YOU Can Find in Biomedical Research Articles. Croatian medical journal 2004; 45(4): 361-370 Nieuwenhuis S et al. Erroneous analyses of interactions in neuroscience: a problem of significance. Nature Neuroscience 2011; 14: 1105- 1107 Olsen CH. Guest commentary: Review of the Use of Statistics in Infection and Immunity. Infection And Immunity 2003;71(12): 6689–6692 Strasak, AM et al. Statistical errors in medical research – a review of common pitfalls. Swiss Medical Weekly 2007; 137: 44-49 Yim KH et al. Analysis of Statistical Methods and Errors in the Articles Published in the Korean Journal of Pain. Korean Journal of Pain 2010; 23: 35-41
  16. 16. References Examples Sung et al. Octreotide infusion or emergency sclerotherapy for variceal haemorrhage. Lancet 1993; 342: 637-41 Watson REB, et al. A cosmetic ‘anti-ageing’ product improves photoaged skin: a double-blind, randomized controlled trial. Br J Dermatol 2009; 161:419-426. Yang et al. Finding the Optimal volume and intensity of Resistance Training Exercise for Type 2 Diabetes: The FORTE Study, a Randomized Trial. Diabetes Research and Clinical Practice 2017; 130: 98-107. http://www.tylervigen.com/spurious-correlations (correlation plots) Thanks to Deb Wyatt, Michael Martin and the MedStats Google Group users for some great examples and references.
  17. 17. Additional thoughts • Several of you asked me how you could get in contact with statisticians to include as reviewers for papers or on your editorial boards. There are several approaches that can be taken: 1. Email the anzstat mailing list (http://www.maths.uq.edu.au/research/research_centres/anzstat/). This is a list for people interested in statistics. Because you could identify people at any stage of their career or non-statisticians it would be important to ask for a CV and maybe check references. 2. Approach university department heads in stats/maths/biostatistics and ask for recommendations on people to invite. 3. I plan to talk to the Statistical Society of Australia about putting together a registry of statisticians willing to help.

×