How to improvethe chance of getting your manuscript accepted for publication Jonas Ranstam PhD
Cohort study of smoking and lung cancer (1954) (Bradford Hill) Evidence based medicine Case-control study of (The Cochrane smoking and lung collaboration 1993) cancer (1950) (Bradford Hill)Randomised clinicaltrial of streptomycinand tubercolosis(1948)(Bradford Hill) Anecdotal evidence (Case reports)
Plan1. Methodological background2. General guidelines3. Special recommendations a) case reports b) mechanical experiments c) in vitro/cadaver experiments d) cross-sectional studies e) epidemiological studies f) randomized trials4. Summary
What is statistics used for?1. Describing data (statistics in the plural)2. Interpreting uncertain data (statistics in the singular)
Two kinds of uncertainty1. Uncertainty of measurement2. Uncertainty of sampling
1. Uncertainty of measurementThe precision of the used measurement instrument. The precision of the Finapres non-invasive blood pressure monitor is on the average 12.1 mm Hg.
2. Uncertainty of samplingIndividual effects vary between subjects. Differentsamples of subjects yield different observed meaneffects.
ExampleAssume that the cumulative 10-year revision rateof the Oxford knee prosthesis is 8% and that twogroups of 100 patients receiving the prosthesis arerandomly selected and followed over time.The two groups are likely to get different numbersof patients revised during follow up.
375 randomly ordered patients of which 30 (8%) will be revised within 10 years
6% revised 12% revisedH0: The two samples represent the same populationH1: The two samples represent different populations
P-valueThe probability that an observed effect only reflectssampling uncertainty.12/100 vs. 6/100, Fishers exact test p = 0.22
P-values are often misunderstoodThey cannot- describe clinical relevance (they depend on sample size)- show that a difference “does not exist”, because n.s. is absence of evidence, not evidence of absence
Confidence intervalA range of values, which with the specified confidencelevel describes how likely it is that the estimatedpopulation parameter is included.12/100 vs. 6/100, RR = 2.0 (95%Ci: 0.7 - 5.6) 1/2 1 2 Relative Risk
Confidence intervalA range of values, which with the specified confidencelevel describes how likely it is that the estimatedpopulation parameter is included.12/100 vs. 6/100, RR = 2.0 (95%Ci: 0.7 - 5.6) p < 0.05 n.s. 1/2 1 2 Relative Risk
Important assumptionsMany statistical methods like the Students t-test andANOVA are based on the assumption of Gaussiandistribution and homogeneous variance.
Important assumptionsMany statistical methods like the Students t-test andANOVA are based on the assumption of Gaussiandistribution and homogeneous variance.If the assumptions are not met, use alternative (non-parametric) methods, like the Mann-Whitney U-test orKruskal-Wallis non-parametric anova).
Important assumptionsMost conventional methods (both parametric and non-parametric) require independent observations.
Important assumptionsMost conventional methods (both parametric and non-parametric) require independent observations.- Patients are independent- Patients knees, hips, shoulders, feet, etc. are not
How Many Patients? How Many Limbs? Analysisof Patients or Limbs in the Orthopaedic Literature:A Systematic ReviewBryant et al. JBJS Am. 2006;88:41-45.Our findings suggest that a high proportion (42%) ofclinical studies in high-impact-factor orthopaedic journalsinvolve the inappropriate use of multiple observations fromsingle individuals, potentially biasing results. Orthopaedicresearchers should attend to this issue when reportingresults.
Important assumptionsMost conventional methods (both parametric andnon-parametric) require independent observations.Include only one observation per patient, or use astatistical method that can handle dependant data,e.g. multilevel or mixed effects models.Always present both number of observations andpatients.
MultiplicityIn contrast to many other forms of precision,statistical precision depends on the number ofperformed measurements (significance tests).
MultiplicityEach significance test at a 5% significance levelhas 5% risk of a false positive test.Repeated testing increases the risk of at least onefalse positive test.Number of tests Risk of at least one false positive 1 0.05 2 0.10 5 0.23 10 0.40
Statistical Methods“Describe statistical methods with enough detail toenable a knowledgeable reader with access to theoriginal data to verify the reported results.”
Statistical Methods“Describe statistical methods with enough detail toenable a knowledgeable reader with access to theoriginal data to verify the reported results.”Required for analytical methods (statistical models,hypothesis tests, confidence intervals).Descriptions are often unclear, vague or ambiguous.They need to be clear and detailed.
Results“When possible, quantify findings and present themwith appropriate indicators of measurement error oruncertainty (such as confidence intervals).”
Results“When possible, quantify findings and present themwith appropriate indicators of measurement error oruncertainty (such as confidence intervals).”Statistical precision (p-values and confidence inter-vals) are necessary for generalization of results beyondexamined patients.
Results“Avoid relying solely on statistical hypothesis testing,such as the use of P values, which fails to conveyimportant information about effect size.”
Results“Avoid relying solely on statistical hypothesis testing,such as the use of P values, which fails to conveyimportant information about effect size.”Describe both your observations and how you interpretthem (use confidence intervals or p-values).
Clinically Statistically significant significant yes no yes a b no c dThere was, or was no, (statistically significant) difference is too simplistic
ExampleTwo side effects with a new osteoporosis treatment:- A statistically significant reduction in body hair growth rate by 5% (p = 0.04)- A statistically insignificant increase in systolic blood pressure by 25 mmHg (p = 0.06)
Confidence intervals are betterthan p-valuesIn contrast to p-values they do- relate to clinical significance- show when a difference “does not exist”because they present lower and upper limits of potential clinical effects/differences
P-value and confidence interval P-values Conclusion from confidence intervals [2 alternatives] [6 alternatives] p < 0.05 Statistically but not clinically significant effect Statistically and clinically significant effect p < 0.05 p < 0.05 Statistically, but not necessarily clinically, significant effect n.s. Inconclusive n.s. Neither statistically nor clinically significant effect p < 0.05 Statistically significant reversed effectEffect 0 Clinically significant effects
When there is a difference in dataDo not write that there is not a difference!
There were indeeddifferences, they are0.45 and 0.57
There were indeed differences, they are 0.45 and 0.57Better alternative:“The observed differencesin extraction torquesbetween the two types ofuncoated distal pins canbe explained by chance.”
Avoid non-technical use of technicalterms and use clear expressions- significant clinically or statistically?- no difference statistically insignificant?- statistical difference statistically significant?- matched selected or just comparable?- correlation relation, regression?- normal Gaussian distribution?- random mathematical algorithm?- etc.
Mechanical experimentsWhat do p-values and confidence intervalsrelate to?- Measurement uncertainty (Perhaps)- Sampling uncertainty (No, there is no information on subject variation. The findings cannot be generalized beyond the device).
In vitro/cadaver experimentsWhat do p-values and confidence intervals relateto?- Measurement uncertainty (Perhaps)- Sampling uncertainty (Perhaps, if the observations provide information on variation between subjects)
ExampleIn a study with 60 observations 20 specimenshad been taken from each of 3 subjects.The specimens were distributed randomlybetween one control group and oneexperimental group.What do significance tests of these two groupstell us?
Epidemiological studies- Exploratory, hypothesis generating, multiplicity issues considered less important than validity issues- External validity (source of subjects)- Internal validity (confounding)
ResultsUniform Requirements: “Where scientificallyappropriate, analyses of the data by variables such asage and sex should be included.”
ResultsUniform Requirements: “Where scientificallyappropriate, analyses of the data by variables such asage and sex should be included.”Observational studies require adjustment for knownand suspected confounding factors to produce valideffect estimates.This adjustment is usually performed using statisticalmodelling (e.g. ANCOVA or regression analysis). Thepurpose is to increase validity.
ResultsAutomatic stepwise regression (forward or backward)is not an adequate method for confoundingadjustment.
Clinical trials“The ICMJE member journals will require, as acondition of consideration for publication in theirjournals, registration in a public trials registry.”“The ICMJE recommends that journals publish the trialregistration number at the end of the Abstract.”
Clinical trials“When reporting experiments on human subjects,authors should indicate whether the proceduresfollowed were in accordance with the ethicalstandards of the responsible committee on humanexperimentation (institutional and national) and withthe Helsinki Declaration of 1975, as revised in 2000(5).”
WORLD MEDICAL ASSOCIATION DECLARATION OF HELSINKIEthical Principles for Medical Research Involving Human Subjects27. ...Reports of experimentation not in accordance with the principles laid down in this Declaration should not be accepted for publication.
Purpose of a randomized trialTo test a hypothesis with control of random andsystematic errors.- No bias (randomization & blinding)- No multiplicity problems
RandomizationMathematical algorithmStratifiedConcealment of outcomeReproducible
Study populationsIntention-to-treat Analyze all randomized subjects(ITT) principle according to planned treatment regimen.Full analysis set The set of subjects that is as close(FAS) as possible to the ideal implied by the ITT-principle.Per protocol The set of subjects who complied(PP) set with the protocol sufficiently to ensure that they are likely to exhibit the effects of treatment according to the underlying scientific model.
FAS vs. PP-setFAS + no selection bias - misclassification problem (effect dilution)PP-set + no contamination problem - possible selection bias (confounding)When the FAS and PP-set lead to essentially the sameconclusions, confidence in the trial is supported.
EndpointsPrimary The variable capable of providing the most clinically relevant evidence directly related to the primary objective of the trialSecondary Either measurements supporting the primary endpoint or effects related to secondary objectives
Statistical analysesConfirmatory The result concerns a primary endpoint and the p-value or confidence interval accounts for potential multiplicity. The result can support a claim of superiority, equivalence or non- inferiority.Exploratory All other analyses. The result is either supporting or explanatory, or simply just a new hypothesis.
Reporting“For reports of randomized controlled trials authorsshould refer to the CONSORT statement.”
Include with the manuscriptStudy ProtocolStatistical Analysis Plan
Clinical trialsInternational regulatory guidelinesICH Topic E9 - Statistical Principles for Clinical TrialsEMEA Points to consider: baseline covariates - missing data - multiplicity issues - etc.and similar documents from the FDAThese guidelines can all be found on the internet.
The responsibilities of a statistical reviewer“To make sure that the authors spell out for the readerthe limitations imposed upon the conclusions by thedesign of the study, the collection of data, and theanalyses performed.”Shor S. The responsibilities of a statistical reviewer. Chest 1972;61:486-487.
Read the manuscript from end to beginning, and lookfor weaknesses in the links between: 1. Conclusion 2. Discussion (Discussion section) 3. Results (Results section) 4. Methods (Material & methods section) 5. Data (Material & methods section) 5. Hypothesis (Introduction)Make sure the chain holds all the way!
Summary1. Present statistical methods in detail, and the number of observations included in each analysis.2. Present data, statistical results and your conclusions - data description vs. results interpretation - clinical vs. statistical significance - absence of evidence is not evidence of absence3. Adjust for confounding factors in observational studies (but do not use stepwise regression)4. Comply with the CONSORT checklist in randomized studies