Upcoming SlideShare
×

# Lund 2009

384 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
384
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
7
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Lund 2009

1. 1. Biostatistical analysis some basic issues Jonas Ranstam PhD
2. 2. PlanStatistical principles and their consequences whenwriting manuscripts for publication in scientific journalsHow does this information relate to your research?General discussion
3. 3. Statistics is not mathematicsMathematics is about deduction;statistics starts with data.John Nelder
4. 4. Litmus testA simple test for the acidity of a substance.Students t-testA simple test for the significance of a finding.
5. 5. Litmus testA simple test for the acidity of a substance.Students t-testA simple test for the significance of a finding.
6. 6. Differences between litmus tests andstatistical testsConcentrated hydrochloric acid always turns blue litmuspaper red.A clinically significant difference in systolic blood pressure isnot always statistically significant.
7. 7. Differences between litmus tests andstatistical testsWhenever a blue litmus paper remains blue it has not beenexposed to concentrated hydrochloric acid (evidence ofabsence).A statistically insignificant difference in systolic bloodpressure may be clinically significant (absence of evidence).
8. 8. Differences between litmus tests andstatistical testsThe interpretation of a blue litmus paper turning red isindependent of the number of tests performed (nomultiplicity issues).The interpretation of a statistical test showing significancedepends on the number of tests performed (multiplicityissues).
9. 9. Differences between litmus tests andstatistical testsA sample is litmus tested to know more about the sample.A sample is statistically tested to know more about thepopulation from which the sample is drawn.
10. 10. A population of black and white dots Population types Finite (having 100 dots) Superpopulation (infinite, but symbolized by the 100 dots)
11. 11. Sampling dots from the population
12. 12. Sampling uncertainty The sample is usually obvious, but what is the population? What do the sampled items represent, a finite or a super-population? ? What is the sampling uncertainty?
13. 13. Sampling uncertaintyThe Central Limit Theorem (CLT) states:The mean of the sampling distribution of means equals themean of the population from which the samples were drawn.The variance of the sampling distribution of means equals thevariance of the population from which the samples were drawndivided by the size of the samples.The sampling distribution of means will approximate aGaussian distribution as sample size increases.yi ~ N(μp; σp/√n)
14. 14. Measurement uncertaintyA measurement, Yi, has two components: the measuredobjects true measure, μ0, and a measurement error, ei:Y i = μ0 + e iA measurement error is typically unknown, but the populationof errors may have a known a distribution:ei ~ N(μe; σe)If μe ≠ 0 measurements will be biased and the greater σe thelower the measurements precision.
15. 15. Combining sampling uncertaintyand measurement errorsAs long as measurements are not biased, measurement errorsreduce statistical precision.This may, however, in turn lead to (dilution) bias or "attenuationby errors"
16. 16. Alt. 1. Statistical hypothesis testingSampling uncertainty described by a probabilityH0: π = π0H1: π ≠ π0p = Prob(drawing a sample looking like H1 | H0)If unlikely reject H0 (“statistically significant”)
17. 17. Alt. 2. Confidence intervalSampling uncertainty described by a rangeof values likely (usually 95%) to include anestimated parameterπE (πL, πU)
18. 18. P-values vs. confidence intervals P-value Conclusion from confidence intervals p < 0.05 Statistically significant effect n.s. Inconclusive p < 0.05 Statistically significant reversed effectEffect 0
19. 19. P-values vs. confidence intervals P-value Conclusion from confidence intervals p < 0.05 Statistically, but clinically insignificant effect Statistically and clinically significant effect p < 0.05 p < 0.05 Statistically, but not necessarily clinically, significant effect n.s. Inconclusive n.s. Neither statistically nor clinically significant effect p < 0.05 Statistically significant reversed effectEffect 0 Clinically significant effect
20. 20. ICMJE Uniform requirements for manuscripts...Results“When possible, quantify findings and present themwith appropriate indicators of measurement error oruncertainty (such as confidence intervals).”“Avoid relying solely on statistical hypothesis testing,such as the use of P values, which fails to conveyimportant information about effect size.”
21. 21. Present observed data usinga) central tendency (mean, median or mode),b) variability (SD or range) andc) number of observations (n).Describe parameter estimatesa) with 95% confidence intervals (2SEM) andb) number of observations in the sample (n).
22. 22. Many scientists do not understand the differencebetween sample and population+/-SD or +/-SEM?SD is a measure of variability.SEM is a measure sampling uncertainty.Note+/- 1SEM corresponds to a 68% confidence interval,+/- 2SEM corresponds to a 95% confidence interval.
23. 23. ICMJE Uniform requirements for manuscripts...MethodsDescribe statistical methods with enough detail toenable a knowledgeable reader with access to theoriginal data to verify the reported results.This part of the manuscript could be written prior to theexperiment.
24. 24. Sample size calculationAn essential part of designing a study, or experiment,is planning for uncertainty.Both too much and too little uncertainty in results areunethical, because research resources are scarce andshould be used rationally.
25. 25. Sample size calculationSampling uncertainty increases with the variability ofthe studied variable.Sampling uncertainty decreases with the number ofindependent observations in the sample.Smaller sampling uncertainty is required to detectsmall differences than large ones.
26. 26. Example 1. A vaccine trialWithout protection 30% will fall ill.Investigating a protective effect, with 5% false positiveand 20% false negative error rate requires.Protection Nr of patients 90% 72 80% 94 70% 128 60% 180 50% 268 40% 428
27. 27. Example 2. Side effect surveillanceGuillain-Barrés syndrome: Incidence = 1x10 -5 personyrs.To investigate the side effect with 5% false positive and20% false negative error rate requires.Risk increase Number of patients100 times 1 098 50 times 2 606 20 times 9 075 10 times 26 366 5 times 92 248 2 times 992 360
28. 28. MultiplicityEach tested null hypothesis have, with 5% significancelevel, 5% chance of being false positive.Testing n null hypotheses at 5% significance level leadsto an overall chance of at least one false positive test of1 - 0.95nn overall p1 0.0502 0.0983 0.1434 0.1865 0.226
29. 29. MultiplicityEach tested null hypothesis have, with 5% significancelevel, 5% chance of being false positive.Testing n null hypotheses at 5% significance level leadsto an overall chance of at least one false positive test of1 - 0.95nn overall p Some experiments have multiplicity problems severe enough to make conventional statistical1 0.050 testing meaningless.2 0.0983 0.1434 0.1865 0.226
30. 30. Sampling from a population withoutvariation
31. 31. Sampling from a population withoutvariation Without variation there is no sampling uncertainty to evaluate. Do not test deterministic outcomes, like sex distribution after matching for sex or baseline imbalance after randomisation.
32. 32. How to report laboratory experimentsGuidelines to promote the transparency of research has been developed inclinical medicine and epidemiology: CONSORT, STROBE, PRISMA, etc.No such guidelines for reporting laboratory experiments.
33. 33. Would reporting guidelines beuseful?A systematic review (Roberts et al. BMJ 2002;324:474-476) shows that the reporting generally is inadequate:only 2 of 44 papers described the allocation of analysisunits.
34. 34. General principlesThe experiment should be described in a way that makesit possible for the reader to repeat the experiment.The statistical analysis should be described with enoughdetail to allow a reader with access to original data toverify reported results.
35. 35. Introduction sectionWhat is the purpose of the experiment?What hypotheses will you test (in general terms)?
36. 36. Material and methods sectionWhat is the design of your experiment?What is your sample and population?What is your analysis unit?
37. 37. Statistics sectionWhat statistical methods did you use?Did you check whether the assumptions were fulfilled?How did you do that?Were the assumptions fulfilled?
38. 38. Results section (observation)What was the mean or median value?What was the variation (SD or range)?How many observations did you have?
39. 39. Results section (inference)What hypotheses did you test (specifically)?What was the effect size (parameter estimate)?What was its 95% confidence intervals?If you present p-values:Present the actual p-value, e.g. p = 0.45 or p = 0.003,unless p < 0.0001Do NOT write p > 0.05 or ns or p = 0.0000 or stars
40. 40. Discussion sectionWhat is your strategy for multiplicity?What are your interpretation of presented p-values?Can you see any problems with bias?What are your overall conclusion?
41. 41. Discussion1. Describe an experiment you have performed or plan to perform and state the purpose of this.2. Present the sample and the population from which the sample is drawn.3. Suggest how the sampling uncertainty could be presented in a manuscript submitted for publication in a scientific journal using p-values and confidence intervals.HelpRanstam J. Sampling uncertainty in medical research. OsteoarthritisCartilage 2009;17:1416-1419.Ranstam J. Reporting laboratory experiments. OsteoarthritisCartilage 2009 (in press) doi:10.1016/j.joca.2009.07.006.