Upcoming SlideShare
×

598 views

Published on

Hypothessis testing by Dr. O. Yusuf as part of the 5th Research Summer School - Jeddah at KAIMRC - WR

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

1. 1. SamplingDistributions, StandardError, Confidence IntervalOyindamola Bidemi YUSUFKAIMRC-WR
2. 2. SAMPLE Why do we sample? Note: information in sample may notfully reflect what is true in thepopulation We have introduced sampling error bystudying only some of the population Can we quantify this error?
3. 3. SAMPLING VARIATIONS Taking repeated samples Unlikely that the estimates would be exactlythe same in each sample However, they should be close to the truevalue By quantifying the variability of theseestimates, precision of estimate is obtained. Sampling error is thereby assessed.
4. 4. SAMPLING DISTRIBUTIONS Distribution of sample estimates- Means- Proportions- Variance Take repeated samples and calculateestimates Distribution is approximately normal
5. 5.  Mathematicians have examined thedistribution of these sample estimatesand their results are expressed in thecentral limit theorem
6. 6. central limit theorem Sampling distributions are approximately normallydistributed regardless of the nature of the variable inthe parent population The mean of the sampling distribution is equal to thetrue population mean Mean of sample means is an unbiased estimate ofthe true population mean The standard deviation (SD) of sampling distributionis directly proportional to the population SD andinversely proportional to the square root of thesample size
7. 7. SUMMARY: DISTRIBUTION OFSAMPLE ESTIMATES NORMAL Mean = True population mean Standard deviation = Population standarddeviation divided by square root of samplesize Standard deviation called standard error
8. 8. ESTIMATION A major purpose or objective of healthresearch is to estimate certain populationcharacteristics or phenomena Characteristic or phenomenon can bequantitative such as average SYSTOLICBLOOD PRESSURE of adult men or qualitativesuch as proportion with MALNUTRITION Can be POINT or INTERVAL ESTIMATE
9. 9. Point estimates Value of a parameter in a populatione.g. mean or a proportion We estimate value of a parameter usingusing data collected from a sample This estimate is called sample statisticand is a POINT ESTIMATE of theparameter i.e. it takes a single value
10. 10. STANDARD ERROR Used to describe the variability ofsample means Depends on variability of individualobservations and the sample size Relationship described as –Standard error = Standard DeviationSquare root of samplesize
11. 11. Sample 1 MeanSample 2 MeanSample 3 Mean……….….........Sample n MeanStandard errorMean of the meansMean of the meansThis mean will also have a standard deviation= SEStandard error
12. 12. Standard Deviation or StandardError? Quote standard deviation if interest is in thevariability of individuals as regards the levelof the factor being investigated – SBP, Ageand cholesterol level. Quote standard Error if emphasis is on theestimate of a population parameter.It is a measure of uncertainty in the samplestatistic as an estimate of populationparameter.
13. 13. Interpreting SE Large SE indicates that estimate isimprecise Small SE indicates that estimate isprecise How can SE be reduced?
14. 14. Answer If sample size is increased If data is less variable
15. 15. INTERVAL ESTIMATE Is SE particularly useful? More helpful to incorporate this measure ofprecision into an interval estimate for thepopulation parameter How? By using the knowledge of the theoreticalprobability distribution of the sample statistic tocalculate a CI
16. 16.  Not sufficient to rely on a singleestimate Other samples could yield plausibleestimates Comfortable to find a range of valueswithin which to find all possible meanvalues
17. 17. WHAT IS A CONFIDENCEINTERVAL? The CI is a range of values, above and below afinding, in which the actual value is likely to fall. The confidence interval represents the accuracy orprecision of an estimate. Only by convention that the 95% confidence levelis commonly chosen. Researchers are confident that if other surveyshad been done, then 95 per cent of the time — or19 times out of 20 — the findings would fall in thisrange.
18. 18. CONFIDENCE INTERVAL Statistic + 1.96 S.E. (Statistic) 95% of the distribution of samplemeans lies within 1.96 SD of thepopulation mean
19. 19. Interpretation If experiment is repeated many times,the interval would contain the truepopulation mean on 95% of occasions i.e. a range of values within which weare 95% certain that the truepopulation mean lies
20. 20. Issues in CI interpretation How wide is it? A wide CI indicates thatestimate is imprecise A narrow one indicates a preciseestimate Width is dependent on size of SE, whichin turn depends on SS
21. 21. Factors affecting CI A narrow or small confidence intervalindicates that if we were to ask the samequestion of a different sample, we arereasonably sure we would get a similar result. A wide confidence interval indicates that weare less sure and perhaps information needsto be collected from a larger number ofpeople to increase our confidence.
22. 22.  Confidence intervals are influenced bythe number of people that are beingsurveyed. Typically, larger surveys will produceestimates with smaller confidenceintervals compared to smaller surveys.
23. 23. Why are CIs important Because confidence intervals representthe range of values scores that arelikely if we were to repeat the survey. Important to consider whengeneralizing results. Consider random sampling andapplication of correct statistical test Like comfort zones that encompass thetrue population parameter
24. 24. Calculating confidence limits The mean diastolic blood pressure from16 subjects is 90.0 mm Hg, and thestandard deviation is 14 mm Hg.Calculate its standard error and 95%confidence limits.
25. 25. Standard error = Standard DeviationSquare root of samplesize14√16
26. 26.  95% CI: Statistic + 1.96 S.E. (Statistic)
27. 27. ANWERS Standard error – 3.5 95% confidence limits – 82.55 to 97.46
28. 28. CI for a proportion P + 1.96 S.E. (P) SE(P)= √p(1-p)/n Online calculators are available
29. 29. In summary SD versus SE Meaning and interpretation of CI Shopping for the right samplingdistribution
30. 30.  THANK YOU