Upcoming SlideShare
×

Basics of statistics

11,011
-1

Published on

6 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total Views
11,011
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
15,672
0
Likes
6
Embeds 0
No embeds

No notes for slide
• This rule has been discussed earlier. Emphasize that there is still 0.3% of the distribution falling outside the 3 standard deviation limits.
• Basics of statistics

1. 1. Basic StatisticsFundamentals of Hypothesis Testing: One-Sample, Two-Sample Tests Chap 9-1
2. 2. What is biostatistics Statistics is the science and art of collecting, summarizing, and analyzing data that are subject to random variation. Bio statistics is the application of statistics and mathematical methods to the design and analysis of health, biomedical, and biological studies. Chap 9-2
3. 3. Different Tests of Significance1. One-Sample z-test or t-test a. Compares one sample mean versus a population mean2. Two-Sample t-test a. Compares one sample mean versus another sample mean a. Independent t-tests (equal samples) b. Dependent t-tests (dependent/paired samples)3. One-way analysis of variance (ANOVA) a. Comparing several sample means Chap 9-3
4. 4. How to properly use Biostatistics Develop an underlying question of interest Generate a hypothesis Design a study (Protocol) Collect Data Analyze Data  Descriptive statistics  Statistical Inference Chap 9-4
5. 5. Relationship between population and sample(Simple random sampling) Chap 9-5
6. 6. Sampling Techniques PopulationSimple Random Stratified Random Systematic Cluster Convenience Sample Sample Sampling Sampling Sampling Bias free Bias free Biased Bias free Biased sample sample sample sample sample Chap 9-6
7. 7. Example How are my 10 patients doing after I put them on an anti-hypertensive medications?  Describe the results of your 10 patients Chap 9-7
8. 8. Example What is the in hospital mortality rate after open heart surgery at SAL hospital so far this year  Describe the mortality What is the in hospital mortality after open heart surgery likely to be this year, given results from last year  Estimate probability of death for patients like those seen in the previous year. Chap 9-8
9. 9. Misuse of statistics About 25% of biological research is flawed because of incorrect conclusions drawn from confounded experimental designs and misuse of statistical methods Chap 9-9
10. 10. What is a Hypothesis? A hypothesis is a I claim that mean CVD claim (assumption) in the INDIA is atleast 3! about the population parameter  Difference between the value of sample µ= statistic and the corresponding hypothesized parameter value is called hypothesis testing. © 1984-1994 T/Maker Co. Chap 9-10
11. 11. Hypothesis Testing Process Assume the populationmean age is 50. ( H 0 : µ = 50) Identify the PopulationIs X = 20 likely if µ = 50 ? Take a Sample No, not likely! REJECT Null Hypothesis ( X = 20 ) Chap 9-11
12. 12. Reason for Rejecting H0 Sampling Distribution of XIt is unlikely that ... Therefore,we would get a we reject thesample mean of null hypothesisthis value ... that m = 50. ... if in fact this were the population mean. 20 µ = 50 X If H0 is true Chap 9-12
13. 13. Components of Biostatistics Biostatistics StatisticalDescriptive Inference Estimation Hypothesis Testing Confidence Intervals P-values Chap 9-13
14. 14. Normal DistributionA variable is said to be normally distributed or to have anormal distribution if its distribution has the shape of anormal curve. Chap 9-14
15. 15. Normal distribution bell-shaped symmetrical about the mean (No skewness) total area under curve = 1 approximately 68% of distribution is within one standard deviation of the mean approximately 95% of distribution is within two standard deviations of the mean approximately 99.7% of distribution is within 3 standard deviations of the mean Mean = Median = Mode Chap 9-15
16. 16. Empirical Rule About 68% of the area lies within 1 standard deviation 68% of the mean−3σ −2σ −σ µ +σ +2σ +3σ About 95% of the area lies within 2 standard deviations About 99.7% of the area lies within 3 standard deviations of the mean Chap 9-16
17. 17. Chap 9-17
18. 18. Level of Significance, α Is designated by α , (level of significance)  Typical values are .01, .05, .10 Is selected by the researcher at the beginning Provides the critical value(s) of the test Chap 9-18
19. 19. The z-Test for Comparing Population MeansCritical values for standard normal distribution Chap 9-19
20. 20. Level of Significance I claim that mean CVD in the INDIA is atleast 3!and the Rejection Region α H0: µ ≥ 3 Critical H1: µ < 3 Value(s) Rejection 0 Regions α H0: µ ≤ 3 H1: µ > 3 0 α /2 H0: µ = 3 H1: µ ≠ 3 0 Chap 9-20
21. 21. Hypothesis Testing1. State the research question.2. State the statistical hypothesis.3. Set decision rule.4. Calculate the test statistic.5. Decide if result is significant.6. Interpret result as it relates to your research question. Chap 9-21
22. 22. Rejection & Nonrejection Regions I claim that mean CVD in the INDIA is atleast 3! Two-tailed test Left-tailed test Right-tailed Sign in Ha = < >Rejection region Both sides Left side Right side Chap 9-22
23. 23. The Null Hypothesis, H0 States the assumption (numerical) to be tested  e.g.: The average number of CVD in INDIA is at least three ( H 0 : µ ≥ 3 ) Is always about a population parameter ( H 0 : µ ≥ 3 about a sample ), not statistic ( ) H0 : X ≥ 3 Chap 9-23
24. 24. The Null Hypothesis, H0 (continued) Begins with the assumption that the null hypothesis is true  Similar to the notion of innocent until proven guilty Chap 9-24
25. 25. The Alternative Hypothesis, H1 Is the opposite of the null hypothesis  e.g.: The average number of CVD in INDIA is less than 3 ( H1 : µ < 3) Never contains the “=” sign May or may not be accepted Chap 9-25
26. 26. General Steps in Hypothesis Testinge.g.: Test the assumption that the true mean number of of σ CVD in INDIA is at least three ( Known) 1. State the H0 H0 : µ ≥ 3 2. State the H1 H1 : µ < 3 3. Choose α α =.05 4. Choose n n = 100 5. Choose Test Z test Chap 9-26
27. 27. General Steps in Hypothesis Testing (continued)6. Set up critical value(s) Reject H0 α Z -1.645 100 persons surveyed7. Collect data Computed test stat =-2,8. Compute test statistic p-value = .0228 and p-value9. Make statistical decision Reject null hypothesis The true mean number of CVD is10. Express conclusion less than 3 in human population. Chap 9-27
28. 28. The z-Test for Comparing Population MeansCritical values for standard normal distribution Chap 9-28
29. 29. p-Value Approach to Testing Convert Sample Statistic (e.g. X ) to Test Statistic (e.g. Z, t or F –statistic) Obtain the p-value from a table or computer Compare the p-value with  ≥ α , do not reject H0 If p-value  If p-value ≤ α , reject H0 Chap 9-29
30. 30. Comparison of Critical-Value & P-Value Approaches Critical-Value Approach P-Value ApproachStep1 State the null and alternative Step1 State the null andhypothesis. alternative hypothesis.Step 2 Decide on the significance Step 2 Decide on the significancelevel, α. level, α.Step 3 Compute the value of the Step 3 Compute the value of thetest statistic. test statistic.Step 4 Determine the critical Step 4 Determine the P-value.value(s).Step 5 If the value of the teststatistic falls in the rejection region, Step 5 If P < α, reject Ho;reject Ho; otherwise, do not reject otherwise do not reject Ho.Ho.Step 6 Interpret the result of the Step 6 Interpret the result of thehypothesis test. hypothesis test. Chap 9-30
31. 31. Result Probabilities H0: Innocent Jury Trial Hypothesis Test The Truth The TruthVerdict Innocent Guilty Decision H0 True H0 False Do Not Type IIInnocent Correct Error Reject 1-α Error (β ) H0 Type I PowerGuilty Error Correct Reject Error H0 (1 - β ) (α ) Chap 9-31
32. 32. Type I & II Errors Have an Inverse Relationship If you reduce the probability of one error, the other one increases so that everything else is unchanged. βα Chap 9-32
33. 33. Critical Values Approach to Testing Convert sample statistic (e.g.: X ) to test statistic (e.g.: Z, t or F –statistic) Obtain critical value(s) for a specified α from a table or computer  If the test statistic falls in the critical region, reject H0  Otherwise do not reject H0 Chap 9-33
34. 34. One-tail Z Test for Mean ( σ Known) Assumptions  Population is normally distributed  If not normal, requires large samples  Null hypothesis has ≤ or ≥ sign only Z test statistic X − µX X −µ  Z= = σX σ/ n Chap 9-34
35. 35. Rejection Region H0: µ ≥ µ 0 H0: µ ≤ µ 0 H1: µ < µ 0 H1: µ > µ 0Reject H0 Reject H0 α α 0 Z 0 Z Z Must Be Significantly Small values of Z don’t Below 0 to reject H0 contradict H0 Don’t Reject H0 ! Chap 9-35
36. 36. Example: One Tail TestQ. Does an average box of cereal contain more than 368 grams of cereal? A random sample of 25 boxes showed X = 372.5. The company has 368 gm. specified σ to be 15 grams. Test at the H0: α = 0.05 level. µ ≤ 368 H1: µ > 368 Chap 9-36
37. 37. Finding Critical Value: One Tail Standardized CumulativeWhat is Z given α = 0.05? Normal Distribution Table (Portion)σZ =1 Z .04 .05 .06 .95 1.6 .9495 .9505 .9515 α = .05 1.7 .9591 .9599 .9608 0 1.645 Z 1.8 .9671 .9678 .9686 Critical Value 1.9 .9738 .9744 .9750 = 1.645 Chap 9-37
38. 38. Example Solution: One Tail TestH0: µ ≤ 368H1: µ > 368α = 0.5 X−µ Z= = 1.50n = 25 σCritical Value: 1.645 n Reject .05 Do Not Reject at α = .05 Conclusion: 0 1.645 Z No evidence that true 1.50 mean is more than 368 Chap 9-38
39. 39. p -Value Solution p-Value is P(Z ≥ 1.50) = 0.0668Use thealternative P-Value =.0668hypothesisto find the 1.0000direction of - .9332the rejection .0668region. 0 1.50 Z From Z Table: Z Value of Sample Lookup 1.50 to Statistic Obtain .9332 Chap 9-39
40. 40. p -Value Solution (continued) (p-Value = 0.0668) ≥ (α = 0.05) Do Not Reject. p Value = 0.0668 Reject α = 0.05 0 1.645 Z 1.50Test Statistic 1.50 is in the Do Not RejectRegion Chap 9-40
41. 41. Example: Two-Tail TestQ. Does an average box of cereal contain 368 grams of cereal? A random sample of 25 boxes showed X = 372.5. The company 368 gm. has specified σ to be 15 grams. Test at the H0: µ = 368 α = 0.05 level. H1: µ ≠ 368 Chap 9-41
42. 42. Example Solution: Two-Tail TestH0: µ = 368 Test Statistic:H1: µ ≠ 368 X − µ 372.5 − 368α = 0.05 Z= = = 1.50 σ 15n = 25 n 25Critical Value: ±1.96 Decision: Reject Do Not Reject at α = .05 .025 .025 Conclusion: No Evidence that True -1.96 0 1.96 Z Mean is Not 368 1.50 Chap 9-42
43. 43. p-Value Solution (p Value = 0.1336) ≥ (α = 0.05) Do Not Reject. p Value = 2 x 0.0668 Reject Reject α = 0.05 0 1.50 1.96 ZTest Statistic 1.50 is in the Do Not RejectRegion Chap 9-43
44. 44. Connection to Confidence Intervals For X = 372.5, σ = 15 and n = 25, the 95% confidence interval is: 372.5 − ( 1.96 ) 15 / 25 ≤ µ ≤ 372.5 + ( 1.96 ) 15 / 25 or 366.62 ≤ µ ≤ 378.38If this interval contains the hypothesized mean (368), we do not reject the null hypothesis. It does. Do not reject. Chap 9-44
45. 45. What is a t Test? Commonly Used Definition: Comparing two means to see if they are significantly different from each other Technical Definition: Any statistical test that uses the t family of distributions Chap 9-45
46. 46. Independent Samples t Test Use this test when you want to compare the means of two Independent Independent Mean Mean independent samples #1 #2 on a given variable • “Independent” means that the members of one sample do not Compare using t test include, and are not matched with, members of the other sample Example: • Compare the average height of 50 randomly selected men to that of Chap 9-46 50 randomly selected
47. 47. Dependent Samples t Test  Used to compare the means of a single sample or of two matched or paired samples  Example: • If a group of students took a math test in March and that same group of students took the same math test two months later in May, we could compare their average scores on the two test dates using a dependent samples t Chap 9-47 test
48. 48. Comparing the Two t TestsIndependent Samples Dependent Samples Tests the equality of the means  Tests the equality of the means from two independent groups between related groups or of two (diagram below) variables within the same group Relies on the t distribution to (diagram below) produce the probabilities used to  Relies on the t distribution to test statistical significance produce the probabilities used to test statistical significance Person Person Person Person #1 #2 #1 #1 Treatment group Control group Before treatment After treatment Chap 9-48
49. 49. Types One sample compare with population Unpaired compare with control Paired same subjects: pre-post Z-test large samples >30 Chap 9-49
50. 50.  Compare Means (or medians)Example: Compare blood presures of two or more groups, or compare BP of one group with a theoretical value. 1 Group:1. One Sample t test2. Wilcoxon rank sum test 2 Groups:1. Unpaired t test2. Paired t test3. Mann-Whitney t test4. Welch’s corrected t test5. Wilcoxon matched pairs test Chap 9-50
51. 51.  3-26 Groups:1. One-way ANOVA2. Repeated measures ANOVA3. Kruskal-Wallis test4. Friedman test (All with post tests) Raw data Average data Mean, SD, & NAverage data Mean, SEM, & N Chap 9-51
52. 52. Is there a difference?between you…means,who is meaner? Chap 9-52
53. 53. Statistical Analysis control treatment group group mean meanIs there a difference? Slide downloaded from the Internet Chap 9-53
54. 54. What does difference mean? The mean difference medium is the same for all variability three cases high variability low variability Slide downloaded from the Internet Chap 9-54
55. 55. What does difference mean? medium variability high variability Which one shows low the greatest variability difference? Slide downloaded from the Internet Chap 9-55
56. 56. t Test: σ Unknown Assumption  Population is normally distributed  If not normal, requires a large sample T test statistic with n-1 degrees of freedom X −µ  t= S/ n Chap 9-56
57. 57. Example: One-Tail t TestDoes an average box ofcereal contain more than368 grams of cereal? Arandom sample of 36boxes showed X = 372.5, 368 gm.and s = 15. Test at theα = 0.01 level. H0: µ ≤ 368 H1: µ > σ is not given 368 Chap 9-57
58. 58. Example Solution: One-TailH0: µ ≤ 368 Test Statistic:H1: µ > 368 X − µ 372.5 − 368α = 0.01 t= = = 1.80 S 15n = 36, df = 35 n 36Critical Value: 2.4377 Reject Decision: Do Not Reject at α = .01 .01 Conclusion: No evidence that true 0 2.4377 t35 1.80 mean is more than 368 Chap 9-58
59. 59. The t Table Since it takes into account the changing shape of the distribution as n increases, there is a separate curve for each sample size (or degrees of freedom). However, there is not enough space in the table to put all of the different probabilities corresponding to each possible t score. The t table lists commonly used critical regions (at popular alpha levels). Chap 9-59
60. 60. Z-distribution versus t-distribution Chap 9-60
61. 61. The z-Test for Comparing Population MeansCritical values for standard normal distribution Chap 9-61
62. 62. Summary We can use the z distribution for testing hypotheses involving one or two independent samples  To use z, the samples are independent and normally distributed  The sample size must be greater than 30  Population parameters must be known Chap 9-62
63. 63. Chap 9-63
1. A particular slide catching your eye?

Clipping is a handy way to collect important slides you want to go back to later.