# BASIC STATISTICS

### BASIC STATISTICS

1. 1. BASIC BIOSTATISTICS Diane Flynn, LTC, MC Colin Greene, LTC, MC
2. 2. Objectives Overview of Biostatistical Terms and Concepts Application of Statistical Tests
3. 3. Why Use Statistics? Descriptive Statistics • identify patterns • leads to hypothesis generating Inferential Statistics • distinguish true differences from random variation • allows hypothesis testing
4. 4. Why Use Statistics? Cardiovascular Mortality in Males 0 0.2 0.4 0.6 0.8 1 1.2 '35- '44 '45- '54 '55- '64 '65- '74 '75- '84 SMR Bangor Roseto AJPH 1992
5. 5. Types of Data Numerical • Continuous • Discrete Categorical • Ordinal • Nominal
6. 6. Descriptive Statistics Identifies patterns in the data Identifies outliers Guides choice of statistical test
7. 7. Percentage of Specimens Testing Positive for RSV Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun South 2 2 5 7 20 30 15 20 15 8 4 3 North- east 2 3 5 3 12 28 22 28 22 20 10 9 West 2 2 3 3 5 8 25 27 25 22 15 12 Mid- west 2 2 3 2 4 12 12 12 10 19 15 8
8. 8. Descriptive Statistics Percentage of Specimens Testing Postive for RSV 1998-99 0 5 10 15 20 25 30 35 Jul Sep Nov Jan Mar May Jul South Northeast West Midwest
9. 9. Describing the Data with Numbers Measures of Central Tendency • MEAN -- average • MEDIAN -- middle value • MODE -- most frequently observed value(s)
10. 10. Distribution of Course Grades 0 2 4 6 8 10 12 14 Number of Students A A- B+ B B- C+ C C- D+ D D- F Grade
11. 11. Describing the Data with Numbers Measures of Dispersion • RANGE • STANDARD DEVIATION • SKEWNESS
12. 12. Measures of Dispersion • RANGE • highest to lowest values • STANDARD DEVIATION • how closely do values cluster around the mean value • SKEWNESS • refers to symmetry of curve
13. 13. Measures of Dispersion • RANGE • highest to lowest values • STANDARD DEVIATION • how closely do values cluster around the mean value • SKEWNESS • refers to symmetry of curve
14. 14. Standard Deviation σB σA Curve B Curve A
15. 15. Measures of Dispersion • RANGE • highest to lowest values • STANDARD DEVIATION • how closely do values cluster around the mean value • SKEWNESS • refers to symmetry of curve
16. 16. Skewness Curve A Curve B negative skew Mode Median Mean
17. 17. The Normal Distribution Mean = median = mode Skew is zero 68% of values fall between 1 SD 95% of values fall between 2 SDs . Mean,Median,Mode 1 σ 2σ
18. 18. Inferential Statistics Used to determine the likelihood that a conclusion based on data from a sample is true
19. 19. Terms p value: the probability that an observed difference could have occurred by chance
20. 20. Hypertension Trial DRUG Baseline mean SBP F/u mean SBP A 150 130 B 150 125
21. 21. Terms confidence interval: The range of values we can be reasonably certain includes the true value.
22. 22. 30 Day % Mortality Study IC STK Control p N Khaja 5.0 10.0 0.55 40 Anderson 4.2 15.4 0.19 50 Kennedy 3.7 11.2 0.02 250
23. 23. 95% Confidence Intervals -.40 -.35 -.30 -.25 -.20 -.15 -.10 -.05 .00 .05 .10 .15 .20 Khaja (n=40) Anderson (n=50) Kennedy (n=250)
24. 24. Types of Errors No difference Difference No difference TYPE II ERROR (β) Difference TYPE I ERROR (α) Truth Conclusion Power = 1-β
25. 25. What Test Do I Use? 1. What type of data? 2. How many samples? 3. Are the data normally distributed? 4. What is the sample size?