Basic stats


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Basic stats

  1. 1. Basic Statistics A Brief Introduction Allison Titcomb, Ph.D. ICYF, SFCR, U of A
  2. 2. Types of Data <ul><li>Stevens Levels (Scales) of Measurement: </li></ul><ul><ul><li>Nominal (Categories) </li></ul></ul><ul><ul><ul><li>Numbers indicate difference in kind </li></ul></ul></ul><ul><ul><ul><li>e.g., ethnicity, gender, id#s </li></ul></ul></ul><ul><ul><li>Ordinal (Ordered) </li></ul></ul><ul><ul><ul><li>Numbers represent rank orderings; distances are not equal (e.g., grades, rank orderings on a survey) </li></ul></ul></ul>
  3. 3. Stevens Levels cont. <ul><ul><li>Interval </li></ul></ul><ul><ul><ul><li>Equal intervals, “arbitrary” zero </li></ul></ul></ul><ul><ul><ul><li>Ratios have no meaning </li></ul></ul></ul><ul><ul><ul><li>e.g., temperature in degrees F </li></ul></ul></ul><ul><ul><li>Ratio </li></ul></ul><ul><ul><ul><li>Equal intervals, absolute zero </li></ul></ul></ul><ul><ul><ul><li>Equal ratios are equivalent </li></ul></ul></ul><ul><ul><ul><li>e.g, weight, height </li></ul></ul></ul>
  4. 4. Other Types of Data <ul><li>Qualitative (nominal and ordinal) vs. Quantitative (interval and ratio) </li></ul><ul><li>Discrete (finite number of values) vs. Continuous (can potentially take on any numerical value) </li></ul><ul><li>Dichotomous (only 2 values) </li></ul>
  5. 5. What kind of data are these? <ul><li>Number of crimes in a county </li></ul><ul><li>Religious preference </li></ul><ul><li>Pass/Fail on a test </li></ul><ul><li>Income </li></ul><ul><li>other examples? </li></ul>
  6. 6. Data Reduction <ul><li>Descriptive Statistics a.k.a. Summary Statistics </li></ul><ul><ul><li>numbers that represent some characteristics of the set of scores </li></ul></ul><ul><ul><li>unorganized > organized </li></ul></ul><ul><ul><li>graph, shape </li></ul></ul>
  7. 7. More data reduction <ul><li>Frequency Distributions </li></ul><ul><ul><li>bar diagram/histogram </li></ul></ul><ul><ul><ul><li>discrete vs. continuous data </li></ul></ul></ul><ul><ul><ul><li>nominal level </li></ul></ul></ul><ul><ul><li>(ordinal data-- why don’t you graph it?) </li></ul></ul>
  8. 8. More data reduction <ul><ul><li>frequency distributions </li></ul></ul><ul><ul><ul><li>interval/ratio </li></ul></ul></ul><ul><ul><ul><li>shapes include skewed, bimodal, j shaped… </li></ul></ul></ul><ul><li>(See samples on board/overhead) </li></ul>
  9. 9. More data reduction <ul><li>Measures of Central Tendency </li></ul><ul><ul><li>describing and typifying </li></ul></ul><ul><ul><li>used for comparison </li></ul></ul><ul><ul><li>Mean (typical/average score, sensitive to extreme scores) </li></ul></ul><ul><ul><li>Median (middlemost score) </li></ul></ul><ul><ul><li>Mode (most “common” score) </li></ul></ul>
  10. 10. More data reduction <ul><li>Measures of Variability </li></ul><ul><ul><li>dispersion/degree of heterogeneity </li></ul></ul><ul><ul><li>Range </li></ul></ul><ul><ul><li>Variance (degree of variability of individual scores) </li></ul></ul><ul><ul><li>Standard Deviation (sq. root of variance; typical “distance” between individual scores and the mean of the sample) </li></ul></ul>
  11. 11. More data reduction <ul><li>Things that contribute to variability </li></ul><ul><ul><li>natural variability (true variance, tough to measure) </li></ul></ul><ul><ul><li>sampling error </li></ul></ul><ul><ul><li>measurement error </li></ul></ul><ul><ul><li>systematic variance </li></ul></ul><ul><ul><li>MAX MIN CON </li></ul></ul>
  12. 12. More data reduction <ul><li>Normal Curve </li></ul><ul><ul><li>With large numbers, many things are “normally distributed” </li></ul></ul><ul><ul><li>majority of individuals measured are clustered close to the mean </li></ul></ul><ul><ul><li>symmetric; mean, median, mode at same point; range is approx. 6 standard deviations </li></ul></ul>
  13. 13. More data reduction <ul><li>Measures of relationship </li></ul><ul><ul><li>Pearson’s Product Moment Correlation, more fondly known simply as “r” </li></ul></ul><ul><ul><li>Correlation coefficient </li></ul></ul><ul><ul><li>2 sets of scores; question is the relationship between the 2. Is there a relationship? </li></ul></ul><ul><ul><li>Allows us to predict; reliability </li></ul></ul>
  14. 14. Correlation <ul><li>Describing the relationship </li></ul><ul><ul><li>Direction </li></ul></ul><ul><ul><ul><li>positive (high w/high, low w/low) </li></ul></ul></ul><ul><ul><ul><li>negative (low w/high, high w/low) </li></ul></ul></ul><ul><ul><li>Magnitude </li></ul></ul><ul><ul><ul><li>+1.0 vs. -1.0 </li></ul></ul></ul><ul><ul><ul><li>low correlation, no correlation </li></ul></ul></ul><ul><ul><li>Draw a picture a.k.a. scatterplot </li></ul></ul><ul><ul><li>Assumption is that it is Linear </li></ul></ul>
  15. 15. Inferential Statistics <ul><li>Statistics in never having to say you’re certain; judgment/ leap/ inference; generalization </li></ul><ul><li>population parameters and sample statistics </li></ul><ul><li>based on probability (relative frequency of occurrence of an event in the “long run”) </li></ul>
  16. 16. Inferential Statistics <ul><li>Errors in Statistical Reasoning </li></ul><ul><ul><li>Null hyp-- no difference hypothesis </li></ul></ul><ul><li>Types of Errors (See Handout) </li></ul><ul><ul><li>Type I </li></ul></ul><ul><ul><ul><li>rejecting the null when it’s true </li></ul></ul></ul><ul><ul><ul><li>crying wolf/false alarm/trigger happy </li></ul></ul></ul><ul><ul><ul><li>in law, we don’t want to convict innocent </li></ul></ul></ul><ul><ul><ul><li>“ controlled” by alpha level (e.g., 0.05) </li></ul></ul></ul>
  17. 17. Inferential Statistics <ul><ul><li>Type II </li></ul></ul><ul><ul><ul><li>NOT rejecting the null when it’s wrong </li></ul></ul></ul><ul><ul><ul><li>“nice puppy” as the wolf bites your fingers </li></ul></ul></ul><ul><ul><ul><li>In medicine, we’d rather treat someone who isn’t sick than to NOT treat someone who is (HMOs might change that) </li></ul></ul></ul><ul><ul><ul><li>Beta, effect size, power of a test, alpha level </li></ul></ul></ul>
  18. 18. Inferential Statistics <ul><li>Major types of statistical tests </li></ul><ul><li>Don’t forget: What’s the question? </li></ul><ul><ul><li>“ t test” (or “t-test statistic) two means </li></ul></ul><ul><ul><ul><li>t for two; Gossett at Guiness Student’s t </li></ul></ul></ul>
  19. 19. Inferential statistics <ul><ul><li>F test </li></ul></ul><ul><ul><ul><li>for more than two means </li></ul></ul></ul><ul><ul><ul><li>a t test is a baby F test </li></ul></ul></ul><ul><ul><ul><li>btwn/within; Fisher an agrarian researcher (ever heard of a split plot design?) </li></ul></ul></ul><ul><ul><ul><li>interactions </li></ul></ul></ul>
  20. 20. Inferential Statistics <ul><ul><li>Chi square </li></ul></ul><ul><ul><ul><li>nominal data (e.g., democrats and republicans; males and females) </li></ul></ul></ul><ul><ul><li>Correlation </li></ul></ul>
  21. 21. Inferential Statistics <ul><li>Statistical vs. Practical Significance </li></ul><ul><ul><li>Cost/Benefit question-- Ask “Compared to What” </li></ul></ul><ul><ul><li>Statistically significant other? </li></ul></ul><ul><ul><li>Significant findings do NOT eliminate the need for replication </li></ul></ul>
  22. 22. Inferential Statistics <ul><li>p = 0.0001 vs. p = 0.01 NOT effect size </li></ul><ul><li>Non significant findings often do not get published-- bias in literature? </li></ul>