Chapter34

742 views
678 views

Published on

Research Methods in Education 6th Edition

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
742
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
59
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Chapter34

  1. 1. APPROACHES TO QUANTITATIVE DATA ANALYSIS © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON
  2. 2. STRUCTURE OF THE CHAPTER • Scales of data • Parametric and non-parametric data • Descriptive and inferential statistics • Kinds of variables • Hypotheses • One-tailed and two-tailed tests • Distributions • Statistical significance • Hypothesis testing • Effect size • A note on symbols
  3. 3. FOUR SCALES OF DATA NOMINAL ORDINAL INTERVAL RATIO It is incorrect to apply statistics which can only be used at a higher scale of data to data at a lower scale.
  4. 4. • Parametric statistics: where characteristics of, or factors in, the population are known; • Non-parametric statistics: where the characteristics of, or factors in, the population are unknown. PARAMETRIC AND NON- PARAMETRIC STATISTICS
  5. 5. DESCRIPTIVE AND INFERENTIAL STATISTICS • Descriptive statistics: to summarize features of the sample or simple responses of the sample (e.g. frequencies or correlations). • No attempt is made to infer or predict population parameters. • Inferential statistics: to infer or predict population parameters or outcomes from simple measures, e.g. from sampling and from statistical techniques. • Based on probability.
  6. 6. DESCRIPTIVE STATISTICS • The mode (the score obtained by the greatest number of people); • The mean (the average score); • The median (the score obtained by the middle person in a ranked group of people, i.e. it has an equal number of scores above it and below it); • Minimum and maximum scores; • The range (the distance between the highest and the lowest scores); • The variance (a measure of how far scores are from the mean: the average of the squared deviations of individual scores from the mean);
  7. 7. SIMPLE STATISTICS • Frequencies (raw scores and percentages) – Look for skewness, intensity, distributions and spread (kurtosis); • Mode – For nominal and ordinal data • Mean – For interval and ratio data • Standard deviation – For interval and ratio data
  8. 8. 9 8 Mean 7 | 6 | 5 | 4 | 3 | 2 | 1 X X X X | X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 20 Mean = 6 High standard deviation
  9. 9. 9 8 Mean 7 | 6 | 5 | 4 | 3 | 2 | 1 X X X X X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 6 10 11 Mean = 6 Moderately high standard deviation
  10. 10. 9 8 Mean 7 | 6 | 5 | 4 | 3 X 2 X 1 X X X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 5 6 6 6 7 Mean = 6 Low standard deviation
  11. 11. STANDARD DEVIATION • The standard deviation is a standardised measure of the dispersal of the scores, i.e. how far away from the mean/average each score is. It is calculated, in its most simplified form as: or • d2 = the deviation of the score from the mean (average), squared ∀ ∑ = the sum of • N = the number of cases • A low standard deviation indicates that the scores cluster together, whilst a high standard deviation indicates that the scores are widely dispersed.         − = ∑ 1 .. 2 N d DS         = ∑ N d DS 2 ..
  12. 12. DESCRIPTIVE STATISTICS • The standard deviation (a measure of the dispersal or range of scores: the square root of the variance); • The standard error (the standard deviation of sample means); • The skewness (how far the data are asymmetrical in relation to a ‘normal’ curve of distribution); • Kurtosis (how steep or flat is the shape of a graph or distribution of data; a measure of how peaked a distribution is and how steep is the slope or spread of data around the peak).
  13. 13. INFERENTIAL STATISTICS • Can use descriptive statistics. • Correlations • Regression • Multiple regression • Difference testing • Factor analysis • Structural equation modelling
  14. 14. DEPENDENT AND INDEPENDENT VARIABLES • An independent variable is an antecedent variable, that which causes, in part or in total, a particular outcome; it is a stimulus that influences a response, a factor which may be modified (e.g. under experimental or other conditions) to affect an outcome. • A dependent variable is the outcome variable, that which is caused, in total or in part, by the input, antecedent variable. It is the effect, consequence of, or response to, an independent variable.
  15. 15. DEPENDENT AND INDEPENDENT VARIABLES • In using statistical tests which require independent and dependent variables, exercise caution in assuming which is or is not the dependent or independent variable, as the direction of causality may not be one-way or in the direction assumed.
  16. 16. FIVE KEY INITIAL QUESTIONS 1. What kind (scales) of data are there? 2. Are the data parametric or non-parametric? 3. Are descriptive or inferential statistics required? 4. Do dependent and independent variables need to be identified? 5. Are the relationships considered to be linear or non-linear?
  17. 17. CATEGORICAL, DISCRETE AND CONTINUOUS VARIABLES • A categorical variable is a variable which has categories of values, e.g. the variable ‘sex’ has two values: male and female. • A discrete variable has a finite number of values of the same item, with no intervals or fractions of the value, e.g. a person cannot have half an illness or half a mealtime. • A continuous variable can vary in quantity, e.g. money in the bank, monthly earnings. There are equal intervals, and, usually, a true zero, e.g. it is possible to have no money in the bank.
  18. 18. CATEGORICAL, DISCRETE AND CONTINUOUS VARIABLES • Categorical variables match categorical data. • Continuous variables match interval and ratio data.
  19. 19. KINDS OF ANALYSIS • Univariate analysis: looks for differences amongst cases within one variable. • Bivariate analysis: looks for a relationship between two variables. • Multivariate analysis: looks for a relationship between two or more variables.
  20. 20. HYPOTHESES • Null hypothesis (H0) • Alternative hypothesis (H1) • The null hypothesis is the stronger hypothesis, requiring rigorous evidence not to support it. • One should commence with the former and cast the research in the form of a null hypothesis, and only turn to the latter in the case of finding the null hypothesis not supported.
  21. 21. HYPOTHESES • Direction of hypothesis: states the kind of difference or relationship between two conditions or two groups of participants • One-tailed (directional), e.g.: ‘people who study in silent surroundings achieve better than those who study in noisy surroundings’. (‘Better’ indicates the direction.) • Two-tailed (no direction), e.g.: ‘there is a difference between people who study in silent surroundings and those who study in noisy surroundings’. (There is no indication of which is the better.)
  22. 22. ONE-TAILED AND TWO-TAILED TESTS • A one-tailed test makes assumptions about the population and the direction of the outcome, e.g. Group A will score more highly than another on a test. • A two-tailed test makes no assumptions about the population and the direction of the outcome, e.g. there will be a difference in the test scores.
  23. 23. THE NORMAL CURVE OF DISTRIBUTION
  24. 24. THE NORMAL CURVE OF DISTRIBUTION • A smooth, perfectly symmetrical, bell-shaped curve. • It is symmetrical about the mean and its tails are assumed to meet the x-axis at infinity. • Statistical calculations often assume that the population is distributed normally and then compare the data collected from the sample to the population, allowing inferences to be made about the population.
  25. 25. THE NORMAL CURVE OF DISTRIBUTION Assumes that: – 68.3 per cent of people fall within 1 standard deviation of the mean; – 27.1 per cent) are between 1 standard deviation and 2 standard deviations away from the mean; – 4.3 per cent are between 2 and 3 standard deviations away from the mean; – 0.3 per cent are more than 3 standard deviations away from the mean.
  26. 26. SKEWNESS The curve is not symmetrical or bell-shaped
  27. 27. KURTOSIS (STEEPNESS OF THE CURVE)
  28. 28. STATISTICAL SIGNIFICANCE If the findings hold true 95% of the time then the statistical significance level (ρ) = 0.05 If the findings hold true 99% of the time then the statistical significance level (ρ) = 0.01 If the findings hold true 99.9% of the time then the statistical significance level (ρ) = 0.001
  29. 29. CORRELATION Shoe size Hat size 1 1 2 2 3 3 4 4 5 5 Perfect positive correlation: + 1
  30. 30. CORRELATION Hand size Foot size 1 1 2 2 3 3 4 4 5 5 Perfect positive correlation: + 1
  31. 31. CORRELATION HAND SIZE FOOT SIZE 1 2 2 1 3 4 4 3 5 5 Positive correlation: <+1
  32. 32. 0 1 2 3 4 5 6 7 Line 1 PERFECT POSITIVE CORRELATION
  33. 33. 0 1 2 3 4 5 6 7 Line 1 PERFECT NEGATIVE CORRELATION
  34. 34. 0 2 4 6 8 10 Line 1 MIXED CORRELATION
  35. 35. CORRELATIONS Statistical significance is a function of the co-efficient and the sample size: – the smaller the sample, the larger the co-efficient has to be in order to obtain statistical significance; – the larger the sample, the smaller the co-efficient can be in order to obtain statistical signifiance; – Statistical significance can be attained either by having a large coefficient together with a small sample or having a small coefficient together with a large sample.
  36. 36. CORRELATIONS • Begin with a null hypothesis (e.g. there is no relationship between the size of hands and the size of feet). The task is not to support the hypothesis, i.e. the burden of responsibility is not to support the null hypothesis. • If the hypothesis is not supported for 95 per cent or 99 per cent or 99.9 per cent of the population, then there is a statistically significant relationship between the size of hands and the size of feet at the 0.05, 0.01 and 0.001 levels of significance respectively. • These levels of significance – the 0.05, 0.01 and 0.001 levels – are the levels at which statistical significance is frequently taken to be demonstrated.
  37. 37. HYPOTHESIS TESTING • Commence with a null hypothesis • Set the level of significance (α) to be used to support or not to support the null hypothesis (the alpha (α) level); the alpha level is determined by the researcher. • Compute the data. • Determine whether the null hypothesis is supported or not supported. • Avoid Type I and Type II errors.
  38. 38. TYPE I AND TYPE II ERRORS • Null Hypothesis: there is no statistically significant difference between x and y. • TYPE I ERROR – The researcher rejects the null hypothesis when it is in fact true (like convicting an innocent person) ∴increase significance level • TYPE II ERROR – The researcher accepts the null hypothesis when it is in fact false (like finding a guilty person innocent) ∴reduce significance level, increase sample size.
  39. 39. EFFECT SIZE • Increasingly seen as preferable to statistical significance. • A way of quantifying the difference between two groups. It indicates how big the effect is, something that statistical significance does not. • For example, if one group has had an experimental treatment and the other has not (the control group), then the effect size is a measure of the effectiveness of the treatment.
  40. 40. EFFECT SIZE • It is calculated thus: • Statistics for calculating effect size include r2 , adjusted R2 , η2 , ω2 , Cramer’s V, Kendall’s W, Cohen’s d, Eta, Eta2 . • Different kinds of statistical treatments use different effect size calculations. groupcontroltheofdeviationstandard group)controlofmeangroupalexperimentof(mean sizeEffect − = squaresofsumTotal groupsbetweensquareofSum )(EtasizeEffect 2 =
  41. 41. EFFECT SIZE • In using Cohen’s d: 0-0.20 = weak effect 0.21-0.50 = modest effect 0.51-1.00 = moderate effect >1.00 = strong effect
  42. 42. THE POWER OF A TEST • An estimate of the ability of the test to separate the effect size from random variation.

×