Making Sense of Statistical Significance Chapter 7 Effect Size and Power
Effect Size An effect can be statistically significant without having much practical significance. Effect Size It is a measure of the difference between populations. It tells us how much something changes after a specific intervention. It indicates the extent to which two populations do not overlap. how much populations are separated due to the experimental procedure With a smaller effect size, the populations will overlap more.
Effect Size and Distribution Overlap
Figuring The Effect Size Raw Score Effect Size calculated by taking the difference between the Population 1 mean and the Population 2 mean Standardized Effect Size calculated by dividing the raw score effect size for each study by each study’s population standard deviation This standardizes the difference between means in the same way a Z-score gives us a way to compare two scores on different measures.
Effect Size Example If Population 1 had a mean of 90, Population 2 had a mean of 50, and the population standard deviation was 20, the effect size would be: (90 – 50) / 20 = 2 This indicates that the effect of the experimental manipulation (e.g., reading program) is to increase the scores (e.g., reading level) by 2 standard deviations. Copyright © 2011 by Pearson Education, Inc. All rights reserved
Formula for Calculating the Effect Size Effect Size =  Population 1 M – Population 2 M   Population SD Population 1 M = the mean for the population that receives the experimental manipulation Population 2 M = the mean of the known population (the basis for the comparison distribution) Population SD = the standard deviation of the population of individuals A negative effect size would mean that the mean of Population 1 is lower than the mean of Population 2.
Example of Calculating the Effect Size For the sample of 64 fifth graders, the best estimate of the Population 1 mean is the sample mean of 220. The mean of Population 2 = 200 and the standard deviation is 48. Effect Size =  Population 1 M – Population 2 M   Population SD Effect Size =  220 – 200   48 Effect Size = .42 Copyright © 2011 by Pearson Education, Inc. All rights reserved
Effect Size Conventions Standard rules about what to consider a small, medium, and large effect size  based on what is typical in behavioral and social science research  Cohen’s effect size conventions for mean differences: How Big? Effect Size  (Cohen’s d) No practical effect Less than .20 Small effect size .20-.49 Medium effect size .50-.79 Large effect size .80 or greater
A More General Importance of Effect Size Knowing the effect size of a study lets you compare results with effect sizes found in other studies, even when the other studies have different population standard deviations. Knowing what is a small or a large effect size helps you evaluate the overall importance of a result--- PRACTICAL SIGNIFICANCE! A result may be statistically significant without having a very large effect. Meta-Analysis a procedure that combines results from different studies, even results using different methods or measurements This is a quantitative rather than a qualitative review of the literature. Effect sizes are a crucial part of this procedure.
Statistical Power- The Ability to Achieve Your Goals! Probability that the study will produce a statistically significant result  when the research hypothesis is really true When a study has only a small chance of being significant even if the research hypothesis is true, the study has low power. When a study has a high chance of being significant when the study hypothesis is actually true, the study has high power.
Remember…. If the research hypothesis is false, we do not want to get significant results. If we reject the null when the research hypothesis is false, we commit a TYPE I ERROR. But, even if the research hypothesis is true, we do not always get significant results.  When we FAIL to reject the null hypothesis when the  Copyright © 2011 by Pearson Education, Inc. All rights reserved
 
What determines the Power of a Study? Effect Size and Power If there is a is a mean difference in the population, you have more chance of getting a significant result in the study. Since the difference between population means is the main component of effect size, the bigger the effect size, the greater the power. Effect size is also determined by the standard deviation of a population. The smaller the standard deviation, the bigger the effect size. The smaller the standard deviation, the greater the power.
Sample Size The more people there are in the study, the greater the power is. The larger the sample size, the smaller the standard deviation of the distribution of means becomes. The smaller the standard deviation of the distribution of means, the narrower the distribution of means—and the less overlap there is between distributions leading to higher power. Remember that though sample size and effect size both influence power, they have nothing to do with each other.
Figuring Needed Sample Size for a Given Level of Power The main reason researchers consider power is to help them decide how many people to include in their studies. Sample size has an important influence on power. Researchers need to ensure that they have enough people in the study that they will be able see an effect if there is one. Copyright © 2011 by Pearson Education, Inc. All rights reserved
Other Influences on Power Significance Level Less extreme significance levels (e.g.,  p  < .10) mean more power because the shaded rejection area of the lower curve is bigger and more of the area in the upper curve is shaded. More extreme significance levels (e.g.,  p  < .001) mean less power because the shaded region in the lower curve is smaller. One- vs. Two-Tailed Tests Using a two-tailed test makes it harder to get significance on any one tail. Power is less with a two-tailed test than a one-tailed test. Copyright © 2011 by Pearson Education, Inc. All rights reserved
Statistical Significance vs. Practical Significance Statistical Significance vs. Practical Significance It is possible for a study with a small effect size to be significant. Though the results are statistically significant , they may not have any practical significance. e.g., if you tested a psychological treatment and your result is not big enough to make a difference that matters when treating patients  Evaluating the practical significance of study results is important when studying hypotheses that have practical implications. e.g., whether a therapy treatment works, whether a particular math tutoring program actually helps to improve math skills, or whether sending mailing reminders increases the number of people who respond to the Census
More things to think about…. With a small sample size, if a result is statistically significant, it is likely to be practically significant. In a study with a large sample size, the effect size should also be considered.
Role of Power When a Result is Not Statistically Significant A nonsignificant result from a study with low power is truly inconclusive. A nonsignificant result from a study with high power suggests that: the research hypothesis is false or there is less of an effect than was predicted when calculating power

Aron chpt 7 ed effect size f2011

  • 1.
    Making Sense ofStatistical Significance Chapter 7 Effect Size and Power
  • 2.
    Effect Size Aneffect can be statistically significant without having much practical significance. Effect Size It is a measure of the difference between populations. It tells us how much something changes after a specific intervention. It indicates the extent to which two populations do not overlap. how much populations are separated due to the experimental procedure With a smaller effect size, the populations will overlap more.
  • 3.
    Effect Size andDistribution Overlap
  • 4.
    Figuring The EffectSize Raw Score Effect Size calculated by taking the difference between the Population 1 mean and the Population 2 mean Standardized Effect Size calculated by dividing the raw score effect size for each study by each study’s population standard deviation This standardizes the difference between means in the same way a Z-score gives us a way to compare two scores on different measures.
  • 5.
    Effect Size ExampleIf Population 1 had a mean of 90, Population 2 had a mean of 50, and the population standard deviation was 20, the effect size would be: (90 – 50) / 20 = 2 This indicates that the effect of the experimental manipulation (e.g., reading program) is to increase the scores (e.g., reading level) by 2 standard deviations. Copyright © 2011 by Pearson Education, Inc. All rights reserved
  • 6.
    Formula for Calculatingthe Effect Size Effect Size = Population 1 M – Population 2 M Population SD Population 1 M = the mean for the population that receives the experimental manipulation Population 2 M = the mean of the known population (the basis for the comparison distribution) Population SD = the standard deviation of the population of individuals A negative effect size would mean that the mean of Population 1 is lower than the mean of Population 2.
  • 7.
    Example of Calculatingthe Effect Size For the sample of 64 fifth graders, the best estimate of the Population 1 mean is the sample mean of 220. The mean of Population 2 = 200 and the standard deviation is 48. Effect Size = Population 1 M – Population 2 M Population SD Effect Size = 220 – 200 48 Effect Size = .42 Copyright © 2011 by Pearson Education, Inc. All rights reserved
  • 8.
    Effect Size ConventionsStandard rules about what to consider a small, medium, and large effect size based on what is typical in behavioral and social science research Cohen’s effect size conventions for mean differences: How Big? Effect Size (Cohen’s d) No practical effect Less than .20 Small effect size .20-.49 Medium effect size .50-.79 Large effect size .80 or greater
  • 9.
    A More GeneralImportance of Effect Size Knowing the effect size of a study lets you compare results with effect sizes found in other studies, even when the other studies have different population standard deviations. Knowing what is a small or a large effect size helps you evaluate the overall importance of a result--- PRACTICAL SIGNIFICANCE! A result may be statistically significant without having a very large effect. Meta-Analysis a procedure that combines results from different studies, even results using different methods or measurements This is a quantitative rather than a qualitative review of the literature. Effect sizes are a crucial part of this procedure.
  • 10.
    Statistical Power- TheAbility to Achieve Your Goals! Probability that the study will produce a statistically significant result when the research hypothesis is really true When a study has only a small chance of being significant even if the research hypothesis is true, the study has low power. When a study has a high chance of being significant when the study hypothesis is actually true, the study has high power.
  • 11.
    Remember…. If theresearch hypothesis is false, we do not want to get significant results. If we reject the null when the research hypothesis is false, we commit a TYPE I ERROR. But, even if the research hypothesis is true, we do not always get significant results. When we FAIL to reject the null hypothesis when the Copyright © 2011 by Pearson Education, Inc. All rights reserved
  • 12.
  • 13.
    What determines thePower of a Study? Effect Size and Power If there is a is a mean difference in the population, you have more chance of getting a significant result in the study. Since the difference between population means is the main component of effect size, the bigger the effect size, the greater the power. Effect size is also determined by the standard deviation of a population. The smaller the standard deviation, the bigger the effect size. The smaller the standard deviation, the greater the power.
  • 14.
    Sample Size Themore people there are in the study, the greater the power is. The larger the sample size, the smaller the standard deviation of the distribution of means becomes. The smaller the standard deviation of the distribution of means, the narrower the distribution of means—and the less overlap there is between distributions leading to higher power. Remember that though sample size and effect size both influence power, they have nothing to do with each other.
  • 15.
    Figuring Needed SampleSize for a Given Level of Power The main reason researchers consider power is to help them decide how many people to include in their studies. Sample size has an important influence on power. Researchers need to ensure that they have enough people in the study that they will be able see an effect if there is one. Copyright © 2011 by Pearson Education, Inc. All rights reserved
  • 16.
    Other Influences onPower Significance Level Less extreme significance levels (e.g., p < .10) mean more power because the shaded rejection area of the lower curve is bigger and more of the area in the upper curve is shaded. More extreme significance levels (e.g., p < .001) mean less power because the shaded region in the lower curve is smaller. One- vs. Two-Tailed Tests Using a two-tailed test makes it harder to get significance on any one tail. Power is less with a two-tailed test than a one-tailed test. Copyright © 2011 by Pearson Education, Inc. All rights reserved
  • 17.
    Statistical Significance vs.Practical Significance Statistical Significance vs. Practical Significance It is possible for a study with a small effect size to be significant. Though the results are statistically significant , they may not have any practical significance. e.g., if you tested a psychological treatment and your result is not big enough to make a difference that matters when treating patients Evaluating the practical significance of study results is important when studying hypotheses that have practical implications. e.g., whether a therapy treatment works, whether a particular math tutoring program actually helps to improve math skills, or whether sending mailing reminders increases the number of people who respond to the Census
  • 18.
    More things tothink about…. With a small sample size, if a result is statistically significant, it is likely to be practically significant. In a study with a large sample size, the effect size should also be considered.
  • 19.
    Role of PowerWhen a Result is Not Statistically Significant A nonsignificant result from a study with low power is truly inconclusive. A nonsignificant result from a study with high power suggests that: the research hypothesis is false or there is less of an effect than was predicted when calculating power