2. STATISTICAL
TESTING
1. Is the difference of means of men and women statistically
significant or it’s just a random case by chance?
2. Is the magnitude of difference large enough to justify any
intervention or effect?
4. T
Normal distribution
There should be no significant outliers
Homogeneity of variance
Independence (in most cases)
Testing the assumptions
Histogram & Q-Q plot
Boxplot
Levene’s test
Study design (probability sampling method)
6. Type of the T-
test
• Compares one sample
mean with a hypothesized
hypothesized value
One-sample
t-test
• Compares the means of
two different samples
Equal variance
Unequal variance
Two sample t-
test
(Independent
sample)
• Compares the means of
two dependent variables
Paired
sample t-test
(Dependent
sample)
7. Type of the T-
test
One-sample
t-test
Two sample t-
test
(Independent
sample)
Paired
sample t-test
(dependent
sample)
Mean heart rate of a group
of people is equal to 65
or not
Mean heart rates for two
groups of people are the
same or not
Mean difference in heart rate for a
group of people before and
after exercise is zero or not
8. Specific assumptions of the t-test
1. Both the independent t-test and the dependent t-test are parametric
tests based on the normal distribution. Therefore, they assume:
a) The sampling distribution is normally distributed. In the
dependent t-test this means that the sampling distribution
of the differences between scores should be normal, not
the scores themselves.
b) Data are measured at least at the interval level.
2. The two sample (Independent) t-test, because it is used to test
different groups of people, also assumes:
a) Variances in these populations are roughly equal
(homogeneity of variance).
b) Scores in different treatment conditions are independent
(because they come from different people).
9. 4
Comparison of the z-test and t-test
t-test
• When groups are
comparatively small.
• When population
standard deviation
is unknown
z-test
• When groups are
large
• σ is known
12. Examples
A hospital has a random sample of cholesterol measurements for
men. These patients were seen for issues other than cholesterol.
They were not taking any medications for high cholesterol. The
hospital wants to know if the unknown mean cholesterol for
patients is different from a goal level of 200 mg.
We measure the grams of protein for a sample of energy bars. The
label claims that the bars have 20 grams of protein. We want to
know if the labels are correct or not.
13. One Sample
t-test
Energy Bar - Grams of Protein
20.70 27.46 22.15 19.85 21.29 24.75
20.75 22.91 25.34 20.33 21.54 21.08
22.14 19.56 21.10 18.04 24.12 19.95
19.72 18.28 16.26 17.46 20.53 22.12
25.06 22.44 19.08 19.88 21.39 22.33 25.79
Hypotheses:
Ho: μ = 20
H1: μ ≠ 20 (Two-sided test)
Calculated Sample Mean: 21.40
Calculated Standard Error: 0.456
Calculated Test Statistic: 3.07
Level of significance (α): 0.05
Critical t-Value (df=30): ±2.043 (<3.07)
Decision: Reject Ho
Practical Conclusion: Labels are incorrect; mean
protein content >20g.
16. Student's T-test
How can we determine, to a reasonable
degree of scientific certainty, if one
variety of barley yields more than
another?
William Sealy Gosset
1908
17. Why the name is
William Sealy Gosset
1908
“Student's
T-test”?
18.
19. Student's T-test
• The t test is one type of inferential,
parametric statistic. It determines whether there is a
statistically significant difference between the
means in two groups or conditions.
• It is also known as independent samples t-test, two
sample t-tests, between samples t-test and unpaired
samples t-test.
20. T-test
Dependent t-test
Compares two means based on
related data.
E.g., Data from the same people
measured at different times.
Data from ‘matched’ samples.
Independent t-test
Compares two means based on
independent data
E.g., data from different groups of
people
Significance testing
Testing the significance of
Pearson’s correlation coefficient
Testing the significance of b in
regression.
21.
22. We measure the grams of protein in two different brands of 30 energy
bars. Our two groups are the two brands. Our measurement is the grams
of protein for each energy bar. Our idea is that the mean grams of protein
for the underlying populations for the two brands may be different. We
want to know if we have evidence that the mean grams of protein for the
two brands of energy bars is different or not.
Example
23. Null Hypothesis (Ho): μ1 = μ2; The null hypothesis states that the mean
grams of protein for the two brands of energy bars are equal, i.e., there
is no significant difference between the two populations.
Alternative Hypothesis (Ha): μ1 ≠ μ2; The mean grams of protein for
the two brands of energy bars are not equal, i.e., there is a significant
difference between the two populations.
Hypothesis
25. • Critical Value: 2.042
• T-Value: 2.3 (>2.042)
• We reject H0
• We conclude: Mean grams of protein
for the two brands of energy bars are
not equal.
• Significance level: 0.05
• Degrees of freedom: (n1 + n2 ) - 2
• Degrees of freedom: (16 + 16 ) - 2 = 30
T test
28. Paired T-test
Estimate whether the means of two related
measurements are significantly different from one
another
Used when two continuous variables are related
Same participant at different times
Different sites on the same person
Cases and their matched controls.
Also known as within-subjects, repeated-measures
and dependent-samples t test.
29. Example
Within a group of overweight and physically inactive
individuals, does participation in an exercise-training
program lead to a reduction in cholesterol concentrations
when compared to following a calorie-controlled diet?
30. Which intervention is more effective in lowering
cholesterol levels in overweight males: Exercise or weight
loss?
This study found that overweight,
physically inactive male participants had
statistically significantly lower cholesterol
concentrations (5.80 ± 0.38 mmol/L) at
the end of an exercise-training
programme compared to after a calorie-
controlled diet (6.15 ± 0.52
mmol/L), t(38) = 2.428, p = 0.020.
31. T-tests
Mean(s) come
from the same
group?
Yes
Paired t-test
No
Comparing
means of 2
groups?
Student's T-test
Comparing
mean of a group
against a known
mean?
One-sample T-
test
32. One sample
T-test
Number
of samples
Dependent?
Equal
Variance?
Two
Paired sample
T-test
Independent
sample T-test
(Student’s T-test)
Independent sample
T-test
(Approximation of d.f.)
Unsure
Independent
H0 : c
df n 1
H0 : d 0
df n 1
H0 : 1 2 0
df n1 n2 2
H0 : 1 2 0
df approximated
One
Yes
Summary of Comparing Means
Yes
34. Analysis of Variance - ANOVA
2 Groups
1 Parameter
➔ ?
➔ t-test
ANOVA
3+ Groups
2+ Parameter
35. ANOVA
The concept of ANOVAis similar to regression in
that it is used to investigate and model the
relationship between a dependent (response)
variable and one or more independent
(explanatory) variables.
It is different:
Here the independent variables is categorical
no assumption is made about the nature of the
relationship
ANOVA (Analysis of Variance) is like a 'big sibling' of the
two-sample t-test. While the t-test compares two means to
see if they are equal, ANOVA takes this a step further. It
helps us compare more than two means to see if they are
all equal or if at least one of them is different from the rest.
36. ANOVA– Analysis of Variance
The null hypothesis in the simple ANOVAtest is the
following:
H0: μ1 = μ2 = … = μk
H1: at least two μ’s differ
The test statistic for ANOVAis the ANOVA F-statistic.
37. ExampleANOVAresearch question
• Are there differences in the degree of religious
commitment between countries (UK, USA, and Australia)?
• 1-way ANOVA
• 1-way repeated measures ANOVA
• Factorial ANOVA
• Mixed ANOVA
• ANCOVA
38. ExampleANOVAresearch question
• Do university students have different levels of satisfaction for
educational, social, and campus-related domains ?
• 1-way ANOVA
• 1-way repeated measures ANOVA
• Factorial ANOVA
• Mixed ANOVA
• ANCOVA
39. ExampleANOVAresearch question
• Are there differences in the degree of religious commitment between
countries(UK, USA, and Australia) and gender (male and female)?
• 1-way ANOVA
• 1-way repeated measures ANOVA
• Factorial ANOVA
• Mixed ANOVA
• ANCOVA
40. ExampleANOVAresearch question
• Does couples' relationship satisfaction differ between males and
females and before and after having children?
• 1-way ANOVA
• 1-way repeated measures ANOVA
• Factorial ANOVA
• Mixed ANOVA
• ANCOVA
41. ExampleANOVAresearch question
• Are there differences in university student satisfaction between males and
females (gender) after controlling for level of academic performance?
• 1-way ANOVA
• 1-way repeated measures ANOVA
• Factorial ANOVA
• Mixed ANOVA
• ANCOVA
42. ANOVA
ANOVA partitions the sums of squares (variance from the mean)
into:
• Explained variance (between groups)
• Unexplained variance (within groups) – or
• error variance
F statistic is the between group variance divided by the within group variance :
the model variance
the error variance
p = probability that the observed mean differences between groups could be attributable to chance
43. One-way ANOVA
Comparing means between 3+ groups
Are there differences in satisfaction levels between students who get different grades?
Average Grade X1 X3
X2
44. The model
Software output looks like this:
The mathematical model:
General mean
Effect of group
Individual variance
45. Factorial ANOVA
Comparing means of groups defined by 2+ parameters
Mean Height Country #1 Country #2 Country #3
Male X1.1 X1.2 X1.3
Female X2.1 X2.2 X2.3
Does height differ significantly according to origin and gender ?
46. The model
Software output looks like this:
The mathematical model:
Xijk=µ+αi+βj+αβij+εijk
General mean
Effect of variable 1 Effect of variable 2
Effect of interaction between 1 & 2
Individual variation
47. Post-Hoc tests
If the test proves significant, we can expect an effect, but which one ?
Several tests can be run after the ANOVA :
• Tukey's Honestly Significant Difference (Tukey HSD)
• Conservative tests include Bonferroni correction
• Liberal tests include Student-Newman-Keuls test
• Fisher's Least Significant Difference (LSD)
They tell us which groups are significantly different, and how they are different.
48. What one-way ANOVAs can tell us
Problem: Can completing courses at different levels (beginner, intermediate, and
advanced) affect how quickly individuals can solve a problem? In essence, we want to
understand if the complexity of the courses has an impact on problem-solving speed.
What can we conclude from a one-way ANOVA ?
49. What one-way ANOVAs can tell us
The mean value of group #1 is significantly higher than that of population #2 & 3
There was a statistically significant difference between groups as determined by one-way ANOVA
(F(2,27) = 4.467, p = .021). A Tukey post hoc test revealed that the time to complete the problem was
statistically significantly lower after taking the intermediate (23.6 ± 3.3 min, p = .046) and advanced
(23.4 ± 3.2 min, p = .034) course compared to the beginners course (27.2 ± 3.0 min). There was no
statistically significant difference between the intermediate and advanced groups (p = .989).
The short way:
50. What factorial ANOVAs can tell us
There is an effect of the interaction between variables #2&3 on the mean value of the continuous variable.
The short way:
In a paper:
A two-way ANOVA was conducted that examined the effect of gender and education level on interest in
politics. There was a statistically significant interaction between the effects of gender and education level on
interest in politics, F (2, 54) = 4.643, p = .014.
Simple main effects analysis showed that males were significantly more interested in politics than females
when educated to university level (p = .002), but there were no differences between gender when educated
to school (p = .465) or college level (p = .793).
51. Quick recap
If we want to compare …
Mean of 1 continuous variable between 2 levels of a categorical variable
Mean of 1 continuous variable between 3+ levels of a categorical variable
Mean of 1 continuous variable between 3+ levels of 2+ categorical variables
➔ T-TEST
➔ One-Way ANOVA
➔ Factorial ANOVA