Guide to Statistical Tests for Comparing Groups

Introduction to Applied Statistics and Applied Statistical Methods Practical guidelines
Prof. Dr. Chang Zhu page 1
Table of Contents
LECTURE 3.......................................................................................................................................................... 2
PARAMETRIC TESTS ........................................................................................................................................... 2
Independent t-test ........................................................................................................................................ 2
Dependent t-test (paired-samples t-test) ..................................................................................................... 4
NON-PARAMETRIC TESTS .................................................................................................................................. 6
Wilcoxon rank-sum test and Mann-Whitney test......................................................................................... 6
The Wilcoxon signed-rank test...................................................................................................................... 7
ASSIGNMENT................................................................................................................................................... 10
REPORTING THE RESULT IN APA STYLE ........................................................................................................... 11

LECTURE 3
PARAMETRIC TESTS
Independent t-test
To test the two hypotheses, we note that H1 aims at finding the difference in the level of stress between 2
groups: male and female. Hence, an independent t-test will be used.
In SPSS, choose Analyse > Compare Means > Independent-Samples Test
Select the two variables StressatStart and StressatEnd and move them to the Test Variable(s) box by
clicking the button.
Select Gender and move it to the Grouping Variable box, then click on Define Groups to indicate the codes
that we have assigned for the two groups. In our data, 1 is Female and 2 is Male, so we will type 1 and 2 in
Group 1 and 2, respectively.
`
After finishing, click on Continue to return to the main dialog box. Then click on OK to run the analysis.
A TV company have started a reality TV show where 32 members of the public are left to fend for
themselves on a desert island. They have asked a psychologist to monitor the psychological well-being
of the contestants and he records a number of indices of mental health. He is initially interested in the
amount of stress experienced by the contestants during their first week on the island and hypothesises
that:
(1) the females will report higher levels of stress than the males at the start as well as at the end of
the week (H1)
(2) the level of stress experienced by all the participants is increased by the end of the week of the
reality TV show (H2)
The data is named TVshow.sav, which can be found on Pointcarre.

The first table Group Statistics tells us the descriptive statistics for both groups measured at two different
times: at the start and the end of the week.
Group Statistics
Gender N Mean Std. Deviation Std. Error Mean
Stress at the start of the week Female 16 14.81 5.307 1.327
Male 16 18.94 7.954 1.988
Stress at the end of the week Female 16 25.38 12.468 3.117
Male 16 23.19 11.220 2.805
To find the answer to the first hypothesis (H1) we should look at the table labelled Independent Sample
Test.
When we conduct analyses that involve different groups, we should make sure that the variances in
different groups are equal, i.e. satisfying the homogeneity of variance assumption. The Levene’s test is used
to test this assumption in SPSS and the result is given in the output table Independent Samples Test.
The output shows that the p-values are bigger than .05 (p = .083 and .847), meaning that the variances of
the two groups are not significantly different from each other. Hence, the homogeneity of variance
assumption is satisfied. Therefore, we should read the result of the t-test in the row labelled Equal
variances assumed.
What can we obtain from the result in the part t-test for Equality of Means?
First, comparing the level of stress between females and males at the start of the week, we see that p =
.095 (2-tailed) as SPSS does not make any specific prediction (higher or lower) so it gives us a 2-tailed test.
To obtain a one-tailed test in order to answer the hypothesis, we just divide the p value by 2, hence p =
.048 < .05 (one-tailed).
What can we conclude? Based on the result, we come up with the conclusion that:
At the start of the week, on average, the male participants experienced a higher level of stress (M= 18.94,
SE = 1.99) than the females (M=14.81, SE = 1.32). This difference was significant t(30) = -1.73, p < .05.
Therefore, hypothesis 1 is not supported because the psychologist assumed that the females experienced a
higher level of stress than males.
Independent Samples Test
Levene's Test for
Equality of
Variances t-test for Equality of Means
F Sig. t df
Sig.
(2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower Upper
Stress at the start of the
week
Equal variances assumed 3.211 .083 -1.726 30 .095 -4.125 2.390 -9.007 .757
Equal variances not
assumed
-1.726 26.146 .096 -4.125 2.390 -9.037 .787
Stress at the end of the
week
Equal variances assumed .038 .847 .522 30 .606 2.188 4.193 -6.376 10.751
Equal variances not
assumed
.522 29.673 .606 2.188 4.193 -6.380 10.755

At the end of the week, p=.606 (2-tailed) and if we calculate the one-tailed, p = .303 > .05, then the test is
also non-significant and H1 is again not supported.
Dependent t-test (paired-samples t-test)
For hypothesis 2(H2), this requires the analysing of difference in the level of stress for each participant from
the beginning to the end week of the reality TV show, therefore, a dependent or paired-sample t-test will
be used.
The paired samples t-test requires that the differences between the scores at the beginning and the end of
the week should be normally distributed, i.e. the K-S test should be non-significant.
To do this, you should create a new variable, the value of which is the difference between the scores of a
given participant.
In SPSS, choose Transform > Compute Variable
In the box Target Variable, we can type the name of this new variable, e.g. difference. Then select the
variable StressatStart and move to the Numeric Expression area. Choose the minus sign (-) from the
numeric pad, and move the StressatEnd to the Numeric Expression.
Click OK to create the new variable.
Then conduct the K-S test (test of normality) for this newly-created variable (difference) to check the
assumption. Your output may look like this:
Tests of Normality
Gender
Kolmogorov-Smirnova
Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
difference Female .177 16 .194 .932 16 .266
Male .138 16 .200*
.933 16 .268
a. Lilliefors Significance Correction
*. This is a lower bound of the true significance.
Now that we are safe with the K-S test, P > .05, we now proceed to the paired samples t-test.
In SPSS, choose Analyse > Compare Means > Paired-Samples T-Test
Select the pair of variables (StressatStart and StressatEnd) and move them to the Paired Variables area by
clicking on the button.

Click on OK to run the analysis.
In the output, the first table Paired Samples Statistics tells us that the stress scores at the end of the week is
higher than those at the beginning.
Paired Samples Statistics
Mean N Std. Deviation Std. Error Mean
Pair 1 Stress at the start of the week 16.88 32 6.973 1.233
Stress at the end of the week 24.28 32 11.720 2.072
As indicated by the table Paired Samples Test, the p value is .013 (2-tailed) which is significant. We can
come up with the conclusion:
On average, the participants experienced a higher level of stress at the end of the week (M=24.48, SE =
2.07) than at the beginning of the week (M = 16.88, SE = 1.23), t (31) = -2.64, p <0.05.
Paired Samples Test
Paired Differences
t df Sig. (2-tailed)Mean Std. Deviation Std. Error Mean
95% Confidence Interval
of the Difference
Lower Upper
Pair 1 Stress at the start of the week -
Stress at the end of the week
-7.406 15.848 2.802 -13.120 -1.693 -2.644 31 .013
Therefore, hypothesis 2 (H2) is supported.

NON-PARAMETRIC TESTS
When assumption of normality is violated or variables are measured on ordinal scales, we opt for non-
parametric tests, which are equivalent to both types of the t-tests.
Wilcoxon rank-sum test and Mann-Whitney test
e.g. we want to know if people who intend to get a Ph.D. or Psychology Doctor (PhD holder) in psychology
are more likely to rely on a calendar or day-planner to remember what they are supposed to be doing (i.e.,
are people who might become professors more absent minded than other people).
The ordinal variable planner measures the extent to which a person relies on a calendar/day planner,
ranging from 1 (strongly agree) to 5 strongly disagree). The data file is named planner_use.sav.
(The idea and data for this example is adapted from
http://academic.udayton.edu/gregelvers/psy216/spss/ordinaldata.htm)
In SPSS, choose Analyse > Nonparametric Tests > Legacy Dialogs > 2 Independent-Samples
Select the variable planner and move it to the Test Variable List box by clicking the button.
Select the variable phd and move it to the Grouping Variable box, then click on Define Groups to indicate
the codes that we have assigned for the two groups. In our data, 1 is for those who intend to go for a PhD
and 2 is PhD degree holder, so we will type 1 and 2 in Group 1 and 2, respectively.
After finishing, click on Continue to return to the main dialog box.
Click on Exact to access the Exact Tests dialog box. With large samples, the suggested option is the Monte
Carlo method. As our samples are small, we will choose the Exact option. Click on Continue to return to the
main dialog box.
Click on Options to access the Options dialog box, select Descriptive and click Continue to return to the main
dialog box.
To run the analysis, click OK.

In the output, the first table we should look at is one labelled Ranks, which reports the mean rank for each
group, e.g. for the first group (those who intend to do a PhD degree), the number of participants is 11, and
the mean rank is 27.72.
Ranks
Intend To Get PhD or PsyD N Mean Rank Sum of Ranks
I rely on a calendar / day-planner
to remember what I am supposed
to do.
Intend to do a PhD 11 27.32 300.50
PhD holder 35 22.30 780.50
Total 46
The important table is named Test Statistics, which shows us the p-value of the Mann-Whitney U test when
exact significance is selected: p = .127 > .05 (1-tailed).
Test Statisticsb
I rely on a calendar / day-planner to remember what I am supposed to do.
Mann-Whitney U 150.500
Wilcoxon W 780.500
Z -1.169
Asymp. Sig. (2-tailed) .242
Exact Sig. [2*(1-tailed Sig.)] .284a
Exact Sig. (2-tailed) .252
Point Probability .006
a. Not corrected for ties.
b. Grouping Variable: Intend To Get PhD or PsyD
Hence, our conclusion is that people who intend to do a PhD do not differ significantly from PhD degree
holders with regard to the use of day planner to remember what they are supposed to be doing , U =
150.50, z = -1.169, p > .05, ns.
The Wilcoxon signed-rank test
e.g. we want to know if each pair of students (having the same GPA score) will differ in the degree to which
they like a course if they are allocated to one of the conditions: having access to an online quiz-program or
without access to the quiz.
The data file is named quiz_access.sav.

In SPSS, choose Analyse > Nonparametric Tests > Legacy Dialogs > 2 Related Samples
Select the pair of variables (quiz and no_quiz) and move them to the Test Pairs area by clicking on the
button. Under the Test Type, choose Wilcoxon.
Click on Exact to access the Exact Tests dialog box. With large samples, the suggested option is the Monte
Carlo method. As our samples are small, we will choose the Exact option. Click on Continue to return to the
main dialog box.
Click on Options to access the Options dialog box, select Descriptive and click Continue to return to the
main dialog box.
To run the analysis, click OK.
In the output, the first table we should look at is one labelled Ranks, which reports the number of rank
scores. For examples, it indicates that there are 8 negative ranks (N=8) in which the no-quiz participants like
the class less than their quiz-peers.
Ranks
N Mean Rank Sum of Ranks
no_quiz - quiz Negative Ranks 8a
4.50 36.00
Positive Ranks 0b
.00 .00
Ties 4c
Total 12
a. no_quiz < quiz
b. no_quiz > quiz
c. no_quiz = quiz

The important table is named Test Statistics, which shows us the p-value of the Wilcoxon Signed Ranks test
when exact significance is selected: p = .004 < .05 (1-tailed).
Test Statisticsb
no_quiz – quiz
Z -2.539a
Asymp. Sig. (2-tailed) .011
Point Probability .004
a. Based on positive ranks.
b. Wilcoxon Signed Ranks Test
The Wilcoxon test is denoted by the letter T and the smallest of the two sum of ranks. Hence, our
conclusion is that the participants who have access to the online quiz-program like the course more than
those who do not have access, T= 0, p < .05.
Alternatively, we can use the z value to write the result:
The participants who have access to the online quiz-program like the course more than those who do not
have access, z = -2.54, p < .05.

ASSIGNMENT
1. Self-practice: familiarize with the paired samples tests (optional)
Read the parts on the paired sample t-test and the Wilcoxon signed rank test, using the data sets
TVshow.sav and quiz_access.sav (on Pointcarre) to conduct the analysis.
2. Group work
You can choose one of the two options
a) Think of an imaginary research (as interesting and fascinating as possible) that you are about to
conduct.
- Decide the variables (e.g. anxiety of SPSS use) and their measurement level
- Decide the groups that involves on the study (male/female; treatment/control group)
If there is a certain intervention, please describe it. For example, you can help to over the
anxiety of SPSS use by offering the treatment group with more simplified explanation
compare to the common textbook that is used.
- State your hypothesis
e.g. There is a difference in the level of anxiety of SPSS use between the group provided
with simplified explanation for statistics concepts and the group that use common
textbook.
- Create a data set with the variables you have defined and for each group, create at least 15
cases for each condition (participants).
- Conduct the appropriate test based on your research design (independent or paired
samples; parametric or nonparamentric).
- Give the conclusion based on the test results.
b) Search for a research article that uses one of the tests for differences (independent/paired
samples t-test; the Mann Whitney or Wilcoxon ranked sum test)
Briefly summarize the following:
- The variables measured in the study
- The groups that the analysis were conducted for.
- The study hypotheses
- The tests that were used to test the hypotheses
- The study’s conclusion (What has been found?)
Submission: please submit your group work (the word document and SPSS file) in the Assignment section.
Please see the example of how to present your results for this assignment in APA style on the next page.

REPORTING THE RESULT IN APA STYLE
Group (1): (indicate your group members here)
Your submission should include 2 parts:
I. Study description
- In this study we are to compare the difference in (name the variables) between (name the groups).
- Shortly describe your hypothesis if you already have some ideas about the way in which the 2 groups
differ.
e.g. group A will have a higher level of anxiety than group B.
- Indicate the tests of differences that you used.
e.g. Independent t-test was used to find out the differences between the 2 groups.
II. Data analysis results
If you conduct t-tests, shortly describe the test of homogeneity of variance (the Levene’s test).
e.g. Levene’s test shows that the variances were equal for the two groups, F = 3.21, p > .05
The results of the t-test are presented in Table 1.
Table 1
Results Of Independent Samples T-Test Comparing The Stress Levels Measured At The Beginning And The
End Of The Week For The Female And Male Groups
Variables Gender n Mean (SD) Std. Error Mean t df
Sig.
(2-tailed)
Stress at the start of the week
Female 16 14.81 (5.307) 1.32 -1.73 30 .095
Male 16 18.94 (7.954) 1.98
Stress at the end of the week
Female 16 25.38 (12.468) 3.12 .522 30 .606
Male 16 23.19 (11.220) 2.80
At the start of the week, on average, the male participants experienced a higher level of stress (M= 18.94,
SE = 1.99) than the females (M = 14.81, SE = 1.32). This difference was significant t(30) = -1.73, p < .05.
Therefore, hypothesis 1 is not supported because the psychologist assumed that the females experienced a
higher level of stress than males.
At the end of the week, p (one-tailed) = .303 > .05, then the test is also non-significant and hypothesis 1 is
again not supported.
Notes on reporting the results in APA style:
When we report the result in a table, we do not use vertical line. The table and its caption (in italic and
capitalize the first letter of each word) should be in separate lines.
The letters used to indicate the test statistics should be in italic. For example:
 Mean: M standard deviation: SD standard error: SE
 Levene test: F test of difference: t significance: p
Any value with an absolute value of 1, if less than 1, should be presented without a zero before the decimal
point, e.g. we write p = .04, but not p = 0.04.

Guide to Statistical Tests for Comparing Groups

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Guide to Statistical Tests for Comparing Groups

Similar to Guide to Statistical Tests for Comparing Groups (20)

More from Daria Bogdanova

More from Daria Bogdanova (20)

Recently uploaded

Recently uploaded (20)

Guide to Statistical Tests for Comparing Groups