HFS3283 paired t tes-t and anova

HFS3283
PAIRED T-TEST
&
ONE WAY-ANOVA
DR. SHARIFAH WAJIHAH WAFA BTE SST WAFA
School of Nutrition and Dietetics
Faculty of Health Sciences
sharifahwajihah@unisza.edu.my
KNOWLEDGE FOR THE BENEFIT OF HUMANITY

SCHOOL OF NUTRITION AND DIETETICS . FACULTY OF HEALTH SCIENCES
Topic Learning Outcomes
At the end of this lecture, the student should be
able to:
1
• Understand structure of research study appropriate
for ANOVA test
2
• Understand how to evaluate the assumptions
underlying this test
3 • interpret SPSS outputs and report the results

Overview
t-tests
1. One-sample t-test
2. Independent samples t-test
3. Paired samples t-test
ANOVAs
1. 1-way ANOVA
2. 1-way repeated measures ANOVA
3. Factorial ANOVA

Why a t-test or ANOVA?
•A t-test or ANOVA is used to determine
whether a sample of scores are from the same
population as another sample of scores.
•These are inferential tools for examining
differences between group means.
• Is the difference between two sample means
‘real’ or due to chance?

t-tests
•One-sample
One group of participants, compared with fixed,
pre-existing value (e.g., population norms)
•Independent
Compares mean scores on the same variable
across different populations (groups)
•Paired
Same participants, with repeated measures

Major assumptions
•Normally distributed variables
•Homogeneity of variance
In general, t-tests and ANOVAs are robust to
violation of assumptions, particularly with
large cell sizes, but don't be complacent.

Use of t in t-tests
•t reflects the ratio of between group variance
to within group variance
•Is the t large enough that it is unlikely that the
two samples have come from the same
population?
•Decision: Is t larger than the critical value for t?
(see t tables – depends on critical  and N)

One-tail vs. two-tail tests
• Two-tailed test rejects null hypothesis if
obtained t-value is extreme is either direction
• One-tailed test rejects null hypothesis if
obtained t-value is extreme is one direction
(you choose – too high or too low)
• One-tailed tests are twice as powerful as two-
tailed, but they are only focused on identifying
differences in one direction.

One sample t-test
• Compare one group (a sample) with a fixed, pre-
existing value (e.g., population norms)
• Do uni students sleep less than
the recommended amount?
e.g., Given a sample of N = 190 uni
students who sleep M = 7.5 hrs/day (SD
= 1.5), does this differ significantly from 8
hours hrs/day ( = .05)?

One sample t-test

Independent groups t-test
• Compares mean scores on the same variable
across different populations (groups)
• Do males & females differ in the
amount of sleep they get?

Assumptions
(Indep. samples t-test)
•LOM
– IV is ordinal / categorical
– DV is interval / ratio
•Homogeneity of Variance: If variances unequal (Levene’s
test), adjustment made
•Normality: t-tests robust to modest departures from
normality, otherwise consider use of Mann-Whitney U test
•Independence of observations (one participant’s score is
not dependent on any other participant’s score)

Do males and females differ in in
amount of sleep per night?

Independent samples t-test
•Comparison b/w means of 2 independent
sample variables = t-test
(e.g., what is the difference in Educational Satisfaction
between male and female students?)
•Comparison b/w means of 3+
independent sample variables = 1-way
ANOVA
(e.g., what is the difference in Educational Satisfaction
between students enrolled in four different faculties?)

PAIRED T-TEST

Paired samples t-test
 1-way repeated measures ANOVA
• Same participants, with repeated measures
• Data is sampled within subjects. Measures are
repeated e.g.,:
–Time e.g., pre- vs. post-intervention
–Measures e.g., approval ratings of
brand X and brand Y
• The paired t-test will show whether the differences
observed in the 2 measures will be found reliably in
repeated samples.

Assumptions
(Paired samples t-test)
• LOM:
– IV: Two measures from same participants (w/in
subjects)
• a variable measured on two occasions or
• two different variables measured on the same occasion
– DV: Continuous (Interval or ratio)
• Normal distribution of difference scores (robust to
violation with larger samples)
• Independence of observations (one participant’s score is
not dependent on another’s score)

Example
• In this example, we want to compare the weight
changes amongst obese children after 6 weeks went
for weight management program.
• Five obese children are selected at random from the
school A.
• We are interested in the following research question:
Does an intervention have an effect on body weight of
the obese children?
• The average weight, in both pre and post treatment
is recorded in columns 1 and 2 (see next slide) for
each of 5 people (P1-P5):

SCHOOL OF NUTRITION AND DIETETICS . FACULTY OF HEALTH SCIENCES 19
X1 :
0 month
X2 :
6 month
P1: 52 51 4 1
P2: 54.5 52 0.25 0
P3: 53 52 1 0
P4: 54.5 52 0.25 0
P5: 56 53 4 1
54 52
1.9 0.4
1
2
11
)( 
2
2
22
)( 




n
sx
2
2
)(
Example 4.1 (cont.)
Column 1 Column 2 Column 3 Column 4

Example 4.1 (cont.)
• Unlike the independent samples t-test, on each row
the numbers in columns 2 and 3 come from the
same people.
• Person 2, for example, weigh of 64.5 kg in the pre-
treatment, but lost to an average of 62 kg in the
post-treatment (after all treatment was completed).
• It appears that this treatment may have improved
body weight.

Example 4.1 (cont.)
• The paired t-test will allow us to see if the
improvement that we see in this sample is reliable.
• If we selected another 5 obese children
at random from the weight management program,
would we still see an improvement?
• Without having to go through the trouble and
expense of repeated sampling (called replication), we
can estimate whether the difference in the 2 means
is so large in magnitude that we would likely find the
same result if we chose another 5 persons.

Example 4.1 (cont.)
1
2 212121
22
21




n
SSrSS
t
xxxxxx
, df = n-1

Example 4.1 (cont.)
• This paired “t” needs a couple more values
that we have not yet computed.
• First, we need to find the Standard Deviation
of X1 and X2, called Sx1 and Sx2.
• These are simply the square-root of the
variances
( and ).6325.04.02
1
xS 3784.19.12
2
xS

Example 4.1 (cont.)
• Second, we need to find the correlation
between the pre and post-treatment ( ),
or likewise columns 2 and 3.
• Another section will illustrate how to compute
a correlation.
• This computation is somewhat long, so we’ll
avoid it for now.
• I’ll just tell you the correlation is:
rx1x2=0.9177.
• Any scientific or statistical calculator can get
you this answer.
21xxr

Example 4.1 (cont.)





15
)3784.1)(6325)(.9177(.29.14.
5254
t 4.78, df = 4

Example 4.1 (cont.)
• Finally, this computed “t” statistic must be compared
with the critical value of the t-distribution.
• The critical value of the “t” is the highest magnitude
we should expect to find if there is really no
difference between the population means of X1 and
X2, or in other words, no difference between weight
in the pre and post treatment in the weight
management program.
• Since we expect there should be a weight loss, this is
a 1-tailed test.

Example 4.1 (cont.)
• The C.V. t(4), α=.05 = 2.132, therefore we reject the null
hypothesis because the absolute value of our “t” at
4.78 is greater than the critical value.
• This is a 1-tailed t-test, so we must verify this
conclusion by noting that the mean of the post
treatment at 52kg, is lower than than the mean of
the pre-treatment average of 54 kg.

Example 4.1 (cont.)
Our research conclusion states the facts in simple
terms:
mean weight was decreased significantly from the
pre-treatment
(M = 54) to the post-treatment (M = 52),
t(4) = 4.78, p < .05 (one-tailed).

Example 4.1 Using SPSS
• First, we must setup the variables in SPSS.
• Although not strictly necessary, it is good practice to
give a unique code to each participant (“personid”).
• Unlike the independent samples t-test, the paired t-
test has separate entries for 2 dependent variables,
rather than an independent and dependent:
– DependentVariable1 = pretreat
(for Pre-treatment)
– DependentVariable2 = posttreat
(for Post-treatment)

• In our example, the variables are setup as
follows in the SPSS variable view:

• To run a Paired Samples t Test in SPSS, click
Analyze > Compare Means > Paired-Samples
T Test.

• You will be presented with the Paired-Samples T Test dialogue
box, as shown below.
• transfer the variables pretreat and posttreat into the Paired
Variables: box.

• Paired Sample Statistics Table
– The first table, titled Paired Samples Statistics, is where SPSS Statistics
has generated descriptive statistics for the variables. You could use the
results here to describe the characteristics of the pre- and post
treatment.

• Paired Samples Test Table
– The Paired Samples Test table is where the results of the dependent t-
test are presented.
• You are essentially conducting a one-sample t-test on the differences between
the groups.

• You should focus your attention first of the mean
values for the pre and the post treatment.
• As before, the means (Pre-treatment=54 and Post-
treatment=52) give us our conclusion.
• Namely, we conclude that weight decreased from
the pre to the post season.
• The statistics tell us that our conclusion is true not
only for this sample of 5 persons, but also for other
samples of 5 persons in the weight management
program.

Example 4.1 Conclusion
• Our test is 1-tailed, so we must divide the
2-tailed probability provided by SPSS in half
(p=.009/2 = .0045).
• When expressed to 2 significant digits, this value will
round to “.00” and as a result the lowest value that
can be represented in APA style is “p<.01.”
• In short, we can now write our conclusion as follows:
Weight of obese children decreased
significantly from the pre-treatment
(M = 54) to the post-treatment (M = 52),
t(4) = 4.78, p < .01 (one-tailed).

t-tests
• Difference between a set value and a variable →
one-sample t-test
• Difference between two independent groups →
independent samples t-test
= BETWEEN-SUBJECTS
• Difference between two related measures (e.g.,
repeated over time or two related measures at one
time) → paired samples t-test
= WITHIN-SUBJECTS

ANOVA

Introduction to ANOVA
(Analysis of Variance)
• Extension of a t-test to assess differences in the
central tendency (M) of several groups or variables.
• DV variance is partitioned into between-group and
within-group variance
• Levels of measurement:
• Single DV: metric,
• 1 or more IVs: categorical

Introduction
• ANOVA is an acronym for ANalysis Of VAriance.
• The adjective oneway means that there is a single
variable that defines group membership (called a
factor).
• Comparisons of means using more than one variable
is possible with other kinds of ANOVA analysis.

When to use a One-way ANOVA
• One-way ANOVA is a generalization of the
independent samples t-test.
• Recall that the independent samples t-test is
used to compare the mean values of 2
different groups.
• A One-way ANOVA does the same thing, but it
has the advantage of allowing comparisons
between more than 2 groups.

When to use a Oneway ANOVA
(continued)
• In health, for example, we often want to contrast
several conditions in an experiment; such as a
control, a standard treatment, and a newer
“experimental” treatment.
• Because Oneway ANOVA is simply a
generalization of the independent samples t-test,
we use this procedure (to follow) to recalculate
our previous 2 groups example.
• Later, we will do an example with more than 2
groups.

Example
• Let’s return to our example
of the nasi lemak (NL) vs. roti canai (RC) diet BUT
now we add up another one which in nasi dagang
diet (ND)
• Our research question is:
“Is there any weight
gain difference between
a 1-week exclusive diet
of either NL, RC or ND?”

Example -con’t
Column 3 Column 4 Column 5
X1 : NL X2 : RC X3 : ND
1 3 3 1 1 0
2 4 2 0 0 1
2 4 3 0 0 0
2 4 3 0 0 0
3 5 4 1 1 1
2 4 3 2 2 2
0.4 0.4 0.4
2
11 )( 
2
22 )( 
1




n
sx
2
2
)(
2 3
2
33 )( 
H0: μ1= μ2= μ3
Ha: At least one pair is different.

Comparing the groups
• Averages within groups:
– NL: 2
– RC: 4
– ND: 3
• Total average:
• Variance around the mean matters for comparison.
• We must compare the variance within the groups to
the variance between the group means.
3
555
)3(5)4(5)2(5





Variance within and between
groups
• Sum of squares within groups:
– SSW = 
= (1-2)2+(2-2) 2+(2-2) 2+(2-2) 2+……+(5-4) 2+……
= 6
• Compare it with sum of squares between groups:
– SSB = 
– = (2-3) 2 +(2-3) 2+(2-3) 2+……+(4-3) 2+……
= 5 (2-3) 2 + 5 (4-3) 2 + (3-3) 2 = 10
– Comparing these, we also need to take into account the
number of observations and sizes of groups
2)( jj 
])(X[ 2
Tj 

Adjusting for group sizes
• Divide by the number of degrees of freedom
• F:
• , reject H0 if this is large
MSG
MSW
Both are estimates of population
variance of error under H0
n: number of observations
K: number of groups
1
SSG
MSG
K


MSB
SSB
SSW
MSW
n K


MSW
SSW
MSW
MSB

Example– Test statistic thresholds
• If populations are normal, with the same
variance, then we can show that under the
null hypothesis,
• Reject at confidence level if
1,~ K n K
MSG
F
MSW
 
 1, ,K n K
MSG
F
MSW
 
The F distribution, with
K-1 and n-K degrees of
freedom
Find this value in a table
MSB
MSW
MSB
MSW

Example– con’t
94.8
9.48
13 3
SSW
MSW
n K
  
 
MSW
6
15-3
0.5
52.43
26.2
1 3 1
SSG
MSG
K
  
 
MSB
SSB 10
3-1
5.0
26.2
2.76
9.48
MSG
MSW
 
MSB
MSW
5.0
0.5
10.0 F3-1,15-3,0.05 = 3.89
Thus we reject the null hypothesis in our case.

Example – ANOVA table
Next, we need to fill-in the so-called ANOVA table:
Source of
variation
Sum of
squares
Deg. of
freedom
Mean
squares
F ratio
Between
groups
SSB K-1 MSB MSB/MSW
Within
groups
SSW n-K MSW
Total SST n-1

Example – ANOVA table (cont.)
Next, we need to fill-in the so-called ANOVA table:
Source
of
Variance
(SV)
Sum of
Squares
(SS)
Degrees
of
Freedom
(df)
Mean
Squares
(MS)
F-ratio
(F)
Critical
Value
(CV)
Reject
Decision
(Reject?)
Between 10 3-1= 2 10/2= 5.0 5.0/0.5=
10.0
3.89 Is F-ratio
> CV ?
YES
Within 6 15-3= 12 6/12= 0.5
Total 16 15-1= 14

Example– F-tests
• In our case, when diet has no effect,
differences between diet are entirely due to
chance. Numerator and denominator will be
similar. F-ratio should have value around 1.00
• When the diet does have an effect then the
between-diet differences (numerator) should
be larger than chance (denominator). F-ratio
should be noticeably larger than 1.00

Example Using SPSS
• 1. Quick Data Check
– We first want to get an idea of what our data basically look like. A nice
option for the data at hand is a running a histogram of weight for each
of the three groups separately. The screenshot below walks you
through doing so.

Example Using SPSS
The shapes of the frequency distributions are
normally distributed

Normality assumption

Example 1 Using SPSS (cont.)
2. Running SPSS One-Way ANOVA
1
2
3
4
5

2. Running SPSS One-Way ANOVA (con’t.)
Under button. Tick the checkbox as shown
below:
4

2. Running SPSS One-Way ANOVA (con’t.)
Click the button. Tick the Descriptive checkbox in
the –Statistics– area, as shown below:
5

Example Using SPSS (cont.)
3. SPSS One-Way ANOVA Output
– Two sections (boxes) appear in the output: Descriptives
– “N” in the first column refers to the number of cases used for calculating the
descriptive statistics. These numbers being equal to our sample sizes tells us
that there are no missing values on the dependent variable.
– The mean weights are the core of our output. After all, our main research
question is whether these differ for different diets.

• 3. SPSS One-Way ANOVA Output-con’t
– The second section, ANOVA table
• The significance level is 0.003 (p <0.01), and, therefore, there is a
statistically significant difference in the mean weight gain between the
different diets.
• which of the specific groups differed?
• Find this out in the Multiple Comparisons table which contains the results
of post-hoc tests.

3. SPSS One-Way ANOVA Output-con’t
• The table below, Multiple Comparisons, shows which groups differed from
each other.
• there is a significant difference in weight gain between NL diet and RC diet
(p = 0.002). However, there were no differences between NL diet and ND
diet (p=0.105), as well as between RC diet and ND diet (p=0.105)

In the Literature
• First and foremost, report descriptive
statistics.
• Regarding the significance test,report
– the F value;
– df1, the numerator degrees of freedom;
– df2, the denominator degrees of freedom;
– the p value

In the Literature
• There was a statistically significant difference between groups
as determined by one-way ANOVA (F(2,12) = 10.0, p<0.01). A
Tukey post-hoc test revealed that weight gain was statistically
significantly lower in NL diet (M= 2.00, SD= 0.71) compared to
RC diet (M= 4.00, SD = 0.71, p <0.01). However, ND diet (M=
3.00, SD= 0.71) did not significantly differ from NL and RC diet.
Diet Mean (SD) t statistics (df) p-value
Nasi Lemak 2.00 (0.71) 10.0(2,12) 0.003
Roti Canai 4.00 (0.71)
Nasi Dagang 3.00 (0.71)
Table 1: Type of diet associated with weight gain

Steps in solving One-Way ANOVA
post hoc Test Problems - 1
The following is a guide to the decision process for answering
homework problems about one-way ANOVA post hoc test
problems:
Is the dependent variable
ordinal or interval level and
does independent variable
define groups?
Incorrect
application of
a statistic
No
Compute the skewness, and kurtosis for the
variable to test assumption of normality.
Yes

Yes
No
Assumption of normality
satisfied? (skew, kurtosis
between -1.0 and + 1.0)
No
Sample size 10+ in
each group to apply
Central Limit Theorem?
Incorrect
application of
a statistic
Yes
Compute the one-way ANOVA with Tukey
HSD post hoc option selected

Is the p-value for the Tukey
HSD post hoc test <= alpha?
Examine Tukey HSD post hoc test result
False
Is the p-value for the F
ratio test <= alpha?
No
Yes

recap
• FIVE assumptions
– dependent variable should be measured at the interval or
ratio level (i.e., they are continuous).
– independent variable should consist of two or more
categorical, independent groups.
– should have independence of observations, which means
that there is no relationship between the observations in
each group or between the groups themselves.
– homoscedasticity: the dependent variable has the same
variance within each population;
– normality: the dependent variable is Gaussianly
distributed within each population;

Any
Questio
ns?
Conce
pts?
Equations?

HFS3283 paired t tes-t and anova

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to HFS3283 paired t tes-t and anova

Similar to HFS3283 paired t tes-t and anova (20)

More from wajihahwafa

More from wajihahwafa (20)

Recently uploaded

Recently uploaded (20)

HFS3283 paired t tes-t and anova