2. •Compare magnitude of a
problem between groups
• Identify risk factors
•Evaluate efficacy of interventions
(test hypothesis)
– Experiments
• Programs evaluation
3. • Levels of comparison
– Two independent groups (parameters)
– Two related parameters
– Several independent groups
(parameters)
4. Comparison of means (using
t-test)
• There are different types of t-tests:
a) One sample t-test:
a)Used to test hypothesis about
single population mean
b)Independent samples t-
test(independent means t-test):
•used to compare the mean scores of two
different populations, for example experiment
and control groups in experimental studies
•different participants are assigned to
each condition
(this is sometimes called the
independent- measures
5. c) Paired sample t-test (dependent means t-tes
–Used to compare the mean scores
for the same group of people on two
different occasions.
–the test is used when there are
two experimental conditions
–the same participants take part in
both conditions of the experiment
– the test is sometimes referred to as
the
matched pairs
6. • Assumptions of t-test:
– The data are normally distributed.
–In the dependent t-test this, means
that the sampling distribution of the
differences between scores should be
normal, not the scores themselves.
–Data are measured at least on an
interval scale.
– The independent samples t-test also
assumes:
•Variances in these populations are roughly
equal (homogeneity of variance).
•Scores are independent (because they come
from different people).
7. Performing t-tests using SPSS
(Examples) a)Testing the
hypothesized mean for a single
population (t-test):
Example
a) Researchers collected birth weight
data from a randomly selected sample
of 50 newborns. The mean birth weight
and standard were deviations were 3.1
and 0.57 kg respectively. Test the
hypothesis that the population birth
weight is not different from
3.5 kg at α=0.05
– Perform the test manually (do your
self)
b) Using SPSS software and a data file
“data for exercise 50”, test this same
hypothesis at the same level of
confidence.
8. Analyze →compare means →one sample T- tes
quantitative variable into the ‘test variables’ field →Ty
hypothesized mean in the ‘test value’
field
→Ok
The output from this
analysis are given below:
9. b)Two independent groups (two independent
samples t- test):
Example: Using a data file “CREG”,
test the hypothesis that the mean birth weight of male and femal
not different at α=0.05. Analyze → compare means →indep
samples T-test →drag the quantitative variable into th
variables’ field → move the qualitative variable into th
‘grouping variable’ field →Ok
The dialog sample box and outputs are
provided in the tables below
10. Reporting the result of Independent t-
test
• Mean birth weight of male and female
newborns were compared using two
independent samples t-test. We found
that the mean(±SD) birth weight of
male,[3.070 (±0.560) kg] and
female,[3.142 (±581)kg] was not
significant different at α=0.05
t(df) = -0.442(48), p =0.66.
11. c) Paired sample T-
Test:
• Is a type of t-test applied in condition of :
– Self-paired (before and after)
– Matching alike groups ( by age or sex for
instance)
–Measurements are taken at two distinct points
in time
–The difference between two observations
are used rather
than individual observations.
• Advantages:
– Controls erroneous factors
– Biological variability is illuminated
– Makes comparison more precisely
12. Example: Using a data file “CREG”,
test the hypothesis that the mean
Fasting blood sugar (FBS) before and
after exercising in Gym for one month
among office holding
workers is not different at α=0.05.
Calculate the effect size.
Solution:
•Check assumptions (normality of
the differences)
•To do so, first compute the
differences between
base line FBS and the value after Gym
exercise of one month
•Then, visualize the data and test
using Shapiro
13. Analyze → compare means →paired
samples Ttest →move the two
quantitative variable that you are
interested in comparing for each
subject into the box
‘paired variables’→ then, click Ok.
The output generated from this
procedure is shown in the following two
tables
14. Reporting the result of paired
t- test
The mean difference of FBS before and
after exercise, mean(±SD) was [ =
=11.14 (±10.86). This difference was
significant with t(df) = 7.25(49) , p
<0.001. The effect of exercise on the
FBS, Effect size=0.62% was observed,
which is very high.
15. Analysis of Variance
(ANOVA)
Introduction
A method of testing the hypothesis of
unequal means for more than two groups
without increasing the Type-I error rate of
the groups or
• Statistical technique for comparing means
for multiple (usually 2) independent
populations without increasing their Type-I
error rate
16. Purpose of
ANOVA
The purpose of ANOVA is to test
whether an independent variable has an
effect on a dependent variable without
increasing the Type-I error rate,
where…
the dependent variable is
Interval/ratio level (continuous), and
the independent variable is
nominal,ordinal, or interval level
17. One Way
ANOVA
–Takes into account only one sources
of variation/factor while comparing
means or variances.
–ANOVA is used to test statistical hypotheses
that propose differences between group
means.
–to test whether an independent variable has
an effect on a dependent variable without
increasing the type-I error rate, where…
–the dependent variable is Interval/ratio
level (continuous), and
–the independent variable is nominal,
ordinal, or interval level
18. Why One Way
ANOVA?
•When the group to be compared are
more than two, using ‘t’ test is
unreliable:
– tedious (kC2 comparisons) .
–Greater probability of making a type 1
error rejecting H0 when it is true.
– When a single ‘t’ test is performed at a
confidence level of 95%, we are willing to
accept type 1 error of 5% = 1 in 20
–If we do 20 ‘t’ tests on random data - expect
that 1 will be significantly different.
Example:
–Comparing three groups using t-test might
cause a type 1 error of at least 14.3%.
20. i)The sum of squares due to differences
between the group means (SSB).
ii)The sum of squares due to differences
between the observations within each group
(SSW). This is also called the residual sum of
squares.
SST = SSB + SSW
SST = Total sum of squared deviations of
each observation about grand mean
SSB = Total sum of squared deviations of
group means about grand mean
SSW = Total sum of squared deviations of
each observation about group mean
21. • Properties of F test:
–If F is close to 1, the two variances are likely
to be equal
–The larger the value of ‘F’ the greater the
chance that the two variances are not the
same
– Does not identify the group or groups that
differ.
–Is robust (provides dependable results even
when there are violations of the assumptions).
–Violations are more critical when sample sizes
are small or ‘n’s are not equal
–If there are real differences, the between
groups variation will be larger.
– If violations can’t be controlled or you think
that
there may be an increased chance of a type 1
error use p = 0.01
22. Relationship Between t
and F
• F = t2
•‘F’ and ‘t’ are based on the same
mathematical
model and t is just a special case of
ANOVA.
•It is ok to use F test when comparing
2 means
23. Assumptions in
ANOVA
•All populations from which the samples were drawn
are normally distributed. One of the following tests is
used to check for normality:
–Kolmogorov–Smirnov test,
–QQ plot
–Shapiro-Wilk test
•Each of the populations has the same variance (all
variances are equal). Check one the following tests to
confirm homogeneity of the variances:
–Rule of thumb: ratio of largest to smallest sample SD
must be less than 2:1
–Levene's test
–Bartlett's test
24. •The set of observations of data are independent
and
drawn randomly from population.
•Results of ANOVA are approximates rather than
exact.
•Data are parametric.
•Variables required for one way ANOVA:
– Two variables:
•One categorical independent variable with three or
more distinct categories. This could be continuous
variable categorized into three or more groups
•One continuous dependent variable
25. Running One way AOVA using
SPSS
i)In the MENU, click on
‘Analyze=> Compare Means =>One Way
ANOVA=>drag the quantitative
variable into ‘dependent list’ and the factor
into ‘Factor’=> Option=> descriptive, fixed and
random effects => homogeneity of
variances=> continue=>
Ok.
ii)Examine the output and do the post hoc
test if ANOVA test is significant to identify which
group is different.
26. Exampl
e
Assume that that 25 patients with blisters were randomly assigned into 2
treatments and a placebo group. Treatments: Treatment A, Treatment B,
Placebo
Average number of days until blisters was healed after starting the
treatment was measured and compared
Data
• A: 5,6,6,7,7,8,9,10
• B: 7,7,8,9,9,10,10,11
• P: 7,9,9,10,10,10,11,12,13
[means]:
[7.25]
[8.875]
[10.11]
1)Compare the three groups using appropriate
graphs
2)Describe the data using numerical
summaries
3)Is there a significant difference among
these
means at α=0.05? Test the hypothesis
using:
a)manual methods
b) SPSS Software
27. Solutions:
1) First show graphical displays to
visualize the comparison
• Side-by-side box plots (Graph>Legacy>Box
plots)
Box and whisker comparing average days
required for the three groups to remove
blisters
28. 2)Numerical Summary of the
result
3) Test the Hypothesis
a)Manual Methods
1)Describe the data:
– Done in the previous table
2) Check Assumption
• Each group is approximately normal
–check this by looking at histograms and/or
normal quantile plots, or use assumptions
– can handle some non-normality, but not
severe
outliers
29. •Standard deviations of each group
are approximately equal
–Rule of thumb: ratio of largest to smallest
sample standard deviation must be less than
2:1, or
– Twice the smallest SD is greater than the
largest
SD.
3) State the hypotheses:
• H0: The mean # of the three treatment are
equal.
•Ha: Not all the treatment mean # of days
are equal.
4) Calculate the test statistic
30. DF (numerator)= K-1= 3-
1= 2
MSb= 34.72/2 =17.36
Fc = MSb/MSw= 17.36/2.69 = 6.45
5)Decision and conclusion:
– Since Fc= 6.45 > FT =3.44, we reject
the Ho
–At least one of the means is
significantly different.
See the summary in the following
table
31. 3 (b) Use the SPSS to test the same
hypothesis done manually
Standard Deviation
Check
Compare largest and smallest standard
deviations:
• largest: 1.764
• smallest: 1.458
• 1.458 x 2 = 2.916 > 1.764
One way ANOVA Output
32. • Interpretation of the output
– The mean number of days blister was
remove was removed from the three
group was statistically significant
difference, F(2,22)=6.45, P.Value =
0.006.
33. 3.2.4. Multiple Comparison
• ANOVA does not provide any information on which
population or populations differed from the other.
• Multiple-comparison procedures are used to provide
information on this point.
• All are essentially based on the t-test but include appropriate
corrections for the fact that we are comparing more than
one
pair of means.
• There are long list of multiple comparison: but the most
frequently used ones are:
– Bonferroni’s t-method
– Tukey’s HSD (Honestly Significant Difference)
– Cheffe’s Procedure;
34.
35.
36. Application of ANOVA (Summary)
• Applicable to quantitative variables
• Number of groups to be compared is more than two.
• More appropriate when randomized experimental design is
employed.
• But, can also be used for observational designs and surveys.
• When ANOVA is significant and Ho is rejected, then it is
important to do a post hoc test( pair-wise comparison).
It divides ‘α’ by the number of comparisons, k (α/k,).
• When assumptions required for ANOVA are not met, a nonparametric
test equivalent to ANOVA (Kruskal Wallis test) is performed.