Parametric & Non-Parametric tests SPSS WORKSHOPpdf

Parametric &
Non-Parametric tests

Parametric tests
• For the most accurate results, the use of the highest level of
statistical test available for that type of data is indispensable
• Many tests suitable for quantitative data make large assumptions
about the distribution of the variables in the populations compared
• Tests which make such distributional assumptions about the variable
being analyzed are called‘parametric tests’
• On the other hand, with fairly large sample sizes, many of the
assumptions for the parametric tests may hold approximately
• In general, parametric tests are more powerful in detecting
differences between populations when the underlying assumptions
hold
Dept. Of Biostats,SJMC, Bangalore

Independent vs. paired samples
• Paired samples : Each observation in the first group has
corresponding observation in the second group (corresponding
observations typically not independent!)
• Independent samples : Observations in each of the two groups
are not related to each other

Comparison of means

Dept. of Biostatistics, SJMC, Bangalore
Student’s t-test
Student’s t-test (Independent t-test)
• to assess the statistical significance of the difference
between two population means
Assumption
• Sample observations are random and independent
• Outcome variable must be continuous and normally
distributed
• The variance of the outcome variable is the same in
the two groups (Homogeneity of variance )

One group of observations
(One sample t-test)
Compare the mean of a single group of observations with a
specified value.
Example data:
Comparison of mean dietary intake of a particular group of
individuals with the recommended daily intake.
Data: Average daily energy intake (ADEI) over 10 days of 11
healthy women (Manocha et al., 1986)

Example for independent t test:
A study was done to investigate the nature of lung
destruction in cigarette smokers before the development
of marked emphysema.Two lung destructive index
lifelong
These
measurements were made
nonsmokers and smokers
on the lungs of
who died suddenly.
indices are as given below.
Can we conclude that the smokers have generally greater
lung damage as measured than nonsmokers?

Example for paired t-
test:
A study on the effect of a particular drug on pulse rate was
observed on 8 patients before and after the administration of
the drug.
Is the drug administered effective in changing the pulse rate in
those 8 patients?
Patient Before drug After drug
1 58 66
2 65 69
3 68 75
4 70 68
5 66 73
6 75 75
7 62 68
8 72 69

Tests of association
By
JOHN MICHAEL RAJ

Chi-square test
• The Chi-square test can be used for two applications
• Independence between two variables
• The null hypothesis for this test is that the variables are
independent (i.e. that there is no statistical association)
• The alternative hypothesis is that there is a statistical
relationship or association between the two variables
• Test for equality of proportions between two or more
groups
• The null hypothesis for this test is that the 2 proportions are
equal
• The alternative hypothesis is that the proportions are not
equal (test for a difference in either direction)

Chi-square Test
• 𝑆𝑡𝑒𝑝 1: 𝐻𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 (𝑎𝑙𝑤𝑎𝑦𝑠 𝑡𝑤𝑜 − 𝑠𝑖𝑑𝑒𝑑):
𝐻0: 𝐼𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡
𝐻𝐴: 𝑁𝑜𝑡 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡
• 𝑆𝑡𝑒𝑝 2: 𝐶𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒 𝑡ℎ𝑒 𝑡𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐:
χ2
= σ
𝑥𝑖−𝑒𝑖
2
𝑒𝑖
͠ χ2 𝑤𝑖𝑡ℎ 𝑑𝑓 = 𝑟 − 1 𝑐 − 1
• Step 3: Calculate the p-value
p-value = P(χ2 > χ 2)
• Step 4: Draw a conclusion
• p-value<α reject independence
• p-value>α do not reject independence

Test statistic
• Where oi’s are observed frequency
Ei’s are expected counts
• Basically the deviation between expected and
observed is computed
• Expected frequencies are calculated based on
Row & Column margin total

−
=
i
i
i
E
E
O 2
2 )
(

Dept. of Biostats, CMC, Vellore

Testing for Independence-
Example
• Contingency tables or cross – classified table
can be used
• Eg:
• How to view the association?
• Proportions of the groups will help in
comparison
Type II diabetes Hypertension
Yes No
Yes 5 57
No 51 2105

Expected frequency
Type II
diabetes
Hypertension
Yes No Total
Yes 5
(1.6)
57
(60.4)
62
No 51
(54.4)
2105
(2101.6)
2156
Total 56 2162 2218
RT *CT/N =
(62*56)/2218

Decision making
• If χ²calc ≥ χ ²tab at (r-1) *(c-1)df then, null is rejected
 =
−
= 95
.
7
)
( 2
2
i
i
i
E
E
O

• χ²calc ≥ χ ²tab at 1 df, null is rejected, concluding that
there is an association between type II diabetes and
hypertension.
χ ²tab = 3.84 at 1df

Few notes before applying
nonparametric test!
➢In practice, of course, no distribution is exactly
Normal. Fortunately, our usual methods for
inference about population means (the one-sample
and two-sample t procedures and analysis of
variance) are quite robust.
➢That is, the results of interference are not very
sensitive to moderate lack of Normality, especially
when the samples are reasonably large.
➢Problem is serious if plots suggest that the data are
clearly not Normal, especially when we have only a
few observations?
Dept. Biostatistics, SJMC, Bangalore

Steps before opt for
nonparametric test!
➢If lack of Normality is due to outliers, it may be
legitimate to remove the outlier
➢Sometimes we can transform our data so that
their distribution is more nearly Normal
➢In some settings, other standard distributions
replace the Normal distributions as models for
the overall pattern in the population
➢Modern bootstrap methods and permutation
tests do not require Normality or any other
speciﬁc form of sampling distribution.

Why do I advocate parametric in the
class of nonparametric statistics?
Easy interpretation.
Can tolerate mild to moderate violation of
assumptions when sample is sufficiently large.
Nonparametric methods give 95% accuracy over
parametric when parametric assumptions are
satisfied.

Introduction
❑Make no assumptions about the data's characteristics.
So called “Distribution free-tests”
❑Answers the same sort of questions as the parametric
test – for each Parametric tests (PT) there is an
alternative Non-Parametric Test (NP)

When are non-parametric tests used?
Assumptions of parametric test are violated
❑Non-normal or skewed
❑Unequal variance
❑Data is on an ordinal scale
❑Very few observations

Assigning Ranks
❑Arranging the data in ascending order or descending
order
❑Assign the rank 1 to the first item, rank 2 to the second
item, and similarly for the rest of the items

Example
Assigning ranks for the following 5 scores 6, 9, 8, 3, 4
Original
score
Arranged
score
Ranks
6 3 1
9 4 2
8 6 3
3 8 4
4 9 5

How to Handel Ties
Example
Calculate the Rank for the following scores
Original scores Arranged Scores Ranks
4 3 1
9 4 2.5
3 4 2.5
4 5 4
8 7 5
10 8 6
7 9 7
5 10 8
(2+3)/2=2.5

Mann Whitney U Test

Mann-Whitney U Test
❑Also known as Wilcoxon Rank Sum test
❑Alternative to the parametric independent t-test
❑To test whether two independent groups have been drawn
from the same population
Assumptions:
❑Two sample are selected independently and at random
from their respective population
❑Variable of interest is continuous
❑Measurement scale is at least ordinal.

Mann-Whitney U Test
Compare the distribution of scores on a quantitative variable
obtained from two independent groups.
Young adults
BMI
Men 18.19 23.79 25.76 21.20 15.79 26.45 29.85
26.66 17.58 25.86 21.54 23.75 22.83
Women 18.86 25.86 16.54 18.87 17.87 18.73 15.75
17.77 17.46 18.28 30.47 30.03
Ho: There is no significant difference between two
population distributions
Ha: There is significant difference between two population
distributions

Procedure
Men(n=13) Women (n=12)
BMI
(kg/m²) Rank
BMI
(kg/m²) Rank
18.19 8 18.86 11
23.79 17 25.86 19.5
25.76 18 16.54 3
21.2 13 18.87 12
15.79 2 17.87 7
26.45 21 18.73 10
17.58 5 15.75 1
29.85 23 17.77 6
26.66 22 17.46 4
25.86 19.5 18.28 9
21.54 14 30.47 25
23.75 16 30.03 24
22.83 15
sum=193.5 sum=131.5
Ranking tied
observations

U1=53.5, U2=115.5
U=53.5
Rule: Calculated value should be less than the
critical value to reject the null hypothesis
Test statistic: ‘U’ = Min(U1, U2)

Wilcoxon Signed – Rank Test

Wilcoxon Signed-Rank Test
❑An alternative to the parametric paired t-test
❑Used to compare two samples from populations are not
independent eg., measure a variable in each subject before and
after an intervention
Assumptions
❑Samples must be paired
❑Pairs are randomly selected from the larger population
❑Measurements should be continuous

Example
A drug was designed to lower systolic blood pressure.
The systolic blood pressure was measured before and
after administration of the drug. Find whether the
drug is effective in lowering systolic blood pressure?

Procedure
Systolic blood
pressure
Subjec
t Before After Difference Rank Sign
1 170 175 5 1 +
2 168 171 3 2 +
3 199 178 -21 6 -
4 183 152 -31 9 -
5 178 159 -19 5 -
6 208 183 -25 7 -
7 194 176 -18 4 -
8 186 159 -27 8 -
9 156 145 -11 3 -
10 210 177 -33 10 -
Sum of (+ve ranks) =
1+2 =3
Test statistic is W = Min(W+ = 3 , W- =
52)

Decision making
We reject H0,because CV=3 < 8=TV.
How to decide the significance?

Parametric & Non-Parametric tests SPSS WORKSHOPpdf

In this document

More Related Content

What's hot

Similar to Parametric & Non-Parametric tests SPSS WORKSHOPpdf

More from jyotshnasahoo5

Recently uploaded

Parametric & Non-Parametric tests SPSS WORKSHOPpdf