The document discusses parametric and non-parametric tests used in statistical analysis, highlighting the importance of data distribution assumptions for parametric tests. It explains specific tests such as the Student's t-test for independent and paired samples, chi-square test for independence, and non-parametric alternatives like the Mann-Whitney U test and Wilcoxon signed-rank test. The text emphasizes when to use non-parametric tests, particularly in cases of non-normality or unequal variances.
Parametric tests
• Forthe most accurate results, the use of the highest level of
statistical test available for that type of data is indispensable
• Many tests suitable for quantitative data make large assumptions
about the distribution of the variables in the populations compared
• Tests which make such distributional assumptions about the variable
being analyzed are called‘parametric tests’
• On the other hand, with fairly large sample sizes, many of the
assumptions for the parametric tests may hold approximately
• In general, parametric tests are more powerful in detecting
differences between populations when the underlying assumptions
hold
Dept. Of Biostats,SJMC, Bangalore
3.
Independent vs. pairedsamples
• Paired samples : Each observation in the first group has
corresponding observation in the second group (corresponding
observations typically not independent!)
• Independent samples : Observations in each of the two groups
are not related to each other
Dept. Of Biostats,SJMC, Bangalore
Dept. of Biostatistics,SJMC, Bangalore
Student’s t-test
Student’s t-test (Independent t-test)
• to assess the statistical significance of the difference
between two population means
Assumption
• Sample observations are random and independent
• Outcome variable must be continuous and normally
distributed
• The variance of the outcome variable is the same in
the two groups (Homogeneity of variance )
Dept. of Biostatistics,SJMC, Bangalore
One group of observations
(One sample t-test)
Compare the mean of a single group of observations with a
specified value.
Example data:
Comparison of mean dietary intake of a particular group of
individuals with the recommended daily intake.
Data: Average daily energy intake (ADEI) over 10 days of 11
healthy women (Manocha et al., 1986)
Dept. of Biostatistics,SJMC, Bangalore
Example for independent t test:
A study was done to investigate the nature of lung
destruction in cigarette smokers before the development
of marked emphysema.Two lung destructive index
lifelong
These
measurements were made
nonsmokers and smokers
on the lungs of
who died suddenly.
indices are as given below.
Can we conclude that the smokers have generally greater
lung damage as measured than nonsmokers?
Dept. of Biostatistics,SJMC, Bangalore
Example for paired t-
test:
A study on the effect of a particular drug on pulse rate was
observed on 8 patients before and after the administration of
the drug.
Is the drug administered effective in changing the pulse rate in
those 8 patients?
Patient Before drug After drug
1 58 66
2 65 69
3 68 75
4 70 68
5 66 73
6 75 75
7 62 68
8 72 69
Chi-square test
• TheChi-square test can be used for two applications
• Independence between two variables
• The null hypothesis for this test is that the variables are
independent (i.e. that there is no statistical association)
• The alternative hypothesis is that there is a statistical
relationship or association between the two variables
• Test for equality of proportions between two or more
groups
• The null hypothesis for this test is that the 2 proportions are
equal
• The alternative hypothesis is that the proportions are not
equal (test for a difference in either direction)
Test statistic
• Whereoi’s are observed frequency
Ei’s are expected counts
• Basically the deviation between expected and
observed is computed
• Expected frequencies are calculated based on
Row & Column margin total
−
=
i
i
i
E
E
O 2
2 )
(
Dept. of Biostats, CMC, Vellore
16.
Testing for Independence-
Example
•Contingency tables or cross – classified table
can be used
• Eg:
• How to view the association?
• Proportions of the groups will help in
comparison
Type II diabetes Hypertension
Yes No
Yes 5 57
No 51 2105
Dept. of Biostats, CMC, Vellore
Decision making
• Ifχ²calc ≥ χ ²tab at (r-1) *(c-1)df then, null is rejected
=
−
= 95
.
7
)
( 2
2
i
i
i
E
E
O
• χ²calc ≥ χ ²tab at 1 df, null is rejected, concluding that
there is an association between type II diabetes and
hypertension.
χ ²tab = 3.84 at 1df
Dept. of Biostats, CMC, Vellore
19.
Few notes beforeapplying
nonparametric test!
➢In practice, of course, no distribution is exactly
Normal. Fortunately, our usual methods for
inference about population means (the one-sample
and two-sample t procedures and analysis of
variance) are quite robust.
➢That is, the results of interference are not very
sensitive to moderate lack of Normality, especially
when the samples are reasonably large.
➢Problem is serious if plots suggest that the data are
clearly not Normal, especially when we have only a
few observations?
Dept. Biostatistics, SJMC, Bangalore
20.
Steps before optfor
nonparametric test!
➢If lack of Normality is due to outliers, it may be
legitimate to remove the outlier
➢Sometimes we can transform our data so that
their distribution is more nearly Normal
➢In some settings, other standard distributions
replace the Normal distributions as models for
the overall pattern in the population
➢Modern bootstrap methods and permutation
tests do not require Normality or any other
specific form of sampling distribution.
Dept. Biostatistics, SJMC, Bangalore
21.
Why do Iadvocate parametric in the
class of nonparametric statistics?
Easy interpretation.
Can tolerate mild to moderate violation of
assumptions when sample is sufficiently large.
Nonparametric methods give 95% accuracy over
parametric when parametric assumptions are
satisfied.
Dept. Biostatistics, SJMC, Bangalore
22.
Introduction
❑Make no assumptionsabout the data's characteristics.
So called “Distribution free-tests”
❑Answers the same sort of questions as the parametric
test – for each Parametric tests (PT) there is an
alternative Non-Parametric Test (NP)
Dept. Biostatistics, SJMC, Bangalore
23.
When are non-parametrictests used?
Assumptions of parametric test are violated
❑Non-normal or skewed
❑Unequal variance
❑Data is on an ordinal scale
❑Very few observations
Dept. Biostatistics, SJMC, Bangalore
24.
Assigning Ranks
❑Arranging thedata in ascending order or descending
order
❑Assign the rank 1 to the first item, rank 2 to the second
item, and similarly for the rest of the items
Dept. Biostatistics, SJMC, Bangalore
Mann-Whitney U Test
❑Alsoknown as Wilcoxon Rank Sum test
❑Alternative to the parametric independent t-test
❑To test whether two independent groups have been drawn
from the same population
Assumptions:
❑Two sample are selected independently and at random
from their respective population
❑Variable of interest is continuous
❑Measurement scale is at least ordinal.
Dept. Biostatistics, SJMC, Bangalore
29.
Mann-Whitney U Test
Comparethe distribution of scores on a quantitative variable
obtained from two independent groups.
Young adults
BMI
Men 18.19 23.79 25.76 21.20 15.79 26.45 29.85
26.66 17.58 25.86 21.54 23.75 22.83
Women 18.86 25.86 16.54 18.87 17.87 18.73 15.75
17.77 17.46 18.28 30.47 30.03
Ho: There is no significant difference between two
population distributions
Ha: There is significant difference between two population
distributions
Dept. Biostatistics, SJMC, Bangalore
U1=53.5, U2=115.5
U=53.5
Rule: Calculatedvalue should be less than the
critical value to reject the null hypothesis
Test statistic: ‘U’ = Min(U1, U2)
Dept. Biostatistics, SJMC, Bangalore
Wilcoxon Signed-Rank Test
❑Analternative to the parametric paired t-test
❑Used to compare two samples from populations are not
independent eg., measure a variable in each subject before and
after an intervention
Assumptions
❑Samples must be paired
❑Pairs are randomly selected from the larger population
❑Measurements should be continuous
Dept. Biostatistics, SJMC, Bangalore
35.
Example
A drug wasdesigned to lower systolic blood pressure.
The systolic blood pressure was measured before and
after administration of the drug. Find whether the
drug is effective in lowering systolic blood pressure?
Dept. Biostatistics, SJMC, Bangalore