This document discusses methods for testing whether a data set is normally distributed. It describes both graphical and statistical tests for normality, including Q-Q plots and the Kolmogorov-Smirnov, Shapiro-Wilk, and Lilliefors tests. It then provides a detailed example of how to perform the Kolmogorov-Smirnov test for normality on a set of height data.
2. 2/12/2013
2
Tests of Normality
.110 1048 .000 .931 1048 .000Age
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov
a
Shapiro-Wilk
Lilliefors Significance Correctiona.
Tests of Normality
.283 149 .000 .463 149 .000TOTAL_VALU
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov
a
Shapiro-Wilk
Lilliefors Significance Correctiona.
Tests of Normality
.071 100 .200* .985 100 .333Z100
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov
a
Shapiro-Wilk
This is a lower bound of the true significance.*.
Lilliefors Significance Correctiona.
Statistical tests for normality are more precise since actual
probabilities are calculated.
The Kolmogorov‐Smirnov and Shapiro‐Wilks tests for normality
calculate the probability that the sample was drawn from a normal
population.
The hypotheses used are:
H0: The sample data are not significantly different than a normal
population.
Ha: The sample data are significantly different than a normal
population.
Typically, we are interested in finding a difference between groups.
When we are, we ‘look’ for small probabilities.
• If the probability of finding an event is rare (less than 5%) and
we actually find it, that is of interest.
When testing normality, we are not looking for a difference.
• In effect, we want our data set to be NO DIFFERENT than
normal.
So when testing for normality:
• Probabilities > 0.05 mean the data are normal.
• Probabilities < 0.05 mean the data are NOT normal.
Kolmogorov‐Smirnov Normality Test
• Based on comparing the observed frequencies and the expected
frequencies.
• Expected frequencies are estimated using z‐scores.
• Cumulative expected frequencies are determined as:
• For negative z‐scores take the number directly from the z‐table.
• For positive z‐scores use (1 ‐ the z‐score from the table).
Kolmogorov‐Smirnov D statistic:
iii
iii
FrelFrelD
and
FrelFrelD
ˆ
ˆ
1
'
Use absolute values.
5. 2/12/2013
5
Dcritical = 0.106
D = 0.108
Since 0.108 > 0.106 reject Ho.
The sample data set is significantly different than normal (D0.05, 0.108, 0.05 > p > 0.025)
From SPSS:
Tests of Normality
.110 70 .036 .965 70 .045VAR00001
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov
a
Shapiro-Wilk
Lilliefors Significance Correctiona.
Non-Normally Distributed Data
.142 72 .001 .841 72 .000Average PM10
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov
a
Shapiro-Wilk
Lilliefors Significance Correctiona.
Remember that LARGE probabilities denote normally distributed data.
Normally Distributed Data
.069 72 .200* .988 72 .721Asthma Cases
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov
a
Shapiro-Wilk
This is a lower bound of the true significance.*.
Lilliefors Significance Correctiona.
In this case the probabilities are greater than 0.05 (the typical alpha
level), so we accept H0… these data are not different from normal.
Normally Distributed Data
.069 72 .200* .988 72 .721Asthma Cases
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov
a
Shapiro-Wilk
This is a lower bound of the true significance.*.
Lilliefors Significance Correctiona.
Non-Normally Distributed Data
.142 72 .001 .841 72 .000Average PM10
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov
a
Shapiro-Wilk
Lilliefors Significance Correctiona.
In this case the probabilities are less than 0.05 (the typical alpha level),
so we reject H0… these data are significantly different from normal.
Important: As the sample size increases, normality parameters becomes
MORE restrictive and it becomes harder to declare that the data are
normally distributed.
6. 2/12/2013
6
For Continuous Data:
Village
Population
Density
Observed
Cumulative
Frequency Z‐score
Z
Probability
Expected
Cumulative
Frequency Di D'i
Aranza 4.13 0.0588 ‐1.40 0.0808 0.0808 0.0220 0.0808
Corupo 4.53 0.1176 ‐0.94 0.1736 0.1736 0.0560 0.1148
San Lorenzo 4.69 0.1764 ‐0.75 0.2266 0.2266 0.0502 0.1090
Cheranatzicurin 4.76 0.2352 ‐0.67 0.2514 0.2514 0.0162 0.0750
Nahuatzen 4.77 0.2940 ‐0.66 0.2546 0.2546 0.0394 0.0194
Pomacuaran 4.96 0.3528 ‐0.44 0.3300 0.3300 0.0228 0.0360
Sevina 4.97 0.4116 ‐0.43 0.3336 0.3336 0.0780 0.0192
Arantepacua 5.00 0.4704 ‐0.39 0.3483 0.3483 0.1221 0.0633
Cocucho 5.04 0.5292 ‐0.35 0.3632 0.3632 0.1660 0.1072
Charapan 5.10 0.5880 ‐0.28 0.3897 0.3897 0.1983 0.1395
Comachuen 5.25 0.6468 ‐0.10 0.4602 0.4602 0.1866 0.1278
Pichataro 5.36 0.7056 0.02 0.4920 0.5080 0.1976 0.1388
Quinceo 5.94 0.7644 0.69 0.2451 0.7549 0.0095 0.0493
Nurio 6.06 0.8232 0.83 0.2033 0.7967 0.0265 0.0323
Turicuaro 6.19 0.8820 0.98 0.1635 0.8365 0.0455 0.0133
Urapicho 6.30 0.9408 1.11 0.1335 0.8665 0.0743 0.0155
Capacuaro 7.73 0.9996 2.76 0.0029 0.9971 0.0025 0.0563
Mean = 5.34
SD = 0.866
N = 17
For the observed cumulative frequency simply
divide the observation by the sample size, so for
Aranza 1/17=0.0588, for Corupo 0.0588+0.0588=
0.1176, for San Lorenzo 0.0588+0.0588+0.0588=
0.1764, etc…
For Continuous Data:
Village
Population
Density
Observed
Cumulative
Frequency Z‐score
Z
Probability
Expected
Cumulative
Frequency Di D'i
Aranza 4.13 0.0588 ‐1.40 0.0808 0.0808 0.0220 0.0808
Corupo 4.53 0.1176 ‐0.94 0.1736 0.1736 0.0560 0.1148
San Lorenzo 4.69 0.1764 ‐0.75 0.2266 0.2266 0.0502 0.1090
Cheranatzicurin 4.76 0.2352 ‐0.67 0.2514 0.2514 0.0162 0.0750
Nahuatzen 4.77 0.2940 ‐0.66 0.2546 0.2546 0.0394 0.0194
Pomacuaran 4.96 0.3528 ‐0.44 0.3300 0.3300 0.0228 0.0360
Sevina 4.97 0.4116 ‐0.43 0.3336 0.3336 0.0780 0.0192
Arantepacua 5.00 0.4704 ‐0.39 0.3483 0.3483 0.1221 0.0633
Cocucho 5.04 0.5292 ‐0.35 0.3632 0.3632 0.1660 0.1072
Charapan 5.10 0.5880 ‐0.28 0.3897 0.3897 0.1983 0.1395
Comachuen 5.25 0.6468 ‐0.10 0.4602 0.4602 0.1866 0.1278
Pichataro 5.36 0.7056 0.02 0.4920 0.5080 0.1976 0.1388
Quinceo 5.94 0.7644 0.69 0.2451 0.7549 0.0095 0.0493
Nurio 6.06 0.8232 0.83 0.2033 0.7967 0.0265 0.0323
Turicuaro 6.19 0.8820 0.98 0.1635 0.8365 0.0455 0.0133
Urapicho 6.30 0.9408 1.11 0.1335 0.8665 0.0743 0.0155
Capacuaro 7.73 0.9996 2.76 0.0029 0.9971 0.0025 0.0563
Mean = 5.34
SD = 0.866
Di Max = 0.1983 D’i Max = 0.1395
D = 0.1983 (Use the larger of the 2 values)
Dcritical = 0.207 (from the table)
D = 0.1983 (computed)
Since 0.1983 < 0.207 accept Ho.
The sample data set is not significantly different than normal
(D0.05, 0.1983, 0.10 > p > 0.05)
From SPSS:
Tests of Normality
.197 17 .077 .883 17 .035VAR00001
Statistic df Sig. Statistic df Sig.
Kolmogorov-Smirnov
a
Shapiro-Wilk
Lilliefors Significance Correctiona.
Notice that there is disagreement between the Kolmogorov‐Smirnov and the Shapiro‐
Wilk tests.
7. 2/12/2013
7
W/S Test for Normality
• A fairly simple test that require only the sample standard
deviation and the data range.
• Based on the q statistic, which is the ‘studentized’ range,
or the range expressed in standard deviation units.
• Should not be confused with the Shapiro‐Wilks test.
where q is the test statistic, w is the range of the data and s is
the standard deviation.
s
w
q
Village
Population
Density
Aranza 4.13
Corupo 4.53
San Lorenzo 4.69
Cheranatzicurin 4.76
Nahuatzen 4.77
Pomacuaran 4.96
Sevina 4.97
Arantepacua 5.00
Cocucho 5.04
Charapan 5.10
Comachuen 5.25
Pichataro 5.36
Quinceo 5.94
Nurio 6.06
Turicuaro 6.19
Urapicho 6.30
Capacuaro 7.73
Standard deviation (s) = 0.866
Range (w) = 3.6
n = 17
31.406.3
16.4
866.0
6.3
toq
q
s
w
q
RangeCritical
The W/S test uses a critical range. IF the calculated value falls WITHIN the range,
then accept Ho. IF the calculated value falls outside the range then reject Ho.
Since 4.16 falls between 3.06 and 4.31, then we accept Ho.
Since we have a critical range, it is difficult to determine a probability
range for our results. Therefore we simply state our alpha level.
The sample data set is not significantly different than normal
(W/S4.16, p > 0.05).
8. 2/12/2013
8
D’Agostino Test for Normality
• A very powerful test for departures from normality.
• Based on the D statistic, which gives an upper and lower critical
value.
where D is the test statistic, SS is the sum of squares of the data
and n is the sample size, and i is the order or rank of observation
X.
• First the data are ordered from smallest to largest or largest to
smallest.
iX
n
iTwhere
SSn
T
D
2
1
3
Village
Population
Density i
Mean
Deviates2
Aranza 4.13 1 1.46410
Corupo 4.53 2 0.65610
San Lorenzo 4.69 3 0.42250
Cheranatzicurin 4.76 4 0.33640
Nahuatzen 4.77 5 0.32490
Pomacuaran 4.96 6 0.14440
Sevina 4.97 7 0.13690
Arantepacua 5.00 8 0.11560
Cocucho 5.04 9 0.09000
Charapan 5.10 10 0.05760
Comachuen 5.25 11 0.00810
Pichataro 5.36 12 0.00040
Quinceo 5.94 13 0.36000
Nurio 6.06 14 0.51840
Turicuaro 6.19 15 0.72250
Urapicho 6.30 16 0.92160
Capacuaro 7.73 17 5.71210
Mean = 5.34 SS = 11.9916
2860.0,2587.0
26050.0
)9916.11)(17(
23.63
23.63
73.7)917(69.4)93(53.4)92(13.4)91(
)9(
9
2
117
2
1
3
1
CriticalD
D
T
T
XiT
n
If the calculated value falls within the critical range, accept Ho.
Since 0.2587 < D = 0.26050 < 0.2860 accept Ho.
The sample data set is not significantly different than normal (D0.26050, p > 0.05).
46410.121.1)34.513.4( 22
Use the next lower
n on the table if your
sample size is NOT
listed.
Values of D as Capacuaro’s
population density is increased.
Values of D as Aranza’s
population density is decreased.
9. 2/12/2013
9
Decreasing Aranza’s
population density
slightly made the data
set more symmetrical.
Which normality test should I use?
Kolmogorov‐Smirnov:
• Not sensitive to problems in the tails.
•Works reasonably well with data sets < 50.
Shapiro‐Wilks:
• Doesn't work well if several values in the data set are the same.
•Works best for data sets with > 50, but can be used with smaller
data sets.
•The better of the two SPSS tests.
W/S:
• Simple, but effective.
• Not available in SPSS.
D’Agostino:
• Probably the most powerful of all the normality tests.
• Not available in SPSS.
Kolmogorov-
Smirnov Shapiro-Wilkes
Data Type N Statistic Prob Statistic Prob Q-QPlots
Random Normal 30 0.116 0.200 0.987 0.962
Random Normal 100 0.059 0.200 0.986 0.382
Random Normal 1000 0.019 0.200 0.998 0.386
Random Normal 2000 0.018 0.120 0.999 0.154
Normality tests under differing conditions
Notice that as the sample size increases, the probabilities decrease.
In other words, it gets harder to meet the normality assumption as
the sample size increases since even small differences are detected.
Normality Test Statistic Calculated Value Probability
Anderson‐Darling A 1.0958 0.006219
Cramer‐von Mises W 0.1776 0.009396
Shapiro‐Francia W 0.9186 0.013690
Shapiro‐Wilk W 0.9224 0.014770
Kolmogorov‐Smirnov D 0.1577 0.023850
Pearson chi‐square P 12.5000 0.051700
Different normality tests produce vastly different probabilities. This is due
to where in the distribution (central, tails) or what moment (skewness, kurtosis)
they are examining.
10. 2/12/2013
10
When is non‐normality a problem?
• Normality can be a problem when the sample size is small (< 50).
• Highly skewed data create problems.
• Highly leptokurtic data are problematic, but not as much as
skewed data.
• Normality becomes a serious concern when there is “activity” in
the tails of the data set.
• Outliers are a problem.
• “Clumps” of data in the tails are worse.
Final Words Concerning Normality Testing:
1. Since it IS a test, state a null and alternate hypothesis.
2. If you perform a normality test, do not ignore the results.
3. If the data are not normal, use non‐parametric tests.
4. If the data are normal, use parametric tests.
AND MOST IMPORTANTLY:
5. If you have groups of data, you MUST test each group for
normality.