3. The Shapiro–Wilk test is a test of normality in
frequentist statistics. It was published in 1965 by Samuel Sanford
Shapiro and Martin Wilk.
Aussmption.
The data is random
4. Samuel Sanford Shapiro
(born July 13, 1930) is an
American statistician and engineer. He is a professor emeritus of
statistics at Florida International University. He is known for his
co-authorship of the Shapiro–Wilk test. A native of New York
City, Shapiro graduated from City College of New York with a
degree in statistics in 1952, and took an MS in industrial
engineering at Columbia University in 1954. He briefly served
as a statistician in the US Army Chemical Corps, before earning
a MS (1960) and PhD (1963) in statistics at Rutgers
University. In 1972 he joined the faculty at Florida International
University
5. Martin Bradbury Wilk,
OC (18 December 1922 – 19 February 2013) was
a Canadian statistician, academic, and the former Chief
Statistician of Canada. In 1965, together with Samuel Shapiro,
he developed the Shapiro–Wilk test, which can indicate whether
a sample of numbers would be unusual if it came from
a Gaussian distribution. With Ramanathan Gnanadesikan he
developed a number of important graphical techniques for data
analysis, including the Q–Q plot and P–P plot.
Born in Montréal, Québec, he received a Bachelor of
Engineering degree in Chemical Engineering from McGill
University in 1945. From 1945 to 1950, he was a Research
Chemical Engineer on the Atomic Energy Project at the National
Research Council of Canada. From 1951 to 1955, he was a
Research Associate, Instructor, and Assistant Professor at Iowa
State University, where he received a Master of Science in
Statistics in 1953 and a Ph.D. in Statistics in 1955. From 1955 to
1957, he was a Research Associate and Assistant Director of the
Statistical Techniques Research Group at Princeton University.
6. . From 1959 to 1963, he was a Professor and Director of Research
in Statistics at Rutgers University In 1956, he joined Bell
Telephone Laboratories and in 1970 joined American Telephone
and Telegraph Company. From 1976 to 1980, he was the
Assistant Vice President-Director of Corporate Planning. From
1980 to 1985, he was the Chief Statistician of Canada. In 1981, he
was appointed an Adjunct Professor of Statistics at Carleton
University. In 1999, he was made an Officer of the Order of
Canada for his "insightful guidance on important matters
related to our country's national statistical system
7. Procedure:
(1) Hypothesis
Ho: data is normally dist
H1: data is not normally dist
(2) Level of Significance
α =0.01
(3) Test Statistics
W=(b/s√n-1)^2
The following steps we take to perform the test
8. Step 1. Order the data from least to greatest, labeling the
observations as x for i i = 1...n . Using the notation x , let the
order statistic from any data set ( j ) jth represent the jth
smallest value.
Step 2. Compute the differences x x for each . Then determine
X(n−i+1 )-X(i ) i = 1...n k as the greatest integer less than or equal
to (n / 2) .
Step 3. Use Table G-4 in Appendix G to determine the Shapiro-
Wilk coefficients, a , n−i+1 for i = 1...n . Note that while these
coefficients depend only on the sample size ( n ), the order of the
coefficients must be preserved when used in step 4 below. The
coefficients can be determined for any sample size from n = 3 up
to n = 50
Step 4. Compute the quantity b given by the following formula:
b=∑bi=∑a(n-i+1)(Xn-i+1-Xi)
Note that the values b are simply intermediate quantities
represented by the I terms in the sum of the right-hand
expression in the above equation
9. Step 5. Calculate the standard deviation (s) of the data set. Then
compute the Shapiro-Wilk test statistic using the following
formula:
W=(b/s√n-1)^2
Critical region and conclusion:
Given the significance level (α ) of the test (for
example, 0.01 or 0.05), determine the critical point of the Shapiro-
Wilk test with n observations using Table G-5 in Appendix G.
Compare the Shapiro-Wilk statistic (W) against the critical point (
w ). If the test statistic exceeds the critical point, accept normality c
as a reasonable model for the underlying population; however, if
W w , reject c < the null hypothesis of normality at the α -level and
decide that another distributional model would provide a better
fit.
10. Example Calculation of the Shapiro-Wilk Test for Normality
Use the Shapiro-Wilk test for normality to determine whether the
following data set, representing the total concentration of nickel
in a solid waste, follows a normal distribution: 58.8, 19, 39, 3.1, 1,
81.5, 151, 942, 262,331, 27, 85.6, 56, 14, 21.4, 10, 8.7, 64.4, 578, and
637.
Solution
Step 1. Order the data from smallest to largest and list, as in
Table F-2. Also list the data in reverse order alongside the first
column.
Step 2. Compute the differences x x in column 4 of the table by
subtracting column 2 (n−i+ ) (i ) − 1 from column 3. Because the
total number of samples is n = 20 , the largest integer less than or
equal to (n / 2) is k = 10 .
11. Step 3. Look up the coefficients a from Table G-4 in Appendix G
and list in column 4. n−i+1
Step 4. Multiply the differences in column 4 by the coefficients in
column 5 and add the first k products (b ) to get quantity , using
Equation F.1.
b = .4734(941.0)+.3211(633.9) + ⋅⋅⋅ .0140(2.8) = 932.88
Step 5. Compute the standard deviation of the sample, s = 259.72,
then use Equation F.2 to calculate the Shapiro-Wilk test statistic:
W =(b/s√n-1)^2
Step 6. Use Table G-5 in Appendix G to determine the .01-level
critical point for the Shapiro-Wilk test when n = 20. This gives w
= 0.868. Then, compare the observed value of = 0.679 to c W the 1-
percent critical point. Since W < 0.868, the sample shows
significant evidence of nonnormality by the Shapiro-Wilk test.
The data should be transformed using natural logs and rechecked
using the Shapiro-Wilk test before proceeding with further
statistical analysis
13. Example2: show that the given data is normal or not. 20,38,40,45,50,63,70,75,79,86
(1) Hypothesis;
Ho: data are normal
Hi: data are not normal
(2) Level of signfinice
α=o.o1
(3) Test Statistics
W=(b/s√n-1)^2
(4) Calaulation:
Xi x(n-i+1) x(n-i+1)-xi a(n-i+1) bi
20 86 66 .5739 37.88
38 79 41 .3291 13.49
40 75 35 .2141 7.49
45 70 25 .1224 3.06
50 63 13 .0339 .520
63 50 -13 b=62.44
70 45 -25
75 40 -35
79 38 -41
86 20 -66
14. W=(b/s√n-1)^2
b= 62.44, n=10,s=20.16
put In above formula we get
=(62.44/20.16√10-1)^2
= 1.o66
(5) Critical region:
w(0.01)10= .781
(6) decision
Use Table G-5 in Appendix G to determine the .01-level critical point for the Shapiro-Wilk test
when n = 10. This gives w = .781. Then, compare the observed value of = .1.066 to c W the 1-
percent critical point. Since W < 0.781, the sample shows significant evidence of normality by
the Shapiro-Wilk test.