1. Session – 9
Measures of Dispersions
1
Standard Deviation
Standard deviation is the root of sum of the squares of deviations divided by
their numbers. It is also called ‘Mean error deviation’. It is also called mean square
error deviation (or) Root mean square deviation. It is a second moment of dispersion.
Since the sum of squares of deviations from the mean is a minimum, the deviations
are taken only from the mean (But not from median and mode).
The standard deviation is Root Mean Square (RMS) average of all the
deviations from the mean. It is denoted by sigma ().
Characteristics of standard deviation
1. Standard deviation and coefficient of variation possesses all these properties
which a good measure of dispersion should possess.
2. The process of squaring the deviation eliminates negative sign and makes
mathematical computations easy.
Merits
1. It is based on all observations.
2. It can be smoothly handled algebraically.
3. It is a well defined and definite measure of dispersion.
4. It is of great importance when we are making comparison between variability of
two series.
Merits
1. It is difficult to calculate and understand.
2. It gives more weightage to extreme values as the deviation is squared.
3. It is not useful in economic studies.
Standard deviation
If the variant xi takes the values of x1, x2 ………….. xn the standard deviation
denoted by and it is defined by
x x
= N
2
i
The quantity 2 is called variance.
3. 3
Alternate Expressions
For raw data
2
2 x =
2
x
n
2
For a grouped data 2 fx =
x
2
n
For a grouped data with step deviation method =
2 2
fd
N
fd
N
Coefficient of variance
It is defined as the ratio to be equal to standard deviation divided by mean.
The percentage form of CV is given by CV = x 100
x
4. Problems
1. Ten students of a class have obtained the following marks in a particular subject
out of 100. Calculate SD and CV for the given data below.
4
Sl. No.
(x)
marks
d = (x1 = 38.5)
d = (x1 - x )
(x1 - x )2
1. 5 - 33.5 1122.25
2. 10 - 28.4 812.25
3. 20 - 18.5 342.25
4. 25 - 13.5 182.25
5. 40 1.5 2.25
6. 42 3.5 12.25
7. 45 6.5 42.25
8. 48 9.5 90.25
9. 70 31.5 992.25
10. 80 41.5 1722.25
x = 385 (x1 - x )2 =
d2 = 5320.50
x
N
x
=
385
10
= 38.5
x x
= N
2
i
=
5320.5
10
= 23.066
CV = x 100
x
23
CV = x 100
38.5
CV = 59.9%
5. 2. Compute standard deviation and coefficient of varience for following data of 100
15.5 25.5
25.5 10 = 25.5 + 6.5
= 12.359
5
students marks.
Class f Class
Mid
point
x
d fd fd2
1 – 10 3 0.5 – 10.5 5.5 -2 -6 12
11 – 20 16 10.5 – 20.5 15.5 -1 -16 16
21 – 30 26 20.5 – 30.5 25.5 0 0 0
31 – 40 31 30.5 – 40.5 35.5 1 31 31
41 – 50 16 40.5 – 50.5 45.5 2 32 64
51 – 60 8 50.5 – 60.5 55.5 3 24 72
N = f =
100 fd = 65 fd2= 195
a = 25.5
x a x 25.5
d = d
10
h
10
d = 1
10
10
fd
N
x a h
65
100
x 32
= h
2 2
fd
N
fd
N
= 10
2
65
100
195
100
CV = x 100
x
12.359
CV = x 100
32
= 38.62%
6. 3. The AM and SD of a set of nine items are 43 and 5 respectively if an item of value
20835 2 = 7.64 is modified SD.
6
63 is added, find the mean and SD.
x
x i N
xi = x x N
xi = 43 x 9
x = 387 for 9 items
x = 387 + 63 for 10 item
x = 450
Modified mean
450
10
x
x
N
x = 45
x = 43 = 5 for 9 items
2
2 x =
2
x
N
2
x
25 = 43
2
9
x2
25 = 1849
9
25 + 1849 =
x2
9
x2
9
= 1874
x2 = 1874
x2 = 16866 for 9 items
If 63 is added
x2 = 16866 + (63)2
= 20835 for 10 items
x 2
2
2
Modified = x
N
2 = 2 45
10
7. 4. The mean of 5 observations is 4.4. and variance is 8.24 and if the 3 items of the
five observations are 1, 2 and 6. Find the values of other two observations.
7
w.k.t.
x
N
x
x
N
4.4
x = 22
2
2 x =
2
x
N
2
x
8.24 = 4.4
2
5
x2
8.24 = 19.36
9
8.24 + 19.36 =
x2
5
x2 = 138
x2 = 12 + 22 + 62 + x1
2 + x2
2
138 = 1 + 4 + 36 + x1
2 + x2
2
97 = x1
2 + x2
2
x1
2 + x2
2 = 97 ---- (1)
x = 1 + 2 + 6 + x1 + x2
22 = 9 + x1 + x2
x1 + x2 = - 13 ---- (2) put (2) in (1)
x2 = 13 – x1
by (1) & (2)
x1
2 + (13 – x1)2 = 97
x1
2 + 169 + x1
2 – 26x1 = 97
2 x1
2 – 26x1 + 72 = 0
x1
2 – 13x1 + 36 = 0
9. 5. The mean and S.D. of the frequency distribution of a continuous random variable
x are 40.604 and 7.92 respectively. Change of origin and scale is given below.
Determine the actual class interval.
d -3 -2 -1 0 1 2 3 4
f 3 15 45 57 50 36 25 9
d f fd fd2 MV CI
-3 3 -9 27 22.5 20-25
-2 15 -30 60 29.5 25-30
-1 45 -45 45 32.5 30-35
0 57 0 0 37.5 35-40
1 50 50 50 42.5 40-45
2 36 72 144 47.5 50-55
3 25 75 225 52.5 55-60
4 9 36 144 57.5
N = 240 fd = 149 fd2 = 695
9
fd
N
x a h
149
240
40.604 a h
40.604 = a + 0.62h ----- (1)
= h
2 2
fd
N
fd
N
7.92 = h
2
149
240
695
240
= h 2.895 0.620
7.92 = h x 1.584
h = 4.998
h = 5
Put h = 5 in equation (1)
40.604 = a + 0.62 x 5
a = 37.5
10. 10
Combined Standard Deviation
Suppose we have different samples of various sizes n1, n2, n3 …….. having
means x1, x2, x3 and standard deviation 1, 2, 3 ……. then combine standard
deviation can be computed by the following formula.
2 (n1 + n2) = n1 (1
2 + d1
2) + n2 (2
2 + d2
2)
d1 = x x 1
d2 = x x 2
1. The mean’s of two samples of sizes 50 and 100 respectively are 54.1 and 50.3 and
there standard deviations are 8 and 7 respectively obtain the SD for combined
group.
n1 = 50
1 x = 54.1
1 = 8
n2 = 100
2 x = 50.3
2 = 7
n x n x
1 1 2 2
(n n )
x
1 2
(50 x 54.1) (100 x 50.3)
50 100
x
x 51.56
2 (n1 + n2) = n1 (1
2 + d1
2) + n2 (2
2 + d2
2)
d1 = x x 1
d2 = x x 2
d1 = 94.1 – 51.56
d1 = 2.54 d1
2 = 6.45
d2 = 50.3 – 51.56
d2 = - 1.26 d2
2 = 1.56
2 150 = 50 (82 + 6.45) + 100 (72 + 1.58)
32 = (64 + 6.45) + 2 (49 + 1.58)
32 = 70.45 + 2 x 50.58
= 7.56
11. 2. The mean wage is Rs. 75 per day, SD wage is Rs. 5 per day for a group of 1000
workers and the same is Rs. 60 and Rs. 4.5 for the other group of 1500 workers.
Find mean and standard deviation for the entire group.
We have by data, 1 x = 75, 1 = 5, n1 = 1000
2 x = 60, 2 = 450, n2 = 1500
Let x and be the mean and SD of the entire group.
x
22
Consider
n x n x
1 1 2 2
n n
1 2
x
1000 x 75 1500 x 60
i.e., 60
1000 1500
Also we have,
(n1 + n2) 2 = n1 (1
2 + d1
2) + n2 (2
2 + d2
2),
where d1 = 1 x - x = 75 – 66 = 9; d2 = 2 x - x = 60 – 66 = -6
(1000 + 1500) 2 = 1000 (52 + 92) + 1500 (4.52 + (-6)2)
2 = 76.15 or = 8.73
12. 3. The runs scored by 3 batsman are 50, 48 and 12. Arithmtic mean’s respectively.
The SD of there runs are 15, 12 and 2 respectively. Who is t he most consistent of
the three batsman? If the one of these three is to be selected who is to be selected?
A B C
AM ( x ) 50 48 12
SD() 15 12 2
23
CVA =
A
x
A
x 100
CVA =
15
50
x 100
CVA = 30%
CVB =
B
x
B
x 100
CVB =
12
48
x 100
CVB = 25%
CVC =
C
x
C
x 100
CVC =
2
12
x 100
CVC = 16.66%
Evaluation Criteria
1. Less CV indicates more constant player and hence more consistent player is
(Player C)
2. Highest rune scorer = x A = 50
13. 4. The coefficient of variation of the two series are 75% and 90% with SD 15 and 18
24
respectively compute there mean.
CVA = 75%
CVB = 80%
A = 15
B = 18
CV= x 100
x
15
75 = x 100
x
A
18
90 = x 100
x
A
x A = 20 x A = 20
5. Goals scored by two teams A & B in a foot ball season are as shown below. By
calculating CV in each, find which team may be considered as more consistent.
No. of goals
x
No. of matches Team (A)
fx
Team (B)
A-team B-team fx
0 27 17 0 0
1 9 9 9 9
2 8 6 16 12
3 5 5 15 15
4 4 3 16 12
N = f = 53 f = 40 fx = 56 fx2 = 48
Team (A)
fx2
Team (B)
fx2
0 0
9 9
32 24
45 45
64 48
fx2 = 150 fx2 = 126
x A =
fx
N
56
=
53
= 1.056
14. 150 2 = 1.30
126 2 = 1.30
1.30
1.30
26
x B =
fx
N
48
=
40
= 1.2
2
2
2
A
x
fx
N
= 1.056 1.715
53
A
2
2
2
B
x
fx
N
= 1.2 1.95
40
B
CVA =
A
x
A
x 100 = x 100
1.056
= 123.8%
CVB =
B
x
B
x 100 = x 100
1.2
= 109%
Since, CVB < CVA, team B is more consistent player
6. The prices of x and y share A & B respectively state which share more stable in its
value.
Price A
(x)
(xi = 53)
(xi = x )
(xi = x )2
Price - A
(4)
(xi = 105)
(xi = x )
(xi = x )2
55 2 4 108 3 9
54 1 1 107 2 4
52 -1 1 105 0 0
53 0 0 105 0 0
56 3 9 106 1 1
58 5 25 107 2 4
52 -1 1 104 -1 1
50 -3 9 103 -2 4
51 -2 4 104 -1 1
49 -4 16 101 -4 16
x = 530 (xi= x )2 = 70 x = 1050 x(xi= x )2 = 40
15. 2.64
2
27
x A =
x
N
=
530
10
= 53
x B =
x
N
=
1050
10
= 105
2.64
70
10
A A
2
40
10
B B
CVA =
A
x
x 100 = x 100
53
= 4.98%
CVB =
B
x
x 100 = x 100
105
= 1.903%
Since, CVB is less share B is more stable.
7. A student while computing the coefficient of variation obtained the mean and SD
of 100 observations as 40 and 5.1 respectively. It was later discovered that he had
wrongly copied an observation as 50 instead of 40. Calculate the correct
coefficient of variation.
>>
x
n
x
i.e.
x
100
40
x (incorrect) = 4000
Now correct x = 4000 – 50 + 40 = 3990
3990
correct
x = 39.9
100
x 2
2
Let us consider 2
x
n
2
x
5.1 2
40
2
100
x
x
i.e. 1626.01
100
or
100
40 5.1
2 2
2 2
x2 (incorrect) = 100 x 1626.01 = 162601
Now correct x2 = 162601 – (50)2 + (40)2 = 161701
16. correct 2 = correct 2
161701 2
5
28
2
correct x
x
n
i.e., correct 2 = 39.9 25
100
Now correct efficient of variation = x 100
x
x 100 12.56%
39.9
Hence correct C.V. = 12.53%
17. 8. The mean and SD of 21 observations are 30 and 5 respectively. It was
subsequently noted that one of the observations 10 was incorrect. Omit it and
determine the mean and SD of the rest.
30
50
29
>>
x
n
x
x
i.e. or x 630
21
incorrect x = 630
Now omitting the incorrect value 10,
New x = 630 – 10 = 620
n = 21 – 1 = 20
620
x
New 31
20
x 2
2
Next consider 2
x
n
2
x
5 2
30
2
100
i.e.
x
21
900 25
2
incorrect x2 925 x 21 19425
Again omitting the incorrect value 10.
New x = 19425 –(10)2 = 19325, n = 20
x
2
2
Hence new 2 new
new x
20
19325 2
(31) 5.25
20
New = 5.25 = 2.29
9. The mean of 200 items was 50. Later on it was discovered that two items were
misread as 92 and 8 instead of 192 and 88. Find out the correct mean.
>>
x
n
x
x
i.e. or x 10000
200
incorrect x = 10000
Correct x = 10000 – 92 – 8 + 192 + 88 = 10180
10180
Correct mean =
200
= 50.9
18. 10. Find the missing frequencies in the following data given that the median is 137.2.
Class 100-
, h = 10 f = f1, c = 192
30
110
110-
120
120-
130
130-
140
140-
150
150-
100
106-
170
170-
180
Frequency 15 44 133 F1 125 F2 35 16 N=600
>> We prepare the table with the column of cumulative frequencies and use
the formula for median.
Class Frequency cf
100-110 15 15
110-120 44 59
120-130 133 192
130-140 f1 192 + f1 Median class
140-150 125 317 + f1
150-160 f2 317 + f1 + f2
160-170 35 352 + f1 + f2
170-180 16 368 + f1 + f2
N = 600
c
2
h
Median = 1 +
N
f
We can take the median class as 130-140 since median is given to be 137.2
130
130 130
l
2
137.2 = 130 +
10
1 f
(300 - 192)
i.e., 137-2 – 130 =
1080
1 f
i.e., 7.2 f1 = 1080 or f1 150
But the last cumulative frequency must be equal to N = 600
i.e. 368 + f1 + f2 = 600
368 + 150 + f2 = 600 f2 = 82
Thus f1 = 150, f2 = 82
19. Relationship between various measures of dispersion
We have some of following relationships among the various methods of
31
measures of dispersion
1. Mean QD covers 50% of observations of the distribution
2. Mean MD covers 57.5% of observations
3. Mean 1 includes 68.27% of observations
4. Mean 2 includes 95.45% of observations
5. Mean 3 includes 99.73% of observations
2
6. QD =
3
6745
4
2
7. MD =
5
x
A
8. QD =
5
MD
6
9. Combining the results we get 3 QD = 2 SD and 5 MD = 4 SD that is also equal
to 6 QD.
10. Range = 6 times SD.
SOURCES AND REFERENCES
1. Statistics for Management, Richard I Levin, PHI / 2000.
2. Statistics, RSN Pillai and Bagavathi, S. Chands, Delhi.
3. An Introduction to Statistical Method, C.B. Gupta, & Vijaya Gupta, Vikasa
Publications, 23e/2006.
4. Business Statistics, C.M. Chikkodi and Salya Prasad, Himalaya Publications,
2000.
5. Statistics, D.C. Sancheti and Kappor, Sultan Chand and Sons, New Delhi, 2004.
6. Fundamentals of Statistics, D.N. Elhance and Veena and Aggarwal, KITAB
Publications, Kolkata, 2003.
7. Business Statistics, Dr. J.S. Chandan, Prof. Jagit Singh and Kanna, Vikas
Publications, 2006.