Measure of Dispersion: Dispersion, Range,
Standard Deviation
Mr. Gaurav S. Patil
M. Pharm (Pharmaceutics)
gauravpatil2901@gmail.com
BP801T: Biostatistics and Research Methodology
Content:
• Dispersion
• Range
• Mean Deviation
• Variance and Standard Deviation
• Root Mean Square Deviation
•Coefficient of Variation
• Pharmaceutical Problems
2/29/2024 2
Gaurav S. Patil
Introduction
The measures of central tendency (mean, median and mode) are not
adequate to describe data.
Two data sets can have the same mean, but they can be entirely
different.
Thus to describe data, one needs to know the extent of variability.
This is given by the measures of dispersion.
In statistics, dispersion (also called variability, scatter, or spread) is the
extent to which a distribution is stretched or squeezed.
2/29/2024 3
Gaurav S. Patil
Importance of dispersion
The measure of variance of dispersion is an important tool in
biostatistical studies because biological phenomena are more valuable
than physical and chemical phenomena.
For example-
Individual variations are found in hemoglobin percentage, in the
number of RBCs and WBC, and even the cure rate with the same drug
where is in different patients of the same age and sex
2/29/2024 4
Gaurav S. Patil
Objective of measure of dispersion
• To judge the reliability of measure of Central tendency.
• To obtain correct picture of distribution or dispersion of values in the
series.
• To make a comparative study of variability of two or more series or
samples.
• To identify causes of variability in samples in order to exercise corrective
measures as in the case of body temperature, blood pressure and pulse
rate, etc.
• To use dispersion values for further statistical analysis.
2/29/2024 5
Gaurav S. Patil
Types of measures of dispersion
2/29/2024 6
Gaurav S. Patil
Measures of dispersion
2/29/2024 7
Gaurav S. Patil
Range:
• Range:
Range is the difference between the lowest and the highest values present in
the observations in a sample.
• Greater the range, larger the variability among the data.
• Range is often used when the number of observations are less (<10).
• It can be calculated by the following formula:
• Range (X) = Maximum value — Minimum value.
• Example- If there are 20 observations on seed oil content in Groundnut, the
highest value being 65% and the lowest 25%. The range will be 65-25 = 40
• Thus, it is a measure of the spread of variation in a sample.
2/29/2024 8
Gaurav S. Patil
Range
• Problem: In a clinical trial assessing the effectiveness of a new pain
medication, researchers administer different doses to patients. The doses
administered (in mg) are as follows: 50, 75, 100, 125, and 150. To calculate the
range of doses administered:
• Solution: Formula: Range = Maximum Value - Minimum Value
= 150 mg - 50 mg = 100 mg
The range of doses administered in the trial is 100 mg, indicating variability in
dosage levels among patients
2/29/2024 9
Gaurav S. Patil
Range
• Advantages of Range :
• Simple Measure: The range is easy to calculate and understand, providing a
quick overview of data spread.
• Useful for Initial Assessment: It can be useful for initial screening of variability,
especially in large datasets or exploratory analyses.
• Disadvantage of Range:
• Limited Information: It doesn't offer insights into the distribution of values
within the range, potentially missing important details.
• Not Suitable for Inference: Range alone lacks the statistical power for making
inferences about the population.
2/29/2024 10
Gaurav S. Patil
Coefficient of Range:
• Coefficient of Range: The relative measure of the range is k/a coefficient of range
•Formula: Coefficient of Range: H-L/H+L
• Example: In a pharmaceutical manufacturing process, the potency of a certain
medication is tested by measuring the amount of active ingredient per dosage unit. In a
batch of tablets, the highest measured potency is 95 mg per tablet, and the lowest
measured potency is 85 mg per tablet.
• Solution: Using the formula for the coefficient of range: H=95 (Highest potency); L=85
(Lowest potency)
• Formula: H-L/H+L
= 95-85/95+85= 10/180= 0.0555
2/29/2024 11
Gaurav S. Patil
Mean Deviation (MD)
• Mean Deviation: Mean deviation is defined as average of the sum of the absolute
values of deviation from any arbitrary value viz. mean, median, mode, etc.
• It is often suggested to calculate it from the median because it gives least value when
measured from the median.
• Formula for mean and median for mean deviation:
• Where,
• n= no. of observation in data
• Xi= value of i-th data point
• Fi= frequency distribution
• Ẋ= mean
2/29/2024 12
Gaurav S. Patil
Mean Deviation (MD)
• Merits of MD
• It utilizes all the observations
• It is easy to understand and calculate and
• It is not much affected by extreme values.
• Demerits of MD
• Negative deviations are straightaway made positive;
• It is not amenable to algebraic treatment; and
• It can not be calculated for open end classes.
2/29/2024 13
Gaurav S. Patil
Mean Deviation (Examples)
• Example: Find mean deviation for the given data 1, 2, 3, 4, 5, 6, 7
• Solution: First we need to find mean of the given data,
• Mean (x̄): 1+2+3+4+5+6+7/7 i.e. 28/7=4
• Then, we will find ∑|xi-x̄ | i.e. |4-1|=3; |4-2|=2; |4-3|=1; |4-4|=0; |4-
5|=1; |4-6|=2; |4-7|=3
• So, ∑|xi-x̄ |= 3+2+1+0+1+2+3=12
• Therefore, mean deviation:
∑|xi−x̄ |
𝒏
=
12
7
= 1.71
• So, the mean deviation for the given data is 1.71
• Excercise: Find mean deviation for given a data: 10, 15, 20, 25, 30?
2/29/2024 14
Gaurav S. Patil
Mean Deviation *Red color text indicates formula
• Example: Find mean deviation for the given data (Grouped Data)
• Solution:
2/29/2024 15
Gaurav S. Patil
X 1 2 3 4 5 6 7
F 3 5 8 12 10 7 5
X F F.X |x-x̄ | F |x-x̄ |
1 3 3 3.24 9.72
2 5 10 2.24 11.20
3 8 24 1.24 9.92
4 12 48 0.24 2.88
5 10 50 0.76 7.60
6 7 42 1.76 12.32
7 5 35 2.76 13.80
∑ 50 212 67.64
• Mean (x̄): :
∑f.x
∑𝒇
• Where, f=freq. &x= obs.
• 212/50=4.24
• Mean (x̄)= 4.24
• MD=
∑ F|x−x̄|
∑𝑭
=
67.64
50
MD = 𝟏. 𝟑𝟓𝟐𝟖
Mean Deviation *Red color text indicates formula
• Example: Find mean deviation from the median from given data
• Solution:
2/29/2024 16
Gaurav S. Patil
X 0-10 10-20 20-30 30-40 40-50
F 3 5 7 9 4
X F Mid
value (Xi)
C.F
0-10 3 5 3
10-20 5 15 8
20-30 7 25 15
30-40 9 35 24
40-50 4 45 28
∑ 28
• We have N=28; N/2=14
• The 14th observation falls in the
class 20-30. This is therefore the
median class.
• Hence we can median from
given data:
• L=20; N/2=14; CF (preceding
class)=8; f=7; i=10
• MEDIAN= L+
𝑁
2
−𝐶𝐹
𝐹
x i
• Median= 20+14-8/7x10= 28.57
Mean Deviation
• Example continued from previous slide: 16
• The calculated median was found to be= 28.57
• Formula for mean deviation from median:
∑ F|xi−median|
∑𝑭
= :
287.14
28
= 𝟏𝟎. 𝟐𝟓𝟓
• The mean deviation of the given data from the median: 10.255
2/29/2024 17
Gaurav S. Patil
X F Mid
value (Xi)
C.F |Xi-Median| F |Xi=Median|
0-10 3 5 3 23.57 70.71
10-20 5 15 8 13.57 67.85
20-30 7 25 15 3.57 24.99
30-40 9 35 24 6.42 57.85
40-50 4 45 28 16.42 65.71
∑ 28 Σf |xi − median|287.14
Coefficient of mean deviation
• The coefficient of mean deviation is a measure of relative variability
or dispersion in a dataset.
• It is calculated by dividing the mean deviation by the mean of the
dataset
• Formula: Coefficient of Mean Deviation (CMD)= Mean
Deviation/Median or Mean
2/29/2024 18
Gaurav S. Patil
Mean Deviation (Exercise)
1. Following are the marks of 7 students in Statistics: 16, 24, 13,
18, 15, 10, 23, Find the mean deviation from mean.
2. Find the mean deviation from the given data:
Hint: Example first is of ungrouped data and second is of grouped
data.
2/29/2024 19
Gaurav S. Patil
X 0-10 10-20 20-30 30-40 40-50
F 5 8 15 16 6
Variance
• Variance (σ2) or (Standard Deviation)2:
• Variance is absolute measure of dispersion.
• The square of standard deviation is k/a variation.
• Variance is the average of the square of deviations of the values taken
from mean.
• It has significant role in inferential statistics.
• Formula: Where, σ2 = Variance
• xi= Value of i-th element
• x̄= sample mean
• N= sample size/no. of observations
2/29/2024 20
Gaurav S. Patil
Variance
• Properties of variance:
• Variance can never be a negative value.
• All observations are considered.
• The problem with the variance is the squared unit.
Merits of variance:
• It is rigidly defined;
• It utilizes all the observations;
• Squaring is a better technique to get rid of negative deviations;
• It is the most popular measure of dispersion.
2/29/2024 21
Gaurav S. Patil
Variance
• Demerits of variance:
• It is not unit free;
• Although easy to understand, calculation may require a calculator or a
computer; and
• Its unit is square of the unit of the variable due to which it is difficult to
judge the magnitude of dispersion compared to standard deviation.
2/29/2024 22
Gaurav S. Patil
Variance (Examples)
•Example: Find out the variance of given data: 1,2,3,4,5,6,7
•Solution: Formula for variance:
• We first need to find out the mean of given data:
• So, Mean (x̄): 1+2+3+4+5+6+7/7 = 28/7=4
• Now find out ∑ (xi-x̄)2 = 1-4= -3; 2-4=-2; 3-4=-1; 4-4=0;5-4=1;6-4=2;7-4=3
• ∑ (xi-x̄)2 = (-3)2 + (-2)2+ (-1)2+ (0)2+ (1)2+ (2)2+ (3)2 =9+4+1+0+1+4+9=28
• So; ∑ (xi-x̄)2= 28
• Variance: σ2 =
∑(𝐗𝒊−x̄)
𝑵
∶
𝟐𝟖
𝟕
;
𝟐𝟖
𝟕
= 𝟒 𝐒𝐨, 𝐭𝐡𝐞 𝐯𝐚𝐫𝐢𝐧𝐜𝐞 𝐨𝐟 𝐠𝐢𝐯𝐞𝐧 𝐝𝐚𝐭𝐚 𝐢𝐬 𝟒
2/29/2024 23
Gaurav S. Patil
Coefficient of Variance (CV)
• The relative measure of the variance is k/a coefficient of variance.
• The credit for Coefficient of Variance goes to the Karl Pearson.
• C.V. is not expressed in absolute values, but rather in expressed in percentage.
• For calculating variance none of the variable should be NEGATIVE otherwise it
cannot be calculated.
• Formula: CV=
σ
𝒙
x 100
• Where σ= standard deviation and x= mean
2/29/2024 24
Gaurav S. Patil
Coefficient of Variance
• Example: The per kg price of pharmaceutical powder in Pune in different months were
found as under calculate the variance and C.V.:
•
2/29/2024 25
Gaurav S. Patil
Jan Feb March April May June July
60 50 35 55 48 55 64
Month Prices= x (𝐗 − x̄)2
Jan 60 81
Feb 50 1
March 35 256
April 45 36
May 48 9
June 55 16
July 64 169
∑ (X)=357 Σ (xi-x̄)2 = 568
• Mean (x ̅): Σx/n
• 357/7= 51 Rs
• Variance:
• =568/7
• σ2 = 81.14
• σ= 𝟖𝟏. 𝟏𝟒= 9
• CV=
σ
𝒙
x 100
=
9
𝟓𝟏
x 100= 17.61%
Standard Deviation (S.D.)
• Standard deviation (σ) is defined as the positive square root of variance.
• It was first introduced by the Karl Pearson. It is widely used in dispersion.
• SD is calculated on the basis of individual series, discrete series, and
continuous series.
• Formula for (σ) (ungrouped data): σ= Σ 𝑿𝒊 − 𝒙 𝟐/𝐍
• Formula for (σ) (grouped data) (when frequency distribution is given):
σ= Σ𝒇𝒊 𝑿𝒊 − 𝒙 𝟐/𝐍
• Where xi= ith-observation, x= mean, and Σfi=N= no. of observations.
2/29/2024 26
Gaurav S. Patil
Standard Deviation (S.D.)
• Example for S.D. (Ungrouped data)
•Example: Find the S. D. of given data: 3, 6, 9, 12, 15.
• Formula: first is for variance, and second is for SD.
• Step I: Calculate the mean: Mean (x̅) = (3 + 6 + 9 + 12 + 15) / 5 = 9.
• Step II: Calculate the squared differences from the mean:
• Step II: Squared differences (xi-x̅)2: (3 - 9)2, (6 - 9)2, (9 - 9) 2, (12 - 9)2, (15 - 9)2
• Step III: Squared differences 𝐬𝐮𝐦: ∑ (xi-x̄)2 : (-6)2=36, (-3)2=9, 0, (3)2=9,
(6)2=36.
• Step IV: Calculate the variance: Variance = (36 + 9 + 0 + 9 + 36) / 5 =90/5=18
• As we know that S.D. is the square root of variance: 𝟏𝟖 = 𝟏 = 𝟒. 𝟐𝟒
2/29/2024 27
Gaurav S. Patil
Standard Deviation (S.D.)
• Example: Calculate the SD for given grouped data:
•
• Solution:
• Formula:
• σ=
Σ𝒇(xi−x̅)𝟐
𝐍
2/29/2024 28
Gaurav S. Patil
Class Interval 10-20 20-30 30-40 40-50 50-60
Frequency 5 8 12 7 3
Class
Interval
Mid
Point (x)
Frequency
(f)
f.x F (xi-x̅)2
10-20 15 5 60 5 (15-33.5)2= 1711.25
20-30 25 8 200 8(25-33.5)2= 578
30-40 35 12 420 12(35-33.5)=27
40-50 45 7 315 7(45-33.5)=927.5
50-60 55 3 165 3(55-33.4)=129
Σf= 35 Σfx=1175 Σf (xi-x̅)2 =3372.75
Standard Deviation (S.D.)
• Step I: Calculate the mean: Mean (x̅): Σfx/Σf= 1175/35 = 33.75
• Step II: Calculate the mid value for each observation
• Step III: Take the product of frequency and mid value of observation (f.x)
• Step IV: Now find out Σf (xi-x̅)2 as mentioned in column 5th
• Step V: Calculate the variance: Variance =
σ2 = 3372.75/5= 674.55
• As we know that S.D. is the square root of variance: 𝟔𝟕𝟒. 𝟓𝟓 = 𝟐𝟓. 𝟗𝟕
• Now, C.V. :CV=
σ
𝒙
x 100 hence, CV =
25.97
𝟑𝟑.𝟕𝟓
x 100 =76.94
• For the given example S. D. was found to be 76.94
2/29/2024 29
Gaurav S. Patil
Coefficient of Standard Deviation
• The relative measure of SD is known as Coefficient of Standard Deviation.
• It can be denoted by formula: Cv=
𝑺𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝑫𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏
𝑴𝒆𝒂𝒏
= :
𝝈
x̅
• Where, Cv= Coefficient of variance, σ= standard deviation and x̅= mean
• For given above example Cv=
25.97
33.75
= 0.769
• For the given example Coefficient of S. D. was found to be 0.769
2/29/2024 30
Gaurav S. Patil
References
2/29/2024 31
Gaurav S. Patil
1. Kothari CR, Research Methodology: Methods and Techniques, 2nd Edition, New
Age International (P) Ltd, New Delhi.
2. Malhotra NK, Birks DF, Marketing Research an Applied Approach, 4th Edition,
Prentice Hall, New Delhi.
3. A. M. Goon, M. K. Gupta and B. Dasgupta, Fundamentals of Statistics Vol.1, 2008,
World Press Organization (P) Ltd, India.
4. A.K. Sharma, Text Book of Elementary Statistics, 2005, Discovery Publishing House,
New Delhi
5. Biostatistics and Research Methodology by Prof. Chandrakant Kokare, Nirali
Prakashan

Understanding Measures of Dispersion in Data Analysis

  • 1.
    Measure of Dispersion:Dispersion, Range, Standard Deviation Mr. Gaurav S. Patil M. Pharm (Pharmaceutics) gauravpatil2901@gmail.com BP801T: Biostatistics and Research Methodology
  • 2.
    Content: • Dispersion • Range •Mean Deviation • Variance and Standard Deviation • Root Mean Square Deviation •Coefficient of Variation • Pharmaceutical Problems 2/29/2024 2 Gaurav S. Patil
  • 3.
    Introduction The measures ofcentral tendency (mean, median and mode) are not adequate to describe data. Two data sets can have the same mean, but they can be entirely different. Thus to describe data, one needs to know the extent of variability. This is given by the measures of dispersion. In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed. 2/29/2024 3 Gaurav S. Patil
  • 4.
    Importance of dispersion Themeasure of variance of dispersion is an important tool in biostatistical studies because biological phenomena are more valuable than physical and chemical phenomena. For example- Individual variations are found in hemoglobin percentage, in the number of RBCs and WBC, and even the cure rate with the same drug where is in different patients of the same age and sex 2/29/2024 4 Gaurav S. Patil
  • 5.
    Objective of measureof dispersion • To judge the reliability of measure of Central tendency. • To obtain correct picture of distribution or dispersion of values in the series. • To make a comparative study of variability of two or more series or samples. • To identify causes of variability in samples in order to exercise corrective measures as in the case of body temperature, blood pressure and pulse rate, etc. • To use dispersion values for further statistical analysis. 2/29/2024 5 Gaurav S. Patil
  • 6.
    Types of measuresof dispersion 2/29/2024 6 Gaurav S. Patil
  • 7.
  • 8.
    Range: • Range: Range isthe difference between the lowest and the highest values present in the observations in a sample. • Greater the range, larger the variability among the data. • Range is often used when the number of observations are less (<10). • It can be calculated by the following formula: • Range (X) = Maximum value — Minimum value. • Example- If there are 20 observations on seed oil content in Groundnut, the highest value being 65% and the lowest 25%. The range will be 65-25 = 40 • Thus, it is a measure of the spread of variation in a sample. 2/29/2024 8 Gaurav S. Patil
  • 9.
    Range • Problem: Ina clinical trial assessing the effectiveness of a new pain medication, researchers administer different doses to patients. The doses administered (in mg) are as follows: 50, 75, 100, 125, and 150. To calculate the range of doses administered: • Solution: Formula: Range = Maximum Value - Minimum Value = 150 mg - 50 mg = 100 mg The range of doses administered in the trial is 100 mg, indicating variability in dosage levels among patients 2/29/2024 9 Gaurav S. Patil
  • 10.
    Range • Advantages ofRange : • Simple Measure: The range is easy to calculate and understand, providing a quick overview of data spread. • Useful for Initial Assessment: It can be useful for initial screening of variability, especially in large datasets or exploratory analyses. • Disadvantage of Range: • Limited Information: It doesn't offer insights into the distribution of values within the range, potentially missing important details. • Not Suitable for Inference: Range alone lacks the statistical power for making inferences about the population. 2/29/2024 10 Gaurav S. Patil
  • 11.
    Coefficient of Range: •Coefficient of Range: The relative measure of the range is k/a coefficient of range •Formula: Coefficient of Range: H-L/H+L • Example: In a pharmaceutical manufacturing process, the potency of a certain medication is tested by measuring the amount of active ingredient per dosage unit. In a batch of tablets, the highest measured potency is 95 mg per tablet, and the lowest measured potency is 85 mg per tablet. • Solution: Using the formula for the coefficient of range: H=95 (Highest potency); L=85 (Lowest potency) • Formula: H-L/H+L = 95-85/95+85= 10/180= 0.0555 2/29/2024 11 Gaurav S. Patil
  • 12.
    Mean Deviation (MD) •Mean Deviation: Mean deviation is defined as average of the sum of the absolute values of deviation from any arbitrary value viz. mean, median, mode, etc. • It is often suggested to calculate it from the median because it gives least value when measured from the median. • Formula for mean and median for mean deviation: • Where, • n= no. of observation in data • Xi= value of i-th data point • Fi= frequency distribution • Ẋ= mean 2/29/2024 12 Gaurav S. Patil
  • 13.
    Mean Deviation (MD) •Merits of MD • It utilizes all the observations • It is easy to understand and calculate and • It is not much affected by extreme values. • Demerits of MD • Negative deviations are straightaway made positive; • It is not amenable to algebraic treatment; and • It can not be calculated for open end classes. 2/29/2024 13 Gaurav S. Patil
  • 14.
    Mean Deviation (Examples) •Example: Find mean deviation for the given data 1, 2, 3, 4, 5, 6, 7 • Solution: First we need to find mean of the given data, • Mean (x̄): 1+2+3+4+5+6+7/7 i.e. 28/7=4 • Then, we will find ∑|xi-x̄ | i.e. |4-1|=3; |4-2|=2; |4-3|=1; |4-4|=0; |4- 5|=1; |4-6|=2; |4-7|=3 • So, ∑|xi-x̄ |= 3+2+1+0+1+2+3=12 • Therefore, mean deviation: ∑|xi−x̄ | 𝒏 = 12 7 = 1.71 • So, the mean deviation for the given data is 1.71 • Excercise: Find mean deviation for given a data: 10, 15, 20, 25, 30? 2/29/2024 14 Gaurav S. Patil
  • 15.
    Mean Deviation *Redcolor text indicates formula • Example: Find mean deviation for the given data (Grouped Data) • Solution: 2/29/2024 15 Gaurav S. Patil X 1 2 3 4 5 6 7 F 3 5 8 12 10 7 5 X F F.X |x-x̄ | F |x-x̄ | 1 3 3 3.24 9.72 2 5 10 2.24 11.20 3 8 24 1.24 9.92 4 12 48 0.24 2.88 5 10 50 0.76 7.60 6 7 42 1.76 12.32 7 5 35 2.76 13.80 ∑ 50 212 67.64 • Mean (x̄): : ∑f.x ∑𝒇 • Where, f=freq. &x= obs. • 212/50=4.24 • Mean (x̄)= 4.24 • MD= ∑ F|x−x̄| ∑𝑭 = 67.64 50 MD = 𝟏. 𝟑𝟓𝟐𝟖
  • 16.
    Mean Deviation *Redcolor text indicates formula • Example: Find mean deviation from the median from given data • Solution: 2/29/2024 16 Gaurav S. Patil X 0-10 10-20 20-30 30-40 40-50 F 3 5 7 9 4 X F Mid value (Xi) C.F 0-10 3 5 3 10-20 5 15 8 20-30 7 25 15 30-40 9 35 24 40-50 4 45 28 ∑ 28 • We have N=28; N/2=14 • The 14th observation falls in the class 20-30. This is therefore the median class. • Hence we can median from given data: • L=20; N/2=14; CF (preceding class)=8; f=7; i=10 • MEDIAN= L+ 𝑁 2 −𝐶𝐹 𝐹 x i • Median= 20+14-8/7x10= 28.57
  • 17.
    Mean Deviation • Examplecontinued from previous slide: 16 • The calculated median was found to be= 28.57 • Formula for mean deviation from median: ∑ F|xi−median| ∑𝑭 = : 287.14 28 = 𝟏𝟎. 𝟐𝟓𝟓 • The mean deviation of the given data from the median: 10.255 2/29/2024 17 Gaurav S. Patil X F Mid value (Xi) C.F |Xi-Median| F |Xi=Median| 0-10 3 5 3 23.57 70.71 10-20 5 15 8 13.57 67.85 20-30 7 25 15 3.57 24.99 30-40 9 35 24 6.42 57.85 40-50 4 45 28 16.42 65.71 ∑ 28 Σf |xi − median|287.14
  • 18.
    Coefficient of meandeviation • The coefficient of mean deviation is a measure of relative variability or dispersion in a dataset. • It is calculated by dividing the mean deviation by the mean of the dataset • Formula: Coefficient of Mean Deviation (CMD)= Mean Deviation/Median or Mean 2/29/2024 18 Gaurav S. Patil
  • 19.
    Mean Deviation (Exercise) 1.Following are the marks of 7 students in Statistics: 16, 24, 13, 18, 15, 10, 23, Find the mean deviation from mean. 2. Find the mean deviation from the given data: Hint: Example first is of ungrouped data and second is of grouped data. 2/29/2024 19 Gaurav S. Patil X 0-10 10-20 20-30 30-40 40-50 F 5 8 15 16 6
  • 20.
    Variance • Variance (σ2)or (Standard Deviation)2: • Variance is absolute measure of dispersion. • The square of standard deviation is k/a variation. • Variance is the average of the square of deviations of the values taken from mean. • It has significant role in inferential statistics. • Formula: Where, σ2 = Variance • xi= Value of i-th element • x̄= sample mean • N= sample size/no. of observations 2/29/2024 20 Gaurav S. Patil
  • 21.
    Variance • Properties ofvariance: • Variance can never be a negative value. • All observations are considered. • The problem with the variance is the squared unit. Merits of variance: • It is rigidly defined; • It utilizes all the observations; • Squaring is a better technique to get rid of negative deviations; • It is the most popular measure of dispersion. 2/29/2024 21 Gaurav S. Patil
  • 22.
    Variance • Demerits ofvariance: • It is not unit free; • Although easy to understand, calculation may require a calculator or a computer; and • Its unit is square of the unit of the variable due to which it is difficult to judge the magnitude of dispersion compared to standard deviation. 2/29/2024 22 Gaurav S. Patil
  • 23.
    Variance (Examples) •Example: Findout the variance of given data: 1,2,3,4,5,6,7 •Solution: Formula for variance: • We first need to find out the mean of given data: • So, Mean (x̄): 1+2+3+4+5+6+7/7 = 28/7=4 • Now find out ∑ (xi-x̄)2 = 1-4= -3; 2-4=-2; 3-4=-1; 4-4=0;5-4=1;6-4=2;7-4=3 • ∑ (xi-x̄)2 = (-3)2 + (-2)2+ (-1)2+ (0)2+ (1)2+ (2)2+ (3)2 =9+4+1+0+1+4+9=28 • So; ∑ (xi-x̄)2= 28 • Variance: σ2 = ∑(𝐗𝒊−x̄) 𝑵 ∶ 𝟐𝟖 𝟕 ; 𝟐𝟖 𝟕 = 𝟒 𝐒𝐨, 𝐭𝐡𝐞 𝐯𝐚𝐫𝐢𝐧𝐜𝐞 𝐨𝐟 𝐠𝐢𝐯𝐞𝐧 𝐝𝐚𝐭𝐚 𝐢𝐬 𝟒 2/29/2024 23 Gaurav S. Patil
  • 24.
    Coefficient of Variance(CV) • The relative measure of the variance is k/a coefficient of variance. • The credit for Coefficient of Variance goes to the Karl Pearson. • C.V. is not expressed in absolute values, but rather in expressed in percentage. • For calculating variance none of the variable should be NEGATIVE otherwise it cannot be calculated. • Formula: CV= σ 𝒙 x 100 • Where σ= standard deviation and x= mean 2/29/2024 24 Gaurav S. Patil
  • 25.
    Coefficient of Variance •Example: The per kg price of pharmaceutical powder in Pune in different months were found as under calculate the variance and C.V.: • 2/29/2024 25 Gaurav S. Patil Jan Feb March April May June July 60 50 35 55 48 55 64 Month Prices= x (𝐗 − x̄)2 Jan 60 81 Feb 50 1 March 35 256 April 45 36 May 48 9 June 55 16 July 64 169 ∑ (X)=357 Σ (xi-x̄)2 = 568 • Mean (x ̅): Σx/n • 357/7= 51 Rs • Variance: • =568/7 • σ2 = 81.14 • σ= 𝟖𝟏. 𝟏𝟒= 9 • CV= σ 𝒙 x 100 = 9 𝟓𝟏 x 100= 17.61%
  • 26.
    Standard Deviation (S.D.) •Standard deviation (σ) is defined as the positive square root of variance. • It was first introduced by the Karl Pearson. It is widely used in dispersion. • SD is calculated on the basis of individual series, discrete series, and continuous series. • Formula for (σ) (ungrouped data): σ= Σ 𝑿𝒊 − 𝒙 𝟐/𝐍 • Formula for (σ) (grouped data) (when frequency distribution is given): σ= Σ𝒇𝒊 𝑿𝒊 − 𝒙 𝟐/𝐍 • Where xi= ith-observation, x= mean, and Σfi=N= no. of observations. 2/29/2024 26 Gaurav S. Patil
  • 27.
    Standard Deviation (S.D.) •Example for S.D. (Ungrouped data) •Example: Find the S. D. of given data: 3, 6, 9, 12, 15. • Formula: first is for variance, and second is for SD. • Step I: Calculate the mean: Mean (x̅) = (3 + 6 + 9 + 12 + 15) / 5 = 9. • Step II: Calculate the squared differences from the mean: • Step II: Squared differences (xi-x̅)2: (3 - 9)2, (6 - 9)2, (9 - 9) 2, (12 - 9)2, (15 - 9)2 • Step III: Squared differences 𝐬𝐮𝐦: ∑ (xi-x̄)2 : (-6)2=36, (-3)2=9, 0, (3)2=9, (6)2=36. • Step IV: Calculate the variance: Variance = (36 + 9 + 0 + 9 + 36) / 5 =90/5=18 • As we know that S.D. is the square root of variance: 𝟏𝟖 = 𝟏 = 𝟒. 𝟐𝟒 2/29/2024 27 Gaurav S. Patil
  • 28.
    Standard Deviation (S.D.) •Example: Calculate the SD for given grouped data: • • Solution: • Formula: • σ= Σ𝒇(xi−x̅)𝟐 𝐍 2/29/2024 28 Gaurav S. Patil Class Interval 10-20 20-30 30-40 40-50 50-60 Frequency 5 8 12 7 3 Class Interval Mid Point (x) Frequency (f) f.x F (xi-x̅)2 10-20 15 5 60 5 (15-33.5)2= 1711.25 20-30 25 8 200 8(25-33.5)2= 578 30-40 35 12 420 12(35-33.5)=27 40-50 45 7 315 7(45-33.5)=927.5 50-60 55 3 165 3(55-33.4)=129 Σf= 35 Σfx=1175 Σf (xi-x̅)2 =3372.75
  • 29.
    Standard Deviation (S.D.) •Step I: Calculate the mean: Mean (x̅): Σfx/Σf= 1175/35 = 33.75 • Step II: Calculate the mid value for each observation • Step III: Take the product of frequency and mid value of observation (f.x) • Step IV: Now find out Σf (xi-x̅)2 as mentioned in column 5th • Step V: Calculate the variance: Variance = σ2 = 3372.75/5= 674.55 • As we know that S.D. is the square root of variance: 𝟔𝟕𝟒. 𝟓𝟓 = 𝟐𝟓. 𝟗𝟕 • Now, C.V. :CV= σ 𝒙 x 100 hence, CV = 25.97 𝟑𝟑.𝟕𝟓 x 100 =76.94 • For the given example S. D. was found to be 76.94 2/29/2024 29 Gaurav S. Patil
  • 30.
    Coefficient of StandardDeviation • The relative measure of SD is known as Coefficient of Standard Deviation. • It can be denoted by formula: Cv= 𝑺𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝑫𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝑴𝒆𝒂𝒏 = : 𝝈 x̅ • Where, Cv= Coefficient of variance, σ= standard deviation and x̅= mean • For given above example Cv= 25.97 33.75 = 0.769 • For the given example Coefficient of S. D. was found to be 0.769 2/29/2024 30 Gaurav S. Patil
  • 31.
    References 2/29/2024 31 Gaurav S.Patil 1. Kothari CR, Research Methodology: Methods and Techniques, 2nd Edition, New Age International (P) Ltd, New Delhi. 2. Malhotra NK, Birks DF, Marketing Research an Applied Approach, 4th Edition, Prentice Hall, New Delhi. 3. A. M. Goon, M. K. Gupta and B. Dasgupta, Fundamentals of Statistics Vol.1, 2008, World Press Organization (P) Ltd, India. 4. A.K. Sharma, Text Book of Elementary Statistics, 2005, Discovery Publishing House, New Delhi 5. Biostatistics and Research Methodology by Prof. Chandrakant Kokare, Nirali Prakashan