Slide 1
 Lecture by
 Dr Zahid Khan
 King Faisal University,KSA.
1
Measures of dispersion
Slide 2 2
 Learning Objectives
• Calculate common measures of
dispersion from grouped and ungrouped
data (including the range, interquartile
range, mean deviation, and standard
deviation)
• Calculate and interpret the coefficient of
variation
Measures of dispersion
Slide 3
Central tendency measures do not reveal
the variability present in the data.
Dispersion is the scatteredness of the data
series around it average.
Dispersion is the extent to which values in
a distribution differ from the average of
the distribution.
What is measures of dispersion?
(Definition)
Slide 4
Determine the reliability of an
average
Serve as a basis for the control of
the variability
To compare the variability of two or
more series and
Facilitate the use of other statistical
measures.
Why we need measures of dispersion?
(Significance)
Slide 5
Dispersion Example
 Number of minutes 20
clients waited to see a
consulting doctor
Consultant Doctor
X Y
05 15 15 16
12 03 12 18
04 19 15 14
37 11 13 17
06 34 11 15
 X:Mean Time – 14.6
minutes
 Y:Mean waiting time
14.6 minutes
 What is the difference
in the two series?
X: High variability, Less consistency.
Y: Low variability, More Consistency
Slide 6
1. It should be rigidly defined.
2. It should be easy to understand and easy to calculate.
3. It should be based on all the observations of the data.
4. It should be easily subjected to further mathematical
treatment.
5. It should be least affected by the sampling fluctuation .
6. It should not be unduly affected by the extreme values.
Characteristics of an Ideal Measure of
Dispersion
Slide 77
Measures of Dispersion
 There are three main measures of dispersion:
– The range
– The Interquartile range (IQR)
– Variance / standard deviation
Slide 8
Measures of dispersion
 Range
The range is defined as the difference between the largest
score in the set of data and the smallest score in the set of
data, XL - XS
• sensitive to extreme scores;
• compensate by calculating interquartile range (distance between
the 25th and 75th percentile points) which represents the range of
scores for the middle half of a distribution
Usually used in combination with other measures of dispersion.
Slide 9
Range
Source: www.animatedsoftware.com/ statglos/sgrange.htm
Slide 10 10
Interquartile range
 Measures the range of the middle 50% of the values only
 Is defined as the difference between the upper and lower quartiles
Interquartile range = upper quartile - lower quartile
= Q3 - Q1
Slide 1111
The Semi-Interquartile Range
 The semi-interquartile range (or SIR) is defined as the
difference of the first and third quartiles divided by two
– The first quartile is the 25th percentile
– The third quartile is the 75th percentile
 SIR = (Q3 - Q1) / 2
Slide 1212
SIR Example
 What is the SIR for the
data to the right?
 25 % of the scores are
below 5
– 5 is the first quartile
 25 % of the scores are
above 25
– 25 is the third quartile
 SIR = (Q3 - Q1) / 2 =
(25 - 5) / 2 = 10
2
4
6
5 = 25th
%tile
8
10
12
14
20
30
25 = 75th
%tile
60
Slide 1313
When To Use the SIR
 The SIR is often used with skewed data as it is
insensitive to the extreme scores
Slide 14 14
The mean deviation
 Measures the ‘average’ distance of each observation away from the
mean of the data
 Gives an equal weight to each observation
 Generally more sensitive than the range or interquartile range, since a
change in any value will affect it
Slide 15 15
Actual and absolute deviations from mean
A set of x values has a mean of
 The residual of a particular x-value is:
Residual or deviation = x -
 The absolute deviation is:
x
x
x-x
Slide 16 16
Mean deviation
 The mean of the absolute deviations
n
xx
deviationMean
Slide 17 17
To calculate mean deviation
1.Calculate mean of data Find x
2.Subtract mean from each
observation
Record the differences
For each x, find
xx
3.Record absolute value of
each residual
Find
xx for each x
4.Calculate the mean of
the absolute values n
xx
deviationMean
Add up absolute values
and divide by n
Slide 18 18
The standard deviation
 Measures the variation of observations from
the mean
 The most common measure of dispersion
 Takes into account every observation
 Measures the ‘average deviation’ of
observations from mean
 Works with squares of residuals not absolute
values—easier to use in further calculations
Slide 19 19
Standard deviation of a population δ
 Every observation in the population is used.
 The square of the population standard deviation is
called the variance.
n
xx
δdeviationdardtanS
2
2
δVariance
Slide 20 20
Standard deviation of a sample s
 In practice, most populations are very large and it is
more common to calculate the sample standard
deviation.
 Where: (n-1) is the number of observations in the sample
1
2
n
xx
sdeviationdardtansSample
Slide 21 21
To calculate standard deviation
1. Calculate the mean
x
2. Calculate the residual for
each x
xx
3. Square the residuals 2
)( xx
4. Calculate the sum of the
squares
2
xx
5. Divide the sum in Step 4 by
(n-1)
1
2
n
xx
6. Take the square root of
quantity in Step 5 1
2
n
xx
Slide 22
 Uses of the standard deviation
– The standard deviation enables us to determine,
with a great deal of accuracy, where the values
of a frequency distribution are located in relation
to the mean. We can do this according to a
theorem devised by the Russian mathematician
P.L. Chebyshev (1821-1894).
Uses of Standard deviation
Slide 23 23
Coefficient of variation
 Is a measure of relative variability used to:
– measure changes that have occurred in a population over time
– compare variability of two populations that are expressed in different
units of measurement
– expressed as a percentage rather than in terms of the units of the
particular data
Slide 24 24
Formula for coefficient of variation
%100
x
s
V
 Denoted by V
where = the mean of the sample
s = the standard deviation of the sample
x
Slide 25 25
Summary
 Measures of Dispersion
– no ideal measure of dispersion exists
 standard deviation is the most important
measure of Dispersion.
• it is the most frequently used
• the value is affected by the value of every observation in the data
• extreme values in the population may distort the data
Slide 26
REFERENCE
1. Mathematical Statistics- S.P Gupta
2. Statistics for management- Richard I.
Levin, David S. Rubin
3. Biostatistics A foundation for Analysis in the
Health Sciences.
Slide 27
THANK YOU

Measures of dispersion

  • 1.
    Slide 1  Lectureby  Dr Zahid Khan  King Faisal University,KSA. 1 Measures of dispersion
  • 2.
    Slide 2 2 Learning Objectives • Calculate common measures of dispersion from grouped and ungrouped data (including the range, interquartile range, mean deviation, and standard deviation) • Calculate and interpret the coefficient of variation Measures of dispersion
  • 3.
    Slide 3 Central tendencymeasures do not reveal the variability present in the data. Dispersion is the scatteredness of the data series around it average. Dispersion is the extent to which values in a distribution differ from the average of the distribution. What is measures of dispersion? (Definition)
  • 4.
    Slide 4 Determine thereliability of an average Serve as a basis for the control of the variability To compare the variability of two or more series and Facilitate the use of other statistical measures. Why we need measures of dispersion? (Significance)
  • 5.
    Slide 5 Dispersion Example Number of minutes 20 clients waited to see a consulting doctor Consultant Doctor X Y 05 15 15 16 12 03 12 18 04 19 15 14 37 11 13 17 06 34 11 15  X:Mean Time – 14.6 minutes  Y:Mean waiting time 14.6 minutes  What is the difference in the two series? X: High variability, Less consistency. Y: Low variability, More Consistency
  • 6.
    Slide 6 1. Itshould be rigidly defined. 2. It should be easy to understand and easy to calculate. 3. It should be based on all the observations of the data. 4. It should be easily subjected to further mathematical treatment. 5. It should be least affected by the sampling fluctuation . 6. It should not be unduly affected by the extreme values. Characteristics of an Ideal Measure of Dispersion
  • 7.
    Slide 77 Measures ofDispersion  There are three main measures of dispersion: – The range – The Interquartile range (IQR) – Variance / standard deviation
  • 8.
    Slide 8 Measures ofdispersion  Range The range is defined as the difference between the largest score in the set of data and the smallest score in the set of data, XL - XS • sensitive to extreme scores; • compensate by calculating interquartile range (distance between the 25th and 75th percentile points) which represents the range of scores for the middle half of a distribution Usually used in combination with other measures of dispersion.
  • 9.
  • 10.
    Slide 10 10 Interquartilerange  Measures the range of the middle 50% of the values only  Is defined as the difference between the upper and lower quartiles Interquartile range = upper quartile - lower quartile = Q3 - Q1
  • 11.
    Slide 1111 The Semi-InterquartileRange  The semi-interquartile range (or SIR) is defined as the difference of the first and third quartiles divided by two – The first quartile is the 25th percentile – The third quartile is the 75th percentile  SIR = (Q3 - Q1) / 2
  • 12.
    Slide 1212 SIR Example What is the SIR for the data to the right?  25 % of the scores are below 5 – 5 is the first quartile  25 % of the scores are above 25 – 25 is the third quartile  SIR = (Q3 - Q1) / 2 = (25 - 5) / 2 = 10 2 4 6 5 = 25th %tile 8 10 12 14 20 30 25 = 75th %tile 60
  • 13.
    Slide 1313 When ToUse the SIR  The SIR is often used with skewed data as it is insensitive to the extreme scores
  • 14.
    Slide 14 14 Themean deviation  Measures the ‘average’ distance of each observation away from the mean of the data  Gives an equal weight to each observation  Generally more sensitive than the range or interquartile range, since a change in any value will affect it
  • 15.
    Slide 15 15 Actualand absolute deviations from mean A set of x values has a mean of  The residual of a particular x-value is: Residual or deviation = x -  The absolute deviation is: x x x-x
  • 16.
    Slide 16 16 Meandeviation  The mean of the absolute deviations n xx deviationMean
  • 17.
    Slide 17 17 Tocalculate mean deviation 1.Calculate mean of data Find x 2.Subtract mean from each observation Record the differences For each x, find xx 3.Record absolute value of each residual Find xx for each x 4.Calculate the mean of the absolute values n xx deviationMean Add up absolute values and divide by n
  • 18.
    Slide 18 18 Thestandard deviation  Measures the variation of observations from the mean  The most common measure of dispersion  Takes into account every observation  Measures the ‘average deviation’ of observations from mean  Works with squares of residuals not absolute values—easier to use in further calculations
  • 19.
    Slide 19 19 Standarddeviation of a population δ  Every observation in the population is used.  The square of the population standard deviation is called the variance. n xx δdeviationdardtanS 2 2 δVariance
  • 20.
    Slide 20 20 Standarddeviation of a sample s  In practice, most populations are very large and it is more common to calculate the sample standard deviation.  Where: (n-1) is the number of observations in the sample 1 2 n xx sdeviationdardtansSample
  • 21.
    Slide 21 21 Tocalculate standard deviation 1. Calculate the mean x 2. Calculate the residual for each x xx 3. Square the residuals 2 )( xx 4. Calculate the sum of the squares 2 xx 5. Divide the sum in Step 4 by (n-1) 1 2 n xx 6. Take the square root of quantity in Step 5 1 2 n xx
  • 22.
    Slide 22  Usesof the standard deviation – The standard deviation enables us to determine, with a great deal of accuracy, where the values of a frequency distribution are located in relation to the mean. We can do this according to a theorem devised by the Russian mathematician P.L. Chebyshev (1821-1894). Uses of Standard deviation
  • 23.
    Slide 23 23 Coefficientof variation  Is a measure of relative variability used to: – measure changes that have occurred in a population over time – compare variability of two populations that are expressed in different units of measurement – expressed as a percentage rather than in terms of the units of the particular data
  • 24.
    Slide 24 24 Formulafor coefficient of variation %100 x s V  Denoted by V where = the mean of the sample s = the standard deviation of the sample x
  • 25.
    Slide 25 25 Summary Measures of Dispersion – no ideal measure of dispersion exists  standard deviation is the most important measure of Dispersion. • it is the most frequently used • the value is affected by the value of every observation in the data • extreme values in the population may distort the data
  • 26.
    Slide 26 REFERENCE 1. MathematicalStatistics- S.P Gupta 2. Statistics for management- Richard I. Levin, David S. Rubin 3. Biostatistics A foundation for Analysis in the Health Sciences.
  • 27.