Chapter three
Numerical Descriptive Measures
 Objectives
 Describe data using measures of central tendency, such as
the mean, median, mode, and midrange.
 Summarize data using measures of variation, such as the
range, variance, and standard deviation.
 Determine the position of a data value in a data set using
various measures of position, such as percentiles, deciles and
quartiles.
A. Measure of central tendency
 A measure of central tendency is very important
tool that refer to the center of a histogram or a
frequency distribution curve.
 Such measures are the mean, the median, and the
mode for the two cases (grouped and ungrouped
data sets).
The mean
◦ The most commonly used measure of central tendency is called
mean (or the average).
• Also known as arithmetic average: it is the most common
measure.
• Calculated by adding all the values in the group & then
dividing by the number of values.
• Helps to summarizing the essential features and enables
comparison.

Cont…
 Mean is the sum of the values divided by the
number of values. The mean of a set of
numbers x1, x2... xn is typically denoted by " ".
 This mean is a type of arithmetic mean.
 It is the "standard" average, often simply
called the "mean".
 The mean for an ungrouped data is obtained
by dividing the sum of all values by the
number of values in that data set.
Cont…
 The Mean for Ungrouped Data
calculated as
 Mean for population data:
 Mean for sample data: x
̄ =
 Example; Find the mean score of 10 students
in a midterm exam in a class if their scores
are
Cont…
25 27 30 23 16 27 29 14 20
28
 =
 Example2. According to example 1, if we
take a sample of 4 students from the class
and find their scores to be: 23, 27, 16, and 29.
Find the mean of this scores.
 x
̄ =
 x
̄ ==95/4=23.75
ii. Weighted Mean
 If 𝑥1 , 𝑥2 , … , 𝑥𝑛 and 𝑤1 , 𝑤2 , … , 𝑤𝑛
are
represent values of the
items
the
corresponding weights, then the weighted mean, (𝑥ҧ𝑤 ) is
given by
Example: A student’s final mark in Mathematics, Physics, Chemistry and Biology are A,
B, D and C respectively. If the respective credits received for these courses are 4, 4, 3
and 2, determine the approximate average mark the student has got for the course.
Solution:
=
𝟏𝟔+𝟏𝟐+𝟑+𝟒
𝟑𝟓 𝟏𝟑
𝟏𝟑
= = 2.69. That is, Average mark of the student is
2.69.

wi
 wi
xi
w1  w2  wn

w1 x1  w2 x2  wn xn
xw

wi
 wi
xi
w1  w2  wn

w1 x1  w2 x2  wn xn
xw
𝑥𝑖 4 3 1 2
𝑤𝑖 4 4 3 2
𝑥𝑖𝑤𝑖 16 12 3 4
iii. Combined mean
 When a set of observations is divided into k groups and x
̄ 1𝑛1 is the mean
of n1
& group 1, x
̄ 2𝑛2 is the mean of n2 & group2, …, x
̄ k𝑛k is the mean of nk &
group k, then the combined mean, denoted by x
̄ c, of all observations
taken together is given by
=
x
̄ 1𝑛1 + x
̄ 2𝑛2 + ⋯ + x
̄ 𝑘 𝑛𝑘
𝑛1 + 𝑛2 + ⋯
+ 𝑛𝑘
Example: There are two classes, Class A and Class B. Class A has 30 students
with an average score of 70 on a test. Class B has 20 students with an average
score of 80. What is the combined average score for both classes?
Solution:
= 74.
The combined mean of the entire students will be
74.
X
̄ c
X
̄ c = =3700/50
Note:
If a constant c is added to or subtracted from every value in the data set, the
mean increases or decreases by that constant:
 New Mean=Old Mean + c, for added;
 New Mean=Old Mean - c, for subtracted
If each value in the data set is multiplied by a constant k, the mean is also
multiplied by k: New Mean=k × Old Mean.
Question 1: If the mean of a data set is 50, what will the new mean be if a constant
value of 5 is added to every value in the data set?
Given mean = 50 and constant = 5; New mean = 50 + 5 = 55.
The mid range
The midrange (MR) is defined as the sum of the lowest and highest
values in the data set divided by 2.
MR = Lowest value + Highest value
2
Example: Find the midrange (MR) for the following data:
11, 13, 20, 30, 9, 4, 15
Solution: The lowest value is 4, and the highest value is 30, then
MR = 4 + 30 = 34 = 17
2 2
Note that, this measure (MR) is weak as a measure of central ten-
dency since it is depends only on two values among of all values in
the data set.
Mean for Grouped data
 If data are given in the form of continuous frequency distribution,
the
sample mean can be computed as
x
̄ =
σ𝑖=1
𝑘
�
�
𝑖
𝑥𝑖 𝑓𝑖 𝑥1𝑓1+𝑥2𝑓2+ …
+𝑥𝑘 𝑓𝑘
σ𝑖=1
𝑘 =
𝑓1+𝑓2+ …+
𝑓𝑘
, 𝑥𝑖 𝑓𝑖 - is the product of mid-
point &
freq.
Solution:
The formula to be used for the
mean is as follows:
x
̄ =
σ𝑖=1
fi
𝑥
𝑓
𝑖
𝑖
σ𝑖=1
𝑘
x
̄ =
σ𝑖=1
𝑘
fi
𝑥
𝑓
𝑖
100
σ𝑖=1
𝑘
𝑖
= x
̄
655
8 = 65.58.
Class boundary 60-62 62-64 64-66 66-68 68-70 70-72 Total
Frequency (fi) 5 18 42 20 8 7 100
xi 61 63 65 67 69 71
xifi 305 1134 2730 1340 552 497
 Median
• Is the value of the middle item of series when it is
arranged in ascending or descending order.
• It divides the series into two half.
• It is positional average.
• It is the middle value of the distribution when all items are
arranged in either ascending or descending order in terms
of value.
 Where n is odd
12
1
2
th
n
Med value

 
 
 
03/08/2025 By: Menberu T.
Cont…
 Example: Find the median for the data set:
312, 257, 421, 289, 526, 374, 497
 Solution: First, the data set after we have ranked in increasing order is:
x1 x2 x3 x4 x5 x6 x7
257 289 312 374 421 497 526
Median=374
 Since there are 7 values in this data set, so the fourth term a 7+ 1 = 4k
in the ranked data is the median.Therefore the median is
 median = ( )th item= = 4th
item = 374
Cont…
 Median of Even Numbers
 Step 1: Arrange the data either in ascending or in descending
order.
 Step 2: If the number of observations (say n) are even, then
identify (n/2)th and [(n/2) + 1]th observations.
 Step 3: The average of the above two observations (which
are identified in step 2) is the median of the given data.
Cont…
 Example: Find the median for the data set:
8, 12, 7, 17, 14, 45, 10, 13, 17, 13, 9, 11
 Solution: First, we rank the data in
increasing order:
 Since there are 12 values in this data set, the
median is given by the average of the two
middle values whose ranks are
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12
7 8 9 10 11 12 13 13 14 17 17 45
Median for grouped data
 For grouped data, the median is obtained
by the following formula.
 Median=L+()h
 Where L= lower limit of the median class
 n= number of observation
 f=frequency of the median class
 cf=cumulative frequency of the class
preceding the median class
 h=class width
Example: Water percentage in the body of species of Fish is given below.
Calculate the median.
Solution: Construct the less than cumulative frequency distribution,
then:
 Since n = 50, 50/2 = 25
 l=35
 f=16
 h=9
 Cf=24
 Median=L+()h = =35+()9 = 35.56
~
x
Class interval 15-24 25-34 35-44 45-54 55-64 Total
Frequency 7 17 16 6 4 50
Class Interval 15-24 25-34 35-44 45-54 55-64 Total
Frequency 7 17 16 6 4 50
Cumulative Freq. 7 24 40 46 50
The mode
 The mode is another measure of central tendency and it is
known as the most common value in a data set.
 Data set with none mode: In such data set each value
occurring only once.
 Data set with one mode: In such data set only one value
occurring with the highest frequency. The data set in this
case is called unimodal.
 Data set with two modes: In such data set two values that
occur with the same (highest) frequency. The distribution, in
this case, is said to be bimodal.
 Data set with more than two modes: In such data set more
than two values occurs with the same (highest) frequency,
then the data set contains more than two modes and it is said
to be multimodal.
Cont…
 Example: Find the mode for the given data set:
 22, 19, 21, 19, 27, 21, 29, 22, 19, 25, 21, 22, 25
 Solution: Since each of the three values, 19 (occur three
times), 21 (occur three times), and 22 (occur three times)
occurs with a highest frequency in their neighborhoods,
therefore, each of these is a mode, that is the modes for this
data set are: 19, 21, and 22.
Mode for grouped data
 The formula for calculating the mode of grouped data
is:
 In this formula, the variables are:
• L:The lower limit of the modal class
• h:The size of the class interval
• f1:The frequency of the modal class
• f0:The frequency of the class preceding the modal
class
• f2:The frequency of the class succeeding the modal
class
Example : The following table shows the distribution of scores obtained by
students in an exam:
What is the mode of the exam
scores?
Answer:
• L = lower boundary of the modal class = 70
• f1 = frequency of the modal class = 25
• f0 = frequency of the class before the modal class =
12
• f2​= frequency of the class after the modal class = 10
• h = class width = 10
• Using formula: Mode = 75.
Score Range Number of Students (Frequency)
50 – 60 8
60 – 70 12
70 – 80 25
80 – 90 10
90 - 100 5
=75
Relationships Between Mean, Median and Mode:
The relationships between mean, median & mode is defined
as Mode is
equal to the difference between 3 times the median & 2 times the
mean.
That is, Mean – Mode = 3 (Mean – Median) OR;
Mode = 3 Median – 2 Mean.
Example : If the difference between mean and mode of a
population is 48 and the median is 12, then the mean is
Solution:
 Mean – Mode = 3(Mean – Median);
 48 = 3(Mean – 12);
 16 = Mean – 12;
 Mean = 28.
B. Measures of dispersion
• An average can represent a series only as best as a single
figure can, but it certainly cannot reveal the entire story of
any phenomenon under study
• It shows the degree by which numerical data tend to spread
around an average value/mean .
• Averages do not tell anything about the scatterness of
observations within the distribution.
• In order to measure the degree of scatter, the statistical device
called measures of dispersion are calculated.
23
03/08/2025 By: Menberu T.
 Range = highest value – lowest value
 It shows the difference b/n the highest value and the
lowest value, hence it is the weakest measure of
dispersion
 Variance
 First calculate the mean, then deduct the mean from
each value in the group square the result and divide
the result by the number of values.
The variance is used as a measure of how far a set of
numbers are spread out from each other.
It describes how far the numbers lie from the mean
(expected value).
24
03/08/2025 By: Menberu T.
 Standard deviation
 The most reliable measurement of the degree to which the
data is spread around the mean
 Putting the variance in square root
25
2
1
( )
( )
n
i
i
x x
Var x
n




03/08/2025 By: Menberu T.
Example: please, find the mean, median, mode, range, variance
and standard deviation for the following row data?
03/08/2025 By: Menberu T. 26
ID Age of respondent
1 53
2 44
3 56
4 70
5 45
6 62
7 36
8 23
9 56
10 55
Solution: A) Mean
= ∑xi/n = 53 + 44 + 56 +70 + 45 + 62 + 36 +23 + 56 + 55/10 = 500/10 = 50
B) Median, first we should arrange the raw data in ascending or descending order
as follow:
23, 36, 44, 45, 53, 55, 56, 56, 62, 70, since n is order, therefore
Median = 53 + 55/2 = 54
C) Mode, we find the most frequently occur, 56 is the mode of the given data
since it is more frequently occur and It is uni-modal.
D) Range = largest value-lowest value = 70-23 =47
E) Variance = ∑(xi- )2/n
03/08/2025 By: Menberu T. 27
ID
xi xi- (xi- )2
1
53 3 9
2
44 -6 36
3
56 6 36
4
70 20 400
5
45 -5 25
6
62 12 144
7
36 -14 196
8
23 -27 729
9
56 6 36
10
55 5 25
∑(xi- )2 =1636 variance = ∑(xi- )2/n = 1636/10 = 163.6
03/08/2025 By: Menberu T. 28
F) SD=variance = 163.6 = 12.79
Measure of dispersion for Grouped Data
• Sample Variance Formula for Grouped Data (σ2
) = ∑ f(mi – )
x
̄ 2
/(n-1)
• Population Variance Formula for Grouped Data (σ2
) = ∑ f(mi – )
x
̄ 2
/n
• where,
• f is the frequency of each interval
• mi is the midpoint of the ith
interval
• x
̄ is the mean of the grouped data
03/08/2025 By: Menberu T. 29
Cont…
• Find the variance and the standard deviation for the following frequency dist
of a sample:
03/08/2025 By: Menberu T. 30
Class Frequency fm
5 – 9 2
10 – 14 4
15 – 19 7
20 – 24 3
25 – 29 1
30 – 34 3
Total 20
Cont…
03/08/2025 By: Menberu T. 31
3.5 12.25
Cont…
• Variance= = 1105/19=58.158
• Standard deviation=7.626
03/08/2025 By: Menberu T. 32
C. Measures of relationship
1. Coefficient of variance
It (CV) is a normalized measure of dispersion.
It is also known as unitized risk or the variation coefficient.
It is defined as the ratio of the standard deviation to the mean.
CV is a relative measure of dispersion, V, defined by:
33
SD
CV
Mean
 
 
 
03/08/2025 By: Menberu T.
Example: If the standard deviation of a given distribution is 0.20
and the mean is 0.50, what is the coefficient of variation (CV)?
CV = (0.20/0.50)*100% = 40%
2. Covariance
Covariance between X and Y refers to a measure of how much
two variables change together.
Covariance indicates how two variables are related.
A positive covariance means the variables are positively related,
while a negative covariance means the variables are inversely
related.
The formula for calculating covariance of sample data is shown
below.
34
03/08/2025 By: Menberu T.
35
Note: for population (N) and for sample
(n-1)
 Often the numbers have no meaning. Thus
we focus on the sign.
03/08/2025 By: Menberu T.
3. correlation
Covariance only shows the direction. It has no upper and lower
bound.
Correlation tells the degree to which the variables tend to move
together.
The most familiar measure of dependence between two quantities is
the "Pearson's correlation."
It is obtained by dividing the covariance of the two variables by the
product of their standard deviations.
The Pearson correlation is defined only if both of the standard
deviations are finite ፥ፍልሕ፡ህ፡and both of them are nonzero.
The correlation coefficient is symmetric: corr(X, Y) = corr(Y, X).
36
03/08/2025 By: Menberu T.
The Pearson correlation is +1 if there is perfect positive linear
relationship, −1 if there is perfect negative linear relationship.
If the variables are independent, Pearson's correlation
coefficient is 0.
The sample correlation coefficient is written
37
03/08/2025 By: Menberu T.
The correlation between two random variables, X and Y, is
a measure of the degree of linear association between
the two variables.
The population correlation, denoted by , can take on any
value from -1 to 1.
  indicates a perfect negative linear relationship
-1 <  < 0 indicates a negative linear relationship
   indicates no linear relationship
0 <  < 1 indicates a positive linear relationship
   indicates a perfect positive linear relationship
The absolute value of  indicates the strength or exactness of the
relationship.
38
03/08/2025 By: Menberu T.
Example: find covariance and Pearson correlation following
hypothetical row data?
03/08/2025 By: Menberu T. 39
xi yi xi- yi- (xi- )(Yi- ) (xi- )2 (Yi- )2
10 18 -4 6 -24 16 36
30 6 16 -6 -96 256 36
8 12 -6 0 0 36 0
16 15 2 3 6 4 9
6 9 -8 -3 24 64 9
Cov (X,Y)= ∑(xi- )(Yi- )/n= -90/5 = -18 ∑(xi- )2 =
376
∑ (Yi- )2
= 90
r (x, y) = ∑(xi- )(Yi- )/∑(xi- )2∑ (Yi- )2 = -90/33, 840 =
-90/183 = -0.49
Skewness
 It refers to symmetry or asymmetry of the distribution.
 A distribution is symmetric if its left half is a mirror image of
its right half.
 The skewness value can be positive or negative.
 A symmetric distribution with a single peak and a bell shape is
known as a normal distribution.
D. Shape of Frequency Distribution
03/08/2025 By: Menberu T. 40
Kurtosis:
 It refers to peakedness/flatness of the distribution.
 Higher kurtosis means more of the variance is the result
of infrequent extreme deviation.
 The fourth standardized moment is defined as
4
1
4
( )
( 1)
n
i
i
x x
KU
n S





03/08/2025 By: Menberu T. 41

Stat Chapter 3.pptx, proved detail statistical issues

  • 1.
    Chapter three Numerical DescriptiveMeasures  Objectives  Describe data using measures of central tendency, such as the mean, median, mode, and midrange.  Summarize data using measures of variation, such as the range, variance, and standard deviation.  Determine the position of a data value in a data set using various measures of position, such as percentiles, deciles and quartiles.
  • 2.
    A. Measure ofcentral tendency  A measure of central tendency is very important tool that refer to the center of a histogram or a frequency distribution curve.  Such measures are the mean, the median, and the mode for the two cases (grouped and ungrouped data sets).
  • 3.
    The mean ◦ Themost commonly used measure of central tendency is called mean (or the average). • Also known as arithmetic average: it is the most common measure. • Calculated by adding all the values in the group & then dividing by the number of values. • Helps to summarizing the essential features and enables comparison. 
  • 4.
    Cont…  Mean isthe sum of the values divided by the number of values. The mean of a set of numbers x1, x2... xn is typically denoted by " ".  This mean is a type of arithmetic mean.  It is the "standard" average, often simply called the "mean".  The mean for an ungrouped data is obtained by dividing the sum of all values by the number of values in that data set.
  • 5.
    Cont…  The Meanfor Ungrouped Data calculated as  Mean for population data:  Mean for sample data: x ̄ =  Example; Find the mean score of 10 students in a midterm exam in a class if their scores are
  • 6.
    Cont… 25 27 3023 16 27 29 14 20 28  =  Example2. According to example 1, if we take a sample of 4 students from the class and find their scores to be: 23, 27, 16, and 29. Find the mean of this scores.  x ̄ =  x ̄ ==95/4=23.75
  • 7.
    ii. Weighted Mean If 𝑥1 , 𝑥2 , … , 𝑥𝑛 and 𝑤1 , 𝑤2 , … , 𝑤𝑛 are represent values of the items the corresponding weights, then the weighted mean, (𝑥ҧ𝑤 ) is given by Example: A student’s final mark in Mathematics, Physics, Chemistry and Biology are A, B, D and C respectively. If the respective credits received for these courses are 4, 4, 3 and 2, determine the approximate average mark the student has got for the course. Solution: = 𝟏𝟔+𝟏𝟐+𝟑+𝟒 𝟑𝟓 𝟏𝟑 𝟏𝟑 = = 2.69. That is, Average mark of the student is 2.69.  wi  wi xi w1  w2  wn  w1 x1  w2 x2  wn xn xw  wi  wi xi w1  w2  wn  w1 x1  w2 x2  wn xn xw 𝑥𝑖 4 3 1 2 𝑤𝑖 4 4 3 2 𝑥𝑖𝑤𝑖 16 12 3 4
  • 8.
    iii. Combined mean When a set of observations is divided into k groups and x ̄ 1𝑛1 is the mean of n1 & group 1, x ̄ 2𝑛2 is the mean of n2 & group2, …, x ̄ k𝑛k is the mean of nk & group k, then the combined mean, denoted by x ̄ c, of all observations taken together is given by = x ̄ 1𝑛1 + x ̄ 2𝑛2 + ⋯ + x ̄ 𝑘 𝑛𝑘 𝑛1 + 𝑛2 + ⋯ + 𝑛𝑘 Example: There are two classes, Class A and Class B. Class A has 30 students with an average score of 70 on a test. Class B has 20 students with an average score of 80. What is the combined average score for both classes? Solution: = 74. The combined mean of the entire students will be 74. X ̄ c X ̄ c = =3700/50
  • 9.
    Note: If a constantc is added to or subtracted from every value in the data set, the mean increases or decreases by that constant:  New Mean=Old Mean + c, for added;  New Mean=Old Mean - c, for subtracted If each value in the data set is multiplied by a constant k, the mean is also multiplied by k: New Mean=k × Old Mean. Question 1: If the mean of a data set is 50, what will the new mean be if a constant value of 5 is added to every value in the data set? Given mean = 50 and constant = 5; New mean = 50 + 5 = 55.
  • 10.
    The mid range Themidrange (MR) is defined as the sum of the lowest and highest values in the data set divided by 2. MR = Lowest value + Highest value 2 Example: Find the midrange (MR) for the following data: 11, 13, 20, 30, 9, 4, 15 Solution: The lowest value is 4, and the highest value is 30, then MR = 4 + 30 = 34 = 17 2 2 Note that, this measure (MR) is weak as a measure of central ten- dency since it is depends only on two values among of all values in the data set.
  • 11.
    Mean for Groupeddata  If data are given in the form of continuous frequency distribution, the sample mean can be computed as x ̄ = σ𝑖=1 𝑘 � � 𝑖 𝑥𝑖 𝑓𝑖 𝑥1𝑓1+𝑥2𝑓2+ … +𝑥𝑘 𝑓𝑘 σ𝑖=1 𝑘 = 𝑓1+𝑓2+ …+ 𝑓𝑘 , 𝑥𝑖 𝑓𝑖 - is the product of mid- point & freq. Solution: The formula to be used for the mean is as follows: x ̄ = σ𝑖=1 fi 𝑥 𝑓 𝑖 𝑖 σ𝑖=1 𝑘 x ̄ = σ𝑖=1 𝑘 fi 𝑥 𝑓 𝑖 100 σ𝑖=1 𝑘 𝑖 = x ̄ 655 8 = 65.58. Class boundary 60-62 62-64 64-66 66-68 68-70 70-72 Total Frequency (fi) 5 18 42 20 8 7 100 xi 61 63 65 67 69 71 xifi 305 1134 2730 1340 552 497
  • 12.
     Median • Isthe value of the middle item of series when it is arranged in ascending or descending order. • It divides the series into two half. • It is positional average. • It is the middle value of the distribution when all items are arranged in either ascending or descending order in terms of value.  Where n is odd 12 1 2 th n Med value        03/08/2025 By: Menberu T.
  • 13.
    Cont…  Example: Findthe median for the data set: 312, 257, 421, 289, 526, 374, 497  Solution: First, the data set after we have ranked in increasing order is: x1 x2 x3 x4 x5 x6 x7 257 289 312 374 421 497 526 Median=374  Since there are 7 values in this data set, so the fourth term a 7+ 1 = 4k in the ranked data is the median.Therefore the median is  median = ( )th item= = 4th item = 374
  • 14.
    Cont…  Median ofEven Numbers  Step 1: Arrange the data either in ascending or in descending order.  Step 2: If the number of observations (say n) are even, then identify (n/2)th and [(n/2) + 1]th observations.  Step 3: The average of the above two observations (which are identified in step 2) is the median of the given data.
  • 15.
    Cont…  Example: Findthe median for the data set: 8, 12, 7, 17, 14, 45, 10, 13, 17, 13, 9, 11  Solution: First, we rank the data in increasing order:  Since there are 12 values in this data set, the median is given by the average of the two middle values whose ranks are x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 7 8 9 10 11 12 13 13 14 17 17 45
  • 16.
    Median for groupeddata  For grouped data, the median is obtained by the following formula.  Median=L+()h  Where L= lower limit of the median class  n= number of observation  f=frequency of the median class  cf=cumulative frequency of the class preceding the median class  h=class width
  • 17.
    Example: Water percentagein the body of species of Fish is given below. Calculate the median. Solution: Construct the less than cumulative frequency distribution, then:  Since n = 50, 50/2 = 25  l=35  f=16  h=9  Cf=24  Median=L+()h = =35+()9 = 35.56 ~ x Class interval 15-24 25-34 35-44 45-54 55-64 Total Frequency 7 17 16 6 4 50 Class Interval 15-24 25-34 35-44 45-54 55-64 Total Frequency 7 17 16 6 4 50 Cumulative Freq. 7 24 40 46 50
  • 18.
    The mode  Themode is another measure of central tendency and it is known as the most common value in a data set.  Data set with none mode: In such data set each value occurring only once.  Data set with one mode: In such data set only one value occurring with the highest frequency. The data set in this case is called unimodal.  Data set with two modes: In such data set two values that occur with the same (highest) frequency. The distribution, in this case, is said to be bimodal.  Data set with more than two modes: In such data set more than two values occurs with the same (highest) frequency, then the data set contains more than two modes and it is said to be multimodal.
  • 19.
    Cont…  Example: Findthe mode for the given data set:  22, 19, 21, 19, 27, 21, 29, 22, 19, 25, 21, 22, 25  Solution: Since each of the three values, 19 (occur three times), 21 (occur three times), and 22 (occur three times) occurs with a highest frequency in their neighborhoods, therefore, each of these is a mode, that is the modes for this data set are: 19, 21, and 22.
  • 20.
    Mode for groupeddata  The formula for calculating the mode of grouped data is:  In this formula, the variables are: • L:The lower limit of the modal class • h:The size of the class interval • f1:The frequency of the modal class • f0:The frequency of the class preceding the modal class • f2:The frequency of the class succeeding the modal class
  • 21.
    Example : Thefollowing table shows the distribution of scores obtained by students in an exam: What is the mode of the exam scores? Answer: • L = lower boundary of the modal class = 70 • f1 = frequency of the modal class = 25 • f0 = frequency of the class before the modal class = 12 • f2​= frequency of the class after the modal class = 10 • h = class width = 10 • Using formula: Mode = 75. Score Range Number of Students (Frequency) 50 – 60 8 60 – 70 12 70 – 80 25 80 – 90 10 90 - 100 5 =75
  • 22.
    Relationships Between Mean,Median and Mode: The relationships between mean, median & mode is defined as Mode is equal to the difference between 3 times the median & 2 times the mean. That is, Mean – Mode = 3 (Mean – Median) OR; Mode = 3 Median – 2 Mean. Example : If the difference between mean and mode of a population is 48 and the median is 12, then the mean is Solution:  Mean – Mode = 3(Mean – Median);  48 = 3(Mean – 12);  16 = Mean – 12;  Mean = 28.
  • 23.
    B. Measures ofdispersion • An average can represent a series only as best as a single figure can, but it certainly cannot reveal the entire story of any phenomenon under study • It shows the degree by which numerical data tend to spread around an average value/mean . • Averages do not tell anything about the scatterness of observations within the distribution. • In order to measure the degree of scatter, the statistical device called measures of dispersion are calculated. 23 03/08/2025 By: Menberu T.
  • 24.
     Range =highest value – lowest value  It shows the difference b/n the highest value and the lowest value, hence it is the weakest measure of dispersion  Variance  First calculate the mean, then deduct the mean from each value in the group square the result and divide the result by the number of values. The variance is used as a measure of how far a set of numbers are spread out from each other. It describes how far the numbers lie from the mean (expected value). 24 03/08/2025 By: Menberu T.
  • 25.
     Standard deviation The most reliable measurement of the degree to which the data is spread around the mean  Putting the variance in square root 25 2 1 ( ) ( ) n i i x x Var x n     03/08/2025 By: Menberu T.
  • 26.
    Example: please, findthe mean, median, mode, range, variance and standard deviation for the following row data? 03/08/2025 By: Menberu T. 26 ID Age of respondent 1 53 2 44 3 56 4 70 5 45 6 62 7 36 8 23 9 56 10 55
  • 27.
    Solution: A) Mean =∑xi/n = 53 + 44 + 56 +70 + 45 + 62 + 36 +23 + 56 + 55/10 = 500/10 = 50 B) Median, first we should arrange the raw data in ascending or descending order as follow: 23, 36, 44, 45, 53, 55, 56, 56, 62, 70, since n is order, therefore Median = 53 + 55/2 = 54 C) Mode, we find the most frequently occur, 56 is the mode of the given data since it is more frequently occur and It is uni-modal. D) Range = largest value-lowest value = 70-23 =47 E) Variance = ∑(xi- )2/n 03/08/2025 By: Menberu T. 27
  • 28.
    ID xi xi- (xi-)2 1 53 3 9 2 44 -6 36 3 56 6 36 4 70 20 400 5 45 -5 25 6 62 12 144 7 36 -14 196 8 23 -27 729 9 56 6 36 10 55 5 25 ∑(xi- )2 =1636 variance = ∑(xi- )2/n = 1636/10 = 163.6 03/08/2025 By: Menberu T. 28 F) SD=variance = 163.6 = 12.79
  • 29.
    Measure of dispersionfor Grouped Data • Sample Variance Formula for Grouped Data (σ2 ) = ∑ f(mi – ) x ̄ 2 /(n-1) • Population Variance Formula for Grouped Data (σ2 ) = ∑ f(mi – ) x ̄ 2 /n • where, • f is the frequency of each interval • mi is the midpoint of the ith interval • x ̄ is the mean of the grouped data 03/08/2025 By: Menberu T. 29
  • 30.
    Cont… • Find thevariance and the standard deviation for the following frequency dist of a sample: 03/08/2025 By: Menberu T. 30 Class Frequency fm 5 – 9 2 10 – 14 4 15 – 19 7 20 – 24 3 25 – 29 1 30 – 34 3 Total 20
  • 31.
  • 32.
    Cont… • Variance= =1105/19=58.158 • Standard deviation=7.626 03/08/2025 By: Menberu T. 32
  • 33.
    C. Measures ofrelationship 1. Coefficient of variance It (CV) is a normalized measure of dispersion. It is also known as unitized risk or the variation coefficient. It is defined as the ratio of the standard deviation to the mean. CV is a relative measure of dispersion, V, defined by: 33 SD CV Mean       03/08/2025 By: Menberu T.
  • 34.
    Example: If thestandard deviation of a given distribution is 0.20 and the mean is 0.50, what is the coefficient of variation (CV)? CV = (0.20/0.50)*100% = 40% 2. Covariance Covariance between X and Y refers to a measure of how much two variables change together. Covariance indicates how two variables are related. A positive covariance means the variables are positively related, while a negative covariance means the variables are inversely related. The formula for calculating covariance of sample data is shown below. 34 03/08/2025 By: Menberu T.
  • 35.
    35 Note: for population(N) and for sample (n-1)  Often the numbers have no meaning. Thus we focus on the sign. 03/08/2025 By: Menberu T.
  • 36.
    3. correlation Covariance onlyshows the direction. It has no upper and lower bound. Correlation tells the degree to which the variables tend to move together. The most familiar measure of dependence between two quantities is the "Pearson's correlation." It is obtained by dividing the covariance of the two variables by the product of their standard deviations. The Pearson correlation is defined only if both of the standard deviations are finite ፥ፍልሕ፡ህ፡and both of them are nonzero. The correlation coefficient is symmetric: corr(X, Y) = corr(Y, X). 36 03/08/2025 By: Menberu T.
  • 37.
    The Pearson correlationis +1 if there is perfect positive linear relationship, −1 if there is perfect negative linear relationship. If the variables are independent, Pearson's correlation coefficient is 0. The sample correlation coefficient is written 37 03/08/2025 By: Menberu T.
  • 38.
    The correlation betweentwo random variables, X and Y, is a measure of the degree of linear association between the two variables. The population correlation, denoted by , can take on any value from -1 to 1.   indicates a perfect negative linear relationship -1 <  < 0 indicates a negative linear relationship    indicates no linear relationship 0 <  < 1 indicates a positive linear relationship    indicates a perfect positive linear relationship The absolute value of  indicates the strength or exactness of the relationship. 38 03/08/2025 By: Menberu T.
  • 39.
    Example: find covarianceand Pearson correlation following hypothetical row data? 03/08/2025 By: Menberu T. 39 xi yi xi- yi- (xi- )(Yi- ) (xi- )2 (Yi- )2 10 18 -4 6 -24 16 36 30 6 16 -6 -96 256 36 8 12 -6 0 0 36 0 16 15 2 3 6 4 9 6 9 -8 -3 24 64 9 Cov (X,Y)= ∑(xi- )(Yi- )/n= -90/5 = -18 ∑(xi- )2 = 376 ∑ (Yi- )2 = 90 r (x, y) = ∑(xi- )(Yi- )/∑(xi- )2∑ (Yi- )2 = -90/33, 840 = -90/183 = -0.49
  • 40.
    Skewness  It refersto symmetry or asymmetry of the distribution.  A distribution is symmetric if its left half is a mirror image of its right half.  The skewness value can be positive or negative.  A symmetric distribution with a single peak and a bell shape is known as a normal distribution. D. Shape of Frequency Distribution 03/08/2025 By: Menberu T. 40
  • 41.
    Kurtosis:  It refersto peakedness/flatness of the distribution.  Higher kurtosis means more of the variance is the result of infrequent extreme deviation.  The fourth standardized moment is defined as 4 1 4 ( ) ( 1) n i i x x KU n S      03/08/2025 By: Menberu T. 41