2. Introduction
Averages or measures of central
tendency are single values about
which the set of observations tend to
cluster.
They provide summary and bases
for comparison
mean, median, and mode (for central
tendency)
Quantiles: quartiles, deciles, and
percentiles (for non central tendency)
4.1 Common Measures of Location: Introduction
3. ARITHMETIC MEAN
the most popular measure of central
tendency
defined as the sum of all observations
divided by the total number of
observations
should only be used for interval and
ratio data
we shall compute two means: one for
the sample and one for a finite
population of values
4.1 Common Measures of Location
4. ARITHMETIC MEAN
4.1 Common Measures of Location
)meanpopulation(
N
X
N
1i
i
X
)(1
meansample
n
X
X
n
i
i
Let Xi represent the ith observation on
a variable(characteristic) X;
5. Suppose ten (10) rice farmers realized
the following yields, in cavans:
92 110 104 110 88
115 123 125 131 115
The mean yield is:
= 110.3 cavans
4.1 Common Measures of Location
ARITHMETIC MEAN: Example
10
11031
N
X
N
i
i
X
6. Interpretations:
1. If one were to divide the total yield
of the ten farmers equally among
them, each would have a yield of
110.3 cavans.
2. Most of the yields of the ten farmers
are close to 110.3 cavans.
4.1 Common Measures of Location
ARITHMETIC MEAN
7. A small company consists of the owner
the manager the salesperson and two
technicians The salaries are listed as
PhP50,000, 20,000, 12,000, 9,000
and 9,000 respectively. (Assume this
is the population) Then the population
mean will be _____.
ARITHMETIC MEAN: example
4.1 Common Measures of Location
8. Properties of the Arithmetic Mean
1. The mean is a unique value because a
set of data has only one arithmetic
mean.
2. The mean reflects the magnitude of
every observation since every
observation contributes to the value of
the mean.
3. It is easily affected by the presence of
extreme values; hence, it is not a good
measure of central tendency when
there are extreme observations.
4. The sum of the deviations of the
observations from the mean is always4.1 Common Measures of Location
9. 5. Means of subgroups may be
combined. When properly weighed,
the resulting number is called a
weighted arithmetic mean.
Formula for weighted arithmetic mean:
4.1 Common Measures of Location
k21
kk2211
k
1i
i
k
1i
ii
w
W...WW
)X(W...)X(W)X(W
W
XW
X
Properties of the Arithmetic Mean
10. Example:
Suppose there are three sections in Stat
21 with 20, 25, and 30 students,
respectively, with corresponding mean
scores of 75, 80, and 82 in the first long
exam. The overall mean score will be:
6. The mean is increased (decreased) by
a constant when every observation in
the data has a constant
added(subtracted) to(from) it. Same
logic is applied when having a constant
multiplier/ divisor.4.1 Common Measures of Location
Properties of the Arithmetic Mean
11. MEDIAN
The median (Md) is the middle value of an
array
It can be used for ordinal, interval, and ratio
data
To find the median, first sort the values
(arrange them in order), then follow one of
these two procedures:
If N is odd, the median is obtained by simply picking out
the middle value of the array
If N is even, the median is just the mean of the two
middle values in the array4.1 Common Measures of Location
2
1NXMd
2
1
22
NN XX
Md
12. 1. Suppose we have the following data:
104, 123, 92, 88, 131
Find the median.
2. Suppose we have the following data:
66, 42, 55, 24, 16, 28
Find the median.
4.1 Common Measures of Location
MEDIAN (Example)
13. Properties of the Median
1. It is a positional value and hence not
affected by the presence of extreme
values unlike the mean.
2. The median is not amenable to
further computation and hence,
median of subgroups cannot be
combined in the same manner as the
mean.
4.1 Common Measures of Location
14. MODE
The mode (Mo) is the value which
occurs the most frequent in the given
data set
It can be used for all scales of
measurement
the only measure among the three
averages that can be used for nominal
data
the most preferred, best liked, most4.1 Common Measures of Location
15. The following data represent the duration
(in days) of U.S. space shuttle voyages
for the years 1992-94.Find the mode.
Data set: 8, 9, 9, 14, 8, 8, 10, 7, 6, 9, 7, 8, 10,
14, 11, 8, 14, 11.
MODE (Example)
Array: 6, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 10, 10, 11,
11, 14, 14, 14.
Mode = 8.
4.1 Common Measures of Location
16. 1. If all values occur with equal frequency, the
frequency maybe equal to 1 or greater than
1, there is no modal value.
Example:
Twelve strains of bacteria were tested to see
how
long they could remain alive outside their
normal
environment. The time, in minutes, is given
below. Find the mode.
Data set: 2, 5, 3, 2, 5, 8, 7, 10, 8, 10,7, 3.
MODE
4.1 Common Measures of Location
Answer: There is no mode since each data value
occurs equally with a frequency of two.
17. 2. In cases where two adjacent values
occur with the same frequency which is
larger than the frequencies of the
others, the mode maybe taken as the
arithmetic mean of the two adjacent
values if the variable is continuous.
Example:
Find the mode of the following data set.
9.2, 11.5, 12.1, 12.1, 12.1, 15.4, 15.4,
15.4, 17.1, 17.1, 19.9
MODE
Answer: Mo=(12.1+15.4)/2=13.75
4.1 Common Measures of Location
18. 3. When two nonadjacent values occur
such that the frequency of both are
greater than the frequencies in the
adjacent intervals, then each value
maybe taken as the mode and the set
of observations maybe spoken of as
bimodal.
MODE
Example:
Eleven different automobiles were tested at
a speed of 15 mph for stopping distances.
The distance, in feet, is given below. Find the
mode.
Data set: 15, 18, 18, 18, 20, 22, 24, 24, 24, 26, 26.4.1 Common Measures of Location
19. Properties of the Mode
The mode is determined by the
frequency and not by the value of the
observations.
It cannot be manipulated algebraically
and hence modes of subgroups cannot
be combined like the mean.
The mode can be defined in both
qualitative and quantitative data.
4.1 Common Measures of Location
20. MODE OF QUALITATIVE DATA
Table 4.1. Distribution of a sample of 70
students according to the brand of
shampoo they use
SHAMPOO Number of Students
Pantene 12
Rejoice 10
Palmolive 13
Head & Shoulders 7
Dove 18
Others 10
4.1 Common Measures of Location
21. Comparison of the Mean,
Median, and Mode
When the distribution of the observations
is fairly symmetric or when there are no
extreme observations, the mean is the
most meaningful measure of central
tendency
With the presence of extreme
observations, the median is a more
meaningful measure in as much as it is
not affected by these extreme values.
If data is qualitative, the mode is the only
measure that one can use to describe the
data4.1 Common Measures of Location
22. In the case of a perfectly symmetric bell
shaped distribution, all the three
measures are equal.
Comparison of the Mean,
Median, and Mode
4.1 Common Measures of Location
23. For a positively skewed distribution:
Mo < Md < µ
Comparison of the Mean,
Median, and Mode
4.1 Common Measures of Location
24. For a negatively skewed distribution:
µ < Md < Mo
Comparison of the Mean,
Median, and Mode
4.1 Common Measures of Location
25. 4.2 Common Measures of
Variation
Another important characteristic of a set of
data is the extent to which they differ among
themselves
The mean gives a description of the “center”
of a data set but it tells nothing about how
spread or variable the data values are.
We can even have data sets having the same
mean and yet they are not identical data sets
simply because of the different values the
data sets contain.
In statistics, we usually determine variation of
individual data values relative to their mean
by computing measures of variation.
26. Range
The range, R, of a set of numbers, is the
difference between the largest and the smallest.
data are at least ordinal in scale.
R=0 means that the data values are all identical.
the larger is the difference between the two
extreme values, the larger is the range
the larger the value of R, the more spread are the
data values
Example: Using data sets A, B, and C above, we
find
RA = 5 - 5 = 0
RB = 7 - 3 = 4
RC = 6 - 4 = 2.
4.2 Common Measures of Variation
27. Properties of the Range
1. The range is easy to calculate and easy to
understand
2. Its main shortcoming is that it tells us
nothing about the dispersion of the data
that fall between the two extremes. Thus,
it is a poor measure of variation
particularly if the size of the sample or
population is large. Consider the following
sets of data, both with a range of 12:
Set A: 3, 4, 5, 6, 8, 9, 10, 12, 15
Set B: 3, 7, 7, 7, 8, 8, 8, 9, 15
4.2 Common Measures of Variation
28. 3. When the sample size is quite small,
the range can be an adequate
measure of variation.
4. It is used primarily when we are
interested in getting a quick, though
perhaps not very accurate, picture of
the variability of a set of data without
going through excessive
calculations.
Properties of the Range
4.2 Common Measures of Variation
29. Average Deviation (Based on
the Median)
the average amount of scatter of the
values in a distribution from the
median, ignoring the signs of the
deviations
This is best used when the median is
the appropriate measure of central
tendency (in the presence of extreme
values/skewed distributions).
n
MdX
.D.A
n
1i
i
30. Average Deviation (Example)
Find the average deviation of the following
data representing the average relative
humidity at 1:30 p.m. in a certain city, for
each month of the year.
71, 64, 53, 43, 37, 32, 28, 28, 31, 42,
59, 70
Solution:
We first compute the median.
Array: 28, 28, 31, 32, 37, 42, 43,
53, 59, 64, 70, 71
Md=(42+43)/2=42.5
32. Properties of the Average
Deviation
1. The sum of the absolute deviations from
the median will always be less than the
sum of the absolute deviations from the
mean.
2. The main drawback of the average
deviation is that due to the absolute
values it does not lend itself readily to
further mathematical treatment.
33. Variance and Standard
Deviation
The variance of a set of numbers is
the mean of the squared deviations of
these numbers from their mean.
X
N
X
N
i
xi
x ofvariancepopulation,
)(
1
2
2
X
n
XX
S
n
i
i
x ofvariancesample,
1
)(
1
2
2
34. To facilitate calculations, we have the
following computational formulas
Variance and Standard
Deviation
population,
N
)X(XN
N
X
2
2
i
2
i2
x
N
1i
2
i
2
x
sample,
)1(
)( 22
2
nn
XXn
S ii
x
35. Consider the scores on the first quiz of
a small class: 6, 7, 7, 7, 8, 8, 8, 9, 10
a) Treat this as population data,
compute the variance.
b) Treat this as sample data, compute
the variance.
Variance and Standard
Deviation
36. The standard deviation of a set of
data is the positive square root of its
variance.
(pop’n standard deviation)
(sample standard deviation)
Example: (from the variance)
Variance and Standard
Deviation
2
2
ss
37. Properties of the Variance
1. The variance can never be negative since it is
a squared value. Like the range and the
average deviation, its minimum value is zero--
absence of variability. A large variance
corresponds to a highly dispersed set of values
2. If each observation of a set of data is
transformed to a new set by the addition (or
subtraction) of a constant c, the variance of the
original set of data is the same as the variance
of the new set.
3. If a set of data is transformed to a new set by
multiplying (or dividing) each observation by a
constant c, the variance of the new set is the
original variance multiplied by (or divided by)
c2
38. Coefficient of Variation
The coefficient of variation, CV,
expresses the standard deviation as a
percentage of the mean.
Can then be used to compare the
variability of two or more data sets
expressed in different units of
measurement or data sets with different
means
population,100xCV
sample,100x
X
S
CV
39. Five repeated measurements of the
length of a room gave a mean of 240
inches with a standard deviation of
0.10 inch. Can you say that the
measurements are extremely
accurate?
Coefficient of Variation
(Example)
40. 2. The weights of ten (10) boxes of a
certain brand of cereal have a mean
content of 278 grams with a standard
deviation of 9.64 grams. If these
boxes were purchased at ten (10)
different stores and the average price
per box is ₱34.83 with a standard
deviation of ₱2.43, can you conclude
that the weights are relatively more
homogeneous than the prices?
Coefficient of Variation
(Example)