The PPT describes the Measures of Central Tendency in detail such as Mean, Median, Mode, Percentile, Quartile, Arthemetic mean. Measures of Variability: Range, Mean Absolute deviation, Standard Deviation, Z-Score, Variance, Coefficient of Variance as well as Measures of Shape such as kurtosis and skewness in the grouped and normal data.
2. Learning Objectives
Measures of Central Tendency
Measures of Variability
Measures of Central Tendency and Variability : Grouped Data
Measures of Shape
3. Measures of Central
Tendency
◦ Measures of Central Tendency yield information about “Particular places
or locations in a group of numbers.”
◦ Measures of Central Tendency can yield information as the average offer
price, the middle offer price and the most frequently occurring offer
price.
◦ Common measures of central tendency :-
Mean, Mode, Median, Percentiles, Quartiles.
4. Mode
◦ Mode is the most frequently occurring value in a set of data.
◦ Mode is applicable to all levels of data measurement (Nominal, Ordinal,
Ratio, Interval).
◦ Mode can be used to determine what categories occurs most frequently.
◦ When there is tie for the most frequently occurring value, two modes are
listed, then the data are said to be Bimodal.
◦ Data sets with more than two modes are referred to as Multimodal.
5. Example of Mode
7 11 14 15 15
15.5 19 19 19 19
21 22 23 24 25
27 27 28 34 43
This table makes it easier to see that 19 is the most frequently
occurring number(4 Times). So Here Mode is 19.
Table 1:
6. Median
◦ Median is the middle value in an ordered array of numbers.
◦ For an array with an odd number of terms, the median is the middle
number.
◦ For an array with an even number of terms, the median is the average of
the two middle numbers.
◦ Steps used to determine the median.
Step 1. Arrange the observations in an ordered data array.
Step 2. For an odd number of terms, find the middle term of the
ordered array. It is the median.
Step 3. For an even number of terms, find the average of the middle
two terms. This is the median.
7. Example of Median
Suppose a business researcher wants to determine the median for the
following numbers.
15 11 14 3 21 17 22 16 19 16 5 7 19 8 9 20 4
Step 1: Now the researcher arranges the numbers in an ordered array.
3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21 22
Now the array contains 17 terms (odd number of terms), the Median is middle
number, i.e. 15.
If one number (for ex. Number 22) is eliminated from the list the array would
contain only 16 terms.
3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21
Now, for an even number of terms, the median is determined by averaging the
2 middle values, 14 and 15. The resulting MEDIAN value is 14.5.
8. ARITHMETIC MEAN
◦ Arithmetic mean is the average of a group of numbers
◦ Arithmetic mean is applicable for interval and ratio data.
◦ Arithmetic mean is affected by each value in the data set, including
extreme values
◦ The arithmetic mean is computed by summing all numbers and dividing
by the number of numbers.
9. Example of Mean
Suppose a company has 5 departments with 24,13,19,26 and 11 workers each.
Compute the population mean number of workers.
Solution :-
Step 1. We will add all the workers of each department.
24+13+19+26+11 = 93
Step 2. We will divide total no. of workers by total no. of departments.
93/5 = 18.6
Step 3. Hence, the Population Mean number of workers in each department is
18.6 workers.
10. Percentiles
◦ Percentiles are measures of central tendency that divide a group
of data into 100 parts.
◦ The nth percentile is the value such that at least n% of the data
are below that value and at most (100 – n) % are above that
value.
◦ Example – 87th percentile indicates that at least 87% of the data
are below the value and no more that 13% are above the value.
◦ Percentiles are widely used in reporting test results.
11. Steps in determining the
location of a percentile
Step 1. Organize the numbers into ascending – order array.
Step 2. Calculate the percentile location(i) by:
(i) = P/100*(N)
where
P = the percentile of interest
i = percentile location
N = number in the data set
Step 3. Determine the location by either (a) or (b).
a. If I is a whole number, the Pth percentile is the average of the value at the ith
location and the value at the (i + 1)st location.
b. If I is not a whole number, the Pth percentile value is located at the whole number
part of i + 1.
12. Example of Percentiles
◦ Suppose you want to determine the 80th percentile of 1240 numbers.
◦ So, P is 80 and N is 1240. First, order the numbers from lowest to
highest.
◦ Next, calculate the location of the 80th percentile.
i = 80/100 * (1240) = 992
◦ So i = 992 is a whole number, the 80th percentile is the average of 992nd
number and 993rd number.
P80 = (992nd number = 993rd number) /
13. Quartiles
◦ Quartiles are the measures of central tendency that divide a group of data into
four subgroups or parts.
◦ The three quartiles are denoted as Q1, Q2 and Q3.
◦ The first quartile Q1, separates the first or lowest, one-fourth of the data
from the upper three-fourths and is equal to the 25th percentile.
◦ The second quartile Q2, separates the second quarter of the data from the
third quartile, Q2 is located at the 50th percentile and equals to the median of
the data.
◦ The third quartile Q3, divides the first three-quarters of the data from the last
quarter and is equal to the value of 75th percentile.
14. Measures of Variability :
Ungrouped Data
◦ Measures of variability describe the spread or the dispersion of a set of
data.
◦ Using measures of variability with measures of central tendency makes
possible a more complete numerical description of the data.
◦ Different measures of variability :-
I. Range
II. Mean absolute deviation
III. Variance
IV. Standard deviation
V. Z score
VI. Coefficient of variation
15. Range
◦ Range is the difference between the largest value of a data set and the
smallest value of a set.
◦ It describes the distance to the outer bounds of data set.
◦ It is easy to compute and affected by the extreme values.
◦ Interquartile Range :- It is the range of values between the first and
third quartile.
RANGE = Highest value – Lowest value
INTERQUARTILE RANGE = Q3 – Q1
16. Mean Absolute Deviation
◦ Mean absolute deviation is the average of the absolute values of the
deviations around the mean for a set of numbers.
◦ Subtracting the mean from each value of data yields the deviation from
the mean (X - µ).
◦ Mean Deviation = Σ|x − μ|
N
17. Example of Mean Absolute
Deviation
M A D
X
N
. . .
.
24
5
4 8
5
9
16
17
18
-8
-4
+3
+4
+5
0
+8
+4
+3
+4
+5
24
X X X
18. Variance
◦ Variance is the average of the squared deviations about the arithmetic
mean for a set of numbers.
◦ The population variance is denoted by σ2
◦ Formula :-
19. Standard Deviation
◦ Standard deviation is the square root of variance.
◦ Standard deviation was given by Karl Pearson in 1893.
◦ Standard deviation is the most popular and useful measure of variability.
◦ Median and Mode is not considered while computing standard deviation.
◦ The standard deviation is represented by the Greek letter 𝝈(sigma).
21. Z- Scores
◦ A z-score represents the number of standard deviations a value (x) is above
or below the mean of a set of numbers when the data are normally
distributed.
◦ Z scores allows translation of a value’s raw distance from the mean into units
of standard deviations.
◦ Z = (x-µ)/σ
22. Z-Scores
◦ If Z is negative, the raw value (x) is below the mean
◦ If Z is positive, the raw value (x) is above the mean
◦ Between
◦ Z = + 1, are approximately 68% of the values
◦ Z = + 2, are approximately 95% of the values
◦ Z = + 3, are approximately 99% of the values
23. Coefficient of Variation
◦ The coefficient of variation is the ratio of the standard deviation to the
mean expressed in percentage.
◦ The coefficient of variation is a relative comparison of a standard
deviation to its mean.
◦ The coefficient of variation can be useful in comparing standard
deviations that have been computed from data with different means.
◦ Coefficient of variation :- C V. .
100
24. Measures of Central
Tendency and Variability:
Grouped Data
Measures of Central Tendency
◦ Mean
◦ Median
◦ Mode
Measures of Variability
◦ Variance
◦ Standard Deviation
25. Measures of Central
Tendency: Grouped Data
MEAN
◦ For mean with grouped data the midpoint of each class interval is used
to represent all the values in a class interval.
◦ Midpoint is weighted by the frequency of values in the class interval.
◦ Mean is computed by summing the products of class midpoint, and the
class frequency for each class and dividing that sum by the total
number of frequencies.
27. Measures of Central
Tendency: Grouped Data
MEDIAN
◦ The median for grouped data is the middle value of an ordered array of
numbers.
◦ Formula to calculate median for grouped data :
M=L+[ (N/2–F)/f ]C
Where,
L means lower boundary of the median class
N means sum of frequencies
F means cumulative frequency before the median class.
f means frequency of the median class
C means the size of the median class
29. Measures of Central
Tendency: Grouped Data
MODE
◦ Mode for the grouped data is the class midpoint of the modal class.
◦ The modal class is the class interval with the greatest frequency.
30. Mode of Grouped Data
Class Interval Frequency
20-under 30 6
30-under 40 18
40-under 50 11
50-under 60 11
60-under 70 3
70-under 80 1
Mode
30 40
2
35
31. Variance and Standard
Deviation Of Grouped Data
2
2
2
f
N
M 2
2
2
1S
M X
S
f
n
S
Population Sample
32. MEASURES OF SHAPE
Measures of shape are tools that can be used to describe the shape of a
distribution of data.
Measures of shape
◦ Skewness
◦ kurtosis
33. Skewness
◦ Skewness is when a distribution is asymmetrical or lacks symmetry.
◦ Symmetrical – A distribution of data in which the right half is a mirror
image of the left half is said to be symmetrical.
◦ The concept of skewness helps to understand the relationship of the
mean, median and mode.
34. Coefficient of Skewness
◦ Coefficient of Skewness (Sk) - compares the mean
and median in light of the magnitude to the standard deviation; Md is
the median; Sk is coefficient of skewness; σ is the Standard deviation.
dM
Sk
3
36. Kurtosis
◦ Kurtosis describes the amount of peakedness of a distribution.
◦ Distributions that are high and thin are referred as Leptokurtic
distributions.
◦ Distributions that are flat and spread out are referred as Platykurtic
distributions.
◦ Distributions that are normal in shape are referred as Mesokurtic
distributions.