Descriptive statistics are used to describe and summarize key features of data. There are two main types: graphical and numerical. Numerical descriptive statistics include measures of central tendency like the mean, median, and mode, which summarize the central or typical values in a data set. They also include measures of dispersion like the range, variance and standard deviation, which quantify how spread out the values are. The mean is the average value, the median is the middle value when data is ordered, and the mode is the most frequent value. The range is the difference between highest and lowest values. Variance measures deviations from the mean while standard deviation measures dispersion relative to the mean.
2. Why is this important?
Descriptive Statistics are part of any research study.
Used to describe a population.
Organize, summarize and describe the important features of the data
6. Mean
Average value of the data set
The mean is the sum of the observations divided by the number of observations
7. Example 1
The following are the ages of all eight employees of a small company. Find the mean
age of these employees.
53 39 67 27 49 55 38 51
Solution :
47.325
53 + 39 + 67 + 27 + 49 + 55 + 38 + 51
8
8. median
Median is the middle score for a set of data that are arranged in ascending order.
The advantage of using median as a measure of tendency is that it is not
influenced by outliers.
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑉𝑎𝑙𝑢𝑒 𝑜𝑓 𝑡ℎ𝑒
𝑛+1
2
𝑡ℎ 𝑡𝑒𝑟𝑚 𝑖𝑛 𝑎 𝑟𝑎𝑛𝑘𝑒𝑑 𝑑𝑎𝑡𝑎𝑠𝑒𝑡
If n is odd, the median is the value of the middle term in the ranked data
If n is even, the median is the average of the two middle terms
9. Example 2
Consider the small data set A= 42, 21, 34, 65, 90, 45, 109. From the available data,
Calculate the Median.
Solution:
1. Arrange the data set in ascending order.
2. 𝑀𝑒𝑑𝑖𝑎𝑛 = 45
21 34 42 45 65 90 109
n=7
10. Example 3
Consider the small data set C= 2, 5, 89, 40, 66, 33, 14, 23, 90, 101. From the available data,
Calculate the Median.
Solution:
1. Arrange the data set in ascending order.
2. 𝑀𝑒𝑑𝑖𝑎𝑛 =
𝑛+1
2
=
10+1
2
= 5.5
2 5 14 23 33 40 66 89 90 101
n=10
11. Mode
The value which occurs most frequently
A set of values may have more than one mode or no mode at all
A single value of mode is called unimodal
Two values of modes, it is called bimodal
More than 3 modes, it is called multimodal
Mode is often used with categorical data
12. Example 4
The cholesterol level of all six residents of Guyan IsIand are as follows:
120.120.140,150,160,190
Determine the mode.
Solutions:
120
14. Range
Range of the data set is obtained by taking the difference between the largest
observations and the smallest observation
Given than 𝑥1 < 𝑥2 < ⋯ < 𝑥𝑛, the Range is calculated as 𝑅𝑎𝑛𝑔𝑒 = 𝑥𝑛 − 𝑥1
15. Example 5
Consider the data set of 12, 24, 41, 51, 67, 67, 85, 99. Determine its range.
Solutions:
1. The highest value = 99
2. The lowest value = 12
Range = 99 – 12
= 87
16. Variance
The deviations of all value from the mean
The average of the squared differences from the mean
17. Standard Deviation
Standard deviation measures the dispersion of a dataset relative to its mean.
It is calculated as the square root of the variance.
Standard deviation is used when mean is used to calculate central tendency.
If data points is further than the mean, there is higher deviation within the data
set. Thus the more the spread out the data, the higher the standard deviation.