SlideShare a Scribd company logo
1 of 55
Download to read offline
Descriptive statistics:
Numerical summary
Tufa Kolola
(MPH, Ass’t. Prof.)
§ Introduction
§ Measures of central tendency
§ Measures of relative standing
§ Shape of distribution
§ Measures of dispersion
After the end of this session you will be able
§ Compute and interpret the mean, median, and
mode for a set of data
§ Construct and interpret a box and whiskers plot
§ Compute and interpret the range, variance,
standard deviation coefficient of variation for a
set of data
§ Use numerical measures along with graphs,
charts, and tables to describe data
summary measures
Numerical summary measures : A descriptive
measure which summarize the data set by a
single number
§ Unlike frequency distributions, indicate the
average value or (the middle) and the spread of
the values
Summary Measures
Measures of central
tendency (Location)
Measures of
Relative Standing
Weighted Mean
Numerical summary
Measures of dispersion
Standard Deviation
Coefficient of
Interquartile Range
Measures of central
§ On the scale of values of a variable, there is a certain
stage at which the largest number of items tend to
§ Since this stage is usually in the centre of distribution,
the tendency of the statistical data to get concentrated
at a certain value is called “central tendency”
§ The various methods of determining the point about
which the observations tend to concentrate are called
Characteristics of
good MCT
1. It should be based on all the observations
2. It should not be affected by the extreme values
3. It should be as close to the minimum & maximum
number of values as possible
4. It should have a definite value
5. It should not be subjected to complicated and
tedious calculations
6. It should be capable of further algebraic treatment
7. It should be stable with regard to sampling
Measures of central
Center and Location
Mean Median Mode Weighted Mean
Arithmetic Mean:
ungrouped data
§ The Mean is the average of data set (Is the sum of
all the observations divided by the total number of
– Sample mean
– Population mean
n = Sample Size
N = Population Size
x n
 
 
Arithmetic Mean
§ The most common measure of central tendency
§ Affected by extreme values (outliers)
0 1 2 3 4 5 6 7 8 9 10
Mean = 3
0 1 2 3 4 5 6 7 8 9 10
Mean = 4
Grouped data
§ In calculating the mean from grouped data, we
assume that all values falling into a particular
class interval are located at the midpoint of the
interval. It is calculated as follows:
§ Where:
mi=the midpoint of the ith class interval
fi= the frequency of the ith class interval
Example. Compute the mean age of 169 subjects
from the grouped data
Properties of
Arithmetic Mean
§ For a given set of data there is one and only one
arithmetic mean (uniqueness)
§ Easy to calculate and understand (simple)
§ Influenced by each and every value in a data sets
§ Greatly affected by the extreme values
§ Poor measure of location if the underlying
distribution is not normal (or not Gaussian)
§ In case of grouped data if any class interval is
open, arithmetic mean can not be calculated
Median: Ungrouped
§ In an ordered array, the median is the “middle”
– If n or N is odd, the median is the middle number
– If n or N is even, the median is the average of the
two middle numbers
0 1 2 3 4 5 6 7 8 9 10
Median = 3
0 1 2 3 4 5 6 7 8 9 10
Median = 3
§ The median is the value of the middle term in a
data set that has been ranked in increasing order
Grouped data
§ In calculating the median from grouped data, we
assume that the values within a class-interval are
evenly distributed through the interval
§ The first step is to locate the class interval in
which the median is located, using the following
§ Find n/2 and see a class interval with a minimum
cumulative frequency which contains n/2
§ Then, use the following formula
Lm = lower true class boundary of the interval containing the
Fc = cumulative frequency of the interval just above the
median class interval
fm = frequency of the interval containing the median
W= class interval width
n = total number of observations
x = L W
 
 
  
 
 
Example: Compute the median age of 169
subjects from the grouped data
§ n/2 = 169/2 = 84.5
§ n/2 = 84.5 = in the 3rd class interval
§ Lower limit = 29.5, Upper limit = 39.5
§ Frequency of the class = 47
§ (n/2 – fc) = 84.5-70 = 14.5
Median = 29.5 + (14.5/47)10 = 32.58 33
Properties of
§ There is only one median for a given set of data
§ The median is easy to calculate
§ Median is a positional average and hence it is
insensitive to very large or very small values
§ Median can be calculated even in the case of open
end intervals if sample size known
§ It is determined mainly by the middle points and
less sensitive to the remaining data points
Mode: Ungrouped
§ Value that occurs most often
§ Not affected by extreme values
§ Used for either numerical or categorical data
§ There may be no mode
§ There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 5
0 1 2 3 4 5 6
No Mode
Mode: Grouped
§ To find the mode of grouped data, we usually
refer to the modal class, where the modal class
is the class interval with the highest frequency
§ If a single value for the mode of grouped data
must be specified, it is taken as the mid-point of
the modal class interval
Properties of
§ It is not affected by extreme values
§ It can be calculated for distributions with open end
§ Often its value is not unique
§ The main drawback of mode is that often it does
not exist
Measures of
Relative Standing
§ Where does one particular measurement stand
in relation to the other measurements in the data
§ Descriptive measures that locate the relative
position of an observation in relation to the other
observations are called measures of relative
Measures of
Relative Standing
Measures of
Relative Standing
Percentiles Quartiles
n 1st quartile = 25th percentile
n 2nd quartile = 50th percentile
= median
n 3rd quartile = 75th percentile
§ The pth percentile in a data
array: is a number such that
p% of the observations of
the data set fall below and
(100-p)% of the observations
fall above it. (where 0 ≤ p ≤
§ The pth percentile in an ordered array of n values
is the value in ith position, where
n Example: The 60th percentile in an ordered array
of 19 values is the value in 12th position:
i 
i 
§ Commonly used percentiles
– First (lower) decile = 10th percentile
– First (lower) quartile, Q1 = 25th percentile
– Second (middle)quartile,Q2 = 50th percentile
– Third quartile, Q3 = 75th percentile
– Ninth (upper) decile = 90th percentile
§ Quartiles Split Ordered Data into 4 equal
§ Q1 and Q3 are Measures of Non-central Location
§ Q2 = the Median
25% 25% 25% 25%
 
Q  
Q  
§ Each Quartile has position and value
– With the data in an ordered array, the position of Qi
– The value of Qi is the value associated with that
position in the ordered array
§ Example:
Data in Ordered Array: 11 12 13 16 16 17 18 21 22
   
1 1
1 9 1 12 13
Position of 2.5 12.5
4 2
 
   
 
 
i n
The prices ($) of 18 brands of walking shoes:
40 60 65 65 67 68 68 70 70
70 70 70 70 74 75 75 90 95
üQ1is 3/4 of the way between the 4th and 5th
ordered measurements, or
Q1 = 65 + .75(67 - 65) = 66.5.
The prices ($) of 18 brands of walking shoes:
40 60 65 65 65 68 68 70 70
70 70 70 70 74 75 75 90 95
üQ3 is 1/4 of the way between the 14th and 15th
ordered measurements, or
Q3 = 74 + .25(75 - 74) = 74.25
üAnd IQR = Q3 – Q1 = 74.25 – 66.5 = 7.75
Shape of a
§ Describes how data is distributed
§ Measures of Shape
- Symmetric or skewed (asymmetric)
Mean = Median = Mode
Mean < Median < Mode Mode < Median < Mean
Left-Skewed Symmetric
(Longer tail extends to left) (Longer tail extends to right)
The Five Number
§ One way to give a nice profile of a data set is the
“five-number summary,” which consists of:
1. The smallest measurement
2. The first quartile, Q1
3. The median, Q2
4. The third quartile, Q3
5. The largest measurement
§ Displayed visually using a box-and-whiskers plot
The Box-and-
Whisker plot
§ 5-number summary
– Median, Q1, Q3, Xsmallest, Xlargest
§ Box Plot
– Graphical display of data using 5-number
( )
4 6 8 10 12
Q 3
Distribution Shape &
Box-and-Whisker Plot
Left-Skewed Symmetric
Q 1
Q 1
Q 2
Q 2
Q 3
§ Skewed distributions usually have a long whisker in the
direction of the skewness
Shape of a Distribution
and Quartiles
§ If the distribution is symmetric, then the upper and
lower quartiles should be approximately equally
spaced from the median
§ If the upper quartile is farther from the median than
the lower quartile, then the distribution is positively
§ If the lower quartile is farther from the median than
the upper quartile, then the distribution is negatively
§ A value located at a distance of more than
1.5(IQR) from the box
üLower fence: Q1-1.5 IQR
üUpper fence: Q3+1.5 IQR
§ Measurements beyond the upper or lower fence
are outliers and are marked with *
Measures of
Variance Standard Deviation Coefficient of
Measures of
§ Measures that quantify the variation or dispersion
of a set of data from its central location
§ The amount may be small when the values are
close together and large when the values are far
apart from each other
§ If all the values are the same, no dispersion
§ How much are the observations spread out
around the mean value?
§ Measures of variation give information on the
spread or variability of the data values
Measures of
Same center,
different variation
Measures of
§ The more Spread out or dispersed data, the larger
the measures of variation
§ The more concentrated the data, the smaller the
measures of variation
§ If all observations are equal, measures of variation
= Zero
§ All measures of variation are Non-negative
§ Simplest measure of variation
§ Difference between the largest and the smallest
Range = xmaximum – xminimum
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13
§ Ignores the way in which data are distributed
§ Sensitive to outliers
7 8 9 10 11 12
Range = 12 - 7 = 5
7 8 9 10 11 12
Range = 12 - 7 = 5
Disadvantages of the
Range = 5 - 1 = 4
Range = 120 - 1 = 119
§ We can eliminate some outlier problems by using
the interquartile range
§ Eliminate some high-and low-valued observations
and calculate the range from the remaining values
§ Also known as midspread
– Spread in the middle 50%
§ Interquartile range = 3rd quartile – 1st quartile
minimum Q1 Q3
25% 25% 25% 25%
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
§ Not affected by extreme values
§ Shows variation about the mean
§ Average of squared deviations of values from the
– Sample variance:
– Population variance:
§ Most commonly used measure of variation
§ Shows variation about the mean
§ Has the same units as the original data
- Sample standard deviation:
- Population standard deviation:
Variance vs.
Standard Deviation
§ Both measure the average “scatter” about the mean
§ Variance computations produce “squared” units which
makes interpretation more difficult
– For example, kg2 is meaningless.
§ Since it is the square root of the Variance, the
Standard Deviation is expressed in the same units as
the original data
§ Therefore, the Standard Deviation is the most
commonly used measure of variation
Comparing Standard
Mean = 15.5
s = 3.338
11 12 13 14 15 16 17 18 19 20 21
11 12 13 14 15 16 17 18 19 20 21
Data B
Data A
Mean = 15.5
s = .9258
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = 4.57
Data C
Coefficient of
§ Measures relative variation
§ Always in percentage (%)
§ Shows variation relative to mean
§ Is used to compare two or more sets of data
measured in different units
Population Sample
CV = ×100%
 
 
 
CV = ×100%
 
 
 
Compare the Coefficient of
Variation between data A, data B
and Data C
Mean = 15.5
s = 3.338
11 12 13 14 15 16 17 18 19 20 21
11 12 13 14 15 16 17 18 19 20 21
Data B
Data A
Mean = 15.5
s = .9258
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = 4.57
Data C
§ Which data more Spread out around the mean?
§ If the data distribution is bell-shaped, then the
§ contains about 68% of the values in
the population
§ contains about 95% of the values in
the population
§ contains about 99.7% of the values
in the population
The Empirical Rule
μ 
μ 2σ
μ 3σ
The Empirical Rule
§ Quantitative data are usually described by a
measure of central tendency and a measure of
§ In describing data, it is important to select the
measure of central tendency that most accurately
represents the data
§ To do so, it is important to know if data is
symmetrical or skewed
Thank you

More Related Content

Similar to 3. Descriptive statistics.pdf

3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplotsLong Beach City College
Measures of Dispersion.pptx
Measures of Dispersion.pptxMeasures of Dispersion.pptx
Measures of Dispersion.pptxVanmala Buchke
Chapter 3 Ken Black 2.ppt
Chapter 3 Ken Black 2.pptChapter 3 Ken Black 2.ppt
Chapter 3 Ken Black 2.pptNurinaSWGotami
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsBurak Mızrak
QT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyQT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyPrithwis Mukerjee
QT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyQT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyPrithwis Mukerjee
Basic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptxBasic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptxAnusuya123
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersionAbhinav yadav
Measure of Variability Report.pptx
Measure of Variability Report.pptxMeasure of Variability Report.pptx
Measure of Variability Report.pptxCalvinAdorDionisio
polar pojhjgfnbhggnbh hnhghgnhbhnhbjnhhhhhh
polar pojhjgfnbhggnbh hnhghgnhbhnhbjnhhhhhhpolar pojhjgfnbhggnbh hnhghgnhbhnhbjnhhhhhh
polar pojhjgfnbhggnbh hnhghgnhbhnhbjnhhhhhhNathanAndreiBoongali
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersionDrZahid Khan
Analysis of students’ performance
Analysis of students’ performanceAnalysis of students’ performance
Analysis of students’ performanceGautam Kumar
CABT Math 8 measures of central tendency and dispersion
CABT Math 8   measures of central tendency and dispersionCABT Math 8   measures of central tendency and dispersion
CABT Math 8 measures of central tendency and dispersionGilbert Joseph Abueg
measure of dispersion
measure of dispersion measure of dispersion
measure of dispersion som allul

Similar to 3. Descriptive statistics.pdf (20)

3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots
Measures of Dispersion.pptx
Measures of Dispersion.pptxMeasures of Dispersion.pptx
Measures of Dispersion.pptx
Chapter 3 Ken Black 2.ppt
Chapter 3 Ken Black 2.pptChapter 3 Ken Black 2.ppt
Chapter 3 Ken Black 2.ppt
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
QT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyQT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central TendencyQT1 - 03 - Measures of Central Tendency
QT1 - 03 - Measures of Central Tendency
Basic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptxBasic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptx
Mod mean quartile
Mod mean quartileMod mean quartile
Mod mean quartile
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersion
Statistics 3, 4
Statistics 3, 4Statistics 3, 4
Statistics 3, 4
Central Tendency.pptx
Central Tendency.pptxCentral Tendency.pptx
Central Tendency.pptx
Dscriptive statistics
Dscriptive statisticsDscriptive statistics
Dscriptive statistics
Measure of Variability Report.pptx
Measure of Variability Report.pptxMeasure of Variability Report.pptx
Measure of Variability Report.pptx
polar pojhjgfnbhggnbh hnhghgnhbhnhbjnhhhhhh
polar pojhjgfnbhggnbh hnhghgnhbhnhbjnhhhhhhpolar pojhjgfnbhggnbh hnhghgnhbhnhbjnhhhhhh
polar pojhjgfnbhggnbh hnhghgnhbhnhbjnhhhhhh
Measures of dispersion
Measures of dispersionMeasures of dispersion
Measures of dispersion
Analysis of students’ performance
Analysis of students’ performanceAnalysis of students’ performance
Analysis of students’ performance
CABT Math 8 measures of central tendency and dispersion
CABT Math 8   measures of central tendency and dispersionCABT Math 8   measures of central tendency and dispersion
CABT Math 8 measures of central tendency and dispersion
measure of dispersion
measure of dispersion measure of dispersion
measure of dispersion

More from YomifDeksisaHerpa

More from YomifDeksisaHerpa (6)

yom seminar TWO.pptx
yom seminar TWO.pptxyom seminar TWO.pptx
yom seminar TWO.pptx
1Basic biostatistics.pdf
1Basic biostatistics.pdf1Basic biostatistics.pdf
1Basic biostatistics.pdf
2Analysis of Variance.pdf
2Analysis of Variance.pdf2Analysis of Variance.pdf
2Analysis of Variance.pdf
2. Descriptive Statistics.pdf
2. Descriptive Statistics.pdf2. Descriptive Statistics.pdf
2. Descriptive Statistics.pdf
Delivering effective presentations.ppt
Delivering effective presentations.pptDelivering effective presentations.ppt
Delivering effective presentations.ppt
ethical dillema.pptx
ethical dillema.pptxethical dillema.pptx
ethical dillema.pptx

Recently uploaded

Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...
Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...
Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...Miss joya
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment BookingHousewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.MiadAlsulami
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipurparulsinha
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknownarwatsonia7
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000aliya bhat
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safenarwatsonia7
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiNehru place Escorts
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...Miss joya
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...Miss joya
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersnarwatsonia7
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowSonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowRiya Pathan
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowNehru place Escorts
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Modelssonalikaur4

Recently uploaded (20)

Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...
Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...
Russian Call Girls in Pune Riya 9907093804 Short 1500 Night 6000 Best call gi...
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment BookingHousewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Housewife Call Girls Hoskote | 7001305949 At Low Cost Cash Payment Booking
Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...
Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...
Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000Ahmedabad Call Girls CG Road 🔝9907093804  Short 1500  💋 Night 6000
Ahmedabad Call Girls CG Road 🔝9907093804 Short 1500 💋 Night 6000
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Majestic 📞 9907093804 High Profile Service 100% Safe
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
VIP Call Girls Pune Vrinda 9907093804 Short 1500 Night 6000 Best call girls S...
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCREscort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowSonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Sonagachi Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking ModelsMumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
Mumbai Call Girls Service 9910780858 Real Russian Girls Looking Models
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Servicesauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service

3. Descriptive statistics.pdf

  • 2. Contents § Introduction § Measures of central tendency § Measures of relative standing § Shape of distribution § Measures of dispersion 2
  • 3. Learning objectives After the end of this session you will be able to: § Compute and interpret the mean, median, and mode for a set of data § Construct and interpret a box and whiskers plot § Compute and interpret the range, variance, standard deviation coefficient of variation for a set of data § Use numerical measures along with graphs, charts, and tables to describe data 3
  • 4. Numerical summary measures Numerical summary measures : A descriptive measure which summarize the data set by a single number § Unlike frequency distributions, indicate the average value or (the middle) and the spread of the values 4
  • 5. Summary Measures Measures of central tendency (Location) Mean Median Mode Measures of Relative Standing Weighted Mean Numerical summary measures Measures of dispersion (Variation) Variance Standard Deviation Coefficient of Variation Range Percentiles Interquartile Range Quartiles 5
  • 6. Measures of central tendency(MCT) § On the scale of values of a variable, there is a certain stage at which the largest number of items tend to cluster § Since this stage is usually in the centre of distribution, the tendency of the statistical data to get concentrated at a certain value is called “central tendency” § The various methods of determining the point about which the observations tend to concentrate are called MCT 6
  • 7. Characteristics of good MCT 1. It should be based on all the observations 2. It should not be affected by the extreme values 3. It should be as close to the minimum & maximum number of values as possible 4. It should have a definite value 5. It should not be subjected to complicated and tedious calculations 6. It should be capable of further algebraic treatment 7. It should be stable with regard to sampling 7
  • 8. Measures of central tendency(MCT) Center and Location Mean Median Mode Weighted Mean        i i i W i i i W w x w w x w X 8
  • 9. Arithmetic Mean: ungrouped data § The Mean is the average of data set (Is the sum of all the observations divided by the total number of observations) – Sample mean – Population mean n = Sample Size N = Population Size n x x x n x x n n i i         2 1 1 N x x x N x N N i i          2 1 1 9
  • 10. Arithmetic Mean § The most common measure of central tendency § Affected by extreme values (outliers) 0 1 2 3 4 5 6 7 8 9 10 Mean = 3 0 1 2 3 4 5 6 7 8 9 10 Mean = 4 3 5 15 5 5 4 3 2 1       4 5 20 5 10 4 3 2 1       10
  • 11. Grouped data § In calculating the mean from grouped data, we assume that all values falling into a particular class interval are located at the midpoint of the interval. It is calculated as follows: 11    fi mifi Sample ) ( mean § Where: mi=the midpoint of the ith class interval fi= the frequency of the ith class interval
  • 12. Example. Compute the mean age of 169 subjects from the grouped data 12
  • 13. Properties of Arithmetic Mean § For a given set of data there is one and only one arithmetic mean (uniqueness) § Easy to calculate and understand (simple) § Influenced by each and every value in a data sets § Greatly affected by the extreme values § Poor measure of location if the underlying distribution is not normal (or not Gaussian) § In case of grouped data if any class interval is open, arithmetic mean can not be calculated 13
  • 14. Median: Ungrouped data § In an ordered array, the median is the “middle” number – If n or N is odd, the median is the middle number – If n or N is even, the median is the average of the two middle numbers 0 1 2 3 4 5 6 7 8 9 10 Median = 3 0 1 2 3 4 5 6 7 8 9 10 Median = 3 § The median is the value of the middle term in a data set that has been ranked in increasing order 14
  • 16. Grouped data § In calculating the median from grouped data, we assume that the values within a class-interval are evenly distributed through the interval § The first step is to locate the class interval in which the median is located, using the following procedure § Find n/2 and see a class interval with a minimum cumulative frequency which contains n/2 § Then, use the following formula 16
  • 17. where, Lm = lower true class boundary of the interval containing the median Fc = cumulative frequency of the interval just above the median class interval fm = frequency of the interval containing the median W= class interval width n = total number of observations c m m n F 2 x = L W f              17
  • 18. Example: Compute the median age of 169 subjects from the grouped data 18
  • 19. § n/2 = 169/2 = 84.5 § n/2 = 84.5 = in the 3rd class interval § Lower limit = 29.5, Upper limit = 39.5 § Frequency of the class = 47 § (n/2 – fc) = 84.5-70 = 14.5 Median = 29.5 + (14.5/47)10 = 32.58 33  19
  • 20. Properties of Median § There is only one median for a given set of data (uniqueness) § The median is easy to calculate § Median is a positional average and hence it is insensitive to very large or very small values § Median can be calculated even in the case of open end intervals if sample size known § It is determined mainly by the middle points and less sensitive to the remaining data points (weakness) 20
  • 21. Mode: Ungrouped data § Value that occurs most often § Not affected by extreme values § Used for either numerical or categorical data § There may be no mode § There may be several modes 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Mode = 5 0 1 2 3 4 5 6 No Mode 21
  • 22. Mode: Grouped data § To find the mode of grouped data, we usually refer to the modal class, where the modal class is the class interval with the highest frequency § If a single value for the mode of grouped data must be specified, it is taken as the mid-point of the modal class interval 22
  • 23. Properties of Mode § It is not affected by extreme values § It can be calculated for distributions with open end classes § Often its value is not unique § The main drawback of mode is that often it does not exist 23
  • 24. Measures of Relative Standing § Where does one particular measurement stand in relation to the other measurements in the data set? § Descriptive measures that locate the relative position of an observation in relation to the other observations are called measures of relative standing 24
  • 25. Measures of Relative Standing Measures of Relative Standing Percentiles Quartiles n 1st quartile = 25th percentile n 2nd quartile = 50th percentile = median n 3rd quartile = 75th percentile § The pth percentile in a data array: is a number such that p% of the observations of the data set fall below and (100-p)% of the observations fall above it. (where 0 ≤ p ≤ 100) 25
  • 26. Percentiles § The pth percentile in an ordered array of n values is the value in ith position, where n Example: The 60th percentile in an ordered array of 19 values is the value in 12th position: 1) (n 100 p i   12 1) (19 100 60 1) (n 100 p i      26
  • 27. 27 § Commonly used percentiles – First (lower) decile = 10th percentile – First (lower) quartile, Q1 = 25th percentile – Second (middle)quartile,Q2 = 50th percentile – Third quartile, Q3 = 75th percentile – Ninth (upper) decile = 90th percentile Percentiles
  • 28. Quartiles § Quartiles Split Ordered Data into 4 equal portions § Q1 and Q3 are Measures of Non-central Location § Q2 = the Median 25% 25% 25% 25%   1 Q   2 Q   3 Q 28
  • 29. Quartiles § Each Quartile has position and value – With the data in an ordered array, the position of Qi is: – The value of Qi is the value associated with that position in the ordered array § Example: Data in Ordered Array: 11 12 13 16 16 17 18 21 22     1 1 1 9 1 12 13 Position of 2.5 12.5 4 2 Q Q           1 4 i i n Q   29
  • 30. Example The prices ($) of 18 brands of walking shoes: 40 60 65 65 67 68 68 70 70 70 70 70 70 74 75 75 90 95 üQ1is 3/4 of the way between the 4th and 5th ordered measurements, or Q1 = 65 + .75(67 - 65) = 66.5. 30
  • 31. Example The prices ($) of 18 brands of walking shoes: 40 60 65 65 65 68 68 70 70 70 70 70 70 74 75 75 90 95 üQ3 is 1/4 of the way between the 14th and 15th ordered measurements, or Q3 = 74 + .25(75 - 74) = 74.25 üAnd IQR = Q3 – Q1 = 74.25 – 66.5 = 7.75 31
  • 32. Shape of a Distribution § Describes how data is distributed § Measures of Shape - Symmetric or skewed (asymmetric) Mean = Median = Mode Mean < Median < Mode Mode < Median < Mean Right-Skewed Left-Skewed Symmetric (Longer tail extends to left) (Longer tail extends to right) 32
  • 33. The Five Number Summary § One way to give a nice profile of a data set is the “five-number summary,” which consists of: 1. The smallest measurement 2. The first quartile, Q1 3. The median, Q2 4. The third quartile, Q3 5. The largest measurement § Displayed visually using a box-and-whiskers plot 33
  • 34. The Box-and- Whisker plot § 5-number summary – Median, Q1, Q3, Xsmallest, Xlargest § Box Plot – Graphical display of data using 5-number summary Median ( ) 4 6 8 10 12 Maximum Minimum 1 Q 3 Q 2 Q 34
  • 35. Distribution Shape & Box-and-Whisker Plot Right-Skewed Left-Skewed Symmetric 1 Q 1 Q 1 Q 2 Q 2 Q 2 Q 3 Q 3 Q 3 Q 35 § Skewed distributions usually have a long whisker in the direction of the skewness
  • 36. Shape of a Distribution and Quartiles § If the distribution is symmetric, then the upper and lower quartiles should be approximately equally spaced from the median § If the upper quartile is farther from the median than the lower quartile, then the distribution is positively skewed § If the lower quartile is farther from the median than the upper quartile, then the distribution is negatively skewed 36
  • 37. Outlier § A value located at a distance of more than 1.5(IQR) from the box üLower fence: Q1-1.5 IQR üUpper fence: Q3+1.5 IQR § Measurements beyond the upper or lower fence are outliers and are marked with * * 37
  • 38. Measures of Variation Variation Variance Standard Deviation Coefficient of Variation Population Variance Sample Variance Population Standard Deviation Sample Standard Deviation Range Interquartile Range 38
  • 39. Measures of Variation § Measures that quantify the variation or dispersion of a set of data from its central location § The amount may be small when the values are close together and large when the values are far apart from each other § If all the values are the same, no dispersion § How much are the observations spread out around the mean value? 39
  • 40. § Measures of variation give information on the spread or variability of the data values Measures of Variation Same center, different variation 40
  • 41. Measures of Variation § The more Spread out or dispersed data, the larger the measures of variation § The more concentrated the data, the smaller the measures of variation § If all observations are equal, measures of variation = Zero § All measures of variation are Non-negative 41
  • 42. Range § Simplest measure of variation § Difference between the largest and the smallest observations: Range = xmaximum – xminimum 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Range = 14 - 1 = 13 Example: 42
  • 43. § Ignores the way in which data are distributed § Sensitive to outliers 7 8 9 10 11 12 Range = 12 - 7 = 5 7 8 9 10 11 12 Range = 12 - 7 = 5 Disadvantages of the Range 1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5 1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120 Range = 5 - 1 = 4 Range = 120 - 1 = 119 43
  • 44. Interquartile Range § We can eliminate some outlier problems by using the interquartile range § Eliminate some high-and low-valued observations and calculate the range from the remaining values § Also known as midspread – Spread in the middle 50% § Interquartile range = 3rd quartile – 1st quartile 44
  • 45. Interquartile Range Median (Q2) X maximum X minimum Q1 Q3 Example: 25% 25% 25% 25% 12 30 45 57 70 Interquartile range = 57 – 30 = 27 § Not affected by extreme values 45
  • 46. § Shows variation about the mean § Average of squared deviations of values from the mean – Sample variance: – Population variance: Variance N μ) (x σ N 1 i 2 i 2     1 - n ) x (x s n 1 i 2 i 2     46
  • 47. Standard Deviation § Most commonly used measure of variation § Shows variation about the mean § Has the same units as the original data - Sample standard deviation: - Population standard deviation: N μ) (x σ N 1 i 2 i     1 - n ) x (x s n 1 i 2 i     47
  • 48. Variance vs. Standard Deviation § Both measure the average “scatter” about the mean § Variance computations produce “squared” units which makes interpretation more difficult – For example, kg2 is meaningless. § Since it is the square root of the Variance, the Standard Deviation is expressed in the same units as the original data § Therefore, the Standard Deviation is the most commonly used measure of variation 48
  • 49. Comparing Standard Deviations Mean = 15.5 s = 3.338 11 12 13 14 15 16 17 18 19 20 21 11 12 13 14 15 16 17 18 19 20 21 Data B Data A Mean = 15.5 s = .9258 11 12 13 14 15 16 17 18 19 20 21 Mean = 15.5 s = 4.57 Data C 49
  • 50. Coefficient of Variation § Measures relative variation § Always in percentage (%) § Shows variation relative to mean § Is used to compare two or more sets of data measured in different units Population Sample s CV = ×100% X       σ CV = ×100% μ       50
  • 51. Compare the Coefficient of Variation between data A, data B and Data C Mean = 15.5 s = 3.338 11 12 13 14 15 16 17 18 19 20 21 11 12 13 14 15 16 17 18 19 20 21 Data B Data A Mean = 15.5 s = .9258 11 12 13 14 15 16 17 18 19 20 21 Mean = 15.5 s = 4.57 Data C 51 § Which data more Spread out around the mean?
  • 52. § If the data distribution is bell-shaped, then the interval: § contains about 68% of the values in the population § contains about 95% of the values in the population § contains about 99.7% of the values in the population The Empirical Rule 1σ μ  μ 2σ  μ 3σ  52
  • 54. Summary § Quantitative data are usually described by a measure of central tendency and a measure of variation § In describing data, it is important to select the measure of central tendency that most accurately represents the data § To do so, it is important to know if data is symmetrical or skewed 54