SlideShare a Scribd company logo
Chapter 2
Descriptive statistics
By
Hirbo Shore (MPH, Assistant Prof. of Epidemiology)
Haramaya University, CHMS, School of Public Health
10/16/2019 1
The objective of the chapter
• At the end the session, students will able to;
• Describe descriptive statistics
• Compute and interpret measure of central tendency
• Compute and interpret measure of central tendency
• Compute and interpret measure of dispersion
Aspects of statistics
There two major categories of statistics
1. Descriptive statistics
2. Inferential statistics
Descriptive statistics
• Statistical procedures used to summarise, organise, and
simplify data.
• Encompasses tools devised to organize and display data in an
accessible fashion
• It involves the quantification of recurring phenomena.
Descriptive statistics,…
• This process should be carried out in such a way that reflects
overall findings
– Raw data is made more manageable
– Raw data is presented in a logical form
– Patterns can be seen from organised data
Descriptive statistics,…
Summarizing and organizing data can be achieved through:
1. Frequency Distributions
2. Graphical Representations
3. Measures of Central Tendency
4. Measures of variability
Simple Frequency Distributions
• It is useful for categorical variable
• the following information can be obtained
– it allows you to pick up at a glance some valuable
information, such as highest, lowest value.
– ascertain the general shape or form of the distribution
– make an informed guess about central tendency values
Table 1. Marital status of participants
Frequency distribution for numeric data
Frequency distribution of group data
Weight Group Frequency Cum. frequency Rel. frequency Cum. Rel. freq
1500-2000 0 0 0 0
2000-2500 13 13 13.54 13.54
2500-3000 21 34 21.88 35.42
3000-3500 38 72 39.58 75
3500-4000 18 90 18.75 93.75
4000-4500 6 96 6.25 100
Total 96 100
Frequency distribution for numeric data,…
• If there is no good knowledge on the data one can use the
following rule as a guide to construct class intervals;
• Sturge’s Rule
– K=1 + 3.322log 𝑛, k is number of class intervals, n is
sample size
K
– W=Range
, W is width of the class interval
– Range, the difference of maximum and minimum value
Diagrammatic Representation
• Qualitative data
– Bar graphs
– Pie chart
• Quantitative data
– Histogram
– Frequency polygon
– Line graph
– Scatter plot
Pie chart
Married
single
divorced
widow
Simple Bar Chart
• Displays the frequency distribution of a qualitative variable
• Take into account one or more classification factors
• a single bar for each category is drawn where the height of the
bar represents the frequency of the category
Simple bar chart,…
30
25
20
15
10
5
0
hemorrhage Embolism
Percent
hypertensive sepsis abortion
disorders
Causes of maternal mortality
Lale Say, et al. Global causes of maternal death: a W H O systematic analysis. T h e Lancet Global Health,
2014: 2(6): e323-e333
Multiple bar graph
Figure. Marital status and sex
Stacked bar graph
Malaria Types
Months P.Faciparum P.Vivax O.ovale P.malaria
August 197 123 12 8
October 176 148 15 11
Decembe
r 146 169 19 13
Total 519 440 46 32
30%
20%
10%
0%
40%
50%
60%
100%
90%
80%
70%
August October December
P.malaria
O.ovale
P.Vivax
P.Faciparum
10/16/2019 17
Figure. Stacked bar graph
Graphs for quantitative data
10/16/2019 18
Histogram
Weight Group Frequency
1500-2000 0
2000-2500 13
2500-3000 21
3000-3500 38
3500-4000 18
4000-4500 6
Total 96
Frequency
10/16/2019 19
40
35
30
25
20
15
10
5
0
1500-2000 2000-2500 2500-3000 3000-3500 3500-4000 4000-4500
Birth Weight
Frequency polygon
10/16/2019 20
Stem & leaf plots
of each
• Draw a vertical line and place the first digits
class‐called the “stem” on the left side of the line.
• The numbers on the right side of the vertical line present the
second digit of each observation; they are the “leaves”
10/16/2019 22
Figure. Stem –and-leaf plot of parity
Box and Whisker plot
• Used to display information when the objective is to illustrate
certain locations in the distribution.
• Abox is drawn with the top of the box at the third quartile and
the bottom at the first quartile.
• The location of the mid‐point of the distribution is indicated
with a horizontal line in the box.
• straight lines, or whiskers, are drawn from the center of the top
of the box to the largest observation and from the center of the
bottom of the box to the smallest observation.
Box and Whisker plot,…
• Points required to construct box and whisker plot
– Lowest value
– First quartile(Q1)
– Second quartile/median(Q2)
– Third quartile(Q3)
– Larges value
Box and Whisker plot,…
40
60
100
120
mweigh
t
80
Minimum
value
Q2
Q1
Q2
Minimum
value
Box and Whisker plot,…
40
60
100
120 female male
mweight
80
Graphs by sex
Scatter plot
• display the relationship between two characteristics
• Used when both the variables are quatitative
• When one of the characteristics is qualitative and the other is
quantitative, the data can be displayed in box and whisker
plots.
Scatter plot,…
20
00
25
00
35
00
40
00
birthwe
ight
3000
4 0 6 0 1 0 0 1 2 0
8 0
m w e i g h t
Scatter plot,…
200
0
250
0
300
0
350
0
400
0
40 60 120
80 100
mweight
birthweight Fitted values
Line graph
Figure Trends in contraceptive use
line,…
Figure Trends in exclusive breastfeeding
Line graph, … .
Figure. Trends in early childhood mortality rates
Numerical summary measures
• summarization of data by means of descriptive measures
• Statistic a descriptive measure computed from the values of a
sample
• Parameter a descriptive measure computed from the values of
a population
parameter
• a measure, quantity, or numerical value obtained from the
population values X1, X2, . . ., XN
• As the value of a parameter is generally unknown but constant,
• we are interested in estimating a parameter to understand the
characteristic of a population using the sample data
statistics
• Ameasure, quantity, or numerical value obtained from the
sample values x1, x2, . . ., xn.
• The values of statistics can be computed from the sample as
functions of observations.
• Used to approximate (estimate) parameters.
Statistics
• The data can be organized and presented using figures and
tables in order to display the important features and
characteristics contained in data.
• However, tables and figures may not reveal the underlying
characteristics very comprehensively.
• For instance, if we want to characterize the data with a single
value, it would be very useful not only for communicating to
everyone but also for making comparisons.
• Let us consider that we are interested in number of C-section
births among the women during their reproductive period in
different regions of a country.
• To collect information on this variable, we need to consider
women who have completed their reproductive period.
• With tables and figures, we can display their frequency
distributions.
• However, to compare tables and figures for different regions, it
would be a very difficult task.
• In such situations, we need a single statistic which contains the
salient feature of the data for a region.
• This single value can be a statistic such as arithmetic mean in
this case
• In case of a qualitative variable, it could be a proportion.
• With this single value, we can represent the whole data set in a
meaningful way to represent the corresponding population
characteristic and we may compare this value with similar
other values at different times or for different groups or
regions conveniently.
• Asingle value makes it very convenient to represent a whole
data set of small, medium, or large size.
• Four types of descriptive measures are commonly used for
characterizing underlying features of data:
1. Measures of central tendency,
2. Measures of dispersion,
3. Measures of skewness,
4. Measures of kurtosis.
Measures of Central Tendency
• The values of a variable often tend to be concentrated around
the center of the data.
• The center of the data can be determined by the measures of
central tendency.
• Ameasure of central tendency is considered to be a typical (or
a representative) value of the set of data as a whole.
• The measures of central tendency are sometimes referred as
the measures of location too.
• The most commonly used measures of central tendency are;
• mean
• median
• mode
Mean
Population Mean (µ)
• If X1, X2, . . ., XN are the population values, then the
population mean is
µ=
𝑥
𝑖
∑𝑁
= 𝑖=1
X1+𝑋2+⋯𝑋𝑁
𝑁 𝑁
• The population mean µ is a parameter.
• It is usually unknown, and we are interested to estimate its
value.
Sample mean
• If x1, x2, . . ., xn are the sample values, then the sample mean is
𝑋=
𝑥1+𝑥2,…+ 𝑥𝑛 𝑥
𝑖
∑𝑛
= 𝑖=1
𝑛
𝑛
• It is noteworthy that the sample mean x is a statistic. It is
known and we can calculate it from the sample.
• The sample mean x is used to approximate (estimate) the
population mean, µ.
• Amore precise way to compute the mean from a frequency
distribution is
𝑓 𝑥 +𝑓 𝑥 ,…+𝑓 𝑥 𝑓𝑖𝑥
𝑖
∑𝑛
𝑋= 1 1 2 2 𝑘 𝑘
= 𝑖=1
𝑓𝑖+𝑓2+⋯+𝑓𝑘 𝑛
where fi denotes the frequency of distinct sample observation
xi, i =1, . . ., k.
• Example 2.1
• Let us consider the data on number of children ever born
reported by 20 women in a sample: 4, 2, 1, 0, 2, 3, 2, 1, 1, 2, 3,
4, 3, 2, 1, 2, 2, 1, 3, 2.
• n = 20.
• We want to find the average number of children ever born
from this sample.
• The sample mean is
•
10/16/2019 49
𝑥=4+2+1+0+2+3+2+1+1+2+3+4+3+2+1+2+2+1+3+2
20
• 𝑥=41 =2.05
20
• Example 2.2
– Employing the same data shown above, we can compute
the arithmetic mean in a more precise way by using
frequencies. We can construct a frequency distribution table
from this data to summarize the data first
Afrequency distribution table for number of children ever born of 20 women
10/16/2019 50
𝑥=0x1+1x5+2x8+3x4+4x2
10/16/2019 51
1+5+8+4+2
41
20
𝑥= =2.05
𝑥=2.05
Features of mean
10/16/2019 52
• Simplicity: The mean is easily understood and easy to
compute.
• Uniqueness: There is one and only one mean for a given set of
data.
• The mean takes into account all values of the data.
• Disadvantage
– The arithmetic mean is influenced by extreme values
• Therefore, the mean may be distorted by extreme values
– cannot be used for qualitative data.
– can only be found for quantitative variables.
10/16/2019 53
median
10/16/2019 54
• The value which divides the ordered array into two equal parts.
• The numbers in the first part are less than or equal to the
median, and the numbers in the second part are greater than or
equal to the median.
• Notice that 50% (or less) of the data are ≤Median and 50% (or
less) of the data are≥ Median.
𝑟𝑎𝑛𝑘 =
• Calculating the Median;
– Let x1, x2, . . ., xn be the sample values.
– The sample size (n) can be odd or even.
– The steps for computing the median are discussed below:
• First, we order the sample to obtain the ordered array.
• Suppose that the ordered array is;
y1, y2, . . ., yn
• We compute the rank of the middle value(s):
𝑛 + 1
2
10/16/2019 55
• If the sample size (n) is an odd number, there is only
one value in the middle, and the rank will be an integer:
𝑟𝑎𝑛𝑘 = 𝑛+1
=m, (m is integer)
2
• The median is the middle value of the ordered
observations, which is
Median =ym
10/16/2019 56
• If the sample size (n) is an even number, there are two values
in the middle, and the rank will be an integer plus 0.5.
𝑟𝑎𝑛𝑘 = 𝑛+1
=m+0.5
10/16/2019 57
2
– Therefore, the ranks of the middle values are (m) and (m +
1).
– The median is the mean (average) of the two middle values
of the ordered observations:
median= 𝑦𝑚+𝑦𝑚+1
2
Example 2.4 (Odd number)
Find the median for the sample values: 10, 54, 21, 38, 53.
10/16/2019 58
• Solution
• n = 5 (odd number)
• There is only one value in the middle.
• The rank of the middle value is
median= 𝑛+1
=5+1
=3, (m=3)
2 2
The median = 38 (unit).
10/16/2019 59
• Example 2.5 (Even number) Find the median for the sample
values: 10, 35, 41, 16, 20, 32.
Solution
– n = 6 (even number)
– There are two values in the middle. The rank is
median= 𝑛+1
= 6+1
=3.5, (m=3)
2 2
10/16/2019 60
10/16/2019 61
Median
10/16/2019 62
Advantage
– Simplicity: The median is easily understood and easy to
compute.
– Uniqueness: There is only one median for a given set of
data.
– Not as drastically affected by extreme values as is the
mean.
• Disadvantages:
– Does not take into account all values of the sample after
ordering the data in ascending order.
– Can only be found for quantitative variables.
10/16/2019 63
Mode
10/16/2019 64
• Aset of values is that value which occurs most frequently (i.e.,
with the highest frequency).
• If all values are different or have the same frequencies, there
will be no mode.
• Aset of data may have more than one mode
Mode from seven sets of data
10/16/2019 65
Percentiles and Quartiles
10/16/2019 66
• Represent a measure of position
• The measures of position indicate the location of a data set
• These descriptive measures are called position parameters
because they can be used to designate certain positions on the
horizontal axis when the distribution of a variable is graphed
Percentile,…
10/16/2019 67
– Given a set of observations x1, x2; . . ., xn, the Pth percentile,
Pp, p = 0,1, 2,…, 100, is the value of X such that p percent
or less of the observations are less than Pp and (100 − p)
percent or more of the observations are greater than Pp.
Percentile,…
– The 10th percentile is denoted by P10 where P10
100
= 10 𝑛+1 th in the ordered value
– The 50th percentile is P50;
100
= 50 𝑛+1 th in the ordered value = median
10/16/2019 68
– The median is a special case of percentile and it represents
the 50th percentile
Quartile
10/16/2019 69
– Given a set of observations x1, x2, . . .,xn, the rth quartile, Qr
, r = 1,2,3, is the value of X such that (25x r) percent or
less of the observations are less than Qr and [100 − (25 r)]
percent or less of the observations are greater than Qr.
Quartile, …
10/16/2019 70
Measures of Dispersion
10/16/2019 71
• The dispersion of a set of observations refers to the variation
contained in the data.
• Ameasure of dispersion conveys information regarding the
amount of variability present in a set of data.
• The dispersion is small when the values are close together
• Measures of dispersion includes range, variance, standard
deviation, and coefficient of variation
Measures of Dispersion,…
Figure displaying small and large variations
10/16/2019 72
Measures of Dispersion,…
Figure displaying smaller and larger variations using the same
center
10/16/2019 73
Measures of Dispersion,…
• Example; Let us consider the hypothetical data sets of size 5
with same measure of central tendency
If we represent the samples by only measure of central tendency,
or by 𝒙 in this case, then all the samples have the same value
10/16/2019 74
Measures of Dispersion,…
10/16/2019 75
• In other words, although we observe that the measure of
central tendency remains same, the variation contained in each
data set, or the variation in values of a data set from its central
value are not same at all
• This indicates that the MCT alone fails to characterize
different types of properties of data sets
Measures of Dispersion,…
10/16/2019 76
• We want to represent the data by a small number of measures,
if possible by a single value, but sometimes a single value
cannot represent the data adequately or sufficiently
• Therefore, the measure of variation is needed to characterize
the data in addition to a measure of central tendency
Range
• The range is the difference between the largest value (Max)
and the smallest value (Min):
Range (R) = Max – Min
• Example
10/16/2019 77
Range,…
10/16/2019 78
• The unit of the range is the same as the unit of the data.
• The usefulness of the range is limited.
• A poor measure of the dispersion because it only takes into
account two of the values, maximum and minimum; however,
it plays a significant role in many applications to have a quick
view of the dispersion in a data set
Interquartile Range
10/16/2019 79
• Adisadvantage of the range is that it is computed using the
two extreme values of the sample data
• If the data contain extreme outlier values, then the range is
affected by those outliers.
• Interquartile range can be used to overcome this limitation of
the range
• Computed by taking the difference between the third quartile
and the first quartile
Interquartile Range,…
10/16/2019 80
• Provides a more meaningful measure that represents the range
of the middle 50% of the data.
• The interquartile range is defined as
IQR=Q3 - Q1
• As the extreme values are not considered, the IQR is a more
robust measure of variation in the data and is not affected by
outliers
Box-and-Whisker Plot
10/16/2019 81
• Measure of location in which value around which other values
are clustered.
• Shows the variation or spread in the data from the central
value
• Also shows whether the data are symmetric or not
• The box contains features included in the middle 50% of the
observations, and whiskers show the minimum and maximum
values
Box-and-Whisker Plot
• Represents the main features of the data very efficiently and
precisely
• Example: Consider the following data set:
14.6, 24.3, 24.9, 27.0, 27.2, 27.4, 28.2, 28.8, 29.9, 30.7, 31.5,
31.6, 32.3, 32.8, 33.3, 33.6, 34.3, 36.9, 38.3, 44.0.
n=20
𝑡ℎ
Q1= (𝑛+1)
=5.25th observation
10/16/2019 82
4
=5th +0.25(6th -5th)=> 27.2+0.25(27.4-27.2)
=27.2+0.05=27.25
Box-and-Whisker Plot,…
4 4
𝑡ℎ
Q3= 3(𝑛+1)
= 3(20+1)
=15.75th ordered observation
=15.75th ordered observation +0.75(16th -15th)
=33.3 +0.75(33.6-33.3)
=33.525
∴ IQR=33.525 -27.25 =6.275
10/16/2019 83
The Variance
10/16/2019 84
• Uses the mean as a point of reference
• an average measure of squared difference between each
observation and arithmetic mean
• Small when the observations are close to the mean, is large
when the observations are spread out from the mean, and is
zero (no variation) when all observations have the same value
(concentrated at the mean)
• Shows, on an average, how close the values of a variable are to
the arithmetic mean
The Variance,…
• Deviations of sample values from the sample mean
– Let x1, x2, . . ., xn be the sample values, and x be the sample mean.
– The deviation of the value xi from the sample mean x is
𝑥𝑖 − 𝑥
– The squared deviation is 𝑥𝑖 − 𝑥 𝟐
𝟐
10/16/2019 85
– The sum of squared deviations is ∑𝑛 𝑥𝑖 − 𝑥
𝑖
=1
The Population Variance, σ2
• Let X1, X2, . . ., XN be the population values. The population
variance, δ2, is defined by
∑𝑁
σ2 = 𝑖=1
2
(𝑥𝑖−μ)
𝑁
𝑋 −𝜇 2
+(𝑋 −𝜇)2+⋯+ 𝑋 −𝜇
10/16/2019 86
2
= 1 2 𝑁
(unit)2
𝑁
• It may be noted here that σ2 is a parameter because it is
obtained from the population values, it is generally unknown.
It is also always true that σ2 ≥0.
The Sample Variance, S2
• Let x1, x2, . . ., xn be the sample values. The sample variance s2
is defined by
(𝑥𝑖−𝑥)2
∑𝑛
= 𝑖=1
𝑛−1
∑𝑛
= 𝑖=1(𝑥1−𝑥)2 + 𝑥𝑖−𝑥 2
…,+(𝑥𝑖−𝑥)2
𝑛
−1
10/16/2019 87
S2 ,….
• Example
– We want to compute the sample variance of the following
sample values: 10, 21, 33, 53, 54.
10/16/2019 88
Standard Deviation
• The variance represents squared units, and therefore is not
appropriate measure of dispersion when we wish to express
the concept of dispersion in terms of the original unit.
• The standard deviation is the square root of the variance
– Population standard deviation is σ= 𝜎2
– Sample standard deviation is S= 𝑆2
10/16/2019 89
Standard Deviation,…
S=
(𝑥𝑖−𝑥)2
∑𝑛
𝑖=1
𝑛−1
For the previous example, the sample standard deviation is
S= 𝑆2= 376.6 = 19.41
10/16/2019 90
Coefficient of Variation (C.V.)
10/16/2019 91
• Used to compare the variation of two variables
• we cannot use the variance or the standard deviation directly to
compare variation of two variable, because
– the variables might have different units
– the variables might have different means
• For comparison, we need a measure of the relative variation
that does not depend on either the units or on the values of
means.
C.V., …
• The coefficient of variation is defined by
𝐶𝑉 = 𝑆
x100
𝑋
• C.V. is free of unit and the standard deviation is divided by the
corresponding arithmetic mean to make it comparable for
different means.
• To compare the variability of two sets of data
10/16/2019 92
C.V., …
10/16/2019 93
• The data set with the larger value of C.V. has larger variation
which is expressed in percentage.
• The relative variability of the data set 1 is larger than the
relative variability of the data set 2 if C.V.1 > C.V.2 and vice
versa
C.V., …
10/16/2019 94
• Example
Variable mean Standard deviation Coefficient of variation
Sample 1 66kg 4.5kg 6.8%
Sample 2 36kg 4.5kg 12:5%
Since C.V.2> C.V.1, the relative variability of the data set 2 is
larger than the relative variability of the data set 1.
Skewness
10/16/2019 95
• the degree of asymmetry or departure of symmetry of a
distribution.
• In a skew distribution, the mean tends to lie on the same side
of the longer tail.
• In a symmetric distribution, the mean, the mode, and the
median coincide.
Skewness, …
10/16/2019 96
Skewness, …
• Pearson’s Measure
– Skewness =
𝐦𝐞𝐚𝐧−𝐦𝐨𝐝𝐞
𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧
=0 for symmetry,
>0 for positively skewed distribution,
<0 for negatively skewed distribution.
10/16/2019 97
Skewness, …
• Alternatively
Skewness =
3𝑥(mean−m𝑒𝑑𝑖𝑎𝑛)
standard deviation
+0 for symmetry,
>0 for positively skewed distribution,
<0, for negatively skewed distribution.
10/16/2019 98
Skewness, …
• Bowley’s Measure
Skewness =
𝑄3−𝑄2 −(𝑄2−𝑄1)
𝑄3−𝑄1
Example: Consider the following data set:
14.6, 24.3, 24.9, 27.0, 27.2, 27.4, 28.2, 28.8, 29.9, 30.7,
31.5, 31.6, 32.3, 32.8, 33.3, 33.6, 34.3, 36.9, 38.3, 44.0
• Here, n = 20. 𝑥 = 30.58: There is no mode.
• The sample variance, s2=36.3417 and the sample standard
deviation, S= 6:0284
10/16/2019 99
Skewness, …
• Q1=27.25, Q2=31.1, Q3=33.525
Skewness = 3𝑥(30.58−31.1)
= -0.2588
6.0284
• This shows that there is negative skewness in the distribution
The Bowley’s measure of skewness is
Skewness= 33.525−31.1 − 31.1−27.25
= -0.2271
33.525−27.25
• This measure also confirms that there is negative skewness in
the distribution
10/16/2019 100
Kurtosis
10/16/2019 101
• The peakedness of a distribution, usually taken relative to a
normal distribution.
• Leptokurtic: a distribution with a relatively high peak
• Platykurtic: a curve which is relatively flat topped
• Mesokurtic: a curve which is neither too peaked nor flat
topped (normal distribution curve)
10/16/2019 102
• The measure of kurtosis is
A mesokurtic distribution has a kurtosis measure of 3 based, but
computer program reduce measure by 3, then mesokurtic distribution
will be equal to 0, leptokurtic distribution will have a kurtosis
measure > 0, and a platykurtic distribution will have a kurtosis
measure < 0.
10/16/2019 103
!!THANK YOU FOR
YOUR ATTENTION!!
10/16/2019 104

More Related Content

Similar to Biostatistics_descriptive stats.pptx

2. chapter ii(analyz)
2. chapter ii(analyz)2. chapter ii(analyz)
2. chapter ii(analyz)
Chhom Karath
 
R training4
R training4R training4
R training4
Hellen Gakuruh
 
Introduction to Statistics in Nursing.
Introduction to Statistics in Nursing.Introduction to Statistics in Nursing.
Introduction to Statistics in Nursing.
Johny Kutty Joseph
 
datacollectionandpresentation-181117043733-converted.pptx
datacollectionandpresentation-181117043733-converted.pptxdatacollectionandpresentation-181117043733-converted.pptx
datacollectionandpresentation-181117043733-converted.pptx
anjaliagarwal93
 
Descriptive
DescriptiveDescriptive
Descriptive
Mmedsc Hahm
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
IndhuGreen
 
Statistik dan Probabilitas Yuni Yamasari 2.pptx
Statistik dan Probabilitas Yuni Yamasari 2.pptxStatistik dan Probabilitas Yuni Yamasari 2.pptx
Statistik dan Probabilitas Yuni Yamasari 2.pptx
AisyahLailia
 
Edited economic statistics note
Edited economic statistics noteEdited economic statistics note
Edited economic statistics note
haramaya university
 
2. AAdata presentation edited edited tutor srudents(1).pptx
2. AAdata presentation edited edited tutor srudents(1).pptx2. AAdata presentation edited edited tutor srudents(1).pptx
2. AAdata presentation edited edited tutor srudents(1).pptx
ssuser504dda
 
Statistics for Data Analytics
Statistics for Data AnalyticsStatistics for Data Analytics
Statistics for Data Analytics
SSaudia
 
Chapter-one.pptx
Chapter-one.pptxChapter-one.pptx
Chapter-one.pptx
AbebeNega
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
Anand Thokal
 
STATISTICS.pptx
STATISTICS.pptxSTATISTICS.pptx
STATISTICS.pptx
Dr. Senthilvel Vasudevan
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statisticsAmira Talic
 
Statistics for machine learning shifa noorulain
Statistics for machine learning   shifa noorulainStatistics for machine learning   shifa noorulain
Statistics for machine learning shifa noorulain
ShifaNoorUlAin1
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
anju mathew
 
Statistics and data analysis
Statistics  and data analysisStatistics  and data analysis
Statistics and data analysis
Regent University
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
Bhagya Silva
 
1Basic biostatistics.pdf
1Basic biostatistics.pdf1Basic biostatistics.pdf
1Basic biostatistics.pdf
YomifDeksisaHerpa
 

Similar to Biostatistics_descriptive stats.pptx (20)

2. chapter ii(analyz)
2. chapter ii(analyz)2. chapter ii(analyz)
2. chapter ii(analyz)
 
determinatiion of
determinatiion of determinatiion of
determinatiion of
 
R training4
R training4R training4
R training4
 
Introduction to Statistics in Nursing.
Introduction to Statistics in Nursing.Introduction to Statistics in Nursing.
Introduction to Statistics in Nursing.
 
datacollectionandpresentation-181117043733-converted.pptx
datacollectionandpresentation-181117043733-converted.pptxdatacollectionandpresentation-181117043733-converted.pptx
datacollectionandpresentation-181117043733-converted.pptx
 
Descriptive
DescriptiveDescriptive
Descriptive
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Statistik dan Probabilitas Yuni Yamasari 2.pptx
Statistik dan Probabilitas Yuni Yamasari 2.pptxStatistik dan Probabilitas Yuni Yamasari 2.pptx
Statistik dan Probabilitas Yuni Yamasari 2.pptx
 
Edited economic statistics note
Edited economic statistics noteEdited economic statistics note
Edited economic statistics note
 
2. AAdata presentation edited edited tutor srudents(1).pptx
2. AAdata presentation edited edited tutor srudents(1).pptx2. AAdata presentation edited edited tutor srudents(1).pptx
2. AAdata presentation edited edited tutor srudents(1).pptx
 
Statistics for Data Analytics
Statistics for Data AnalyticsStatistics for Data Analytics
Statistics for Data Analytics
 
Chapter-one.pptx
Chapter-one.pptxChapter-one.pptx
Chapter-one.pptx
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
STATISTICS.pptx
STATISTICS.pptxSTATISTICS.pptx
STATISTICS.pptx
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
 
Statistics for machine learning shifa noorulain
Statistics for machine learning   shifa noorulainStatistics for machine learning   shifa noorulain
Statistics for machine learning shifa noorulain
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Statistics and data analysis
Statistics  and data analysisStatistics  and data analysis
Statistics and data analysis
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
1Basic biostatistics.pdf
1Basic biostatistics.pdf1Basic biostatistics.pdf
1Basic biostatistics.pdf
 

More from MohammedAbdela7

Chap.VII.pptx
Chap.VII.pptxChap.VII.pptx
Chap.VII.pptx
MohammedAbdela7
 
Introduction to Pathology.pptx
Introduction to Pathology.pptxIntroduction to Pathology.pptx
Introduction to Pathology.pptx
MohammedAbdela7
 
preeclampsia.pptx
preeclampsia.pptxpreeclampsia.pptx
preeclampsia.pptx
MohammedAbdela7
 
Hypersensitivity reactions BY GROUP 1.pptx
Hypersensitivity reactions BY GROUP 1.pptxHypersensitivity reactions BY GROUP 1.pptx
Hypersensitivity reactions BY GROUP 1.pptx
MohammedAbdela7
 
inflammaton.pptx
inflammaton.pptxinflammaton.pptx
inflammaton.pptx
MohammedAbdela7
 
FINALLLL HMD.pptx
FINALLLL HMD.pptxFINALLLL HMD.pptx
FINALLLL HMD.pptx
MohammedAbdela7
 
Chap.-II.pptx
Chap.-II.pptxChap.-II.pptx
Chap.-II.pptx
MohammedAbdela7
 
Cellular Reactions to Injury.pptx
Cellular  Reactions  to Injury.pptxCellular  Reactions  to Injury.pptx
Cellular Reactions to Injury.pptx
MohammedAbdela7
 
by Group 8 PID & EP edited.pptx
by Group 8 PID & EP edited.pptxby Group 8 PID & EP edited.pptx
by Group 8 PID & EP edited.pptx
MohammedAbdela7
 
ACID-BASE BALANCE.pptx
ACID-BASE BALANCE.pptxACID-BASE BALANCE.pptx
ACID-BASE BALANCE.pptx
MohammedAbdela7
 
Autoimmunity group 2.ppt
Autoimmunity group 2.pptAutoimmunity group 2.ppt
Autoimmunity group 2.ppt
MohammedAbdela7
 
infection prevention.pptx
infection prevention.pptxinfection prevention.pptx
infection prevention.pptx
MohammedAbdela7
 
integumentery.pptx
integumentery.pptxintegumentery.pptx
integumentery.pptx
MohammedAbdela7
 
Medication and fluid therapy.pptx
Medication and fluid therapy.pptxMedication and fluid therapy.pptx
Medication and fluid therapy.pptx
MohammedAbdela7
 
Endocrine System Disorder.pptx
Endocrine System Disorder.pptxEndocrine System Disorder.pptx
Endocrine System Disorder.pptx
MohammedAbdela7
 
CVS and abdomen.pptx
CVS and abdomen.pptxCVS and abdomen.pptx
CVS and abdomen.pptx
MohammedAbdela7
 
Endocrine DOs.pptx
Endocrine DOs.pptxEndocrine DOs.pptx
Endocrine DOs.pptx
MohammedAbdela7
 
badnews.pptx
badnews.pptxbadnews.pptx
badnews.pptx
MohammedAbdela7
 
2 Assessment of patient with respiratory disorder.pptx
2 Assessment of patient with respiratory disorder.pptx2 Assessment of patient with respiratory disorder.pptx
2 Assessment of patient with respiratory disorder.pptx
MohammedAbdela7
 
Adult health.pptx
Adult health.pptxAdult health.pptx
Adult health.pptx
MohammedAbdela7
 

More from MohammedAbdela7 (20)

Chap.VII.pptx
Chap.VII.pptxChap.VII.pptx
Chap.VII.pptx
 
Introduction to Pathology.pptx
Introduction to Pathology.pptxIntroduction to Pathology.pptx
Introduction to Pathology.pptx
 
preeclampsia.pptx
preeclampsia.pptxpreeclampsia.pptx
preeclampsia.pptx
 
Hypersensitivity reactions BY GROUP 1.pptx
Hypersensitivity reactions BY GROUP 1.pptxHypersensitivity reactions BY GROUP 1.pptx
Hypersensitivity reactions BY GROUP 1.pptx
 
inflammaton.pptx
inflammaton.pptxinflammaton.pptx
inflammaton.pptx
 
FINALLLL HMD.pptx
FINALLLL HMD.pptxFINALLLL HMD.pptx
FINALLLL HMD.pptx
 
Chap.-II.pptx
Chap.-II.pptxChap.-II.pptx
Chap.-II.pptx
 
Cellular Reactions to Injury.pptx
Cellular  Reactions  to Injury.pptxCellular  Reactions  to Injury.pptx
Cellular Reactions to Injury.pptx
 
by Group 8 PID & EP edited.pptx
by Group 8 PID & EP edited.pptxby Group 8 PID & EP edited.pptx
by Group 8 PID & EP edited.pptx
 
ACID-BASE BALANCE.pptx
ACID-BASE BALANCE.pptxACID-BASE BALANCE.pptx
ACID-BASE BALANCE.pptx
 
Autoimmunity group 2.ppt
Autoimmunity group 2.pptAutoimmunity group 2.ppt
Autoimmunity group 2.ppt
 
infection prevention.pptx
infection prevention.pptxinfection prevention.pptx
infection prevention.pptx
 
integumentery.pptx
integumentery.pptxintegumentery.pptx
integumentery.pptx
 
Medication and fluid therapy.pptx
Medication and fluid therapy.pptxMedication and fluid therapy.pptx
Medication and fluid therapy.pptx
 
Endocrine System Disorder.pptx
Endocrine System Disorder.pptxEndocrine System Disorder.pptx
Endocrine System Disorder.pptx
 
CVS and abdomen.pptx
CVS and abdomen.pptxCVS and abdomen.pptx
CVS and abdomen.pptx
 
Endocrine DOs.pptx
Endocrine DOs.pptxEndocrine DOs.pptx
Endocrine DOs.pptx
 
badnews.pptx
badnews.pptxbadnews.pptx
badnews.pptx
 
2 Assessment of patient with respiratory disorder.pptx
2 Assessment of patient with respiratory disorder.pptx2 Assessment of patient with respiratory disorder.pptx
2 Assessment of patient with respiratory disorder.pptx
 
Adult health.pptx
Adult health.pptxAdult health.pptx
Adult health.pptx
 

Recently uploaded

Physiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdfPhysiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdf
MedicoseAcademics
 
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
Swetaba Besh
 
Novas diretrizes da OMS para os cuidados perinatais de mais qualidade
Novas diretrizes da OMS para os cuidados perinatais de mais qualidadeNovas diretrizes da OMS para os cuidados perinatais de mais qualidade
Novas diretrizes da OMS para os cuidados perinatais de mais qualidade
Prof. Marcus Renato de Carvalho
 
POST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its managementPOST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its management
touseefaziz1
 
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdfARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
Anujkumaranit
 
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTSARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
Dr. Vinay Pareek
 
Ocular injury ppt Upendra pal optometrist upums saifai etawah
Ocular injury  ppt  Upendra pal  optometrist upums saifai etawahOcular injury  ppt  Upendra pal  optometrist upums saifai etawah
Ocular injury ppt Upendra pal optometrist upums saifai etawah
pal078100
 
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdfBENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
DR SETH JOTHAM
 
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
kevinkariuki227
 
Superficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptxSuperficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptx
Dr. Rabia Inam Gandapore
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
Anurag Sharma
 
BRACHYTHERAPY OVERVIEW AND APPLICATORS
BRACHYTHERAPY OVERVIEW  AND  APPLICATORSBRACHYTHERAPY OVERVIEW  AND  APPLICATORS
BRACHYTHERAPY OVERVIEW AND APPLICATORS
Krishan Murari
 
KDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologistsKDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologists
د.محمود نجيب
 
Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
Savita Shen $i11
 
Flu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore KarnatakaFlu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore Karnataka
addon Scans
 
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
i3 Health
 
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model SafeSurat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Savita Shen $i11
 
Are There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdfAre There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdf
Little Cross Family Clinic
 
ACUTE SCROTUM.....pdf. ACUTE SCROTAL CONDITIOND
ACUTE SCROTUM.....pdf. ACUTE SCROTAL CONDITIONDACUTE SCROTUM.....pdf. ACUTE SCROTAL CONDITIOND
ACUTE SCROTUM.....pdf. ACUTE SCROTAL CONDITIOND
DR SETH JOTHAM
 
Cervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptxCervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptx
Dr. Rabia Inam Gandapore
 

Recently uploaded (20)

Physiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdfPhysiology of Chemical Sensation of smell.pdf
Physiology of Chemical Sensation of smell.pdf
 
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
 
Novas diretrizes da OMS para os cuidados perinatais de mais qualidade
Novas diretrizes da OMS para os cuidados perinatais de mais qualidadeNovas diretrizes da OMS para os cuidados perinatais de mais qualidade
Novas diretrizes da OMS para os cuidados perinatais de mais qualidade
 
POST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its managementPOST OPERATIVE OLIGURIA and its management
POST OPERATIVE OLIGURIA and its management
 
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdfARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
 
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTSARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
ARTHROLOGY PPT NCISM SYLLABUS AYURVEDA STUDENTS
 
Ocular injury ppt Upendra pal optometrist upums saifai etawah
Ocular injury  ppt  Upendra pal  optometrist upums saifai etawahOcular injury  ppt  Upendra pal  optometrist upums saifai etawah
Ocular injury ppt Upendra pal optometrist upums saifai etawah
 
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdfBENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
BENIGN PROSTATIC HYPERPLASIA.BPH. BPHpdf
 
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
 
Superficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptxSuperficial & Deep Fascia of the NECK.pptx
Superficial & Deep Fascia of the NECK.pptx
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
 
BRACHYTHERAPY OVERVIEW AND APPLICATORS
BRACHYTHERAPY OVERVIEW  AND  APPLICATORSBRACHYTHERAPY OVERVIEW  AND  APPLICATORS
BRACHYTHERAPY OVERVIEW AND APPLICATORS
 
KDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologistsKDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologists
 
Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
Phone Us ❤85270-49040❤ #ℂall #gIRLS In Surat By Surat @ℂall @Girls Hotel With...
 
Flu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore KarnatakaFlu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore Karnataka
 
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
 
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model SafeSurat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
 
Are There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdfAre There Any Natural Remedies To Treat Syphilis.pdf
Are There Any Natural Remedies To Treat Syphilis.pdf
 
ACUTE SCROTUM.....pdf. ACUTE SCROTAL CONDITIOND
ACUTE SCROTUM.....pdf. ACUTE SCROTAL CONDITIONDACUTE SCROTUM.....pdf. ACUTE SCROTAL CONDITIOND
ACUTE SCROTUM.....pdf. ACUTE SCROTAL CONDITIOND
 
Cervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptxCervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptx
 

Biostatistics_descriptive stats.pptx

  • 1. Chapter 2 Descriptive statistics By Hirbo Shore (MPH, Assistant Prof. of Epidemiology) Haramaya University, CHMS, School of Public Health 10/16/2019 1
  • 2. The objective of the chapter • At the end the session, students will able to; • Describe descriptive statistics • Compute and interpret measure of central tendency • Compute and interpret measure of central tendency • Compute and interpret measure of dispersion
  • 3. Aspects of statistics There two major categories of statistics 1. Descriptive statistics 2. Inferential statistics
  • 4. Descriptive statistics • Statistical procedures used to summarise, organise, and simplify data. • Encompasses tools devised to organize and display data in an accessible fashion • It involves the quantification of recurring phenomena.
  • 5. Descriptive statistics,… • This process should be carried out in such a way that reflects overall findings – Raw data is made more manageable – Raw data is presented in a logical form – Patterns can be seen from organised data
  • 6. Descriptive statistics,… Summarizing and organizing data can be achieved through: 1. Frequency Distributions 2. Graphical Representations 3. Measures of Central Tendency 4. Measures of variability
  • 7. Simple Frequency Distributions • It is useful for categorical variable • the following information can be obtained – it allows you to pick up at a glance some valuable information, such as highest, lowest value. – ascertain the general shape or form of the distribution – make an informed guess about central tendency values
  • 8. Table 1. Marital status of participants
  • 10. Frequency distribution of group data Weight Group Frequency Cum. frequency Rel. frequency Cum. Rel. freq 1500-2000 0 0 0 0 2000-2500 13 13 13.54 13.54 2500-3000 21 34 21.88 35.42 3000-3500 38 72 39.58 75 3500-4000 18 90 18.75 93.75 4000-4500 6 96 6.25 100 Total 96 100
  • 11. Frequency distribution for numeric data,… • If there is no good knowledge on the data one can use the following rule as a guide to construct class intervals; • Sturge’s Rule – K=1 + 3.322log 𝑛, k is number of class intervals, n is sample size K – W=Range , W is width of the class interval – Range, the difference of maximum and minimum value
  • 12. Diagrammatic Representation • Qualitative data – Bar graphs – Pie chart • Quantitative data – Histogram – Frequency polygon – Line graph – Scatter plot
  • 14. Simple Bar Chart • Displays the frequency distribution of a qualitative variable • Take into account one or more classification factors • a single bar for each category is drawn where the height of the bar represents the frequency of the category
  • 15. Simple bar chart,… 30 25 20 15 10 5 0 hemorrhage Embolism Percent hypertensive sepsis abortion disorders Causes of maternal mortality Lale Say, et al. Global causes of maternal death: a W H O systematic analysis. T h e Lancet Global Health, 2014: 2(6): e323-e333
  • 16. Multiple bar graph Figure. Marital status and sex
  • 17. Stacked bar graph Malaria Types Months P.Faciparum P.Vivax O.ovale P.malaria August 197 123 12 8 October 176 148 15 11 Decembe r 146 169 19 13 Total 519 440 46 32 30% 20% 10% 0% 40% 50% 60% 100% 90% 80% 70% August October December P.malaria O.ovale P.Vivax P.Faciparum 10/16/2019 17 Figure. Stacked bar graph
  • 18. Graphs for quantitative data 10/16/2019 18
  • 19. Histogram Weight Group Frequency 1500-2000 0 2000-2500 13 2500-3000 21 3000-3500 38 3500-4000 18 4000-4500 6 Total 96 Frequency 10/16/2019 19 40 35 30 25 20 15 10 5 0 1500-2000 2000-2500 2500-3000 3000-3500 3500-4000 4000-4500 Birth Weight
  • 21. Stem & leaf plots of each • Draw a vertical line and place the first digits class‐called the “stem” on the left side of the line. • The numbers on the right side of the vertical line present the second digit of each observation; they are the “leaves”
  • 22. 10/16/2019 22 Figure. Stem –and-leaf plot of parity
  • 23. Box and Whisker plot • Used to display information when the objective is to illustrate certain locations in the distribution. • Abox is drawn with the top of the box at the third quartile and the bottom at the first quartile. • The location of the mid‐point of the distribution is indicated with a horizontal line in the box. • straight lines, or whiskers, are drawn from the center of the top of the box to the largest observation and from the center of the bottom of the box to the smallest observation.
  • 24. Box and Whisker plot,… • Points required to construct box and whisker plot – Lowest value – First quartile(Q1) – Second quartile/median(Q2) – Third quartile(Q3) – Larges value
  • 25. Box and Whisker plot,… 40 60 100 120 mweigh t 80 Minimum value Q2 Q1 Q2 Minimum value
  • 26. Box and Whisker plot,… 40 60 100 120 female male mweight 80 Graphs by sex
  • 27. Scatter plot • display the relationship between two characteristics • Used when both the variables are quatitative • When one of the characteristics is qualitative and the other is quantitative, the data can be displayed in box and whisker plots.
  • 29. Scatter plot,… 200 0 250 0 300 0 350 0 400 0 40 60 120 80 100 mweight birthweight Fitted values
  • 30.
  • 31.
  • 32. Line graph Figure Trends in contraceptive use
  • 33. line,… Figure Trends in exclusive breastfeeding
  • 34. Line graph, … . Figure. Trends in early childhood mortality rates
  • 35. Numerical summary measures • summarization of data by means of descriptive measures • Statistic a descriptive measure computed from the values of a sample • Parameter a descriptive measure computed from the values of a population
  • 36. parameter • a measure, quantity, or numerical value obtained from the population values X1, X2, . . ., XN • As the value of a parameter is generally unknown but constant, • we are interested in estimating a parameter to understand the characteristic of a population using the sample data
  • 37. statistics • Ameasure, quantity, or numerical value obtained from the sample values x1, x2, . . ., xn. • The values of statistics can be computed from the sample as functions of observations. • Used to approximate (estimate) parameters.
  • 38. Statistics • The data can be organized and presented using figures and tables in order to display the important features and characteristics contained in data. • However, tables and figures may not reveal the underlying characteristics very comprehensively. • For instance, if we want to characterize the data with a single value, it would be very useful not only for communicating to everyone but also for making comparisons.
  • 39. • Let us consider that we are interested in number of C-section births among the women during their reproductive period in different regions of a country. • To collect information on this variable, we need to consider women who have completed their reproductive period.
  • 40. • With tables and figures, we can display their frequency distributions. • However, to compare tables and figures for different regions, it would be a very difficult task. • In such situations, we need a single statistic which contains the salient feature of the data for a region. • This single value can be a statistic such as arithmetic mean in this case
  • 41. • In case of a qualitative variable, it could be a proportion. • With this single value, we can represent the whole data set in a meaningful way to represent the corresponding population characteristic and we may compare this value with similar other values at different times or for different groups or regions conveniently. • Asingle value makes it very convenient to represent a whole data set of small, medium, or large size.
  • 42. • Four types of descriptive measures are commonly used for characterizing underlying features of data: 1. Measures of central tendency, 2. Measures of dispersion, 3. Measures of skewness, 4. Measures of kurtosis.
  • 43. Measures of Central Tendency • The values of a variable often tend to be concentrated around the center of the data. • The center of the data can be determined by the measures of central tendency. • Ameasure of central tendency is considered to be a typical (or a representative) value of the set of data as a whole.
  • 44. • The measures of central tendency are sometimes referred as the measures of location too. • The most commonly used measures of central tendency are; • mean • median • mode
  • 45. Mean Population Mean (µ) • If X1, X2, . . ., XN are the population values, then the population mean is µ= 𝑥 𝑖 ∑𝑁 = 𝑖=1 X1+𝑋2+⋯𝑋𝑁 𝑁 𝑁 • The population mean µ is a parameter. • It is usually unknown, and we are interested to estimate its value.
  • 46. Sample mean • If x1, x2, . . ., xn are the sample values, then the sample mean is 𝑋= 𝑥1+𝑥2,…+ 𝑥𝑛 𝑥 𝑖 ∑𝑛 = 𝑖=1 𝑛 𝑛 • It is noteworthy that the sample mean x is a statistic. It is known and we can calculate it from the sample. • The sample mean x is used to approximate (estimate) the population mean, µ.
  • 47. • Amore precise way to compute the mean from a frequency distribution is 𝑓 𝑥 +𝑓 𝑥 ,…+𝑓 𝑥 𝑓𝑖𝑥 𝑖 ∑𝑛 𝑋= 1 1 2 2 𝑘 𝑘 = 𝑖=1 𝑓𝑖+𝑓2+⋯+𝑓𝑘 𝑛 where fi denotes the frequency of distinct sample observation xi, i =1, . . ., k.
  • 48. • Example 2.1 • Let us consider the data on number of children ever born reported by 20 women in a sample: 4, 2, 1, 0, 2, 3, 2, 1, 1, 2, 3, 4, 3, 2, 1, 2, 2, 1, 3, 2. • n = 20. • We want to find the average number of children ever born from this sample. • The sample mean is
  • 49. • 10/16/2019 49 𝑥=4+2+1+0+2+3+2+1+1+2+3+4+3+2+1+2+2+1+3+2 20 • 𝑥=41 =2.05 20 • Example 2.2 – Employing the same data shown above, we can compute the arithmetic mean in a more precise way by using frequencies. We can construct a frequency distribution table from this data to summarize the data first
  • 50. Afrequency distribution table for number of children ever born of 20 women 10/16/2019 50
  • 52. Features of mean 10/16/2019 52 • Simplicity: The mean is easily understood and easy to compute. • Uniqueness: There is one and only one mean for a given set of data. • The mean takes into account all values of the data.
  • 53. • Disadvantage – The arithmetic mean is influenced by extreme values • Therefore, the mean may be distorted by extreme values – cannot be used for qualitative data. – can only be found for quantitative variables. 10/16/2019 53
  • 54. median 10/16/2019 54 • The value which divides the ordered array into two equal parts. • The numbers in the first part are less than or equal to the median, and the numbers in the second part are greater than or equal to the median. • Notice that 50% (or less) of the data are ≤Median and 50% (or less) of the data are≥ Median.
  • 55. 𝑟𝑎𝑛𝑘 = • Calculating the Median; – Let x1, x2, . . ., xn be the sample values. – The sample size (n) can be odd or even. – The steps for computing the median are discussed below: • First, we order the sample to obtain the ordered array. • Suppose that the ordered array is; y1, y2, . . ., yn • We compute the rank of the middle value(s): 𝑛 + 1 2 10/16/2019 55
  • 56. • If the sample size (n) is an odd number, there is only one value in the middle, and the rank will be an integer: 𝑟𝑎𝑛𝑘 = 𝑛+1 =m, (m is integer) 2 • The median is the middle value of the ordered observations, which is Median =ym 10/16/2019 56
  • 57. • If the sample size (n) is an even number, there are two values in the middle, and the rank will be an integer plus 0.5. 𝑟𝑎𝑛𝑘 = 𝑛+1 =m+0.5 10/16/2019 57 2 – Therefore, the ranks of the middle values are (m) and (m + 1). – The median is the mean (average) of the two middle values of the ordered observations:
  • 58. median= 𝑦𝑚+𝑦𝑚+1 2 Example 2.4 (Odd number) Find the median for the sample values: 10, 54, 21, 38, 53. 10/16/2019 58
  • 59. • Solution • n = 5 (odd number) • There is only one value in the middle. • The rank of the middle value is median= 𝑛+1 =5+1 =3, (m=3) 2 2 The median = 38 (unit). 10/16/2019 59
  • 60. • Example 2.5 (Even number) Find the median for the sample values: 10, 35, 41, 16, 20, 32. Solution – n = 6 (even number) – There are two values in the middle. The rank is median= 𝑛+1 = 6+1 =3.5, (m=3) 2 2 10/16/2019 60
  • 62. Median 10/16/2019 62 Advantage – Simplicity: The median is easily understood and easy to compute. – Uniqueness: There is only one median for a given set of data. – Not as drastically affected by extreme values as is the mean.
  • 63. • Disadvantages: – Does not take into account all values of the sample after ordering the data in ascending order. – Can only be found for quantitative variables. 10/16/2019 63
  • 64. Mode 10/16/2019 64 • Aset of values is that value which occurs most frequently (i.e., with the highest frequency). • If all values are different or have the same frequencies, there will be no mode. • Aset of data may have more than one mode
  • 65. Mode from seven sets of data 10/16/2019 65
  • 66. Percentiles and Quartiles 10/16/2019 66 • Represent a measure of position • The measures of position indicate the location of a data set • These descriptive measures are called position parameters because they can be used to designate certain positions on the horizontal axis when the distribution of a variable is graphed
  • 67. Percentile,… 10/16/2019 67 – Given a set of observations x1, x2; . . ., xn, the Pth percentile, Pp, p = 0,1, 2,…, 100, is the value of X such that p percent or less of the observations are less than Pp and (100 − p) percent or more of the observations are greater than Pp.
  • 68. Percentile,… – The 10th percentile is denoted by P10 where P10 100 = 10 𝑛+1 th in the ordered value – The 50th percentile is P50; 100 = 50 𝑛+1 th in the ordered value = median 10/16/2019 68 – The median is a special case of percentile and it represents the 50th percentile
  • 69. Quartile 10/16/2019 69 – Given a set of observations x1, x2, . . .,xn, the rth quartile, Qr , r = 1,2,3, is the value of X such that (25x r) percent or less of the observations are less than Qr and [100 − (25 r)] percent or less of the observations are greater than Qr.
  • 71. Measures of Dispersion 10/16/2019 71 • The dispersion of a set of observations refers to the variation contained in the data. • Ameasure of dispersion conveys information regarding the amount of variability present in a set of data. • The dispersion is small when the values are close together • Measures of dispersion includes range, variance, standard deviation, and coefficient of variation
  • 72. Measures of Dispersion,… Figure displaying small and large variations 10/16/2019 72
  • 73. Measures of Dispersion,… Figure displaying smaller and larger variations using the same center 10/16/2019 73
  • 74. Measures of Dispersion,… • Example; Let us consider the hypothetical data sets of size 5 with same measure of central tendency If we represent the samples by only measure of central tendency, or by 𝒙 in this case, then all the samples have the same value 10/16/2019 74
  • 75. Measures of Dispersion,… 10/16/2019 75 • In other words, although we observe that the measure of central tendency remains same, the variation contained in each data set, or the variation in values of a data set from its central value are not same at all • This indicates that the MCT alone fails to characterize different types of properties of data sets
  • 76. Measures of Dispersion,… 10/16/2019 76 • We want to represent the data by a small number of measures, if possible by a single value, but sometimes a single value cannot represent the data adequately or sufficiently • Therefore, the measure of variation is needed to characterize the data in addition to a measure of central tendency
  • 77. Range • The range is the difference between the largest value (Max) and the smallest value (Min): Range (R) = Max – Min • Example 10/16/2019 77
  • 78. Range,… 10/16/2019 78 • The unit of the range is the same as the unit of the data. • The usefulness of the range is limited. • A poor measure of the dispersion because it only takes into account two of the values, maximum and minimum; however, it plays a significant role in many applications to have a quick view of the dispersion in a data set
  • 79. Interquartile Range 10/16/2019 79 • Adisadvantage of the range is that it is computed using the two extreme values of the sample data • If the data contain extreme outlier values, then the range is affected by those outliers. • Interquartile range can be used to overcome this limitation of the range • Computed by taking the difference between the third quartile and the first quartile
  • 80. Interquartile Range,… 10/16/2019 80 • Provides a more meaningful measure that represents the range of the middle 50% of the data. • The interquartile range is defined as IQR=Q3 - Q1 • As the extreme values are not considered, the IQR is a more robust measure of variation in the data and is not affected by outliers
  • 81. Box-and-Whisker Plot 10/16/2019 81 • Measure of location in which value around which other values are clustered. • Shows the variation or spread in the data from the central value • Also shows whether the data are symmetric or not • The box contains features included in the middle 50% of the observations, and whiskers show the minimum and maximum values
  • 82. Box-and-Whisker Plot • Represents the main features of the data very efficiently and precisely • Example: Consider the following data set: 14.6, 24.3, 24.9, 27.0, 27.2, 27.4, 28.2, 28.8, 29.9, 30.7, 31.5, 31.6, 32.3, 32.8, 33.3, 33.6, 34.3, 36.9, 38.3, 44.0. n=20 𝑡ℎ Q1= (𝑛+1) =5.25th observation 10/16/2019 82 4 =5th +0.25(6th -5th)=> 27.2+0.25(27.4-27.2) =27.2+0.05=27.25
  • 83. Box-and-Whisker Plot,… 4 4 𝑡ℎ Q3= 3(𝑛+1) = 3(20+1) =15.75th ordered observation =15.75th ordered observation +0.75(16th -15th) =33.3 +0.75(33.6-33.3) =33.525 ∴ IQR=33.525 -27.25 =6.275 10/16/2019 83
  • 84. The Variance 10/16/2019 84 • Uses the mean as a point of reference • an average measure of squared difference between each observation and arithmetic mean • Small when the observations are close to the mean, is large when the observations are spread out from the mean, and is zero (no variation) when all observations have the same value (concentrated at the mean) • Shows, on an average, how close the values of a variable are to the arithmetic mean
  • 85. The Variance,… • Deviations of sample values from the sample mean – Let x1, x2, . . ., xn be the sample values, and x be the sample mean. – The deviation of the value xi from the sample mean x is 𝑥𝑖 − 𝑥 – The squared deviation is 𝑥𝑖 − 𝑥 𝟐 𝟐 10/16/2019 85 – The sum of squared deviations is ∑𝑛 𝑥𝑖 − 𝑥 𝑖 =1
  • 86. The Population Variance, σ2 • Let X1, X2, . . ., XN be the population values. The population variance, δ2, is defined by ∑𝑁 σ2 = 𝑖=1 2 (𝑥𝑖−μ) 𝑁 𝑋 −𝜇 2 +(𝑋 −𝜇)2+⋯+ 𝑋 −𝜇 10/16/2019 86 2 = 1 2 𝑁 (unit)2 𝑁 • It may be noted here that σ2 is a parameter because it is obtained from the population values, it is generally unknown. It is also always true that σ2 ≥0.
  • 87. The Sample Variance, S2 • Let x1, x2, . . ., xn be the sample values. The sample variance s2 is defined by (𝑥𝑖−𝑥)2 ∑𝑛 = 𝑖=1 𝑛−1 ∑𝑛 = 𝑖=1(𝑥1−𝑥)2 + 𝑥𝑖−𝑥 2 …,+(𝑥𝑖−𝑥)2 𝑛 −1 10/16/2019 87
  • 88. S2 ,…. • Example – We want to compute the sample variance of the following sample values: 10, 21, 33, 53, 54. 10/16/2019 88
  • 89. Standard Deviation • The variance represents squared units, and therefore is not appropriate measure of dispersion when we wish to express the concept of dispersion in terms of the original unit. • The standard deviation is the square root of the variance – Population standard deviation is σ= 𝜎2 – Sample standard deviation is S= 𝑆2 10/16/2019 89
  • 90. Standard Deviation,… S= (𝑥𝑖−𝑥)2 ∑𝑛 𝑖=1 𝑛−1 For the previous example, the sample standard deviation is S= 𝑆2= 376.6 = 19.41 10/16/2019 90
  • 91. Coefficient of Variation (C.V.) 10/16/2019 91 • Used to compare the variation of two variables • we cannot use the variance or the standard deviation directly to compare variation of two variable, because – the variables might have different units – the variables might have different means • For comparison, we need a measure of the relative variation that does not depend on either the units or on the values of means.
  • 92. C.V., … • The coefficient of variation is defined by 𝐶𝑉 = 𝑆 x100 𝑋 • C.V. is free of unit and the standard deviation is divided by the corresponding arithmetic mean to make it comparable for different means. • To compare the variability of two sets of data 10/16/2019 92
  • 93. C.V., … 10/16/2019 93 • The data set with the larger value of C.V. has larger variation which is expressed in percentage. • The relative variability of the data set 1 is larger than the relative variability of the data set 2 if C.V.1 > C.V.2 and vice versa
  • 94. C.V., … 10/16/2019 94 • Example Variable mean Standard deviation Coefficient of variation Sample 1 66kg 4.5kg 6.8% Sample 2 36kg 4.5kg 12:5% Since C.V.2> C.V.1, the relative variability of the data set 2 is larger than the relative variability of the data set 1.
  • 95. Skewness 10/16/2019 95 • the degree of asymmetry or departure of symmetry of a distribution. • In a skew distribution, the mean tends to lie on the same side of the longer tail. • In a symmetric distribution, the mean, the mode, and the median coincide.
  • 97. Skewness, … • Pearson’s Measure – Skewness = 𝐦𝐞𝐚𝐧−𝐦𝐨𝐝𝐞 𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧 =0 for symmetry, >0 for positively skewed distribution, <0 for negatively skewed distribution. 10/16/2019 97
  • 98. Skewness, … • Alternatively Skewness = 3𝑥(mean−m𝑒𝑑𝑖𝑎𝑛) standard deviation +0 for symmetry, >0 for positively skewed distribution, <0, for negatively skewed distribution. 10/16/2019 98
  • 99. Skewness, … • Bowley’s Measure Skewness = 𝑄3−𝑄2 −(𝑄2−𝑄1) 𝑄3−𝑄1 Example: Consider the following data set: 14.6, 24.3, 24.9, 27.0, 27.2, 27.4, 28.2, 28.8, 29.9, 30.7, 31.5, 31.6, 32.3, 32.8, 33.3, 33.6, 34.3, 36.9, 38.3, 44.0 • Here, n = 20. 𝑥 = 30.58: There is no mode. • The sample variance, s2=36.3417 and the sample standard deviation, S= 6:0284 10/16/2019 99
  • 100. Skewness, … • Q1=27.25, Q2=31.1, Q3=33.525 Skewness = 3𝑥(30.58−31.1) = -0.2588 6.0284 • This shows that there is negative skewness in the distribution The Bowley’s measure of skewness is Skewness= 33.525−31.1 − 31.1−27.25 = -0.2271 33.525−27.25 • This measure also confirms that there is negative skewness in the distribution 10/16/2019 100
  • 101. Kurtosis 10/16/2019 101 • The peakedness of a distribution, usually taken relative to a normal distribution. • Leptokurtic: a distribution with a relatively high peak • Platykurtic: a curve which is relatively flat topped • Mesokurtic: a curve which is neither too peaked nor flat topped (normal distribution curve)
  • 103. • The measure of kurtosis is A mesokurtic distribution has a kurtosis measure of 3 based, but computer program reduce measure by 3, then mesokurtic distribution will be equal to 0, leptokurtic distribution will have a kurtosis measure > 0, and a platykurtic distribution will have a kurtosis measure < 0. 10/16/2019 103
  • 104. !!THANK YOU FOR YOUR ATTENTION!! 10/16/2019 104