GIBSON MANDOZANA
BIOSTATISTICIAN-UZ-CRC
COMMUNITY MEDICINE
Outline
 Descriptive Statistics-Definition
 Types of data
-Quantitative and Qualitative data
 Data presentation
-Bar graph, pie chart, histogram, line graph and
boxplot
Descriptive Statistics
 Utilizes numerical, tabular and
graphical methods to look for patterns
in a data set
 -to summaries the information revealed in a
data set
 present that information in a convenient
form
Descriptive
Statistics
1. Involves
 Presenting Data
 Characterizing
Data
2. Purpose
 Describe Data

X = 30.5 S
X = 30.5 S2
2
= 113
= 113
0
0
25
25
50
50
Q1
Q1 Q2
Q2 Q3
Q3 Q4
Q4
$
$
Types of data
 Quantitative-are numerical values that measure
some characteristics of an individual such as height
or salary.
 There are two types of numerical data
 Continuous data -occurs when there is no
limitation on the values which a characteristic
being measured can take.(other than that which
restricts us when taking measurement)
 Example: weight can be 171.2, 171.3, 171,4 etc
 Discrete data- are numeric data that have a finite
number of possible values
 Example: shoe size, number of brothers (when
data represent count they are discrete)
Types of data
 Qualitative/Categorical: occur when each individual can
only belong to one of a number of distinct categories
such as males / female
 Categorical data – expressed not in terms of number but
natural language of description e.g. favorite color=blue
 Can further be classified into two depending on
ordering
 Nominal-the categories are not ordered but simply have names
(e.g. blood group A, AB, O or marital
status(married/widowed/single)). In this case there is no reason
to suspect being married is better (or worse) than single.
 Ordinal-categories are order in some way e.g. disease staging
(advanced, moderate, mild) or degree of pain (severe,
moderate, mild, none)
Types of data
Types of data
Categorical
(Qualitative)
Numerical
(Quantitative)
Nominal
(no
ranking)
Ordinal
(ranked)
Discrete Continuous
Interval
data
Ratio data
Note: Interval is numerical data
expressed as an interval e.g. age
15-25, 25-35
Ratio data is derived from
ratio of numerical data e.g
Univariate Analysis
 involves the examination across cases of one
variable at a time. There are three major
characteristics of a single variable that we tend to
look at:
 Frequency distribution
 Central tendency
 Dispersion
In most situations, we would describe all three of
these characteristics for each of the variables in
our study.
Frequency distribution
 is a presentation of the number of times (or the
frequency) that each value (or group of values)
occurs in the study population.
 helps to give a picture of the shape of the
distribution of
the data.
 A frequency distribution can be displayed as a
table, a bar chart, a histogram, or a frequency
polygon
 The method usually depends on the type of
variable being described.
Frequency distribution -
Qualitative data
 Categorical variables are qualitative in nature and
are best displayed as a table or a bar chart.
 Example 1: Frequency table; simply shows the
number of times each specific observation
appears in a sample or population.
Example 1
In the month of April, the number of accidents occurring in the workplace was
recorded as follows:
1 1 2 3 2 0
3 0 1 1 1 3
4 0 2 2 1 1
2 0 0 3 0 0
0 3 4 0 0 2
Tally Sheet
No of Accidents Tally
0 |||| ||||
|||| ||||
1 |||| ||
|||| ||
2 |||| |
|||| |
3 ||||
||||
4 ||
||
Frequency Distribution
Frequency Distribution
0
0
1
1
2
2
3
3
4
4
10
10
7
7
6
6
5
5
2
2
Total
Total 30
30
No of Accidents
No of Accidents Frequency
Frequency
The
The relative frequency
relative frequency of a class is the fraction or
of a class is the fraction or
proportion of the total number of data items
proportion of the total number of data items
belonging to the class.
belonging to the class.
A
A relative frequency distribution
relative frequency distribution is a tabular
is a tabular
summary of a set of data showing the relative
summary of a set of data showing the relative
frequency for each class.
frequency for each class.
Relative Frequency Distribution
Relative Frequency Distribution
Percent Frequency
Distribution
The
The percent frequency
percent frequency of a class is the relative
of a class is the relative
frequency multiplied by 100.
frequency multiplied by 100.
A
A percent frequency distribution
percent frequency distribution is a tabular
is a tabular
summary of a set of data showing the percent
summary of a set of data showing the percent
frequency for each class.
frequency for each class.
Relative Frequency and
Relative Frequency and
Percent Frequency Distributions
Percent Frequency Distributions
0
0
1
1
2
2
3
3
4
4
.333
.333
.233
.233
.200
.200
.167
.167
.067
.067
Total
Total 1.000
1.000
33.3
33.3
23.3
23.3
20.0
20.0
16.7
16.7
6.7
6.7
100.0
100.0
Relative
Relative
Frequency
Frequency
Percent
Percent
Frequency
Frequency
No. of Accidents
No. of Accidents
.333(100) =
.333(100) =
33.3%
33.3%
2/30 = .067
2/30 = .067
Bar Chart
 A bar chart, graph that used to display frequency
distributions for ordinal and nominal data.
 The various categories into which the
observations fall are presented along the
horizontal axis.
 A vertical bar is drawn above each category and
the height of the bar represents the frequency or
relative of observations in that class
 The bar should be of equal width and separated
from one another (as not no imply continuity)
0 1 2 3 4
Frequency
No. of Accidents
Bar Graph
Bar Graph
1
2
3
4
5
6
7
8
9
10
Example 1
Pie Chart
 The
The pie chart
pie chart is a commonly used graphical device
is a commonly used graphical device
for presenting relative frequency distributions for
for presenting relative frequency distributions for
qualitative data.
qualitative data.
 First draw a
First draw a circle
circle; then use the relative
; then use the relative
frequencies to subdivide the circle
frequencies to subdivide the circle
into sectors that correspond to the
into sectors that correspond to the
relative frequency for each class.
relative frequency for each class.
 Since there are 360 degrees in a circle,
Since there are 360 degrees in a circle,
a class with a relative frequency of .25 would
a class with a relative frequency of .25 would
consume .25(360) = 90 degrees of the circle.
consume .25(360) = 90 degrees of the circle.
Example 1
Example 1
Pie Chart
Pie Chart
0
1
2
3
4
33.3%
23.3%
16.7%
20%
6.7%
Frequency distribution-
Numeric variable
 Numerical variables are quantitative in nature
and are best displayed as a frequency histogram
or a frequency polygon.
 A frequency histogram shows the frequencies
relative to each other.
 The horizontal axis displays the true limits of the
various intervals
 The width of the bar is in proportion with the
class interval that it represents.
 Typically there are no spaces between bars in a
frequency histogram,
Frequency histogram
Example 2
31
31 16
16 22
22 50
50 30
30 42
42 63
63 33
33 56
56 64
64 41
41 37
37
41
41 63
63 31
31 17
17 61
61 53
53 30
30 52
52 32
32 28
28 54
54 36
36
65
65 24
24 41
41 26
26 54
54 49
49 19
19 31
31 56
56 32
32 20
20 54
54
64
64 54
54 52
52 58
58 17
17 19
19 64
64 42
42 23
23 43
43 34
34 33
33
42
42 21
21 41
41 24
24 41
41 64
64 61
61 46
46 34
34 40
40 30
30 43
43
54
54 43
43 45
45 53
53 30
30 43
43 40
40 30
30 25
25 52
52 58
58 28
28
60
60 32
32 24
24 34
34 43
43 23
23 42
42 59
59 54
54 45
45 51
51 41
41
50
50 58
58 44
44 40
40 64
64 24
24 42
42 62
62 46
46 52
52 21
21 25
25
51
51 56
56 50
50 60
60 65
65 40
40 41
41 61
61 16
16 40
40 25
25 55
55
52
52 19
19 41
41 26
26 52
52 56
56 62
62 57
57 16
16 21
21 26
26 29
29
48
48 26
26 62
62 43
43 58
58 25
25 21
21 52
52 42
42 33
33 39
39 26
26
Frequency Distribution-
Quantitative data
 Guidelines for Selecting Number of
Classes
• Use between 5 and 20 classes.
Use between 5 and 20 classes.
• Data sets with a larger number of elements
Data sets with a larger number of elements
usually require a larger number of classes.
usually require a larger number of classes.
• Smaller data sets usually require fewer classes
Smaller data sets usually require fewer classes
Frequency Distribution
 Guidelines for Selecting Width of Classes
Largest Data Value Smallest Data Value
Number of Classes

•Use classes of equal width.
Use classes of equal width.
•Approximate Class Width =
Approximate Class Width =
Frequency Distribution
For Example 2, if we choose six classes:
Approximate Class Width = (65 - 16)/6 = 8.2 =  9
We first prepare a Tally Sheet
Round
Round
up
up
Tally Sheet
Age
Age Tally
Tally
15 - 23
15 - 23 IIII IIII IIII I
IIII IIII IIII I
24 - 32
24 - 32 IIII IIII IIII IIII IIII II
IIII IIII IIII IIII IIII II
33 - 41
33 - 41 IIII IIII IIII III
IIII IIII IIII III
42 - 50
42 - 50 IIII IIII IIII III
IIII IIII IIII III
51 - 59
51 - 59 IIII IIII IIII IIII IIII III
IIII IIII IIII IIII IIII III
60 - 68
60 - 68 IIII IIII IIII II
IIII IIII IIII II
Frequency Distribution
15-23
15-23
24-32
24-32
33-41
33-41
42-50
42-50
51-59
51-59
60-68
60-68
16
16
27
27
22
22
22
22
28
28
17
17
Total 132
Total 132
Age
Age Frequency
Frequency
Relative Frequency and
Percent Frequency
Distribution
15-23
15-23
24-32
24-32
33-41
33-41
42-50
42-50
51-59
51-59
60-68
60-68
Age
Age
.121
.121
.205
.205
.167
.167
.167
.167
.212
.212
.128
.128
Total 1.00
Total 1.00
Relative
Relative
Frequency
Frequency
12.1
12.1
20.5
20.5
16.7
16.7
16.7
16.7
21.2
21.2
12.8
12.8
100.0
100.0
Percent
Percent
Frequency
Frequency
16/132
16/132 .121(100)
.121(100)
Histogram
 Another common graphical presentation of
Another common graphical presentation of
quantitative data is a
quantitative data is a histogram
histogram.
.
 The variable of interest is placed on the horizontal
The variable of interest is placed on the horizontal
axis.
axis.
 A rectangle is drawn above each class interval with
A rectangle is drawn above each class interval with
its height corresponding to the interval’s
its height corresponding to the interval’s frequency
frequency,
,
relative frequency
relative frequency, or
, or percent frequency
percent frequency.
.
 Unlike a bar graph, a histogram has
Unlike a bar graph, a histogram has no natural
no natural
separation between rectangles
separation between rectangles of adjacent classes.
of adjacent classes.
Histogram
4
8
12
16
20
24
28
32
36
Age
Frequency
1523 2432 3341 4250 5159 60-68
Example 2
Example 2
Histogram
 Symmetric
 Left tail is the mirror image of the right tail
Relative
Frequency
.05
.10
.15
.20
.25
.30
.35
0
Histogram
 Moderately Skewed Left
 A longer tail to the left
Relative
Frequency
.05
.10
.15
.20
.25
.30
.35
0
Histogram
 Moderately Right Skewed
 A Longer tail to the right
Relative
Frequency
.05
.10
.15
.20
.25
.30
.35
0
Histogram
 Highly Skewed Right
 A very long tail to the right
Relative
Frequency
.05
.10
.15
.20
.25
.30
.35
0
Frequency polygon
 A frequency polygon includes the same area
under the line that a histogram displays within
the bars.
 Is constructed by placing a point at the center of
each interval
 Point are then connected by a straight line.
 Though a frequency polygon may look like a line
graph, a frequency polygon must be closed at the
ends.
Histogram and Frequency Polygon
Histogram and Frequency Polygon
Mode
Mode
0
50
100
150
200
250
300
2 2.25 2.5 2.75 3 3.25 3.5 3.75 4 4.25 4.5 4.75 5 5.25
Birth weight (Kg)
Frequency
Other way of presenting
data
Quantitative data
 Scatter plot
 Box-plot
 Line graph
 Ogive
Scatter plot
 Used to depict the relationship between two different
continuous measurements.
 Each point on the graph represents a pair of values.
FVC
FEV1
1.55333 4.00667
2.05
4.89
Box plot
 Uses summary measures such as min max
median and interquartile range to summarize a
set of continuous or discrete variable.
Line graph
 Same as scatter plot but each value 0n horizontal axis has a
single corresponding measurement on vertical axis
 Adjacent point are connected by a straight line
 Commonly horizontal axis is the time variable
Cumulative frequency distribution
Cumulative frequency distribution 
 shows the
shows the
number of items with values less than or equal to
number of items with values less than or equal to
the upper limit of each class..
the upper limit of each class..
Cumulative relative frequency distribution
Cumulative relative frequency distribution – shows
– shows
the proportion of items with values less than or
the proportion of items with values less than or
equal to the upper limit of each class.
equal to the upper limit of each class.
Cumulative Distributions
Cumulative Distributions
Cumulative percent frequency distribution
Cumulative percent frequency distribution – shows
– shows
the percentage of items with values less than or
the percentage of items with values less than or
equal to the upper limit of each class.
equal to the upper limit of each class.
Cumulative Distributions
 Example 2
<
< 23
23
<
< 32
32
<
< 41
41
<
< 50
50
<
< 59
59
<
< 68
68
Age
Age
Cumulative
Cumulative
Frequency
Frequency
Cumulative
Cumulative
Relative
Relative
Frequency
Frequency
Cumulative
Cumulative
Percent
Percent
Frequency
Frequency
16
16
43
43
65
65
87
87
115
115
132
132
.121
.121
.326
.326
.492
.492
.660
.660
.871
.871
1.00
1.00
12.1
12.1
32.6
32.6
49.2
49.2
66.0
66.0
87.1
87.1
100.0
100.0
16 + 27
16 + 27 43/132
43/132 .326(100)
.326(100)
Ogive
Ogive
 An
An ogive
ogive is a graph of a cumulative distribution.
is a graph of a cumulative distribution.
 The data values are shown on the horizontal axis.
The data values are shown on the horizontal axis.
 Shown on the vertical axis are the:
Shown on the vertical axis are the:
• cumulative frequencies, or
cumulative frequencies, or
• cumulative relative frequencies, or
cumulative relative frequencies, or
• cumulative percent frequencies
cumulative percent frequencies
 The frequency (one of the above) of each class is
The frequency (one of the above) of each class is
plotted as a point.
plotted as a point.
 The plotted points are connected by straight lines.
The plotted points are connected by straight lines.
• Because the class limits for the age data are 15-23,
Because the class limits for the age data are 15-23,
24-32, and so on, there appear to be one-unit gaps
24-32, and so on, there appear to be one-unit gaps
from 23 to 24, 32 to 33, and so on.
from 23 to 24, 32 to 33, and so on.
Ogive
Ogive
• These gaps are eliminated by plotting points
These gaps are eliminated by plotting points
halfway between the class limits.
halfway between the class limits.
• Thus, 23.5 is used for the 15-23 class, 32.5 is used
Thus, 23.5 is used for the 15-23 class, 32.5 is used
for the 24-32 class, and so on.
for the 24-32 class, and so on.
 Example 2
Example 2
Age
Age
20
40
60
80
100
Cumulative
Percent
Frequency
Cumulative
Percent
Frequency
15 24 33 42 51 61 68
15 24 33 42 51 61 68
(50.5, 66)
(50.5, 66)
Ogive with
Ogive with
Cumulative Percent Frequencies
Cumulative Percent Frequencies
Example 2
Example 2
 THANK YOU
 SIYABONGA
 TATENDA

Data Types and Descriptive Statistics.ppt

  • 1.
  • 2.
    Outline  Descriptive Statistics-Definition Types of data -Quantitative and Qualitative data  Data presentation -Bar graph, pie chart, histogram, line graph and boxplot
  • 3.
    Descriptive Statistics  Utilizesnumerical, tabular and graphical methods to look for patterns in a data set  -to summaries the information revealed in a data set  present that information in a convenient form
  • 4.
    Descriptive Statistics 1. Involves  PresentingData  Characterizing Data 2. Purpose  Describe Data  X = 30.5 S X = 30.5 S2 2 = 113 = 113 0 0 25 25 50 50 Q1 Q1 Q2 Q2 Q3 Q3 Q4 Q4 $ $
  • 5.
    Types of data Quantitative-are numerical values that measure some characteristics of an individual such as height or salary.  There are two types of numerical data  Continuous data -occurs when there is no limitation on the values which a characteristic being measured can take.(other than that which restricts us when taking measurement)  Example: weight can be 171.2, 171.3, 171,4 etc  Discrete data- are numeric data that have a finite number of possible values  Example: shoe size, number of brothers (when data represent count they are discrete)
  • 6.
    Types of data Qualitative/Categorical: occur when each individual can only belong to one of a number of distinct categories such as males / female  Categorical data – expressed not in terms of number but natural language of description e.g. favorite color=blue  Can further be classified into two depending on ordering  Nominal-the categories are not ordered but simply have names (e.g. blood group A, AB, O or marital status(married/widowed/single)). In this case there is no reason to suspect being married is better (or worse) than single.  Ordinal-categories are order in some way e.g. disease staging (advanced, moderate, mild) or degree of pain (severe, moderate, mild, none)
  • 7.
    Types of data Typesof data Categorical (Qualitative) Numerical (Quantitative) Nominal (no ranking) Ordinal (ranked) Discrete Continuous Interval data Ratio data Note: Interval is numerical data expressed as an interval e.g. age 15-25, 25-35 Ratio data is derived from ratio of numerical data e.g
  • 8.
    Univariate Analysis  involvesthe examination across cases of one variable at a time. There are three major characteristics of a single variable that we tend to look at:  Frequency distribution  Central tendency  Dispersion In most situations, we would describe all three of these characteristics for each of the variables in our study.
  • 9.
    Frequency distribution  isa presentation of the number of times (or the frequency) that each value (or group of values) occurs in the study population.  helps to give a picture of the shape of the distribution of the data.  A frequency distribution can be displayed as a table, a bar chart, a histogram, or a frequency polygon  The method usually depends on the type of variable being described.
  • 10.
    Frequency distribution - Qualitativedata  Categorical variables are qualitative in nature and are best displayed as a table or a bar chart.  Example 1: Frequency table; simply shows the number of times each specific observation appears in a sample or population.
  • 11.
    Example 1 In themonth of April, the number of accidents occurring in the workplace was recorded as follows: 1 1 2 3 2 0 3 0 1 1 1 3 4 0 2 2 1 1 2 0 0 3 0 0 0 3 4 0 0 2
  • 12.
    Tally Sheet No ofAccidents Tally 0 |||| |||| |||| |||| 1 |||| || |||| || 2 |||| | |||| | 3 |||| |||| 4 || ||
  • 13.
  • 14.
    The The relative frequency relativefrequency of a class is the fraction or of a class is the fraction or proportion of the total number of data items proportion of the total number of data items belonging to the class. belonging to the class. A A relative frequency distribution relative frequency distribution is a tabular is a tabular summary of a set of data showing the relative summary of a set of data showing the relative frequency for each class. frequency for each class. Relative Frequency Distribution Relative Frequency Distribution
  • 15.
    Percent Frequency Distribution The The percentfrequency percent frequency of a class is the relative of a class is the relative frequency multiplied by 100. frequency multiplied by 100. A A percent frequency distribution percent frequency distribution is a tabular is a tabular summary of a set of data showing the percent summary of a set of data showing the percent frequency for each class. frequency for each class.
  • 16.
    Relative Frequency and RelativeFrequency and Percent Frequency Distributions Percent Frequency Distributions 0 0 1 1 2 2 3 3 4 4 .333 .333 .233 .233 .200 .200 .167 .167 .067 .067 Total Total 1.000 1.000 33.3 33.3 23.3 23.3 20.0 20.0 16.7 16.7 6.7 6.7 100.0 100.0 Relative Relative Frequency Frequency Percent Percent Frequency Frequency No. of Accidents No. of Accidents .333(100) = .333(100) = 33.3% 33.3% 2/30 = .067 2/30 = .067
  • 17.
    Bar Chart  Abar chart, graph that used to display frequency distributions for ordinal and nominal data.  The various categories into which the observations fall are presented along the horizontal axis.  A vertical bar is drawn above each category and the height of the bar represents the frequency or relative of observations in that class  The bar should be of equal width and separated from one another (as not no imply continuity)
  • 18.
    0 1 23 4 Frequency No. of Accidents Bar Graph Bar Graph 1 2 3 4 5 6 7 8 9 10 Example 1
  • 19.
    Pie Chart  The Thepie chart pie chart is a commonly used graphical device is a commonly used graphical device for presenting relative frequency distributions for for presenting relative frequency distributions for qualitative data. qualitative data.  First draw a First draw a circle circle; then use the relative ; then use the relative frequencies to subdivide the circle frequencies to subdivide the circle into sectors that correspond to the into sectors that correspond to the relative frequency for each class. relative frequency for each class.  Since there are 360 degrees in a circle, Since there are 360 degrees in a circle, a class with a relative frequency of .25 would a class with a relative frequency of .25 would consume .25(360) = 90 degrees of the circle. consume .25(360) = 90 degrees of the circle.
  • 20.
    Example 1 Example 1 PieChart Pie Chart 0 1 2 3 4 33.3% 23.3% 16.7% 20% 6.7%
  • 21.
    Frequency distribution- Numeric variable Numerical variables are quantitative in nature and are best displayed as a frequency histogram or a frequency polygon.  A frequency histogram shows the frequencies relative to each other.  The horizontal axis displays the true limits of the various intervals  The width of the bar is in proportion with the class interval that it represents.  Typically there are no spaces between bars in a frequency histogram,
  • 22.
  • 23.
    Example 2 31 31 16 1622 22 50 50 30 30 42 42 63 63 33 33 56 56 64 64 41 41 37 37 41 41 63 63 31 31 17 17 61 61 53 53 30 30 52 52 32 32 28 28 54 54 36 36 65 65 24 24 41 41 26 26 54 54 49 49 19 19 31 31 56 56 32 32 20 20 54 54 64 64 54 54 52 52 58 58 17 17 19 19 64 64 42 42 23 23 43 43 34 34 33 33 42 42 21 21 41 41 24 24 41 41 64 64 61 61 46 46 34 34 40 40 30 30 43 43 54 54 43 43 45 45 53 53 30 30 43 43 40 40 30 30 25 25 52 52 58 58 28 28 60 60 32 32 24 24 34 34 43 43 23 23 42 42 59 59 54 54 45 45 51 51 41 41 50 50 58 58 44 44 40 40 64 64 24 24 42 42 62 62 46 46 52 52 21 21 25 25 51 51 56 56 50 50 60 60 65 65 40 40 41 41 61 61 16 16 40 40 25 25 55 55 52 52 19 19 41 41 26 26 52 52 56 56 62 62 57 57 16 16 21 21 26 26 29 29 48 48 26 26 62 62 43 43 58 58 25 25 21 21 52 52 42 42 33 33 39 39 26 26
  • 24.
    Frequency Distribution- Quantitative data Guidelines for Selecting Number of Classes • Use between 5 and 20 classes. Use between 5 and 20 classes. • Data sets with a larger number of elements Data sets with a larger number of elements usually require a larger number of classes. usually require a larger number of classes. • Smaller data sets usually require fewer classes Smaller data sets usually require fewer classes
  • 25.
    Frequency Distribution  Guidelinesfor Selecting Width of Classes Largest Data Value Smallest Data Value Number of Classes  •Use classes of equal width. Use classes of equal width. •Approximate Class Width = Approximate Class Width =
  • 26.
    Frequency Distribution For Example2, if we choose six classes: Approximate Class Width = (65 - 16)/6 = 8.2 =  9 We first prepare a Tally Sheet Round Round up up
  • 27.
    Tally Sheet Age Age Tally Tally 15- 23 15 - 23 IIII IIII IIII I IIII IIII IIII I 24 - 32 24 - 32 IIII IIII IIII IIII IIII II IIII IIII IIII IIII IIII II 33 - 41 33 - 41 IIII IIII IIII III IIII IIII IIII III 42 - 50 42 - 50 IIII IIII IIII III IIII IIII IIII III 51 - 59 51 - 59 IIII IIII IIII IIII IIII III IIII IIII IIII IIII IIII III 60 - 68 60 - 68 IIII IIII IIII II IIII IIII IIII II
  • 28.
  • 29.
    Relative Frequency and PercentFrequency Distribution 15-23 15-23 24-32 24-32 33-41 33-41 42-50 42-50 51-59 51-59 60-68 60-68 Age Age .121 .121 .205 .205 .167 .167 .167 .167 .212 .212 .128 .128 Total 1.00 Total 1.00 Relative Relative Frequency Frequency 12.1 12.1 20.5 20.5 16.7 16.7 16.7 16.7 21.2 21.2 12.8 12.8 100.0 100.0 Percent Percent Frequency Frequency 16/132 16/132 .121(100) .121(100)
  • 30.
    Histogram  Another commongraphical presentation of Another common graphical presentation of quantitative data is a quantitative data is a histogram histogram. .  The variable of interest is placed on the horizontal The variable of interest is placed on the horizontal axis. axis.  A rectangle is drawn above each class interval with A rectangle is drawn above each class interval with its height corresponding to the interval’s its height corresponding to the interval’s frequency frequency, , relative frequency relative frequency, or , or percent frequency percent frequency. .  Unlike a bar graph, a histogram has Unlike a bar graph, a histogram has no natural no natural separation between rectangles separation between rectangles of adjacent classes. of adjacent classes.
  • 31.
  • 32.
    Histogram  Symmetric  Lefttail is the mirror image of the right tail Relative Frequency .05 .10 .15 .20 .25 .30 .35 0
  • 33.
    Histogram  Moderately SkewedLeft  A longer tail to the left Relative Frequency .05 .10 .15 .20 .25 .30 .35 0
  • 34.
    Histogram  Moderately RightSkewed  A Longer tail to the right Relative Frequency .05 .10 .15 .20 .25 .30 .35 0
  • 35.
    Histogram  Highly SkewedRight  A very long tail to the right Relative Frequency .05 .10 .15 .20 .25 .30 .35 0
  • 36.
    Frequency polygon  Afrequency polygon includes the same area under the line that a histogram displays within the bars.  Is constructed by placing a point at the center of each interval  Point are then connected by a straight line.  Though a frequency polygon may look like a line graph, a frequency polygon must be closed at the ends.
  • 37.
    Histogram and FrequencyPolygon Histogram and Frequency Polygon Mode Mode 0 50 100 150 200 250 300 2 2.25 2.5 2.75 3 3.25 3.5 3.75 4 4.25 4.5 4.75 5 5.25 Birth weight (Kg) Frequency
  • 38.
    Other way ofpresenting data Quantitative data  Scatter plot  Box-plot  Line graph  Ogive
  • 39.
    Scatter plot  Usedto depict the relationship between two different continuous measurements.  Each point on the graph represents a pair of values. FVC FEV1 1.55333 4.00667 2.05 4.89
  • 40.
    Box plot  Usessummary measures such as min max median and interquartile range to summarize a set of continuous or discrete variable.
  • 41.
    Line graph  Sameas scatter plot but each value 0n horizontal axis has a single corresponding measurement on vertical axis  Adjacent point are connected by a straight line  Commonly horizontal axis is the time variable
  • 42.
    Cumulative frequency distribution Cumulativefrequency distribution   shows the shows the number of items with values less than or equal to number of items with values less than or equal to the upper limit of each class.. the upper limit of each class.. Cumulative relative frequency distribution Cumulative relative frequency distribution – shows – shows the proportion of items with values less than or the proportion of items with values less than or equal to the upper limit of each class. equal to the upper limit of each class. Cumulative Distributions Cumulative Distributions Cumulative percent frequency distribution Cumulative percent frequency distribution – shows – shows the percentage of items with values less than or the percentage of items with values less than or equal to the upper limit of each class. equal to the upper limit of each class.
  • 43.
    Cumulative Distributions  Example2 < < 23 23 < < 32 32 < < 41 41 < < 50 50 < < 59 59 < < 68 68 Age Age Cumulative Cumulative Frequency Frequency Cumulative Cumulative Relative Relative Frequency Frequency Cumulative Cumulative Percent Percent Frequency Frequency 16 16 43 43 65 65 87 87 115 115 132 132 .121 .121 .326 .326 .492 .492 .660 .660 .871 .871 1.00 1.00 12.1 12.1 32.6 32.6 49.2 49.2 66.0 66.0 87.1 87.1 100.0 100.0 16 + 27 16 + 27 43/132 43/132 .326(100) .326(100)
  • 44.
    Ogive Ogive  An An ogive ogiveis a graph of a cumulative distribution. is a graph of a cumulative distribution.  The data values are shown on the horizontal axis. The data values are shown on the horizontal axis.  Shown on the vertical axis are the: Shown on the vertical axis are the: • cumulative frequencies, or cumulative frequencies, or • cumulative relative frequencies, or cumulative relative frequencies, or • cumulative percent frequencies cumulative percent frequencies  The frequency (one of the above) of each class is The frequency (one of the above) of each class is plotted as a point. plotted as a point.  The plotted points are connected by straight lines. The plotted points are connected by straight lines.
  • 45.
    • Because theclass limits for the age data are 15-23, Because the class limits for the age data are 15-23, 24-32, and so on, there appear to be one-unit gaps 24-32, and so on, there appear to be one-unit gaps from 23 to 24, 32 to 33, and so on. from 23 to 24, 32 to 33, and so on. Ogive Ogive • These gaps are eliminated by plotting points These gaps are eliminated by plotting points halfway between the class limits. halfway between the class limits. • Thus, 23.5 is used for the 15-23 class, 32.5 is used Thus, 23.5 is used for the 15-23 class, 32.5 is used for the 24-32 class, and so on. for the 24-32 class, and so on.  Example 2 Example 2
  • 46.
    Age Age 20 40 60 80 100 Cumulative Percent Frequency Cumulative Percent Frequency 15 24 3342 51 61 68 15 24 33 42 51 61 68 (50.5, 66) (50.5, 66) Ogive with Ogive with Cumulative Percent Frequencies Cumulative Percent Frequencies Example 2 Example 2
  • 47.
     THANK YOU SIYABONGA  TATENDA

Editor's Notes

  • #37 To return to the birth weight data. We first plotted a histogram based on the frequency in each group. From this we constructed a frequency polygon. The mode is the group with the highest frequency. There is no formula to calculate it. It is found by inspection. The ease with which the mode can be determined is one advantage of the mode. It gives a quick estimate of the centre of the group, and when the distribution is normal or nearly normal, this estimate is a fair description of the central tendency of the data. The mode is the only measure of central tendency that can be used with data on an ordinal scale. The mode also has some disadvantages. It is unstable as it may change if the method of grouping changes. It is terminal statistic as it does not give information that can be used for further calculation. It completely disregards extreme scores – it does not reflect how many there are, their values or how far they are from the centre of the group.