2. Purpose of presenting data
• The purpose of developing clearly understandable
tables, graphs is to facilitate:
– interpretation of data
– effective, rapid communication on complex issues
and situations
– a way of displaying and reporting data, making it
easier to report patterns and relationships, shapes
of distributions, and trends
Can be displayed by: tables, graphs
3. Line listing
• If conducting a study or investigating an outbreak, you
must compile information in an organized manner
• One common method is creating a line list or line
listing
• The line listing is one type of epidemiologic database,
and it is organized like a spreadsheet with rows and
columns
4. Line listing
• Each row is called a record or observation and
represents one person or a case of disease
• Each column is called a variable
– Contains information about one characteristic of the
individuals, such as sex, or date of birth
5. Table 1a: Line listing of hepatitis A cases,
January –February 1012
ID Date of diagnosis Village Age Sex Jaundice IV drugs
01 01/05 B 74 M N Y
02 01/06 J 29 M Y N
03 01/08 K 37 M Y N
04 01/19 J 3 F N N
05 01/30 C 35 F Y Y
06 02/04 B 23 M Y N
07 02/28 G 11 F Y N
08 02/28 R 21 F Y N
09 02/29 W 34 M Y N
6. Tabulating data cont...
• A table is a set of data arranged in rows and columns
• Almost any quantitative data can be organized in a table
• Tables are useful for demonstrating patterns, differences
and other relationships
• A table should be self explanatory
• It should convey all the information necessary for the
reader to understand the data
7. Constructing tables
• Use a clear and concise title that describes
person, place and time
• Precede the title with a table number
• Label each row and each column
• Show totals for columns, where appropriate
8. Constructing tables
• Explain any codes, abbreviations, or symbols
in a footnote
• Note the source of the data below the table or
in a footnote if the data is not original
9. Table 1b: Reported cases of primary and secondary
syphilis by age-United States, 2002
Age group Number of cases Percent
<14 21 0.3
14-19 351 5.1
20-24 842 12.3
25-29 895 13.0
30-34 1,097 16.0
35-39 1,367 19.9
40-44 1,023 14.9
45-54 982 14.3
>55 284 4.1
Total 6,862 100*
*Actual total of percentages for this table is 99.9% and does not add to
100.0% due to rounding error
Data source: CDC. Sexually transmitted surveillance 2002.
10. Frequency distribution
• A frequency distribution is an organized tabulation
showing exactly how many individuals are located in each
category on the scale of measurement.
• The frequency distribution tells us the values of a variable
and how often these values occur in a given sample
A frequency distribution displays the values a variable can
take and the number of persons or records with each value
• A value: individual entry in each column
Frequency distributions can be presented in tables or using
graphical display
11. Frequency distribution table
• Is the simplest method for presenting a
summary of categorical data
• Popular frequency tables are one-way and 2-
way tables
• Consists of at least two columns - one listing
categories on the scale of measurement (X)
and another for frequency (f).
• In the X column, values are listed from the
highest to lowest or lowest to highest, without
skipping any.
12. One variable tables/ one-way
frequency table
• This is the most basic and simple table
• Frequency distribution with only one variable
• The first column shows the categories of the
variable
• The second column shows the number of
persons or event that fall into each category
13. Example of a one-way frequency table
• Table 1c: District of residence of women who
delivered at MRRH in 2006
District Number %
Mbarara 2,045 66.5
Bushenyi 236 7.7
Ibanda 33 1.1
Isingiro 499 16.2
Kiruhura 133 4.3
Other district 131 4.3
Total 3,077 100
14. One-way frequency table cont..
• Table 1d: Reported cases of primary and secondary syphilis
by age-United States, 2002
Age group Number of cases Percent
<14 21 0.3
14-19 351 5.1
20-24 842 12.3
25-29 895 13.0
30-34 1,097 16.0
35-39 1,367 19.9
40-44 1,023 14.9
45-54 982 14.3
>55 284 4.1
Total 6,862 100*
15. Two and Three variable tables/ two-
way frequency table
• A two by two table- Is favorite among epidemiologists
• Two and Three variable tables
• Table 1e shows the number of syphilis cases classified
by both age group and sex of the patients
• This is a two variable table with data categorized jointly
by those two variables
16. Table 1e: Reported cases of primary and secondary
syphilis by age-United States, 2002
Data source: CDC. Sexually transmitted surveillance 2002
Age group Male Female Number of cases
<14 9 12 21
14-19 135 216 351
20-24 533 309 842
25-29 668 227 895
30-34 877 220 1,097
35-39 1,121 246 1,367
40-44 845 178 1,023
45-54 825 178 982
>55 255 29 284
Total 5,268 1,862 6,862
17. Example of a two-way frequency table
• Table 1f: Frequency of maternal complications
by parity
Parity No complications Mother had
complications
Total
Para 1 1,070 25 1,095
Para 2 655 14 669
Para 3 447 7 454
Para 4 312 16 328
5 or more 499 32 531
Total 2983 94 3,077
18. The two-way frequency table
• Table 1g: Occurrence of maternal
complications by parity
No complications Had maternal complications
Parity Number % Number %
1 1,070 97.7 25 2.3
2 655 97.9 14 2.1
3 447 98.5 7 1.5
4 312 95.1 16 4.9
5 or more 499 94.0 32 6.0
Total 2983 97.0 94 3.0
19. Graphical display
• There are different types of graphs/ diagrams
that can be used to display the frequency
distribution of data.
Pie chart, bar chart/graph
Histogram, ogive, line graph
Scatter diagram
Box and whisker plot etc
• Choice of the appropriate graph/ diagram
depends on the type of variable(s) to be
presented
Categorical, numerical
20. Why use graphs to present data?
Because they...
• highlight the most important facts
• facilitate understanding of the data
• can convince readers
• can be easily remembered
21. Histogram
• A histogram is a type of frequency distribution
diagram. It is constructed by plotting frequency against
class boundaries.
• Can be described by:
• Its shape (may be symmetrical about the mean or
skewed)
• Centre and
• Spread
• What observations can you make about the histogram
of birth weight of neonates admitted at MRRH
22. Histogram cont..
• The shape of the histogram provides information
about the distribution of scores on the
continuous variable
• In most cases the scores are normally distributed,
with most scores occurring at the centre
• Scores may be skewed to the left or right
26. Plotting a histogram (exercise)
• Use the following data to plot a histogram. Use 5-year age
intervals
Age (years) Frequency
15 2
16 1
17 10
18 30
19 69
20 243
21 131
22 192
23 220
24 236
25 253
Age (year) Frequency
26 189
27 178
28 219
29 145
30 285
31 65
32 148
33 67
34 65
27. When looking at histogram
–Does it have one peak or two peaks?
–Are the data values spread out on the
graph?
–Are the data values clustered on the right or
left ends?
–Are there data values in the extreme ends?
(outliers)
28. Shapes of Histograms
Bell Shape
A special type of symmetric unimodal histogram
is one that is bell shaped:
A histogram is said to be symmetric if, when we
draw a vertical line down the center of the
histogram, the two are identical in shape
Bell Shaped
Many statistical techniques
require that the population
be bell shaped.
Drawing the histogram
helps verify the shape of
the population in question.
Variable
30. Shapes of Histograms
• Skewness is a measure of the shape of the distribution
• The shape of the frequency distribution can be
symmetrical or asymmetrical
• A symmetric distribution has the same shape on both
sides of the mean (central location)
• When it is asymmetrical, we say it is skewed
• If outlying values occur only in one direction, the
distribution is said to be skewed
• Normally distributed data has skewness of zero
31. Skewness cont...
• When the mean, mode and median are
approximately the same, then the scores are
normally distributed otherwise they are skewed
• Distributions with fewer observations on the right
(toward higher values) are said to be skewed
right/ Positively skewed;
• Distributions with fewer observations on the left
(toward lower values) are said to be skewed
left/Negatively skewed.
32. Shapes of Histograms cont...
Skewness
A skewed histogram is one with a long tail
extending to either the right or the left:
Frequency
Variable
Positively Skewed Frequency
Variable
Negatively Skewed
34. Line graph
• A line graph shows patterns or trends over some
variables, often time
• A line graph allows you to inspect the mean
scores of continuous variable across a number of
values of a categorical variable
• It is a method of choice for plotting rates over
time
36. Draw a line graph for the data below
Month Number of babies delivered in
kyandondo county
February 43
March 60
April 80
May 94
June 110
37. Ogive
• Ogives are graphs that are used to estimate how
many numbers lie below or above a particular
variable or value in a data
• To construct cumulative frequency curve or ogive
it is necessary first to form the frequency table.
• Upper class boundaries of the classes are taken
as the x-coordinates and the cumulative
frequencies as the y-coordinates and the points
are plotted.
• The points are joined by a free hand smooth
curve to give the ogive.
40. Bar chart/graph
• Basically you need two variables-One categorical and one
continuous
• Comparison of categories is based on the fact that the
length of the bar is proportional to the frequency of the
event in that category
• Bars for different categories are separated by spaces
• The bar chart can be portrayed with the bars either vertical
or horizontal is a graphical device for depicting qualitative
data.
41. Bar chart cont...
• On the horizontal axis we specify the labels
that are used for each of the classes.
• A frequency, percent frequency scale can be
used for the vertical axis(Y-axis)
• Using a bar of fixed width drawn above each
class label(y-axis), we extend the height
appropriately.
• The bars are separated to emphasize the fact
that each class is a separate category.
42. Types of bar graph
Simple bar chart
Only one variable is represented
Component / stacked bar chart
• A single bar is used to indicate the composition of the
total divided into sections according to the relative
proportion. More than two variables are represented
• Multiple/ compound bar chart
Each observation has more than one value represented
by a group of bars
These are useful to compare values across categories.
47. The following information shows the
favorite subjects of students at KIU-WC
• Draw a bar graph
Favorite course unit Female students Male students
pharmacology 26 18
Biostatistics 20 20
Biochemistry 21 35
48. Pie chart
• The pie chart is a commonly used graphical device for
presenting relative frequency/percentage distributions
for qualitative data.
• A circle is divided into a series of segments. Each
segment represents a particular category of the total
data set.
• First draw a circle; then use the relative frequencies to
subdivide the circle into sectors that correspond to the
relative frequency for each class.
• Since there are 360 degrees in a circle, the relative
frequency is multiplied with 360 to get degrees of the
circle.
50. Draw a pie-chart for the data
presented below
• The following are commonest causes of injury
in rural and urban Uganda in people aged 30-
39years
Cause frequency
Burns 30
Cuts/stabs 42
Falls 36
Editor's Notes
We’re going to review the most commonly used charts and graphs in Excel/PowerPoint. Later, we’ll have you use data to create your own graphics, which may go beyond those presented here.
We’re going to review the most commonly used charts and graphs in Excel/PowerPoint. Later, we’ll have you use data to create your own graphics, which may go beyond those presented here.
Bar charts are used to compare data across categories.
Line graphs are used to display trends over time.
Pie charts show percentages or the contribution of each value to a total.
Outliers: Sample values that lie very far away from the majority of other sample values
A stacked bar chart is often used to represent components of a whole and compare the wholes (or multiple values). In a variant of a stacked bar chart, we make all of the bars the same height (or length) and show the components as percents of the total rather than as actual values. This type of chart is useful for comparing the contribution of different components to each of the categories of the main variable.
Here, you see the number of months female and male patients have been enrolled in HIV care, by age group. By looking within each bar, you see the age breakdown by gender, and by looking at both bars together, you can compare the number of months enrolled for both males and females.