1. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
DATA VISUALIZATION
Dr. Adesina
1
2. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Introduction
• Statistical decision making begins with analysis of data.
• The purpose of statistical analysis is to quantify variable or
to quantify the magnitude of the association between an
exposure variable and an outcome variable.
• At the end of every data analysis exercise, the
findings/results must be communicated.
2
3. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Introduction….2
• One essential method of communicating our findings is
by data visualization
• Data visualization
‒Makes data easily understood by people with varying
experiences of data and statistics
‒Makes it easy to absorb large volumes of data
‒Makes it easy to spot trends and patterns in data
‒Encourages interaction with data
‒Speeds up decision-making processes using data.
4. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Applications of Data Visualization
• Tracking of Project progress
• Stakeholder engagement
• Mapping
• Research dissemination
• Predictive Analysis
5. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Discussions
• In what ways have you used visualization in your
departments?
6. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Elements of Data Visualization
• Position
• Length
• Orientation
• Area
• Shape
• Volume
• Colour
• Texture
• Numbers
• Text
6
7. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Examples of Data Visualization
• Tables
• Bar chart
• Box and Whisker
• Cartograph
• Circle View (Pie chart)
• Histogram
• Scatter plots
• Redial tree
• Dot Distribution Map
• Area charts
7
8. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Tables
• The tables should be numbered e.g. Table 1, Table 2 etc.
• A title must be given to each table,
‒e.g Table showing sex distribution.
• Label each row and each column clearly and concisely
• include the units of measurement for the data (for example,
years, mm Hg, mg/dl, rate per 100,000).
8
9. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Tables……2
• Show totals for rows and columns. If you show percents (%), also
give their total (always100).
• Explain any codes, abbreviations, or symbols in a footnote.
• Acknowledge the source of the data in a footnote if the data are
not original.
9
10. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Types of Tables
• Frequency distribution tables
• Cumulative frequency tables
• Grouped frequency tables
• Cross-tabulation
13
11. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Frequency Table
A frequency table is a distribution arranged in rows and columns.
A Frequency Table: shows in absolute or relative terms, how
often different values of a variable are encountered in a
sample. It is a one variable table.
Qualitative frequency table presents categorical variables.
Quantitative frequency table presents both discrete and
continuous variable.
14
12. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Table 2. Frequency Distribution of a nominal variable.
15
VARIABLE VALUE ABSOLUTE
FREQ.
RELATIVE FREQ (
% )
MARITAL
STATUS
1.Single
2.Married
3.Widowed
4.Divorced
183
94
22
51
52.3
26.9
6.3
14.5
TOTAL 350 100.0%
13. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Frequency Table of Ordinal Variable
16
Variable Value Absolute Freq Rel. Freq
Educational
Qualification
1.Degree
2.SSC
3.FSLC
189
87
74
54.0
24.9
21.1
Total 350 100.0
14. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Frequency Table of a discrete numerical variable.
17
VARIABLE
Number of children ABSOLUTE FREQUENCY RELATIVE FREQUENCY (%)
Parity 0
1
2
3
4
5
6
7
8
9
10
110
74
38
15
20
56
0
13
9
13
2
31.4
21.1
10.9
4.3
5.7
16.0
0.0
3.7
2.6
3.7
0.6
TOTAL 350 100.0%
15. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Cumulative Frequency Table
• This shows in absolute or relative terms, how many
observations take values that are “greater than” or “less
than” a specific value.
18
16. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Cumulative frequency Table
19
VARIABLE Number of
children
ABSOLUTE
FREQUENCY
RELATIVE
FREQUENCY
(%)
CUMULATIVE
ABSOLUTE
FREQUENCY
CUMULATIVE
RELATIVE
FREQUENCY
Parity 0
1
2
3
4
5
6
7
8
9
10
110
74
38
15
20
56
0
13
9
13
2
31.4
21.1
10.9
4.3
5.7
16.0
0.0
3.7
2.6
3.7
0.6
110
184
222
237
257
313
313
326
335
348
350
31.4
52.5
63.4
67.7
73.4
89.4
89.4
93.1
95.7
99.4
100.0
TOTAL 350 100.0%
Interpretation: 257 respondents have four (4) children or fewer, while 350-257= 93
respondents have more than 4 children representing (100-73.4)%)=26.6%.
17. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Grouped frequency table
• Data with many values are grouped into class
intervals.
• Class interval is defined by an upper and lower value
called class limits or class boundaries.
• Class width is the size of the class interval by
subtracting the lower value from the upper.
• Class mid-point or class mark is the middle value of
the class interval; it is obtained by adding the upper
and lower class boundaries and dividing by two.
20
18. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Grouped frequency Table for Age Distribution
21
Age (in years)
Absolute Frequency Relative Freq (%)
15-24
25-34
35-44
45-54
55-64
65-74
≥75
67
89
106
33
28
23
4
19.1
25.4
30.3
9.4
8.0
6.6
1.2
TOTAL 350 100.0%
19. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Cross-tabulation (Contingency Table)
A two-variable table with cross-tabulated data is also known as a
contingency table.
Cross-tabulation is used to display the association between
exposure and outcome variables, using a contingency
table.
The row corresponds to exposure values, and the columns
corresponds to outcomes.
22
20. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
General structure of a contingency table
Disease Present Disease Absent Total
Exposed a b a + b
Unexposed c d c + d
Total a + c b + d a+b + c + d
23
21. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Cross-Tabulation of 188 people by occupation and disease.
24
Angular
stomatitis
Occupation Total
Professional Skilled Unskilled
Present 5 13 70 88
Absent 20 30 50 100
Total 25 43 120 188
Percentage
with disease
20.0(%) 30.2(%) 58.3(%) 46.8(%)
22. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Graphs for Qualitative Variables
• The bar chart
• The pie chart
• Spot map
25
23. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
The bar chart
• The simplest bar chart is used to display the data from a one-variable
table.
• Each category of the variable is represented by a bar. The length of the
bar is proportional to the number of events (frequency).
• Variables shown in bar charts are categorical such as nominal or ordinal
variables. Discrete numerical variables eg year can also be presented
using bar chart.
26
24. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Simple bar chart of occupation distribution
27
Professional Skilled Unskilled
25
43
120
• 120
• 100
• 80
• 60
• 40
• 20
• 0
25. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Variant of a Bar Chart: Multiple bar Type
28
26. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Pie chart
• The Pie chart is used for categorical data. Sectors of a circle,
represents an area proportional to class frequencies.
• We determine the sectoral angle corresponding to each
occupation to draw a pie chart.
29
27. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Pie chart of occupation distribution
30
Unskilled
Skilled
Professional
28. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Spot map
Spot map ( geographic coordinate chart or dot density map)
show geographic distributions of disease conditions.
Clustering of dots may mean a common source epidemic.
Used in outbreak conditions. The first spot map was plotted
by John Snow during cholera outbreak in London.
31
29. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Graphs for Quantitative variables
• Histogram
• Frequency polygon
• An Ogive
• Scatter plot
• Stem and leaf diagram
• Box and Whiskers plot
• Dotplot
32
30. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Histogram
• A histogram is a graph of the frequency distribution of a
continuous variable. It uses adjoining columns to represent the
number of observations for each class interval in the distribution.
33
32. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Frequency Polygon
Frequency Polygon is a graphical display for a quantitative
variable with class frequencies plotted against class marks.
The points are joined by straight lines. It can take on more
than one variable.
35
33. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Frequency polygon
Fig.7 Frequency distribution of diastolic blood
pressure in females aged 45-64 years
0
2
4
6
8
10
12
14
16
63 67 71 75 79
Diastolic blood pressure
Frequency
36
34. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Scatter plot
• Scatter plot is a plot of the values of a dependent variable y
versus the values of an independent variable, x, which are
quantitative.
• Slope refers to the direction of change in variable Y when
variable X gets bigger. If variable Y also gets bigger, the
slope is positive; but if variable Y gets smaller, the slope is
negative.
37
35. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Scatter plot
• Strength refers to the degree of "scatter" in the plot. If the
dots are widely spread, the relationship between variables is
weak. If the dots are concentrated around a line, the
relationship is strong.
38
36. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Scatter diagram
HT
39
37. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Cumulative Frequency Graph (Ogive)
• A cumulative frequency graph, ( an Ogive), is a curve showing
the cumulative frequency for a given set of data.
• The cumulative frequency is plotted on the y-axis against the
variable on the x-axis for un-grouped and grouped data. It is
plotted against the upper boundary of the class in grouped data.
• It is useful for determining medians, quartiles, and other
percentiles.
40
38. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Box and Whiskers Plot
•A boxplot, sometimes called a box and whisker
plot, is used to display of quantitative data.
• A boxplot splits the data set into quartiles, which
goes from the first quartile (Q1) to the third quartile
(Q3).
41
39. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
Box and Whiskers Plot
• The “box” represents the 25th and 75th percentile (interquartile
range) of the data, and the “whiskers” extend to the minimum and
maximum values.
• We mark the position of the median (50%) with a line inside the
box.
• Thus, with a box plot we can show (and compare) the central
location (median), dispersion (interquartile range and range), and
any tendency toward skewness, which is indicated if the median
line is not centered in the box.
42
40. BHIS HEALTH PLANNING & RESEARCH CAPACITY BUILDING 12TH-14TH FEBRUARY 2024
43
Maximum value
Q3(75%)
Maximum value Q2(50%)
Q3(75%)
.
Minimum value
Minimum value
Hospital A Hospital B
Weight kg
Q1(25%)
Q1(25%)
Q2(50%)
Box plots showing the distribution of weights of patients
from Hospital A and Hospital B