ANALYSIS OF
SURVEILLANCE
DATA
Dr. Ronnie D. Domingo
Design
form Field data
gathering
Data
encoding
Data
Analysis
Report
Writing
Data Processing
• Sorting
• Coding
• Editing
• Summarizing
Data
analysis
Data processing
• A series of steps undertaken
to transform collected raw
data into a form suitable for
statistical analysis (Sanchez et al,
1989)
Data sorting method
• Types of data sheets
• Numbering system for data
sheets (especially for surveys)
• The physical “container” for
these raw data
Sorting
Coding
Editing
Summarizing
Data Coding
• Examples
Data Possible codes
“Yes” answer Y or 1
“No” answer N or 2
No response 999 or U for unknown
Does not know 888 or D
Sorting
Coding
Editing
Summarizing
Data editing/ validation
Examine the data for
four things: C.A.T.S.
• Completeness
• Accuracy
• Traceability
• Standard format
Sorting
Coding
Editing
Summarizing
Spreadsheet from Hell
By Daniel W. Byrne
Spreadsheet from Heaven
By Daniel W. Byrne
GIGO
• Garbage In, Garbage Out
Form Level Validation:
• At the stage of filling up the
online or printed form.
• Mandatory vs optional fields
• INC entries= “SUBMIT” fail
DataValidation
Field Level Validation:
• Field= space where you write
the answer
• “Farmer’s Name” field =
Fernan@do Cruz
• Date: 03-02-2016
• Provide a list of possible
answers
• Other fields auto appear or
disappear
DataValidation
Data Saving Validation:
• Option: keep the record as a
draft copy vs “Submit” as
final copy
• User with time to review and
revise entries
DataValidation
Validation of Continuous
Variables
• Continuous variables: age,
height, weight, feed
consumption, size of lesion, egg
per gram of feces, temperature,
etc.
• Check the following:
– Minimum value
– maximum value
– mean
– median
Variables
Validation techniques
Sample bar chart of lung score of pigs from several farm sources. The expected lung
scores should range from 0-55. Note “farmer117” registered an erroneous lung score
of 60.
0
10
20
30
40
50
60
70
Validation of Categorical
Variables
• Categorical Variables –
– nominal (sick, healthy)
– ordinal (+,++, +++)
• Techniques:
– Frequency checks
– Cross Tabulations
Variables
Cross-check variables to detect
awkward combinations.
Example a male dog
positive for metritis.
Data Verification
• Comparing the output of two
encoders
• Comparing the data on the
screen against the original
paper document.
• Comparing the print out of the
computer database and the
original printed document.
Summarizing the data
Sorting
Coding
Editing
Summarizing
Design
form Field data
gathering
Data
encoding
Data
Analysis
Report
Writing
Data Processing
Data analysis
Data analysis: Tools
• Install statistical and
graphics software packages
• Examples: SAS, SPSS,
STATA, Epi Info, R software,
Open Epi, Win Epi, QGIS
• Check the provider for newer
software packages.
Type of Statistical Analysis
Descriptive
statistics
Measures Descriptive Statistics
Measures of central
tendency
Mean, median, mode
Measures of variation Range, variance, standard deviation,
standard error, confidence limits
Frequency distribution Counts or proportions in different
groups; use frequency tables,
histograms and other graphs for visual
presentation
Rates and ratios Incidence, prevalence, etc.
Inferential
statistics
Tests for difference Tests for Association
See next page Cohort study= Relative risk,
attributable risk
Case-control study = Odds ratio
Experimental study = Protective
value
Correlation and regression analysis
= linear relationship, non-linear
relationship
From your sample, make
inferences about the larger
population
Inferential statistics
(deduce, generalize, extrapolate)
• Uses the theory of
probability to
make inferences
about larger
populations from
your sample.
• The pattern seen
in the analyzed
sample is
extrapolated to the
target population.
Tests
Sample flow chart to select the
appropriate statistical test
Essential components of a
common report in veterinary
practice
Generate information from
collected data.
Name the comic hero who caught this criminal?
The Phantom
Who visited this place?
Calling?
Every
disease
leaves a
distinct
mark
Two premises of modern
epidemiology:
Diseases in
populations
do not occur
in random
fashion
Diseases in
populations do
have multiple
determinants
Disease patterns are
described based on
three main
epidemiologic
variables:
Reasons for the Epi Triad:
• The three = most important;
• The result= significant
information
• The process= systematic
• The by-product= hypothesis;
• The output = transferable to
the stakeholders.
Information is
processed data
Basic Activities: CDC
Count Aggregate the cases in the
line listing by characteristic
(e.g., place, animal, time)
Divide Divide the number of cases
by the relevant denominator
Compare Compare incidence across
groups
Forms of analysis output
• Textual
• Tabular
• Graphical
Data Presentation: Graphical
(Horizontal bar graph)
0 10 20 30 40 50 60 70 80 90
SFB
BFB
PFB
RFB
PGF
AAF
RDF
SCF
Proportion of positive samples (%)
FarmCode
Figure 1. Bar Graph of the proportion of Mycolasma
hyopneumonia positive samples per farm of origin as detected
by LAMP technique
Qualitative
data
Data Presentation: Graphical (Vertical bar graph)
-
50,000
100,000
150,000
200,000
250,000
300,000
350,000
Aurora Bataan Bulacan N.Ecija Pampanga Tarlac Zambales
Figure 1. Estimated dog population in the different provinces of Region III, 2013)
Qualitative
data
0
50
100
150
200
250
300
350
400
450
500
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Data Presentation: Graphical
(Line graph)
Figure 2. Secular trend of animal rabies in Central Luzon, 2002 to 2013.
Continuous
Quantitative
data
Data Presentation: Graphical
(Pie Graph)
Bulacan
20%
Nueva Ecija
15%
Tarlac
10%
Pampanga
30%
Zambales
7%
Aurora
6%
Bataan
12%
Figure 3. Rabies vaccine allotment to different provinces in Central Luzon, 2013
Animal
Which type of animals are prone
to develop the disease and which
type tends to be spared?
Common groupings
employed in epidemiology
Age
Sex
Species
Breed
Use
Disease patterns are
described based on
three main
epidemiologic
variables:
Age
Sex
Species
Breed
Use
Classification of time
trends
• Short term
• Cyclical
• Seasonal
• Long-term
Graphs of endemic and sporadic
diseases
0
5
10
15
20
25
January February March April May June July August September October November December
Incidencecountofanimalrabies
Month
Seasonal distribution of animal rabies in
Central Luzon, 2002-2011
Disease patterns are
described based on
three main
epidemiologic
variables:
Age
Sex
Species
Breed
Use
Short
term
Cyclical
Seasonal
Long-
term
Surra Prevalence –
CATT Percent Positive
by Municipality
Source: EAHMI, based on data
provided by PAHC and RADLs.
Types of Thematic Maps
1. Qualitative maps= maps that
show non-measurable
characteristics (e.g. Low and
high rainfall).
2. Quantitative maps= maps
that depict areas with
measured variations
Qualitative Map
Geographic distribution of Japanese encephalitis
Types of quantitative maps:
(a) Dot maps
(b) Choropleth maps
(c) Isopleth maps
(d) Proportional symbol maps
Dot Maps
Choropleth maps
• Geographic areas are shaded or colored according to a prearranged key,
each shading or color type corresponding to a range of values
• Commonly used in showing population density information
Isopleth Map
from iso meaning “equal”
and pleth meaning “lines.”
Dot maps Choropleth maps Isopleth maps Proportional symbol maps
Analysis of Surveillance Data
Analysis of Surveillance Data

Analysis of Surveillance Data

Editor's Notes

  • #25 Summarizes your identified group of numbers. They do not draw conclusions about the data.
  • #58 The dots of same size superimposed over the study area Dots could be piggery farms, buffalo population, disease outbreaks, etc.