Few ice breaking questions. Do not spend too much time.
Outline of the session.
You can connect this slide to the one we showed earlier in the course. We worked on the data collection side before. No we are working on analysis to transform data into information.
This is the difference between data and information. The process between the two is analysis, or data reduction.
The major concept behind epidemiological data analysis is CDC- Count, Divide and Compare. This slide was shown earlier. It is a revision.
This exercise is to help the participants understand the denominator they need to work with. Sometimes, people have a hard time identify the denominator they need to use before a comparison. So In this exercise, by comparing the incidence of diphtheria among people below and under the poverty line, we will help participants understand the denominator they need to work with. Ask the participants what they should COUNT, denominator they would use to DIVIDE and what they would COMPARE (Answers on next slide).
Answers to the questions on the previous slide. Try to understand the misconceptions among participants who did not identify the right denominator. That will help you understand your audience.
The CDC process will be repeated three times for the three basic types of epidemiological analyses. Time Place Person
For time, we use graphs. We can either use direct numbers or rates, depending on whether a comparison is needed.
For this outbreak of a short duration, the population does not have time to change substantially. We can use the absolute numbers.
In contrast, here, for a five year analysis, the population size increased so we need to divide by the denominators to allow comparison (CDC).
Now we will discuss maps. These slides repeat the lectures on graphs, tables and maps. Make sure that people have the reflex of using maps to present the information by geographical area. It is too common to see it in tables, which constitutes a loss of information.
This is an example of a map of analysis by geographical area. Ask the participants what were the step that were followed to prepare this map.
For the third type of analysis, by person, we usually report the data in tables.
That is the table to remember and to replicate when doing surveillance data analysis.
This graph present more information about the person (I.e., immunization status).
In the context of IDSP, we routinely prepare seven different types of reports.
First is about completeness and timeliness.
Second is the weekly / monthly summary report.
Third is on trends.
Fourth is about crossing threshold values.
Fifth is about comparing reporting units.
Sixth compares the private and the public sector.
Seventh compare the results of the reporting in the pubic health care system and the laboratory.
Some people may be confused about the way we examine data with information collected using different case definition from different reporting sources. Because the case definitions are different, we cannot simply add. But we can look synoptically at both types of reporting to see if trends emerge.
Computer can help as tools but do not replace thinking.
Epidemiologist may be inhibited by all kind of technical considerations. But often they have not even started to look at the data. Once there is a willingness to look at the data, the technical hurdles can be addressed.
After analysis we have gone from data to information. Now, beyond that stage, the information needs to be INTERPRETED to decide on any relevant action. This interpretation for decision making should take place in the context of a technical committee.
The take home messages of the session.
Analysis and interpretation of surveillance dataIntegrated Disease Surveillance Programme (IDSP)
Preliminary questions to the group• Have you been involved in surveillance data analysis?• What difficulties have you encountered in analyzing surveillance data?• What would you like to learn about surveillance data analysis? 2
Outline of this session1. The concept of data analysis2. CDC for TPP3. Reports4. Interpretation of the information 3
What is data analysis?• Data reduction Reduces the quantity of numbers to examine Because the human mind cannot handle too many bits of information at the same time• Transforms raw data into information A list of cases becomes a monthly rate Data Information Action Analysis Interpretation 4 Today we will focus on analysis Why analyze?
REC SEX Distribution of cases by sex--- ---- 1 M 2 M Table 3 M 4 F 5 M Data 6 F Sex Frequency Proportion 7 F 8 M Female 10 33.3% 9 M 10 M Analysis Male 20 66.7% 11 F 12 M 13 M Total 30 100.0% 14 M 15 F Information Graph 16 F 17 F 18 M 19 M Female 20 M Male 21 F 22 M 23 M 24 F 25 M 26 M 27 M 28 F 29 M 30 M 5 Why analyze?
1. Count, Divide and Compare (CDC): An epidemiologist calculates rates and compare them• Direct comparisons of absolute numbers of cases are not possible in the absence of rates• CDC Count • Count (compile) cases that meet the case definition Divide • Divide cases by the corresponding population denominator Compare • Compare rates across age groups, districts etc. 6 CDC for TPP
Exercise• How would you find out if diphtheria is more common among people who are below the poverty line? 7 CDC for TPP
Is diphtheria more common among poorer people?• Count Count cases of diphtheria among families with and without a Below Poverty Line (BPL) card• Divide Divide the cases of diphtheria among BPL people by the estimated BPL population size (e.g., census) to get the rate Divide the cases of diphtheria among non BPL people by the estimated non BPL population size (e.g., census) to get the rate• Compare Compare the rates of diphtheria among BPL and non BPL people 8 CDC for TPP
2. Time, place and person descriptive analysisA. Time Incidence over time (Graph)A. Place Map of incidence by areaA. Person Breakdown by age, sex or personal characteristics Table of incidence by age and sex 9 CDC for TPP
A. Present the results of the analysis over time using a GRAPH• Absolute number of cases Avoid analysis over longer time period as the population size increases• Incidence rates Allows analysis over longer time period Analysis by week, month or year 10 CDC for TPP
Absolute number of cases for analysis over a short time period Acute hepatitis (E) by week, Hyderabad, 120 AP, India, March-June 2005 100 Number of cases 80 60 40 20 0 1 8 15 22 29 4 12 19 26 3 10 17 24 31 7 14 21 28 March April May June First day of week of onset Interpretation: The source of infection is persisting and continues to cause cases 11 CDC for TPP
December November Reports October Interpretation: There is a seasonality in the end of the year and a trend towards September August July 2004 District, West Bengal, India, 2000-2004 June May Malaria in Kurseong block, Darjeeling April March February January December November October September August July 2003 June May increasing incidence year after yearIncidence rates for analysis over a longer time period April March February January December November October September August July 2002 June Months May April March February January December November 12 October September August Incidence of Pf malaria July 2001 Incidence of malaria June May April March February January December November October September August July 2000 June May April March February January 5 0 45 40 35 30 25 20 15 10 Incidence of malaria per 10,000
2. Present the results of the analysis by place using a MAP• Number of cases Spot map Does not control for population size Concentration of dots may represent high population density only May be misleading in areas with heterogeneous population density (e.g., urban areas)• Incidence rates Incidence rate map Controls for population size 13 CDC for TPP
Incidence by area Incidence of acute hepatitis (E) by block, Hyderabad, AP, India, March-June 2005 Attack rate per 100,000 population 0 1-19 20-49 50-99 100+Open drain Interpretation: Blocks with hepatitis are those supplied by pipelinesPipeline crossingopen sewage drain 14 crossing open sewage drains
3. Present the results of the analysis per person using an incidence TABLE• Distribution of cases by: Age Sex Other characteristics (e.g., ethnic group, vaccination status)• Incidence rate by: Age Sex Other characteristics 15 CDC for TPP
Incidence according to a characteristic Probable cases of cholera by age and sex, Parbatia, Orissa, India, 2003 Number of cases Population Incidence Age group 0 to4 6 113 5.3% (In years) 5 to14 4 190 2.1% 15 to24 5 128 3.9% 25 to34 5 144 3.5% 35 to44 6 129 4.7% 45 to54 4 88 4.5% 55 to64 8 67 11.9% > 65 3 87 3.4% Sex Male 17 481 3.5% Female 24 465 5.2% Total Total 41 946 4.3% Interpretation: Older adults and women are at increased risk of cholera 16 CDC for TPP
Distribution of cases according to a characteristic Immunization status of measles cases, Nai, Uttaranchal, India, 2004 19% 81% Immunized Unimmunized Interpretation: The outbreak is probably caused by a failure to vaccinate CDC for TPP
Seven reports to be generated1. Timeliness/completeness2. Description by time, place and person3. Trends over time4. Threshold levels5. Compare reporting units6. Compare private / public7. Compare providers with laboratory 18 Reports
Report 1: Completeness and timeliness• A report is considered on time if it reaches the designated level within the prescribed time period Reflects alertness• A report is said to be complete if all the reporting units within its catchment area submitted the reports on time Reflects reliability 19 Reports
Report 2: Weekly/ monthly summary report• Based upon compiled data of all the reporting units• Presented as tables, graphs and maps• Takes into account the count, divide and compare principle: Absolute numbers of cases, deaths and case fatality ratio are sufficient for a single reporting unit level Incidence rates are required to compare reporting units 20 Reports
Report 3: Comparison with previous weeks/ months/ years• Help examine trend of diseases over time• Weekly analysis compare the current week with data from the last three weeks Alerts authorities for immediate action• Monthly and yearly analysis examine: Long term trends Cyclic pattern Seasonal patterns 21 Reports
Report 4: Crossing threshold values• Comparison of rates with thresholds• Thresholds that may be used: Pre-existing national/international thresholds Thresholds based on local historic data • Monthly average in the last three years (excluding epidemic periods) Increasing trends over a short duration of time (e.g., Weeks) 22 Reports
Report 5: Comparison between reporting units• Compares Incidence rates Case fatality ratios• Reference period Current month• Sites concerned Block level and above 23 Reports
Report 6: Comparison between public and private sectors• Compare trends in number of new cases/deaths Incidences are not available for private provider since no population denominators are available• Good correlation may imply: The quality of information is good Events in the community are well represented• Poor correlation may suggest: One of the data source is less reliable 24 Reports
Report 7: Comparison of reports between the public health system and the laboratory Elements to compare Public health system LaboratoriesValidation of •Number of cases •Number of laboratoryreporting seen by providers diagnosesWater borne •Cases of diarrheal •Water qualitydisease diseasesVector borne •Cases of vector •Entomological datadisease borne diseases 25 Reports
Making sense of different sources of information (“S” and “P” forms) It is not possible to mix data from different case definitions One cannot add cases coming from “S” and “P” forms (syndromic and presumptive diagnoses) It is not possible to add apples and oranges Use the different sources of information to cross validate (or “triangulate”) If there is an increase in the cases of dengue in the “P” forms, check if there is a surge in the number of fever cases in the “S” forms 26 Interpretation
What computers cannot do Skills Attitudes• Contact reporting units • Looking for missing information • Thinking• Interpret laboratory tests • Discussing• Make judgment about: • Taking action Epidemiologic linkage Duplicate records Data entry errors• Declare a state of outbreak 27 Interpretation
Expressed concerns versus reality Concerns Mistake commonly commonly expressed observed• Statistics are difficult • Data are not looked at• Multivariate analysis is complex• Presentation of data is challenging 28 Interpretation
Review of analysis results by the technical committee• Meeting on a fixed day of the week• Search for missing values• Validity check• Interpretation of the analysis bearing in mind The strength and weakness of data The disease profiles The need to calculate rates before comparisons Meeting on a fixed day of every week• Summary reports for dissemination• Action 29 Interpretation
Take home messages1. Link data collection and program implementation • Data > Information > Action1. Count, divide and compare for time, place and person description2. Share information through reports3. Interpret with the technical committee to decide action on the basis of the information 30