SlideShare a Scribd company logo
1 of 45
Data Analysis:
Simple Statistical Tests
Goals
 Understand confidence intervals and p-
values
 Learn to use basic statistical tests
including chi square and ANOVA
Types of Variables
 Types of variables indicate which estimates you can
calculate and which statistical tests you should use
 Continuous variables:
 Always numeric
 Generally calculate measures such as the mean, median
and standard deviation
 Categorical variables:
 Information that can be sorted into categories
 Field investigation – often interested in dichotomous or
binary (2-level) categorical variables
 Cannot calculate mean or median but can calculate risk
Measures of Association
 Strength of the association between two variables,
such as an exposure and a disease
 Two measure of association used most often are the
relative risk, or risk ratio (RR), and the odds ratio
(OR)
 The decision to calculate an RR or an OR depends
on the study design
 Interpretation of RR and OR:
 RR or OR = 1: exposure has no association with disease
 RR or OR > 1: exposure may be positively associated with
disease
 RR or OR < 1: exposure may be negatively associated with
disease
Risk Ratio or Odds Ratio?
 Risk ratio
 Used when comparing outcomes of those who were
exposed to something to those who were not exposed
 Calculated in cohort studies
 Cannot be calculated in case-control studies because the
entire population at risk is not included in the study
 Odds ratio
 Used in case-control studies
 Odds of exposure among cases divided by odds of exposure
among controls
 Provides a rough estimate of the risk ratio
Analysis Tool: 2x2 Table
 Commonly used with dichotomous
variables to compare groups of people
 Table puts one dichotomous variable
across the rows and another
dichotomous variable along the
columns
 Useful in determining the association
between a dichotomous exposure and a
dichotomous outcome
Calculating an Odds Ratio
 Table displays data from a case control study conducted in
Pennsylvania in 2003 (2)
 Can calculate the odds ratio:
 *OR = ad = (218)(85) = 19.6
bc (45)(21) 
Outcome
Exposure
Hepatitis A
No Hepatitis
A
Total
Ate salsa 218 45 263
Did not eat
salsa
21 85 106
Total 239 130 369
Table 1. Sample 2x2 table for Hepatitis A at Restaurant A
Confidence Intervals
 Point estimate – a calculated estimate (like
risk or odds) or measure of association (risk
ratio or odds ratio)
 The confidence interval (CI) of a point
estimate describes the precision of the
estimate
 The CI represents a range of values on either side
of the estimate
 The narrower the CI, the more precise the point
estimate (3)
Confidence Intervals - Example
 Example—large bag of 500 red, green and
blue marbles:
 You want to know the percentage of green
marbles but don’t want to count every marble
 Shake up the bag and select 50 marbles to give
an estimate of the percentage of green marbles
 Sample of 50 marbles:

15 green marbles, 10 red marbles, 25 blue marbles
Confidence Intervals - Example
 Marble example continued:
 Based on sample we conclude 30% (15 out of 50)
marbles are green
 30% = point estimate
 How confident are we in this estimate?
 Actual percentage of green marbles could be
higher or lower, ie. sample of 50 may not reflect
distribution in entire bag of marbles
 Can calculate a confidence interval to
determine the degree of uncertainty
Calculating Confidence Intervals
 How do you calculate a confidence interval?
 Can do so by hand or use a statistical
program
 Epi Info, SAS, STATA, SPSS and Episheet are
common statistical programs
 Default is usually 95% confidence interval but
this can be adjusted to 90%, 99% or any
other level
Confidence Intervals
 Most commonly used confidence interval is the 95%
interval
 95% CI indicates that our estimated range has a 95%
chance of containing the true population value
 Assume that the 95% CI for our bag of marbles
example is 17-43%
 We estimated that 30% of the marbles are green:
 CI tells us that the true percentage of green marbles is most
likely between 17 and 43%
 There is a 5% chance that this range (17-43%) does not
contain the true percentage of green marbles
Confidence Intervals
 If we want less chance of error we could
calculate a 99% confidence interval
 A 99% CI will have only a 1% chance of error but
will have a wider range
 99% CI for green marbles is 13-47%
 If a higher chance of error is acceptable we
could calculate a 90% confidence interval
 90% CI for green marbles is 19-41%
Confidence Intervals
 Very narrow confidence intervals indicate a very
precise estimate
 Can get a more precise estimate by taking a larger
sample
 100 marble sample with 30 green marbles

Point estimate stays the same (30%)

95% confidence interval is 21-39% (rather than 17-43% for
original sample)
 200 marble sample with 60 green marbles

Point estimate is 30%

95% confidence interval is 24-36%
 CI becomes narrower as the sample size increases
Confidence Intervals
 Returning to example of Hepatitis A in a
Pennsylvania restaurant:
 Odds ratio = 19.6
 95% confidence interval of 11.0-34.9 (95% chance that the
range 11.0-34.9 contained the true OR)
 Lower bound of CI in this example is 11.0 (e.g., >1)

Odds ratio of 1 means there is no difference between the two
groups, OR > 1 indicates a greater risk among the exposed
 Conclusion: people who ate salsa were truly more likely to
become ill than those who did not eat salsa
Confidence Intervals
 Must include CIs with your point estimates to give a
sense of the precision of your estimates
 Examples:
 Outbreak of gastrointestinal illness at 2 primary schools in
Italy (4)

Children who ate corn/tuna salad had 6.19 times the risk of
becoming ill as children who did not eat salad

95% confidence interval: 4.81 – 7.98
 Pertussis outbreak in Oregon (5)

Case-patients had 6.4 times the odds of living with a 6-10 year-
old child than controls

95% confidence interval: 1.8 – 23.4
 Conclusion: true association between exposure and disease
in both examples
Analysis of Categorical Data
 Measure of association (risk ratio or
odds ratio)
 Confidence interval
 Chi-square test
 A formal statistical test to determine
whether results are statistically significant
Chi-Square Statistics
 A common analysis is whether Disease X
occurs as much among people in Group A as
it does among people in Group B
 People are often sorted into groups based on their
exposure to some disease risk factor
 We then perform a test of the association between
exposure and disease in the two groups
Chi-Square Test: Example
 Hypothetical outbreak of Salmonella on
a cruise ship
 Retrospective cohort study conducted
 All 300 people on cruise ship interviewed,
60 had symptoms consistent with
Salmonella
 Questionnaires indicate many of the case-
patients ate tomatoes from the salad bar
Chi-Square Test: Example (cont.)
 To see if there is a statistical difference in the amount
of illness between those who ate tomatoes (41/130)
and those who did not (19/170) we could conduct a
chi-square test
Salmonella?
Yes No Total
Tomatoes 41 89 130
No Tomatoes 19 151 170
Total 60 240 300
Table 2a. Cohort study: Exposure to tomatoes and Salmonella infection
Chi-Square Test: Example (cont.)
 To conduct a chi-square the following
conditions must be met:
 There must be at least a total of 30 observations
(people) in the table
 Each cell must contain a count of 5 or more
 To conduct a chi-square test we compare the
observed data (from study results) with the
data we would expect to see
Chi-Square Test: Example (cont.)
Salmonella?
Yes No Total
Tomatoes 130
No Tomatoes 170
Total 60 240 300
 Gives an overall distribution of people who ate tomatoes and
became sick
 Based on these distributions we can fill in the empty cells
with the expected values
Table 2b. Row and column totals for tomatoes and Salmonella infection
Chi-Square Test: Example (cont.)
 Expected Value = Row Total x Column Total
Grand Total
 For the first cell, people who ate tomatoes and
became ill:
 Expected value = 130 x 60 = 26
300
 Same formula can be used to calculate the expected
values for each of the cells
Chi-Square Test: Example (cont.)
Salmonella?
Yes No Total
Tomatoes
130 x 60 = 26
300
130 x 240 = 104
300
130
No Tomatoes
170 x 60 = 34
300  
170 x 240 = 136
300  
170
Total 60 240 300
 To calculate the chi-square statistic you use the observed values
from Table 2a and the expected values from Table 2c
 Formula is [(Observed – Expected)2
/Expected] for each cell of
the table
Table 2c. Expected values for exposure to tomatoes
Chi-Square Test: Example (cont.)
Salmonella?
Yes No Total
Tomatoes
(41-26)2
= 8.7
26
 
(89-104)2
= 2.2
104
 
130
No Tomatoes
(19-34)2
= 6.6
34
 
(151-136)2
= 1.7
136
 
170
Total 60 240 300
 The chi-square (χ2) for this example is 19.2
 8.7 + 2.2 + 6.6 + 1.7 = 19.2
Table 2d. Expected values for exposure to tomatoes
Chi-Square Test
 What does the chi-square tell you?
 In general, the higher the chi-square value,
the greater the likelihood there is a
statistically significant difference between the
two groups you are comparing
 To know for sure, you need to look up the p-
value in a chi-square table
 We will discuss p-values after a discussion of
different types of chi-square tests
Types of Chi-Square Tests
 Many computer programs give different
types of chi-square tests
 Each test is best suited to certain
situations
 Most commonly calculated chi-square
test is Pearson’s chi-square
 Use Pearson’s chi-square for a fairly large
sample (>100)
Types of Statistical Tests
Parade of
Statistics Guys
The right test...
 
To use when….
 
Pearson chi-square (uncorrected) Sample size >100
Expected cell counts > 10
Yates chi-square (corrected) Sample size >30
Expected cell counts ≥ 5
Mantel-Haenszel chi-square Sample size > 30
Variables are ordinal
Fisher’s exact test Sample size < 30 and/or
Expected cell counts < 5
Using Statistical Tests:
Examples from Actual Studies
 In each study, investigators chose the type of test
that best applied to the situation (Note: while the chi-
square value is used to determine the corresponding
p-value, often only the p-value is reported.)
 Pearson (Uncorrected) Chi-Square : A North Carolina study
investigated 955 individuals because they were identified as
partners of someone who tested positive for HIV. The study
found that the proportion of partners who got tested for HIV
differed significantly by race/ethnicity (p-value <0.001). The
study also found that HIV-positive rates did not differ by
race/ethnicity among the 610 who were tested (p = 0.4). (6)
Using Statistical Tests:
Examples from Actual Studies
 Additional examples:
 Yates (Corrected) Chi-Square: In an outbreak of Salmonella
gastroenteritis associated with eating at a restaurant, 14 of
15 ill patrons studied had eaten the Caesar salad, while 0 of
11 well patrons had eaten the salad (p-value <0.01). The
dressing on the salad was made from raw eggs that were
probably contaminated with Salmonella. (7)
 Fisher’s Exact Test: A study of Group A Streptococcus
(GAS) among children attending daycare found that 7 of 11
children who spent 30 or more hours per week in daycare
had laboratory-confirmed GAS, while 0 of 4 children
spending less than 30 hours per week in daycare had GAS
(p-value <0.01). (8)
P-Values
 Using our hypothetical cruise ship Salmonella
outbreak:
 32% of people who ate tomatoes got Salmonella
as compared with 11% of people who did not eat
tomatoes
 How do we know whether the difference
between 32% and 11% is a “real” difference?
 In other words, how do we know that our chi-
square value (calculated as 19.2) indicates a
statistically significant difference?
 The p-value is our indicator
P-Values
 Many statistical tests give both a
numeric result (e.g. a chi-square value)
and a p-value
 The p-value ranges between 0 and 1
 What does the p-value tell you?
 The p-value is the probability of getting the
result you got, assuming that the two
groups you are comparing are actually the
same
P-Values
 Start by assuming there is no difference in outcomes
between the groups
 Look at the test statistic and p-value to see if they
indicate otherwise
 A low p-value means that (assuming the groups are the
same) the probability of observing these results by chance is
very small

Difference between the two groups is statistically significant
 A high p-value means that the two groups were not that
different
 A p-value of 1 means that there was no difference between
the two groups
P-Values
 Generally, if the p-value is less than
0.05, the difference observed is
considered statistically significant, ie.
the difference did not happen by
chance
 You may use a number of statistical
tests to obtain the p-value
 Test used depends on type of data you
have
Chi-Squares and P-Values
 If the chi-square statistic is small, the observed and
expected data were not very different and the p-value
will be large
 If the chi-square statistic is large, this generally
means the p-value is small, and the difference could
be statistically significant
 Example: Outbreak of E. coli O157:H7 associated
with swimming in a lake (1)
 Case-patients much more likely than controls to have taken
lake water in their mouth (p-value =0.002) and swallowed
lake water (p-value =0.002)
 Because p-values were each less than 0.05, both exposures
were considered statistically significant risk factors
Note: Assumptions
 Statistical tests such as the chi-square assume that
the observations are independent
 Independence: value of one observation does not influence
value of another
 If this assumption is not true, you may not use the
chi-square test
 Do not use chi-square tests with:
 Repeat observations of the same group of people (e.g. pre-
and post-tests)
 Matched pair designs in which cases and controls are
matched on variables such as sex and age
Analysis of Continuous Data
 Data do not always fit into discrete categories
 Continuous numeric data may be of interest
in a field investigation such as:
 Clinical symptoms between groups of patients
 Average age of patients compared to average age
of non-patients
 Respiratory rate of those exposed to a chemical
vs. respiratory rate of those who were not exposed
ANOVA
 May compare continuous data through
the Analysis Of Variance (ANOVA) test
 Most statistical software programs will
calculate ANOVA
 Output varies slightly in different programs
 For example, using Epi Info software:

Generates 3 pieces of information: ANOVA
results, Bartlett’s test and Kruskal-Wallis test
ANOVA
 When comparing continuous variables between
groups of study subjects:
 Use a t-test for comparing 2 groups
 Use an f-test for comparing 3 or more groups
 Both tests result in a p-value
 ANOVA uses either the t-test or the f-test
 Example: testing age differences between 2 groups
 If groups have similar average ages and a similar
distribution of age values, t-statistic will be small and the p-
value will not be significant
 If average ages of 2 groups are different, t-statistic will be
larger and p-value will be smaller (p-value <0.05 indicates
two groups have significantly different ages)
ANOVA and Bartlett’s Test
 Critical assumption with t-tests and f-tests: groups
have similar variances (e.g., “spread” of age values)
 As part of the ANOVA analysis, software conducts a
separate test to compare variances: Bartlett’s test for
equality of variance
 Bartlett’s test:
 Produces a p-value
 If Bartlett’s p-value >0.05, (not significant) OK to use ANOVA
results
 Bartlett’s p-value <0.05, variances in the groups are NOT
the same and you cannot use the ANOVA results
Kruskal-Wallis Test
 Kruskal-Wallis test: generated by Epi Info
software
 Used only if Bartlett’s test reveals variances
dissimilar enough so that you can’t use ANOVA
 Does not make assumptions about variance,
examines the distribution of values within each
group
 Generates a p-value

If p-value >0.05 there is not a significant difference
between groups

If p-value < 0.05 there is a significant difference between
groups
Analysis of Continuous Data
Figure 1. Decision tree for analysis of continuous data.
Bartlett’s test for equality of variance
p-value >0.05?
YES NO
Use ANOVA test Use Kruskal-Wallis
test test
p<0.05 p>0.05 p<0.05 p>0.05
Difference between groups is
statistically significant
Difference between groups is
statistically significant
Difference between groups is
NOT statistically significant
Difference between groups is
NOT statistically significant
Conclusion
 In field epidemiology a few calculations and
tests make up the core of analytic methods
 Learning these methods will provide a good
set of field epidemiology skills.
 Confidence intervals, p-values, chi-square tests,
ANOVA and their interpretations
 Further data analysis may require methods to
control for confounding including matching
and logistic regression
References
1. Bruce MG, Curtis MB, Payne MM, et al. Lake-associated
outbreak of Escherichia coli O157:H7 in Clark County,
Washington, August 1999. Arch Pediatr Adolesc Med.
2003;157:1016-1021.
2. Wheeler C, Vogt TM, Armstrong GL, et al. An outbreak of
hepatitis A associated with green onions. N Engl J Med.
2005;353:890-897.
3. Gregg MB. Field Epidemiology. 2nd ed. New York, NY: Oxford
University Press; 2002.
4. Aureli P, Fiorucci GC, Caroli D, et al. An outbreak of febrile
gastroenteritis associated with corn contaminated by Listeria
monocytogenes. N Engl J Med. 2000;342:1236-1241.
References
5. Schafer S, Gillette H, Hedberg K, Cieslak P. A community-wide
pertussis outbreak: an argument for universal booster vaccination.
Arch Intern Med. 2006;166:1317-1321.
6. Centers for Disease Control and Prevention. Partner counseling and
referral services to identify persons with undiagnosed HIV --- North
Carolina, 2001. MMWR Morb Mort Wkly Rep.2003;52:1181-1184.
7. Centers for Disease Control and Prevention. Outbreak of Salmonella
Enteritidis infection associated with consumption of raw shell eggs,
1991. MMWR Morb Mort Wkly Rep. 1992;41:369-372.
8. Centers for Disease Control and Prevention. Outbreak of invasive
group A streptococcus associated with varicella in a childcare center --
Boston, Massachusetts, 1997. MMWR Morb Mort Wkly Rep.
1997;46:944-948.

More Related Content

Similar to 3 6 datatests-slides

Measures of association 2013
Measures of association 2013Measures of association 2013
Measures of association 2013dinahoefer11
 
Analytic StudiesThere are basically two types of studies experi.docx
Analytic StudiesThere are basically two types of studies experi.docxAnalytic StudiesThere are basically two types of studies experi.docx
Analytic StudiesThere are basically two types of studies experi.docxrossskuddershamus
 
COM 301 INFERENTIAL STATISTICS SLIDES.ppt
COM 301   INFERENTIAL STATISTICS SLIDES.pptCOM 301   INFERENTIAL STATISTICS SLIDES.ppt
COM 301 INFERENTIAL STATISTICS SLIDES.pptdanielayo912
 
session three epidemiology.pptx
session three epidemiology.pptxsession three epidemiology.pptx
session three epidemiology.pptxAxmedAbdiHasen
 
session three epidemiology.pptx
session three epidemiology.pptxsession three epidemiology.pptx
session three epidemiology.pptxAxmedAbdiHasen
 
.Analytical epidemiology.pptx
.Analytical epidemiology.pptx.Analytical epidemiology.pptx
.Analytical epidemiology.pptxsteffyjohn7
 
Descriptive epidemiology
Descriptive epidemiologyDescriptive epidemiology
Descriptive epidemiologyDalia El-Shafei
 
RMH Concise Revision Guide - the Basics of EBM
RMH Concise Revision Guide -  the Basics of EBMRMH Concise Revision Guide -  the Basics of EBM
RMH Concise Revision Guide - the Basics of EBMAyselTuracli
 
05 confidence interval & probability statements
05 confidence interval & probability statements05 confidence interval & probability statements
05 confidence interval & probability statementsDrZahid Khan
 
Assessing the performance of diagnostic tests
Assessing the performance of diagnostic testsAssessing the performance of diagnostic tests
Assessing the performance of diagnostic testsILRI
 
Soni_Biostatistics.ppt
Soni_Biostatistics.pptSoni_Biostatistics.ppt
Soni_Biostatistics.pptOgunsina1
 
Case control study
Case control studyCase control study
Case control studyswati shikha
 
Statistical analysis in pharmacokinetics.pptx
Statistical analysis in pharmacokinetics.pptxStatistical analysis in pharmacokinetics.pptx
Statistical analysis in pharmacokinetics.pptxMdHimelAhmedRidoy1
 
10-Interpretation& Causality by Mehdi Ehtesham
10-Interpretation& Causality  by Mehdi Ehtesham10-Interpretation& Causality  by Mehdi Ehtesham
10-Interpretation& Causality by Mehdi EhteshamResearchGuru
 

Similar to 3 6 datatests-slides (20)

Measures of association 2013
Measures of association 2013Measures of association 2013
Measures of association 2013
 
Analytic StudiesThere are basically two types of studies experi.docx
Analytic StudiesThere are basically two types of studies experi.docxAnalytic StudiesThere are basically two types of studies experi.docx
Analytic StudiesThere are basically two types of studies experi.docx
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Statistics
StatisticsStatistics
Statistics
 
COM 301 INFERENTIAL STATISTICS SLIDES.ppt
COM 301   INFERENTIAL STATISTICS SLIDES.pptCOM 301   INFERENTIAL STATISTICS SLIDES.ppt
COM 301 INFERENTIAL STATISTICS SLIDES.ppt
 
session three epidemiology.pptx
session three epidemiology.pptxsession three epidemiology.pptx
session three epidemiology.pptx
 
session three epidemiology.pptx
session three epidemiology.pptxsession three epidemiology.pptx
session three epidemiology.pptx
 
Biostatistics.pptx
Biostatistics.pptxBiostatistics.pptx
Biostatistics.pptx
 
.Analytical epidemiology.pptx
.Analytical epidemiology.pptx.Analytical epidemiology.pptx
.Analytical epidemiology.pptx
 
Descriptive epidemiology
Descriptive epidemiologyDescriptive epidemiology
Descriptive epidemiology
 
Chi square test
Chi square testChi square test
Chi square test
 
RMH Concise Revision Guide - the Basics of EBM
RMH Concise Revision Guide -  the Basics of EBMRMH Concise Revision Guide -  the Basics of EBM
RMH Concise Revision Guide - the Basics of EBM
 
05 confidence interval & probability statements
05 confidence interval & probability statements05 confidence interval & probability statements
05 confidence interval & probability statements
 
Stats.pptx
Stats.pptxStats.pptx
Stats.pptx
 
Assessing the performance of diagnostic tests
Assessing the performance of diagnostic testsAssessing the performance of diagnostic tests
Assessing the performance of diagnostic tests
 
Soni_Biostatistics.ppt
Soni_Biostatistics.pptSoni_Biostatistics.ppt
Soni_Biostatistics.ppt
 
Statistical Journal club
Statistical Journal club Statistical Journal club
Statistical Journal club
 
Case control study
Case control studyCase control study
Case control study
 
Statistical analysis in pharmacokinetics.pptx
Statistical analysis in pharmacokinetics.pptxStatistical analysis in pharmacokinetics.pptx
Statistical analysis in pharmacokinetics.pptx
 
10-Interpretation& Causality by Mehdi Ehtesham
10-Interpretation& Causality  by Mehdi Ehtesham10-Interpretation& Causality  by Mehdi Ehtesham
10-Interpretation& Causality by Mehdi Ehtesham
 

3 6 datatests-slides

  • 2. Goals  Understand confidence intervals and p- values  Learn to use basic statistical tests including chi square and ANOVA
  • 3. Types of Variables  Types of variables indicate which estimates you can calculate and which statistical tests you should use  Continuous variables:  Always numeric  Generally calculate measures such as the mean, median and standard deviation  Categorical variables:  Information that can be sorted into categories  Field investigation – often interested in dichotomous or binary (2-level) categorical variables  Cannot calculate mean or median but can calculate risk
  • 4. Measures of Association  Strength of the association between two variables, such as an exposure and a disease  Two measure of association used most often are the relative risk, or risk ratio (RR), and the odds ratio (OR)  The decision to calculate an RR or an OR depends on the study design  Interpretation of RR and OR:  RR or OR = 1: exposure has no association with disease  RR or OR > 1: exposure may be positively associated with disease  RR or OR < 1: exposure may be negatively associated with disease
  • 5. Risk Ratio or Odds Ratio?  Risk ratio  Used when comparing outcomes of those who were exposed to something to those who were not exposed  Calculated in cohort studies  Cannot be calculated in case-control studies because the entire population at risk is not included in the study  Odds ratio  Used in case-control studies  Odds of exposure among cases divided by odds of exposure among controls  Provides a rough estimate of the risk ratio
  • 6. Analysis Tool: 2x2 Table  Commonly used with dichotomous variables to compare groups of people  Table puts one dichotomous variable across the rows and another dichotomous variable along the columns  Useful in determining the association between a dichotomous exposure and a dichotomous outcome
  • 7. Calculating an Odds Ratio  Table displays data from a case control study conducted in Pennsylvania in 2003 (2)  Can calculate the odds ratio:  *OR = ad = (218)(85) = 19.6 bc (45)(21)  Outcome Exposure Hepatitis A No Hepatitis A Total Ate salsa 218 45 263 Did not eat salsa 21 85 106 Total 239 130 369 Table 1. Sample 2x2 table for Hepatitis A at Restaurant A
  • 8. Confidence Intervals  Point estimate – a calculated estimate (like risk or odds) or measure of association (risk ratio or odds ratio)  The confidence interval (CI) of a point estimate describes the precision of the estimate  The CI represents a range of values on either side of the estimate  The narrower the CI, the more precise the point estimate (3)
  • 9. Confidence Intervals - Example  Example—large bag of 500 red, green and blue marbles:  You want to know the percentage of green marbles but don’t want to count every marble  Shake up the bag and select 50 marbles to give an estimate of the percentage of green marbles  Sample of 50 marbles:  15 green marbles, 10 red marbles, 25 blue marbles
  • 10. Confidence Intervals - Example  Marble example continued:  Based on sample we conclude 30% (15 out of 50) marbles are green  30% = point estimate  How confident are we in this estimate?  Actual percentage of green marbles could be higher or lower, ie. sample of 50 may not reflect distribution in entire bag of marbles  Can calculate a confidence interval to determine the degree of uncertainty
  • 11. Calculating Confidence Intervals  How do you calculate a confidence interval?  Can do so by hand or use a statistical program  Epi Info, SAS, STATA, SPSS and Episheet are common statistical programs  Default is usually 95% confidence interval but this can be adjusted to 90%, 99% or any other level
  • 12. Confidence Intervals  Most commonly used confidence interval is the 95% interval  95% CI indicates that our estimated range has a 95% chance of containing the true population value  Assume that the 95% CI for our bag of marbles example is 17-43%  We estimated that 30% of the marbles are green:  CI tells us that the true percentage of green marbles is most likely between 17 and 43%  There is a 5% chance that this range (17-43%) does not contain the true percentage of green marbles
  • 13. Confidence Intervals  If we want less chance of error we could calculate a 99% confidence interval  A 99% CI will have only a 1% chance of error but will have a wider range  99% CI for green marbles is 13-47%  If a higher chance of error is acceptable we could calculate a 90% confidence interval  90% CI for green marbles is 19-41%
  • 14. Confidence Intervals  Very narrow confidence intervals indicate a very precise estimate  Can get a more precise estimate by taking a larger sample  100 marble sample with 30 green marbles  Point estimate stays the same (30%)  95% confidence interval is 21-39% (rather than 17-43% for original sample)  200 marble sample with 60 green marbles  Point estimate is 30%  95% confidence interval is 24-36%  CI becomes narrower as the sample size increases
  • 15. Confidence Intervals  Returning to example of Hepatitis A in a Pennsylvania restaurant:  Odds ratio = 19.6  95% confidence interval of 11.0-34.9 (95% chance that the range 11.0-34.9 contained the true OR)  Lower bound of CI in this example is 11.0 (e.g., >1)  Odds ratio of 1 means there is no difference between the two groups, OR > 1 indicates a greater risk among the exposed  Conclusion: people who ate salsa were truly more likely to become ill than those who did not eat salsa
  • 16. Confidence Intervals  Must include CIs with your point estimates to give a sense of the precision of your estimates  Examples:  Outbreak of gastrointestinal illness at 2 primary schools in Italy (4)  Children who ate corn/tuna salad had 6.19 times the risk of becoming ill as children who did not eat salad  95% confidence interval: 4.81 – 7.98  Pertussis outbreak in Oregon (5)  Case-patients had 6.4 times the odds of living with a 6-10 year- old child than controls  95% confidence interval: 1.8 – 23.4  Conclusion: true association between exposure and disease in both examples
  • 17. Analysis of Categorical Data  Measure of association (risk ratio or odds ratio)  Confidence interval  Chi-square test  A formal statistical test to determine whether results are statistically significant
  • 18. Chi-Square Statistics  A common analysis is whether Disease X occurs as much among people in Group A as it does among people in Group B  People are often sorted into groups based on their exposure to some disease risk factor  We then perform a test of the association between exposure and disease in the two groups
  • 19. Chi-Square Test: Example  Hypothetical outbreak of Salmonella on a cruise ship  Retrospective cohort study conducted  All 300 people on cruise ship interviewed, 60 had symptoms consistent with Salmonella  Questionnaires indicate many of the case- patients ate tomatoes from the salad bar
  • 20. Chi-Square Test: Example (cont.)  To see if there is a statistical difference in the amount of illness between those who ate tomatoes (41/130) and those who did not (19/170) we could conduct a chi-square test Salmonella? Yes No Total Tomatoes 41 89 130 No Tomatoes 19 151 170 Total 60 240 300 Table 2a. Cohort study: Exposure to tomatoes and Salmonella infection
  • 21. Chi-Square Test: Example (cont.)  To conduct a chi-square the following conditions must be met:  There must be at least a total of 30 observations (people) in the table  Each cell must contain a count of 5 or more  To conduct a chi-square test we compare the observed data (from study results) with the data we would expect to see
  • 22. Chi-Square Test: Example (cont.) Salmonella? Yes No Total Tomatoes 130 No Tomatoes 170 Total 60 240 300  Gives an overall distribution of people who ate tomatoes and became sick  Based on these distributions we can fill in the empty cells with the expected values Table 2b. Row and column totals for tomatoes and Salmonella infection
  • 23. Chi-Square Test: Example (cont.)  Expected Value = Row Total x Column Total Grand Total  For the first cell, people who ate tomatoes and became ill:  Expected value = 130 x 60 = 26 300  Same formula can be used to calculate the expected values for each of the cells
  • 24. Chi-Square Test: Example (cont.) Salmonella? Yes No Total Tomatoes 130 x 60 = 26 300 130 x 240 = 104 300 130 No Tomatoes 170 x 60 = 34 300   170 x 240 = 136 300   170 Total 60 240 300  To calculate the chi-square statistic you use the observed values from Table 2a and the expected values from Table 2c  Formula is [(Observed – Expected)2 /Expected] for each cell of the table Table 2c. Expected values for exposure to tomatoes
  • 25. Chi-Square Test: Example (cont.) Salmonella? Yes No Total Tomatoes (41-26)2 = 8.7 26   (89-104)2 = 2.2 104   130 No Tomatoes (19-34)2 = 6.6 34   (151-136)2 = 1.7 136   170 Total 60 240 300  The chi-square (χ2) for this example is 19.2  8.7 + 2.2 + 6.6 + 1.7 = 19.2 Table 2d. Expected values for exposure to tomatoes
  • 26. Chi-Square Test  What does the chi-square tell you?  In general, the higher the chi-square value, the greater the likelihood there is a statistically significant difference between the two groups you are comparing  To know for sure, you need to look up the p- value in a chi-square table  We will discuss p-values after a discussion of different types of chi-square tests
  • 27. Types of Chi-Square Tests  Many computer programs give different types of chi-square tests  Each test is best suited to certain situations  Most commonly calculated chi-square test is Pearson’s chi-square  Use Pearson’s chi-square for a fairly large sample (>100)
  • 28. Types of Statistical Tests Parade of Statistics Guys The right test...   To use when….   Pearson chi-square (uncorrected) Sample size >100 Expected cell counts > 10 Yates chi-square (corrected) Sample size >30 Expected cell counts ≥ 5 Mantel-Haenszel chi-square Sample size > 30 Variables are ordinal Fisher’s exact test Sample size < 30 and/or Expected cell counts < 5
  • 29. Using Statistical Tests: Examples from Actual Studies  In each study, investigators chose the type of test that best applied to the situation (Note: while the chi- square value is used to determine the corresponding p-value, often only the p-value is reported.)  Pearson (Uncorrected) Chi-Square : A North Carolina study investigated 955 individuals because they were identified as partners of someone who tested positive for HIV. The study found that the proportion of partners who got tested for HIV differed significantly by race/ethnicity (p-value <0.001). The study also found that HIV-positive rates did not differ by race/ethnicity among the 610 who were tested (p = 0.4). (6)
  • 30. Using Statistical Tests: Examples from Actual Studies  Additional examples:  Yates (Corrected) Chi-Square: In an outbreak of Salmonella gastroenteritis associated with eating at a restaurant, 14 of 15 ill patrons studied had eaten the Caesar salad, while 0 of 11 well patrons had eaten the salad (p-value <0.01). The dressing on the salad was made from raw eggs that were probably contaminated with Salmonella. (7)  Fisher’s Exact Test: A study of Group A Streptococcus (GAS) among children attending daycare found that 7 of 11 children who spent 30 or more hours per week in daycare had laboratory-confirmed GAS, while 0 of 4 children spending less than 30 hours per week in daycare had GAS (p-value <0.01). (8)
  • 31. P-Values  Using our hypothetical cruise ship Salmonella outbreak:  32% of people who ate tomatoes got Salmonella as compared with 11% of people who did not eat tomatoes  How do we know whether the difference between 32% and 11% is a “real” difference?  In other words, how do we know that our chi- square value (calculated as 19.2) indicates a statistically significant difference?  The p-value is our indicator
  • 32. P-Values  Many statistical tests give both a numeric result (e.g. a chi-square value) and a p-value  The p-value ranges between 0 and 1  What does the p-value tell you?  The p-value is the probability of getting the result you got, assuming that the two groups you are comparing are actually the same
  • 33. P-Values  Start by assuming there is no difference in outcomes between the groups  Look at the test statistic and p-value to see if they indicate otherwise  A low p-value means that (assuming the groups are the same) the probability of observing these results by chance is very small  Difference between the two groups is statistically significant  A high p-value means that the two groups were not that different  A p-value of 1 means that there was no difference between the two groups
  • 34. P-Values  Generally, if the p-value is less than 0.05, the difference observed is considered statistically significant, ie. the difference did not happen by chance  You may use a number of statistical tests to obtain the p-value  Test used depends on type of data you have
  • 35. Chi-Squares and P-Values  If the chi-square statistic is small, the observed and expected data were not very different and the p-value will be large  If the chi-square statistic is large, this generally means the p-value is small, and the difference could be statistically significant  Example: Outbreak of E. coli O157:H7 associated with swimming in a lake (1)  Case-patients much more likely than controls to have taken lake water in their mouth (p-value =0.002) and swallowed lake water (p-value =0.002)  Because p-values were each less than 0.05, both exposures were considered statistically significant risk factors
  • 36. Note: Assumptions  Statistical tests such as the chi-square assume that the observations are independent  Independence: value of one observation does not influence value of another  If this assumption is not true, you may not use the chi-square test  Do not use chi-square tests with:  Repeat observations of the same group of people (e.g. pre- and post-tests)  Matched pair designs in which cases and controls are matched on variables such as sex and age
  • 37. Analysis of Continuous Data  Data do not always fit into discrete categories  Continuous numeric data may be of interest in a field investigation such as:  Clinical symptoms between groups of patients  Average age of patients compared to average age of non-patients  Respiratory rate of those exposed to a chemical vs. respiratory rate of those who were not exposed
  • 38. ANOVA  May compare continuous data through the Analysis Of Variance (ANOVA) test  Most statistical software programs will calculate ANOVA  Output varies slightly in different programs  For example, using Epi Info software:  Generates 3 pieces of information: ANOVA results, Bartlett’s test and Kruskal-Wallis test
  • 39. ANOVA  When comparing continuous variables between groups of study subjects:  Use a t-test for comparing 2 groups  Use an f-test for comparing 3 or more groups  Both tests result in a p-value  ANOVA uses either the t-test or the f-test  Example: testing age differences between 2 groups  If groups have similar average ages and a similar distribution of age values, t-statistic will be small and the p- value will not be significant  If average ages of 2 groups are different, t-statistic will be larger and p-value will be smaller (p-value <0.05 indicates two groups have significantly different ages)
  • 40. ANOVA and Bartlett’s Test  Critical assumption with t-tests and f-tests: groups have similar variances (e.g., “spread” of age values)  As part of the ANOVA analysis, software conducts a separate test to compare variances: Bartlett’s test for equality of variance  Bartlett’s test:  Produces a p-value  If Bartlett’s p-value >0.05, (not significant) OK to use ANOVA results  Bartlett’s p-value <0.05, variances in the groups are NOT the same and you cannot use the ANOVA results
  • 41. Kruskal-Wallis Test  Kruskal-Wallis test: generated by Epi Info software  Used only if Bartlett’s test reveals variances dissimilar enough so that you can’t use ANOVA  Does not make assumptions about variance, examines the distribution of values within each group  Generates a p-value  If p-value >0.05 there is not a significant difference between groups  If p-value < 0.05 there is a significant difference between groups
  • 42. Analysis of Continuous Data Figure 1. Decision tree for analysis of continuous data. Bartlett’s test for equality of variance p-value >0.05? YES NO Use ANOVA test Use Kruskal-Wallis test test p<0.05 p>0.05 p<0.05 p>0.05 Difference between groups is statistically significant Difference between groups is statistically significant Difference between groups is NOT statistically significant Difference between groups is NOT statistically significant
  • 43. Conclusion  In field epidemiology a few calculations and tests make up the core of analytic methods  Learning these methods will provide a good set of field epidemiology skills.  Confidence intervals, p-values, chi-square tests, ANOVA and their interpretations  Further data analysis may require methods to control for confounding including matching and logistic regression
  • 44. References 1. Bruce MG, Curtis MB, Payne MM, et al. Lake-associated outbreak of Escherichia coli O157:H7 in Clark County, Washington, August 1999. Arch Pediatr Adolesc Med. 2003;157:1016-1021. 2. Wheeler C, Vogt TM, Armstrong GL, et al. An outbreak of hepatitis A associated with green onions. N Engl J Med. 2005;353:890-897. 3. Gregg MB. Field Epidemiology. 2nd ed. New York, NY: Oxford University Press; 2002. 4. Aureli P, Fiorucci GC, Caroli D, et al. An outbreak of febrile gastroenteritis associated with corn contaminated by Listeria monocytogenes. N Engl J Med. 2000;342:1236-1241.
  • 45. References 5. Schafer S, Gillette H, Hedberg K, Cieslak P. A community-wide pertussis outbreak: an argument for universal booster vaccination. Arch Intern Med. 2006;166:1317-1321. 6. Centers for Disease Control and Prevention. Partner counseling and referral services to identify persons with undiagnosed HIV --- North Carolina, 2001. MMWR Morb Mort Wkly Rep.2003;52:1181-1184. 7. Centers for Disease Control and Prevention. Outbreak of Salmonella Enteritidis infection associated with consumption of raw shell eggs, 1991. MMWR Morb Mort Wkly Rep. 1992;41:369-372. 8. Centers for Disease Control and Prevention. Outbreak of invasive group A streptococcus associated with varicella in a childcare center -- Boston, Massachusetts, 1997. MMWR Morb Mort Wkly Rep. 1997;46:944-948.

Editor's Notes

  1. It’s the middle of summer, prime time for swimming, and your local hospital reports several children with Escherichia coli O157:H7 infection. A preliminary investigation shows that many of these children recently swam in a local lake. The lake has been closed so the health department can conduct an investigation. Now you need to find out whether swimming in the lake is significantly associated with E. coli infection, so you’ll know whether the lake is the real culprit in the outbreak. This situation occurred in Washington in 1999, and it’s one example of the importance of good data analysis, including statistical testing, in an outbreak investigation. (1) In this issue of FOCUS we will discuss confidence intervals and p-values, and introduce some basic statistical tests, including chi square and ANOVA.   
  2. Before starting any data analysis, it is important to know what types of variables you are working with. The types of variables tell you which estimates you can calculate, and later, which types of statistical tests you should use. Remember, continuous variables are numeric (such as age in years or weight) while categorical variables (whether yes or no, male or female, or something else entirely) are just what the name says—categories. For continuous variables, we generally calculate measures such as the mean (average), median and standard deviation to describe what the variable looks like. We use means and medians in public health all the time; for example, we talk about the mean age of people infected with E. coli, or the median number of household contacts for case-patients with chicken pox. In a field investigation, you are often interested in dichotomous or binary (2-level) categorical variables that represent an exposure (ate potato salad or did not eat potato salad) or an outcome (had Salmonella infection or did not have Salmonella infection). For categorical variables, we cannot calculate the mean or median, but we can calculate risk. Remember, risk is the number of people who develop a disease among all the people at risk of developing the disease during a given time period. For example, in a group of firefighters, we might talk about the risk of developing respiratory disease during the month following an episode of severe smoke inhalation.         
  3. We usually collect information on exposure and disease because we want to compare two or more groups of people. To do this, we calculate measures of association. Measures of association tell us the strength of the association between two variables, such as an exposure and a disease. The two measures of association that we use most often are the relative risk, or risk ratio (RR), and the odds ratio (OR). The decision to calculate an RR or an OR depends on the study design. We interpret the RR and OR as follows: RR or OR = 1: exposure has no association with disease RR or OR &amp;gt; 1: exposure may be positively associated with disease RR or OR &amp;lt; 1: exposure may be negatively associated with disease    
  4. The risk ratio, or relative risk, is used when we look at a population and compare the outcomes of those who were exposed to something to the outcomes of those who were not exposed. When we conduct a cohort study, we can calculate risk ratios. However, the case-control study design does not allow us to calculate risk ratios, because the entire population at risk is not included in the study. That’s why we use odds ratios for case-control studies. An odds ratio is the odds of exposure among cases divided by the odds of exposure among controls, and it provides a rough estimate of the risk ratio. So remember, for a cohort study, calculate a risk ratio, and for a case-control study, calculate an odds ratio. For more information about calculating risk ratios and odds ratios, take a look at FOCUS Volume 3, Issues 1 and 2.  
  5. This is a good time to discuss one of our favorite analysis tools—the 2x2 table. You should be familiar with 2x2 tables from previous FOCUS issues. They are commonly used with dichotomous variables to compare groups of people. The table has one dichotomous variable along the rows and another dichotomous variable along the columns. This set-up is useful because we usually are interested in determining the association between a dichotomous exposure and a dichotomous outcome. For example, the exposure might be eating salsa at a restaurant (or not eating salsa), and the outcome might be Hepatitis A (or no Hepatitis A).    
  6. Table 1 displays data from a case-control study conducted in Pennsylvania in 2003. (2) So what measure of association can we get from this 2x2 table? Since we do not know the total population at risk (everyone who ate salsa), we cannot determine the risk of illness among the exposed and unexposed groups. That means we should not use the risk ratio. Instead, we will calculate the odds ratio, which is exactly what the study authors did. They found an odds ratio of 19.6*, meaning that the odds of getting Hepatitis A among people who ate salsa were 19.6 times as high as among people who did not eat salsa.
  7. How do we know whether an odds ratio of 19.6 as in the preceding example is meaningful in our investigation? We can start by calculating the confidence interval (CI) around the odds ratio. When we calculate an estimate (like risk or odds), or a measure of association (like a risk ratio or an odds ratio), that number (in this case 19.6) is called a point estimate. The confidence interval of a point estimate describes the precision of the estimate. It represents a range of values on either side of the estimate. The narrower the confidence interval, the more precise the point estimate. (3) Your point estimate will usually be the middle value of your confidence interval.    
  8. An analogy can help explain the confidence interval. Say we have a large bag of 500 red, green, and blue marbles. We are interested in knowing the percentage of green marbles, but we do not have the time to count every marble. So we shake up the bag and select 50 marbles to give us an idea, or an estimate, of the percentage of green marbles in the bag. In our sample of 50 marbles, we find 15 green marbles, 10 red marbles, and 25 blue marbles.    
  9. Based on this sample, we can conclude that 30% (15 out of 50) of the marbles in the bag are green. In this example, 30% is the point estimate. Do we feel confident in stating that 30% of the marbles are green? We might have some uncertainty about this statement, since there is a chance that the actual percent of green marbles in the entire bag is higher or lower than 30%. In other words, our sample of 50 marbles may not accurately reflect the actual distribution of marbles in the whole bag of 500 marbles. One way to determine the degree of our uncertainty is to calculate a confidence interval.  
  10. So how do you calculate a confidence interval? It’s certainly possible to do so by hand. Most of the time, however, we use statistical programs such as Epi Info, SAS, STATA, SPSS, or Episheet to do the calculations. The default is usually a 95% confidence interval, but this can be adjusted to 90%, 99%, or any other level depending on the desired level of precision. For those interested in calculating confidence intervals by hand, the following resource may be helpful: Giesecke, J. Modern Infectious Disease Epidemiology. 2nd Ed. London: Arnold Publishing; 2002.  
  11. The most commonly used confidence interval is the 95% interval. When we use a 95% confidence interval, we conclude that our estimated range has a 95% chance of containing the true population value (e.g., the true percentage of green marbles in our bag). Let’s assume that the 95% confidence interval is 17-43%. How do we interpret this? Well, we estimated that 30% of the marbles are green, and the confidence interval tells us that the true percentage of green marbles in the bag is most likely between 17 and 43%. However, there is a 5% chance that this range (17-43%) does not contain the true percentage of green marbles.  
  12. In epidemiology we are usually comfortable with this 5% chance of error, which is why we commonly use the 95% confidence interval. However, if we want less chance of error, we might calculate a 99% confidence interval, which has only a 1% chance of error. This is a trade-off, since with a 99% confidence interval the estimated range will be wider than with a 95% confidence interval. In fact, with a 99% confidence interval, our estimate of the percentage of green marbles is 13-47%. That’s a pretty wide range! On the other hand, if we were willing to accept a 10% chance of error, we can calculate a 90% confidence interval (and in this case, the percentage of green marbles will be 19-41%).   
  13. Ideally, we would like a very narrow confidence interval, which would indicate that our estimate is very precise. One way to get a more precise estimate is to take a larger sample. If we had taken 100 marbles (instead of 50) from our bag and found 30 green marbles, the point estimate would still be 30%, but the 95% confidence interval would be a range of 21-39% (instead of our original range of 17-43%). If we had sampled 200 marbles and found 60 green marbles, the point estimate would be 30%, and with a 95% confidence interval the range would be 24-36%. You can see that the confidence interval becomes narrower as the sample size increases.  
  14. Let’s go back to our example of Hepatitis A in the Pennsylvania restaurant for one final review of confidence intervals. The odds ratio was 19.6, and the 95% confidence interval for this estimate was 11.0-34.9. This means there was a 95% chance that the range 11.0-34.9 contained the true odds ratio of Hepatitis A among people who ate salsa compared with people who did not eat salsa. Remember that an odds ratio of 1 means that there is no difference between the two groups, while an odds ratio greater than 1 indicates a greater risk among the exposed group. The lower bound of the confidence interval was 11.0, which is greater than 1. That means we can conclude that the people who ate salsa were truly more likely to become ill than the people who did not eat salsa.       
  15. It’s necessary to include confidence intervals with your point estimates. That way you can give a sense of the precision of your estimates. Here are two examples: In an outbreak of gastrointestinal illness at two primary schools in Italy, investigators reported that children who ate a cold salad of corn and tuna had 6.19 times the risk of becoming ill of children who did not eat salad (95% confidence interval: 4.81-7.98). (4) In a community-wide outbreak of pertussis in Oregon in 2003, case-patients had 6.4 times the odds of living with a 6-10 year-old child than controls (95% confidence interval: 1.8-23.4). (5) In both of these examples, one can conclude that there was an association between exposure and disease.    
  16. Analysis of Categorical Data You have calculated a measure of association (a risk ratio or odds ratio), and a confidence interval for a range of values around the point estimate. Now you want to use a formal statistical test to determine whether the results are statistically significant. Here, we will focus on the statistical tests that are used most often in field epidemiology. The first is the chi-square test.      
  17. Chi-Square Statistics As noted earlier, a common analysis in epidemiology involves dichotomous variables, and uses a 2x2 table. We want to know if Disease X occurs as much among people belonging to Group A as it does among people belonging to Group B. In epidemiology, we often put people into groups based on their exposure to some disease risk factor. To determine whether those persons who were exposed have more illness than those not exposed, we perform a test of the association between exposure and disease in the two groups. Let’s use a hypothetical example to illustrate this.       
  18. Let’s assume there was an outbreak of Salmonella on a cruise ship, and investigators conducted a retrospective cohort study to determine the source of the outbreak. They interviewed all 300 people on the cruise and found that 60 had symptoms consistent with Salmonella. Questionnaires indicated that many of the case-patients ate tomatoes from the salad bar. Table 2a shows the number of people who did and did not eat tomatoes from the salad bar.  
  19. Table 2a shows the number of people who did and did not eat tomatoes from the salad bar. To see if there is a significant difference in the amount of illness between those who ate tomatoes (41/130 or 32%) and those who did not (19/170 or 11%), one test we could conduct is a handy little statistic called χ2 (or chi-square).      
  20. In order to calculate a run-of-the mill chi-square, the following conditions must be met: There must be at least a total of 30 observations (people) in the table. Each cell must contain a count of 5 or more.   To conduct a chi-square test, we compare the observed data (from our study results) to the data we would expect to see.
  21. So how do we know what data would be expected? We need to know the size of our population, so we start with the totals from our observed data, as in Table 2b. This gives us the overall distribution of people who ate tomatoes and people who became sick. Based on these distributions, we can fill in the empty cells of the table with the expected values, using the totals as weights. A computer program will calculate the expected values, but it is good to know that these numbers do not just fall out of the sky; there is actually a simple method to calculate them!  
  22. Expected Value = Row Total x Column Total Grand Total For the first cell, people who ate tomatoes and became ill: Expected Value = 130 x 60 = 26 300 We can use this formula to calculate the expected values for each of the cells, as shown in Table 2c.  
  23. To calculate the chi-square statistic, you use the observed values from Table 2a and the expected values that we calculated in Table 2c. You use this formula: [(Observed – Expected)2/ Expected] for each cell in the table. Then you add these numbers together to find the chi-square statistic.   
  24. The chi-square (χ2) for this example is 19.2 ( 8.7 + 2.2 + 6.6 + 1.7 = 19.2).    
  25. What exactly does this number tell you? In general, the higher the chi-square value, the greater the likelihood that there is a statistically significant difference between the two groups you are comparing. To know for sure, though, you need to look up the p-value in a chi-square table. We will talk about the p-value and how the p-value is related to the chi-square test in the section below. First, though, let’s talk about different types of chi-square tests.      
  26. Many computer programs give several types of chi-square tests. Each of these chi-square tests is best suited to certain situations. The most commonly calculated chi-square test is Pearson’s chi-square, or the uncorrected chi-square. In fact, if you see output that is simply labeled “chi-square,” it is likely that it is actually Pearson’s chi-square. A general rule of thumb is to use Pearson’s chi square when you have a fairly large sample (&amp;gt;100). For a 2x2 table, the computer takes some shortcuts when calculating the chi-square, which does not work so well for smaller sample sizes but does make things faster to calculate.  
  27. The figure identifies the types of tests to use in different situations. If you have a sample with less than 30 people or if one of the cells in your 2x2 tables is less than 5, you will need to use Fisher’s exact test. If you have matched or paired data, you should use McNemar’s Test instead of a standard chi-square test (we’ll talk more about McNemar’s Test in the next issue of FOCUS).     
  28. Here are a few examples of studies that compared two groups using a chi-square test or Fisher’s exact test. In each study, the investigators chose the type of test that best applied to the situation. Remember that the chi-square value is used to determine the corresponding p-value. Many studies, including the ones below, report only the p-value rather than the actual chi-square value. Pearson (Uncorrected) Chi-Square : A North Carolina study investigated 955 individuals referred to the Department of Health and Human Services because they were partners of someone who tested positive for HIV. The study found that the proportion of partners who got tested for HIV differed significantly by race/ethnicity (p-value &amp;lt;0.001). The study also found that HIV-positive rates did not differ by race/ethnicity among the 610 who were tested (p = 0.4). (6)  
  29. Further examples: Yates (Corrected) Chi-Square: In an outbreak of Salmonella gastroenteritis associated with eating at a restaurant, 14 of 15 ill patrons studied had eaten the Caesar salad, while 0 of 11 well patrons had eaten the salad (p-value &amp;lt;0.01). The dressing on the salad was made from raw eggs that were probably contaminated with Salmonella. (7) Fisher’s Exact Test: A study of Group A Streptococcus (GAS) among children attending daycare found that 7 of 11 children who spent 30 or more hours per week in daycare had laboratory-confirmed GAS, while 0 of 4 children spending less than 30 hours per week in daycare had GAS (p-value &amp;lt;0.01). (8)  
  30. P-Values Let’s get back to our hypothetical cruise ship Salmonella outbreak. In our example, 32% of the people who ate tomatoes got Salmonella infection, compared to 11% of the people who did not eat tomatoes. (Although 32% and 11% look like they are different, we will need to check this with a statistical test. Often looks can be deceiving.) How do we know whether the difference between 32% and 11% is a “real” difference? In other words, how do we know that our chi-square value of 19.2 indicates a statistically significant difference? The p-value is our indicator!
  31. Many statistical tests give both a numeric result (e.g., our chi-square value of 19.2) and a p-value. The p-value is a number that ranges between 0 and 1. So what does the p-value tell you? The p-value is the probability of getting the result you got, assuming that the two groups you are comparing are actually the same.  
  32. In other words, we start by assuming that there is no difference in outcomes between the groups (e.g., the people who ate tomatoes and those who did not). Then we look at the test statistic and p-value to see if they indicate otherwise. A low p-value means that, assuming the groups are actually the same, the probability of observing these results just by chance is very small. Therefore, you can call the difference between two groups statistically significant. A high p-value means that the two groups were not that different. A p-value of 1 means that there was no difference at all between the two groups.
  33. Generally, if the p-value is less than 0.05, the difference observed is considered statistically significant. Although this is somewhat arbitrary, we often consider statistically significant differences to be real. If the p-value is below 0.05, we conclude that the difference observed is a true difference and did not happen by chance. A number of statistical tests can be employed to obtain the p-value. The test you use depends on the type of data you have.
  34. Earlier we discussed the chi-square test. If the chi-square statistic is small, this indicates that observed and expected data were not very different, and the p-value is large (&amp;gt;0.05 , i.e. the difference was not statistically significant). If the chi-square statistic is large, this generally means that the p-value is small, and the difference could be statistically significant. At the beginning of our discussion we mentioned an outbreak of E. coli O157:H7 associated with swimming in a lake. In that study, the investigators reported that “case-patients were significantly more likely than camper control subjects to have taken lake water into the mouth (p-value =0.002) and to have swallowed lake water (p-value =0.002).” (1) Because the p-values were each less than 0.05, both exposures were considered to be statistically significant risk factors.  
  35. A cautionary note… Statistical tests such as the chi-square are based on several assumptions about the data, including independence of observations. The assumption of independence means that the value of one observation does not influence the value of another observation. If this assumption is not actually true in your study, then it is incorrect to use the test. Situations in which the chi-square test should not be used include: repeat observations of the same group of people (e.g., pre- and post-tests) and matched pair designs in which cases and controls are matched on variables such as age and sex. These situations require special data analysis methods. Luckily for us, in most of the situations we face in field epidemiology, the assumptions of the chi-square test are met. For example, the people who ate the potato salad were not the same people as those who did not eat the potato salad.  
  36.  Analysis of Continuous Data Data do not always fit into discrete categories but continuous numeric data can also be of interest in a field investigation. You might want to compare clinical symptoms between groups of patients. For example, you might want to compare the temperature (fever) of adult patients with that of children. You could compare the average age of patients to the average age of non-patients, or the respiratory rate of those who were directly exposed to a chemical plume to the rate of those who were not exposed.  
  37. It is possible to make this type of comparison through a test called Analysis Of Variance (ANOVA). Most of the major statistical software programs will calculate ANOVA, but the output varies slightly in different programs. Here we will discuss the output from Epi Info to introduce the ANOVA test and two other useful tests. Epi Info generates 3 pieces of information when you conduct a test using ANOVA: the ANOVA results, Bartlett’s test, and the Kruskal-Wallis test. The next few slides describe how to interpret all of these tests.
  38. ANOVA When comparing continuous variables between groups of study subjects, a test that results in a p-value is used. However, instead of using chi-square tests as we did for categorical data, we use a t-test (for comparing 2 groups) or an f-test (for comparing 3 or more groups). The ANOVA procedure uses either the t-test or the f-test, depending on the number of groups being compared. A t-test compares averages between two groups, takes into account the variability in each group, and results in a statistic (t) that has a p-value. For example, let’s say that we are testing age differences between 2 groups. If the groups have very similar average ages and a similar “spread” or distribution of age values, the t-statistic will be relatively small and the p-value will not be significant (i.e., it will be &amp;gt;0.05). If the average ages of the 2 groups are quite different, the t-statistic will be larger, and the p-value will be smaller. If the p-value is less than 0.05, the groups have significantly different ages.  
  39. Bartlett’s Test In t-tests and f-tests, there is one critical assumption that you should know about. ANOVA assumes that the two groups have similar variances. In other words, the two groups have a similar “spread” of age values. Think of it as comparing apples to apples. As part of the ANOVA analysis, the software program conducts a separate test just to see if the variances of the two groups are comparable. This test is called Bartlett’s test for equality of variance. The basic assumption for this test is that the variances are comparable. If you see a p-value for the test that is larger than 0.05, which is not significant, everything is fine. You can use the results of your ANOVA. However, if Bartlett’s p-value is less than 0.05, the variances in the groups are not the same and you cannot use the results of the ANOVA. In this situation there is another test that you can use in place of ANOVA: the Kruskal-Wallis test.  
  40. Kruskal-Wallis Test In Epi Info, a third test result is given that you will only need if Bartlett’s test told you that the variances were not similar enough to use ANOVA (i.e., the p-value of Bartlett’s test was less than 0.05). The Kruskal-Wallis test does not make assumptions about the variance in the data. The Kruskal-Wallis test does not test averages but examines the distribution of the values within each of the groups. The test will result in a p-value. If the p-value is larger than 0.05, then you can conclude that there is not a significant difference between the groups. If the p-value is less than 0.05, there is a significant difference between the groups.  
  41. Figure 1 gives a summary of when to use each of these tests for analyzing continuous data.
  42. Conclusion Analysis of epidemiologic data is a vital link in implicating exposures in disease causation. You could take years’ worth of coursework just to learn epidemiologic and statistical methods. However, in field epidemiology, a few tried and true calculations and tests make up the core of analytic methods. Here we have presented confidence intervals, p-values, chi-square tests, ANOVA, and their interpretations. When you master these methods, your field epi skills will be among the best! The next issue of FOCUS will delve even further into data analysis to tackle methods that you can use to control for confounding, including matching and logistic regression.  
  43. 1. Bruce MG, Curtis MB, Payne MM, et al. Lake-associated outbreak of Escherichia coli O157:H7 in Clark County, Washington, August 1999. Arch Pediatr Adolesc Med. 2003;157:1016-1021. 2. Wheeler C, Vogt TM, Armstrong GL, et al. An outbreak of hepatitis A associated with green onions. N Engl J Med. 2005;353:890-897. 3. Gregg MB. Field Epidemiology. 2nd ed. New York, NY: Oxford University Press; 2002. 4. Aureli P, Fiorucci GC, Caroli D, et al. An outbreak of febrile gastroenteritis associated with corn contaminated by Listeria monocytogenes. N Engl J Med. 2000;342:1236-1241.   
  44. 5. Schafer S, Gillette H, Hedberg K, Cieslak P. A community-wide pertussis outbreak: an argument for universal booster vaccination. Arch Intern Med. 2006;166:1317-1321. 6. Centers for Disease Control and Prevention. Partner counseling and referral services to identify persons with undiagnosed HIV --- North Carolina, 2001. MMWR Morb Mort Wkly Rep.2003;52:1181-1184. 7. Centers for Disease Control and Prevention. Outbreak of Salmonella Enteritidis infection associated with consumption of raw shell eggs, 1991. MMWR Morb Mort Wkly Rep. 1992;41:369-372. 8. Centers for Disease Control and Prevention. Outbreak of invasive group A streptococcus associated with varicella in a childcare center -- Boston, Massachusetts, 1997. MMWR Morb Mort Wkly Rep. 1997;46:944-948.