2. 9/13/2010
2
Contingency Tables
• Used to display relationships
t i l i bl
Pregnant? Row
Totalsamong categorical variables
– Responses in the columns
– Predictors in the rows
• Statistical significance tested
using Pearson chi‐square or
Fisher’s exact tests
Yes No
Pregnancy
Test?
Positive 27 3 30
Negative 4 26 30
31 29 60Column Totals →
Totals
↓
Fisher s exact tests
• Results interpreted using an
odds ratio or relative risk
Pearson’s Chi‐Squared Test
• Pearson’s chi‐square test
assumes that columns and
Pregnant?
assumes that columns and
rows are independent
– Computation of expected values
(Expij) assumes independence
• Chi‐square tests require large
sample sizes with no empty
Yes No
Pregnancy
Test?
Positive Obs11 Obs12 R1.
Negative Obs21 Obs22 R2.
C.1 C.2 N..
cells & few small cell counts
• P‐values computed from the
chi‐square distribution
3. 9/13/2010
3
Fisher’s Exact Test
• Also tests the independence
f l d
Pregnant?
of columns and rows
• Fisher’s test is valid for all
sample sizes and cell counts
• Fisher’s test assumes column
Yes No
Pregnancy
Test?
Positive a b a+b
Negative c d c+d
a+c b+d n
Fisher s test assumes column
and row totals are fixed
– Fisher’s exact test may be
inappropriate for some tables
• P‐values computed using the hypergeometric
distribution shown above
• P‐value represents the probability of finding
this specific table vs. all possible tables of
sample size of n = a + b + c + d
Odds Ratios and Relative Risk
• Pearson’s chi‐square and
Fi h ’ i di
Pregnant?
Fisher’s exact tests indicate
whether a relationship is
statistically significant
– Did the results occur by chance?
• Odds ratios and relative risk
indicate the magnitude of a
Yes No
Pregnancy
Test?
Positive a b a+b
Negative c d c+d
a+c b+d n
indicate the magnitude of a
relationship or its effect size
– Was there a large difference in
the odds or risks among rows?
4. 9/13/2010
4
Interpreting OR and RR
• The odds of pregnancy are OR = 58 5 times higher• The odds of pregnancy are OR = 58.5 times higher
for women who tested positive than the odds of
pregnancy for women who tested negative
• The risk of pregnancy is RR = 6.75 times higher for p g y g
women who tested positive than the odds of
women who tested negative
Sensitivity and Specificity
• Sensitivity and specificity
represent the performance
Pregnant?
represent the performance
of diagnostic tests
• Sensitivity is the proportion
of actual positives correctly
identified by the diagnostic
Yes No
Pregnancy
Test?
Positive TP FP
Negative FN TN
• Specificity is the proportion
of actual negatives correctly
identified by the diagnostic
5. 9/13/2010
5
Table Formats
Pregnant?
Pregnancy
Test Pregnant? Count
Yes No
Pregnancy
Test?
Positive 27 3 30
Negative 4 26 30
31 29 60
Test Pregnant? Count
Positive Yes 27
Positive No 3
Negative Yes 26
Negative No 4
Contingency Table format Summarized Table format
• You may need to reformat your data table for some software
– Contingency table format for analysis in GraphPad Prism
– Summarized table format for analysis in JMP
Review Contingency Table Results
Pregnant?
Yes No
Pregnancy
Test?
Positive 27 3 30
Negative 4 26 30
31 29 60
Pearson Chi‐Square: X2 = 32.3026, p = 1.319e‐08q , p
Fisher’s Exact Test: p = 1.975e‐09
Odds of pregnancy are OR = 58.5 times higher after positive pregnancy test
Risk of pregnancy is RR = 6.75 times higher after positive pregnancy test
Pregnancy test has 87.1% sensitivity and 89.66% specificity
6. 9/13/2010
6
More Complicated Models
• What if your contingency table is larger than 2 x 2?
– Pearson chi‐square and Fisher’s exact test for M x N tables
• What if your table contains paired data?
– McNemar’s Test for paired data
• What if your table has three variables?
– Mantel‐Haenzel‐Cochran (MHC) test
• What if you have a continuous predictor variable?y p
– Logistic regression models
• What about really complicated models?
– Generalized Linear Models (GLIM)
M x N Contingency Tables
Blood Types
P hi k h f l M N bl b
A B AB O
Ethnicity
Bambara 7 8 5 20 40
Peul 12 3 3 12 30
Tuareg 11 13 2 4 30
30 24 10 36 100
• Pearson chi‐square tests work the same for larger M x N tables, but
researchers need to remember the assumptions about cell counts
• Fisher’s exact test is difficult to compute for M x N tables, but it
can be computed using simulations in R or other software
7. 9/13/2010
7
Ordinal vs. Nominal Variables
• Ordinal variables have outcomes that are ordered
D D 0 5 10 d 15– Drug Dosages: 0 mg, 5 mg, 10 mg and 15 mg
– Symptom Severity: Mild, Moderate and Severe
• Nominal variables have outcomes that are unordered
– Blood Types: A, B, AB and O
– Ethnicity: Bambara, Peul and Tuareg
• Most tests assume nominal variables by defaulty
– Ordinal variables require fewer odds ratio estimates
– Ordinal variables may allow for a simpler model
– E.g. compute odds ratios to compare Mild vs. Moderate and Moderate
vs. Severe, but do not compare Mild vs. Severe
McNemar’s Test
• McNemar’s test should be used
if t bl t t h d
Test 2
if table represents a matched
pairs design experiment
– E.g. Some matched pairs designs
arise from repeated sampling of
patients pre‐ and post‐treatment
– E g Case‐control experiments may
Pos Neg
Test 1
Positive a b a+b
Negative c d c+d
a+c b+d n
E.g. Case control experiments may
use McNemar’s test because case
and control patients have been
“matched” using key demographic
variables like age, gender, race, ...
8. 9/13/2010
8
Mantel‐Haenzel‐Cochran Test
Age < 40 Age > 40
• Mantel‐Haenzel‐Cochran test determines if the relationship
All Ages
Heart Attack?
Yes No
Birth
Control?
Yes 16 34
No 34 16
Heart Attack?
Yes No
8 32
2 8
Heart Attack?
Yes No
8 2
32 8
between two table variables remains the same if the table is
“paneled” or split by a third table variable
• Often used to investigate Simpson’s Paradox
Logistic Regression
• Logistic regression fits the relationship
b t ti di t dbetween a continuous predictor and a
categorical response variable
– E.g. predict the gender of an unknown
person based on their height
– E.g. predict whether an animal will live or
die based on the dose of a drug
• The logistic regression plot represents
a change in log odds ratio for each onea change in log odds ratio for each one
unit increase in the predictor variable
– E.g. If an unknown person is 61 inches tall,
their odds of being male are near zero
– E.g. if an unknown person is 68 inches tall,
their odds of being male are about 50‐50
10. 9/13/2010
10
Results from Logistic Regression
• Whole model results
Likelihood Ratio Test (LRT)– Likelihood Ratio Test (LRT)
– Model fit diagnostics
• Parameter estimates
– Regression coefficients
– Wald tests
• Odds ratios
Th dd f i l 1 107– The odds of survival are 1.107
times higher after every one
unit increase in log(dose)
– Odds of survival are 12.794
times higher after every one
unit increase in dose
Why Use Both Wald and LRT?
• Likelihood Ratio tests compare the fit of two statistical models
– Most statistical models can be described with a likelihood function, e.g., g
– A likelihood ratio test (LRT) computes the log‐likelihood function under a full
model (dose and intercept) and reduced model (intercept) to test model fit
• Wald tests evaluate the statistical significance of model parameters
– Wald test statistics are constructed very similar to Student’s T‐tests
– Results from Wald test should be consistent with LRT results
11. 9/13/2010
11
Estimate LD50 from Logistic Regression
• You can use interpolated values
i di ti t ti tor inverse prediction to estimate
LD50 from a logistic regression
• Open the Inverse Prediction menu
and enter Prob = 0.500 to estimate
LD50 by finding X at Y = 0.500
– Enter Prob = 0.90 for LD90, ...,
• You may need to antilog your LD50
estimate if your predictor is on the
log scale (e.g. log10(dose))
Compute LD50 from Parameter Estimates
• Simple logistic regression is defined by the equation
• Therefore, by simple algebra, we find LD50 = ‐B0 / B1
13. 9/13/2010
13
Generalized Linear Models
• Logistic regression, extensions of Pearson chi‐square tests and other
models can be defined as generalized linear models (GLIM)models can be defined as generalized linear models (GLIM)
• Each GLIM model is coerced into the form of a linear equation by
choosing the correct statistical distribution and link function
• Excluding logistic regression, most multifactor categorical models
must be specified using the GLIM procedures in your softwarep g p y
• GLIM procedures typically allow analysts to test for overdispersion,
where real data has more variance than expected from the model
Distribution Choices
• Modeling categorical responses directlyg g p y
– Binomial and multinomial distributions
– Negative binomial distribution
• Modeling contingency table cell counts
– Poisson distribution models all cell counts as rare eventsPoisson distribution models all cell counts as rare events
– Normal distribution models cell counts as common events
15. 9/13/2010
15
Overdispersion Parameters
• Traditional linear models, like linear regression, use independent
parameters to estimate the variance of the response dataparameters to estimate the variance of the response data
– E.g. linear regression has independent mean μ = Xβ and variance σ2
• Many GLIM models, like logistic regression, have fixed relationships
between the variance and other model parameters
– E.g. logistic regression has mean μ = np and variance σ2 = np(1 – p)
– E.g. log‐linear models have μ = σ2 = λ = np for rare event with small p
• Overdispersion parameters are used to account for extra variability• Overdispersion parameters are used to account for extra variability
in the responses, which cannot be explained by the model
– E.g. logistic regression modeled with variance σ2 = φnp(1 – p)
– Want to know if multiplier φ > 2 to determine significance or importance
Generalized Linear Mixed Models
• Generalized linear models can be advanced further by
including random effect variables
– These models are called generalized linear mixed models (GLMM)
– Random effect variables are included to account for paired designs,
repeated measures designs, split‐plot designs and other effects
– GLMM are typiaclly fit using generalized estimating equations (GEE), often
using linearization techniques (e.g. SAS PROC GLIMMIX)
l d d b f• Sometimes complicated GLM and GLMM must be fit
using nonlinear modeling procedures in your software
– Probit model with binomial errors or Poisson loss function models in JMP
– Probit‐Normal models and Poisson‐Normal models in SAS PROC NLMIXED
16. 9/13/2010
16
Random vs. Fixed Effects
Subject effects are random Gender effects are fixed
• Subject effects are random because the subjects in a experiment
are a sample from the population of all possible subjects
• Gender effects are fixed because there are only two genders
Split‐plot Design
12 mice: 6 infected, 6 uninfected
3 infected males, 3 infected females, …
• Split‐plot design experiments model experiments where
whole plots and subplots represent different EUs
, ,
4 samples taken from each mouse
Each sample treated with one of 2 different drugs
Whole plot (mouse) EU’s: Infection, gender
Subplot (sample) EU’s: drug treatment
whole plots and subplots represent different EUs
– Whole plots are often locations, subjects, objects or factors that
are difficult to change (e.g. temperature in an incubator)
– Subplot effects are typically the effects of highest interest
– Subplot effects are tested with higher power than whole plot
17. 9/13/2010
17
References
• Agresti A. 2002. Categorical Data Analyses. Second Ed. Wiley‐Interscience.
• Reed LJ and H Muench. 1938. A Simple Method of Estimating Fifty Percent
Endpoints. The American Journal of Hygiene. 27(3):493‐497
• SAS Institute Inc. 2007. SAS 9.1.3 Documentation. Cary, NC. SAS Institute Inc.
• SAS Institute Inc 2010 JMP Statistics and Graphics Guide Cary NC SAS• SAS Institute Inc. 2010. JMP Statistics and Graphics Guide. Cary, NC. SAS
Institute Inc.