SlideShare a Scribd company logo
GEE Data Examples
The TLC Data
SAS Codes to Create a Binary Outcome
SAS file name: SAS demo TLC genmod
data tlc ; set ala.tlc ;
y=y0 ; time=0 ; week=0 ; output ;
y=y1 ; time=1 ; week=1 ; output ;
y=y4 ; time=2 ; week=4 ; output ;
y=y6 ; time=3 ; week=6 ; output ;
run ;
data tlc ; set tlc ;
if week=0 then delete ;
if y>=20 then lead_normal=0 ;
if y ne . and y < 20 then lead_normal=1 ;
run ;
proc print ; run ;
Note: the event/success is normal blood lead level
id trt y0 y1 y4 y6 y time week lead_normal
1 P 30.8 26.9 25.8 23.8 26.9 1 1 0
1 P 30.8 26.9 25.8 23.8 25.8 2 4 0
1 P 30.8 26.9 25.8 23.8 23.8 3 6 0
2 A 26.5 14.8 19.5 21.0 14.8 1 1 1
2 A 26.5 14.8 19.5 21.0 19.5 2 4 1
2 A 26.5 14.8 19.5 21.0 21.0 3 6 0
3 A 25.8 23.0 19.1 23.2 23.0 1 1 0
3 A 25.8 23.0 19.1 23.2 19.1 2 4 1
3 A 25.8 23.0 19.1 23.2 23.2 3 6 0
TLC Data
Days Group A Group P
7 0.78 0.16
28 0.76 0.26
42 0.54 0.26
Blood lead levels were repeatedly measured in the
TLC trial data.
Binary outcome: blood lead level < 20 μg/dL (no lead
poisoning)
Percent of no lead poisoning in the two groups:
TLC Data (Continuous Lead Level)
TLC Data (continuous lead level by group)
TLC Data (binary: normal lead by group)
data tlc ; set ala.tlc ;
y=y0 ; time=0 ; week=0 ; output ;
y=y1 ; time=1 ; week=1 ; output ;
y=y4 ; time=2 ; week=4 ; output ;
y=y6 ; time=3 ; week=6 ; output ;
run ;
data tlc ; set tlc ;
if week=0 then delete ;
if y>=20 then lead_normal=0 ;
if y ne . and y < 20 then lead_normal=1 ;
run ;
proc genmod data=tlc descending ;
class id trt ;
model lead_normal =trt week / d=bin link=logit ;
repeated subject=id / type=exch corrw modelse ;
output out=pprobs p=pred xbeta=xbeta ;
run ;
Note: Genmod default is to use empirical (i.e. robust) standard error estimates. I used
the “modelse” option to show the difference between empirical and model-based
results.
GEE Model Information
Correlation Structure Exchangeable
Subject Effect id (100 levels)
Number of Clusters 100
Correlation Matrix Dimension 3
Maximum Cluster Size 3
Minimum Cluster Size 3
Algorithm converged.
Working Correlation Matrix
Col1 Col2 Col3
Row1 1.0000 0.4622 0.4622
Row2 0.4622 1.0000 0.4622
Row3 0.4622 0.4622 1.0000
Exchangeable Working Correlation
Correlation 0.4621656646
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter Estimate Standard
Error
95% Confidence Limits Z Pr > |Z|
Intercept -1.0402 0.2839 -1.5966 -0.4838 -3.66 0.0002
trt A 2.0654 0.3706 1.3391 2.7918 5.57 <.0001
trt P 0.0000 0.0000 0.0000 0.0000 . .
week -0.0613 0.0522 -0.1635 0.0409 -1.18 0.2399
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter Estimate Standard
Error
95% Confidence Limits Z Pr > |Z|
Intercept -1.0402 0.2839 -1.5966 -0.4838 -3.66 0.0002
trt A 2.0654 0.3706 1.3391 2.7918 5.57 <.0001
trt P 0.0000 0.0000 0.0000 0.0000 . .
week -0.0613 0.0522 -0.1635 0.0409 -1.18 0.2399
Analysis Of GEE Parameter Estimates
Model-Based Standard Error Estimates
Parameter Estimate Standard
Error
95% Confidence Limits Z Pr > |Z|
Intercept -1.0402 0.3150 -1.6575 -0.4229 -3.30 0.0010
trt A 2.0654 0.3677 1.3447 2.7862 5.62 <.0001
trt P 0.0000 0.0000 0.0000 0.0000 . .
week -0.0613 0.0471 -0.1536 0.0310 -1.30 0.1930
Scale 1.0000 . . . . .
TLC Data
Observed and predicted proportions of normal lead
level in the two groups (predicted in parentheses)
Note the differences between observed and predicted
proportions in the treatment group. This is because the model
we fit was “main effect” only which assumes treatment effects
Days Group A Group P
7 0.78 (0.72) 0.16 (0.25)
28 0.76 (0.69) 0.26 (0.22)
42 0.54 (0.66) 0.26 (0.20)
TLC Data: Adding an Interaction
proc genmod data=tlc descending ;
class id trt ;
model lead_normal =trt week trt*week / d=bin link=logit ;
repeated subject=id / type=exch corrw ;
output out=pprobs p=pred xbeta=xbeta ;
run ;
GEE Model Information
Correlation Structure Exchangeable
Subject Effect id (100 levels)
Number of Clusters 100
Correlation Matrix Dimension 3
Maximum Cluster Size 3
Minimum Cluster Size 3
Algorithm converged.
Working Correlation Matrix
Col1 Col2 Col3
Row1 1.0000 0.4784 0.4784
Row2 0.4784 1.0000 0.4784
Row3 0.4784 0.4784 1.0000
Exchangeable Working Correlation
Correlation 0.4783943345
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter Estimate Standard
Error
95% Confidence Limits Z Pr > |Z|
Intercept -1.6952 0.3935 -2.4665 -0.9239 -4.31 <.0001
week 0.1233 0.0770 -0.0276 0.2742 1.60 0.1091
trt A 3.3776 0.5711 2.2583 4.4970 5.91 <.0001
trt P 0.0000 0.0000 0.0000 0.0000 . .
week*trt A -0.3452 0.1045 -0.5500 -0.1404 -3.30 0.0010
week*trt P 0.0000 0.0000 0.0000 0.0000 . .
Group P:
logit 𝜇𝑖𝑗 = −1.6952 + 0.1233 ∗ 𝑤𝑒𝑒𝑘
Group A:
logit 𝜇𝑖𝑗 = −1.6952 + 3.3776 + 0.1233 ∗ 𝑤𝑒𝑒𝑘 − 0.3452 ∗ 𝑤𝑒𝑒𝑘
= 1.6824 − 0.2219 ∗ 𝑤𝑒𝑒𝑘
Thus, in the placebo group (group P), the odds of having normal lead level goes up over time
(although not reaching significance at the 0.05 level)
OR per week= exp(0.1233) = 1.13
But in the treatment group (group A), the odds of having normal lead level goes down over
time:
OR per week = exp(-0.2219) = 0.80
Change in OR over time between the two groups is significantly different (p=0.0010)
𝑂𝑅 =
𝑃𝑟𝑜𝑏 (𝑏𝑙𝑜𝑜𝑑 𝑙𝑒𝑎𝑑<20)
𝑃𝑟𝑜𝑏 (𝑏𝑙𝑜𝑜𝑑 𝑙𝑒𝑎𝑑≥20)
TLC Data
Comparisons of observed and predicted probabilities (in
parentheses) from the GEE model with trt, week as main
effects and trt and week interaction.
Days Group A Group P
7 0.78 (0.81) 0.16 (0.17)
28 0.76 (0.69) 0.26 (0.23)
42 0.54 (0.59) 0.26 (0.28)
Days Group A Group P
7 0.78 (0.72) 0.16 (0.25)
28 0.76 (0.69) 0.26 (0.22)
42 0.54 (0.66) 0.26 (0.20)
Predicted results using
main effects only model in
parentheses
GEE2
R(α) is the working correlation matrix containing unknown
parameter α. If we can write V=Wα, then we can include a second
set of estimating equations for α.
Second-order generalized estimating equation (GEE2)
Using correlation coefficient for binary data
𝐿𝑒𝑡 𝑌1 =
𝑖=1
𝑛
𝑌𝑖1
𝑛
𝑎𝑛𝑑 𝑌2 =
𝑖=1
𝑛
𝑌𝑖2
𝑛
𝐶𝑜𝑟𝑟 𝑌𝑖1, 𝑌𝑖2 =
𝑖=1
𝑛
(𝑌𝑖1 − 𝑌1)(𝑌𝑖2 − 𝑌2)
𝑛𝑆1𝑆2
= 𝑖=1
𝑛
𝑌𝑖1𝑌𝑖2 −𝑛𝑌1𝑌2
𝑛 𝑌1 1−𝑌1 𝑌2 1−𝑌2
<
min 𝑌1, 𝑌2 − 𝑌1𝑌2
𝑌1 1 − 𝑌1 𝑌2 1 − 𝑌2
𝑊ℎ𝑒𝑛 𝑌1=0.2 and 𝑌2 = 0.8, 𝑐𝑜𝑟𝑟 < 0.25
𝑂𝑅 𝑌
𝑗, 𝑌𝑘 =
Pr(𝑌𝑗=1,𝑌𝑘=1)
Pr(𝑌𝑗=0,𝑌𝑘=1)
/
Pr(𝑌𝑗=1,𝑌𝑘=0)
Pr(𝑌𝑗=0,𝑌𝑘=0)
Alternate Logistic Regression using GEE2
Let be the log OR between pairs of
between subject binary outcomes.
The ALR algorithm models the log OR with:
𝛾𝑖𝑗𝑘 = 𝑍𝑖𝑗𝑘
′
𝛼
The vector α is now also included in the GEE iterative
algorithm in addition to the regression parameter β.
Respiratory Disease Example
• Clinical trial data comparing two treatments for a
respiratory disorder.
• Patients in each of two centers are randomly assigned
to groups receiving the active treatment or a placebo.
• ID re-used within each center
• During treatment, respiratory status, represented by
the variable outcome (coded as 0=poor, 1=good) is
determined for each of four visits.
Respiratory Disease Data
SAS file name: SAS demo GEE binary
center id treatment sex age baseline visit outcome
1 1 P M 46 0 1 0
1 1 P M 46 0 2 0
1 1 P M 46 0 3 0
1 1 P M 46 0 4 0
1 2 P M 28 0 1 0
1 2 P M 28 0 2 0
1 2 P M 28 0 3 0
1 2 P M 28 0 4 0
1 3 A M 23 1 1 1
1 3 A M 23 1 2 1
1 3 A M 23 1 3 1
1 3 A M 23 1 4 1
SAS Codes
proc genmod data=resp descend;
class id treatment(ref="P") center(ref="1") sex(ref="M")
baseline(ref="0") / param=ref;
model outcome=treatment center sex age baseline / dist=bin;
repeated subject=id(center) / corr=unstr corrw;
run;
proc genmod data=resp descend;
class id treatment(ref="P") center(ref="1") sex(ref="M")
baseline(ref="0") / param=ref;
model outcome=treatment center sex age baseline / dist=bin;
repeated subject=id(center) / logor=fullclust;
run;
In this study, IDs are re-used within each of the two centers.
So the code: subject=id(center) tells SAS that subjects with same ID but different
center will still be different subjects. This saves us from re-creating new unique
IDs.
SAS demo
• GEE with unstructured correlation
• GEE2 with alternate logistic regression
The GENMOD Procedure
Model Information
Data Set WORK.RESP
Distribution Binomial
Link Function Logit
Dependent Variable outcome
Number of Observations Read 444
Number of Observations Used 444
Number of Events 248
Number of Trials 444
Class Level Information
Class Value Design
Variables
treatment A 1
P 0
center 1 0
2 1
sex F 1
M 0
baseline 0 0
1 1
Response Profile
Ordered
Value
outcome Total
Frequency
1 1 248
2 0 196
PROC GENMOD is modeling the probability that outcome='1'.
Parameter Information
Parameter Effect treatment center sex baseline
Prm1 Intercept
Prm2 treatment A
Prm3 center 2
Prm4 sex F
Prm5 age
Prm6 baseline 1
Algorithm converged.
GEE Model Information
Correlation Structure Unstructured
Subject Effect id(center) (111 levels)
Number of Clusters 111
Correlation Matrix Dimension 4
Maximum Cluster Size 4
Minimum Cluster Size 4
Algorithm converged.
Working Correlation Matrix
Col1 Col2 Col3 Col4
Row1 1.0000 0.3351 0.2140 0.2953
Row2 0.3351 1.0000 0.4429 0.3581
Row3 0.2140 0.4429 1.0000 0.3964
Row4 0.2953 0.3581 0.3964 1.0000
GEE Fit Criteria
QIC 512.3416
QICu 499.6081
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter Estimate Standard
Error
95% Confidence Limits Z Pr > |Z|
Intercept -0.8882 0.4568 -1.7835 0.0071 -1.94 0.0519
treatment A 1.2442 0.3455 0.5669 1.9214 3.60 0.0003
center 2 0.6558 0.3512 -0.0326 1.3442 1.87 0.0619
sex F 0.1128 0.4408 -0.7512 0.9768 0.26 0.7981
age -0.0175 0.0129 -0.0427 0.0077 -1.36 0.1728
baseline 1 1.8981 0.3441 1.2237 2.5725 5.52 <.0001
QIC: Quasi-likelihood Criterion
Smaller is better
GEE Model Information
Log Odds Ratio Structure Fully Parameterized Clusters
Subject Effect id(center) (111 levels)
Number of Clusters 111
Correlation Matrix Dimension 4
Maximum Cluster Size 4
Minimum Cluster Size 4
Log Odds Ratio Parameter
Information
Parameter Group
Alpha1 (1, 2)
Alpha2 (1, 3)
Alpha3 (1, 4)
Alpha4 (2, 3)
Alpha5 (2, 4)
Alpha6 (3, 4)
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter Estimate Standard
Error
95% Confidence Limits Z Pr > |Z|
Intercept -0.9266 0.4513 -1.8111 -0.0421 -2.05 0.0400
treatment A 1.2611 0.3406 0.5934 1.9287 3.70 0.0002
center 2 0.6287 0.3486 -0.0545 1.3119 1.80 0.0713
sex F 0.1024 0.4362 -0.7526 0.9575 0.23 0.8144
age -0.0162 0.0125 -0.0407 0.0084 -1.29 0.1977
baseline 1 1.8980 0.3404 1.2308 2.5652 5.58 <.0001
Alpha1 1.6109 0.4892 0.6522 2.5696 3.29 0.0010
Alpha2 1.0771 0.4834 0.1297 2.0246 2.23 0.0259
Alpha3 1.5875 0.4735 0.6594 2.5155 3.35 0.0008
Alpha4 2.1224 0.5022 1.1381 3.1068 4.23 <.0001
Alpha5 1.8818 0.4686 0.9634 2.8001 4.02 <.0001
Alpha6 2.1046 0.4949 1.1347 3.0745 4.25 <.0001
Results depends on the corr structure assumed
Un logor
Parameter Estimate Standard
Error
Pr > |Z| Estimate Standard
Error
Pr > |Z|
Intercept -0.8882 0.4568 0.0519 -0.9266 0.4513 0.0400
treatment A 1.2442 0.3455 0.0003 1.2611 0.3406 0.0002
center 2 0.6558 0.3512 0.0619 0.6287 0.3486 0.0713
sex F 0.1128 0.4408 0.7981 0.1024 0.4362 0.8144
age -0.0175 0.0129 0.1728 -0.0162 0.0125 0.1977
baseline 1 1.8981 0.3441 <.0001 1.8980 0.3404 <.0001
Log odds ratio structure
𝑂𝑅 𝑌
𝑗, 𝑌𝑘 =
Pr(𝑌𝑗=1,𝑌𝑘=1)
Pr(𝑌𝑗=0,𝑌𝑘=1)
/
Pr(𝑌𝑗=1,𝑌𝑘=0)
Pr(𝑌𝑗=0,𝑌𝑘=0)
𝐴𝑙𝑝ℎ𝑎1 = 𝑂𝑅 𝑌1, 𝑌2 =1.6109
=> having a good outcome at visit 1(Y_1=1) is
associated with having a good outcome at visit 2.
Log linear model for epileptic seizure episodes
• The data consist of the number of epileptic seizures
in an eight-week baseline period, before any
treatment;
• and in each of four two-week treatment periods, in
which patients received either a placebo or the drug
Progabide in addition to other therapy.
Trt=0 placebo
Trt=1 Progabide
SAS file name: SAS demo GEE Poisson
Obs ID Count Visit Trt Age Weeks
1 104 11 0 0 31 8
2 104 5 1 0 31 2
3 104 3 2 0 31 2
4 104 3 3 0 31 2
5 104 3 4 0 31 2
6 106 11 0 0 30 8
7 106 3 1 0 30 2
8 106 5 2 0 30 2
9 106 3 3 0 30 2
10 106 3 4 0 30 2
/*** exclude an outlier ID 207
creating offset variable ***/
data Seizure;
set Seizure;
if ID ne 207;
if Visit = 0 then do;
X1=0;
Ltime = log(8);
end;
else do;
X1=1;
Ltime=log(2);
end;
run;
proc print ; run ;
proc genmod data=Seizure;
class id;
model count=x1 | trt / d=poisson offset=ltime;
repeated subject=id / corrw covb type=exch;
run;
Obs ID Count Visit Trt Age Weeks X1 Ltime
1 104 11 0 0 31 8 0 2.07944
2 104 5 1 0 31 2 1 0.69315
3 104 3 2 0 31 2 1 0.69315
4 104 3 3 0 31 2 1 0.69315
5 104 3 4 0 31 2 1 0.69315
6 106 11 0 0 30 8 0 2.07944
7 106 3 1 0 30 2 1 0.69315
8 106 5 2 0 30 2 1 0.69315
9 106 3 3 0 30 2 1 0.69315
10 106 3 4 0 30 2 1 0.69315
The GENMOD Procedure
Model Information
Data Set WORK.SEIZURE
Distribution Poisson
Link Function Log
Dependent Variable Count
Offset Variable Ltime
Number of Observations Read 290
Number of Observations Used 290
Class Level Information
Class Levels Values
ID 58 101 102 103 104 106 107 108 110 111 112 113 114 116 117 118 121 122 123
124 126 128 129 130 135 137 139 141 143 145 147 201 202 203 204 205 206
208 209 210 211 213 214 215 217 218 219 220 221 222 225 226 227 228 230
232 234 236 238
Parameter Information
Parameter Effect
Prm1 Intercept
Prm2 X1
Prm3 Trt
Prm4 X1*Trt
Covariance Matrix (Model-Based)
Prm1 Prm2 Prm3 Prm4
Prm1 0.01223 0.001520 -0.01223 -0.001520
Prm2 0.001520 0.01519 -0.001520 -0.01519
Prm3 -0.01223 -0.001520 0.02495 0.005427
Prm4 -0.001520 -0.01519 0.005427 0.03748
Covariance Matrix (Empirical)
Prm1 Prm2 Prm3 Prm4
Prm1 0.02476 -0.001152 -0.02476 0.001152
Prm2 -0.001152 0.01348 0.001152 -0.01348
Prm3 -0.02476 0.001152 0.03751 -0.002999
Prm4 0.001152 -0.01348 -0.002999 0.02931
Algorithm converged.
Working Correlation Matrix
Col1 Col2 Col3 Col4 Col5
Row1 1.0000 0.5941 0.5941 0.5941 0.5941
Row2 0.5941 1.0000 0.5941 0.5941 0.5941
Row3 0.5941 0.5941 1.0000 0.5941 0.5941
Row4 0.5941 0.5941 0.5941 1.0000 0.5941
Row5 0.5941 0.5941 0.5941 0.5941 1.0000
Exchangeable Working Correlation
Correlation 0.5941485833
These are covariance
matrices for the beta
parameters.
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter Estimate Standard
Error
95% Confidence Limits Z Pr > |Z|
Intercept 1.3476 0.1574 1.0392 1.6560 8.56 <.0001
X1 0.1108 0.1161 -0.1168 0.3383 0.95 0.3399
Trt -0.1080 0.1937 -0.4876 0.2716 -0.56 0.5770
X1*Trt -0.3016 0.1712 -0.6371 0.0339 -1.76 0.0781
Genmod working correlation matrix
Genmod working correlation matrix
User defined correlation matrix (not in proc mixed):
New Homework Assignment
• Problem 13.1, due next Wednesday, April 15.
• Submit to canvas

More Related Content

Similar to Week 10 GEE Data Examples v2.pptx

⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention
⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention
⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention
Victor Asanza
 
ictir2016
ictir2016ictir2016
ictir2016
Tetsuya Sakai
 
Factorial Experiments
Factorial ExperimentsFactorial Experiments
Factorial Experiments
HelpWithAssignment.com
 
A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...
A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...
A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...
aurkoiitk
 
3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots
Long Beach City College
 
Ch15
Ch15Ch15
1. Outline the differences between Hoarding power and Encouraging..docx
1. Outline the differences between Hoarding power and Encouraging..docx1. Outline the differences between Hoarding power and Encouraging..docx
1. Outline the differences between Hoarding power and Encouraging..docx
paynetawnya
 
Estimating sample size through simulations
Estimating sample size through simulationsEstimating sample size through simulations
Estimating sample size through simulations
Arthur8898
 
Chapter12
Chapter12Chapter12
Chapter12
Richard Ferreria
 
33151-33161.ppt
33151-33161.ppt33151-33161.ppt
33151-33161.ppt
dawitg2
 
12 ch ken black solution
12 ch ken black solution12 ch ken black solution
12 ch ken black solutionKrunal Shah
 
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data MiningMetaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Varun Ojha
 
BIOSTATISTICS MEAN MEDIAN MODE SEMESTER 8 AND M PHARMACY BIOSTATISTICS.pptx
BIOSTATISTICS MEAN MEDIAN MODE SEMESTER 8 AND M PHARMACY BIOSTATISTICS.pptxBIOSTATISTICS MEAN MEDIAN MODE SEMESTER 8 AND M PHARMACY BIOSTATISTICS.pptx
BIOSTATISTICS MEAN MEDIAN MODE SEMESTER 8 AND M PHARMACY BIOSTATISTICS.pptx
Payaamvohra1
 
Repeated events analyses
Repeated events analysesRepeated events analyses
Repeated events analyses
Mike LaValley
 
chapter15c.ppt
chapter15c.pptchapter15c.ppt
chapter15c.ppt
MohamedSahal16
 
chapter15c.ppt
chapter15c.pptchapter15c.ppt
chapter15c.ppt
dawitg2
 
Tryptone task
Tryptone taskTryptone task
Tryptone task
Yuwu Chen
 
Uplift Modeling Workshop
Uplift Modeling WorkshopUplift Modeling Workshop
Uplift Modeling Workshop
odsc
 

Similar to Week 10 GEE Data Examples v2.pptx (20)

AJMS_395_22.pdf
AJMS_395_22.pdfAJMS_395_22.pdf
AJMS_395_22.pdf
 
⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention
⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention
⭐⭐⭐⭐⭐ Finding a Dynamical Model of a Social Norm Physical Activity Intervention
 
ictir2016
ictir2016ictir2016
ictir2016
 
Factorial Experiments
Factorial ExperimentsFactorial Experiments
Factorial Experiments
 
A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...
A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...
A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...
 
3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots3.3 Measures of relative standing and boxplots
3.3 Measures of relative standing and boxplots
 
Ch15
Ch15Ch15
Ch15
 
1. Outline the differences between Hoarding power and Encouraging..docx
1. Outline the differences between Hoarding power and Encouraging..docx1. Outline the differences between Hoarding power and Encouraging..docx
1. Outline the differences between Hoarding power and Encouraging..docx
 
Estimating sample size through simulations
Estimating sample size through simulationsEstimating sample size through simulations
Estimating sample size through simulations
 
Chapter12
Chapter12Chapter12
Chapter12
 
33151-33161.ppt
33151-33161.ppt33151-33161.ppt
33151-33161.ppt
 
12 ch ken black solution
12 ch ken black solution12 ch ken black solution
12 ch ken black solution
 
Seminar SPSS di UM
Seminar SPSS di UM Seminar SPSS di UM
Seminar SPSS di UM
 
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data MiningMetaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
 
BIOSTATISTICS MEAN MEDIAN MODE SEMESTER 8 AND M PHARMACY BIOSTATISTICS.pptx
BIOSTATISTICS MEAN MEDIAN MODE SEMESTER 8 AND M PHARMACY BIOSTATISTICS.pptxBIOSTATISTICS MEAN MEDIAN MODE SEMESTER 8 AND M PHARMACY BIOSTATISTICS.pptx
BIOSTATISTICS MEAN MEDIAN MODE SEMESTER 8 AND M PHARMACY BIOSTATISTICS.pptx
 
Repeated events analyses
Repeated events analysesRepeated events analyses
Repeated events analyses
 
chapter15c.ppt
chapter15c.pptchapter15c.ppt
chapter15c.ppt
 
chapter15c.ppt
chapter15c.pptchapter15c.ppt
chapter15c.ppt
 
Tryptone task
Tryptone taskTryptone task
Tryptone task
 
Uplift Modeling Workshop
Uplift Modeling WorkshopUplift Modeling Workshop
Uplift Modeling Workshop
 

Recently uploaded

ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 

Recently uploaded (20)

ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 

Week 10 GEE Data Examples v2.pptx

  • 3. SAS Codes to Create a Binary Outcome SAS file name: SAS demo TLC genmod data tlc ; set ala.tlc ; y=y0 ; time=0 ; week=0 ; output ; y=y1 ; time=1 ; week=1 ; output ; y=y4 ; time=2 ; week=4 ; output ; y=y6 ; time=3 ; week=6 ; output ; run ; data tlc ; set tlc ; if week=0 then delete ; if y>=20 then lead_normal=0 ; if y ne . and y < 20 then lead_normal=1 ; run ; proc print ; run ; Note: the event/success is normal blood lead level
  • 4. id trt y0 y1 y4 y6 y time week lead_normal 1 P 30.8 26.9 25.8 23.8 26.9 1 1 0 1 P 30.8 26.9 25.8 23.8 25.8 2 4 0 1 P 30.8 26.9 25.8 23.8 23.8 3 6 0 2 A 26.5 14.8 19.5 21.0 14.8 1 1 1 2 A 26.5 14.8 19.5 21.0 19.5 2 4 1 2 A 26.5 14.8 19.5 21.0 21.0 3 6 0 3 A 25.8 23.0 19.1 23.2 23.0 1 1 0 3 A 25.8 23.0 19.1 23.2 19.1 2 4 1 3 A 25.8 23.0 19.1 23.2 23.2 3 6 0
  • 5. TLC Data Days Group A Group P 7 0.78 0.16 28 0.76 0.26 42 0.54 0.26 Blood lead levels were repeatedly measured in the TLC trial data. Binary outcome: blood lead level < 20 μg/dL (no lead poisoning) Percent of no lead poisoning in the two groups:
  • 6. TLC Data (Continuous Lead Level)
  • 7. TLC Data (continuous lead level by group)
  • 8. TLC Data (binary: normal lead by group)
  • 9.
  • 10. data tlc ; set ala.tlc ; y=y0 ; time=0 ; week=0 ; output ; y=y1 ; time=1 ; week=1 ; output ; y=y4 ; time=2 ; week=4 ; output ; y=y6 ; time=3 ; week=6 ; output ; run ; data tlc ; set tlc ; if week=0 then delete ; if y>=20 then lead_normal=0 ; if y ne . and y < 20 then lead_normal=1 ; run ; proc genmod data=tlc descending ; class id trt ; model lead_normal =trt week / d=bin link=logit ; repeated subject=id / type=exch corrw modelse ; output out=pprobs p=pred xbeta=xbeta ; run ; Note: Genmod default is to use empirical (i.e. robust) standard error estimates. I used the “modelse” option to show the difference between empirical and model-based results.
  • 11. GEE Model Information Correlation Structure Exchangeable Subject Effect id (100 levels) Number of Clusters 100 Correlation Matrix Dimension 3 Maximum Cluster Size 3 Minimum Cluster Size 3 Algorithm converged. Working Correlation Matrix Col1 Col2 Col3 Row1 1.0000 0.4622 0.4622 Row2 0.4622 1.0000 0.4622 Row3 0.4622 0.4622 1.0000 Exchangeable Working Correlation Correlation 0.4621656646 Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Parameter Estimate Standard Error 95% Confidence Limits Z Pr > |Z| Intercept -1.0402 0.2839 -1.5966 -0.4838 -3.66 0.0002 trt A 2.0654 0.3706 1.3391 2.7918 5.57 <.0001 trt P 0.0000 0.0000 0.0000 0.0000 . . week -0.0613 0.0522 -0.1635 0.0409 -1.18 0.2399
  • 12. Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Parameter Estimate Standard Error 95% Confidence Limits Z Pr > |Z| Intercept -1.0402 0.2839 -1.5966 -0.4838 -3.66 0.0002 trt A 2.0654 0.3706 1.3391 2.7918 5.57 <.0001 trt P 0.0000 0.0000 0.0000 0.0000 . . week -0.0613 0.0522 -0.1635 0.0409 -1.18 0.2399 Analysis Of GEE Parameter Estimates Model-Based Standard Error Estimates Parameter Estimate Standard Error 95% Confidence Limits Z Pr > |Z| Intercept -1.0402 0.3150 -1.6575 -0.4229 -3.30 0.0010 trt A 2.0654 0.3677 1.3447 2.7862 5.62 <.0001 trt P 0.0000 0.0000 0.0000 0.0000 . . week -0.0613 0.0471 -0.1536 0.0310 -1.30 0.1930 Scale 1.0000 . . . . .
  • 13. TLC Data Observed and predicted proportions of normal lead level in the two groups (predicted in parentheses) Note the differences between observed and predicted proportions in the treatment group. This is because the model we fit was “main effect” only which assumes treatment effects Days Group A Group P 7 0.78 (0.72) 0.16 (0.25) 28 0.76 (0.69) 0.26 (0.22) 42 0.54 (0.66) 0.26 (0.20)
  • 14. TLC Data: Adding an Interaction proc genmod data=tlc descending ; class id trt ; model lead_normal =trt week trt*week / d=bin link=logit ; repeated subject=id / type=exch corrw ; output out=pprobs p=pred xbeta=xbeta ; run ;
  • 15. GEE Model Information Correlation Structure Exchangeable Subject Effect id (100 levels) Number of Clusters 100 Correlation Matrix Dimension 3 Maximum Cluster Size 3 Minimum Cluster Size 3 Algorithm converged. Working Correlation Matrix Col1 Col2 Col3 Row1 1.0000 0.4784 0.4784 Row2 0.4784 1.0000 0.4784 Row3 0.4784 0.4784 1.0000 Exchangeable Working Correlation Correlation 0.4783943345 Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Parameter Estimate Standard Error 95% Confidence Limits Z Pr > |Z| Intercept -1.6952 0.3935 -2.4665 -0.9239 -4.31 <.0001 week 0.1233 0.0770 -0.0276 0.2742 1.60 0.1091 trt A 3.3776 0.5711 2.2583 4.4970 5.91 <.0001 trt P 0.0000 0.0000 0.0000 0.0000 . . week*trt A -0.3452 0.1045 -0.5500 -0.1404 -3.30 0.0010 week*trt P 0.0000 0.0000 0.0000 0.0000 . .
  • 16. Group P: logit 𝜇𝑖𝑗 = −1.6952 + 0.1233 ∗ 𝑤𝑒𝑒𝑘 Group A: logit 𝜇𝑖𝑗 = −1.6952 + 3.3776 + 0.1233 ∗ 𝑤𝑒𝑒𝑘 − 0.3452 ∗ 𝑤𝑒𝑒𝑘 = 1.6824 − 0.2219 ∗ 𝑤𝑒𝑒𝑘 Thus, in the placebo group (group P), the odds of having normal lead level goes up over time (although not reaching significance at the 0.05 level) OR per week= exp(0.1233) = 1.13 But in the treatment group (group A), the odds of having normal lead level goes down over time: OR per week = exp(-0.2219) = 0.80 Change in OR over time between the two groups is significantly different (p=0.0010) 𝑂𝑅 = 𝑃𝑟𝑜𝑏 (𝑏𝑙𝑜𝑜𝑑 𝑙𝑒𝑎𝑑<20) 𝑃𝑟𝑜𝑏 (𝑏𝑙𝑜𝑜𝑑 𝑙𝑒𝑎𝑑≥20)
  • 17. TLC Data Comparisons of observed and predicted probabilities (in parentheses) from the GEE model with trt, week as main effects and trt and week interaction. Days Group A Group P 7 0.78 (0.81) 0.16 (0.17) 28 0.76 (0.69) 0.26 (0.23) 42 0.54 (0.59) 0.26 (0.28) Days Group A Group P 7 0.78 (0.72) 0.16 (0.25) 28 0.76 (0.69) 0.26 (0.22) 42 0.54 (0.66) 0.26 (0.20) Predicted results using main effects only model in parentheses
  • 18. GEE2 R(α) is the working correlation matrix containing unknown parameter α. If we can write V=Wα, then we can include a second set of estimating equations for α. Second-order generalized estimating equation (GEE2)
  • 19.
  • 20. Using correlation coefficient for binary data 𝐿𝑒𝑡 𝑌1 = 𝑖=1 𝑛 𝑌𝑖1 𝑛 𝑎𝑛𝑑 𝑌2 = 𝑖=1 𝑛 𝑌𝑖2 𝑛 𝐶𝑜𝑟𝑟 𝑌𝑖1, 𝑌𝑖2 = 𝑖=1 𝑛 (𝑌𝑖1 − 𝑌1)(𝑌𝑖2 − 𝑌2) 𝑛𝑆1𝑆2 = 𝑖=1 𝑛 𝑌𝑖1𝑌𝑖2 −𝑛𝑌1𝑌2 𝑛 𝑌1 1−𝑌1 𝑌2 1−𝑌2 < min 𝑌1, 𝑌2 − 𝑌1𝑌2 𝑌1 1 − 𝑌1 𝑌2 1 − 𝑌2 𝑊ℎ𝑒𝑛 𝑌1=0.2 and 𝑌2 = 0.8, 𝑐𝑜𝑟𝑟 < 0.25
  • 21. 𝑂𝑅 𝑌 𝑗, 𝑌𝑘 = Pr(𝑌𝑗=1,𝑌𝑘=1) Pr(𝑌𝑗=0,𝑌𝑘=1) / Pr(𝑌𝑗=1,𝑌𝑘=0) Pr(𝑌𝑗=0,𝑌𝑘=0)
  • 22.
  • 23. Alternate Logistic Regression using GEE2 Let be the log OR between pairs of between subject binary outcomes. The ALR algorithm models the log OR with: 𝛾𝑖𝑗𝑘 = 𝑍𝑖𝑗𝑘 ′ 𝛼 The vector α is now also included in the GEE iterative algorithm in addition to the regression parameter β.
  • 24. Respiratory Disease Example • Clinical trial data comparing two treatments for a respiratory disorder. • Patients in each of two centers are randomly assigned to groups receiving the active treatment or a placebo. • ID re-used within each center • During treatment, respiratory status, represented by the variable outcome (coded as 0=poor, 1=good) is determined for each of four visits.
  • 25. Respiratory Disease Data SAS file name: SAS demo GEE binary center id treatment sex age baseline visit outcome 1 1 P M 46 0 1 0 1 1 P M 46 0 2 0 1 1 P M 46 0 3 0 1 1 P M 46 0 4 0 1 2 P M 28 0 1 0 1 2 P M 28 0 2 0 1 2 P M 28 0 3 0 1 2 P M 28 0 4 0 1 3 A M 23 1 1 1 1 3 A M 23 1 2 1 1 3 A M 23 1 3 1 1 3 A M 23 1 4 1
  • 26. SAS Codes proc genmod data=resp descend; class id treatment(ref="P") center(ref="1") sex(ref="M") baseline(ref="0") / param=ref; model outcome=treatment center sex age baseline / dist=bin; repeated subject=id(center) / corr=unstr corrw; run; proc genmod data=resp descend; class id treatment(ref="P") center(ref="1") sex(ref="M") baseline(ref="0") / param=ref; model outcome=treatment center sex age baseline / dist=bin; repeated subject=id(center) / logor=fullclust; run; In this study, IDs are re-used within each of the two centers. So the code: subject=id(center) tells SAS that subjects with same ID but different center will still be different subjects. This saves us from re-creating new unique IDs.
  • 27. SAS demo • GEE with unstructured correlation • GEE2 with alternate logistic regression
  • 28. The GENMOD Procedure Model Information Data Set WORK.RESP Distribution Binomial Link Function Logit Dependent Variable outcome Number of Observations Read 444 Number of Observations Used 444 Number of Events 248 Number of Trials 444 Class Level Information Class Value Design Variables treatment A 1 P 0 center 1 0 2 1 sex F 1 M 0 baseline 0 0 1 1 Response Profile Ordered Value outcome Total Frequency 1 1 248 2 0 196 PROC GENMOD is modeling the probability that outcome='1'.
  • 29. Parameter Information Parameter Effect treatment center sex baseline Prm1 Intercept Prm2 treatment A Prm3 center 2 Prm4 sex F Prm5 age Prm6 baseline 1 Algorithm converged. GEE Model Information Correlation Structure Unstructured Subject Effect id(center) (111 levels) Number of Clusters 111 Correlation Matrix Dimension 4 Maximum Cluster Size 4 Minimum Cluster Size 4 Algorithm converged.
  • 30. Working Correlation Matrix Col1 Col2 Col3 Col4 Row1 1.0000 0.3351 0.2140 0.2953 Row2 0.3351 1.0000 0.4429 0.3581 Row3 0.2140 0.4429 1.0000 0.3964 Row4 0.2953 0.3581 0.3964 1.0000 GEE Fit Criteria QIC 512.3416 QICu 499.6081 Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Parameter Estimate Standard Error 95% Confidence Limits Z Pr > |Z| Intercept -0.8882 0.4568 -1.7835 0.0071 -1.94 0.0519 treatment A 1.2442 0.3455 0.5669 1.9214 3.60 0.0003 center 2 0.6558 0.3512 -0.0326 1.3442 1.87 0.0619 sex F 0.1128 0.4408 -0.7512 0.9768 0.26 0.7981 age -0.0175 0.0129 -0.0427 0.0077 -1.36 0.1728 baseline 1 1.8981 0.3441 1.2237 2.5725 5.52 <.0001 QIC: Quasi-likelihood Criterion Smaller is better
  • 31. GEE Model Information Log Odds Ratio Structure Fully Parameterized Clusters Subject Effect id(center) (111 levels) Number of Clusters 111 Correlation Matrix Dimension 4 Maximum Cluster Size 4 Minimum Cluster Size 4 Log Odds Ratio Parameter Information Parameter Group Alpha1 (1, 2) Alpha2 (1, 3) Alpha3 (1, 4) Alpha4 (2, 3) Alpha5 (2, 4) Alpha6 (3, 4)
  • 32. Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Parameter Estimate Standard Error 95% Confidence Limits Z Pr > |Z| Intercept -0.9266 0.4513 -1.8111 -0.0421 -2.05 0.0400 treatment A 1.2611 0.3406 0.5934 1.9287 3.70 0.0002 center 2 0.6287 0.3486 -0.0545 1.3119 1.80 0.0713 sex F 0.1024 0.4362 -0.7526 0.9575 0.23 0.8144 age -0.0162 0.0125 -0.0407 0.0084 -1.29 0.1977 baseline 1 1.8980 0.3404 1.2308 2.5652 5.58 <.0001 Alpha1 1.6109 0.4892 0.6522 2.5696 3.29 0.0010 Alpha2 1.0771 0.4834 0.1297 2.0246 2.23 0.0259 Alpha3 1.5875 0.4735 0.6594 2.5155 3.35 0.0008 Alpha4 2.1224 0.5022 1.1381 3.1068 4.23 <.0001 Alpha5 1.8818 0.4686 0.9634 2.8001 4.02 <.0001 Alpha6 2.1046 0.4949 1.1347 3.0745 4.25 <.0001
  • 33. Results depends on the corr structure assumed Un logor Parameter Estimate Standard Error Pr > |Z| Estimate Standard Error Pr > |Z| Intercept -0.8882 0.4568 0.0519 -0.9266 0.4513 0.0400 treatment A 1.2442 0.3455 0.0003 1.2611 0.3406 0.0002 center 2 0.6558 0.3512 0.0619 0.6287 0.3486 0.0713 sex F 0.1128 0.4408 0.7981 0.1024 0.4362 0.8144 age -0.0175 0.0129 0.1728 -0.0162 0.0125 0.1977 baseline 1 1.8981 0.3441 <.0001 1.8980 0.3404 <.0001
  • 34. Log odds ratio structure 𝑂𝑅 𝑌 𝑗, 𝑌𝑘 = Pr(𝑌𝑗=1,𝑌𝑘=1) Pr(𝑌𝑗=0,𝑌𝑘=1) / Pr(𝑌𝑗=1,𝑌𝑘=0) Pr(𝑌𝑗=0,𝑌𝑘=0) 𝐴𝑙𝑝ℎ𝑎1 = 𝑂𝑅 𝑌1, 𝑌2 =1.6109 => having a good outcome at visit 1(Y_1=1) is associated with having a good outcome at visit 2.
  • 35. Log linear model for epileptic seizure episodes • The data consist of the number of epileptic seizures in an eight-week baseline period, before any treatment; • and in each of four two-week treatment periods, in which patients received either a placebo or the drug Progabide in addition to other therapy. Trt=0 placebo Trt=1 Progabide SAS file name: SAS demo GEE Poisson
  • 36. Obs ID Count Visit Trt Age Weeks 1 104 11 0 0 31 8 2 104 5 1 0 31 2 3 104 3 2 0 31 2 4 104 3 3 0 31 2 5 104 3 4 0 31 2 6 106 11 0 0 30 8 7 106 3 1 0 30 2 8 106 5 2 0 30 2 9 106 3 3 0 30 2 10 106 3 4 0 30 2
  • 37. /*** exclude an outlier ID 207 creating offset variable ***/ data Seizure; set Seizure; if ID ne 207; if Visit = 0 then do; X1=0; Ltime = log(8); end; else do; X1=1; Ltime=log(2); end; run; proc print ; run ; proc genmod data=Seizure; class id; model count=x1 | trt / d=poisson offset=ltime; repeated subject=id / corrw covb type=exch; run;
  • 38. Obs ID Count Visit Trt Age Weeks X1 Ltime 1 104 11 0 0 31 8 0 2.07944 2 104 5 1 0 31 2 1 0.69315 3 104 3 2 0 31 2 1 0.69315 4 104 3 3 0 31 2 1 0.69315 5 104 3 4 0 31 2 1 0.69315 6 106 11 0 0 30 8 0 2.07944 7 106 3 1 0 30 2 1 0.69315 8 106 5 2 0 30 2 1 0.69315 9 106 3 3 0 30 2 1 0.69315 10 106 3 4 0 30 2 1 0.69315
  • 39. The GENMOD Procedure Model Information Data Set WORK.SEIZURE Distribution Poisson Link Function Log Dependent Variable Count Offset Variable Ltime Number of Observations Read 290 Number of Observations Used 290 Class Level Information Class Levels Values ID 58 101 102 103 104 106 107 108 110 111 112 113 114 116 117 118 121 122 123 124 126 128 129 130 135 137 139 141 143 145 147 201 202 203 204 205 206 208 209 210 211 213 214 215 217 218 219 220 221 222 225 226 227 228 230 232 234 236 238 Parameter Information Parameter Effect Prm1 Intercept Prm2 X1 Prm3 Trt Prm4 X1*Trt
  • 40. Covariance Matrix (Model-Based) Prm1 Prm2 Prm3 Prm4 Prm1 0.01223 0.001520 -0.01223 -0.001520 Prm2 0.001520 0.01519 -0.001520 -0.01519 Prm3 -0.01223 -0.001520 0.02495 0.005427 Prm4 -0.001520 -0.01519 0.005427 0.03748 Covariance Matrix (Empirical) Prm1 Prm2 Prm3 Prm4 Prm1 0.02476 -0.001152 -0.02476 0.001152 Prm2 -0.001152 0.01348 0.001152 -0.01348 Prm3 -0.02476 0.001152 0.03751 -0.002999 Prm4 0.001152 -0.01348 -0.002999 0.02931 Algorithm converged. Working Correlation Matrix Col1 Col2 Col3 Col4 Col5 Row1 1.0000 0.5941 0.5941 0.5941 0.5941 Row2 0.5941 1.0000 0.5941 0.5941 0.5941 Row3 0.5941 0.5941 1.0000 0.5941 0.5941 Row4 0.5941 0.5941 0.5941 1.0000 0.5941 Row5 0.5941 0.5941 0.5941 0.5941 1.0000 Exchangeable Working Correlation Correlation 0.5941485833 These are covariance matrices for the beta parameters.
  • 41. Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Parameter Estimate Standard Error 95% Confidence Limits Z Pr > |Z| Intercept 1.3476 0.1574 1.0392 1.6560 8.56 <.0001 X1 0.1108 0.1161 -0.1168 0.3383 0.95 0.3399 Trt -0.1080 0.1937 -0.4876 0.2716 -0.56 0.5770 X1*Trt -0.3016 0.1712 -0.6371 0.0339 -1.76 0.0781
  • 43. Genmod working correlation matrix User defined correlation matrix (not in proc mixed):
  • 44. New Homework Assignment • Problem 13.1, due next Wednesday, April 15. • Submit to canvas