Statistical Analysis of Clustered Binary Response in
                Oral Health Research
Ronen Ofec*, DMD ; David M. Steinberg, PhD ; Devorah Schwartz-Arad, DND, PhD

* M.Sc. program in Biostatistics, School of Mathematical sciences, Tel-Aviv university
* Praviate dental practice, Tel-Aviv, Israel
The 4th International Meeting on Methodological Issues In Oral Health Research,
Istanbul , Turkey
Dental implants treatment
Dental implants treatment
Dental implants treatment
Dental implants treatment
Dental implants treatment
Dental implants treatment
Dental implants treatment
Dental implants treatment
Dental implants treatment
The durability of
Dental implants treatment




 Failures (removal of an implant) do occur
 Marginal bone loss (MBL) can be an early sign for a failure
 MBL: The amount of bone an implant loses during function time
Marginal bone loss (MBL)

                 Fransson et al.(2005)




 What are the risk factors for MBL?

 Are some patients more prone to MBL?
 Are implants within patients correlated to each other?
Marginal bone loss (MBL)

                 Fransson et al.(2005)




 What are the risk factors for MBL?

 Are some patients more prone to MBL?
 Are implants within patients correlated to each other?
Marginal bone loss (MBL)

                 Fransson et al.(2005)




 What are the risk factors for MBL?

 Are some patients more prone to MBL?
 Are implants within patients correlated to each other?
Main question of interest and
Objectives of the study


What will be the consequences of a naïve analysis
that doesn't recognize correlation within a patient?


1   To identify risk factors for MBL in a long term follow up study


2   To estimate the Intra patient correlation with regard to MBL

    To compare results from a naïve analysis to one that
3   includes intra patient correlation
Main question of interest and
Objectives of the study


What will be the consequences of a naïve analysis
that doesn't recognize correlation within a patient?


1   To identify risk factors for MBL in a long term follow up study


2   To estimate the Intra patient correlation with regard to MBL

    To compare results from a naïve analysis to one that
3   includes intra patient correlation
Main question of interest and
Objectives of the study


What will be the consequences of a naïve analysis
that doesn't recognize correlation within a patient?


1   To identify risk factors for MBL in a long term follow up study


2   To estimate the Intra patient correlation with regard to MBL

    To compare results from a naïve analysis to one that
3   includes intra patient correlation
Main question of interest and
Objectives of the study


What will be the consequences of a naïve analysis
that doesn't recognize correlation within a patient?


1   To identify risk factors for MBL in a long term follow up study


2   To estimate the Intra patient correlation with regard to MBL

    To compare results from a naïve analysis to one that
3   includes intra patient correlation
Study design and participants




  Historical prospective cohort study design
  Schwartz-Arad Surgical center, between January
   1996, and July 1998 by a single surgeon (DSA)
  Follow-up time was up to 147 months with a mean
   of 70 months
The exposures,
Binary response and data set


 The exposures: Patient-specific and implant-specific
 Clustered response: MBL measurement at implant level
 Binary response: Acceptable vs. advanced bone loss
 Cut point at MBL=0.2 mm/year
 The data set: Multilevel data set
 195 Patients as the primary sample units (clusters)
 721 Implants as the Elementary units
 No. of implants per patient [1,16], mode=3
The Intra Patient Correlation



1 Way Random Effect ANOVA

Patient effect

Implant effect
Patient and implant effect
 are independent
The Intra Class Correlation (ICC)
The estimator and the estimate
for ICC


 Kappa type estimator proposed by Fleiss and Cuzick (1979)
 Confidence Intervals for the estimator formulated
  by Zou and Donner (2004)
 Simulation results: empirical coverage is close to nominal
  with C.I for the kappa type
 In our study:
The Generalized Estimating
Equations (GEE)


 Population average (Marginal) model Liang and Zeger (1986)
1. The mean model:
2. Working variance structure:
3. Working correlation structure:
 The empirical/sandwich estimator for the precision of estimates
Robustness of the
Sandwich estimator




 The estimator is robust to misspecification of the variance and
  correlation structures
 Our estimates are still valid (consistent) if we use a structure
  which is not reflecting reality
 Mancl and Leroux (1996): Gain of precision for the “right”
  correlation structure
The prevalence of advanced
bone loss by GEE
Risk factors for MBL
by GEE
                                                         0 *** 0.001 ** 0.01 * 0.05


             Exposure          Function time<3 years    Function time≥3 years

                               Beta     S.E     PV.    Beta         S.E               PV.
  Smoker                                                1.44       0.41               ***
  Coating (HA &TPS)            -2.22   0.76     **      1.34       0.39               ***
  Early spontaneous exposure                            0.85       0.29               **
  Diameter                                             -1.39       0.40               ***

 Interaction between function time and risk factors
  For smoker:
  The odds for MBL for smokers is 4.22 times greater
   than for non smokers
  The effect of HA & TPS turns from protective to risk
Risk factors for MBL
by GEE
                                                         0 *** 0.001 ** 0.01 * 0.05


             Exposure          Function time<3 years    Function time≥3 years

                               Beta     S.E     PV.    Beta         S.E               PV.
  Smoker                                                1.44       0.41               ***
  Coating (HA &TPS)            -2.22   0.76     **      1.34       0.39               ***
  Early spontaneous exposure                            0.85       0.29               **
  Diameter                                             -1.39       0.40               ***

 Interaction between function time and risk factors
  For smoker:
  The odds for MBL for smokers is 4.22 times greater
   than for non smokers
  The effect of HA & TPS turns from protective to risk
Risk factors for MBL
by GEE
                                                         0 *** 0.001 ** 0.01 * 0.05


             Exposure          Function time<3 years    Function time≥3 years

                               Beta     S.E     PV.    Beta         S.E               PV.
  Smoker                                                1.44       0.41               ***
  Coating (HA &TPS)            -2.22   0.76     **      1.34       0.39               ***
  Early spontaneous exposure                            0.85       0.29               **
  Diameter                                             -1.39       0.40               ***

 Interaction between function time and risk factors
  For smoker:
  The odds for MBL for smokers is 4.22 times greater
   than for non smokers
  The effect of HA & TPS turns from protective to risk
The naïve estimation
and GEE

                                        Function time >= 3 years

                                       Naïve                        GEE
                Exposure       Beta     S.E     PV.    Beta        S.E    PV.
           Smoker              1.50     0.29    ***    1.44        0.41   ***
           Coating (HA &TPS)   1.31     0.27    ***    1.34        0.39   ***
           Early exposure      0.85     0.26    **     0.85        0.29   **
           Diameter            -1.57    0.35    ***    -1.39       0.40   ***


 Estimates for exposure effects – it is not bad to be naïve
 Correlation doesn’t induce bias to an unbiased estimator
 Standard errors of estimates- a naïve analysis leads to bias
 Underestimation or overestimation of standard errors
 Risk for invalid inference concerning the estimated effect
The naïve estimation
and GEE

                                        Function time >= 3 years

                                       Naïve                        GEE
                Exposure       Beta     S.E     PV.    Beta        S.E    PV.
           Smoker              1.50     0.29    ***    1.44        0.41   ***
           Coating (HA &TPS)   1.31     0.27    ***    1.34        0.39   ***
           Early exposure      0.85     0.26    **     0.85        0.29   **
           Diameter            -1.57    0.35    ***    -1.39       0.40   ***


 Estimates for exposure effects – it is not bad to be naïve
 Correlation doesn’t induce bias to an unbiased estimator
 Standard errors of estimates- a naïve analysis leads to bias
 Underestimation or overestimation of standard errors
 Risk for invalid inference concerning the estimated effect
The naïve estimation
and GEE

                                        Function time < 3 years

                                       Naïve                       GEE
                Exposure       Beta     S.E     PV.    Beta       S.E    PV.
           Smoker              -0.41    0.41   0.32    -0.43      0.43   0.30
           Coating (HA &TPS)   -2.2     1.09   0.04     -2.2      0.76   **
           Early exposure      0.35     0.42   0.40    0.35       0.39   0.35
           Diameter            -0.63    0.42   0.13    -0.60      0.47   0.21


 Estimates for exposure effects – it is not bad to be naïve
 Correlation doesn’t induce bias to an unbiased estimator
 Standard errors of estimates- a naïve analysis leads to bias
 Underestimation or overestimation of standard errors
 Risk for invalid inference concerning the estimated effect
The source of exposures
 variation

                        Source of exposure/treatment variation
    Between                                                          Within
  patient/cluster                                                patient/cluster



 Patient specific exposure: variation between patient
 Similar to treatment effect in Between cluster design
 Implant specific exposure: variation within and between patient
 Might be similar to treatment effect in Within/Between cluster
  design
 Depends on the source of variation of Implant specific exposure
The Design Effect (Deff)




            Between                                     Within
         cluster design                             cluster design


 Deff >1                              Deff<1

 Variance inflation factor (VIF)      Variance attenuation factor (VAF)

 Therefore, a naïve analysis is       Therefore, a naïve analysis is
  anti-conservative (underestimate)     conservative (overestimate)
The Design Effect (Deff)




            Between                                     Within
         cluster design                             cluster design


 Deff >1                              Deff<1

 Variance inflation factor (VIF)      Variance attenuation factor (VAF)

 Therefore, a naïve analysis is       Therefore, a naïve analysis is
  anti-conservative (underestimate)     conservative (overestimate)
The Design Effect (Deff)




            Between                                     Within
         cluster design                             cluster design


 Deff >1                              Deff<1

 Variance inflation factor (VIF)      Variance attenuation factor (VAF)

 Therefore, a naïve analysis is       Therefore, a naïve analysis is
  anti-conservative (underestimate)     conservative (overestimate)
The answer to the main
question of interest

              What will be the consequences of a
              naïve analysis that doesn't recognize
              correlation within a patient?


 No problem with the estimated effect
 For a patient specific exposure: underestimation of standard
  errors
 For an implant specific exposure: underestimation if variance is
  from between patients
 But, overestimation if variance of exposure is from within patient
 Mancl, Leroux, DeRouen (2000) recommended to separate the
  effect of a site specific exposure, into within and between effect
Conclusions




1   Intra patient correlation for advanced MBL exists


2   The effect of some exposures isn’t constant during function time

    Ignoring ICC might bias the precision of estimated effect. Simulation
3   studies should confirm the direction of the bias
Conclusions




1   Intra patient correlation for advanced MBL exists


2   The effect of some exposures isn’t constant during function time

    Ignoring ICC might bias the precision of estimated effect. Simulation
3   studies should confirm the direction of the bias
Conclusions




1   Intra patient correlation for advanced MBL exists


2   The effect of some exposures isn’t constant during function time

    Ignoring ICC might bias the precision of estimated effect. Simulation
3   studies should confirm the direction of the bias
Thanks !
ofec@post.tau.ac.il

Statistical Analysis of Clustered Binary Response in Oral Health Research

  • 1.
    Statistical Analysis ofClustered Binary Response in Oral Health Research Ronen Ofec*, DMD ; David M. Steinberg, PhD ; Devorah Schwartz-Arad, DND, PhD * M.Sc. program in Biostatistics, School of Mathematical sciences, Tel-Aviv university * Praviate dental practice, Tel-Aviv, Israel The 4th International Meeting on Methodological Issues In Oral Health Research, Istanbul , Turkey
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
    The durability of Dentalimplants treatment  Failures (removal of an implant) do occur  Marginal bone loss (MBL) can be an early sign for a failure  MBL: The amount of bone an implant loses during function time
  • 12.
    Marginal bone loss(MBL) Fransson et al.(2005)  What are the risk factors for MBL?  Are some patients more prone to MBL?  Are implants within patients correlated to each other?
  • 13.
    Marginal bone loss(MBL) Fransson et al.(2005)  What are the risk factors for MBL?  Are some patients more prone to MBL?  Are implants within patients correlated to each other?
  • 14.
    Marginal bone loss(MBL) Fransson et al.(2005)  What are the risk factors for MBL?  Are some patients more prone to MBL?  Are implants within patients correlated to each other?
  • 15.
    Main question ofinterest and Objectives of the study What will be the consequences of a naïve analysis that doesn't recognize correlation within a patient? 1 To identify risk factors for MBL in a long term follow up study 2 To estimate the Intra patient correlation with regard to MBL To compare results from a naïve analysis to one that 3 includes intra patient correlation
  • 16.
    Main question ofinterest and Objectives of the study What will be the consequences of a naïve analysis that doesn't recognize correlation within a patient? 1 To identify risk factors for MBL in a long term follow up study 2 To estimate the Intra patient correlation with regard to MBL To compare results from a naïve analysis to one that 3 includes intra patient correlation
  • 17.
    Main question ofinterest and Objectives of the study What will be the consequences of a naïve analysis that doesn't recognize correlation within a patient? 1 To identify risk factors for MBL in a long term follow up study 2 To estimate the Intra patient correlation with regard to MBL To compare results from a naïve analysis to one that 3 includes intra patient correlation
  • 18.
    Main question ofinterest and Objectives of the study What will be the consequences of a naïve analysis that doesn't recognize correlation within a patient? 1 To identify risk factors for MBL in a long term follow up study 2 To estimate the Intra patient correlation with regard to MBL To compare results from a naïve analysis to one that 3 includes intra patient correlation
  • 19.
    Study design andparticipants  Historical prospective cohort study design  Schwartz-Arad Surgical center, between January 1996, and July 1998 by a single surgeon (DSA)  Follow-up time was up to 147 months with a mean of 70 months
  • 20.
    The exposures, Binary responseand data set  The exposures: Patient-specific and implant-specific  Clustered response: MBL measurement at implant level  Binary response: Acceptable vs. advanced bone loss  Cut point at MBL=0.2 mm/year  The data set: Multilevel data set  195 Patients as the primary sample units (clusters)  721 Implants as the Elementary units  No. of implants per patient [1,16], mode=3
  • 21.
    The Intra PatientCorrelation 1 Way Random Effect ANOVA Patient effect Implant effect Patient and implant effect are independent The Intra Class Correlation (ICC)
  • 22.
    The estimator andthe estimate for ICC  Kappa type estimator proposed by Fleiss and Cuzick (1979)  Confidence Intervals for the estimator formulated by Zou and Donner (2004)  Simulation results: empirical coverage is close to nominal with C.I for the kappa type  In our study:
  • 23.
    The Generalized Estimating Equations(GEE)  Population average (Marginal) model Liang and Zeger (1986) 1. The mean model: 2. Working variance structure: 3. Working correlation structure:  The empirical/sandwich estimator for the precision of estimates
  • 24.
    Robustness of the Sandwichestimator  The estimator is robust to misspecification of the variance and correlation structures  Our estimates are still valid (consistent) if we use a structure which is not reflecting reality  Mancl and Leroux (1996): Gain of precision for the “right” correlation structure
  • 25.
    The prevalence ofadvanced bone loss by GEE
  • 26.
    Risk factors forMBL by GEE 0 *** 0.001 ** 0.01 * 0.05 Exposure Function time<3 years Function time≥3 years Beta S.E PV. Beta S.E PV. Smoker 1.44 0.41 *** Coating (HA &TPS) -2.22 0.76 ** 1.34 0.39 *** Early spontaneous exposure 0.85 0.29 ** Diameter -1.39 0.40 *** Interaction between function time and risk factors  For smoker:  The odds for MBL for smokers is 4.22 times greater than for non smokers  The effect of HA & TPS turns from protective to risk
  • 27.
    Risk factors forMBL by GEE 0 *** 0.001 ** 0.01 * 0.05 Exposure Function time<3 years Function time≥3 years Beta S.E PV. Beta S.E PV. Smoker 1.44 0.41 *** Coating (HA &TPS) -2.22 0.76 ** 1.34 0.39 *** Early spontaneous exposure 0.85 0.29 ** Diameter -1.39 0.40 *** Interaction between function time and risk factors  For smoker:  The odds for MBL for smokers is 4.22 times greater than for non smokers  The effect of HA & TPS turns from protective to risk
  • 28.
    Risk factors forMBL by GEE 0 *** 0.001 ** 0.01 * 0.05 Exposure Function time<3 years Function time≥3 years Beta S.E PV. Beta S.E PV. Smoker 1.44 0.41 *** Coating (HA &TPS) -2.22 0.76 ** 1.34 0.39 *** Early spontaneous exposure 0.85 0.29 ** Diameter -1.39 0.40 *** Interaction between function time and risk factors  For smoker:  The odds for MBL for smokers is 4.22 times greater than for non smokers  The effect of HA & TPS turns from protective to risk
  • 29.
    The naïve estimation andGEE Function time >= 3 years Naïve GEE Exposure Beta S.E PV. Beta S.E PV. Smoker 1.50 0.29 *** 1.44 0.41 *** Coating (HA &TPS) 1.31 0.27 *** 1.34 0.39 *** Early exposure 0.85 0.26 ** 0.85 0.29 ** Diameter -1.57 0.35 *** -1.39 0.40 ***  Estimates for exposure effects – it is not bad to be naïve  Correlation doesn’t induce bias to an unbiased estimator  Standard errors of estimates- a naïve analysis leads to bias  Underestimation or overestimation of standard errors  Risk for invalid inference concerning the estimated effect
  • 30.
    The naïve estimation andGEE Function time >= 3 years Naïve GEE Exposure Beta S.E PV. Beta S.E PV. Smoker 1.50 0.29 *** 1.44 0.41 *** Coating (HA &TPS) 1.31 0.27 *** 1.34 0.39 *** Early exposure 0.85 0.26 ** 0.85 0.29 ** Diameter -1.57 0.35 *** -1.39 0.40 ***  Estimates for exposure effects – it is not bad to be naïve  Correlation doesn’t induce bias to an unbiased estimator  Standard errors of estimates- a naïve analysis leads to bias  Underestimation or overestimation of standard errors  Risk for invalid inference concerning the estimated effect
  • 31.
    The naïve estimation andGEE Function time < 3 years Naïve GEE Exposure Beta S.E PV. Beta S.E PV. Smoker -0.41 0.41 0.32 -0.43 0.43 0.30 Coating (HA &TPS) -2.2 1.09 0.04 -2.2 0.76 ** Early exposure 0.35 0.42 0.40 0.35 0.39 0.35 Diameter -0.63 0.42 0.13 -0.60 0.47 0.21  Estimates for exposure effects – it is not bad to be naïve  Correlation doesn’t induce bias to an unbiased estimator  Standard errors of estimates- a naïve analysis leads to bias  Underestimation or overestimation of standard errors  Risk for invalid inference concerning the estimated effect
  • 32.
    The source ofexposures variation Source of exposure/treatment variation Between Within patient/cluster patient/cluster  Patient specific exposure: variation between patient  Similar to treatment effect in Between cluster design  Implant specific exposure: variation within and between patient  Might be similar to treatment effect in Within/Between cluster design  Depends on the source of variation of Implant specific exposure
  • 33.
    The Design Effect(Deff) Between Within cluster design cluster design  Deff >1  Deff<1  Variance inflation factor (VIF)  Variance attenuation factor (VAF)  Therefore, a naïve analysis is  Therefore, a naïve analysis is anti-conservative (underestimate) conservative (overestimate)
  • 34.
    The Design Effect(Deff) Between Within cluster design cluster design  Deff >1  Deff<1  Variance inflation factor (VIF)  Variance attenuation factor (VAF)  Therefore, a naïve analysis is  Therefore, a naïve analysis is anti-conservative (underestimate) conservative (overestimate)
  • 35.
    The Design Effect(Deff) Between Within cluster design cluster design  Deff >1  Deff<1  Variance inflation factor (VIF)  Variance attenuation factor (VAF)  Therefore, a naïve analysis is  Therefore, a naïve analysis is anti-conservative (underestimate) conservative (overestimate)
  • 36.
    The answer tothe main question of interest What will be the consequences of a naïve analysis that doesn't recognize correlation within a patient?  No problem with the estimated effect  For a patient specific exposure: underestimation of standard errors  For an implant specific exposure: underestimation if variance is from between patients  But, overestimation if variance of exposure is from within patient  Mancl, Leroux, DeRouen (2000) recommended to separate the effect of a site specific exposure, into within and between effect
  • 37.
    Conclusions 1 Intra patient correlation for advanced MBL exists 2 The effect of some exposures isn’t constant during function time Ignoring ICC might bias the precision of estimated effect. Simulation 3 studies should confirm the direction of the bias
  • 38.
    Conclusions 1 Intra patient correlation for advanced MBL exists 2 The effect of some exposures isn’t constant during function time Ignoring ICC might bias the precision of estimated effect. Simulation 3 studies should confirm the direction of the bias
  • 39.
    Conclusions 1 Intra patient correlation for advanced MBL exists 2 The effect of some exposures isn’t constant during function time Ignoring ICC might bias the precision of estimated effect. Simulation 3 studies should confirm the direction of the bias
  • 40.