WELCOM
E
Reliability of Scales
Dhasarathy Kumar
OUTLINE: What is
reliability of a
scale?
What are the broad
approaches to
reliability? –
repeatability and
internal consistency
How is repeatability or
reproducibility
assessed? – for
categorical variables
and continuous
variables
Definition of Reliability: Reliability usually
“refers to the consistency of scores obtained by
the same persons when they are reexamined
with the same test on different occasions, or
with different sets of equivalent items, or under
other variable examining conditions
(Anastasi & Urbina, 1997)
Reliability
• The extent to which a scale provide
the same numerical score each time it is
administered, provided no true change
has actually happened is called reliability
• Reliability is a necessary but not
sufficient consideration in scale
development
Approaches to reliability testing
Reliability
Internal
Consistency
Cronbach’s
Alpha
Split Half
Reliability
Repeatability
Test-retest
reliability
Inter observer
reliability
Alternative
forms
reliability
Reliability
• Internal consistency reliability is a measure of how well the items on
a test measure the same construct or idea.
 Cronbach’s Alpha
 Split Half Reliability
• Repeatability measures the variation in measurements taken by a
single instrument or person under the same conditions.
(Reproducibility measures whether an entire study or experiment can
be reproduced in its entirety)
Statistical Assessment of Repeatability
Repeatability
Continuous
Variables
Intra class
Correlation
Coefficient (ICC)
Bland Altmann’s
Plot for
agreement
Categorical
Variables
Kappa statistic
for agreement
Intra Class Correlation (ICC)
• When one is interested in the relationship between variables of a common class, one uses an
Intraclass Correlation Coefficient
• It is, as a general matter, the ratio of two variances:
Variance due to rated subjects (patients) ICC
= --------------------------------------------------------------
(Variance due to subjects + Variance due to Judges + Residual
Variance)
ICC < 0.7 – low reliability
ICC 0.7 – 0.89 – moderate to good reliability
ICC > 0.89 – Excellent reliability
Bland Altmann’s Plot
• The Kappa statistic (or value) is a metric that compares an Observed Accuracy
with an Expected Accuracy (random chance).
• Kappa coefficient (κ) is a statistic which measures inter-rater agreement for
qualitative (categorical) items.
Kappa Statistics
Kappa Statistics
Kppa = observed agreement – chance agreement / 1- chance agreement
Observed agreement Pr(a) = a + d / a+b+c+d
Expected agreement Pr(e) = [(a+b) * (a+c) / a+b+c+d] + [(c+d) * (b+d) / a+b+c+d] / (a+b+c+d)
k = Pr(a) – Pr(e) / 1 – Pr(e)
Kappa < 0.4 – poor reliability
Kappa 0.41 – 0.74 – moderate to good reliability
Kappa >0.74 – Excellent reliability
Rater 1 – positive Rater 1 – negative
Rater 2 - positive a b
Rater 2 - negative c d
Reliability

Reliability

  • 1.
  • 2.
    OUTLINE: What is reliabilityof a scale? What are the broad approaches to reliability? – repeatability and internal consistency How is repeatability or reproducibility assessed? – for categorical variables and continuous variables
  • 3.
    Definition of Reliability:Reliability usually “refers to the consistency of scores obtained by the same persons when they are reexamined with the same test on different occasions, or with different sets of equivalent items, or under other variable examining conditions (Anastasi & Urbina, 1997)
  • 4.
    Reliability • The extentto which a scale provide the same numerical score each time it is administered, provided no true change has actually happened is called reliability • Reliability is a necessary but not sufficient consideration in scale development
  • 5.
    Approaches to reliabilitytesting Reliability Internal Consistency Cronbach’s Alpha Split Half Reliability Repeatability Test-retest reliability Inter observer reliability Alternative forms reliability
  • 6.
    Reliability • Internal consistencyreliability is a measure of how well the items on a test measure the same construct or idea.  Cronbach’s Alpha  Split Half Reliability • Repeatability measures the variation in measurements taken by a single instrument or person under the same conditions. (Reproducibility measures whether an entire study or experiment can be reproduced in its entirety)
  • 8.
    Statistical Assessment ofRepeatability Repeatability Continuous Variables Intra class Correlation Coefficient (ICC) Bland Altmann’s Plot for agreement Categorical Variables Kappa statistic for agreement
  • 9.
    Intra Class Correlation(ICC) • When one is interested in the relationship between variables of a common class, one uses an Intraclass Correlation Coefficient • It is, as a general matter, the ratio of two variances: Variance due to rated subjects (patients) ICC = -------------------------------------------------------------- (Variance due to subjects + Variance due to Judges + Residual Variance) ICC < 0.7 – low reliability ICC 0.7 – 0.89 – moderate to good reliability ICC > 0.89 – Excellent reliability
  • 10.
  • 11.
    • The Kappastatistic (or value) is a metric that compares an Observed Accuracy with an Expected Accuracy (random chance). • Kappa coefficient (κ) is a statistic which measures inter-rater agreement for qualitative (categorical) items. Kappa Statistics
  • 12.
    Kappa Statistics Kppa =observed agreement – chance agreement / 1- chance agreement Observed agreement Pr(a) = a + d / a+b+c+d Expected agreement Pr(e) = [(a+b) * (a+c) / a+b+c+d] + [(c+d) * (b+d) / a+b+c+d] / (a+b+c+d) k = Pr(a) – Pr(e) / 1 – Pr(e) Kappa < 0.4 – poor reliability Kappa 0.41 – 0.74 – moderate to good reliability Kappa >0.74 – Excellent reliability Rater 1 – positive Rater 1 – negative Rater 2 - positive a b Rater 2 - negative c d