Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Diagnostic & Screening Tests       Evaluating Clinical Tests
Herald-Leader, 13 October 2011
Science, 14 October 2011
Herald-Leader, 18 October 2011
Herald-Leader, 26 March 2012
NY Times, 30 October 2011
Herald-Leader, 4 April 2012
www.ChoosingWisely.org
Diagnostic & Screening Tests  Biological               Symptoms    Onset                   Appear                         ...
Diagnostic & Screening TestsDiagnostic and screening tests attempt to reveal an otherwisehidden truth about patients (i.e....
Discrimination & Classification“The fundamental principle of diagnostic testing [andscreening] rests on the belief that in...
Discrimination & Classification Disease status comes from an external source of “truth” regarding the patients in the popu...
Interlude: The Gold StandardUnbiased  • The procedure used to establish the truth should not bias    the truth.  • Surgery...
Interlude: The Gold StandardAdequate   •Surgery or autopsy (common in imaging studies)   •Time between imaging and surgery...
Discrimination & Classification “Appearances to the mind are of four kinds. Things either are what they appear to be [ ]; ...
Validity: Sensitivity & SpecificitySensitivity = Ability of the test to correctly identify those with disease = Probabilit...
Validity: Sensitivity & SpecificitySpecificity = Ability of the test to correctly identify those without disease = Probabi...
Validity: Sensitivity & SpecificityAssume a population of 1000 people of whom 100 have adisease. Of these 100 people, the ...
Validity: Sensitivity & SpecificitySensitivity and Specificity  • Inherent characteristics of the test  • Stable over diff...
Validity: Sensitivity & Specificity                         Low cutoff  High sensitivity                                 ...
Validity: Receiver Operating Characteristic Curve                                  X-axis:                                ...
Validity: Receiver Operating Characteristic Curve
Validity: Receiver Operating Characteristic Curve
Validity: Receiver Operating Characteristic Curve          5: Sensitivity = 1 and Specificity = 0          1: Sensitivity ...
Validity: Receiver Operating Characteristic Curve                               ROC can be used for a binary              ...
Validity: Receiver Operating Characteristic Curve
Validity: Performance / Predictive ValueSensitivity and specificity are useful, but • May be numerically different if obta...
Validity: Performance / Predictive ValuePositive Predictive Value = Ability of the test to correctly identify those who te...
Validity: Performance / Predictive ValueNegative Predictive Value = Ability of the test to correctly identify those who te...
Validity: Positive & Negative Predictive ValuesAssume a population of 1000 people of whom 100 have adisease. Of these 100 ...
Validity: Predictive Values & PrevalenceAssume a test with a sensitivity of 80% and specificitity of 90%.What happens to t...
Validity: Predictive Values & PrevalenceAssume a test with a sensitivity of 80% and specificitity of 90%.  Positive PV = a...
Validity: Predictive Values & PrevalenceAssume a test with a sensitivity of 80% and specificitity of 90%.  Positive PV = a...
Validity: Predictive Values & PrevalenceAssume a test with a sensitivity of 80% and specificitity of 90%.  Some additional...
Interlude: Bayes Theorem
Interlude: Bayes Theorem
Interlude: Bayes Theorem
Validity: Predictive Values & Prevalence                                           Gordis, 2009, Figure 5-12
Validity: Predictive Values & Prevalence                                    Sackett, Clinical Epidemiology, 1985
Validity: Predictive Values & Prevalence
Multiple Tests: Sequential versus Simultaneous    Screening test:        Diagnostic test:       •Less expensive        •Mo...
Multiple Tests: Sequential
Multiple Tests: Sequential   Net Sensitivity = 161 / 200 = 80.5%   Net Specificity = (8740 + 158) / 9,800 = 90.1%
Multiple Tests: Sequential      Net Sensitivity = 315 / 500 = 63.0%      Net Specificity = (7600 + 1710) / 9,500 = 98.0%  ...
Multiple Tests: Simultaneous  Suppose in a population of 1000  people, 200 have the disease and  Test A sensitivity = 80% ...
Multiple Tests: Simultaneous  Suppose in a population of 1000  people, 800 don’t have the disease  Test A specificity = 60...
Multiple Tests: Simultaneous
Multiple Tests: Simultaneous
Reliability Reliability (aka repeatability or precision) is the ability of the test to give consistent results when perfor...
Reliability: Percent Agreement Percent agreement    = number of tests that agree / total number of tests    = (a + d) / (a...
Reliability: KappaMeasure agreement beyond that expected from chance alone:Kappa = (percent agreement – chance agreement) ...
Reliability: Kappa                     Kundel and Polansky, Radiology, 2003
Reliability: Calculating KappaTwo pathologists independently read and score 75histopathology slides using their own criter...
Reliability: Calculating Kappa          Observed                                 Gordis, 2009, Figure 5-17
Reliability: Calculating Kappa             Expected     Kappa = 90.7% - 51.7% = 39% = 0.81             100% - 51.7%    48....
Reliability
Validity and Reliability
Screening“Screening is defined as the presumptive identification ofunrecognized diseasese or defects by the application of...
ScreeningNature of the Disease   •Important health problem        Morbidity/Mortality   •Treatable         Unethical to ...
ScreeningNature of the Test   •Simple        Easy to learn and perform       No complicated patient preparation   •Rapid...
ScreeningSocietal Factors   •Cost        Relatively inexpensive       Benefit/cost ratio favorable versus other health c...
ResourcesLanglotz, Radiology 2003 – supplement to Gordis, especiallyfor ROC curves.Pisano et al. NEJM 2005 – example of an...
Diagnotic and screening tests
Upcoming SlideShare
Loading in …5
×

Diagnotic and screening tests

12,584 views

Published on

second session powerpoint from epidemiology and bio-statistics module

Published in: Health & Medicine, Technology

Diagnotic and screening tests

  1. 1. Diagnostic & Screening Tests Evaluating Clinical Tests
  2. 2. Herald-Leader, 13 October 2011
  3. 3. Science, 14 October 2011
  4. 4. Herald-Leader, 18 October 2011
  5. 5. Herald-Leader, 26 March 2012
  6. 6. NY Times, 30 October 2011
  7. 7. Herald-Leader, 4 April 2012
  8. 8. www.ChoosingWisely.org
  9. 9. Diagnostic & Screening Tests Biological Symptoms Onset Appear Clinical Clinical Outcome Screening Diagnosis Mausner & Kramer, Epidemiology—An Introductory Text, 1985
  10. 10. Diagnostic & Screening TestsDiagnostic and screening tests attempt to reveal an otherwisehidden truth about patients (i.e., their health status: diseasedor disease-free). •Physical examination •Radiographs/Computed Tomography (CT) •Blood and urine assays •Cytology (Paps smear, Oral brush biopsy) •Saliva (HIV testing)
  11. 11. Discrimination & Classification“The fundamental principle of diagnostic testing [andscreening] rests on the belief that individuals with disease aredifferent from individuals without disease and that diagnostic[and screening] tests can distinguish between these twogroups.” Riegelman, Studying a Study and Testing a Test, 2000 •Valid (i.e., accurate) Sensitivity, specificity, ROC Predictive values Multiple tests •Reliable (i.e., precise or repeatable) Percent agreement Kappa
  12. 12. Discrimination & Classification Disease status comes from an external source of “truth” regarding the patients in the population: •Gold standard or reference standard Adequate Independent Unbiased Representative
  13. 13. Interlude: The Gold StandardUnbiased • The procedure used to establish the truth should not bias the truth. • Surgery or histology  the “truth” will consist of the more advanced casesRepresentative • Cadaver studies of TMJ (older). Patients younger. • Caries simulations (drilled holes in teeth) versus natural lesions
  14. 14. Interlude: The Gold StandardAdequate •Surgery or autopsy (common in imaging studies) •Time between imaging and surgery/biopsy •Applies to positive cases •Negative cases – clinical follow-upIndependent •Histology provides an independent truth. •Occasionally all of the available information, including the test being tested is used to establish the gold standard. Bone lesion for example (BFO). Creates a bias in favor of the test
  15. 15. Discrimination & Classification “Appearances to the mind are of four kinds. Things either are what they appear to be [ ]; or they neither are, nor appear to be [ ]; or they are, and do not appear to be [ ]; or they are not, and yet appear to be [ ]. Rightly to aim in all these cases is the wise man’s task.” Epictetus (c. 50-120) Discourses, Bk I, Chp 27
  16. 16. Validity: Sensitivity & SpecificitySensitivity = Ability of the test to correctly identify those with disease = Probability of testing positive given the presence of disease = TP / (TP + FN) = a / (a + c)
  17. 17. Validity: Sensitivity & SpecificitySpecificity = Ability of the test to correctly identify those without disease = Probability of testing negative given the absence of disease = TN / (FP + TN) = d / (b + d)
  18. 18. Validity: Sensitivity & SpecificityAssume a population of 1000 people of whom 100 have adisease. Of these 100 people, the test correctly identifies 80.Ofthe 900 disease-free people, the test correctly identifies 800. Sensitivity = a / (a + c) = 80 / 100 = 80% Specificity = d / (b + d) = 800/ 900 = 89% Gordis, 2009, Table 5-1
  19. 19. Validity: Sensitivity & SpecificitySensitivity and Specificity • Inherent characteristics of the test • Stable over different populations with different disease prevalence • Useful for comparing performance of two tests (e.g., Digital versus film mammography / Pisano, NEJM 2005) • Have a reciprocal relationship with one another
  20. 20. Validity: Sensitivity & Specificity Low cutoff  High sensitivity  Low specificity  False positives Moderate cutoff  balance High cutoff  Low sensitivity  High specificity  False negatives Courtesy, S. Fleming, 2011
  21. 21. Validity: Receiver Operating Characteristic Curve X-axis: False positive ratio (1-specificity) Y-axis: True positive ratio (sensitivity)
  22. 22. Validity: Receiver Operating Characteristic Curve
  23. 23. Validity: Receiver Operating Characteristic Curve
  24. 24. Validity: Receiver Operating Characteristic Curve 5: Sensitivity = 1 and Specificity = 0 1: Sensitivity = 0 and Specificity = 1
  25. 25. Validity: Receiver Operating Characteristic Curve ROC can be used for a binary outcome (cancer/no cancer) by creating a multipoint scoring scale.
  26. 26. Validity: Receiver Operating Characteristic Curve
  27. 27. Validity: Performance / Predictive ValueSensitivity and specificity are useful, but • May be numerically different if obtained on a group of people with early stages of disease compared with a group with more advanced disease. • We do not know ahead of time who has the disease and who does not. Rather, we get the test results and need to interpret the findings.
  28. 28. Validity: Performance / Predictive ValuePositive Predictive Value = Ability of the test to correctly identify those who test positive = Probability of having the disease given a positive test result = TP / (TP + FP) = a / (a + b)
  29. 29. Validity: Performance / Predictive ValueNegative Predictive Value = Ability of the test to correctly identify those who test negative = Probability of not having the disease (i.e., being disease-free) given a negative test result = TN / (FN + TN) = d / (c + d)
  30. 30. Validity: Positive & Negative Predictive ValuesAssume a population of 1000 people of whom 100 have adisease. Of these 100 people, the test correctly identifies 80.Ofthe 900 disease-free people, the test correctly identifies 800. Positive PV = a / (a + b) = 80 / 180 = 44% Negative PV = d / (c + d) = 800/ 820 = 98% Gordis, 2009, Table 5-7
  31. 31. Validity: Predictive Values & PrevalenceAssume a test with a sensitivity of 80% and specificitity of 90%.What happens to the predictive values when the prevalence ofthe disease varies? To fill in the cells, assume a convenient totalpopulation, in this case 1000. 80 90 20 810 Positive PV = a / (a + b) = 80 / 170 = 0.4706 = 47.1% Negative PV = d / (c + d) = 810/ 830 = 0.9759 = 97.6% After Kramer Clinical Epidemiology and Biostatistics, 1988
  32. 32. Validity: Predictive Values & PrevalenceAssume a test with a sensitivity of 80% and specificitity of 90%. Positive PV = a / (a + b) = 400 / 450 = 0.8888 = 88.9% Negative PV = d / (c + d) = 100/ 550 = 0.8181 = 81.8% After Kramer Clinical Epidemiology and Biostatistics, 1988
  33. 33. Validity: Predictive Values & PrevalenceAssume a test with a sensitivity of 80% and specificitity of 90%. Positive PV = a / (a + b) = 720 / 730 = 0.9863 = 98.6% Negative PV = d / (c + d) = 90/ 270 = 0.3333 = 33.3% After Kramer Clinical Epidemiology and Biostatistics, 1988
  34. 34. Validity: Predictive Values & PrevalenceAssume a test with a sensitivity of 80% and specificitity of 90%. Some additional terms: • Pretest probability = prior probability = prevalence • Post-test probability = posterior probability = positive/negative predictive value • Bayes Theorem (Thomas Bayes, 1702-61)
  35. 35. Interlude: Bayes Theorem
  36. 36. Interlude: Bayes Theorem
  37. 37. Interlude: Bayes Theorem
  38. 38. Validity: Predictive Values & Prevalence Gordis, 2009, Figure 5-12
  39. 39. Validity: Predictive Values & Prevalence Sackett, Clinical Epidemiology, 1985
  40. 40. Validity: Predictive Values & Prevalence
  41. 41. Multiple Tests: Sequential versus Simultaneous Screening test: Diagnostic test: •Less expensive •More expensive •Less invasive •More invasive •Less uncomfortable •More uncomfortable •More accurate Mausner & Kramer, Epidemiology—An Introductory Text, 1985
  42. 42. Multiple Tests: Sequential
  43. 43. Multiple Tests: Sequential Net Sensitivity = 161 / 200 = 80.5% Net Specificity = (8740 + 158) / 9,800 = 90.1%
  44. 44. Multiple Tests: Sequential Net Sensitivity = 315 / 500 = 63.0% Net Specificity = (7600 + 1710) / 9,500 = 98.0% Gordis, 2009, Figure 5-8
  45. 45. Multiple Tests: Simultaneous Suppose in a population of 1000 people, 200 have the disease and Test A sensitivity = 80% Test B sensitivity = 90% Net sensitivity = A+, B+ or both Step 1: 0.8 x 200 = 160 who are A+ Step *: 0.9 x 200 = 180 who are B+ Step 2: 0.9 x 160 = 144 who are A+B+ Step 3: 160 – 144 = 16 who are A+ only Step 4: 180 – 144 = 36 who are B+ only Step 5: 144 + 16 + 36 = 196 = A+,B+, or both Step 6: 196/200 = 98%Courtesy, S. Fleming, 2011
  46. 46. Multiple Tests: Simultaneous Suppose in a population of 1000 people, 800 don’t have the disease Test A specificity = 60% Test B specificity = 90% Net specificity = A- and B- Step 1: 0.6 x 800 = 480 who are A- Step *: 0.90 x 800 = 720 who are B- Step 2: 0.9 x 480 = 432 who are A- and B- Step 3: 432/800 = 54%Courtesy, S. Fleming, 2011
  47. 47. Multiple Tests: Simultaneous
  48. 48. Multiple Tests: Simultaneous
  49. 49. Reliability Reliability (aka repeatability or precision) is the ability of the test to give consistent results when performed more than once by on the same individual under the same conditions, even if conducted by different examiners. Sources of variability (the antithesis of repeatability) •Subjects BP reading (throughout day, sitting/standing, R/L arm) Serum glucose (throughout day, day of the week) •Instrumentations PSA assay (5% variability even when measuring identical blood sample) •Observer Intra-observer Inter-observer
  50. 50. Reliability: Percent Agreement Percent agreement = number of tests that agree / total number of tests = (a + d) / (a + b + c + d) = 35 / 40 = 0.875 = 87.5%
  51. 51. Reliability: KappaMeasure agreement beyond that expected from chance alone:Kappa = (percent agreement – chance agreement) (1 – chance agreement)Kappa varies between 0 (no agreement) and 1 (perfect agreement) < 0.40 Poor agreement 0.40 - 0.75 Fair to good agreement > 0.75 ExcellentIn example, chance agreement = 0.695Kappa = (0.875 – 0.695)/(1 – 0.695) = 0.180/0.305 = 0.590
  52. 52. Reliability: Kappa Kundel and Polansky, Radiology, 2003
  53. 53. Reliability: Calculating KappaTwo pathologists independently read and score 75histopathology slides using their own criteria to subtype thelesion as Grade II or Grade III Gordis, 2009, Figure 5-17
  54. 54. Reliability: Calculating Kappa Observed Gordis, 2009, Figure 5-17
  55. 55. Reliability: Calculating Kappa Expected Kappa = 90.7% - 51.7% = 39% = 0.81 100% - 51.7% 48.3% Gordis, 2009, Figure 5-17
  56. 56. Reliability
  57. 57. Validity and Reliability
  58. 58. Screening“Screening is defined as the presumptive identification ofunrecognized diseasese or defects by the application of tests,examinations, or other procedures that can be applied rapidly.” Friis and Seller, 2009“For screening to be of benefit, treatment given during thedetectable preclinical phase must result in a better prognosisthan therapy given after symptoms develop.” Hennekens and Buring, 1987
  59. 59. ScreeningNature of the Disease •Important health problem Morbidity/Mortality •Treatable  Unethical to screen if untreatable, except to prevent transmission (e.g., early cases of AIDS versus protecting blood supply) •Relatively high prevalence  Rare disease  PPV is low & cost per case detected is high  Exceptions: Phenylketouria (PKU), 1 in 15,000 births, but consequences are severe (mental retardation), treatment is simple (dietary restriction), screening tests are simple. •Detectable preclinical phase (long latency period) Biological Symptoms Onset Appear Clinical Clinical Outcome Screening Diagnosis
  60. 60. ScreeningNature of the Test •Simple  Easy to learn and perform No complicated patient preparation •Rapid  To administer  To yield results •Safe  Screened populations are overwhelmingly healthy – keep them that way •Valid and reliable  High sensitivity  Relatively high specificity – accept some FP as there will be follow-up confirmatory tests, but what is the cost and morbidity of the follow-up, the cost of mislabeling someone, etc.
  61. 61. ScreeningSocietal Factors •Cost  Relatively inexpensive Benefit/cost ratio favorable versus other health care expenditures •Acceptable  Unpalatable or difficult tests  refusal to participate
  62. 62. ResourcesLanglotz, Radiology 2003 – supplement to Gordis, especiallyfor ROC curves.Pisano et al. NEJM 2005 – example of an application ofconcepts.Linker, AJPH 2012 – and interesting historical perspective ofscreening, specifically for scoliosis.US Preventive Services Task Force (USPSTF) – the source ofmany guidelines (and some controversy) regarding screening:< http://www.uspreventiveservicestaskforce.org/>.

×