Your SlideShare is downloading. ×
  • Like
GRADE: Implications for Teaching EBM
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

GRADE: Implications for Teaching EBM

  • 422 views
Published

 

Published in Health & Medicine , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
422
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
6
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Reason for clear recommendations in first example is that benefits moderate to large and risk or costs are minimal Reson that equivocal in second is that benefits slightly smaller, risks greater, and costs greater; means that close call (or at least some might think so)

Transcript

  • 1.  
  • 2. Plan
    • GRADE background
    • two steps
      • quality of evidence
      • strength of recommendation
    • quality and strength can differ
    • profiles and summary of findings
    • importance of values/preferences
  • 3. Plan
    • GRADE background
    • two steps
      • quality of evidence
      • strength of recommendation
    • evidence profiles
    • an exercise in applying GRADE
  • 4. Plan
    • GRADE background
    • two steps
      • quality of evidence
      • strength of recommendation
    • quality and strength can differ
    • profiles and summary of findings
    • importance of values/preferences
    • an exercise in applying GRADE
  • 5. Plan
    • GRADE background
    • two steps
      • quality of evidence
      • strength of recommendation
    • importance of values/preferences
    • an exercise in applying GRADE
  • 6. Plan
    • GRADE background
    • two steps
      • quality of evidence
      • strength of recommendation
    • application to breast cancer screening
      • contrast with USPSTF
  • 7. Summarizing recommendations
    • clinicians need succinct summaries
    • should include
      • quality of evidence
      • summaries of best estimates of effect
        • all patient-important outcomes
      • strength of recommendations
    • GRADE working group
      • BMJ 2004 and 2008
  • 8.
    • Is grading recommendations a good idea?
    • Why?
    • experience with grading
      • systems used?
  • 9. Why Grade Recommendations?
    • strong recommendations
      • strong methods
      • large precise effect
      • few down sides of therapy
    • weak recommendations
      • weak methods
      • imprecise estimate
      • small effect
      • substantial down sides
  • 10. Which grading system to use?
    • many available
      • Australian National and MRC
      • Oxford Center for Evidence-based Medicine
      • Scottish Intercollegiate Guidelines (SIGN)
      • US Preventative Services Task Force
      • American professional organizations
        • AHA/ACC, ACCP, AAP, Endocrine society, etc....
    • cause of confusion, dismay
  • 11.  
  • 12.  
  • 13. A common international grading system?
    • GRADE ( G rades of r ecommendation, a ssessment, d evelopment and e valuation)
    • international group
      • Australian NMRC, SIGN, USPSTF, WHO, NICE, Oxford CEBM, CDC, CC
    • ~ 25 meetings over last ten years
        • (~10 – 50 attendants)
  • 14. GRADE Uptake Agencia sanitaria regionale, Bologna, Italia Agency for Health Care Research and Quality (AHRQ) Allergic Rhinitis and Group - Independent Expert Panel American Association for the study of liver diseases American College of Cardiology Foundation American College of Chest Physicians American College of Emergency Physicians American College of Physicians American Endocrine Society American Society of Gastrointestinal Endoscopy American society of Interventional Pain Physicians American Thoracic Society (ATS) BMJ Clinical Evidence British Medical Journal Canadian Agency for Drugs and Technology in Health Canadian Cardiovascular Society Canadian Task Force on Preventive Health Care Centers for Disease Control Cochrane Collaboration EBM Guidelines Finland Emergency Medical Services for Children National Resource Center European Association for the Study of the Liver European Respiratory Society European Society of Thoracic Surgeons Evidence-based Nursing Sudtirol, Alta Adiga, Italy Finnish Office of Health Technology Assessment German Agency for Quality in Medicine Heelth Inspectorate for Scotland Infectious Disease Society of America Japanese Society of Oral and Maxillofacial Radiology Joslin Diabetes Center Journal of Infection in Developing Countries Kaiser Permanente Kidney Disease International Guidelines Organization National and Gulf Centre for Evidence-based Medicine National Institute for Clinical Excellence (NICE) National Kidney Foundation Norwegian Knowledge Centre for the Health Services Ontario MOH Medical Advisory Secretariat Panama and Costa Rica National Clinical Guidelines Program Polish Institute for EBM Scottish Intercollegiate Guideline Network (SIGN) Society of Critical Care Medicine Society of Pediatric Endocrinology Society of Vascular Surgery Spanish Society of Family Practice (SEMFYC) Stop TB Diagnostic Working Group Surviving sepsis campaign Swedish Council on Technology Assessment in Health Care Swedish National Board of Health and Welfare University of Pennsylvania Health System for EB Practice UpToDate WINFOCUS World Allergy Organization World Health Organization (WHO)
  • 15. What are we grading?
    • two components
    • quality of body of evidence
      • extent to which confidence in estimate of effect adequate to support decision
        • high, moderate, low, very low
    • strength of recommendation
        • strong and weak
  • 16. What are we grading?
    • two components
    • quality of body of evidence
      • confidence in estimate of effect
        • high, moderate, low, very low
    • strength of recommendation
        • strong and weak
  • 17. Interpretation of quality
    • High quality— Further research is very unlikely to change our confidence in the estimate of effect
    • Moderate quality— Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate
    • Low quality— Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate
    • Very low quality— Any estimate of effect is very uncertain
  • 18. Interpretation of quality
    • High: We are very confident that t he true effect lies close to that of the estimate of the effect.
    • Moderate: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
    • Low: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect.
    • Very low: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect.
  • 19. Health Care Question (PICO) Systematic reviews Studies Outcomes Important outcomes Rate the quality of evidence for each outcome , across studies RCTs start high, observational studies start low (-) Study limitations Imprecision Inconsistency of results Indirectness of evidence Publication bias likely Final rating of quality for each outcome: high, moderate, low, or very low (+) Large magnitude of effect Dose response Plausible confounders would ↓ effect when an effect is present or ↑ effect if effect is absent Decide on the direction (for/against) and grade strength (strong/weak*) of the recommendation considering: Quality of the evidence Balance of desirable/undesirable outcomes Values and preferences Decide if any revision of direction or strength is necessary considering: Resource use *also labeled “conditional” or “discretionary” Rate overall quality of evidence (lowest quality among critical outcomes) S1 S2 S3 S4 OC1 OC2 OC3 OC4 OC1 OC3 Critical outcomes OC4 Generate an estimate of effect for each outcome OC2 S5
  • 20. Structured question
    • patients: lymphoma patients at risk of developing chemotherapy-induced febrile neutropenia
    • granulocyte colony-stimulating (G-CSF)
    • alternative not using G-CSF
  • 21. Structured question
    • patients:
      • women considering breast cancer screening
      • age 40-9; 50 to 74; > 75
      • no  risk genetic mutation chest radiation
    • intervention
      • film mammography
    • alternative
      • no screening
  • 22. Need to define all patient-important outcomes and evaluate their importance
    • desirable consequences
      • reduction in breast cancer mortality
    • undesirable consequences
      • false positive screening results
      • invasive procedures from positive results
      • complications of invasive procedures
      • unnecessary diagnosis and treatment
  • 23. Determinants of quality
    • RCTs start high
    • observational studies start low
    • what can lower quality?
      • detailed design and execution
      • inconsistency
      • indirectness
      • imprecision
      • reporting bias
  • 24. Determinants of quality
    • RCTs start high
    • observational studies start low
    • 5 limitations can lower quality
    • detailed design and execution
      • concealment, blinding, loss to follow-up
    • inconsistency
      • variability in results (heterogeneity)
    • publication bias
  • 25. Determinants of quality
    • RCTs start high
    • observational studies start low
    • 5 limitations can lower quality
    • Bias
      • detailed design and execution
        • concealment, blinding, loss to follow-up
      • publication bias
    • Imprecision
      • wide confidence intervals
  • 26. Determinants of quality
    • RCTs start high
    • observational studies start low
    • limitations can lower quality?
    • Bias
      • detailed design and execution
        • concealment, blinding, loss to follow-up
    • Imprecision
      • wide confidence intervals
  • 27. Determinants of quality
    • RCTs start high
    • observational studies start low
    • limitations can lower quality?
  • 28. Determinants of quality
    • 5 limitations can lower quality
    • risk of bias
      • concealment, blinding, loss to follow-up
    • imprecision
    • inconsistency
      • variability in results (heterogeneity)
    • Indirectness
    • publication bias
  • 29. Determinants of quality
    • 5 limitations can lower quality
    • risk of bias
      • concealment, blinding, loss to follow-up
    • imprecision
    • inconsistency
      • variability in results (heterogeneity)
    • publication bias
  • 30. Risk of Bias
    • well established
      • concealment
      • intention to treat principle observed
      • blinding
      • completeness of follow-up
    • more recent
      • early stopping for benefit
      • selective outcome reporting bias
  • 31. Breast cancer risk of bias
    • most trials not concealed
    • blinding
      • ? adjudication of outome
      • no other blinding
    • ? loss to follow-up
    • all trials rated as “fair” by USPSTF
  • 32. Consistency of results
    • if inconsistency, look for explanation
      • patients, intervention, outcome, methods
    • judgment of consistency
      • variation in size of effect
      • overlap in confidence intervals
      • statistical significance of heterogeneity
      • I 2
  • 33. Relative Risk with 95% CI for Vitamin D Non-vertebral Fractures
    • Chapuy et al, (2002) 0.85 (0.64, 1.13)
    Pooled Random Effect Model 0.82 (0.69 to 0.98) p= 0.05 for heterogeneity, I 2 =53% Chapuy et al, (1994) 0.79 (0.69, 0.92) Lips et al, (1996) 1.10 (0.87, 1.39) Dawson-Hughes et al, (1997) 0.46 (0.24, 0.88) Pfeifer et al, (2000) 0.48 (0.13, 1.78) Meyer et al, (2002) 0.92 (0.68, 1.24) Trivedi et al, (2003) 0.67 (0.46, 0.99) Favours Vitamin D Favours Control Relative Risk 95% CI
  • 34. Relative Risk with 95% CI for Vitamin D (Non-Vertebral Fractures, Dose >400) Chapuy et al, (1994) 0.70 (0.69, 0.92) Dawson-Hughes et al, (1997) 0.46 (0.24, 0.88) Pfeifer et al, (2000) .48 (0.13, 1.78) Chapuy et al, (2002) 0.85 (0.64, 1.13) Trivedi et al, (2003) 0.67 (0.46, 0.99) Pooled Random Effect Mode 0.75 (0.63 to 0.89) p= 0.26 for heterogeneity, I 2 =24% Favours Vitamin D Favours Control Relative Risk 95% CI
  • 35. Relative Risk with 95% CI for Vitamin D (Non-Vertebral Fractures, Dose = 400) Lips et al (1996) 1.10 (0.87, 1.39) Meyer et al (2002) 0.92 (0.68, 1.24) Pooled Random Effect Mode 1.03 (0.86 to 1.24) p = 0.35 heterogeneity, I 2 =0% Favours Vitamin D Favours Control Relative Risk 95% CI
  • 36. Meta-analysis of women 39 to 49 years of age
  • 37. Should we believe sub-group analysis?
    • within-study comparison? No
    • large difference in effect Borderline
    • unlikely chance Yes, p = 0.006
    • consistent across studies Yes
    • a priori hypothesis Yes
    • one of small number hypotheses Yes
    • biologically compelling Yes
    • shall we believe sub-group analysis?
  • 38. Directness of Evidence
    • differences in
      • patients
      • interventions
      • comparators
    • differences in outcomes
      • surrogates
  • 39. Quality judgments: Directness
    • populations
      • older, sicker or more co-morbidity
    • interventions
      • new statins versus old
    • outcomes
      • important versus surrogate outcomes
      • glucose control versus CV events
  • 40. Figure 6: Hierarchy of outcomes according to their patient-importance to assess the effect of phosphate lowering drugs in patients with renal failure and hyperphophatemia Flatulence Importance of endpoints 2 5 Pain due to soft tissue Calcification / function 6 Fractures 7 Myocardial infarction 8 Mortality 9 3 4 1 Coronary calcification Ca 2+ /P- Product Surrogates of declining importance Bone density Ca 2+ /P- Product Soft tissue calcification Ca 2+ /P- Product Lower by one level for indirectness Lower by two levels for indirectness Critical for decision making Important, but not critical for decision making Of low patient- importance
  • 41. Alendronate Risedronate Placebo Directness interested in A versus B available data A vs C, B vs C
  • 42. Issues in directness breast cancer screening
    • estimated screening effects in over 75
      • could also use observational studies
    • no direct comparisons of screening intervals
  • 43. Imprecision
    • small sample size
      • small number of events
    • wide confidence intervals
      • uncertainty about magnitude of effect
    • primary criterion:
      • would decisions differ at ends of CI
  • 44. Meta-analysis of women 39 to 49 years of age 25% RRR, reduction in breast cancer deaths 1/1,000 4% RRR, reduction in breast cancer deaths 1/6,000
  • 45. Imprecision
    • small sample size
      • small number of events
    • wide confidence intervals
      • uncertainty about magnitude of effect
    • problems
      • analogy to stopping early
      • lack of prognostic balance
    • solution: optimal information size
      • # of pts from conventional sample size calculation
      • specify CER, effect, α , β , Δ
  • 46. Fluoroquinolone prophylaxis in neutropenia: infection-related mortality Total number of events: 47
  • 47. Fluoroquinolone prophylaxis in neutropenia: infection-related mortality sample size 1,002 α 0.05, β 0.20, Δ 0.25, N = 6,000
  • 48. Stroke – Fixed Effects
  • 49. Downgrading for precision
    • if OIS not met rate down for imprecision
      • unless very large ss (? > 1,000 per group)
    • if OIS met and CI exclude RR = 1, don’t downgrade
    • if OIS met and CI includes, RR = 1, downgrade only if RR < 0.75 or > 1.25
  • 50. Precision
    • atrial fib at risk of stroke
    • warfarin increases serious gi bleeding
      • 3% per year
    • 1,000 patients 1 less stroke
      • 30 more bleeds for each stroke prevented
    • 1,000 patients 100 less strokes
      • 3 strokes prevented for each bleed
    • where is your threshold?
      • how many strokes in 100 with 3% bleeding?
  • 51. 0 1.0%
  • 52. 0 1.0%
  • 53. 0 1.0%
  • 54. 0 1.0%
  • 55. 0 0.5% 1.0%
  • 56. Example: clopidogrel or ASA?
    • pts with threatened stroke
    • RCT of clopidogrel vs ASA
      • 19,185 patients
    • ischaemic stroke, MI, or vascular death compared
      • 939 events (5·32%) clopidogrel
      • 1021 events (5·83%) with aspirin
    • RR 0.91 (95% CI 0.83 – 0.99) (p=0·043)
    • downgrade for precision?
  • 57. 0 1.0% Clopidogrel or ASA for threatened vascular events RCT 19,185 patients 1.7% - 0.9 – 0.1% RR 0.91 (95% CI 0.83 – 0.99)
  • 58. 0 Non-inferiority
  • 59. 0 Non-inferiority
  • 60. 0 Non-inferiority
  • 61. Avoiding misleading conclusions
    • analogy
      • SR accumulating data just as single trial
    • risk of spurious effect by stopping early
    • how to avoid
      • insist on sufficient data
      • optimal information size
  • 62. Avoiding misleading conclusions
    • optimal information size
      • # of pts from conventional sample size calculation
      • specify CER, effect, α , β
    • alternative, number of events
      • ? 300 ?
  • 63. Criteria for NOT downgrading
    • CI narrow enough to permit confident recommendation for or against
    • if positive benefit outcome, safeguard against false positive
      • OIS
      • or
      • number threshold met (300)
  • 64. Publication bias
    • high likelihood could lower quality
    • when to suspect
        • number of small studies
        • industry sponsored
  • 65.  
  • 66.  
  • 67. What can raise quality?
    • large magnitude can upgrade one level
      • very large two levels
    • common criteria
      • everyone used to do badly
      • almost everyone does well
      • quick action
    • hip replacement for hip osteoarthritis
    • mechanical ventilation in respiratory failure
  • 68. Dose-response gradient
    • childhood lymphoblastic leukemia
    • risk for CNS malignancies 15 years after cranial irradiation
    • no radiation: 1% (95% CI 0% to 2.1%)
    • 12 Gy: 1.6% (95% CI 0% to 3.4%)
    • 18 Gy: 3.3% (95% CI 0.9% to 5.6%).
  • 69. Observational study starts low What can move up?
    • plausible confounders strengthen association
    • FP hospitals higher mortality than NFP hospitals
      • NFP treat sicker patients
      • FP treat wealthier patients
  • 70. Quality assessment criteria
  • 71. Overall level of evidence
    • most systems just use evidence about primary benefit outcome
    • but what about others (risk)?
    • what to do?
    • options
      • ignore all but primary
      • weakest of any outcome
      • some blended approach
      • weakest of critical outcomes
  • 72. Beta blockers in non-cardiac surgery Quality Assessment Summary of Findings Quality Relative Effect (95% CI) Absolute risk difference Outcome Number of participants (studies) Risk of Bias Consistency Directness Precision Reporting Bias Myocardial infarction 10,125 (9) No serious limitations No serious imitations No serious limitations No serious limitations Not detected High 0.71 (0.57 to 0.86) 1.5% fewer (0.7% fewer to 2.1% fewer) Mortality 10,205 (7) No serious limitations Possiblly inconsistent No serious limitations Imprecise Not detected Moderate or low 1.23 (0.98 – 1.55) 0.5% more (0.1% fewer to 1.3% more) Stroke 10,889 (5) No serious limitaions No serious limitations No serious limitations Possible imprecision Not detected Moderate or High 2.21 (1.37 – 3.55) 0.5% more (0.2% more to 1.3% more0
  • 73.  
  • 74. Pressure limited ventilation Quality Assessment Summary of Findings Quality Relative Risk (95% CI) p-value Illustrative risks Outcome No. of patients (studies) Risk of Bias Inconsistency Indirectness Imprecision Publication Bias Example control rate Associated risk with PVL Hospital mortality 1,664 (9) Inability to blind. 2 trials stopped early with few events and large effects; were also confounded by ‘open lung’ strategies. p = 0.07 I 2 = 45.6% Varied populations, interventions. Not robust in sensitivity analyses Direct Precise Undetected Moderate (due to inconsistency) 0.82 (0.68 – 0.99) p = 0.04 40% 32.8% (27.2 – 39.6) Barotrauma 1,497 (7) Inability to blind. p = 0.24 I 2 = 25.3% Varied populations, interventions Direct Imprecise Undetected Moderate (due to imprecision) 0.90 (0.66 – 1.24) p = 0.53 NS NS Paralysis 1,202 (5) Inability to blind. p = 0.004 I 2 = 59% Varied populations, interventions, measurements Direct Precise Not assessed Moderate (due to inconsistency) 1.37 (1.04 – 1.82) p = 0.03 30% 41.1% (31.2 – 54.6) Dialysis 173 (2) Inability to blind. p = 0.26 I 2 = 22.8% Varied populations, interventions Direct Imprecise Not assessed Moderate (due to imprecision) 1.76 (0.79 – 3.90) p = 0.16 NS NS
  • 75. Zoster vaccine Quality Assessment Summary of Findings Quality Relative Risk (95% CI) p-value Illustrative risks Outcome No. of patients (studies) Risk of Bias Inconsistency Indirectness Imprecision Publication Bias control rate vaccinated rate Zoster episodes 38,546 (1) No serious risk only one study Direct Precise Undetected High not reported 11.12 per 1,000 patient-years 5.42 (difference 5.7 per 1,000 pt-years (p< 0.001) Post-herpetic neuralgia 38,546 (1) No serious risk only one study Direct Precise Undetected High not reported 1.38 per 1,000 patient-years 0.46 (difference 0.92 per 1,000 pt-years (p< 0.001) Serious adverse events 38,546 (1) No serious risk only one study Direct Precise Undetected High Not reported 13 per 1,000 19 (difference 6 per 1,000)
  • 76. High versus low PEEP in ALI and ARDS Population No. of participants (trials) † Higher PEEP Lower PEEP Adjusted Relative Risk (95% CI; P -value) ‡ Adjusted Absolute Risk Difference (95% CI) Quality Patients with ARDS 1892 (3) 324/951 (34.1%) 368/941 (39.1%) 0.90 (0.81 to 1.00; 0.049) -3.9% (-7.4% to -0.04%) High Patients without ARDS 404 (3) 50/184 (27.2%) 41/220 (18.6%) 1.37 (0.98 to 1.92; 0.065) 6.9% (-0.4% to 17.1%) Moderate (imprecision)
  • 77. Whipples procedure pancreatic cancer with or without duodenectomy
  • 78.  
  • 79. Patients or population: Anyone taking a long flight (lasting more than 6 hours) Settings: International air travel Intervention: Compression stockings 1 Comparison: Without stockings Outcomes Illustrative comparative risks * (RANGE OF UNCERTAINTY) Relative effect (95% CI) Number of participants (studies) Quality of the evidence (GRADE) Comments Assumed risk Corresponding risk Without stockings With stockings (95% CI) Symptomatic deep vein thrombosis (DVT) See comment See comment Not estimable 2637 (9 studies) See comment 0 participants developed symptomatic DVT in these studies. Symptomatic deep vein thrombosis –surrogate symptomless deep vein thrombosis Low risk population 2 RR 0.10 (0.04 to 0.25) 2637 (9 studies)   Moderate 3 Estimates of control group asymptomatic thrombosis from the primary studies range from 15 per 1,000 in low risk patients to 25 per 1,000 in high risk patients 5 per 10,000 0.5 per 10,000 (0 to 1) High risk population 2 18 per 10,000 1.8 per 10,000 (0.5 to 4) Superficial vein thrombosis 13 per 1000 6 per 1000 (2 to 15) RR 0.45 (0.18 to 1.13) 1804 (8 studies)   Moderate 4
  • 80. Diagnostic tests
    • same logic as for treatment
    • judges quality of evidence NOT for accuracy, but for change in patient-important outcome
    • ideally establish through RCT
      • focus on patient-important outcome
      • screening RCTs (breast cancer, colon cancer)
    • most of time not available
      • new complexities, process in evolution
  • 81. Study designs
  • 82. Study designs II
  • 83. Test accuracy is a surrogate for patient important outcomes
    • When clinicians think about diagnostic tests, they focus on their accuracy
    • Underlying assumption: obtaining a better idea of whether a target condition is present or absent will result in superior patient management and improved outcome.
  • 84. Balance between presumed patient outcomes, complications and cost: Less complications and downsides compared to IVP would support the new test’s usefulness, but the balance between desirable and undesirable effect not clear in view of the uncertain consequences of identifying smaller stones.
  • 85. Strength of recommendations
    • degree of confidence that desirable effects of adhering to recommendation outweigh undesirable effects.
    • strong recommendation
      • benefits clearly outweigh risks/hassle/cost
      • risk/hassle/cost clearly outweighs benefit
  • 86. Strength of recommendations
    • degree of confidence that desirable effects of adhering to recommendation outweigh undesirable effects.
    • strong recommendation
      • benefits clearly outweigh risks/hassle/cost
      • risk/hassle/cost clearly outweighs benefit
    • what can downgrade strength?
  • 87.  
  • 88. Strength of Recommendation
    • strong recommendation
      • benefits clearly outweigh risks/hassle/cost
      • risk/hassle/cost clearly outweighs benefit
    • what can downgrade strength?
    • low quality evidence
    • close balance between up and downsides
  • 89. Grades Translations
    • Strong recommendations
    • Just do it
    • Virtually all well informed individuals would want the intervention and only a small proportion would not
    • Most individuals should receive the intervention
    • Use of the intervention according to the guideline could be used as a quality criterion or performance indicator
  • 90. Grades Translations
    • Weak recommendation
    • Examine the evidence yourself
    • The majority of well informed individuals would want the intervention, but a substantial proportion would not
    • Many but not all individuals should receive the intervention
    • The intervention is not a candidate for a quality criterion or performance indicator.
  • 91. Risk/Benefit tradeoff
    • aspirin after myocardial infarction
      • 25% reduction in relative risk
      • side effects minimal, cost minimal
      • benefit obviously much greater than risk/cost
    • warfarin in low risk atrial fibrillation
      • warfarin reduces stroke vs ASA by 50%
      • but if risk only 1% per year, ARR 0.5%
      • increased bleeds by 1% per year
  • 92. Strength of Recommendations
    • Resuscitate fast in septic patient
      • - do it!
    • Prone ventilation in failing patient with ARDS
      • Probably do it
      • Probably do not do it
  • 93. Strength of Recommendations
    • Aspirin after MI – do it
    • Warfarin rather than ASA in Afib
    • -- probably do it
    • -- probably don’t do it
  • 94.  
  • 95. Significance of strong vs weak
    • variability in patient preference
      • strong, almost all same choice (> 90%)
      • weak, choice varies appreciably
    • interaction with patient
      • strong, just inform patient
      • weak, ensure choice reflects values
    • use of decision aid
      • strong, don’t bother
      • weak, use the aid
    • quality of care criterion
      • strong, consider
      • weak, don’t consider
  • 96. USPSTF documents - clinical summary - supporting article (decision analysis) - evidence summary - recommendations
  • 97.  
  • 98. For women age 40 – 49 we suggest NOT screening. (Weak recommendation based, moderate quality evidence. 2B) Values and Preferences: This recommendation places a relatively low value on a very small, uncertain mortality decrease and reflects concerns with false positive results, unnecessary biopsies, and unnecessary dx of breast cancer Women who place a higher value on a small reduction in breast cancer mortality and are less concerned about the undesirable consequences will choose screening
  • 99. For women age 50 - 74 we suggest screening. (Weak recommendation, moderate quality evidence. 2B) Women who do not place a high value on a small reduction in breast cancer mortality and are concerned about false positive results, unnecessary biopsies, and unnecessary diagnosis of breast cancer will decline screening
  • 100.
    • Using GRADE no insufficient
      • - no recommendation or recommend for or against
    Breast self-examination: We recommend against breast self-examination. (strong recommendation, high/moderate quality evidence)
  • 101. Weak recommendation
    • practice will vary
      • according to what?
    • interpretation of evidence
      • clopidogrel in stroke
    • patients’ values and preferences
      • atrial fibrillation
    • inclination to gamble (risk aversion)
      • HRT
  • 102. When evidence is low quality
    • choice more preference dependent
    • risk aversion
    • steroids for pulmonary fibrosis
      • low quality evidence in support of benefit
      • high quality evidence of toxicity
  • 103. When evidence is low quality
    • recommendation to the hopeful patient
      • I’m likely to deteriorate
      • if something might work, let’s try it
      • damn the torpedoes
    • recommendation to the fearful patient
      • doctor, you mean you know it’s toxic
        • diabetes, skin changes, body habitus, infection, osteoporosis
      • you don’t know for sure it works?
      • are you crazy?
    • weak recommendation mandated
  • 104. Strong recommendation when evidence is low quality?
    • known benefit, strong recommendation for one of two alternatives
      • antipyretics in children with chickenpox
      • but which one: ASA or acetaminophen
    • benefit: high quality evidence of equivalence
    • harm: low quality evidence that harm differs appreciably
      • Reye syndrome from ASA
    • strong recommendation for acetaminophen
  • 105. Strong recommendation when evidence is low quality?
    • Blastomycosis
      • low quality evidence amphotericin more effective than itraconazole
      • high quality evidence more toxic
    • patients with life threatening blasto
      • life and death situation
      • strong recommendation for ampho
  • 106. Strong recommendation when evidence is low quality?
    • head to toe CT scanning
      • prevent cancer deaths
    • very low quality evidence of benefits
    • moderate quality evidence re risks, high re costs
    • strong recommendation against
  • 107. Presentation
    • strong and weak
      • discomfort with “weak”
      • alternative wording: discretionary, conditional
    • strong
      • “ we recommend”…
    • discretionary
      • “ we suggest…”
    • never
      • we recommend (or suggest) you consider…
    • always: quality of evidence and grade
  • 108. When (not to) GRADE “good to remind/alert”
    • if no systematic review undertaken
    • no sensible person would consider contrary
      • We recommend that the patient, and the clinician responsible for the patient’s care, should be made aware of any change in a prescribed medication, including change to a generic drug
    • very general (not sufficiently specific)
      • We suggest that long-term maintenance immunosuppression be tailored to individual patient’s adverse events or risk of adverse events
  • 109. Explicit comparator
    • we recommend hourly urine volume measurement for at least 24 hours
      • in contrast to every 2 hours, every 3…?
    • we suggest measuring serum creatinine in all KTRs at least
      • daily for 7 days
      • 2 to 3 X per week for weeks 2 to 4
      • every 2 weeks for months 4 to 6
  • 110. Value and preference statements
    • underlying values and preferences always present
    • sometimes crucial
    • important to make explicit
  • 111. Values and preferences
    • Stroke guideline: patients with TIA clopidogrel over aspirin (Grade 2B).
    • Underlying values and preferences : This recommendation to use clopidogrel over aspirin places a relatively high value on a small absolute risk reduction in stroke rates, and a relatively low value on minimizing drug expenditures.
  • 112. Values and preferences
    • peripheral vascular disease: aspirin be used instead of clopidogrel (Grade 2A).
    • Underlying values and preferences : This recommendation places a relatively high value on avoiding large expenditures to achieve small reductions in vascular events.
  • 113. Flavanoids for Hemorrhoids
    • venotonic agents
      • mechanism unclear, increase venous return
    • popularity
      • 90 venotonics commercialized in France
      • none in Sweden and Norway
      • France 70% of world market
    • possibilities
      • French misguided
      • rest of world missing out
  • 114. Systematic Review
    • 14 trials, 1432 patients
    • key outcome
      • risk not improving/persistent symptoms
      • 11 studies, 1002 patients, 375 events
      • RR 0.4, 95% CI 0.29 to 0.57
    • minimal side effects
    • is France right?
    • what is the quality of evidence?
  • 115. What can lower quality?
    • risk of bias
      • lack of detail re concealment
      • questionnaires not validated
    • indirectness – no problem
    • inconsistency, need to look at the results
  • 116.  
  • 117. Publication bias?
    • size of studies
      • 40 to 234 patients, most around 100
    • all industry sponsored
  • 118.  
  • 119. What can lower quality?
    • detailed design and execution
      • lack of detail re concealment
      • questionnaires not validated
    • inconsistency
      • almost all show positive effect, trend
      • heterogeneity p < 0.001; I 2 65.1%
    • indirectness
    • imprecision
      • RR 0.4, 95% CI 0.29 to 0.57
    • reporting bias
      • 40 to 234 patients, most around 100
  • 120. Recommendation
    • for clinician
      • offer to patient
      • don’t offer to patient?
    • strength of recommendation
      • strong or weak
    • for the funding body
      • publicly funded
      • not publicly funded
      • strong or weak?
  • 121. Is France right?
    • recommendation
      • yes
      • no against use
    • strength
      • strong
      • weak
  • 122. Resource Use
    • why not cost?
      • may lead to focus on cost of intervention rather than downstream resource use
      • resource use emphasizes alternative uses of resources (opportunity cost)
  • 123. Resource Use just another outcome?
    • yes and no
    • who benefits?
      • other outcomes usually clear
      • costs borne by different payers
        • across societies and within (age)
    • some argue costs aren’t relevant to clinicians when third party payer
  • 124. Why resource use different
    • costs vary much more than other outcomes
      • across jurisdictions
      • within jurisdictions
      • over time
    • even when resource use the same, implications may differ
      • year’s supply of expensive drug
      • nurses’ salary in U.S., 6 in Poland, 30 in China
  • 125. Why resource use different
    • opportunity cost differs by perspective
    • hospital pharmacy, fixed budget
      • new expensive drug, clear what give up
    • envelope public spending
      • more on health, less on education, social services
      • will refraining from spending on drugs really mean more for other services?
      • should envelope include military spending?
  • 126. Implications
    • unbearable lightness of resource use
    • consider balance of desirable and undesirable before considering resource use
    • may decide not to consider resource use at all
      • intervention not useful
      • desirable consequences >>>> undesirable
      • relevant only when difference small
  • 127. Similarities with other outcomes
    • only consider important resource use
    • need estimate of difference between trt and control
    • explicit judgments about the quality of the evidence, special judgments
      • perspective
      • how to judge quality of evidence
      • ? use of economic model
  • 128. Evidence summary
    • includes quality of evidence, summary of findings
      • “ balance sheet”, special form of grade profile
    • resource use and not just costs
      • can judge whether resource use applicable to local setting
      • focus on coss relevant to them (pharmacy)
      • apply unit costs to local setting
  • 129. Example question
    • patients
      • women with pre-eclampsia
    • intervention
      • intravenous magnesium
    • RCT done in 33 countries
      • over 9,000 patients
    • for presentation of resource use evidence need to specify perspective
      • health system
  • 130.   Quality assessment   Studies Design Limitations Inconsistency Indirectness Imprecision No of patients Relative effect (95% CI) Quality  Eclampsia                 Duley 2003 RCT No one trial only No No 9,992 RR 0.41 (0.29-0,58) High Maternal death   Duley 2003 RCT No one trial only No Imprecision 9,992 RR 0.54 (0.26-1,10) Moderate
  • 131. Quality assessment       Design Limita-tions Inconsis-tency Indirect- ness Impre-cision Resources Costs per patient Studies   per patient (US $; year 2001)     Placebo MgSO4 Placebo MgSO4 Magnesium sulphate                   High GNI         0 6 0 20   Simon 2005 Middle GNI RCT No one trial only No No 0 6 0 3 High   Low GNI           0 6 0 5   Administration of the drug                       High GNI         0 1 0 66   Simon 2005 Middle GNI RCT No one trial only No No 0 1 0 14 High   Low GNI           0 1 0 8   Other hospital resources a, b                     High GNI       Large variationresources c NA NA 12,839 12,818   Simon 2005 Middle GNI RCT No one trial only No NA NA 1,412 1,416 Moderate   Low GNI         NA NA 155 157  
  • 132. Outcomes Typical control group risk Typical absolute effect (95% CI) Relative effect (95% CI) Nr. of participants (studies) Quality of the evidence Comments Clinical outcomes Eclampsia Severe RR 0.41 (0.29 - 0.58) 11,444  High 27 per 1,000 16 fewer per 1,000 (11 to 19) Not severe 15 per 1,000 9 fewer per 1,000 (6 to 11) Maternal death Severe RR 0.54 (0.26 - 1.10) 10,795   Moderate 2 6 per 1,000 3 fewer per 1,000 (0.6 more to 4 fewer) Not severe 3 per 1,000 1 fewer per 1,000 (0.3 more to 2 fewer) Side effects 46 per 1,000 3 196 more per 1,000 (165 to 231) RR 5.26 (4.59 - 6.03) 9.992  High Mostly flushing. Other side effects include nausea, vomiting, slurred speech, muscle weakness, dizziness, drowsiness, confusion and headache.
  • 133. Resource use from the perspective of the health system control grop difference trt vs control Magnesium sulphate ampoules 0 6 10 ml. ampoules per woman 9.996  High Cost High GNI Middle GNI Low GNI $20 more per patient $ 3 more per patient $ 5 more per patient Administration of magnesium sulphate 0 1 per woman 9.996  High Cost High GNI Middle GNI Low GNI $66 per patient $14 per patient $ 8 per patient Resources for administering magnesium sulphate included midwife time (main cost), intravenous cannula/needle, syringe, IV fluids, drug. Other hospital resources Varied widely 9.996   Moderate 5 There was large variation in the use of other hospital resources in both intervention and control groups. Cost High GNI Middle GNI Low GNI $12,839 $ 1,416 $ 157 $20 less per woman (0 to 60) $ 4, less per woman (0 to 10) $ 2 less per woman (1 to 3) Other hospital costs have been adjusted based on the influence of eclampsia to control for the many other factors that influenced these costs.
  • 134. Issues in resource use
    • broad perspective desirable
      • narrow perspectives ignore much resource use
      • users can pick costs relevant to them
      • either health care system or societal
        • indirect costs controversial
    • indirect evidence of resources use
      • costs only reported
      • RCT but doesn’t reflect practice
        • ulcer prevention everyone gets repeat endoscopy
  • 135. Quality of evidence for resource use
    • rules basically the same
      • RCTs start high, observational low
    • may need multiple sources of evidence
      • RCTs may not fully report resource use
        • variation across settings
      • RCT may not reflect real world
      • time frame may extend beyond trial
    • different quality for different resources
      • mag sulphate versus hospital resources
  • 136. Formal economic models
    • limitations
      • supported by industry, biased
      • setting specific
      • reduces transparency
      • if evidence low quality, speculative
      • often many assumptions
    • solution: develop own model
      • OK if you are NICE with lots of resources
    • even so, don’t include in profile
  • 137. Costs versus affordability
    • intervention may be “cost-effective”
      • $10,000 per qaly gained
    • but if applicable to huge proportion of population, may still be unaffordable
  • 138. healthy asymptomatic postmenopausal qomwn: HRT in 1992?
    • Possible benefits
      • CHD, Hip fracture, Colorectal cancer
    • Possible harms
      • Breast cancer
      • Stroke
      • Thrombosis
      • Gall bladder disease
    Can GRADE lead to change?
  • 139. Evidence profile: Quality assessment Oestrogen + progestin for prevention in 1992 (before WHI and HERS) Oestrogen + progestin versus usual care
  • 140. Oestrogen + progestin for prevention after WHI and HERS
  • 141. Postulate
    • major work in preparing guideline/HTA assessment is systematic review
    • If already doing this, GRADE framework should add little
    • history: Rolls-Royce and Volkswagen
  • 142. VW and RR appraoches
    • Rolls Royce (NICE)
      • systematic review for every outcome
      • production of evidence profiles
      • involvement of multiple constituencies
        • including patients
      • inclusion of economic analysis
    • cost $1 million per guideline
  • 143. MOPED GRADE
    • UpToDate
      • 5,000 graded recommendations
    • generate PICO (informal)
      • no formal rating of outcome importance
    • use of existing reviews, primary studies
      • no new evidence syntheses
    • quality for key outcomes
      • 5 reasons rating down, 3 up
      • no new evidence profiles, SoF tables
    • recommendations
      • strong or weak, consider 3 factors
      • value and preference statements
  • 144. ACCP
    • formal structured questions
    • no formal rating of outcome importance
      • trying to change
    • hit-and miss systematic reviews
      • largely only available ones
    • hit-and-miss individual study evidence summaries
    • rare evidence profiles
      • trying to change
  • 145. VW approach
    • take systematic reviews if available
    • if not, review key, accessible evidence
    • no meta-analysis if not done
    • no evidence profiles
    • small group make expert judgement
  • 146. Conclusion
    • clinicians, policy makers need summaries
      • quality of evidence
      • strength of recommendations
    • explicit rules
      • transparent, informative
    • GRADE
      • simple, transparent, systematic
      • increasing wide adoption
  • 147.