Uploaded on

Understanding applicability, also referred to as relevance, the extent to which published results are likely to reflect expected outcomes when an intervention is applied broadly across populations.

Understanding applicability, also referred to as relevance, the extent to which published results are likely to reflect expected outcomes when an intervention is applied broadly across populations.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
999
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
12
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Assessing Applicability
  • Systematic Review Process Overview This slide illustrates the steps in the systematic review process. This module focuses on the assessment of applicability.
  • Learning Objectives There are three main objectives for this module. At the conclusion of the module, the successful learner will be able to describe applicability and substantiate its importance, delineate a systematic approach to assessing applicability by using PICOS domains (population, intervention, comparators, outcomes, and setting), and recognize common features that may impact the applicability of studies.
  • Defining Applicability Applicability is defined as the extent to which the results observed in published studies are likely to reflect the expected outcomes when an intervention is applied to broader populations under real-world conditions. Applicable results likely reflect the expected outcomes when the intervention is employed in clinical practice. Terms such as external validity, generalizability, directness, and relevance are similar to applicability. Keep in mind that applicability depends on the user of the information. Studies conducted in the United States may be more applicable to clinicians in the United States but less applicable in the developing world. Studies conducted in large academic medical centers in major cities might not be as applicable to rural settings. In general, we are interested in the applicability of evidence to a primary care population in the United States. Reference: Shadish W, Cook T. Experimental and quasi-experimental design for generalized causal inference . Boston: Houghton Mifflin; 2002.
  • The Importance of Applicability in Systematic Reviews Unless the evidence synthesized within a systematic review is applicable to relevant populations under realistic practice conditions, the conclusions may be inappropriate for guiding clinical or policy decisions.
  • Framing Applicability Issues When setting up a systematic review, researchers need to understand contemporary practice. Frame applicability to specific clinical or policy questions the systematic review is intended to inform. Is the systematic review intended to inform Medicare? If so, then studies in older patients or in settings where older patients reside, such as assisted-living facilities, will have greater applicability. Consider applicability when determining the scope of the review or when identifying key questions to be answered. However, this module focuses on handling applicability issues at the analyze-and-synthesize-data stage.
  • Applicability Resources In the Effective Health Care Program, clinical experts and stakeholders help ensure that the comparative effectiveness reviews are relevant to their constituencies. As such, they can provide valuable insight when posed questions such as: What does the population of interest in the community look like? Are they mostly female, mostly elderly, mostly ethnic minorities? What types of care or procedures are routine or represent the standard of care? Are certain subpopulations biologically or clinically different from others? In heart failure trials, two-thirds of white patients have heart failure as a result of ischemic heart disease. However, two-thirds of African Americans have heart failure as a result of long-standing hypertension. The left ventricular mass, average age of onset, and rate of progression are different based on the ethnicity of the population. Thus, in subsequent clinical trials, it appears that certain therapies, such as hydralazine with isosorbide dinitrate, work better in African Americans, but ß-blockers, such as metoprolol succinate and bucindolol, work better in whites.
  • Other Applicability Resources Many readily available sources of information can help determine applicability, including registry or epidemiologic information. Registries may provide information on how the populations in trials are different from the populations that would receive the intervention in actual practice. Comparing the demographics of patients in a disease registry (age, gender, comorbidities, baseline medications) to those of patients in a clinical trial can be enlightening when assessing applicability. Unlike benefits and harms data, applicability data do not have to be reviewed systematically. Applicability data should be used to place the available benefits and harms literature in context. Literature based in patients much younger and healthier than the target population using techniques that cannot readily be used in contemporary practice will not be as useful as a representative literature base, all other things being equal.
  • How Individual Studies Consider Applicability When judging applicability, there are several general considerations. First, studies often do not report on factors needed to judge applicability. Increased use of CONSORT (Consolidated Standards of Reporting Trials) diagrams help to increase transparency in clinical trials, but deficiencies remain. Reviewers must decide how to handle important missing information needed to judge applicability. Only a small number of studies are designed with applicability in mind. Most trials are efficacy trials, which maximize internal validity at the expense of applicability. Whereas internal validity—or the ability to say “all other things being equal, the intervention is causing these effects”—is very important, careful patient selection, tightly controlled interventions, and measuring only a few specific outcomes (including the omission of important potential adverse events) negatively impact applicability.
  • Study Parameters Affect Applicability Study parameters may affect applicability through different mechanisms. Effect modifiers are cofactors that determine the extent to which an intervention results in an outcome. An example is estrogen receptor-positive breast cancer for which tamoxifen is effective versus other forms of breast cancer for which it is not. Some study-related factors alter the perceived balance of benefits to harms by selecting a population at low risk of experiencing harms or by using a run-in period to eliminate patients who would experience adverse events or become noncompliant before randomization. Intervention features may also affect the ability to generalize the effectiveness or safety of the intervention to use in everyday practice. For example, trials conducted in tertiary medical centers with experienced surgeons performing a procedure after years of practice may not reflect the effects that would occur if this procedure were performed in smaller community hospitals.
  • Applicability Judged for Each Question Rather than judging the applicability generally for a project, applicability should be judged separately for each important question. Factors influencing benefits and harms often depend on distinct physiologic processes, and studies available to help answer one question may not be applicable for evaluating other outcomes of interest. In addition, the most applicable evidence may differ when considering a benefit or a harm. Clinical trials may provide evidence to answer a question about the applicability of a benefit but exclude patients with baseline risks of adverse effects. In this case, a cohort study might be more applicable for that harm.
  • Applicability of a Body of Evidence Applicability of a body of evidence is not the same as applicability of individual studies. A collection of studies may provide broad applicable evidence even if individual studies do not. Consistency of study results across different populations and settings increases confidence in the applicability of the results. Consider one study conducted entirely in a population of Caucasian males, one mostly with females, and another with a predominantly African American and Latino population, all showing the same direction of effect when an intervention is applied. Individually, the studies might have limited applicability, but together the body of evidence is applicable. Applicability depends on context and cannot be assessed with a simple rating scale. No empiric data exist that validate a scoring system for applicability.
  • Judge Applicability and Strength of Evidence Separately Applicability is best reported separately from quality or strength of evidence. Internal validity concerns are not altered by changes in perspective, but applicability is. Studies conducted only with Latinos may not be applicable to Asians, but they are very applicable to clinicians treating Latinos. Some evidence hierarchies (GRADE) consider applicability “or directness” when assessing the quality of evidence. In the Effective Health Care Program, the perspectives of clinicians, patients, and policymakers are usually considered. This suggests that applicability should be assessed separately from quality. GRADE = Grading of Recommendations Assessment, Development and Evaluation.
  • Three Steps To Assess Applicability This is a systematic but practical approach to judge applicability. First, identify beforehand a limited number of factors that are most likely to influence applicability of evidence for the specific issue being reviewed and report the relevant factors from individual studies. Start with a broad list in PICOS format (population, intervention, comparators, outcomes, and setting) and then select those that are really important and likely to affect applicability. Keep this list reasonable to avoid being too nihilistic. Once you have the list, report these key factors and why they are important in the methods section. Place extracted applicability data from individual studies into evidence tables. Second, judge limitations to the applicability of individual studies based on those factors. Note major applicability limitations of studies in the last column of the evidence tables and highlight effectiveness studies. Finally, summarize the applicability of the body of evidence from multiple studies with respect to specific populations, interventions, comparisons, and outcomes. In summary statements, be sure to describe to whom the evidence applies and highlight specific limitations to applicability of the evidence. Using an applicability summary table for major questions may be helpful.
  • Step 1. Determine and Report the Most Important Factors in PICOS Format In Step 1, first identify conditions that affect applicability for the topic at hand. This includes understanding the real-world context and how studies may differ from this context. Applicability issues may be organized by using the PICOS format (i.e., population, intervention, comparators, outcomes, and setting). We explore five areas so that we know what the population in the real world looks like, what interventions are currently used, what comparators are most relevant, what outcomes are of greatest interest, and in what settings care is delivered. Once this is done, features from individual studies that may affect applicability should be presented in evidence tables.
  • Population and Applicability: Examples Here are some examples where the populations used in trials limited applicability. In the first example, to maximize internal validity, the Fracture Intervention Trial (FIT) used very stringent inclusion and exclusion criteria. As such, investigators had to screen 54,000 patients to enroll 4,000 eligible patients. The eligible patients were younger, healthier, and more adherent to therapy than is typical of patients with osteoporosis, which would reduce applicability of the trial results to typical osteoporosis patients. In the second example, patients meeting inclusion and exclusion criteria were put through an active run-in period. They were placed on etanercept. If they exhibit adverse effects, they were excluded from being randomized into the trial. Since only patients tolerating therapy were randomized, the trial found a very low incidence of adverse events. The general population will exhibit many more adverse effects than those elucidated in this trial, reducing the applicability of the harms information derived from the trial. In the third example — a general assessment of clinical trials used to inform Medicare decisions — patients enrolled in the clinical trials were younger and more often male than is typical of the Medicare population, thus reducing the applicability of the evidence gathered from these clinical trials.
  • Population and Applicability When most people think of applicability, they focus only on the population. However, as we will see in subsequent slides, many other factors should be assessed as well. Determining how well the study population approximates the target population is critical to determining applicability. First, review the eligibility criteria and ascertain the proportion of screened individuals enrolled in the study. Use of narrow eligibility criteria or a high exclusion rate reduces applicability. Next, the demographics need to be reviewed. Differences in age, gender, race, or other important factors between the study population and patients in the community reduce applicability. Next, the severity or the stage of illness needs to be evaluated. If the population is a referral or specialty population, the applicability to the general population with the disease in the primary care arena is reduced. Next, ascertain whether there was a run-in period and if so the reasons for attrition before randomization. If high levels of nonadherence, adverse effects, or lack of response are found, then this reduces applicability of the study results because better safety or superior efficacy will be seen versus the general population with the disease. Next, compare the event rates in the treatment and control groups to community averages. If the number of events in the control group, for example, is markedly higher than the community population, then the applicability is reduced because the study population is sicker than the general population. Finally, evaluate the prevalence of the disease in diagnostic test studies. If the disease prevalence is markedly different than in the community, then applicability is reduced. References: Gartlehner G, Hansen RA, Nissman D, et al. A simple and valid tool distinguished efficacy from effectiveness studies. J Clin Epidemiol 2006;59:1040-8. http://www.ncbi.nlm.nih.gov/pubmed/16980143
  • Intervention and Applicability: Examples Here are some examples in which the intervention negatively affected applicability. On the television series, “The Biggest Loser,” obese subjects are removed from their jobs and families and sent to a resort where physicians, dietitians, and personal trainers work with them on losing weight. However, most Americans cannot afford to leave their jobs and families for several months and devote all of their time and effort to lifestyle modification. So although the show can demonstrate the dramatic effect that lifestyle modification can have under optimal practice conditions, the applicability of this intervention for the average American is low. In the second example, selecting patients who are more compliant and then instituting techniques such as pill counts, telephone reminders, and compliance incentives can overestimate the degree of benefit that a therapy can provide in clinical practice. This is especially true in diseases that afflict patients with lower compliance rates such as abusers of alcohol and drugs. In the third example, the regimen calls for iron to be given with zinc, which attenuates the ability of iron to raise hemoglobin levels. As such, these data are not as applicable to clinical practice where patients with microcytic anemia are given iron supplementation without zinc. In the fourth example, trials are conducted to evaluate the benefits and harms of carotid endarterectomy in patients at risk for stroke. However, several trials allow only surgeons who have extensive experience in performing the procedure and demonstrably low complication rates. These trials are not applicable to most patients who will be receiving this procedure in clinical practice.
  • Intervention and Applicability There are several factors that can limit the applicability of the intervention being employed in studies and that need to be assessed. First, assess the medication dose, schedule, and duration. The researcher should determine if this regimen is reflective of current practice. If the dose is much higher than previously used or requires multiple daily doses known to reduce compliance, then applicability might be reduced. Second, if the intensity of the intervention is not feasible for routine care, applicability is reduced. Third, if monitoring practices or visit frequency is more intensive than that used in practice, applicability is reduced. Fourth, employing versions of technology that are not currently in use, either because they are too new or because they are outdated, can reduce applicability. Fifth, if cointerventions would likely modify the effectiveness of therapy, applicability might be reduced. Finally, if the training and skill level of an intervention team exceed those of the general practitioners who would be performing the intervention in the community, then applicability is reduced.
  • Comparator and Applicability: Examples Here are three examples where the choice of comparator negatively affected applicability. In the first example, a study of a new antidepressant compares its higher dose regimen to a low dose of its comparator. In clinical practice, most patients are on higher doses of the comparator. In the last two examples, the problem is not with the stated comparator per se, but rather with the background therapies being employed. Although some would think of these issues as applicability issues associated with the population, they are placed here for added emphasis. In the second example, trials comparing magnesium to placebo in patients with acute myocardial infarction are included. However, these trials were undertaken before therapies that have been shown to reduce morbidity and mortality were being employed. Thus, the important question is not whether magnesium would benefit patients with a myocardial infarction but whether it would still provide benefit in addition to the standard medical therapy currently being used for acute myocardial infarction. In the last example, bypass surgery and angioplasty are being compared. Unfortunately, these clinical trials, by and large, compared bypass surgery versus angioplasty with or without a bare metal stent. In contemporary practice, however, a large subset of patients undergo angioplasty with a drug-eluting stent.
  • Outcomes and Applicability: Examples Here are two examples of how the outcomes used negatively affected applicability. Although the use of objective data in these two examples is good for internal validity, the applicability of these trials is not as strong. Demonstrating an improvement in health-related quality of life, reduction in hospitalizations, or development of ulcers requiring intervention would be more clinically meaningful.
  • Comparator, Outcomes, and Applicability On this slide, we discuss how the comparator and the outcomes being assessed can affect applicability. For the comparator, the researcher first needs to evaluate the medication dose, schedule, and duration. If this regimen is not reflective of clinical practice, then the applicability of the study is reduced. Similarly, if there are several potential comparators, the researcher needs to determine whether a representative one or a substandard one (i.e., known either to have poorer efficacy or greater adverse events) was chosen. For outcomes, the researcher needs to evaluate the outcomes and how they were defined. If the definition is markedly different than those used in clinical practice, then that negatively affects applicability. Similarly, the use of surrogate end points such as blood pressure or cholesterol, instead of health outcomes such as myocardial infarctions and strokes, can weaken applicability. The use of composite end points can also lower applicability because it is not always clear which components of the composite end point are being affected and to what extent.
  • Setting and Applicability: Examples Here are two examples showing the relationship between setting and applicability. In the first example, the benefits of breast self-examinations are being evaluated in Shanghai and St. Petersburg. However, in China and Russia, the use of other more-sensitive breast cancer-detection methods such as mammography is minimal. Without another method to detect breast cancer, these studies may overemphasize the value of breast self-examinations. In the second example, it is known that hospitals performing large numbers of abdominal aortic aneurysm repair surgeries have lower short-term mortality rates. Thus, when evaluating studies comparing complication rates between surgical repair and other therapies, the setting where the studies take place is very important in determining applicability.
  • Setting and Applicability Setting refers to either the geographic setting or the clinical setting. If a study is conducted in a country or region where standards of care differ markedly from the setting of interest, applicability may be reduced. Similarly, if the clinical setting differs markedly from that seen in the community (e.g., a specialty care setting), applicability is reduced.
  • Step 2. Make and Report Judgments About Major Limitations to Applicability of Individual Studies In Step 2, make and report judgments about major limitations to the applicability of individual studies. First, judge whether or not a study is an effectiveness trial or an efficacy trial. Effectiveness trials are specifically designed to produce applicable results and should be highlighted. If efficacy and effectiveness trials are available, they should be compared and contrasted. Finally, clearly report and highlight major applicability issues of efficacy trials and what impact they may have on the interpretation of the study.
  • Effectiveness Trials Here we provide a tool for differentiating effectiveness trials from efficacy trials. Effectiveness trials have greater applicability than efficacy trials and, in this method, meet at least five of the seven following criteria: Enroll a primary-care population Have less-stringent eligibility criteria Assess health-related outcomes instead of only intermediate ones Have a long study duration and clinically relevant treatment modalities Assess for adverse events Have an adequate sample size to assess minimally important differences from a patient perspective Use intention-to-treat analysis This is only for illustrative purposes and other authors, such as Thorpe et al. (2009), have their own proposed criteria differentiating effectiveness from efficacy trials. References: Gartlehner G, Hansen RA, Nissman D, et al. A simple and valid tool distinguished efficacy from effectiveness studies. J Clin Epidemiol 2006;59:1040-8. http://www.ncbi.nlm.nih.gov/pubmed/16980143 Gartlehner G, Thieda P, Hansen RA, et al. Inadequate reporting of trials compromises the applicability of systematic reviews. Int J Technol Assess Health Care 2009;25:323-30. http://www.ncbi.nlm.nih.gov/pubmed/19619351 Thorpe KE, Zwarenstein M, Oxman AD, et al. A pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. J Clin Epidemiol 2009;62:464-75. http://www.ncbi.nlm.nih.gov/pubmed/19348971
  • Make Judgments About the Applicability of Individual Studies For each efficacy trial, clearly report characteristics that may limit applicability. These limitations should be highlighted in the text or evidence tables under the heading “comments” or “limitations.” When describing these limitations, describe how important factors would affect applicability and the expected direction and magnitude of bias. For example, trials with run-in periods might have more adherent participants with less risk of adverse events. In some instances, subgroup analyses or meta-regression can be used to explore the impact of these factors on the results. For example, if several trials utilize a higher than average risk population but other trials do not, subgroup analyses can be conducted. Similarly, if there is a continuum of baseline risk, meta-regression can be conducted. This step is not about reporting applicability elements (in step 1), but rather is about summarizing and making judgments about the applicability of the individual study.
  • Evidence Table Template for Applicability Here is a template for an applicability evidence table. For each specific key question being asked, an applicability evidence table is generated. In step 1 of the process, data on factors identified a priori as being most likely to impact applicability are extracted for each study and entered in the table. The PICOS format (i.e., population, intervention, comparators, outcomes, and setting) is used to organize the data. In step 2, judgments about whether a trial is an effectiveness or efficacy trial is made and then components that may negatively impact applicability are summarized in the “comments” column. In this example, a trial is identified. In the population column, the population of interest, mean age, and New York Heart Association (NYHA) classification of disease severity are given. Although there are numerous demographic factors that could have impacted applicability, these were selected a priori as being the most important. In the intervention column, the type of surgical procedure is described. In the comparator column, the comparator is described, and then the extent of use of other medications already known to impact the disease process is identified. The main outcomes of interest in the key questions being asked, median followup, and setting are identified. Finally, given the information on the PICOS domains, the judgment that this is an efficacy trial is made and limitations to applicability are delineated. For our purposes, assume that four other studies were also available with similar PICOS domains. Please read through the data extracted for this study, because we will revisit this example throughout the rest of the module. Reference: Slutsky J, Atkins D, Chang S, et al. Assessing applicability when comparing medical interventions: AHRQ and the Effective Health Care Program. In: Methods guide for comparative effectiveness reviews . Rockville, MD: Agency for Healthcare Research and Quality, Posted November 2008. Available at: http://www.effectivehealthcare.ahrq.gov/ehc/products/118/324/2008_1118CompareInterventions.pdf.
  • Step 3. Consider and Summarize the Applicability of a Body of Evidence From Individual Studies In contrast to Step 2, which is concerned with individual study evaluations of applicability, step 3 involves making the aggregate applicability for each component of the PICOS framework (i.e., population, intervention, comparators, outcomes, and setting) and the overall assessment of applicability. A summary table with a PICOS structure should be generated. Applicability is not the sum of applicability of individual studies. As we will show in the next example, each individual study can have relatively low applicability in at least one aspect of PICOS, but the body of evidence can have high applicability.
  • Which Series of Studies Has an Applicability Issue? In this example, we focus on the age of patients with a disease. In clinical practice, the age of patients with this disease is dispersed across a continuum from low to high with most patients in the low-to-moderate age range. In both scenarios, there is a significant reduction in the odds of the event with therapy and no statistical heterogeneity. In scenario 1, the studies are clustered around the higher age range. However, we cannot be truly confident that the benefits seen in higher aged populations can be extrapolated to those in the low-to-moderate age range. Thus, applicability is reduced. In contrast, the studies in scenario 2 are dispersed across the continuum of age, which provides greater confidence in the ability of the results to be extrapolated.
  • Step 3. Applicability Summary Table Template In this slide, we provide a template for summarizing the applicability for aggregated studies for each component of the PICOS framework (i.e., population, intervention, comparators, outcomes, and setting). We need to emphasize that this is a tool that can help the reviewer look across studies so that they can synthesize and describe applicability concerns or limitations accurately in the conclusions. For the population, describe the general characteristics of enrolled populations and how they may differ from the target population. Where possible, describe proportions (percentage older than 65 years of age) rather than an age range or average age. For the intervention, describe the general characteristics of the interventions, how they compare to routine practice, and how this might affect benefits or harms. For comparators, describe each, whether they reflect the best alternative therapy, and how this may impact treatment effect size. For outcomes, describe those most frequently reported and over what period of time. Describe whether these outcomes reflect the most important benefits or harms. For setting, describe the settings in which the studies were conducted and the potential implications of using the intervention elsewhere. With the use of a summary applicability table, it becomes easier for a reviewer to describe in the text how aspects of studies impact the interpretations of study results in answering a question.
  • Step 3. Completed Applicability Summary Table In this example, we enter data from our case study into the fields of the summary table. We state that the population and disease stage are representative of the United States population with heart failure and that the intervention is plausible. These factors strengthen applicability. Watchful waiting would be reasonable if the baseline treatment in both groups was standard medical therapy; unfortunately, this was not the case with our literature base, which limited applicability. Subgroup analyses suggest that benefits occur predominantly in those patients not receiving standard therapy, thereby exacerbating the limitations this places on applicability. In the outcomes field, we identify that hospitalizations and survival are being evaluated but that other outcomes, including harms, are not. Finally, the settings for the studies are large tertiary medical centers, which may overestimate the benefits of therapy in actual practice and accentuate the harms.
  • Summary Table for Body of Evidence Applicability In this slide, we use the template for a summary table to assess the applicability of a body of evidence. For the comparison of myocardial debulking therapy versus watchful waiting, the strength of evidence is determined to be low. We then give the conclusion: “When compared with watchful waiting, the use of myocardial debulking reduced hospitalizations without impacting survival.” Finally, we describe the applicability of this evidence: “These results are predominantly limited to patients 65 years or older with NYHA [New York Heart Association] class II and III disease.” This is similar to the population with heart failure in the United States. In a subgroup analysis, benefits were accentuated in patients not receiving standard medical therapy with ACE [angiotensin-converting enzyme] inhibitors and ß-blockers, thus limiting applicability. No harms were being evaluated. These trials were conducted in select, large, tertiary care medical centers and may not be applicable to other settings.”
  • Key Messages (I) As shown in this modules, the PICOS (population, intervention, comparators, outcomes, and setting) framework is a useful way of organizing the contextual factors of interest. Input from clinical experts and stakeholders and a review of current practices can help identify specific applicability issues. Population-based surveys, pharmacoepidemiologic studies, and large case series or registries of devices or surgical procedures can help determine how trial populations and circumstances differ from typical circumstances.
  • Key Messages (II) In this module, we also learned that: Reviewers should routinely select the most important characteristics that may limit the applicability of the evidence and assess if there is any evidence that variation within the factors identified truly modifies the outcome effects. Reports should clearly highlight issues that raise important questions about applicability of individual studies in a “comments” or “limitations” section in evidence tables or in the text. Meta-regression and/or separate applicability summary tables may help reviewers and those using the reports to see how well the body of evidence applies to the question at hand. Judgments about applicability of the evidence should consider the entire body of studies. Finally, we learned that important limitations of the applicability of the evidence should be described within each summary conclusion.
  • References
  • Author

Transcript

  • 1. Assessing Applicability Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for Systematic Reviews Methods Guide www.ahrq.gov
  • 2. Systematic Review Process Overview
  • 3.
    • To describe applicability and substantiate its importance
    • To delineate a systematic approach to assessing applicability by using PICOS domains
      • To recognize common features that may impact the applicability of studies
    Learning Objectives PICOS domains = population, intervention, comparators, outcomes, and setting
  • 4.
    • Applicability is defined as the extent to which the results observed in published studies are likely to reflect the expected outcomes when an intervention is applied to broader populations under real-world conditions.
    • Similar terms include external validity, generalizability, directness, and relevance.
    Defining Applicability
  • 5.
    • Unless the evidence synthesized within a systematic review is applicable to relevant populations under real-world practice conditions, the conclusions may be inappropriate for guiding clinical or policy decisions.
    The Importance of Applicability in Systematic Reviews
  • 6.
    • Frame issues of applicability with reference to specific clinical or policy questions the review is intended to inform.
    • Applicability needs to be considered at the outset:
      • when the scope of the review is determined, and
      • when the key questions are identified.
    • This module focuses on handling applicability issues at the analyze-and-synthesize-data stage.
    Framing Applicability Issues
  • 7.
    • Clinical experts and stakeholders can provide general information important to framing applicability issues.
      • What does the population of interest looks like?
      • What types of care or procedures are routine or represent standard of care?
      • Are certain subpopulations characteristically different from others?
    Applicability Resources
  • 8.
    • Registry or epidemiological information, practice guidelines, consensus papers, book chapters, and general reviews can provide useful applicability information.
      • Applicability issues do not have to be reviewed for each study.
      • They are used to place the available literature in context.
    Other Applicability Resources
  • 9.
    • Studies often do not report on factors needed to judge applicability.
      • Increased use of CONSORT diagrams helps, but deficiencies remain.
    • Only a small number of studies are designed with applicability in mind.
      • Most trials are efficacy trials that maximize internal validity at the expense of applicability.
    How Individual Studies Consider Applicability CONSORT = Consolidated Standards of Reporting Trials
  • 10.
    • Study parameters may affect applicability through these different mechanisms:
      • Effect modifiers
      • Modifiers of baseline rates
      • Factors that reduce ability to generalize results to everyday practice
    Study Parameters Affect Applicability
  • 11.
    • Applicability should be judged separately for each important question.
      • Factors influencing benefits and harms often depend on distinct physiologic processes.
      • The most applicable evidence may differ when considering a benefit or a harm.
    Applicability Judged for Each Question
  • 12.
    • Applicability of a body of evidence is not the same as applicability of individual studies.
      • A collection of studies may provide broad applicable evidence even if individual studies do not.
      • Consistency of study results across different populations and settings increases confidence in the applicability of the results.
    • Applicability depends on context and cannot be assessed with a simple rating scale.
      • There are no empiric data validating any scoring system for applicability.
    Applicability of a Body of Evidence
  • 13.
    • Applicability is best reported separately from quality or strength of evidence.
      • Internal validity concerns are not altered by changes in perspective, but applicability is.
      • Some evidence hierarchies (GRADE) consider applicability “or directness” when assessing the quality of evidence.
      • For systematic reviews in which multiple perspectives (clinician, patient, policymaker) are considered, applicability should be assessed separately.
    Judge Applicability and Strength of Evidence Separately
  • 14.
    • Stepwise approach to applicability:
      • Identify beforehand a limited number of factors that are most likely to influence applicability of evidence for the specific issue being reviewed and report the relevant elements from individual studies.
      • Judge limitations to the applicability of individual studies based on those factors.
      • Summarize the applicability of the body of evidence from multiple studies with respect to specific populations, interventions, comparisons, and outcomes.
    Three Steps To Assess Applicability
  • 15.
    • Identify most important applicability issues
      • Understand the real-world context
      • Understand how studies differ from the real-world context
      • Use PICOS format to organize factors
    • Report features from individual studies that impact applicability in evidence tables
    Step 1. Determine and Report the Most Important Factors in PICOS Format PICOS = population, intervention, comparators, outcomes, and setting
  • 16.
    • In a trial of women with osteoporosis, only 4,000 of the 54,000 women screened were enrolled; the enrollees were younger, healthier, and more adherent to therapy than is typical of women with osteoporosis.
    • A trial of etanercept for juvenile diabetes excluded patients with side effects during an active run-in period; the trial found a low incidence of adverse events.
    • Clinical trials used to inform Medicare decisions enrolled patients who were younger (60 vs. 75 years of age) and more often male (75 vs. 42%) than is typical of Medicare patients with cardiovascular disease.
    Population and Applicability: Examples
  • 17. Population and Applicability Conditions That Limit Applicability Features That Should Be Extracted Into Evidence Tables Narrow eligibility criteria, high exclusion rate, low enrollment Eligibility criteria, proportion of screened individuals enrolled Differences between patients in study and the community Demographics (range and mean): age, gender, race, ethnicity Narrow or unrepresentative severity or stage of illness Severity or stage of illness (referral or primary care population) Run-in periods with high exclusion rates Run-in period: attrition rate before randomization and reasons (e.g., nonadherence, adverse drug events, no response) Events rates markedly different than in community Event rates in treatment and control groups Disease prevalence in study population different than community Prevalence of disease (for diagnostic studies)
  • 18.
    • Studies of behavioral modification to promote healthy diet employ a larger number and longer duration of visits than those available to most community patients.
    • The use of pill counts in antiretroviral trials does not always translate into effectiveness in real-world practice.
    • Combining iron and zinc attenuates the ability of iron to raise hemoglobin levels.
    • Trials of carotid endarterectomy select surgeons with extensive experience and low complication rates and are not representative of average vascular surgeons.
    Intervention and Applicability: Examples
  • 19. Intervention and Applicability Conditions That Limit Applicability Features That Should Be Extracted Into Evidence Tables Regimen not reflective of current practice Medication dose, schedule, duration Intensity of intervention not feasible for routine use Intensity of behavioral interventions Monitoring practices or visit frequency not used in practice Adherence to interventions Versions not in common use Version of rapidly changing technology Cointerventions that likely modify effectiveness of therapy Cointerventions Level of training not widely available Training/skill level of intervention team (surgery/diagnostics)
  • 20.
    • A fixed-dose study that compared high-dose duloxetine to low-dose paroxetine
    • Many trials evaluating magnesium as a treatment for acute myocardial infarction that were conducted before thrombolytic drugs, antiplatelet drugs, ß-blockers, and primary percutaneous coronary intervention (PCI) were used
    • Only 1 of 23 trials that compared bypass surgery to PCI used drug-eluting stents
    Comparator and Applicability: Examples
  • 21.
    • Trials of biologics for rheumatoid arthritis that used radiographic progression rather than symptom evaluations as an outcome measure
    • Trials comparing cyclooxygenase-2 inhibitors and nonsteroidal antiinflammatory drugs that used endoscopy-evaluated ulceration rather than symptomatic ulcers as an outcome measure
    Outcomes and Applicability: Examples
  • 22. Comparator, Outcomes, and Applicability Conditions That Limit Applicability Features That Should Be Extracted Into Evidence Tables Comparator Regimen not reflective of current practice Medication dose, schedule, duration (if applicable) Use of substandard alternative therapy Comparator chosen vs. others available (if applicable) Outcomes Surrogate endpoints, brief follow-up periods, improper definitions for outcomes, composite endpoints Outcomes (benefits and harms) and how they were defined
  • 23.
    • Studies evaluating the benefits of breast self-examinations conducted in China and Russia, countries that do not employ routine mammography screening as in the United States
    • Studies of open surgical abdominal aortic aneurysm repair showing an inverse relationship between hospital volume and short-term mortality
    Setting and Applicability: Examples
  • 24. Setting and Applicability Conditions That Limit Applicability Features That Should Be Extracted Into Evidence Tables Settings in which standards of care differ markedly from setting of interest Geographic setting Specialty population or level of care that differs from community Clinical setting
  • 25.
    • Make judgments about applicability of individual studies
    • Look for effectiveness trials
      • Specifically designed to produce applicable results
    • Report on applicability
      • Highlight effectiveness trials
      • Compare and contrast effectiveness and efficacy trials
      • Clearly report and highlight major applicability issues of efficacy trials and what impact they may have on the interpretation of the study
    Step 2. Make and Report Judgments About Major Limitations to the Applicability of Individual Studies
  • 26.
    • Seven criteria:
      • Enroll a primary-care population
      • Have less-stringent eligibility criteria
      • Assess health-related outcomes
      • Have a long study duration and clinically relevant treatment modalities
      • Assess for adverse events
      • Have an adequate sample size to assess minimally important differences from a patient perspective
      • Use intention-to-treat analysis
    Effectiveness Trials Gartlehner G, et al. J Clin Epidemiol 2006;59:1040-8; Gartlehner G, et al. Int J Technol Assess Health Care 2009;25:323-30.
  • 27.
    • For efficacy trials, clearly report characteristics that may limit applicability.
      • Describe those characteristics in the text or in evidence tables under the heading “comments” or “limitations.”
      • Describe how important factors would affect applicability and the expected direction and magnitude of bias.
    Make Judgments About the Applicability of Individual Studies
  • 28. Evidence Table Template for Applicability After reviewing this sample evidence table, assume that four other trials are available with similar PICOS domains. Slutsky J, et al. In: Methods guide for comparative effectiveness reviews . Available at: http://www.effectivehealthcare.ahrq.gov/ehc/products/118/324/2008_1118CompareInterventions.pdf. ACE = angiotensin-converting enzyme; NYHA = New York Heart Association; PICOS = population, intervention, comparators, outcomes, and setting Trial Population Intervention Comparator Outcomes, Setting Comments Smith et al. 24 Heart failure population Mean age: 65 years NYHA class II or III: 83% Surgical debulking of myocardium Watchful waiting (ACE inhibitor use, 34%; ß-blocker use, 40%) Hospitali-zations and survival Median followup at 1 year Single, large, tertiary care hospital An efficacy trial ; limited standardization of intervention; comparator did not include optimal medical therapy; unclear how the benefits and harms would compare in a smaller community hospital
  • 29.
    • Describe the limitations of aggregate evidence in the conclusions.
    • Generate a summary applicability table with a PICOS structure.
      • Aggregate applicability is not the sum of applicability of individual studies.
      • Each individual study can have relatively low applicability in at least one PICOS domain, but the body of evidence can have high applicability.
    Step 3. Consider and Summarize the Applicability of a Body of Evidence From Individual Studies PICOS = population, intervention, comparators, outcomes, and setting
  • 30. Which Series of Studies Has an Applicability Issue? Study 1 Study 2 Study 3 Study 4 Higher Age Lower Age Study 4 Study 1 Study 3 Study 2 Scenario 1 Scenario 2
  • 31. Step 3. Applicability Summary Table Template Domain Description of Applicability Evidence for a Key Question Population Describe the general characteristics of enrolled populations and how they may differ from the target population. Where possible, describe proportions (percentage older than 65) rather than an age range or average age. Intervention Describe the general characteristics of the interventions, how they compare to routine practice, and how this may affect benefits or harms. Comparators Describe comparators, whether they reflect best alternative therapy, and how this may impact treatment effect size. Outcomes Describe outcomes most frequently reported and over what period of time. Describe whether or not these outcomes reflect the most important benefits or harms. Setting Describe the settings in which the studies were conducted and the potential implications of using the intervention elsewhere.
  • 32. Step 3. Completed Applicability Summary Table Domain Description of Applicability Evidence for a Key Question Population The population and disease stage are representative of the United States population with heart failure. Intervention The intervention is plausible. Comparators Watchful waiting is reasonable if the baseline treatment in both groups was standard medical therapy. Standard medical therapy is not being used in most patients. Subgroup analyses suggest that benefits are predominantly in those patients not receiving standard therapy. Outcomes Although hospitalizations and survival are being evaluated, other outcomes, including harms, are not. Setting The settings for the studies are large tertiary medical centers, which may overestimate the benefits of therapy in actual practice and accentuate the harms.
  • 33. Summary Table for Body of Evidence Applicability ACE = angiotensin-converting enzyme; NYHA = New York Heart Association Comparison Strength of Evidence Conclusions With Description of Applicability Myocardial debulking versus watchful waiting Low When compared with watchful waiting, the use of myocardial debulking reduced hospitalizations without affecting survival. These results are predominantly limited to patients 65 years of age and older with NYHA class II and III disease. This is similar to the population with heart failure in the United States. In a subgroup analysis, benefits were accentuated in patients not receiving standard medical therapy with ACE inhibitors and ß-blockers, thus limiting applicability. No harms were being evaluated. These trials were conducted in select, large, tertiary medical centers and may not be applicable to other settings.
  • 34.
    • The PICOS framework is useful for organizing contextual factors of interest.
    • Input from clinical experts/stakeholders and a review of current practices can help identify specific applicability issues.
    • Population-based surveys, pharmacoepidemiologic studies, and large case series or registries can help determine how trial populations and circumstances differ from normal practice.
    Key Messages (I) PICOS = population, intervention, comparators, outcomes, and setting
  • 35.
    • Reviewers should routinely select the most important characteristics that may limit applicability and assess whether variation within the factors may modify outcome effects.
    • Reports should clearly highlight issues about applicability of individual studies in a “comments” or “limitations” section in evidence tables or in the text.
    • Meta-regression and/or separate applicability summary tables may help reviewers, and those using the reports, to see how the body of evidence applies to the question at hand.
    • Judgments about applicability of the evidence should consider the entire body of studies.
    • Important limitations of the applicability of the evidence should be described within each summary conclusion.
    Key Messages (II)
  • 36.
    • Gartlehner G, Hansen RA, Nissman D, et al. A simple and valid tool distinguished efficacy from effectiveness studies. J Clin Epidemiol 2006;59:1040-8.
    • Gartlehner G, Thieda P, Hansen RA, et al. Inadequate reporting of trials compromises the applicability of systematic reviews. Int J Technol Assess Health Care 2009;25:323-30.
    • Shadish W, Cook T. Experimental and quasi-experimental design for generalized causal inference. Boston: Houghton Mifflin; 2002.
    • Thorpe KE, Zwarenstein M, Oxman AD, et al. A pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. J Clin Epidemiol 2009;62:464-75.
    References
  • 37.
    • This presentation was prepared by C. Michael White, Pharm.D., FCP, FCCP, a member of the University of Connecticut/Hartford Hospital Evidence-based Practice Center.
    • This module is based on chapter 6 in version 1.0 of the Methods Reference Guide for Effectiveness and Comparative Effectiveness Reviews (available at: http://effectivehealthcare.ahrq.gov/repFiles/2007_10Draft MethodsGuide.pdf).
    Author