Acs0002 Performance Measures In Surgical Practice


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Acs0002 Performance Measures In Surgical Practice

  1. 1. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 1 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE John D. Birkmeyer, M.D., F.A.C.S., and Justin B. Dimick, M.D., M.P.H. With the growing recognition that the quality of surgical care reviews, providing an overview of the measures commonly used to varies widely, there is a rising demand for good measures of surgi- assess surgical quality, considering their main strengths and limita- cal performance. Patients and their families need to be able to tions, and offering recommendations for selecting the optimal make better-informed decisions about where to get their surgical quality measure. care—and from whom.1 Employers and payers need data on which to base their contracting decisions and pay-for-performance initiatives.2 Finally, clinical leaders need tools that can help them Overview of Current Performance Measures identify “best practices” and guide their quality-improvement The number of performance measures that have been devel- efforts. To meet these different needs, an ever-broadening array of oped for the assessment of surgical quality is already large and con- performance measures is being developed. tinues to grow. For present purposes, it should be sufficient to con- The consensus about the general desirability of surgical perfor- sider a representative list of commonly used quality indicators that mance measurement notwithstanding, there remains considerable have been endorsed by leading quality-measurement organizations uncertainty about which specific measures are most effective in or have already been applied in hospital accreditation, pay-for-per- measuring surgical quality. The measures currently in use are formance, or public reporting efforts [see Tables 2 and 3]. A more remarkably heterogeneous, encompassing a range of different ele- exhaustive list of performance measures is available on the ments. In broad terms, they can be grouped into three main cate- National Quality Measures Clearinghouse (NQMC) Web site gories: measures of health care structure, process-of-care mea- (, sponsored by the Agency sures, and measures reflecting patient outcomes. Although each of for Healthcare Research and Quality (AHRQ). these three types of performance measure has its unique strengths, Overt the past few years, the National Quality Forum (NQF) each is also associated with conceptual, methodological, or practi- has emerged as the leading organization endorsing quality mea- cal problems [see Table 1]. Obviously, the baseline risk and fre- sures. Many other organizations, including the Joint Commission quency of the procedure are important considerations in weighing on Accreditation of Healthcare Organizations (JCAHO) and the the strengths and weaknesses of different measures.3 So too is the Center for Medicare and Medicaid Services (CMS), rely on the underlying purpose of performance measurement; for example, endorsement of the NQF before applying a measure to practice. measures that work well when the primary intent is to steer The number of measures relevant to surgery that have been patients to the best hospitals or surgeons (selective referral) may endorsed by the NQF has grown rapidly [see Table 2]. Many of not be optimal for quality-improvement purposes. these new measures were vetted as part of CMS’s Surgical Care Several reviews of performance measurement have been pub- Improvement Program (SCIP), which includes process measures lished in the past few years.3-5 In what follows, we expand on these related to prevention of surgical site infections (SSIs), postopera- Table 1 Primary Strengths and Limitations of Structural, Process, and Outcome Measures Type of Examples Strengths Limitations Measure Measures are expedient and inexpensive Number of measures is limited Measures are efficient—a single one may relate to Procedure volume several outcomes Measures are generally not actionable Structural Intensivist-managed ICU For some procedures, measures predict subse- Measures do not reflect individual performance and are consid- quent performance better than process or out- ered unfair by providers come measures do Measures reflect care that patients actually Many measures are hard to define with existing databases receive—hence, greater buy-in from providers Process of Appropriate use of Extent of linkage between measures and important patient Measures are directly actionable for quality-improve- care prophylactic antibiotics outcomes is variable ment activities High-leverage, procedure-specific measures are lacking For many measures, risk adjustment is unnecessary Risk-adjusted mortalities Face validity Sample sizes are limited Direct outcome for CABG from state or Measurement may improve outcomes in and of Clinical data collection is expensive national registries itself (Hawthorne effect) Concerns exist about risk adjustment with administrative data CABG—coronary artery bypass grafting
  2. 2. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 2 Table 2 Clinical Performance Measures Structural Measures Relevant to Surgery That Have Been Endorsed by The term health care structure refers to the setting or system in National Quality Forum* which care is delivered. Many structural performance measures reflect hospital-level attributes, such as the physical plant and resources or the coordination and organization of the staff (e.g., Diagnosis or Procedure Performance Measure the registered nurse–bed ratio and the designation of a hospital as a level I trauma center). Other structural measures reflect physi- Use of internal mammary artery cian-level attributes (e.g., board certification, subspecialty training, Preoperative beta blocker and procedure volume). Deep sternal wound infection rate Coronary artery bypass grafting Prolonged intubation STRENGTHS Renal insufficiency Surgical reexploration Structural performance measures have several attractive fea- Hospital volume tures. One strength of such measures is that many of them are strongly related to outcomes. For example, with esophagectomy Risk-adjusted mortality and pancreatic resection for cancer, operative mortality is as much Aortic valve replacement Hospital volume as 10% lower, in absolute terms, at very high volume hospitals Risk-adjusted mortality than at lower-volume centers.6,7 In some instances, structural mea- Mitral valve replacement Hospital volume sures (e.g., procedure volume) are better predictors of subsequent hospital performance than any known process or outcome mea- Use of antiplatelet agents, antilipid drugs, and beta blockers on discharge sures are [see Figure 1].8 Participation in a cardiac surgery registry A second strength is efficiency. A single structural measure may Any cardiac surgery Preoperative beta blocker be associated with numerous outcomes. For example, with some Renal insufficiency types of cancer surgery, higher hospital or surgeon procedure vol- Prolonged intubation ume is associated not only with lower operative mortality but Stroke also with lower perioperative morbidity and improved late sur- Radiation therapy after breast conservation surgery vival.9-11 Intensivist-staffed intensive care units are linked to short- Surgery for breast cancer er lengths of stay and reduced use of resources, as well as to lower Adjuvant chemotherapy for appropriate candidates mortality.12,13 Adjuvant chemotherapy for appropriate candidates The third, and perhaps most important, strength of structural Surgery for colon cancer At least 12 lymph nodes identified in surgical specimen measures is expediency. Many such measures can easily be assessed with readily available administrative data. Although some Adjuvant radiation therapy for patients with rectal structural measures require surveying of hospitals or providers, Surgery for rectal cancer cancer such data are much less expensive to collect than data obtained VTE prophylaxis through review of individual patients’ medical records. Any surgical procedure Appropriate timing, selection, and discontinuance of prophylactic antibiotics Any hospitalized patient, Central venous catheter infection rate including all postopera- Urinary catheter–associated infection rate tive patients Table 3 Other Performance Measures Currently Ventilator-associated pneumonia rate Used in Surgical Practice *As of August 2007. Performance Measure Diagnosis or Procedure (Developer/Endorser) tive cardiac events, venous thromboembolism (VTE), and respira- Critical illness Staffing with board-certified intensivists (LF) tory complications. Although the NQF is the central organization for evaluating can- Hospital volume (AHRQ, LF) didate quality measures, many other organizations continue to cre- Abdominal aneurysm repair Risk-adjusted mortality (AHRQ) ate their own quality indicators [see Table 3].The AHRQ has focused Prophylactic beta blockers (LF) primarily on quality measures that take advantage of readily avail- Carotid endarterectomy Hospital volume (AHRQ) able administrative data. Because little information on process of care is available in these datasets, these measures are mainly struc- Esophageal resection Hospital volume (AHRQ) for cancer tural (e.g., hospital procedure volume) or outcome-based (e.g., risk- adjusted mortality). The Leapfrog Group (http://www.leapfrog- Hospital volume (AHRQ, LF) Pancreatic resection, a coalition of large employers and purchasers, has devel- Risk-adjusted mortality (AHRQ) oped perhaps the most visible set of surgical quality indicators for Hospital volume (AHRQ) its value-based purchasing initiative. The organization’s original Pediatric cardiac surgery Risk-adjusted mortality (AHRQ) (2000) standards focused exclusively on procedure volume, but their current (2006) standards include selected process variables Hip replacement Risk-adjusted mortality (AHRQ) (e.g., the use of beta blockers in patients undergoing abdominal Craniotomy Risk-adjusted mortality (AHRQ) aortic aneurysm repair) and outcome measures. In the near future, the Leapfrog Group may begin using a composite of operative mor- Cholecystectomy Laparoscopic approach (AHRQ) tality and hospital volume as the primary measure for their evi- Appendectomy Avoidance of incidental appendectomy (AHRQ) dence-based hospital referral initiative. Such composite measures are discussed further elsewhere (see below). AHRQ—Agency for Healthcare Research and Quality LF—Leapfrog Group
  3. 3. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 3 a b 25.0 20.0 Observed Mortality (%), 2002–2003 Observed Mortality (%), 2002–2003 20.0 16.0 15.0 12.0 10.0 8.0 5.0 4.0 0 0 Hospital Volume Historical Mortality Hospital Volume Historical Mortality Quartiles of Performance Measures for Quartiles of Performance Measures for Esophagectomy (1998–2001) Pancreatic Resection (1998–2001) Figure 1 Illustrated is the relative ability of historical (1998–2001) measures of hospital volume and risk- adjusted mortality to predict subsequent (2002–2003) risk-adjusted mortality in U.S. Medicare patients. LIMITATIONS outcome measures, is not required for many process measures. For One limitation of structural performance measures is that rela- example, appropriate prophylaxis against postoperative VTE is one tively few of them are strongly linked to patients and thus poten- performance measure in CMS’s expanding pay-for-performance tially useful as quality indicators. A second limitation is that most initiative and is part of SCIP. Because it is widely agreed that vir- structural measures, unlike most process measures, are not readily tually all patients undergoing open abdominal procedures should actionable. For example, a small hospital can increase the percent- be offered some form of prophylaxis, there is little need to collect age of its surgical patients who receive antibiotic prophylaxis, but it detailed clinical data about illness severity for the purposes of risk cannot easily make itself a high-volume center. Thus, al- adjustment. though some structural measures may be useful for selective refer- Another strength is that process measures are generally less con- ral initiatives, they are of limited value for quality improvement. strained by sample-size problems than outcome measures are. A third limitation is that whereas some structural measures can Important outcome measures (e.g., perioperative death) are rela- identify groups of hospitals or providers that perform better on tively rare, but most targeted process measures are relevant to a average, they are not adequate discriminators of performance much larger proportion of patients. Moreover, because process among individuals. For example, in the aggregate, high-volume measures generally target aspects of general perioperative care, hospitals have a much lower operative mortality for pancreatic they can often be applied to patients who are undergoing numer- resection than lower-volume centers do. Nevertheless, some indi- ous different procedures, thereby increasing sample sizes and, ulti- vidual high-volume hospitals may have a high mortality, and some mately, improving the precision of the measurements. individual low-volume centers may have a low mortality (though LIMITATIONS the latter possibility may be difficult to confirm because of the smaller sample sizes involved).14 For this reason, many providers One practical limitation of process measures is the lack of a reli- view structural performance measures as unfair. able infrastructure for collecting the necessary data. Administra- tive datasets do not have the clinical detail and specificity required for this task. Measurement systems based on clinical data, includ- Process Measures ing that of the National Surgical Quality Improvement Program Processes of care are the clinical interventions and services pro- (NSQIP) of the Department of Veterans Affairs (VA),15 focus on vided to patients. Process measures have long been the predomi- patient characteristics and outcomes and do not collect informa- nant quality indicators for both inpatient and outpatient medical tion on processes of care. Currently, most pay-for-performance care, and their popularity as quality measures for surgical care is programs rely on self-reported information from hospitals, but the growing rapidly. Perhaps the best example of the trend toward reliability of such data is uncertain (particularly when reimburse- using process measures is SCIP, which, as noted (see above), ment is at stake). focuses exclusively on processes related to prevention of SSIs, post- Even if this first limitation were overcome, there remains a sec- operative cardiac events,VTE, and respiratory complications. ond limitation to be considered—namely, that process variables are limited in their ability to explain observed variations in mortality. STRENGTHS There is a growing body of empirical data supporting this state- A strength of process measures is their direct connection to ment. Most of the data come from the literature on medical diag- patient management. Because they reflect the care that physicians noses (e.g., acute myocardial infarction), where the link between actually deliver, they have substantial face validity and hence process and outcome is much stronger than it is in surgery.16,17 For greater “buy-in” from providers. Such measures are usually direct- example, the JCAHO/CMS process measures for acute myocardial ly actionable and thus are a good substrate for quality-improve- infarction explained only 6% of the observed variation in risk- ment activities. adjusted mortality for this condition.17 A second strength is that risk adjustment, though important for Although to date, no analogous study has been done in surgery,
  4. 4. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 4 LIMITATIONS there is some reason to believe that existing process measures explain very little of the variation in important surgical outcomes. One limitation of hospital- or surgeon-specific outcome mea- First, most process measures currently used in surgery relate to sures is that they are severely constrained by small sample sizes. secondary rather than primary outcomes. For example, although For the large majority of surgical procedures, very few hospitals (or the value of antibiotic prophylaxis in reducing the risk of superfi- surgeons) have sufficient adverse events (numerators) and cases cial SSI should not be underestimated, this process is not among (denominators) to be able to generate meaningful, procedure-spe- the most important adverse events of major surgery (including cific measures of morbidity or mortality. For example, a 2004 death). Second, process measures in surgery often relate to com- study used data from the Nationwide Inpatient Sample to study plications that are very rare. For example, there is a consensus that seven procedures for which mortality was advocated as a quality prophylaxis for VTE is necessary and important. Accordingly, the indicator by the AHRQ.19 For six of the seven procedures, only a SCIP measures, endorsed by the NQF, include the use of appro- very small proportion of hospitals in the United States had large priate prophylaxis. However, pulmonary embolism is very uncom- enough caseloads to rule out a mortality that was twice the nation- mon, and therefore, improving adherence to these measures will al average. Although identifying poor-quality outliers is an impor- not avert many deaths. Until a better understanding is achieved tant function of outcome measurement, to focus on this goal alone regarding which details account for variations in the most impor- is to underestimate the problems associated with small sample tant complications, especially those adverse events leading to sizes. Distinguishing among individual hospitals with intermediate death, process measures will continue to be of limited usefulness in levels of performance is even more difficult. surgical quality improvement. Other limitations of direct outcome assessment depend on the measurement platform being used. The two most prevalent mea- surement platforms are the use of existing data, usually generated Outcome Measures for administrative purposes, and the creation of a clinical registry Direct outcome measures reflect the end result of care, either specifically for quality improvement. For outcome measures based from a clinical perspective or from the patient’s viewpoint. Mor- on clinical data, the major problem is expense. For example, it tality is by far the most commonly used surgical outcome measure, costs more than $100,000 annually for a private-sector hospital to but there are other outcomes that could also be used as quality participate in NSQIP. Because of the expense of data collection, indicators, including complications, hospital readmission, and var- the ACS-NSQIP currently collects data on only a sample of ious patient-centered measures of satisfaction or health status. patients undergoing surgery at each hospital. Although this sam- Several large-scale initiatives involving direct outcome assess- pling strategy reduces the cost of data collection, it exacerbates the problem of small sample size with individual procedures. ment in surgery are currently under way. For example, proprietary With measurement systems that use administrative data, a health care rating firms (e.g., Healthgrades) and state agencies are major concern is the adequacy of risk adjustment. For outcome assessing risk-adjusted mortalities by using Medicare or state-level measures to have face validity with providers, high-quality risk administrative datasets. Most of the current outcome-measurement adjustment may be essential. It may also be useful for discouraging initiatives, however, involve the use of large clinical registries, of gaming of the system (e.g., hospitals or providers avoiding high- which the cardiac surgery registries in New York, Pennsylvania, and risk patients to optimize their performance measures). It is unclear, a growing number of other states are perhaps the most visible exam- however, to what extent the scientific validity of outcome measures ples. At the national level, the Society for Thoracic Surgeons and the is threatened by imperfect risk adjustment with administrative American College of Cardiology have implemented systems for data. Although administrative data lack clinical detail on many tracking the morbidity and mortality associated with cardiac variables related to baseline risk,20-23 the degree to which case mix surgery and percutaneous coronary interventions, respectively. varies systematically across hospitals or surgeons has not been Although the majority of the outcome-measurement efforts to determined. Among patients who are undergoing the same surgi- date have been procedure-specific (and largely limited to cardiac cal procedure, there is often surprisingly little variation. For exam- procedures), NSQIP has assessed hospital-specific morbidities and ple, among patients undergoing CABG in New York State, unad- mortalities aggregated across surgical specialties and procedures. justed hospital mortality and adjusted hospital mortality (as NSQIP is now working in conjunction with the American College derived from clinical registries) were nearly identical in most years of Surgeons (ACS) in an effort to apply the same measurement (with correlations exceeding 0.90) [see Figure 2]. Moreover, hospi- approach outside the VA.18 Currently, the ACS-NSQIP is being tal rankings based on unadjusted mortality and those based on used in more than 170 private hospitals in the United States, of adjusted mortality were equally useful in predicting subsequent many different types and from all geographic regions. hospital performance. STRENGTHS Direct outcome measures have at least two major strengths. Matching Performance Measures to Underlying Goals First, they have obvious face validity and thus are likely to garner a Performance measures will never be perfect. Certainly, over high degree of support from hospitals and surgeons. Second, out- time, better analytic methods will be developed, and better access come measurement, in and of itself, may improve performance— to higher-quality data may be gained with the addition of clinical the so-called Hawthorne effect. For example, surgical morbidity elements to administrative datasets or the broader adoption of and mortality in VA hospitals have fallen dramatically since the electronic medical records. There are, however, some problems implementation of NSQIP in 1991.15 Undoubtedly, many surgical with performance measurement (e.g., sample-size limitations) that leaders at individual hospitals made specific organizational or are inherent and thus not fully correctable. Consequently, clinical process improvements after they began receiving feedback on their leaders, patient advocates, payers, and policy makers will all have hospitals’ performance. However, it is very unlikely that even a full to make decisions about when imperfect measures are nonetheless inventory of these specific changes would explain such broad- good enough to act on. based and substantial improvements in morbidity and mortality. A measure should be implemented only with the expectation
  5. 5. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 5 that acting on it will yield a net improvement in health quality. In improve quality at all hospitals, not to direct patients to centers other words, the direct benefits of implementing a particular mea- with high compliance rates. Conversely, the Leapfrog Group’s sure cannot be outweighed by the indirect harm. Unfortunately, efforts in surgery are primarily aimed at selective referral, though benefits and harm are often difficult to measure. Moreover, mea- they may indirectly provide incentives for quality improvement. surement is heavily influenced by the specific context and by For the purposes of quality improvement, a good performance who—patients, payers, or providers—is doing the accounting. For measure—most often, a process-of-care variable—must be action- this reason, the question of where to set the bar, so to speak, has able. Measurable improvements in the given process should trans- no simple answer. late into clinically meaningful improvements in patient outcomes. Although quality-improvement activities are rarely actually harm- It is important to ensure a good match between the perfor- ful, they do have potential downsides, mainly related to their mance measure and the primary goal of measurement. It is partic- opportunity cost. Initiatives that hinge on bad performance mea- ularly important to be clear about whether the underlying goal is sures siphon away resources (e.g., time and focus) from more pro- (1) quality improvement or (2) selective referral (i.e., directing ductive activities. patients to higher-quality hospitals or providers). Although some For the purposes of selective referral, a good performance mea- pay-for-performance initiatives may have both goals, one usually sure is one that steers patients toward better hospitals or physicians predominates. For example, the ultimate objective of CMS’s pay- (or away from worse ones). For example, a measure based on pre- for-performance initiative with prophylactic antibiotics is to vious performance should reliably identify providers who are likely to have superior performance now and in the future. At the same time, a good performance measure should not provide incentives a for perverse behaviors (e.g., carrying out unnecessary procedures to 4.0 meet a specific volume standard) or negatively affect other domains Correlation = 0.95 of quality (e.g., patient autonomy, access, and satisfaction). 3.5 Measures that work well for quality improvement may not be Risk-Adjusted Mortality (%) 3.0 particularly useful for selective referral; the converse is also true. For example, appropriate use of perioperative antibiotics in surgi- 2.5 cal patients is a good quality-improvement measure: it is clinically meaningful, linked to lower SSI rates, and directly actionable.This 2.0 process of care would not, however, be particularly useful for selec- tive referral purposes. In the first place, patients are unlikely to base 1.5 their decision about where to undergo surgery on patterns of peri- operative antibiotic use. Moreover, surgeons with high rates of 1.0 appropriate antibiotic use do not necessarily do better with respect to more important outcomes (e.g., mortality). A physician’s per- 0.5 formance on one quality indicator often correlates poorly with his or her performance on other indicators for the same or other clin- ical conditions.24 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 As a counterexample, the two main performance measures for Observed Mortality (%) pancreatic cancer surgery—hospital volume and operative mortal- b ity—are very informative in the context of selective referral: 4.0 patients can markedly improve their chances of surviving surgery by selecting hospitals highly ranked on either measure [see Figure 1]. Neither of these measures, however, is particularly useful for 3.0 quality-improvement purposes. Volume is not readily actionable, Mortality (%), 2002 and mortality is too unstable at the level of individual hospitals (again, because of the small sample sizes) to serve as a means of 2.0 identifying top performers, determining best practices, or evaluat- ing the effects of improvement activities. Many believe that a good performance measure must be capa- 1.0 ble of distinguishing levels of performance on an individual basis. From the perspective of providers in particular, a measure cannot be considered fair unless it reliably reflects the performance of 0 individual hospitals or physicians. Unfortunately, as noted (see Best Middle Worst Best Middle Worst above), small caseloads (and, sometimes, variations in the case Unadjusted Mortality Ratings, Risk-Adjusted Mortality Ratings, mix) make this degree of discrimination difficult or impossible to New York State Hospitals New York State Hospitals, 2001 achieve with most procedures. Even so, information that at least improves the chances of a good outcome on average is still of real value to patients. Many performance measures can achieve this less Figure 2 Shown are mortality figures from coronary artery bypass surgery in New York State hospitals, based on data from demanding objective even if they do not reliably reflect individual the state’s clinical outcomes registry. (a) Depicted is the correla- performance. tion between adjusted and unadjusted mortality rates for all state For example, a 2002 study used clinical data from the hospitals in 2001. (b) Illustrated is the relative ability of adjusted Cooperative Cardiovascular Project to assess the usefulness of the mortality and unadjusted mortality to predict performance in the Healthgrades hospital ratings for acute myocardial infarction subsequent year. (based primarily on risk-adjusted mortality from Medicare data).25
  6. 6. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 6 Compared with the one-star (worst) hospitals, the five-star (best) ber of data elements could be reduced by creating more parsimo- hospitals had a significantly lower mortality (16% versus 22%) nious risk-adjustment models. Second, the sampling strategy could after risk adjustment with clinical data; they also discharged signif- be changed to sample 100% of the most important operations; this icantly more patients on appropriate aspirin, beta-blocker, and change would allow assessment of procedure-specific outcomes. angiotensin-converting enzyme inhibitor regimens. However, the Ultimately, participating hospitals would need procedure-specific Healthgrades ratings proved not to be useful for discriminating outcome data to target specific operations for improvement.Third, between any two individual hospitals. In only 3% of the head-to- clinical processes of care could be added to the data collection head comparisons did five-star hospitals have a statistically lower process; this would allow hospitals to respond to national pay- mortality than one-star hospitals. for-performance mandates, as well as to provide more actionable Thus, some performance measures that clearly identify groups quality measures.This last change would require the ACS-NSQIP of hospitals or providers exhibiting superior performance may be to manifest a level of flexibility that it has not exhibited to date. limited in their ability to differentiate individual hospitals from one With the flexibility to change data measurement periodically, the another.There may be no simple way of resolving the basic tension ACS-NSQIP would be able not only to add other measures that implied by performance measures that are unfair to providers yet are used in national mandates (e.g., SCIP) but also to evaluate informative for patients. This tension does, however, underscore their importance. the importance of being clear about (1) what the primary purpose Another barrier to improving surgical performance is the lack of of performance measurement is (quality improvement or selective good global measures of performance. With the proliferation of referral) and (2) whose interests are receiving top priority (the pay-for-performance pilot programs, various stakeholders have provider or the patient). been confronted with the problem of how to make sense of multi- ple competing measures of quality. Most have responded by com- bining multiple domains to create a composite measure of perfor- Future of Performance Measurement mance. The Premier/Center for Medicare and Medicaid Services Although great progress has been made, the science of surgical Hospital Quality Incentive Demonstration uses a composite of quality improvement is still in its infancy.There are several barriers process and outcome as a quality measure for coronary artery to improving the quality of surgical care. Perhaps the biggest bar- bypass surgery. The Society of Thoracic Surgeons’ Task Force on rier is the lack of an accurate and affordable measurement infra- Quality Measurement advocates a composite score based on a set structure. One practical solution that may reduce the expense of of outcome and process measures endorsed by the NQF. In these detailed data collection with clinical registries is to create hybrid composite approaches, the different measures are essentially systems that join data elements from administrative and clinical weighted equally, with no empiric determination of which ones are datasets. Although administrative data are criticized for their lack the most important.There are, however, emerging techniques that of accuracy in identifying coexisting diseases, they can reliably use empirically derived weighting to create a composite score that identify the type of procedure performed, certain demographic optimally predicts future mortality for high-risk surgery. As such variables (e.g., age, gender, and race), and some outcome variables methods become more fully developed, composite measures will (e.g., vital status, discharge to a skilled nursing facility, and length no doubt continue to gain popularity. of stay). This set of variables could then be linked to a limited set Given that most existing quality improvement efforts focus on of clinical risk factors that would allow robust risk adjustment.This optimizing measurement of technical quality, it is important not to solution will be even more attractive as administrative data come lose sight of the fact that many quality concerns arise upstream to contain more accurate information (e.g., present-on-admission from the operation itself—that is, with the decision to operate in codes to distinguish complications from coexisting problems).26 the first place.Wide variations in the use of surgery have long been In addition to improving the efficiency of data collection, it recognized. Some of these variations are attributable to differences would be worthwhile to rethink how existing registries are in disease prevalence and physician practice style. Some, however, designed so as to make them less expensive and more useful. For arise from either overuse, underuse, or misuse of surgical manage- example, although the ACS-NSQIP is in a key position to become ment. For a full accounting of surgical quality, it will be necessary the leading measurement platform for surgical quality improve- to develop reliable means of measuring the appropriateness of sur- ment, there are several changes that could be made to ensure its gical treatment and the extent to which patient preferences are success. First, the burden of data collection could be reduced; this incorporated into clinical decisions, in addition to measures assess- would substantially decrease the costs of participating. The num- ing how well patients do after surgery. References 1. Lee TH, Meyer GS, Brennan TA: A middle 5. Bird SM, Cox D, Farewell VT, et al: Performance subsequent hospital performance. Ann Surg ground on public accountability. N Engl J Med indicators: good, bad, and ugly. J R Statist Soc 243:411, 2006 350:2409, 2004 168:1, 2005 9. Bach PB, Cramer LD, Schrag D, et al: The influ- 2. Galvin R, Milstein A: Large employers’ new strate- 6. Halm EA, Lee C, Chassin MR: Is volume related ence of hospital volume on survival after resection gies in health care. N Engl J Med 347:939, 2002 to outcome in health care? A systematic review and for lung cancer. N Engl J Med 345:181, 2001 3. Birkmeyer JD, Birkmeyer NJ, Dimick JB: methodologic critique of the literature. Ann Intern 10. Begg CB, Reidel ER, Bach PB, et al: Variations in Measuring the quality of surgical care: structure, Med 137:511, 2002 morbidity after radical prostatectomy. N Engl J process, or outcomes? J Am Coll Surg 198:626, 7. Dudley RA, Johansen KL, Brand R, et al: Selective Med 346:1138, 2002 2004 referral to high volume hospitals: estimating poten- 11. Finlayson EVA, Birkmeyer JD: Effects of hospital 4. Landon BE, Normand SL, Blumenthal D, et al: tially avoidable deaths. JAMA 283:1159, 2000 volume on life expectancy after selected cancer Physician clinical performance assessment: 8. Birkmeyer JD, Dimick JB, Staiger DO: Operative operations in older adults: a decision analysis. J prospects and barriers. JAMA 290:1183, 2003 mortality and hospital volume as predictors of Am Coll Surg 196:410, 2002
  7. 7. © 2008 WebMD, Inc. All rights reserved. ACS Surgery: Principles and Practice ELEMENTS OF CONTEMPORARY PRACTICE 2 PERFORMANCE MEASURES IN SURGICAL PRACTICE — 7 12. Pronovost PJ, Angus DC, Dorman T, et al: 17. Bradley EH, Herrin J, Elbel B, et al: Hospital qual- progress, but problems remain. Am J Public Health Physician staffing patterns and clinical outcomes ity for acute myocardial infarction: correlation 82:243, 1992 in critically ill patients: a systematic review. JAMA among process measures and relationship with 22. Iezzoni LI, Foley SM, Daley J, et al: Comorbidities, 288:2151, 2002 short-term mortality. JAMA 296:72, 2006 complications, and coding bias. Does the number 13. Pronovost PJ, Needham DM, Waters H, et al: 18. Fink A, Campbell DJ, Mentzer RJ, et al: The Na- of diagnosis codes matter in predicting in-hospital Intensive care unit physician staffing: financial tional Surgical Quality Improvement Program in mortality? JAMA 267:2197, 1992 modeling of the Leapfrog standard. Crit Care Med non–Veterans Administration hospitals: initial 32:1247, 2004 demonstration of feasibility. Ann Surg 236:344, 23. Iezzoni LI: The risks of risk adjustment. JAMA 2002 278:1600, 1997 14. Shahian DM, Normand SL:The volume-outcome relationship: from Luft to Leapfrog. Ann Thorac 19. Dimick JB, Welch HG, Birkmeyer JD: Surgical 24. Palmer RH, Wright EA, Orav EJ, et al: Surg 75:1048, 2003 mortality as an indicator of hospital quality: the Consistency in performance among primary care problem with small sample size. JAMA 292:847, practitioners. Med Care 34(9 suppl):SS52, 1996 15. Khuri SF, Daley J, Henderson WG:The compara- tive assessment and improvement of quality of sur- 2004 25. Krumholz HM, Rathore SS, Chen J, et al: gical care in the Department of Veterans Affairs. 20. Finlayson EV, Birkmeyer JD, Stukel TA, et al: Evaluation of a consumer-oriented internet health Arch Surg 137:20, 2002 Adjusting surgical mortality rates for patient care report card: the risk of quality ratings based comorbidities: more harm than good? Surg on mortality data. JAMA 287:1277, 2002 16. Fonarow GC, Abraham WT, Albert NM, et al: Association between performance measures and 132:787, 2002 26. Fry DE, Pine M, Jordan HS, et al: Combining clinical outcomes for patients hospitalized with 21. Fisher ES, Whaley FS, Krushat WM, et al: The administrative and clinical data to stratify surgical heart failure. JAMA 297:61, 2007 accuracy of Medicare’s hospital claims data: risk. Ann Surg 246:875, 2007