Predicting Mortality after Bypass Surgery: Coronary Artery
1Predicting Mortality after Coronary ArteryBypass Surgery:What Do Artificial Neural Networks Learn?JACK V. TU, MD, PhD, MILTON C. WEINSTEIN, PhD,BARBARA J. MCNEIL, MD, PhD, C. DAVID NAYLOR, MD, DPhil, andTHE STEERING COMMITTEE OF THE CARDIAC CARE NETWORKOF ONTARIO*Objective. To compare the abilities of artificial neural network and logistic regressionmodels to predict the risk of in-hospital mortality after coronary artery bypass graft(CABG) surgery. Methods. Neural network and logistic regression models were de-veloped using a training set of 4,782 patients undergoing CABG surgery in Ontario,Canada, in 1991, and they were validated in two test sets of 5,309 and 5,517 patientshaving CABG surgery in 1992 and 1993, respectively. Results. The probabilities pre-dicted from a fully trained neural network were similar to those of a “saturated” re-gression model, with both models detecting all possible interactions in the training setand validating poorly in the two test sets. A second neural network was developed bycross-validating a network against a new set of data and terminating network trainingearly to create a more generalizable model. A simple “main effects” regression modelwithout any interaction terms was also developed. Both of these models validated well,with areas under the receiver operating characteristic curves of 0.78 and 0.77 (p >0.10) in the 1993 test set. The predictions from the two models were very highly cor-related (r = 0.95). Conclusions. Artificial neural networks and logistic regression modelslearn similar relationships between patient characteristics and mortality after CABGsurgery. Key words: cardiac surgery; mortality; neural networks; logistic regression;ROC curves. (Med Decis Making 1998;18:229-235)Recently, there has been widespread interest in us- logical structure of the human brain, and it hasing artificial neural networks (ANNs) for predicting been suggested that they may offer some advantagesclinical outcomes. Neural networks are pattern-rec- over classic statistical approaches as predictive mod-ognition algorithms that are modeled after the bio- els for certain clinical problems. 1-7 ANNs are devel- oped using an iterative training process in which training examples (e.g., previous patients who have Received December 16, 1996, from the Institute for Clinical undergone cardiac surgery) are repeatedly pre-Evaluative Sciences in Ontario (JVT, CDN) and the Clinical Epi- sented to a neural network. The weights of a net-demiology Unit and Division of General Internal Medicine, De- work are gradually adjusted until the networkpartment of Medicine, Sunnybrook Health Science Centre, Uni-versity of Toronto, Toronto, Ontario, Canada (JVT, CDN); and the “learns” the mathematical relationship between theDepartment of Health Policy and Management, Harvard School predictor variables and the outcome of interest.of Public Health, Boston, Massachusetts (MCW) and Department Neural networks may theoretically offer greater pre-of Health Care Policy, Harvard Medical School, Boston, Massa-chusetts (JVT, BJM). Revision accepted for publication June 12,1997. Supported in part by a seed grant from the Harvard Center *Steering Committee of the Cardiac Care Network of Ontario:for Risk Analysis and by grants number HS08071 and HS08464 Donald S. Beanlands, MD, University of Ottawa Heart Institute;from the Agency for Health Care Policy and Research, Rockvllle, Lorna Bickerton, BScN, University of Ottawa Heart Institute; Rob-Maryland. Dr. Tu was supported by a Health Research Personnel ert Chisholm, MD, St. Michael’s Hospital, Toronto; Martin Gold-Development Program Fellowship (04544) from the Ontario Min- bach, MD, Victoria Hospital, London; Vicki Kaminski, BScN, Sud-istry of Health. Dr. Naylor was supported by a Career Scientist bury Memorial Hospital; Jeffrey Lozon, MHA, St. Michael’sAward (02377) from the Ontario Ministry of Health. The results Hospital, Toronto; Neil McKenzie, MB ChB, University Hospital,and conclusions are those of the authors, and no official en- London; Barry J. Monaghan, BComm DHA, West Park Hospital,dorsement by the Ministry of Health is intended or should be Toronto; Christopher D. Morgan, MD, Sunnybrook Health Sci-inferred. ence Centre, Toronto; John Pym, MB, BCh, Kingston General Address correspondence and reprint requests to Dr. Tu: In- Hospital; Hugh Scully, MD, Toronto Hospital; B. William Shragge,stitute for Clinical Evaluative Sciences, G-106, 2075 Bayview Av- MD, Hamilton General Hospital; James Swan, MD, Scarboroughenue, Toronto, Ontario, Canada, M4N 3M5. Centenary Health Centre. 229
230 l Tu, Weinstein, McNeil, Nayior, Cardiac Care Network of Ontario MEDICAL DECISION MAKINGdictive accuracy when important nonlinear or Table 1 l Patient Characteristics Used to Develophigher-order relationships that have not been de- Predictive Models of in-hospital Mortality following Coronary Artery Bypass Graft (CABG)tected by other forms of analysis exist in data Surgery*sets. l-4 However, relatively few head-to-head com- Age 65-74 yearsparisons of logistic regression and neural network Age 2 75 yearsmodeling techniques have been conducted by re- Female gendersearchers in the field. Grade 2 left ventricular function (EF 35-50%) In this study, we developed ANN and logistic re- Grade 3 left ventricular function (EF 20-34%)gression models to predict the risk of patient mor- Grade 4 left ventricular function (EF ~20%)tality following isolated coronary artery bypass graft Unknown left ventricular function(CABG) surgery (without concomitant valve surgery) Urgent surgeryin Ontario, Canada. Accurately identifying the risks Emergency surgeryof CARG surgery is an important clinical problem, Previous CABG surgeryand regression modeling techniques are frequently Left main disease (2 50% stenosis)used for assessing surgical risk.8-12 Risk-stratification CCS Class 3 anginamodels are useful both to clinicians counseling in- CCS Class 4 anginadividual patients and to outcomes researchers com- Recent myocardial infarction (cl week) Diabetesparing the risk-adjusted mortality rates of different Chronic obstructive lung diseasehospitals and cardiac surgeons.8-12 Since ANNs are Peripheral vascular diseaseoften considered “black-box” prediction models, we *EF denotes ejection fraction; CCS, Canadian Cardiovascular Societysought to determine what neural networks actually angina class.learn about surgical risks during the training pro-cess. 13 In particular, we compared the predictions 3from ANN and logistic regression models, since Table 2 l Summary of the Coronary Artery Bypass Graftgreater predictive power has been cited as the major (CABG) Surgery Data Sets*reason for using the more complex ANN modeling 1991 1992 1993technique. Training Set Test Set Test Set Patients 4,762 5,309 5,517 Unique covariate patterns 774 639 633Methods Patients who died 144 153 173 Unique covariate patterns with deaths 119 127 139DATA SOURCES in-hospital mortality rate 3.01% 2.66% 3.14% *The data sets were collected based on a fiscal year (April 1 of the The data in this study come from the Cardiac Care calendar year to March 31 of the subsequent year).Network (formerly the Provincial Adult Cardiac CareNetwork1 of Ontario database, a large population-based cardiac surgery registry used to collect clini- March 31, 19921 while the two test sets included allcal data for all patients waiting for CABG surgery in patients who had CABG surgery in fiscal 1992 (AprilOntario. This clinical database was linked using 1, 1992, to March 31, 1993) and fiscal 1993 (April 1,unique patient identifiers to mortality and non-car- 1993, to March 31, 19941. Since neural networksdiac comorbidity information contained in the Ca- learn by recognizing patterns, the number of uniquenadian Institute for Health Information administra- covariate patterns (i.e., unique combinations of pa-tive database, as described elsewhere. 8,9 Eleven tient characteristics1 in each data set was deter-patient characteristics were examined as potential mined, including the number of patterns associatedpredictors of in-hospital mortality following CABG with deaths each fiscal year.surgery (table 1), after identifying the most commonsurgical risk factors found in other studies.8-12 Pa-tient ages were initially grouped into five-year inter- ARTIFICIAL NEURAL NETWORK MODELSvals and then these intervals were combined intothree categories, < 65 years, 65-74 years, and 2 75 Artificial neural network models were developedyears, based on similar mortality rates within these using the 1991 training set and were evaluated basedage ranges. on the two test sets using NeuroShell 2, a commer- In the current study, the database was divided into cially available neural network simulator. 14 All mod-three data sets based on the fiscal years in which els were developed using the back-propagationthe data were collected (table 2). The training set learning algorithm with a least-mean-squares-errorincluded all patients who underwent isolated CABG objective function. 5,7,15 A logistic function was used 5surgery in Ontario in fiscal 1991 (April 1, 1991, to in the output layer so that the network output would
VOL 18/NO 2, APR-JUN 1998 Cardiac Surgery Neural Network l 231be similar to a probability..” Various neural network COMPARISON OF NEURAL NETWORK ANDarchitectures were tried, and those yielding the best LOGISTIC REGRESSION MODELSpredictive performances were used for further com-parisons with logistic regression models. The num- The predicted probabilities of patient deaths afterber of hidden nodes was varied between a lower CARG surgery from ANN model 1 and the saturatedlimit of 1 and an upper limit of 100, and different logistic regression model were compared in thetypes of activation functions in the hidden layer were 1991 training set. Similarly, the predicted probabili-tried (logistic, hyperbolic tangent, etc.). 14 An initial ties from the main-effects logistic regression modellearning rate of 0.1 and a momentum term of 0.1 were compared with those from ANN model 2 inwere chosen for developing these models.5,15 Sensi- the 1993 test set, and Pearson’s correlation coeffi-tivity analyses of these parameters were conducted cient between the predictions was determined. Theto see whether they improved ANN performance. A areas under the receiver operating characteristicdetailed summary of all the models tried is available (ROC) curves were calculated for all four models inelsewhere.*’ both the training set and the two test data sets. 20,21 In this study, two different neural network models The ROC curve is a measure of the discriminatingwere developed. The first to be developed was a ability of a model, with higher areas indicating bet-model designed for maximal predictive perfor- ter predictive ability. An ROC curve area of 1.00 in-mance in the training set (i.e., the smallest error in dicates perfect predictive ability, while an area ofthe training set), hereafter referred to as ANN model 0.50 indicates predictability no better than chance.1. An analysis was conducted to determine the re-lationship between the predictions from this modeland the actual probability of death for patients witheach unique covariate pattern. Since this analysis Resultsshowed that ANN model 1 overfit the data in the The overall predictive performances of the fourtraining set by memorizing each unique covariate CARG mortality predictive models are summarizedpattern to the extent that the model was not gener- in table 3. ANN model 1 had a very high area underalizable, a second model, hereafter referred to as the receiver operating characteristic curve of 0.88 inANN model 2, was developed. This network was de- the 1991 training set but validated poorly in the I992veloped by periodically cross-validating the 1991 and 1993 test sets, with areas under the ROC curvestraining set network on the 1992 test set during of 0.52 and 0.57, respectively. ANN model 1 was atraining and saving the network’s weight configu- network with an architecture containing 78 hiddenration that performed best in the 1992 test set.’ The nodes in one hidden layer (model not shown).”performance of ANN model 2 in the 1993 test set Analysis of the predicted probabilities from thisshould be considered the most valid test of this net- model revealed that they were directly proportionalworks predictive performance because the 1992 test to the mean probabilities of death for patients withset was used in its development. the individual covariate patterns in the 1991 training set, a property associated with the predictions of sat-LOGISTIC REGRESSION MODELS urated statistical models (see table 4 below).” In es- For comparison with ANN model 1, the equivalent Table 3 l Performances of Four Models for Predicting In-of a “saturated’ logistic regression model was cre- hospital Mortalitv after CABG Surgeryated, with indicator variables used to represent Area under theevery unique covariate pattern in the 1991 training Data Set ROC Curveset.l8 A saturated statistical model includes all pos- Artificial neural network 1991 training set 0.88sible interactions between the covariates. A main- model 1* 1992 test set 0.52effects logistic regression model was also developed 1993 test set 0.57for comparison with ANN model 2 by using indica- Saturated logistic 1991 training set 0.94tor variables for the 11 covariates shown in table 1 model 1992 test set 0.48(regression coefficients are not shown but are avail- 1993 test set 0.49able from the first author). 17 A main-effects model Artificial neural network 1991 training set 0.79includes the individual covariates but does not in- model 2t 1992 test set 0.77clude any interaction terms. Predicted probabilities 1993 test set 0.78of patient deaths after CARG surgery were calculated Main-effects logistic 1991 training set 0.79using both the saturated model and the main-effects model 1992 test set 0.78logistic regression model for all patients in the three 1993 test set 0.77data sets. The Stata statistical package was used for *1991 training-set error minimized.statistical analysis.19 ?I992 test-set error minimized.
232 l lb, Weinstein, McNeil, Naylor, Cardiac Care Network of Ontario MEDICAL DECISION M A K I NG 0 0 I 8 FIGURE 1. Predicted probabilities of pa- tients dying after GABG surgery-ANN 0 Model 1 versus saturated logistic regres- sion model in the 1991 training set. I I I I I I 0 .2 A -6. .a 1 Predicted probability of death - Saturated logistic modelsence, this overfitted model memorized each spe- output layer. Both ANN model 2 and the main-effectscific pattern of clinical characteristics associated logistic regression model predicted mortality well inwith deaths, reducing its generalizability to. other the 1993 test set, with areas under the ROC curve ofyears of data. The fully saturated logistic model had 0.78 and 0.77 (p > 0.10).20,21 Figure 3 shows a linearan even higher area under the ROC curve of 0.94 in relationship between the predicted probabilities ofthe 1991 training set but also validated poorly in the death from these two models in the 1993 test set.1992 and 1993 test sets. The predicted probabilities The correlation between the predicted probabilitiesfrom ANN model 1 and the saturated logistic model was very high (r = 0.951, suggesting that the twoare shown for all patients in the 1991 training set in models learned similar relationships between pa-figure 1. tient characteristics and mortality after CABG sur- ANN model 2 was the neural network model de- gery, although the predicted probabilities wereveloped by intermittently cross-validating the 1991 spread over a wider range with the main-effects lo-training set network against the 1992 test set. This gistic regression model.network was a model with one hidden layer con- Predicted probabilities of patients dying aftertaining ten hidden nodes, as shown in figure 2. In CABG surgery from the four prediction models areANN model 2, each input variable to the network shown for some sample covariate patterns in tableundergoes a nonlinear logistic transformation 4. This table demonstrates in detail that the predic-within each node in the hidden layer of the network tions from ANN model 1 were directly proportionalbefore the predicted probability emerges from the to the mean probabilities of death for patients with Age 65-74 Age 275 Female gender FIGURE 2. Diagram of an (17 X 10 X 1) Grade 2 left ventricular function artificial neural network model (ANN Grade 3 lefl ventricular function Model 2) for predicting mortality after Grade 4 left ventricular function CABG surgery. Each circle represents a Unknown left ventricular function node, while each line represents a weight. Urgent surgery Emergency surgery Every node in each layer in connected to Previous CABG surgery every node in the preceding and/or suc- Left main disease ceeding layers by a weight. Only the CCS Class 3 angina weights to the first and last nodes in the CCS Class 4 angina hidden layer are shown. A nonlinear lo- Recent myocardial infarction gistic transformation is applied to the in- Diabetes put variables at each node in the hidden Chronic obstructive lung disease layer and the output layer. Peripheral vascular disease Input Hidden output Layer Layer Layer
VOL 18/NO 2, APR-JUN 1998 Cardiac Surgery Neural Network l 233 z N am .6 - gFIGURE 3. Predicted probabilities of pa- s stients dying after CAE%G surgery-ANN B 02Model 2 versus main-effects logistic re- &S . 4 - 0 m0gression model in the 1993 test set. ‘EIl 00~0 ?i 0 Oso” a .2 - e n O- I I I I I I 0 .2 .4 .6 .8 1 Predicted probability of death - Main effects logistic modelindividual covariate patterns (i.e., the predictions of vide further evidence of the comparability of thea saturated regression model). This table also shows predictions from these two techniques, and showthat the predictions from ANN model 2 were com- that the two methods identify the same types of pa-parable in magnitude to the predictions of the main- tients as being at the highest risk of death after CABGeffects logistic regression model. These results pro- surgery.Table 4 l Predicted Probabilities of Patients Dying after CABG Surgery from the Four Prediction Models for Some Sample Covariate Patterns in the 1991 Training Set Example 7 8 (n =‘27) (n ,218) (nz 8) (nz 4) (nZ2) (n! 2) (n= 1) (n= 1)Mean probability of death 0 0 0.25 0.25 0.50 0.50 1 .oo 1 .ooPredicted probability of death ANN* model 1 0 0.02 0.18 0.22 0.39 0.52 0.93 0.99 Saturated logistic model 0 0 0.25 0.25 0.50 0.50 1 .oo 1.00 ANN* model 2 0.02 0.01 0.07 0.09 0.04 0.06 0.13 0.18 Main-effects logistic model 0.01 0.01 0.10 0.13 0.05 0.07 0.13 0.28Covariates in pattern Age 85-74 years X X X Age ~75 Female gender X X X Grade 2 LVFf X X Grade 3 LVFt X X X Grade 4 LVFt Unknown LVFt Urgent surgery X Emergency surgery X X X Previous CABG$ surgery X X Left main disease X X CCSg Class 3 angina X X X CCSg Class 4 angina X X X X X Recent myocardial infarction Diabetes X COPDll Peripheral vascular disease X X *Artificial neural network. tLeft ventricular function. SCoronary artery bypass graft. OCanadian Cardiovascular Society. IChronic obstructive pulmonary disease.
234 l Tu, Weinstein, McNeil, Naylor, Cardiac Care Network of Ontario MEDICAL DECISION MAKINGDiscussion actions between dependent and independent varia- bles, but these types of data sets may be relatively In this study, we developed artificial neural net- infrequent in clinical medicine. Previous analyses ofwork and logistic regression models for predicting ours and other cardiac surgery data sets have shownthe risk of patient death following isolated CARG sur- that two-way interactions are usually not of majorgery. We demonstrated that fully training a neural importance in assessing the risks of CABGnetwork (ANN model 1) leads a network to the equiv- surgery. 9-12 Furthermore, most of the variables that ”alent of a “saturated” statistical state, with the net- are risk factors for CABG mortality are binary vari-work memorizing each unique cluster of clinical ables (i.e., disease present or absent), so identifyingcharacteristics in the training set and its average nonlinear relationships are less likely to be impor-mortality rate.l8 Differences between the training ’ tant statistically. The only continuous variable in ourand test sets in the covariate patterns of patients study, age, was treated as a categorical variable sowho died caused this network to validate quite that we could perform analyses of individual co-poorly in two test data sets.’ However, by cross-vali- variate patterns.dating a neural network against a new set of data Our study demonstrates that it is possible toand terminating network training early in the train- mimic the pattern-recognition capabilities of neuraling process, we were also able to develop a more networks by fitting a regression model with separategeneralieable model (ANN model 2 ) that predicted coefficients for each unique covariate pattern. How-mortality as well as a main-effects logistic regression ever, this led to overfitting and the equivalent of amodel in an independent test data set. Our study “saturated” statistical model.” To the best of ourprovides some novel insights into the statistical be- knowledge, the link between the neural networkhavior of artificial neural networks, and demon- phenomenon of “overfitting” or memorizing all pat-strates that neural network and logistic regression terns in the training set and statistical “saturation”models learn similar relationships between patient has not been previously recognized. For example,characteristics and mortality after CABG surgery. Doig and colleagues compared a fully-trained neural We embarked on this study with the hope that the network (similar to our ANN model 1) and a main-neural network might uncover some previously un- effects logistic regression model in developing mod-recognized relationships between patient character- els to predict mortality in their intensive care unit. 23istics and mortality after CABG surgery, in light of They erroneously concluded that because the areasome positive reports by other investigators. 1-4,22 under the ROC curve of the fully-trained neural net-However, the results of the study showed that the work (0.99931 was superior to that of their logisticpredicted probabilities from a trained network (ANN regression model 10.92591 in their training set neuralmodel 2) were very highly correlated with those of networks were superior prediction models. How- standard statistical models. Unlike regression mod- ever, the proper comparison should have been with els, which can clearly identify the patient character- a saturated regression model. A saturated regressionistics that have the greatest influences on surgical model would have had the same or an even higher risk, the “knowledge” acquired by the neural net- ROC curve area, since it represents the upper limitwork during the training process is less obvious and on any model’s predictive performance in the train-is contained within the weights of the neural net- ing set.l8work. Our finding that the same types of patients The relatively low prevalence (-3%) of deaths afterwere identified as high surgical risks by both the CABG surgery meant that there were relatively few neural network and the regression model provides examples of each type of patient who died that the strong, albeit indirect, evidence that the knowledge neural network could learn to recognize, even about patient characteristics and surgical mortality though there were thousands of patients in the acquired by the neural network is similar to that training set. Since the covariate patterns of patientswhich we identified using regression analysis. who die from CARG surgery vary from year to year, Our finding of similar levels of predictive perfor- the neural network could only improve its recogni- mance with the neural network and the logistic re- tion of these patients in the 1991 training set at the gression models is not entirely unexpected. While expense of its performance in the 1992 and 1993 test some investigators have obtained impressive results sets. These tradeoffs between predictive accuracy in with neural networks, 1-4 other studies suggest that 4 the training set and the generalizability of a model neural networks offer small or no improvement explain why neural networks may not validate any over existing regression modeling techniques when better than regression models in certain data sets in head-to-head comparison studies have been con- spite of their ability to recognize individual covariate ducted.’ Neural networks may be expected to per- patterns. Neural networks may ultimately prove to form better when data sets contain very complex be most useful in clinical scenarios requiring the nonlinear relationships or many higher-order inter- recognition of small numbers of high-frequency
VOL 18/NO 2, APR-JUN 1998 Cardiac Surgery Neural Network l 235covariate patterns (e.g., reading cervical smears, 4. Dybowski R, Weller P, Chang Ii, Gant V. Prediction of out-electrocardiograms). 24,25 come in critically ill patients using atificial neural network synthesized by genetic algorithm. Lancet. 1996;347:1146-50. Our study has certain limitations. We restricted 5. Tu JV. Advantages and disadvantages of using artificial neu-the study to using “back-propagation,” the most ral networks versus logistic regression for predicting medi-popular neural-net-training algorithm in current cal outcomes. J Clin Epidemiol. 1996;49:1225-31.use.l5 Newer algorithms are being developed by re- 6. Cross SS, Harrison RF, Kennedy RL. Introduction to neural networks. Lancet. 1995;346:1075-9.searchers in the field, and it is possible that some 7. Penny W, Frost D. Neural networks in clinical medicine. Medother training algorithm might have led to superior Decis Making. 1996;16:386-98.results for the neural network. Second, our study 8. Tu JV, Naylor CD. Steering Committee of the Provincial Adultrepresents a comparison of neural networks and lo- Cardiac Care Network of Ontario. Coronary artery bypassgistic regression in one data set only, and further mortality rates in Ontario: a Canadian approach to quality assurance in cardiac surgery. Circulation, 1996;94:2429-33.comparisons of the two methods should be con- 9. Tu JV, Jaglal SB, Naylor CD. Steering Committee of the Pro-ducted by researchers using other data sets. In par- vincial Adult Cardiac Care Network of Ontario. Multicenterticular, studies using data sets with more continu- validation of a risk index for mortality, intensive care unitous variables or more frequent adverse outcomes stay, and overall hospital of stay after cardiac surgery. Cir-could yield different results. Third, we restricted the culation. 1995;91:677-84. 10. Higgins TL, Estafanous FG, Loop FD, Beck GJ, Blum JM, Par-input variables in our study to those that have pre- anandi L. Stratification of morbidity and mortality outcomeviously been shown to be important predictors of by preoperative risk factors in coronary artery bypass pa-CABG mortality. Some investigators have suggested tients: a clinical severity score. JAMA. 1992;267:2344-8.that neural networks can identify variables that are 11. O’Connor GT, Plume SK, Olmstead EM, et al. Northern New England Cardiovascular Disease Study Group. Multivariatenot important outcome predictors by traditional sta- prediction of in-hospital mortality associated with coronarytistical methods.” artery bypass graft surgery. Circulation, 1992;85:2110-8. In conclusion, we have demonstrated a remark- 12. Hannan EL, Kolburn H Jr, Racz M, Shields E, Chassin MR.able similarity in the abilities of artificial neural net- Improving the outcomes of coronary artery bypass surgerywork and logistic regression models to predict in- in New York State. JAMA. 1994;271:761-6. 13. Hart A, Wyatt J. Evaluating black-boxes as medical decisionhospital mortality after CABG surgery. Neural aids: issues arising from a study of neural networks. Mednetworks and logistic regression models appear to Inform. 1990;15:229-36.learn similar relationships between patient charac- 14. Neuroshell2. Ward Systems Group, Inc., Frederick, MD.teristics and mortality after CABG surgery, but this 15. Rumelhart DE, Hinton GE, Williams RJ. Learning represen- tations by back-propagating errors. Nature. 1986;323:533-6.knowledge is modeled in a less transparent manner 16. Richard MD, Lippman RP. Neural network classifiers esti-in a neural network. Although artificial neural net- mate Bayesian a posteriori probabilities. Neural Computa-works are an interesting alternative to classic statis- tion. 1991;3:461-3.tical methods for predicting surgical outcomes, we 17. Tu JV. Quality of cardiac surgical care in Ontario, Canadadid not find any significant predictive advantage to [dissertation]. Cambridge, MA: Harvard University, 1996. 18. Shwartz M, Ash AS. Evaluating the performance of risk-ad-using them for predicting mortality after CABG sur- justment methods: continuous measures. In: Iezzoni LI (ed).gery. For our ongoing analyses of cardiac surgical Risk Adjustment for Measuring Health Care Outcomes. Annoutcomes in Ontario, we plan to rely on logistic re- Arbor, MI: Health Administration Press, 1994:287-311.gression and other more conventional statistical 19. Stata Release 4.0. College Station, TX, 1995. 20. Hanley JA, McNeil BJ. The meaning and use of the area un-methods to assess the outcomes of CABG surgery. 8.9 der a receiver operating characteristic (ROC) curve. Radiol- ogy. 1982;143:29-36. 21. Hanley JA, McNeil BJ. A method of comparing the areas un-The authors thank Carl N. Morris, PhD, for his helpful advice on der receiver operating characteristic curves derived from thethis project, and Keyi Wu, MSc, for assistance with computer same cases. Radiology. 1983;148:839-43.programming. They also thank the cardiovascular practitioners, 22. Baxt WG. Analysis of the clinical variables driving decision innurses, and registry personnel who make up the Cardiac Care an artificial neural network trained to identify the presenceNetwork of Ontario for participating in the study. of myocardial infarction. Ann Emerg Med. 1992;21:1439-44. 23. Doig GS, Inman KJ, Sibbald WJ, Martin CM, Robertson JM. Modeling mortality in the intensive care unit: comparing the performance of a back-propagation, associative-learningReferences neural network with multivariate logistic regression. In: Pro- ceedings of the Seventeenth Annual Symposium on Com- 1. Baxt WG. Use of an artificial neural network for the diagnosis puter Applications in Medical Care. Washington, DC: of myocardial infarction. Ann Intern Med. 1991;115:843-8. McGraw-Hill, 1993; pp 361-5. 2. Baxt WG, Skora J. Prospective validation of artificial neural 24. Koss LG, Lin E, Schreiber K, Elgert P, Mango L. Evaluation network trained to identify acute myocardial infarction. Lan- of the PAPNET cytologic screening system for quality control cet. 1996;347:12-5. of cervical smears. Am J Clin Pathol. 1994;101:220-9. 3. Buchman TG, Kubos KL, Seidler AJ, Siegforth MJ. A com- 25. Heden B, Ohlsson M, Rittner R, et al. Agreement between parison of statistical and connectionist models for the pre- artificial neural networks and experienced electrocardiog- diction of chronicity in a surgical intensive care unit. Crit rapher on electrocardiographic diagnosis of healed myocar- Care Med. 1994;22:750-62. dial infarction. J Am Co11 Cardiol. 1996;28:1012-6.