• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Estimating the proportion cured of cancer: Some practical advice for users
 

Estimating the proportion cured of cancer: Some practical advice for users

on

  • 255 views

Cure models can provide improved possibilities for inference if used appropriately, but there is potential for misleading results if care is not taken. In this study, we compared five commonly used ...

Cure models can provide improved possibilities for inference if used appropriately, but there is potential for misleading results if care is not taken. In this study, we compared five commonly used approaches for modelling cure in a relative survival framework and provide some practical advice
on the use of these approaches.

Statistics

Views

Total Views
255
Views on SlideShare
255
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Estimating the proportion cured of cancer: Some practical advice for users Estimating the proportion cured of cancer: Some practical advice for users Document Transcript

    • G Model CANEP-624; No. of Pages 7 Cancer Epidemiology xxx (2013) xxx–xxx Contents lists available at ScienceDirect Cancer Epidemiology The International Journal of Cancer Epidemiology, Detection, and Prevention journal homepage: www.cancerepidemiology.net Estimating the proportion cured of cancer: Some practical advice for users X.Q. Yu a,b,*, R. De Angelis c, T.M.L. Andersson d, P.C. Lambert d,e, D.L. O’Connell a,b,f,g, P.W. Dickman d a Cancer Council New South Wales, Sydney, Australia Sydney School of Public Health, Sydney, Australia c National Centre of Epidemiology, Italian National Institute of Health, Rome, Italy d Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden e University of Leicester, Department of Health Sciences, Leicester, UK f School of Medicine and Public Health, University of Newcastle, Newcastle, Australia g School of Public Health and Community Medicine, University of New South Wales, Sydney, Australia b A R T I C L E I N F O A B S T R A C T Article history: Accepted 24 August 2013 Available online xxx Background: Cure models can provide improved possibilities for inference if used appropriately, but there is potential for misleading results if care is not taken. In this study, we compared five commonly used approaches for modelling cure in a relative survival framework and provide some practical advice on the use of these approaches. Patients and methods: Data for colon, female breast, and ovarian cancers were used to illustrate these approaches. The proportion cured was estimated for each of these three cancers within each of three age groups. We then graphically assessed the assumption of cure and the model fit, by comparing the predicted relative survival from the cure models to empirical life table estimates. Results: Where both cure and distributional assumptions are appropriate (e.g., for colon or ovarian cancer patients aged <75 years), all five approaches led to similar estimates of the proportion cured. The estimates varied slightly when cure was a reasonable assumption but the distributional assumption was not (e.g., for colon cancer patients 75 years). Greater variability in the estimates was observed when the cure assumption was not supported by the data (breast cancer). Conclusions: If the data suggest cure is not a reasonable assumption then we advise against fitting cure models. In the scenarios where cure was reasonable, we found that flexible parametric cure models performed at least as well, or better, than the other modelling approaches. We recommend that, regardless of the model used, the underlying assumptions for cure and model fit should always be graphically assessed. Crown Copyright ß 2013 Published by Elsevier Ltd. All rights reserved. Keywords: Statistical cure Cure models Relative survival Population-based 1. Introduction Advances in the diagnosis and treatment of cancer have meant that an increasing number of cancer patients are now cured of their cancer. For those cancers where cure occurs, the cumulative survival curves level off at the point of cure. The definition of ‘cure’, as used in this context, is that ‘cured’ patients have a mortality rate equal to that of the subjects of the same age and sex in the general population, and differs conceptually from ‘‘clinical cure’’ for individual patients. Traditional approaches to survival analysis assume that a single survival distribution can be used to describe the survival of all individuals with a given set of covariates. Cure models in a relative survival framework (the most commonly used approach for * Corresponding author at: Cancer Research Division, Cancer Council New South Wales, PO Box 572, Kings Cross, NSW 1340, Australia. Tel.: +61 2 93341851; fax: +61 2 8302 3550. E-mail addresses: xueqiny@nswcc.org.au, xue.yu@sydney.edu.au (X.Q. Yu). population-based data) [1], on the other hand, assume that the patients can be partitioned into two groups, those who are cured and those who are not, with separate survival distributions for each. When applying cure models, one typically reports an estimate of the proportion cured along with a summary measure (e.g., mean or median survival time) of the uncured. The proportion cured, in particular, is felt to be more directly relevant to patients and clinicians and easier to interpret than the measures reported from traditional approaches. An additional advantage of cure models is that they potentially provide greater possibilities for studying the mechanisms underlying temporal trends. Traditional approaches can, for example, demonstrate that survival is improving over time but cure models can elucidate whether this is because we are curing more patients or because we are prolonging the survival time of those patients who will nevertheless succumb to the disease or, more usually, some combination of both. Due to these advantages, several statistical models for fitting survival data with a cure proportion in a relative survival framework have been developed over the last two decades [2–6]. Cure 1877-7821/$ – see front matter . Crown Copyright ß 2013 Published by Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.canep.2013.08.014 Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
    • G Model CANEP-624; No. of Pages 7 X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx 2 models can also be fitted in a cause-specific survival framework, but such models are not considered here. These cure models have been applied to data from various populations to investigate either the improvement in survival over time, or geographical variation in cancer survival [2–11]. Using these models and interpreting the results they produce, however, must be undertaken carefully and with an awareness of the potential limitations and inaccuracies resulting from the assumptions on which the models rely. It is therefore important to comprehensively compare these models and assess their merits for use in different circumstances, so that appropriate recommendations for their use may be made. Existing studies on these issues, however, are rather limited, although we hope that this study will begin to address some of these questions. First, we compare five approaches [2–6] for cure modelling in a relative survival framework by applying them with the population-based survival data for three cancers with different cure profiles. We then discuss the relative merits of the approaches and provide practical advice for users. 2. Methods All five approaches [2–6] are variations on two types of cure model: the mixture cure model and the non-mixture cure model. Both models assume that a proportion of patients will be cured by defining an asymptote for the relative survival, but the models are parameterised in different ways. Three of the selected approaches are mixture cure models [2–4] and two of them are non-mixture cure models [5,6]. A brief summary of these approaches is provided in Table 1 and the key features of these approaches are described in Appendix 1 (Model specifications). The application of these approaches was illustrated using SEER9 data [12] from 1981 to 1988, with follow-up through to 2007 (with only the first 15 years of follow-up data being used in the estimations), for patients diagnosed with cancers of the colon, female breast and ovary. The decision to choose these three types of cancer was because they are typical examples that illustrate several scenarios regarding two key assumptions of cure models: a proportion of cancer patients do get cured and the distribution of the survival times of the uncured cases can be described by the chosen parametric distribution. Survival data were extracted from the SEER database using the SEER*Stat software in two formats with identical selection criteria: grouped relative survival data and individual survival data. The grouped data (used in the CANSURV software [13] and Verdecchia’s approach) [3] included the following variables: number of patients alive at start of the follow-up interval (annual 1–15), number of patients who died, number of patients lost to follow-up, and also observed, expected, and relative survival, and the standard error for relative survival. Individual records were used in other three approaches [4–6]. As survival differ greatly by age at diagnosis, we estimated the proportion cured separately by age group (<60, 60–74, and 75–84 years). To examine the adequacy of the cure models, we visually compared the predicted relative survival from the cure models with empirical life table estimates derived using the Ederer II method [14]. 3. Results A total of 148,963 cases were included (colon cancer: 52,203; breast cancer: 84,595; ovarian cancer: 12,165). For colon and ovarian cancer in the two younger age groups (<60 and 60–74 years) all approaches produced similar estimates of the proportion cured (Table 2). For the oldest age group (75–84 years) there was Table 1 Comparison of five approaches for estimating the proportion cured of cancer. Yu et al. [2] Type of model Structure of input data Parameter estimation Software used Assumed survival distribution a b Verdecchia et al. [3] De Angelis et al. [4] Lambert et al. [5] Andersson et al. [6] Mixture Grouped survival data Maximum likelihood CANSURV Weibull Mixture Grouped survival data Non-linear least squares SAS Weibull Mixture Individual survival data Maximum likelihood Stata Weibull a Non-mixture Individual survival data Maximum likelihood Stata Weibull a Non-mixture Individual survival data Maximum likelihood Stata Splines b We use the Weibull distribution but other distributions are available in the Stata implementation by Lambert. The Stata Journal 2007 The baseline cumulative hazard is estimated using restricted cubic splines so the survival distribution is a parametric distribution that is a function of the spline parameters. Table 2 Estimated proportion cured (%) and (95% confidence intervals) from different cure model approaches by cancer type and age group. <60 years Yu et al. [2] (yearly interval data) (monthly interval data) Verdecchia et al. [3] (yearly interval data) (monthly interval data) De Angelis et al. [4] Lambert et al. [5] Andersson et al. [6] Yu et al. [2] Verdecchia et al. [3] De Angelis et al. [4] Lambert et al. [5] Andersson et al. [6] Yu et al. [2] (yearly interval data) (monthly interval data) Verdecchia et al. [3] (yearly interval data) (monthly interval data) De Angelis et al. [4] Lambert et al. [5] Andersson et al. [6] Colon cancer 52.9 (51.7–54.0) 52.8 (51.7–53.9) 52.8 (52.3–53.3) 53.1 (53.0–53.3) 52.2 (51.1–53.3) 51.9 (50.8–53.0) 52.4 (51.4–53.4) Breast (female) cancer 62.4 (61.7–63.1) 63.8 (62.1–65.6) 62.1 (61.5–62.8) 61.8 (61.1–62.5) 64.1 (63.6–64.7) Ovarian cancer 46.7 (45.2–48.3) Failed to converge 47.5 (46.3–48.7) 47.8 (47.5–48.0) 47.8 (46.3–49.3) 48.0 (46.4–49.6) 48.7 (47.2–50.1) 60–74 years 75–84 years 52.9 52.8 52.5 52.7 54.0 53.5 52.9 (51.8–53.9) (51.8–53.8) (52.2–52.9) (52.6–52.9) (53.1–54.9) (52.6–54.5) (52.1–53.8) 50.8 53.1 50.9 51.3 56.5 56.0 50.0 (48.5–53.0) (51.5–54.7) (49.8–52.0) (51.0–51.6) (55.3–57.8) (54.8–57.3) (48.7–51.2) 58.9 59.2 57.8 55.9 66.2 (56.8–60.9) (58.0–60.4) (55.7–59.9) (53.3–58.5) (65.5–66.9) 0 0 4.7 0.06 65.0 (0–100) (0–100) (63.3–66.7) 23.1 22.6 22.3 22.4 23.2 23.2 22.5 (21.6–24.7) (21.2–24.2) (21.1–23.4) (22.2–22.7) (21.7–24.7) (21.7–24.9) (21.2–23.8) 17.3 18.8 16.3 16.7 20.2 21.1 15.8 (14.6–20.4) (16.4–21.6) (15.4–17.2) (16.4–17.0) (17.8–22.8) (18.5–24.0) (13.9–17.9) Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
    • G Model CANEP-624; No. of Pages 7 X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx considerable variation, with results ranging from 50.0% to 56.5% for colon cancer and from 15.8% to 21.1% for ovarian cancer. As mentioned previously, the results for the approaches with grouped data were based on commonly used annual follow-up intervals. For colon and ovarian cancers, to examine the sensitivity to the width of the follow-up interval, we also present estimates in Table 2 using monthly intervals. Breast cancer was chosen to show that it is not sensible to fit cure models for this cancer. As expected, all five modelling approaches produced an estimated proportion cured for breast cancer patients in the two younger age groups (Table 2). For the youngest age group (<60 years) all approaches resulted in 3 quite similar estimates (ranging from 61.8% to 64.1%), while estimates for the middle age group (60–74 years) were less consistent. For the oldest age group (75–84 years), however, all methods based on the Weibull model indicated no cure or negligible proportions cured (values ranging from 0 to 4.7%), while the flexible parametric approach gave an estimate of 65%. The predicted relative survival estimates, stratified by age groups, for each of the three cancers from the five approaches were plotted against the life table estimates to evaluate model fit (Fig. 1). For breast cancer (Fig. 1B), the graphical assessment indicate that there was no evidence of statistical cure for any age group because all Fig. 1. Comparing predicted relative survival from different modelling approaches with life table estimates – for (A) colon cancer, (B) breast cancer and (C) ovarian cancer. Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
    • G Model CANEP-624; No. of Pages 7 X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx 4 Table 3 Estimated proportion cured (%) from different modelling approaches for localised colon cancer. <60 years Yu et al. [2] Verdecchia et al. [3] De Angelis et al. [4] Lambert et al. [5] Andersson et al. [6] 60–74 years 75–84 years Failed to converge 87.1 Failed to converge Failed to converge 86.5 Failed to converge 82.2 Failed to converge Failed to converge 82.9 Failed Failed Failed Failed 81.8 the survival curves did not level off within 15 years of follow-up, while cure appears to be a reasonable assumption for colon and ovarian cancer (Fig. 1A and C). We also applied these approaches to data of localised colon cancer to evaluate the models in relatively high survival situation (Table 3). 4. Discussion Cure models have been increasingly used for modelling timeto-event data incorporating a proportion cured. We wish to emphasise that their application and interpretation is dependent on assumptions, and violation of these assumptions may lead to biased estimates. While some statistical procedures are relatively insensitive to underlying assumptions this is not the case with cure models. The two key assumptions are that cure occurs and that the distribution of survival times of the uncured cases can be described by the chosen parametric distribution. If the first assumption is not reasonable then we would advise against using cure models, and even when cure is a reasonable assumption, careful assessment of the distributional assumptions must be undertaken so that biased estimates can be prevented. From our experience in applying cure models we have found that cure is not always a reasonable assumption, with breast cancer and prostate cancer being typical examples. We have also found that even when there is evidence of cure, the models do not converge, or only fit poorly, when survival is either relatively good or relatively poor. The developers of cure models often discuss the limitations of the models, and urge caution in their use, but typically present illustrative examples where cure models work well; colon cancer being a particular favourite [2–6,9,10,15]. In this study, we explored the practicalities of applying cure models in a broader context, using data for three different cancer types that illustrate several scenarios regarding these two central assumptions. First is the situation where cure is not a reasonable assumption, illustrated using data for female breast cancer. Second is the situation where both the cure and distributional assumptions are reasonable (in this study we used the Weibull distribution for all approaches requiring a distributional assumption). Finally is the situation where cure is a reasonable assumption but the distributional assumption is not. Within the considered 15-year follow-up time there was no evidence of statistical cure, for any age group, for women diagnosed with breast cancer (Fig. 1B) but this does not preclude the possibility that many of the women are medically cured. We believe that cure models should not be used for such data, but we nonetheless fitted cure models to illustrate this scenario. All approaches produced an estimate of the proportion cured for the two younger age groups, and one of the approaches also produced a large positive estimate in the oldest age group (Table 2). We would emphasise here that just producing an estimate does not mean the approach is sensible. It would be desirable to have a formal test for determining whether population cure exists, to assist researchers deciding whether it is appropriate to apply cure models to the population of interest. There has previously been some work in this area [16,17], to to to to converge converge converge converge but the application of these methods is rather limited, largely due to a lack of software to implement the proposed approaches [18]. Thus, a simple and easy to implement test is needed for this. In the absence of such a test, we suggest visually examining life table estimates of relative survival by key prognostic factors to determine if the survival curves tend to level off after a certain period of follow-up; if not, a cure model should not be applied to such data [19,20]. This raises the interesting question of how much levelling is sufficient to allow the application of cure models, a question that still requires much discussion. For the second scenario, where there is clear graphical evidence that the statistical cure assumption is appropriate, our analyses showed that all five approaches, no matter what the structure of the input file, the estimation methods or the software used, produced very similar estimates for the proportion cured. This is the case for the two younger age groups for colon and ovarian cancer (Table 2), and supported by graphical evidence that the survival curves from the different approaches are in close agreement (Fig. 1A and C). Thus, when both the cure and distributional assumptions are met, most cure model approaches are likely to give reliable estimates of the proportion cured. The third scenario, where cure is reasonable but the distributional assumption is not, was illustrated using the oldest age group for both colon and ovarian cancer. Here, the Weibull distribution cannot capture the survival shapes appropriately since mortality is quite high within the first year and then rapidly decreases. In this situation, the estimates of the proportion cured from different approaches varied from 50.0% to 56.5% for colon cancer, and from 15.8% to 21.1% for ovarian cancer (Table 2). The two approaches that use a dataset comprising individual records and a Weibull distribution [4,5] yielded very similar results and overestimated the proportion cured, which confirm the previous finding that cure models do not perform well when survival drops rapidly soon after diagnosis [5,15]. However, the two Weibull models using grouped data [2,3] gave similar and very close estimates to the life table estimates for colon cancer in the oldest age group. We suspect that the use of grouped data with an annual follow-up interval effectively averaged out the high excess mortality in the first few months after diagnosis; consequently resulting in lower estimates which coincided with that from the flexible parametric model. To test this hypothesis, we repeated the initial analysis with monthly follow-up intervals using CANSURV and found that the updated estimate of the proportion cured moved towards that (56%) obtained using the individual records, from 50.8% to 53.1%. This suggested that when the assumption of a Weibull distribution is not appropriate, the models using grouped data may be sensitive to the choice of the width of the follow-up interval. Although the magnitude of the change (from 50.8% to 53.1%) does not constitute evidence against using Weibull models, we believe that the flexible parametric cure model may be preferable in such situations. Besides satisfying two central assumptions for cure models, the accuracy of estimates of the proportion cured is based on the size of the study population and the length of patient follow-up, two Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
    • G Model CANEP-624; No. of Pages 7 X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx important prerequisites for the application of cure models. Thus, the strengths of this study include the large sample, which increased the probability of obtaining a stable estimate, and the long follow-up, which allowed the survival curves to level off. For both colon and ovarian cancers, 15-years of follow-up is considered to be beyond the minimum threshold required [21]. For colon cancer, and to a lesser extent ovarian cancer, the cure models used here are well justified, because there is strong empirical evidence of the existence of a proportion cured. For colon cancer, advances in diagnostic and surgical techniques along with adjuvant chemotherapy and radiotherapy have led to impressive improvements in outcomes: a substantial proportion of patients with early disease stage [22] or regional disease [23] may be cured of the disease. Cure is also possible for selected advanced colon cancer patients through a multimodal approach of combining surgical treatments and systemic therapies [24,25]. For ovarian cancer the evidence was not as strong but the available data [26,27] clearly pointed in the same direction. In addition, our results are consistent with many population-based studies which indicate that cure is possible for some patients with colon or ovarian cancer [2,3,5,28]. However, cure models in general have several potential limitations which need to be considered. First, cure models with right-censored data suffer from an inherent identifiability problem. To try to minimise this problem we examined whether the survival curve levelled off after a certain period of follow-up to make sure most events have been observed, as advised by Yu [2]. In the case of colon and ovarian cancers the survival curves appeared to level off at around 10 years of follow-up. Second, most cure models do not converge (e.g., localised colon cancer), or only fit poorly, when survival is either relatively good or relatively poor, as we showed earlier in this study. To account for the latter problem, Lambert et al. [15] proposed a mixture Weibull distribution approach which assumes that the distribution of survival times for the uncured cases is a mixture of two Weibull distributions and reported two advantages over a single Weibull model [15]: a lower Akaike Information Criterion (AIC) value, indicating better fit of the model; and closer predicted survival estimates to the empirical survival estimates. In situations like this, the flexible parametric model is a better way to be more flexible in the shape of the parametric distribution than the mixture Weibull. Third, there is currently a lack of diagnostic tools for all approaches. Although AIC has been used to select a better model, using such measures for model selection can be dangerous if interest lies in estimation of the proportion cured [5]. The difficulty in assessing cure models is that when the chosen distribution, e.g., Weibull, is not appropriate the estimation algorithm will favour models that provide a better fit where there is most information, typically early in the follow-up where more deaths occur, rather than later in the follow-up where cure occurs. As such, the usual tests of goodness-of-fit are not especially informative for cure models since they may favour a model that fits well in the first year following diagnosis but for which the proportion cured is estimated poorly. Thus, we believe that the use of graphs for assessing goodness of fit is extremely important, although methods for model diagnostics are still needed for future methodological research [29]. Both grouped relative survival data and data comprising individual survival records can be used for estimating the proportion cured. There are several benefits to using grouped survival data in cure models. The tabulated data are readily available in published reports [3], the approaches are implemented with the use of readily available software (either SAS or CANSURV), and the model is easy to run and takes less time to converge than approaches that use individual data records. However, there are 5 some concerns regarding loss of information due to collapsing data into groups such as requiring an annual follow-up interval, which is not ideal for older patients with high excess mortality in the first few months after diagnosis. The models fitted to individual data have an advantage over those fitted to grouped data in that one is not required to categorise continuous covariates such as age. Modelling age as a smooth function, e.g., using splines, is not only biologically more plausible than modelling age as a step function, but it gives the possibility to make predictions for individual ages rather than for age groups. The flexible parametric cure model offers some additional advantages including greater modelling flexibility with respect to the shapes of the survival distributions, greater sensitivity to small excess risk; and easy implementation, as readily available software can be used. However, it is also potentially sensitive to the choice of the number and location of knots. In this study, we have fitted a model with seven knots with default locations. But how sensitive are the results to different numbers or locations of knots? We performed a sensitivity analysis by fitting models with varying numbers and locations of knots using the colon cancer data for the oldest age group for which relatively larger variation was observed. As found in previous studies [6,30,31], the estimated proportion cured was insensitive to either ‘‘sensible’’ choices of the number of knots with a difference of only 0.2% (50.0% vs 50.2%) or locations of the knots: the difference being 0.4% (49.9% vs 50.3%) (Fig. 2). This further confirms that the flexible parametric models are generally insensitive to the number and position of the knots ‘‘as long as they are placed over the whole follow-up period and the last knot is positioned at the last observed death time or possibly later’’ [30]. Its unique advantage is that it allows modelling for older age groups (Fig. 1) and cases with early cancer stage (Table 3), which is not always possible using other approaches. More detailed methods and results for this sensitivity analysis were described in the Appendix 2. However, in the case where cure is not a reasonable assumption, such as for breast cancer, the flexible parametric cure model gives a worse fit to the data (Fig. 1B) than the other models. This is because the flexible parametric cure model assumes the point of statistical cure occurs at a finite point during follow-up, whereas Weibull models assume cure to occur even at time infinity with a null cured proportion, and hence provide a better fit when cure does not occur. Specifically, the flexible parametric cure model assumes cure occurs at the last knot, which in our example was at 15 years, thus estimating that 65% were cured. We do not feel this should be seen as a disadvantage of the flexible parametric cure model, since we don’t believe cure models should be applied in situations where cure is not a reasonable assumption. In summary, in choosing an approach we feel practitioners should take into account both theoretical and practical considerations. If visual inspection of cumulative relative survival curves suggests cure is not a reasonable assumption we would, in general, discourage the application of cure models. If cure is a reasonable assumption and patient survival is neither extremely high nor low, then there is little difference between the implementations and one would choose software that suits. If one is interested in analysing SEER data or has imported data into SEER*Stat, for example, then CANSURV becomes particularly attractive. One does not have to work long with cure models, however, before encountering scenarios where the distributional assumptions are not appropriate and the models fail to converge. Therefore, unless one has a particularly strong preference for another software package, we recommend the implementation by Andersson [6] since the Weibull distribution is not always sufficiently flexible to provide a good fit when, for example early mortality is Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
    • G Model CANEP-624; No. of Pages 7 X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx 6 Fig. 2. Sensitivity analysis varying numbers and locations of knots, SEER data for colon cancer (aged 75–84 years) diagnosed in 1981–1988. *Proportion cured overestimated and this choice not recommended. high and the flexible parametric model does not suffer from this problem. In conclusion, if the data suggest cure is not a reasonable assumption then we advise against fitting cure models. In the scenarios where cure was reasonable, we found that flexible parametric cure models performed at least as well, or better, than the other modelling approaches. We recommend that, regardless of the model used, the underlying assumptions for cure and model fit should always be graphically assessed. Conflict of interest statement No conflict of interests identified. Acknowledgements We thank Mark Clements for his comments on the earlier draft of this manuscript, Qingwei Luo for assisting with producing the graphs and Clare Kahn for editorial assistance. Xue Qin Yu is supported by an Australian NHMRC Training Fellowship (550002) and he thanks the Sydney Medical School for their support in the form of an International Travelling Fellowship in 2012, which enabled him to collaborate with Paul Dickman at the Karolinska Institute in Sweden. Part of this work was carried out while Paul Lambert was granted study leave by the University of Leicester. Appendix. Supplementary files Supplementary files associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.canep.2013. 08.014. References [1] Dickman PW, Adami HO. Interpreting trends in cancer patient survival. J Intern Med 2006;260:103–17. [2] Yu B, Tiwari RC, Cronin KA, McDonald C, Feuer EJ. CANSURV: A Windows program for population-based cancer survival analysis. Comput Methods Programs Biomed 2005;80:195–203. [3] Verdecchia A, De Angelis R, Capocaccia R, et al. The cure for colon cancer: results from the EUROCARE study. Int J Cancer 1998;77:322–9. [4] De Angelis R, Capocaccia R, Hakulinen T, Soderman B, Verdecchia A. Mixture models for cancer survival analysis: application to population-based data with covariates. Stat Med 1999;18:441–54. [5] Lambert PC, Thompson JR, Weston CL, Dickman PW. Estimating and modeling the cure fraction in population-based cancer survival analysis. Biostatistics 2007;8:576–94. [6] Andersson TM, Dickman PW, Eloranta S, Lambert PC. Estimating and modelling cure in population-based cancer studies within the framework of flexible parametric survival models. BMC Med Res Methodol 2011;11:96. [7] Andersson TM, Lambert PC, Derolf AR, Kristinsson SY, Eloranta S, Landgren O. Temporal trends in the proportion cured among adults diagnosed with acute myeloid leukaemia in Sweden 1973–2001, a population-based study. Br J Haematol 2010;148:918–24. [8] Clements MS, Roder DM, Yu XQ, Egger S, O’Connell DL. Estimating prevalence of distant metastatic breast cancer: a means of filling a data gap. Cancer Causes Control 2012;23:1625–34. [9] Eloranta S, Lambert PC, Cavalli-Bjorkman N, Andersson TM, Glimelius B, Dickman PW. Does socioeconomic status influence the prospect of cure from colon cancer – a population-based study in Sweden 1965–2000. Eur J Cancer 2010;46:2965–72. [10] Lambert PC, Dickman PW, Osterlund P, Andersson T, Sankila R, Glimelius B. Temporal trends in the proportion cured for cancer of the colon and rectum: a population-based study using data from the Finnish Cancer Registry. Int J Cancer 2007;121:2052–9. [11] Woods LM, Rachet B, Lambert PC, Coleman MP. ‘Cure’ from breast cancer among two populations of women followed for 23 years after diagnosis. Ann Oncol 2009;20:1331–6. [12] SEER. Surveillance Epidemiology and End Results (SEER) Program Research Data (1973–2007) National Cancer Institute, DCCPS, Surveillance Research Program Cancer Statistics Branch. National Cancer Institute; 2010 , Released April 2010, based on the November 2009 submission. [13] Data Modeling Branch, ed. Cansurv. Statistical Methodology Applications Branch. National Cancer Institute; 2005. Version 1. 0. [14] Ederer F, Heise H. Instructions to IMB 650 Programmers in Processing Survival Computations Methodological note No. 10. End Results Evaluation Sectioned. Bethesda: National Cancer Institute, 1959. [15] Lambert PC, Dickman PW, Weston CL, Thompson JR. Estimating the cure fraction in population-based cancer studies using finite mixture models. J Roy Statist Soc 2010;59:35–55. [16] Maller RA, Zhou S. Testing for the presence of immune or cured individuals in censored survival data. Biometrics 1995;51:1197–205. [17] Peng Y, Dear KB, Carriere KC. Testing for the presence of cured patients: a simulation study. Stat Med 2001;20:1783–96. [18] Othus M, Barlogie B, Leblanc ML, Crowley JJ. Cure models as a useful statistical tool for analyzing survival. Clin Cancer Res 2012;18:3731–6. [19] Othus M, Li Y, Tiwari R. Change point-cure models with application to estimating the change-point effect of age of diagnosis among prostate cancer patients. J Appl Stat 2012;39:901–11. Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014
    • G Model CANEP-624; No. of Pages 7 X.Q. Yu et al. / Cancer Epidemiology xxx (2013) xxx–xxx [20] Rondeau V, Schaffner E, Corbiere F, Gonzalez JR, Mathoulin-Pelissier S. Cure frailty models for survival data: application to recurrences for breast cancer and to hospital readmissions for colorectal cancer. Stat Methods Med Res 2013;22:243–60. [21] Tai P, Yu E, Cserni G, et al. Minimum follow-up time required for the estimation of statistical cure of cancer patients: verification using data from 42 cancer sites in the SEER database. BMC Cancer 2005;5:48. [22] Wichmann MW, Muller C, Hornung HM, Lau-Werner U, Schildberg FW. Results of long-term follow-up after curative resection of Dukes A colorectal cancer. World J Surg 2002;26:732–6. [23] Wilkinson NW, Yothers G, Lopa S, Costantino JP, Petrelli NJ, Wolmark N. Long-term survival results of surgery alone versus surgery plus 5-fluorouracil and leucovorin for stage II and stage III colon cancer: pooled analysis of NSABP C-01 through C-05. A baseline from which to compare modern adjuvant trials. Ann Surg Oncol 2010;17:959–66. [24] Gallagher DJ, Kemeny N. Metastatic colorectal cancer: from improved survival to potential cure. Oncology 2010;78:237–48. 7 [25] Tomlinson JS, Jarnagin WR, DeMatteo RP, et al. Actual 10-year survival after resection of colorectal liver metastases defines cure. J Clin Oncol 2007;25:4575–80. [26] Jelovac D, Armstrong DK. Recent progress in the diagnosis and treatment of ovarian cancer. CA Cancer J Clin 2011;61:183–203. [27] Swenerton KD, Santos JL, Gilks CB, et al. Histotype predicts the curative potential of radiotherapy: the example of ovarian cancers. Ann Oncol 2011;22:341–7. [28] Cvancarova M, Aagnes B, Fossa SD, Lambert PC, Moller B, Bray F. Proportion cured models applied to 23 cancer sites in Norway. Int J Cancer 2013;132: 1700–10. [29] Mallett S, Royston P, Waters R, Dutton S, Altman DG. Reporting performance of prognostic models in cancer: a review. BMC Med 2010;8:21. [30] Andersson TML, Lambert PC. Fitting and modeling cure in population-based cancer studies within the framework of flexible parametric survival models. Stata J 2012;12:623–38. [31] Lambert PC, Royston P. Further development of flexible parametric models for survival analysis. Stata J 2009;9:265–90. Please cite this article in press as: Yu XQ, et al. Estimating the proportion cured of cancer: Some practical advice for users. Cancer Epidemiology (2013), http://dx.doi.org/10.1016/j.canep.2013.08.014