Factors Influencing Decisions by NICE


Published on

OHE presents a series of lunchtime seminars throughout the year. The most recent seminar, held in late April, considered the influence of cost-effectiveness and other factors on NICE decisions.

Published in: Health & Medicine, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • .
  • Factors Influencing Decisions by NICE

    1. 1. The Influence of Cost-Effectiveness andOther Factors on NICE DecisionsOHE Lunchtime SeminarLondon • 23 April 2013
    2. 2. The influence of cost-effectivenessand other factors on NICE decisionsHelen Dakin1 and Nancy Devlin2in collaboration withYan Feng2, Nigel Rice3, Phill O’Neill2 and David Parkin21University of Oxford, 2OHE, 3Unversity of York
    4. 4. NICE Established 1999 Issues guidance to ensure quality and value for money For technology appraisals NHS is required to providefunding and resources for medicines and treatmentsrecommended by NICE Important implications for patients, the NHS, industry,decision making in other countries What factors affect NICE decisions? How important iscost effectiveness compared to ‘other factors’?
    5. 5. NICE’s stated threshold Pre-2004: various statements suggest a threshold of around£30,000 2004 methods guide & 2005 social value judgements• [NICE] should, generally, accept as cost effective those interventionswith an incremental cost-effectiveness ratio of less than £20,000 perQALY and that there should be increasingly strong reasons foraccepting as cost effective interventions with an incremental cost-effectiveness ratio of over £30,000 per QALY 2008 social value judgements• NICE should explain its reasons when it decides that an interventionwith an ICER below £20,000 per QALY gained is not cost effective;and when an intervention with an ICER of more than £20,000 to£30,000 per QALY gained is cost effective
    6. 6. Previous studies ondeterminants of NICE decisionsStudy No.AppraisalsModel FindingsDevlin &Parkin (2004)39 to May2002Logistic:yes/no• Threshold ≈£40,000/QALY• Uncertainty & prevalence matterDakin et al(2006)73 to Dec2003mlogit: yes /no / yes, but• ICER, number trials or SRs, date, patient groupsubmissions and technology type matter• ‘Yes, but’ and ‘no’ driven by different factorsJena (2009) 86 to 2005 Linearprobabilitymodel:yes/no• £1000 increase in ICER decreases probabilityof ‘yes’ by 0.009• Infectious disease and mental health decisionsmore likely to be ‘no’Mason &Drummond(2009)38 cancerappraisalsOct 2008Tabulation:yes / no /yes, but• ‘No’ more likely after 2006: partly due to STAprocess?• Restrictions attributed to ICER, insufficientevidence, uncertainty or methodology
    7. 7. Aims Estimate NICE’s cost-effectiveness threshold Identify the factors that affect or explain NICE’s decisions Evaluate whether NICE’s threshold or decision-making has changedover time [in progress]
    9. 9. The basic model Incremental cost-effectiveness ratio (ICER)• Hypothesised to be main driver of decisions• ICER in £000s Clinical evidence• ‘NICE should not recommend an intervention […] if there is […] notenough evidence’ (SVJ 2008)• Hypothesis: NICE will reject if insufficient evidence• Total number of patients in randomised trials = number RCTs x meanpatients/trial Insights provided by stakeholders (Rawlins 2010)• E.g. on whether QoL assessment adequately captures benefits (SVJ 2008)• Hypothesis: increases odds of ‘yes’• Patient group submission =1 if patient group submitted evidence oropinion (proxy for stakeholder involvement/persuasion)
    10. 10. The basic model (cont.) Only treatment• Hypothesis: NICE is more likely to recommend if no alternatives• =1 if there were no alternative treatments available for this patient group Children• Give ‘the benefit of the doubt’ given methodological challenges (Rawlins 2010)• Hypothesis: paediatric treatments more likely to be recommended• =1 if the decision specifically concerns children or adolescents Publication date• Evaluates whether NICE decisions are changing• No prior hypothesis• =Years since first NICE appraisal was published Severity of underlying illness• NICE states that is accept higher ICERs for serious conditions (Rawlins 2010)• = Mean WHO DALY weight across conditions for this disease category
    11. 11. Additional variables explored Pharmaceutical• May reflect greater stakeholder involvement• =1 for all drugs Disease• Interim analysis suggests NICE gives extra weight to cancer treatments• 8 dummies =1 if the decision concerns that disease• Diseases with <20 decisions with ICERs omitted Probabilistic sensitivity analysis (PSA)• Significant predictor of AWMSG decisions (Linley & Hughes 2012)• =1 if the model has PSA Broader perspective• Reflects consideration of additional savings not captured in ICER• =1 if non-NHS/PSS costs were analysed or discussed
    12. 12. Additional variables explored (cont.) Appraisal committee• Because committees weigh both quantitative evidence and valuejudgements, the way they do this might differ systematically acrosscommittees.• Committees characterised by their Chairs and included as dummies Innovation• NICE says that it takes into account the innovative nature of thetechnology• Defined by us as: any molecule launched within 2 years of appraisalAND in an ATC4 class that was created 5 years prior to appraisal.This picks up new medicines, but also avoids limiting to first in class asNICE does assess groups of similar medicines that are new and wouldcapture the spirit of viewing a medicines as innovative – e.g. newdiabetes medicines, TNFs, etc.
    13. 13. Additional variables explored (cont.) Single technology appraisal (STA)• Mason & Drummond found cancer STAs more likely to be ‘no’• =1 if the STA process was used Orphan• Evaluate […] ‘orphan drugs’, in the same way as any other treatment(SVJ, 2008; Littlejohns and Rawlins, 2009)• Hypothesised to have no impact based on NICE statements• =1 if the treatment has EMEA orphan status Uncertainty• Difference between the highest and lowest NE quadrant ICERs• 2 dummies indicating whether the plausible ICER could be dominant,or be dominated were explored, but dropped out of regression• Other measures of uncertainty are problematic
    14. 14. Additional variables exploredon a subset of decisions End of life• Place special value on treatments prolonging life at the end of life,providing that life is of reasonable quality (Rawlins, 2010; NICE 2009)• =1 if met EoL criteria• Only evaluated for decisions with preliminary guidance (FAD)published after 5th Jan 2009
    15. 15. DATA
    16. 16. HTAinSite Most data derived from HTAinSite: www.htainsite.com Commercial database developed by OHE, Abacus and CityUniversity Provides extensive data onall NICE and SMC appraisals• Regularly updated• Extracted and validated basedon established protocol Access to web interfaceavailable to subscribers Academics can requestdata for research by email
    17. 17. Appraisals versus decisions We analyse binary yes/no choices (not yes/no/yes, but)• Evidence, ICERs and other considerations often differ by subgroup forwhich the technology is recommended and those for which it isrejected• Levels of restrictions differ enormously from 5% of patients to 80%(O’Neill and Devlin 2010)• Reflects HTAinSite protocol NICE appraisals are divided into yes/no decisions concerningwhether or not to use one technology in one patientsubgroup with a certain condition• Methods for subdividing appraisals governed by HTAinSite protocol
    18. 18. Collection of ICER data Most guidance documents give multiple ICERs• From manufacturer, assessment group and decision support unit• Base case, subgroup analyses and sensitivity analyses• Several comparators HTAinSite records all ICERs mentioned in documentation We developed a protocol to identify the ICERs informing eachdecision• Included only cost/QALY ICERs for the subgroup(s) in that decision• DSU or ERG/TAG ICERs used in preference to manufacturer• Vs NICE’s preferred comparator or next most effective treatment on thefrontier• Exclude ICERs that NICE did not believe (based on considerations section)
    19. 19. Decisions with multiple ICERs A decision may have >1 relevant ICER if• It covers >1 subgroup with different ICERs• Results of several analyses or comparators are given equal prominencein guidance document• NICE concluded the ICER was ‘between X and Y’, <A or >B Taking the mean or midpoint would ignore uncertainty & makeassumptions about how NICE uses ICER data We therefore randomly sampled from the list of ICERs• Each ICER was given equal weight• For ranges, we sampled from the full list of ICERs from other decisions• Drew 100 iterations, each with different ICERs for each decision• Analyses repeated on each iteration; results combined with Rubin’sRules
    21. 21. Outline of econometric methods Used logistic regression to predict the effect of ICER andother variables on the log-odds of NICE saying ‘yes’ Adjusted standard errors to allow for clustering of decisionswithin appraisals Analysed in Stata version 12
    22. 22. Modelling strategyStage 0: Estimation of ICER-only model where ICER alonepredicts recommendationsStage 1: Evaluate a basic model with the variables expected tohave most impact on NICE decisionsStage 2: Remove non-significant variables from basic model oneat a timeStage 3: Evaluate impact of adding additional variablesStage 4: Alternative specifications of basic model parameters [tocome]Stage 5: Sensitivity and subgroup analyses [to come]
    23. 23. Model selection Our methods for dealing with decisions with ≥2 ICERsinvolve generating 100 datasets with different ICER values We combine coefficients across datasets using Rubin’s Rules AIC and pseudo-R2 cannot be pooled across datasets We therefore choose between models based on predictionaccuracy• We assume that model predicts a ‘yes’ if predicted log-odds ≥0• Categorise decisions into true/false positives and true/false negativesto get the % of decisions correctly predicted
    24. 24. Additional variables exploredon a subset of decisions End of life• Place special value on treatments prolonging life at the end of life,providing that life is of reasonable quality (Rawlins 2010; NICE 2009)• =1 if met EoL criteria• Only evaluated for decisions with preliminary guidance (FAD)published after 5th Jan 2009
    26. 26. Numbers of decisions and appraisals240 appraisalspublished by 31stDecember 2011 E1 & E2: 11 terminatedappraisals & 12 decisionswithout other restrictionsexcluded229 appraisalscomprising 763decisionsE3a: 162 decisions based ongrounds other than cost-effectivenessE3b-c: 92 decisions based oncost-effectiveness but withoutavailable, quantified cost/QALY510 decisions includedin models with ICERs
    27. 27. Grounds for decision
    28. 28. Number of ICERs510 decisions have available quantified cost/QALY ICERs of which: 198 have 2 to 40 ICERs 31 have ICER rangeICERs for these appraisalsare randomly drawn in100 datasets
    29. 29. ICER dataNE: morecostly,moreeffectiveSE: lesscostly,moreeffectiveNW:morecostly, lesseffectiveSW: lesscostly, lesseffectiveQuadrantsvary acrossdecisionsNumberdecisionsAll 418 33 31 6 22Yes (%) 282 (67%) 33 (100%) 0 (0%) 5 (83%) 13 (59%)No (%) 136 (33%) 0 (0%) 31 (100%) 1 (17%) 9 (41%)MeanICERAll £34,207 Dominant Dominated £5,760 -Yes £17,450 Dominant - £5,544 -No £68,952 - Dominated £6,839 - Average ICER is >3 times higher for ‘no’ decisions than ‘yes’ Dominance perfectly predicts NICE recommendations Subsequent analyses focus on the NE quadrant decisions
    30. 30. Impact of ICER ranking onrecommendations Decisions with high ICERs are more likely to be rejected, butthere are many exceptions£0£45,500£2,500£5,000£7,500£60,000£70,000£100,000£500,000£10,000£12,500£15,000£17,500£20,000£22,500£25,000£27,500£30,000£32,500£35,000£37,500£40,000£45,000£50,000Blue = recommended; red = rejected
    31. 31. Proportion of decisions below differentthresholds that are rejected 50% of decisions with ICERs >£20,000 are rejected
    32. 32. Sensitivity and specificity at differentthresholds ROC analysis suggests ICER strongly predicts decisions (AUC 0.85) Sensitivity and specificity both equal 77% if Rc ~£30,000/QALY % correctly classified plateaus at 81-82% between Rc 36k & £54k,peaking at £47,743/QALYSpecificity, sensitivity and classification calculated using roctab
    33. 33. Plotting probability of rejection againstICERInflection points at ~£20,000 and~£50,000/QALY?Rawlins & Culyer estimatedthat the relationship was ofthis shape and that inflexion Aoccurs at around £5,000-£15,000/QALY and inflexion Bat around £25,000-£35,000/QALYDecisions are grouped into categories with similar ICER;proportion of decisions in each category that were recommendedplotted against mid-point of each categoryCurved line shows approximate best fit by eye
    35. 35. Results of the basic model (1)Variable Definition Odds ratio (SE)ICER Cost-effectiveness ratio (£’000s) 0.938 (0.915, 0.962)*Total Pts in RCTsTotal number of patients randomised inall RCTs for this decision0.99999 (0.99996, 1.00002)Only treatment =1 if there are no alternative treatments 2.263 (0.448, 11.448)Children =1 if concerns treatments for children 3.774 (0.274, 52.026)Patient groupsubmission=1 if ≥1 patient group(s) made asubmission0.929 (0.097, 8.912)Publication dateYears since first NICE appraisal waspublished1.061 (0.947, 1.188)SeverityMean DALY weight for conditions in thisdisease category0.435 (0.031, 6.022)* p<0.05 Every £1000 increase in ICER reduces odds of yes by 6.2% No other variables are statistically significant
    36. 36. Results of the basic model (2) At average levels for all covariates, a decision would have a50% chance of rejection if its ICER were £45,118/QALY
    37. 37. Results of the basic model (3) Based on a 50% cut-off, 81.63% of decisions are correctlyclassified• However, at this cut-off, sensitivity (94% of ‘yes’ decisions correctlypredicted) is higher than specificity (56% of ‘no’ decisions correct)ActualrecommendationNo Yes TotalPredictedrecommendationNo 79 (56%) 17 (6%) 96Yes 61 (44%) 271 (94%) 332Total 140 288 428
    38. 38. Impact of removing variables frombasic model Removing any variable except severity worsens predictionaccuracy compared with basic model (81.63%)Variable removed % correctly classified afterremoving variableTotal Patients in RCTs 81.53%Only treatment 81.25%Children 81.61%Patient group submission 81.63%Publication date 81.62%Severity 81.84%All variables except ICER 81.44%
    39. 39. Effect of adding variables to basic modelVariable Definition % correct Odds ratioPharmaceutical = 1 for drugs 81.62% 0.903 (p=0.81)PSA = 1 if the model has PSA 81.78% 0.672 (p=0.45)Broaderperspective= 1 if non-NHS/PSS costs were analysed ordiscussed81.57% 0.666 (p=0.46)STA = 1 if the STA process was used 81.74% 0.659 (p=0.23)Orphan = 1 if Tx has EMEA orphan status 81.94% 1.415 (p=0.55)Range of ICERsDifference between the lowest and highestNE quadrant ICER for this decision in £’000s82.56% 0.987 (p=0.15)CancerDummy =1 if decision is for this disease82.29% 2.025 (p=0.10)Cardiovascular 81.70% 0.869 (p=0.75)Central nervoussystem81.41% 0.433 (p=0.30)Endocrine 81.58% 0.648 (p=0.45)Infectious 81.69% 1.122 (p=0.89)Mental health 81.50% 0.411 (p=0.49)Musculoskeletal 81.99% 3.317 (p=0.02)Respiratory 82.55% 0.855 (p=0.002)
    40. 40. Model including all variables increasingprediction accuracy Omitted severity and added STA, PSA, orphan, ICER range,cancer, cardiovascular disease, infectious disease,musculoskeletal & respiratory to basic model as theseimproved model fit Correctly classified 84.20% of NICE decisions• Specificity was higher than basic model, but sensitivity was lowerActual recommendationNo Yes TotalPredictedrecommendationNo 25 (66%) 6 (8%) 32Yes 13 (34%) 80 (92%) 93Total 39 86 125
    41. 41. Comparison of model predictionsAllowing for other factors has little impact on curve or thresholdICER only: £45,449Basic model: £45,118Basic model minus severity, plusSTA, PSA, orphan, ICER range,cancer, CVD, infection,musculoskeletal & respiratory :£41,808
    42. 42. Comparing thresholds across diseases NICE decisions and thresholds appear to vary substantiallyacross diseases
    43. 43. End of life The impact of end of life was evaluated for the 133 decisionswith draft guidance since Jan 2009 Adding end-of-life variable to basic model improves predictionaccuracy for these decisions from 84.23% to 85.12%• More than any other variable explored except ICER range andrespiratory Odds of NICE saying ‘yes’ are 3.37 (95% CI: 0.64, 17.86)times higher if meets end-of-life criteria (p=0.153)
    44. 44. End of life (2) Threshold appears to be higher (>£50,000) post-2009 End-of-life treatments have higher ICERNot end of life: £53,534End of life: £67,646
    45. 45. DISCUSSION
    46. 46. Conclusions (1) ICER is by far the strongest predictor of NICE decisions• Excluding those decisions based on clinical grounds or lack of evidence ICER alone explains 82% of NICE decisions Other variables significantly affecting NICE decisions include• Whether for respiratory disorder (less chance of ‘yes’)• Whether for musculoskeletal disorder (more chance of ‘yes’) Variables improving predictions, but not statistically significant• End of life – matches EoL guidance;• PSA; orphan; uncertainty• Cardiovascular disease, cancer, infection• Committee; innovation
    47. 47. Conclusions (2) Odds of a ‘yes’ decrease by ~6% for every £1000 increase inICER 50% of decisions with ICERs >£20,000/QALY are ‘no’ Specificity and sensitivity are equalised at £30,000/QALYthreshold Our ‘best’ model suggests that the average decision with anICER of £42,000 has a 50% chance of being rejected
    48. 48. Next steps Conduct sensitivity and subgroup analyses to explore howresults vary with alternative specification of models andvariables• Further exploration of variations over time, e.g. subgrouping appraisalsby time periods• Cross-validation to be conducted on the best model Step function model, assuming that NICE rejects alltreatments above a certain threshold ICER?• Additional variables increase/decrease the threshold, not the log-odds Multi-part models of decision-making
    49. 49. Multi-part models of decision-making Current analyses exclude decisions not based on cost-effectiveness grounds Some of the variables (e.g. clinical evidence or only treatment)may predict these decisions Decisions to reject/recommend on other grounds may occurbefore ICER evidence is considered Could explore these earlier steps in 2- or 3-part modelsRejected on clinical groundsRecommended on clinical groundsConsider cost-effectivenessRejected based on lack of clinical evidence
    50. 50. Discussion points Which threshold is correct?• ICER at which there is a 50% chance of rejection?• ICER that maximises specificity or sensitivity?• ICER above which there is a 50% chance of rejection? Are the multi-part models realistic? Which is best? Is there any way that we could better measureinnovation, severity and/or uncertainty? Is the % of decisions correctly classified the best wayto select models?• Can we pool AIC across datasets? Is it reasonable to select models on predictionaccuracy without a validation sample? How should we present the data?
    51. 51. AcknowledgmentsWe would like to thank: A consortium of 12 companies that provided aresearch grant to facilitate the initial data collectionand modelling HTAinSite and (in particular) Carmel Guarnieri andZoe Philips for providing the data used in thisanalysis Members of HERC, ScHARR and HESG for theircomments on our earlier work
    52. 52. To enquire about additional information and analyses, please contactNancy Devlin (ndevlin@ohe.org) or Helen Dakin(helen.dakin@dph.ox.ac.uk)To keep up with the latest news and research, subscribe to our blog, OHE News.Follow us on Twitter @OHENews, LinkedIn and SlideShare.Office of Health Economics (OHE)Southside, 7th Floor105 Victoria StreetLondon SW1E 6QTUnited Kingdom+44 20 7747 8850www.ohe.orgOHE’s publications may be downloaded free of charge for registered users of its website.©2013 OHE