Estimating long-run benefits of Immuno-oncology treatments: how early data cuts may exaggerate benefits
The typical CEA model of new cancer drugs extrapolates survival curves from clinical trial data. How much of the modeled long-term benefits depend on which data cutoff is used?
The typical CEA model in Health Technology Assessments of new cancer drugs is based on extrapolation of survival curves (PFS and OS). The active literature in this area often focuses on which statistical assumptions to make for the extrapolation (choosing between different distributions, cure models, and so on).
To exemplify the problem, assume that we have data with a 5-year follow-up. How do you extrapolate the long-term survival? Various assumptions may imply substantial differences in the model outcomes and, thus, the modeled cost-effectiveness of the treatment being assessed.
In a new study forthcoming in Value in Health (“Slipping Away: Slippage in hazard ratios over datacuts and its impact on immuno-oncology combination economic evaluations” by Dawn Lee and co-authors), the focus is on how using data at various time points impacts the extrapolation of long-term benefits. The authors give a number of clinical arguments for why we may see a hazard ratio that becomes less impressive at later datacuts; for example, if patients develop resistance to the treatment. If the hazard ratio differs substantially over the trial follow-up (at 12 months, 18 months, 24 months, etc.) the long-term predictions can be sensitive to which datacut is used to inform the CEA model.
The Case study
The paper used Renal Cell Carcinoma (RCC) as a case study for a combination treatment of an immune checkpoint inhibitor and a TKI (vs. TKI alone). The primary outcome of interest in the study was how the predicted survival outcomes depended on which datacut was used for the extrapolation (and the subsequent difference in life-years gained between the two treatment alternatives). Based on reviews of trial study results, Kaplan Meier curves for OS were digitized for the various data cutoffs reported in the published literature. Extrapolation was conducted using exponential, Weibull, log-logistic, lognormal, Gompertz, gamma, and generalized gamma assumptions.
Study results
The authors found that the hazard ratio results identified in the literature were generally the most impressive (lower hazard ratios) in earlier data cutoffs. Results also indicated that the predicted errors in extrapolation was higher when using earlier data cutoffs, and some survival curve assumptions tended to consistently predict too high survival (lognormal) whereas other lead to overly conservative predictions (e.g., Gompertz). Quite worryingly, the authors also showed that statistical goodness of fit measures for the earlier observed data did not provide reliable information for predicted survival using data with longer follow-up. This latter result may put question marks on the standard practice of relying on the goodness of fit measures to choose appropriate parametric assumptions in early-stage data.
Implications for CEA modeling and coverage decisions?
The results on which parametric curves provided the most accurate fits should, of course, be treated with caution in terms of broader generalizability. As the authors point out, such choices (appropriate parametric curves) are likely affected by the disease context, mechanism of action, etc. In the concluding discussion of the paper, it is mentioned that the results may be most generalizable to contexts where it is plausible that patients develop resistance to treatment.
The study showed how difficult it is, when forced to rely on “immature” OS and PFS data and with uncertain subsequent treatment patterns, to design CEA models that accurately predict long-term benefits of cancer treatments (and incremental gains in life-years, QALYs, etc.). The implications for coverage decisions that use cost-effectiveness evidence as input include that the “reasonable price” will be dependent on chosen parametric curves, and yet again reinforces the need for conducting extensive robustness checks assessing how different parametric assumptions will impact modeling results (and justified prices). If there is substantial uncertainty around long-term effectiveness and if data is “very immature”, it may make sense to primarily look parametric curves giving conservative survival extrapolations (at least in contexts where there is clinical plausibility for the development of treatment resistance, for example).