SurvHTE-Bench:生存分析中异质性处理效应评估的基准框架
SurvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis
March 5, 2026
作者: Shahriar Noroozizadeh, Xiaobin Shen, Jeremy C. Weiss, George H. Chen
cs.AI
摘要
在精准医疗和个性化政策制定等高风险应用中,基于右删失生存数据估计异质性处理效应至关重要。然而,由于删失现象、未观测的反事实结果以及复杂的识别假设,生存分析场景给HTE估计带来了独特挑战。尽管从因果生存森林到生存元学习器和结果插补方法等领域已取得进展,但评估实践仍存在碎片化和不一致的问题。我们推出SurvHTE-Bench——首个针对删失结果HTE估计的综合基准平台。该基准涵盖:(i) 包含已知真实值的模块化合成数据集套件,系统性地改变因果假设与生存动态;(ii) 将真实世界协变量与模拟处理及结果相结合的半合成数据集;(iii) 来自双胞胎研究(含已知真实值)和HIV临床试验的真实世界数据集。通过合成、半合成及真实世界场景,我们首次对不同条件下及现实假设违背情况下的生存HTE方法进行了严格比较。SurvHTE-Bench为因果生存方法的公平、可复现和可扩展评估奠定了基础。基准平台的数据与代码详见:https://github.com/Shahriarnz14/SurvHTE-Bench。
English
Estimating heterogeneous treatment effects (HTEs) from right-censored survival data is critical in high-stakes applications such as precision medicine and individualized policy-making. Yet, the survival analysis setting poses unique challenges for HTE estimation due to censoring, unobserved counterfactuals, and complex identification assumptions. Despite recent advances, from Causal Survival Forests to survival meta-learners and outcome imputation approaches, evaluation practices remain fragmented and inconsistent. We introduce SurvHTE-Bench, the first comprehensive benchmark for HTE estimation with censored outcomes. The benchmark spans (i) a modular suite of synthetic datasets with known ground truth, systematically varying causal assumptions and survival dynamics, (ii) semi-synthetic datasets that pair real-world covariates with simulated treatments and outcomes, and (iii) real-world datasets from a twin study (with known ground truth) and from an HIV clinical trial. Across synthetic, semi-synthetic, and real-world settings, we provide the first rigorous comparison of survival HTE methods under diverse conditions and realistic assumption violations. SurvHTE-Bench establishes a foundation for fair, reproducible, and extensible evaluation of causal survival methods. The data and code of our benchmark are available at: https://github.com/Shahriarnz14/SurvHTE-Bench .