SurvHTE-Bench:生存分析中异质性处理效应评估的基准框架
SurvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis
March 5, 2026
作者: Shahriar Noroozizadeh, Xiaobin Shen, Jeremy C. Weiss, George H. Chen
cs.AI
摘要
在精準醫療與個體化政策制定等高風險應用中,基於右設限生存數據估計異質性處理效應具有關鍵意義。然而,由於數據設限、反事實結果的不可觀測性以及複雜的識別假設,生存分析場景為HTE估計帶來了獨特挑戰。儘管從因果生存森林到生存元學習器及結果插補方法等技術近期取得進展,評估實踐仍存在碎片化與不一致問題。我們提出SurvHTE-Bench——首個針對設限結果HTE估計的綜合基準平台,其包含三大模塊:(一)具有已知真實效應的模塊化合成數據集,系統性調整因果假設與生存動力學;(二)將真實世界協變量與模擬處理及結果相結合的半合成數據集;(三)源自雙胞胎研究(含已知真實效應)及HIV臨床試驗的真實世界數據集。通過合成、半合成與真實場景的綜合測試,我們首次在多樣化條件及現實假設違反情境下對生存HTE方法進行嚴謹比較。SurvHTE-Bench為因果生存方法的公平、可復現與可擴展評估奠定了基礎。本基準平台的數據與代碼公開於:https://github.com/Shahriarnz14/SurvHTE-Bench。
English
Estimating heterogeneous treatment effects (HTEs) from right-censored survival data is critical in high-stakes applications such as precision medicine and individualized policy-making. Yet, the survival analysis setting poses unique challenges for HTE estimation due to censoring, unobserved counterfactuals, and complex identification assumptions. Despite recent advances, from Causal Survival Forests to survival meta-learners and outcome imputation approaches, evaluation practices remain fragmented and inconsistent. We introduce SurvHTE-Bench, the first comprehensive benchmark for HTE estimation with censored outcomes. The benchmark spans (i) a modular suite of synthetic datasets with known ground truth, systematically varying causal assumptions and survival dynamics, (ii) semi-synthetic datasets that pair real-world covariates with simulated treatments and outcomes, and (iii) real-world datasets from a twin study (with known ground truth) and from an HIV clinical trial. Across synthetic, semi-synthetic, and real-world settings, we provide the first rigorous comparison of survival HTE methods under diverse conditions and realistic assumption violations. SurvHTE-Bench establishes a foundation for fair, reproducible, and extensible evaluation of causal survival methods. The data and code of our benchmark are available at: https://github.com/Shahriarnz14/SurvHTE-Bench .