訓練大型語言模型以預測臨床事件
Training Large Language Models to Predict Clinical Events
May 12, 2026
作者: Benjamin Turtel, Paul Wilczewski, Kris Skotheim
cs.AI
摘要
纵向临床记录包含了患者随时间变化的丰富证据,但将这些信号转化为临床预测的训练监督信号仍具挑战性。我们将前瞻学习扩展到临床预测领域,通过将按时间顺序排列的MIMIC-III笔记转化为包含患者既往背景、针对未来可能事件的自然语言问题以及从后续记录中解析出的标签的示例。这一过程从702次入院记录中提取了6,900个预测示例,涵盖用药、手术、器官支持、微生物学和死亡率。基于这些示例训练的小型LoRA适配器在提示基础模型上实现了改进,将预期校准误差从0.1269降至0.0398,布里尔分数从0.199降至0.145,同时在保留问题上略优于GPT-5的点估计。该方法无需人工设计的结构化特征或针对特定终点的分类器,即可从纵向记录中实现可复用的临床预测监督。
English
Longitudinal clinical notes contain rich evidence of how patients evolve over time, but converting this signal into training supervision for clinical prediction remains challenging. We extend Foresight Learning to clinical prediction by converting time-ordered MIMIC-III notes into examples consisting of past patient context, a natural-language question about a possible future event, and a label resolved from later documentation. This process yields 6,900 prediction examples from 702 admissions across medications, procedures, organ support, microbiology, and mortality. A small LoRA adapter trained on these examples improves over the prompted base model, reducing expected calibration error from 0.1269 to 0.0398 and Brier score from 0.199 to 0.145, while slightly outperforming GPT-5 point estimates on held-out questions. The approach enables reusable clinical prediction supervision from longitudinal notes without hand-engineered structured features or endpoint-specific classifiers.