基于代理型RAG的可配置临床信息提取：有效、失效及其原因分析

摘要

患者数据跨越数百个异构文档和数千个结构化数据点，但人工智能系统用于检索与分诊所需的文档级元数据往往缺失或不完整。标准检索增强生成技术在此类数据上表现不佳，难以处理时间推理、跨文档依赖及元数据缺失等问题。我们在埃森大学医学中心部署了ACIE（智能体临床信息提取）系统：一套本地部署的智能体RAG流程，能够对完整患者背景进行推理，并基于源文段佐证每个回答以便临床医生核查。我们量化了元数据缺口的规模，追溯了由此形成的架构决策，并通过一项独立的回顾性淋巴瘤登记研究评估了提取效果——在该研究中，核医学科医师针对每个提取值及其引用的来源进行验证。在7,326项判定中，临床医生接受了96.5%的提取结果，按类型划分的接受率介于80%至99%之间。

English

Patient contexts span hundreds of heterogeneous documents and thousands of structured data points, yet the document-level metadata that AI systems need for retrieval and triage is absent or incomplete. Standard retrieval-augmented generation fails on this data, mishandling temporal reasoning, cross-document dependencies, and missing metadata. We deploy ACIE (Agentic Clinical Information Extraction) at University Medicine Essen: an on-premise agentic RAG pipeline that reasons over complete patient contexts and grounds every answer in source passages for clinician verification. We quantify the metadata gap, trace the architectural decisions it shaped, and evaluate extraction alongside an independent retrospective lymphoma registry study, in which nuclear-medicine physicians verify every extracted value against its cited sources. Across 7,326 judgments, clinicians accepted 96.5\% of extractions, with per-type acceptance ranging from 80\% to 99\%.