エージェント型RAGを用いた構成可能な臨床情報抽出：機能する点、機能しない点、そしてその理由

要旨

患者コンテキストは数百の異種文書と数千の構造化データポイントに及びますが、AIシステムが検索やトリアージに必要とする文書レベルのメタデータは存在しないか不完全です。標準的な検索拡張生成(RAG)はこのデータに対して機能せず、時間的推論、文書間依存関係、欠落メタデータを適切に処理できません。私たちはエッセン大学病院においてACIE（エージェント型臨床情報抽出）を導入しました。これはオンプレミスのエージェントベースRAGパイプラインであり、患者コンテキスト全体を推論し、すべての回答を臨床医の検証のためにソースパッセージに基づいて根拠づけます。私たちはメタデータギャップを定量化し、それによって形成されたアーキテクチャ上の決定を追跡し、抽出の評価を独立した後ろ向きリンパ腫登録研究と併せて実施しました。この研究では核医学医が抽出された各値を引用元に対して検証しています。7,326件の判定にわたり、臨床医は抽出結果の96.5%を承認し、タイプ別の受容率は80%から99%の範囲でした。

English

Patient contexts span hundreds of heterogeneous documents and thousands of structured data points, yet the document-level metadata that AI systems need for retrieval and triage is absent or incomplete. Standard retrieval-augmented generation fails on this data, mishandling temporal reasoning, cross-document dependencies, and missing metadata. We deploy ACIE (Agentic Clinical Information Extraction) at University Medicine Essen: an on-premise agentic RAG pipeline that reasons over complete patient contexts and grounds every answer in source passages for clinician verification. We quantify the metadata gap, trace the architectural decisions it shaped, and evaluate extraction alongside an independent retrospective lymphoma registry study, in which nuclear-medicine physicians verify every extracted value against its cited sources. Across 7,326 judgments, clinicians accepted 96.5\% of extractions, with per-type acceptance ranging from 80\% to 99\%.