CCD: 臨床的対照デコーディングによる放射線学MLLMの幻覚緩和

要旨

マルチモーダル大規模言語モデル（MLLMs）は、視覚的知覚と自然言語理解を統合することで、最近、放射線学において顕著な進展を遂げている。しかし、これらのモデルはしばしば臨床的に裏付けられていない記述、いわゆる「医療的幻覚」を生成し、正確性と画像に基づいた出力を要求する医療応用において重大なリスクを引き起こす。実証分析を通じて、プロンプト誘発性の幻覚が放射線学MLLMsにおいて依然として広く見られることが明らかとなった。これは主に、臨床セクションに対する過敏性に起因している。この問題に対処するため、我々はClinical Contrastive Decoding（CCD）を導入する。これは、タスク固有の放射線学専門家モデルから構造化された臨床信号を統合する、トレーニング不要かつ検索不要の推論フレームワークである。CCDは、生成中のトークンレベルのロジットを洗練するための二段階のコントラスティブメカニズムを導入し、ベースMLLMを変更することなく臨床的信頼性を向上させる。3つのデータセットと複数のモデルを用いた実験により、CCDが放射線学レポート生成（RRG）において一貫して全体的な性能を向上させることが示された。MIMIC-CXRデータセットでは、最先端のRRGモデルに適用した場合、RadGraph-F1において最大17%の改善が見られた。我々のアプローチは、医療的幻覚を軽減するための軽量で汎用的な解決策を提供し、放射線学における専門家モデルとMLLMsを効果的に橋渡しする。

English

Multimodal large language models (MLLMs) have recently achieved remarkable progress in radiology by integrating visual perception with natural language understanding. However, they often generate clinically unsupported descriptions, known as medical hallucinations, which pose serious risks in medical applications that demand accuracy and image-grounded outputs. Through empirical analysis, we find that prompt-induced hallucinations remain prevalent in radiology MLLMs, largely due to over-sensitivity to clinical sections. To address this, we introduce Clinical Contrastive Cecoding (CCD), a training-free and retrieval-free inference framework that integrates structured clinical signals from task-specific radiology expert models. CCD introduces a dual-stage contrastive mechanism to refine token-level logits during generation, thereby enhancing clinical fidelity without modifying the base MLLM. Experiments on three datasets and multiple models demonstrate that CCD consistently improves overall performance on radiology report generation (RRG). On the MIMIC-CXR dataset, it yields up to a 17% improvement in RadGraph-F1 when applied to state-of-the-art RRG models. Our approach provides a lightweight and generalisable solution for mitigating medical hallucinations, effectively bridging expert models and MLLMs in radiology.

CCD: 臨床的対照デコーディングによる放射線学MLLMの幻覚緩和

CCD: Mitigating Hallucinations in Radiology MLLMs via Clinical Contrastive Decoding

要旨

Support