MedVLM-R1:通過強化學習激勵視覺語言模型(VLMs)的醫學推理能力
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning
February 26, 2025
作者: Jiazhen Pan, Che Liu, Junde Wu, Fenglin Liu, Jiayuan Zhu, Hongwei Bran Li, Chen Chen, Cheng Ouyang, Daniel Rueckert
cs.AI
摘要
推理是推進醫學影像分析的關鍵前沿,透明度和可信度在醫師信任和監管批准中發揮著核心作用。儘管醫學視覺語言模型(VLMs)對放射學任務顯示出潛力,但大多數現有的VLMs僅生成最終答案,而不揭示潛在的推理過程。為填補這一差距,我們引入MedVLM-R1,一種醫學VLM,明確生成自然語言推理,以增強透明度和可信度。MedVLM-R1不依賴監督微調(SFT),後者常常過度擬合訓練分佈,無法促進真正的推理,而是採用強化學習框架,該框架鼓勵模型發現可解釋的人類推理路徑,而無需使用任何推理參考。儘管訓練數據有限(600個視覺問答樣本)且模型參數(2B)有限,但MedVLM-R1將MRI、CT和X射線基準的準確率從55.11%提高到78.22%,優於在超過一百萬樣本上訓練的更大模型。它還展示了對分布外任務的強大域泛化能力。通過將醫學影像分析與明確推理結合,MedVLM-R1標誌著邁向可信且可解釋的臨床實踐人工智能的重要一步。
English
Reasoning is a critical frontier for advancing medical image analysis, where
transparency and trustworthiness play a central role in both clinician trust
and regulatory approval. Although Medical Visual Language Models (VLMs) show
promise for radiological tasks, most existing VLMs merely produce final answers
without revealing the underlying reasoning. To address this gap, we introduce
MedVLM-R1, a medical VLM that explicitly generates natural language reasoning
to enhance transparency and trustworthiness. Instead of relying on supervised
fine-tuning (SFT), which often suffers from overfitting to training
distributions and fails to foster genuine reasoning, MedVLM-R1 employs a
reinforcement learning framework that incentivizes the model to discover
human-interpretable reasoning paths without using any reasoning references.
Despite limited training data (600 visual question answering samples) and model
parameters (2B), MedVLM-R1 boosts accuracy from 55.11% to 78.22% across MRI,
CT, and X-ray benchmarks, outperforming larger models trained on over a million
samples. It also demonstrates robust domain generalization under
out-of-distribution tasks. By unifying medical image analysis with explicit
reasoning, MedVLM-R1 marks a pivotal step toward trustworthy and interpretable
AI in clinical practice.Summary
AI-Generated Summary