ChatPaper.aiChatPaper

MedSAM3:融合医学概念的通用分割模型深度探索

MedSAM3: Delving into Segment Anything with Medical Concepts

November 24, 2025
作者: Anglin Liu, Rundong Xue, Xu R. Cao, Yifan Shen, Yi Lu, Xiang Li, Qianqian Chen, Jintai Chen
cs.AI

摘要

医学图像分割是生物医学发现的基础。现有方法缺乏普适性,且针对新的临床应用需要大量耗时的人工标注。本文提出MedSAM-3——一种支持文本提示的医学图像与视频分割模型。通过在对语义概念标签的医学图像上微调Segment Anything Model(SAM)3架构,我们的MedSAM-3实现了医学可提示概念分割(PCS),能够通过开放词汇文本描述(而非仅依赖几何提示)精确定位解剖结构。我们进一步推出MedSAM-3智能体框架,该框架集成多模态大语言模型(MLLM),在智能体参与循环的工作流中执行复杂推理与迭代优化。涵盖X光、磁共振、超声、CT及视频等多种医学影像模态的综合实验表明,本方法显著优于现有专业模型与基础模型。代码与模型将在https://github.com/Joey-S-Liu/MedSAM3发布。
English
Medical image segmentation is fundamental for biomedical discovery. Existing methods lack generalizability and demand extensive, time-consuming manual annotation for new clinical application. Here, we propose MedSAM-3, a text promptable medical segmentation model for medical image and video segmentation. By fine-tuning the Segment Anything Model (SAM) 3 architecture on medical images paired with semantic conceptual labels, our MedSAM-3 enables medical Promptable Concept Segmentation (PCS), allowing precise targeting of anatomical structures via open-vocabulary text descriptions rather than solely geometric prompts. We further introduce the MedSAM-3 Agent, a framework that integrates Multimodal Large Language Models (MLLMs) to perform complex reasoning and iterative refinement in an agent-in-the-loop workflow. Comprehensive experiments across diverse medical imaging modalities, including X-ray, MRI, Ultrasound, CT, and video, demonstrate that our approach significantly outperforms existing specialist and foundation models. We will release our code and model at https://github.com/Joey-S-Liu/MedSAM3.
PDF473December 1, 2025