ChatPaper.aiChatPaper

基于自动生成大规模数据集的指令引导胸部X光病灶分割 (注:该标题采用学术论文常见的名词化结构,将核心要素"指令引导分割"作为主语,"自动生成数据集"作为实现方式置于介词结构中,符合中文科技文献标题的简洁规范。关键技术点"病灶分割"和"胸部X光"保持行业标准译法,同时通过"大规模"准确传达large-scale的量化特征。)

Instruction-Guided Lesion Segmentation for Chest X-rays with Automatically Generated Large-Scale Dataset

November 19, 2025
作者: Geon Choi, Hangyul Yoon, Hyunju Shin, Hyunki Park, Sang Hoon Seo, Eunho Yang, Edward Choi
cs.AI

摘要

当前胸片病灶分割模型的应用受限于目标标签数量稀少及依赖冗长专业的文本输入,这为实际应用设置了障碍。为解决这些局限性,我们提出了一种新范式:指令引导病灶分割(ILS),该范式能够基于简洁的用户指令实现多类型病灶分割。在此框架下,我们通过全自动多模态流程构建了首个大规模胸片病灶分割指令-答案数据集MIMIC-ILS,该流程可从胸片图像及对应报告中自动生成标注。MIMIC-ILS包含源自19.2万张图像和9.1万个独立分割掩模的110万条指令-答案对,覆盖七种主要病灶类型。为实证其效用,我们推出了基于MIMIC-ILS微调的视觉语言模型ROSALIA。该模型能根据用户指令实现多类型病灶分割并提供文本解释。在我们新提出的任务中,该模型实现了高水平的分割精度与文本描述准确性,充分证明了我们流程的有效性,也彰显了MIMIC-ILS作为像素级胸片病灶定位基础资源的价值。
English
The applicability of current lesion segmentation models for chest X-rays (CXRs) has been limited both by a small number of target labels and the reliance on long, detailed expert-level text inputs, creating a barrier to practical use. To address these limitations, we introduce a new paradigm: instruction-guided lesion segmentation (ILS), which is designed to segment diverse lesion types based on simple, user-friendly instructions. Under this paradigm, we construct MIMIC-ILS, the first large-scale instruction-answer dataset for CXR lesion segmentation, using our fully automated multimodal pipeline that generates annotations from chest X-ray images and their corresponding reports. MIMIC-ILS contains 1.1M instruction-answer pairs derived from 192K images and 91K unique segmentation masks, covering seven major lesion types. To empirically demonstrate its utility, we introduce ROSALIA, a vision-language model fine-tuned on MIMIC-ILS. ROSALIA can segment diverse lesions and provide textual explanations in response to user instructions. The model achieves high segmentation and textual accuracy in our newly proposed task, highlighting the effectiveness of our pipeline and the value of MIMIC-ILS as a foundational resource for pixel-level CXR lesion grounding.
PDF251December 2, 2025