ChatPaper.aiChatPaper

OpenUS:基于自适应掩码对比学习的全开源超声图像分析基础模型

OpenUS: A Fully Open-Source Foundation Model for Ultrasound Image Analysis via Self-Adaptive Masked Contrastive Learning

November 14, 2025
作者: Xiaoyu Zheng, Xu Chen, Awais Rauf, Qifan Fu, Benedetta Monosi, Felice Rivellese, Myles J. Lewis, Shaogang Gong, Gregory Slabaugh
cs.AI

摘要

超声成像(US)因其低成本、便携性、实时反馈和无电离辐射等优势,成为应用最广泛的医学影像技术之一。然而,超声图像解读仍高度依赖操作者,且在不同解剖区域、采集协议和设备类型间存在显著差异。这些变异以及散斑噪声、低对比度和标准化标注稀缺等独特挑战,制约了可泛化、低标注依赖的超声AI模型的开发。本文提出OpenUS——首个基于大规模公共数据构建的可复现开源超声基础模型。该模型采用视觉Mamba架构作为主干网络,能同时捕捉图像的局部特征与全局长程依赖关系。在预训练阶段,我们引入结合对比学习与掩码图像建模的自适应掩码框架,通过融合教师模型的注意力图与学生模型的重建损失,动态优化临床相关区域的掩码策略以提升预训练效能。OpenUS还采用动态学习调度机制,逐步调整预训练任务的难度层级。为构建基础模型,我们整合了42个公共数据集中的30.8万余张图像,形成迄今最大的公共超声数据集,涵盖多解剖部位、医疗机构、成像设备及疾病类型。预训练完成的OpenUS模型可作为主干网络,通过少量标注数据微调即可快速适配下游任务。代码已开源:https://github.com/XZheng0427/OpenUS。
English
Ultrasound (US) is one of the most widely used medical imaging modalities, thanks to its low cost, portability, real-time feedback, and absence of ionizing radiation. However, US image interpretation remains highly operator-dependent and varies significantly across anatomical regions, acquisition protocols, and device types. These variations, along with unique challenges such as speckle, low contrast, and limited standardized annotations, hinder the development of generalizable, label-efficient ultrasound AI models. In this paper, we propose OpenUS, the first reproducible, open-source ultrasound foundation model built on a large collection of public data. OpenUS employs a vision Mamba backbone, capturing both local and global long-range dependencies across the image. To extract rich features during pre-training, we introduce a novel self-adaptive masking framework that combines contrastive learning with masked image modeling. This strategy integrates the teacher's attention map with student reconstruction loss, adaptively refining clinically-relevant masking to enhance pre-training effectiveness. OpenUS also applies a dynamic learning schedule to progressively adjust the difficulty of the pre-training process. To develop the foundation model, we compile the largest to-date public ultrasound dataset comprising over 308K images from 42 publicly available datasets, covering diverse anatomical regions, institutions, imaging devices, and disease types. Our pre-trained OpenUS model can be easily adapted to specific downstream tasks by serving as a backbone for label-efficient fine-tuning. Code is available at https://github.com/XZheng0427/OpenUS.
PDF02December 1, 2025