インタラクティブな医用画像セグメンテーション：ベンチマークデータセットとベースライン

要旨

インタラクティブ医用画像セグメンテーション（IMIS）は、大規模で多様かつ密に注釈付けされたデータセットの入手が限られているため、モデルの汎化と異なるモデル間での一貫した評価が妨げられてきました。本論文では、一般的なIMIS研究の重要な進展として、IMed-361Mベンチマークデータセットを紹介します。まず、複数のデータソースから6.4百万枚以上の医用画像とそれに対応する正解マスクを収集し、標準化しました。次に、ビジョン基盤モデルの強力な物体認識能力を活用して、各画像に対して密なインタラクティブマスクを自動生成し、その品質を厳格な品質管理と粒度管理を通じて確保しました。従来の特定のモダリティに制限されたり、スパースな注釈に制約を受ける従来のデータセットとは異なり、IMed-361Mは14のモダリティと204のセグメンテーションターゲットを網羅し、計361百万枚のマスクが含まれており、画像あたり平均56枚のマスクがあります。最後に、このデータセット上でIMISベースラインネットワークを開発し、クリック、境界ボックス、テキストプロンプト、およびそれらの組み合わせを含むインタラクティブ入力を介した高品質のマスク生成をサポートするものです。我々は、既存のインタラクティブセグメンテーションモデルと比較して、医用画像セグメンテーションタスクにおけるその性能を複数の視点から評価し、優れた精度と拡張性を示しました。医療コンピュータビジョンの基盤モデルに関する研究を促進するために、IMed-361Mおよびモデルをhttps://github.com/uni-medical/IMIS-Benchで公開しています。

English

Interactive Medical Image Segmentation (IMIS) has long been constrained by the limited availability of large-scale, diverse, and densely annotated datasets, which hinders model generalization and consistent evaluation across different models. In this paper, we introduce the IMed-361M benchmark dataset, a significant advancement in general IMIS research. First, we collect and standardize over 6.4 million medical images and their corresponding ground truth masks from multiple data sources. Then, leveraging the strong object recognition capabilities of a vision foundational model, we automatically generated dense interactive masks for each image and ensured their quality through rigorous quality control and granularity management. Unlike previous datasets, which are limited by specific modalities or sparse annotations, IMed-361M spans 14 modalities and 204 segmentation targets, totaling 361 million masks-an average of 56 masks per image. Finally, we developed an IMIS baseline network on this dataset that supports high-quality mask generation through interactive inputs, including clicks, bounding boxes, text prompts, and their combinations. We evaluate its performance on medical image segmentation tasks from multiple perspectives, demonstrating superior accuracy and scalability compared to existing interactive segmentation models. To facilitate research on foundational models in medical computer vision, we release the IMed-361M and model at https://github.com/uni-medical/IMIS-Bench.

インタラクティブな医用画像セグメンテーション：ベンチマークデータセットとベースライン

Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline

要旨

Support