ChatPaper.aiChatPaper

在類別不平衡的CT身體成分分割中,將採樣與訓練預算解耦。

Disentangling Sampling from Training Budget in Class-Imbalanced CT Body Composition Segmentation

May 19, 2026
作者: Iason Skylitsis, Dimitrios Karkalousos, Ivana Išgum
cs.AI

摘要

类别不平衡是医学图像分割中的一个基本挑战,其中频繁出现的类别通常主导训练过程,而稀有类别则被忽视。基于损失的方法通过在批次内对逐像素损失进行重新加权来缓解不平衡,而采样策略则控制哪些图像进入批次。然而,这两种方法均未明确控制批次中出现的类别,导致稀有类别仅得到部分重新平衡。在本工作中,我们采用少样本学习中的情节采样,以在全监督设置下促进类别平衡的批次构建。我们将情节采样从其传统的度量学习背景中解耦,并在CT身体成分分割任务中对其进行评估。我们基于公共SAROS数据集的210次扫描,对九种肌肉和脂肪组织,将情节采样与随机采样和加权采样进行了比较。训练在全数据和低数据场景下进行,并在匹配的训练迭代预算下进行了额外比较。在全数据训练下,三种策略表现相当(情节采样平均Dice为0.882,随机和加权采样为0.878)。在低数据训练下,情节采样优于随机和加权采样(0.787对比0.758和0.762),这得益于训练迭代次数12倍的差异。在匹配的训练预算下,随机和加权采样更早出现过拟合,而情节采样在趋稳前持续改善了约三倍的迭代次数。我们的发现将训练迭代预算识别为采样策略中未被充分认识的混杂因素,从而为小数据集提出了迭代感知的评估协议。此外,情节采样的残余优势与类别平衡批次的隐式正则化效应一致,为类别不平衡的医学图像分割提供了一种低成本、模型无关的策略。代码可在https://github.com/iasonsky/episodic-sampling获取。
English
Class imbalance is a fundamental challenge in medical image segmentation, where frequent classes typically dominate training at the expense of rare classes. Loss-based approaches mitigate imbalance by reweighting the per-pixel loss within the batch, while sampling strategies control which images enter the batch. Yet neither explicitly controls which classes appear within the batch, leaving rare-class exposure only partially rebalanced. In this work, we adopt episodic sampling from few-shot learning to promote class-balanced batch construction in a fully supervised setting. We decouple episodic sampling from its conventional metric-learning context and evaluate it in body composition segmentation in CT. We compare episodic sampling against random and weighted sampling on nine muscle and adipose tissues, derived from 210 scans of the public SAROS dataset. Training is performed under full- and low-data regimes, with additional comparisons under matched training iteration budgets. Under full-data training, all three strategies performed comparably (mean Dice 0.882 for episodic, 0.878 for random and weighted). Under low-data training, episodic sampling outperformed random and weighted (0.787 vs. 0.758 and 0.762), driven by a 12-fold difference in training iterations. Under matched training budgets, random and weighted overfit earlier, while episodic improved for approximately three times more iterations before plateauing. Our findings identify the training iteration budget as under-recognized confound in sampling strategies, motivating iteration-aware evaluation protocols for small datasets. Furthermore, the residual advantage of episodic sampling is consistent with an implicit regularization effect of class-balanced batches, offering a low-cost, model-agnostic strategy for class-imbalanced medical image segmentation. Code is available at https://github.com/iasonsky/episodic-sampling.