ChatPaper.aiChatPaper

稀疏数据,丰富成果:通过类别条件图像翻译实现少样本半监督学习

SPARSE Data, Rich Results: Few-Shot Semi-Supervised Learning via Class-Conditioned Image Translation

August 8, 2025
作者: Guido Manni, Clemente Lauretti, Loredana Zollo, Paolo Soda
cs.AI

摘要

深度学习已彻底革新了医学影像领域,但其效能因标注训练数据的不足而受到严重制约。本文提出了一种新颖的基于生成对抗网络(GAN)的半监督学习框架,专为低标注数据场景设计,并在每类5至50个标注样本的多种设置下进行了评估。我们的方法整合了三类专用神经网络——一个用于类条件图像转换的生成器、一个用于真实性评估与分类的判别器,以及一个专门的分类器——构建于一个三阶段训练框架之内。该方法在有限的标注数据上进行监督训练与利用大量未标注图像通过图像到图像转换(而非从噪声生成)的无监督学习之间交替进行。我们采用了基于集成的伪标签技术,该技术结合了判别器和分类器的置信度加权预测,并通过指数移动平均实现时间一致性,从而为未标注数据提供可靠的标签估计。在十一个MedMNIST数据集上的全面评估表明,相较于六种最先进的基于GAN的半监督方法,我们的方法取得了统计学上显著的改进,尤其在极端5样本设置下,当标注数据极度稀缺时表现尤为突出。该框架在所有评估设置(每类5、10、20及50样本)中均保持了其优越性。我们的方法为标注成本高昂的医学影像应用提供了一个实用解决方案,即便在标注数据极少的情况下也能实现稳健的分类性能。代码已发布于https://github.com/GuidoManni/SPARSE。
English
Deep learning has revolutionized medical imaging, but its effectiveness is severely limited by insufficient labeled training data. This paper introduces a novel GAN-based semi-supervised learning framework specifically designed for low labeled-data regimes, evaluated across settings with 5 to 50 labeled samples per class. Our approach integrates three specialized neural networks -- a generator for class-conditioned image translation, a discriminator for authenticity assessment and classification, and a dedicated classifier -- within a three-phase training framework. The method alternates between supervised training on limited labeled data and unsupervised learning that leverages abundant unlabeled images through image-to-image translation rather than generation from noise. We employ ensemble-based pseudo-labeling that combines confidence-weighted predictions from the discriminator and classifier with temporal consistency through exponential moving averaging, enabling reliable label estimation for unlabeled data. Comprehensive evaluation across eleven MedMNIST datasets demonstrates that our approach achieves statistically significant improvements over six state-of-the-art GAN-based semi-supervised methods, with particularly strong performance in the extreme 5-shot setting where the scarcity of labeled data is most challenging. The framework maintains its superiority across all evaluated settings (5, 10, 20, and 50 shots per class). Our approach offers a practical solution for medical imaging applications where annotation costs are prohibitive, enabling robust classification performance even with minimal labeled data. Code is available at https://github.com/GuidoManni/SPARSE.
PDF22August 18, 2025