ChatPaper.aiChatPaper

稀疏數據,豐碩成果:通過類別條件圖像翻譯實現少樣本半監督學習

SPARSE Data, Rich Results: Few-Shot Semi-Supervised Learning via Class-Conditioned Image Translation

August 8, 2025
作者: Guido Manni, Clemente Lauretti, Loredana Zollo, Paolo Soda
cs.AI

摘要

深度學習已革新了醫學影像領域,但其效能因標記訓練數據不足而受到嚴重限制。本文提出了一種基於生成對抗網絡(GAN)的新型半監督學習框架,專為低標記數據情境設計,並在每類5至50個標記樣本的設置下進行評估。我們的方法整合了三種專門的神經網絡——用於類條件圖像轉換的生成器、用於真實性評估與分類的判別器,以及專用的分類器——在一個三階段的訓練框架內。該方法在有限的標記數據上進行監督訓練與利用大量未標記圖像的無監督學習之間交替進行,後者通過圖像到圖像的轉換而非從噪聲生成來實現。我們採用了基於集成的偽標記技術,結合了判別器和分類器的置信度加權預測,並通過指數移動平均保持時間一致性,從而實現了對未標記數據的可靠標籤估計。在十一套MedMNIST數據集上的全面評估表明,我們的方法相較於六種最先進的基於GAN的半監督方法,取得了統計學上顯著的改進,特別是在極端的5-shot設置下,標記數據的稀缺性最具挑戰性時,表現尤為突出。該框架在所有評估設置(每類5、10、20和50個樣本)中均保持了其優勢。我們的方法為標註成本高昂的醫學影像應用提供了一個實用的解決方案,即使在極少標記數據的情況下也能實現穩健的分類性能。代碼可在https://github.com/GuidoManni/SPARSE獲取。
English
Deep learning has revolutionized medical imaging, but its effectiveness is severely limited by insufficient labeled training data. This paper introduces a novel GAN-based semi-supervised learning framework specifically designed for low labeled-data regimes, evaluated across settings with 5 to 50 labeled samples per class. Our approach integrates three specialized neural networks -- a generator for class-conditioned image translation, a discriminator for authenticity assessment and classification, and a dedicated classifier -- within a three-phase training framework. The method alternates between supervised training on limited labeled data and unsupervised learning that leverages abundant unlabeled images through image-to-image translation rather than generation from noise. We employ ensemble-based pseudo-labeling that combines confidence-weighted predictions from the discriminator and classifier with temporal consistency through exponential moving averaging, enabling reliable label estimation for unlabeled data. Comprehensive evaluation across eleven MedMNIST datasets demonstrates that our approach achieves statistically significant improvements over six state-of-the-art GAN-based semi-supervised methods, with particularly strong performance in the extreme 5-shot setting where the scarcity of labeled data is most challenging. The framework maintains its superiority across all evaluated settings (5, 10, 20, and 50 shots per class). Our approach offers a practical solution for medical imaging applications where annotation costs are prohibitive, enabling robust classification performance even with minimal labeled data. Code is available at https://github.com/GuidoManni/SPARSE.
PDF22August 18, 2025