LoFT:面向开放世界场景的长尾半监督学习中的参数高效微调
LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios
September 12, 2025
作者: Jiahao Chen, Zhiyuan Huang, Yurou Liu, Bing Su
cs.AI
摘要
長尾學習因其在現實場景中的廣泛適用性而受到越來越多的關注。在現有的方法中,長尾半監督學習(LTSSL)通過將大量未標記數據整合到不平衡的標記數據集中,已成為一種有效的解決方案。然而,大多數先前的LTSSL方法旨在從頭開始訓練模型,這往往會導致過度自信和低質量偽標籤等問題。為了解決這些挑戰,我們將LTSSL擴展到基礎模型微調範式,並提出了一種新框架:LoFT(通過參數高效微調實現的長尾半監督學習)。我們證明,微調後的基礎模型能夠生成更可靠的偽標籤,從而有益於不平衡學習。此外,我們通過研究開放世界條件下的半監督學習,探索了一種更為實用的設置,其中未標記數據可能包含分佈外(OOD)樣本。為了解決這個問題,我們提出了LoFT-OW(開放世界場景下的LoFT)以提高區分能力。在多個基準測試上的實驗結果表明,與先前的方法相比,即使僅使用1%的未標記數據,我們的方法也能實現優越的性能。
English
Long-tailed learning has garnered increasing attention due to its wide
applicability in real-world scenarios. Among existing approaches, Long-Tailed
Semi-Supervised Learning (LTSSL) has emerged as an effective solution by
incorporating a large amount of unlabeled data into the imbalanced labeled
dataset. However, most prior LTSSL methods are designed to train models from
scratch, which often leads to issues such as overconfidence and low-quality
pseudo-labels. To address these challenges, we extend LTSSL into the foundation
model fine-tuning paradigm and propose a novel framework: LoFT (Long-tailed
semi-supervised learning via parameter-efficient Fine-Tuning). We demonstrate
that fine-tuned foundation models can generate more reliable pseudolabels,
thereby benefiting imbalanced learning. Furthermore, we explore a more
practical setting by investigating semi-supervised learning under open-world
conditions, where the unlabeled data may include out-of-distribution (OOD)
samples. To handle this problem, we propose LoFT-OW (LoFT under Open-World
scenarios) to improve the discriminative ability. Experimental results on
multiple benchmarks demonstrate that our method achieves superior performance
compared to previous approaches, even when utilizing only 1\% of the unlabeled
data compared with previous works.