LoFT:面向开放世界场景的长尾半监督学习的高效参数微调方法
LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios
September 12, 2025
作者: Jiahao Chen, Zhiyuan Huang, Yurou Liu, Bing Su
cs.AI
摘要
长尾学习因其在现实场景中的广泛应用而日益受到关注。在现有方法中,长尾半监督学习(LTSSL)通过将大量未标记数据融入不平衡的标记数据集,已成为一种有效的解决方案。然而,大多数先前的LTSSL方法旨在从头训练模型,这往往导致过度自信和低质量伪标签等问题。为应对这些挑战,我们将LTSSL扩展至基础模型微调范式,并提出了一种新颖框架:LoFT(通过参数高效微调实现的长尾半监督学习)。我们证明,经过微调的基础模型能够生成更可靠的伪标签,从而有益于不平衡学习。此外,我们探索了一种更为实际的设置,即在开放世界条件下研究半监督学习,其中未标记数据可能包含分布外(OOD)样本。为解决此问题,我们提出了LoFT-OW(开放世界场景下的LoFT)以提升判别能力。在多个基准测试上的实验结果表明,与先前方法相比,即使仅使用1%的未标记数据,我们的方法仍能实现更优的性能。
English
Long-tailed learning has garnered increasing attention due to its wide
applicability in real-world scenarios. Among existing approaches, Long-Tailed
Semi-Supervised Learning (LTSSL) has emerged as an effective solution by
incorporating a large amount of unlabeled data into the imbalanced labeled
dataset. However, most prior LTSSL methods are designed to train models from
scratch, which often leads to issues such as overconfidence and low-quality
pseudo-labels. To address these challenges, we extend LTSSL into the foundation
model fine-tuning paradigm and propose a novel framework: LoFT (Long-tailed
semi-supervised learning via parameter-efficient Fine-Tuning). We demonstrate
that fine-tuned foundation models can generate more reliable pseudolabels,
thereby benefiting imbalanced learning. Furthermore, we explore a more
practical setting by investigating semi-supervised learning under open-world
conditions, where the unlabeled data may include out-of-distribution (OOD)
samples. To handle this problem, we propose LoFT-OW (LoFT under Open-World
scenarios) to improve the discriminative ability. Experimental results on
multiple benchmarks demonstrate that our method achieves superior performance
compared to previous approaches, even when utilizing only 1\% of the unlabeled
data compared with previous works.