语义库自适应:面向开放词汇语义分割的LoRA检索与融合
Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation
March 27, 2025
作者: Reza Qorbani, Gianluca Villani, Theodoros Panagiotakopoulos, Marc Botet Colomer, Linus Härenstam-Nielsen, Mattia Segu, Pier Luigi Dovesi, Jussi Karlgren, Daniel Cremers, Federico Tombari, Matteo Poggi
cs.AI
摘要
开放词汇语义分割模型通过将视觉与文本关联,利用文本查询对未定义类别集合中的像素进行标注,从而在新数据集上展现出广泛适用性。然而,训练与测试领域间的显著差异会削弱其性能,需通过微调以提升实际应用效果。我们提出了语义库适应(SemLA),一种无需训练、在测试时进行领域适应的创新框架。SemLA利用基于CLIP嵌入索引的LoRA适配器库,根据目标域在嵌入空间中的邻近度动态融合最相关的适配器。此方法为每个特定输入构建定制模型,无需额外训练。我们的方法高效扩展,通过追踪适配器贡献增强可解释性,并天然保护数据隐私,特别适用于敏感场景。基于10个标准数据集构建的20领域基准上的全面实验表明,SemLA在不同设置下均展现出卓越的适应性和性能,为开放词汇语义分割的领域适应设立了新标杆。
English
Open-vocabulary semantic segmentation models associate vision and text to
label pixels from an undefined set of classes using textual queries, providing
versatile performance on novel datasets. However, large shifts between training
and test domains degrade their performance, requiring fine-tuning for effective
real-world applications. We introduce Semantic Library Adaptation (SemLA), a
novel framework for training-free, test-time domain adaptation. SemLA leverages
a library of LoRA-based adapters indexed with CLIP embeddings, dynamically
merging the most relevant adapters based on proximity to the target domain in
the embedding space. This approach constructs an ad-hoc model tailored to each
specific input without additional training. Our method scales efficiently,
enhances explainability by tracking adapter contributions, and inherently
protects data privacy, making it ideal for sensitive applications.
Comprehensive experiments on a 20-domain benchmark built over 10 standard
datasets demonstrate SemLA's superior adaptability and performance across
diverse settings, establishing a new standard in domain adaptation for
open-vocabulary semantic segmentation.Summary
AI-Generated Summary