語義庫適應：基於LoRA檢索與融合的開放詞彙語義分割

摘要

開放詞彙語義分割模型通過將視覺與文本相結合，利用文本查詢來標記來自未定義類別集合的像素，從而在新數據集上展現出多功能的性能。然而，訓練與測試領域之間的大幅差異會降低其性能，這要求進行微調以實現有效的現實應用。我們引入了語義庫適應（SemLA），這是一種新穎的無訓練、測試時領域適應框架。SemLA利用基於LoRA的適配器庫，這些適配器通過CLIP嵌入進行索引，並根據與目標領域在嵌入空間中的接近度動態合併最相關的適配器。這種方法構建了一個針對每個特定輸入的臨時模型，無需額外訓練。我們的方法高效擴展，通過跟踪適配器的貢獻增強了可解釋性，並從本質上保護了數據隱私，使其成為敏感應用的理想選擇。在基於10個標準數據集構建的20個領域基準上進行的全面實驗表明，SemLA在多樣化設置中展現出卓越的適應性和性能，為開放詞彙語義分割的領域適應樹立了新標準。

English

Open-vocabulary semantic segmentation models associate vision and text to label pixels from an undefined set of classes using textual queries, providing versatile performance on novel datasets. However, large shifts between training and test domains degrade their performance, requiring fine-tuning for effective real-world applications. We introduce Semantic Library Adaptation (SemLA), a novel framework for training-free, test-time domain adaptation. SemLA leverages a library of LoRA-based adapters indexed with CLIP embeddings, dynamically merging the most relevant adapters based on proximity to the target domain in the embedding space. This approach constructs an ad-hoc model tailored to each specific input without additional training. Our method scales efficiently, enhances explainability by tracking adapter contributions, and inherently protects data privacy, making it ideal for sensitive applications. Comprehensive experiments on a 20-domain benchmark built over 10 standard datasets demonstrate SemLA's superior adaptability and performance across diverse settings, establishing a new standard in domain adaptation for open-vocabulary semantic segmentation.

語義庫適應：基於LoRA檢索與融合的開放詞彙語義分割

Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation

摘要

Support