ChatPaper.aiChatPaper

CLARE:基于自主适配器路由与扩展的视觉-语言-动作模型持续学习方法

CLARE: Continual Learning for Vision-Language-Action Models via Autonomous Adapter Routing and Expansion

January 14, 2026
作者: Ralf Römer, Yi Zhang, Angela P. Schoellig
cs.AI

摘要

為教導機器人執行複雜操作任務,當前普遍做法是針對特定任務數據對預訓練的視覺-語言-動作模型進行微調。然而,由於這種方法會更新現有表徵,不適用於現實世界中需要長期運行的場景——機器人必須在持續適應新任務與環境的同時,保留已習得的知識。現有的機器人持續學習方法通常需要存儲過往數據(範例樣本),難以應對長任務序列,或需依賴任務標識進行部署。為解決這些局限性,我們提出CLARE框架:一種通用、參數高效的無範例持續學習方法。CLARE通過分層特徵相似度指導,在選定的前饋網絡層中引入輕量級模塊化適配器,並僅在學習新任務時必要處自主擴展模型。部署階段基於自編碼器的路由機制能動態激活最相關適配器,無需任務標籤。在LIBERO基準測試中的大量實驗表明,CLARE在學習新任務時既能保持高性能,又能有效避免對舊任務的災難性遺忘,其表現甚至顯著優於基於範例的方法。代碼與數據詳見:https://tum-lsy.github.io/clare。
English
To teach robots complex manipulation tasks, it is now a common practice to fine-tune a pre-trained vision-language-action model (VLA) on task-specific data. However, since this recipe updates existing representations, it is unsuitable for long-term operation in the real world, where robots must continually adapt to new tasks and environments while retaining the knowledge they have already acquired. Existing continual learning methods for robotics commonly require storing previous data (exemplars), struggle with long task sequences, or rely on task identifiers for deployment. To address these limitations, we propose CLARE, a general, parameter-efficient framework for exemplar-free continual learning with VLAs. CLARE introduces lightweight modular adapters into selected feedforward layers and autonomously expands the model only where necessary when learning a new task, guided by layer-wise feature similarity. During deployment, an autoencoder-based routing mechanism dynamically activates the most relevant adapters without requiring task labels. Through extensive experiments on the LIBERO benchmark, we show that CLARE achieves high performance on new tasks without catastrophic forgetting of earlier tasks, significantly outperforming even exemplar-based methods. Code and data are available at https://tum-lsy.github.io/clare.
PDF12January 21, 2026