ChatPaper.aiChatPaper

參數空間中的技能擴展與組合

Skill Expansion and Composition in Parameter Space

February 9, 2025
作者: Tenglong Liu, Jianxiong Li, Yinan Zheng, Haoyi Niu, Yixing Lan, Xin Xu, Xianyuan Zhan
cs.AI

摘要

人類擅長重複利用先前知識來應對新挑戰,並在解決問題時發展技能。這種範式在自主代理的發展中變得越來越受歡迎,因為它開發了能夠像人類一樣對新挑戰做出自我進化反應的系統。然而,先前的方法在擴展新技能時存在訓練效率有限的問題,並未充分利用先前知識來促進新任務的學習。在本文中,我們提出了Parametric Skill Expansion and Composition(PSEC),這是一個新的框架,旨在通過維護可管理的技能庫來逐步發展代理的能力,並有效應對新挑戰。這個庫可以逐步將技能基元作為即插即用的低秩適應(LoRA)模塊整合到參數高效微調中,促進高效靈活的技能擴展。此結構還使直接在參數空間中進行技能組合成為可能,通過合併編碼不同技能的LoRA模塊,利用技能之間的共享信息來有效地編程新技能。基於此,我們提出了一個上下文感知模塊,動態激活不同技能以協同處理新任務。通過賦予多樣化應用,包括多目標組合、動態轉移和持續策略轉移,D4RL、DSRL基準和DeepMind Control Suite上的結果表明,PSEC展現出卓越的能力,能夠有效利用先前知識來應對新挑戰,並擴展其技能庫以發展能力。項目網站:https://ltlhuuu.github.io/PSEC/。
English
Humans excel at reusing prior knowledge to address new challenges and developing skills while solving problems. This paradigm becomes increasingly popular in the development of autonomous agents, as it develops systems that can self-evolve in response to new challenges like human beings. However, previous methods suffer from limited training efficiency when expanding new skills and fail to fully leverage prior knowledge to facilitate new task learning. In this paper, we propose Parametric Skill Expansion and Composition (PSEC), a new framework designed to iteratively evolve the agents' capabilities and efficiently address new challenges by maintaining a manageable skill library. This library can progressively integrate skill primitives as plug-and-play Low-Rank Adaptation (LoRA) modules in parameter-efficient finetuning, facilitating efficient and flexible skill expansion. This structure also enables the direct skill compositions in parameter space by merging LoRA modules that encode different skills, leveraging shared information across skills to effectively program new skills. Based on this, we propose a context-aware module to dynamically activate different skills to collaboratively handle new tasks. Empowering diverse applications including multi-objective composition, dynamics shift, and continual policy shift, the results on D4RL, DSRL benchmarks, and the DeepMind Control Suite show that PSEC exhibits superior capacity to leverage prior knowledge to efficiently tackle new challenges, as well as expand its skill libraries to evolve the capabilities. Project website: https://ltlhuuu.github.io/PSEC/.
PDF43February 12, 2025