パラメータ空間におけるスキル拡張と合成

要旨

人間は、新しい課題に対処するために以前の知識を再利用し、問題を解決する過程でスキルを磨くことに優れています。このパラダイムは、自律エージェントの開発においてますます人気を博しており、人間のように新たな課題に対応するために自己進化できるシステムを開発しています。しかし、従来の手法は、新しいスキルを拡張する際のトレーニング効率が限られており、新しいタスクの学習を促進するために以前の知識を十分に活用できていません。本論文では、パラメトリックスキル拡張と合成（PSEC）という新しいフレームワークを提案し、エージェントの能力を段階的に進化させ、管理可能なスキルライブラリを維持することで新たな課題に効率的に対処することを目指しています。このライブラリは、スキルプリミティブをプラグアンドプレイのLow-Rank Adaptation（LoRA）モジュールとして逐次統合し、パラメータ効率の微調整に活用することで、効率的かつ柔軟なスキル拡張を促進します。この構造はまた、異なるスキルをエンコードするLoRAモジュールをマージすることで、パラメータ空間での直接スキル組成を可能にし、スキル間で共有された情報を活用して新しいスキルを効果的にプログラムします。これに基づき、新しいタスクを共同で処理するために異なるスキルを動的に活性化するコンテキスト感知モジュールを提案します。D4RL、DSRLベンチマーク、DeepMind Control Suite上の結果から、PSECは以前の知識を効率的に活用して新たな課題に効果的に取り組み、スキルライブラリを拡張して能力を進化させる優れた能力を示すことが示されました。プロジェクトのウェブサイト：https://ltlhuuu.github.io/PSEC/。

English

Humans excel at reusing prior knowledge to address new challenges and developing skills while solving problems. This paradigm becomes increasingly popular in the development of autonomous agents, as it develops systems that can self-evolve in response to new challenges like human beings. However, previous methods suffer from limited training efficiency when expanding new skills and fail to fully leverage prior knowledge to facilitate new task learning. In this paper, we propose Parametric Skill Expansion and Composition (PSEC), a new framework designed to iteratively evolve the agents' capabilities and efficiently address new challenges by maintaining a manageable skill library. This library can progressively integrate skill primitives as plug-and-play Low-Rank Adaptation (LoRA) modules in parameter-efficient finetuning, facilitating efficient and flexible skill expansion. This structure also enables the direct skill compositions in parameter space by merging LoRA modules that encode different skills, leveraging shared information across skills to effectively program new skills. Based on this, we propose a context-aware module to dynamically activate different skills to collaboratively handle new tasks. Empowering diverse applications including multi-objective composition, dynamics shift, and continual policy shift, the results on D4RL, DSRL benchmarks, and the DeepMind Control Suite show that PSEC exhibits superior capacity to leverage prior knowledge to efficiently tackle new challenges, as well as expand its skill libraries to evolve the capabilities. Project website: https://ltlhuuu.github.io/PSEC/.

パラメータ空間におけるスキル拡張と合成

Skill Expansion and Composition in Parameter Space

要旨

Support