行動する前に考える：内部ワーキングメモリを備えた意思決定トランスフォーマー

要旨

大規模言語モデル（LLM）ベースの意思決定エージェントは、複数のタスクにわたる汎化能力を示しています。しかし、その性能は膨大なデータと計算資源に依存しています。この非効率性は、モデルが訓練を通じてその振る舞いをパラメータに記憶する「忘却現象」に起因すると私たちは主張します。その結果、新しいタスクで訓練を行うと、以前のタスクでのモデルの性能が低下する可能性があります。LLMの暗黙的な記憶メカニズムとは対照的に、人間の脳は分散型の記憶ストレージを利用しており、これが複数のスキルを効率的に管理・整理し、忘却現象を緩和するのに役立ちます。この着想を得て、私たちは異なる下流タスクの情報を保存、混合、検索するための内部ワーキングメモリモジュールを提案します。評価結果は、提案手法がAtariゲームとメタワールドの物体操作タスクの両方において、訓練効率と汎化性能を向上させることを示しています。さらに、メモリのファインチューニングが提案アーキテクチャの適応性をさらに高めることを実証します。

English

Large language model (LLM)-based decision-making agents have shown the ability to generalize across multiple tasks. However, their performance relies on massive data and compute. We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training. As a result, training on a new task may deteriorate the model's performance on previous tasks. In contrast to LLMs' implicit memory mechanism, the human brain utilizes distributed memory storage, which helps manage and organize multiple skills efficiently, mitigating the forgetting phenomenon. Thus inspired, we propose an internal working memory module to store, blend, and retrieve information for different downstream tasks. Evaluation results show that the proposed method improves training efficiency and generalization in both Atari games and meta-world object manipulation tasks. Moreover, we demonstrate that memory fine-tuning further enhances the adaptability of the proposed architecture.

行動する前に考える：内部ワーキングメモリを備えた意思決定トランスフォーマー

Think Before You Act: Decision Transformers with Internal Working Memory

要旨

Support