三思而後行：具備內部工作記憶的決策Transformer。

摘要

基於大型語言模型（LLM）的決策代理展現了在多個任務間泛化的能力。然而，它們的表現依賴於龐大的數據和計算。我們認為這種低效源於遺忘現象，即模型在訓練過程中通過參數記憶其行為。因此，在新任務上的訓練可能會降低模型在先前任務上的表現。與LLM的隱式記憶機制相反，人類大腦利用分佈式記憶存儲，有助於有效管理和組織多個技能，減輕遺忘現象。受此啟發，我們提出了一個內部工作記憶模塊，用於存儲、融合和檢索不同下游任務的信息。評估結果顯示，所提出的方法提高了在Atari遊戲和元世界物體操作任務中的訓練效率和泛化能力。此外，我們展示了記憶微調進一步增強了所提架構的適應性。

English

Large language model (LLM)-based decision-making agents have shown the ability to generalize across multiple tasks. However, their performance relies on massive data and compute. We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training. As a result, training on a new task may deteriorate the model's performance on previous tasks. In contrast to LLMs' implicit memory mechanism, the human brain utilizes distributed memory storage, which helps manage and organize multiple skills efficiently, mitigating the forgetting phenomenon. Thus inspired, we propose an internal working memory module to store, blend, and retrieve information for different downstream tasks. Evaluation results show that the proposed method improves training efficiency and generalization in both Atari games and meta-world object manipulation tasks. Moreover, we demonstrate that memory fine-tuning further enhances the adaptability of the proposed architecture.

三思而後行：具備內部工作記憶的決策Transformer。

Think Before You Act: Decision Transformers with Internal Working Memory

摘要

Support