행동 전에 생각하라: 내부 작업 메모리를 갖춘 결정 트랜스포머

초록

대규모 언어 모델(LLM) 기반 의사결정 에이전트는 여러 작업에 걸쳐 일반화할 수 있는 능력을 보여주고 있습니다. 그러나 그들의 성능은 방대한 데이터와 컴퓨팅 자원에 의존합니다. 우리는 이러한 비효율성이 모델이 훈련 과정에서 매개변수에 자신의 행동을 암기하는 망각 현상에서 비롯된다고 주장합니다. 결과적으로 새로운 작업에 대한 훈련은 이전 작업에서의 모델 성능을 저하시킬 수 있습니다. LLM의 암묵적 메모리 메커니즘과 대조적으로, 인간의 뇌는 분산된 메모리 저장 방식을 활용하여 여러 기술을 효율적으로 관리하고 조직화함으로써 망각 현상을 완화합니다. 이러한 영감을 받아, 우리는 다양한 하위 작업을 위해 정보를 저장, 혼합 및 검색할 수 있는 내부 작업 메모리 모듈을 제안합니다. 평가 결과, 제안된 방법은 Atari 게임과 메타-월드 객체 조작 작업 모두에서 훈련 효율성과 일반화 능력을 향상시키는 것으로 나타났습니다. 또한, 메모리 미세 조정이 제안된 아키텍처의 적응성을 더욱 향상시킨다는 것을 입증합니다.

English

Large language model (LLM)-based decision-making agents have shown the ability to generalize across multiple tasks. However, their performance relies on massive data and compute. We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training. As a result, training on a new task may deteriorate the model's performance on previous tasks. In contrast to LLMs' implicit memory mechanism, the human brain utilizes distributed memory storage, which helps manage and organize multiple skills efficiently, mitigating the forgetting phenomenon. Thus inspired, we propose an internal working memory module to store, blend, and retrieve information for different downstream tasks. Evaluation results show that the proposed method improves training efficiency and generalization in both Atari games and meta-world object manipulation tasks. Moreover, we demonstrate that memory fine-tuning further enhances the adaptability of the proposed architecture.

행동 전에 생각하라: 내부 작업 메모리를 갖춘 결정 트랜스포머

Think Before You Act: Decision Transformers with Internal Working Memory

초록

Support