SRMT: 複数エージェントの生涯パス検索のための共有メモリ

要旨

多エージェント強化学習（MARL）は、さまざまな環境で協力的および競争的な多エージェント問題を解決する上で著しい進歩を示しています。MARLにおける主要な課題の1つは、協力を実現するためにエージェントの振る舞いを明示的に予測する必要があることです。この問題を解決するために、私たちはShared Recurrent Memory Transformer（SRMT）を提案します。SRMTは、メモリトランスフォーマーを拡張し、個々の作業メモリをプールし、グローバルにブロードキャストすることで、エージェントが情報を暗黙的に交換し、行動を調整できるようにします。私たちは、SRMTを部分観測多エージェント経路探索問題のおもちゃのボトルネックナビゲーションタスクとPOGEMAベンチマークタスクセットで評価します。ボトルネックタスクでは、SRMTは一貫してさまざまな強化学習ベースラインを上回り、特に希少な報酬の下で効果的に汎化し、トレーニング中に見られる以上の長い回廊にも適用できます。迷路、ランダム、MovingAIなどのPOGEMAマップでは、SRMTは最近のMARL、ハイブリッド、および計画ベースのアルゴリズムと競争力を持っています。これらの結果から、共有リカレントメモリをトランスフォーマーベースのアーキテクチャに組み込むことが、分散型多エージェントシステムにおける調整を向上させることが示唆されます。トレーニングと評価のためのソースコードはGitHubで入手できます：https://github.com/Aloriosa/srmt。

English

Multi-agent reinforcement learning (MARL) demonstrates significant progress in solving cooperative and competitive multi-agent problems in various environments. One of the principal challenges in MARL is the need for explicit prediction of the agents' behavior to achieve cooperation. To resolve this issue, we propose the Shared Recurrent Memory Transformer (SRMT) which extends memory transformers to multi-agent settings by pooling and globally broadcasting individual working memories, enabling agents to exchange information implicitly and coordinate their actions. We evaluate SRMT on the Partially Observable Multi-Agent Pathfinding problem in a toy Bottleneck navigation task that requires agents to pass through a narrow corridor and on a POGEMA benchmark set of tasks. In the Bottleneck task, SRMT consistently outperforms a variety of reinforcement learning baselines, especially under sparse rewards, and generalizes effectively to longer corridors than those seen during training. On POGEMA maps, including Mazes, Random, and MovingAI, SRMT is competitive with recent MARL, hybrid, and planning-based algorithms. These results suggest that incorporating shared recurrent memory into the transformer-based architectures can enhance coordination in decentralized multi-agent systems. The source code for training and evaluation is available on GitHub: https://github.com/Aloriosa/srmt.

SRMT: 複数エージェントの生涯パス検索のための共有メモリ

SRMT: Shared Memory for Multi-agent Lifelong Pathfinding

要旨

Support