POGEMA：一个用于协作多智能体导航的基准平台

摘要

最近，多智能体强化学习（MARL）在解决各种环境中具有挑战性的合作和竞争性多智能体问题方面取得了显著进展，这些环境通常包含少量智能体且具有完全可观测性。此外，一系列关键的与机器人相关的任务，如多机器人导航和避障，通常采用传统的非可学习方法（例如，启发式搜索），目前建议采用基于学习或混合方法来解决。然而，在这一领域中，由于缺乏支持学习和评估的统一框架，要进行对传统方法、基于学习的方法和混合方法的公平比较是困难的，甚至可以说是不可能的。为此，我们引入了POGEMA，一个包括快速学习环境、问题实例生成器、预定义实例集合、可视化工具包和允许自动评估的基准工具的综合工具集。我们介绍并详细说明了一个评估协议，定义了一系列基于主要评估指标（例如成功率和路径长度）计算的与领域相关的指标，从而实现了公平的多方面比较。我们展示了涉及各种最先进的MARL、基于搜索的方法和混合方法的比较结果。

English

Multi-agent reinforcement learning (MARL) has recently excelled in solving challenging cooperative and competitive multi-agent problems in various environments with, mostly, few agents and full observability. Moreover, a range of crucial robotics-related tasks, such as multi-robot navigation and obstacle avoidance, that have been conventionally approached with the classical non-learnable methods (e.g., heuristic search) is currently suggested to be solved by the learning-based or hybrid methods. Still, in this domain, it is hard, not to say impossible, to conduct a fair comparison between classical, learning-based, and hybrid approaches due to the lack of a unified framework that supports both learning and evaluation. To this end, we introduce POGEMA, a set of comprehensive tools that includes a fast environment for learning, a generator of problem instances, the collection of pre-defined ones, a visualization toolkit, and a benchmarking tool that allows automated evaluation. We introduce and specify an evaluation protocol defining a range of domain-related metrics computed on the basics of the primary evaluation indicators (such as success rate and path length), allowing a fair multi-fold comparison. The results of such a comparison, which involves a variety of state-of-the-art MARL, search-based, and hybrid methods, are presented.

POGEMA：一个用于协作多智能体导航的基准平台

POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation

摘要

Support