MOMAland:多目标多智能体强化学习基准集
MOMAland: A Set of Benchmarks for Multi-Objective Multi-Agent Reinforcement Learning
July 23, 2024
作者: Florian Felten, Umut Ucak, Hicham Azmani, Gao Peng, Willem Röpke, Hendrik Baier, Patrick Mannion, Diederik M. Roijers, Jordan K. Terry, El-Ghazali Talbi, Grégoire Danoy, Ann Nowé, Roxana Rădulescu
cs.AI
摘要
许多具有挑战性的任务,如管理交通系统、电力网络或供应链,涉及复杂的决策过程,必须平衡多个相互冲突的目标,并协调各个独立决策者(DMs)的行动。一个形式化和解决这类任务的视角是多目标多智能体强化学习(MOMARL)。MOMARL将强化学习(RL)扩展到需要考虑多个目标的多个智能体的问题中。在强化学习研究中,基准测试对于促进进展、评估和可重现性至关重要。基准测试的重要性得到了多个基准框架的存在所强调,这些框架针对各种RL范式进行了开发,包括单智能体RL(例如Gymnasium)、多智能体RL(例如PettingZoo)和单智能体多目标RL(例如MO-Gymnasium)。为了支持MOMARL领域的发展,我们介绍了MOMAland,这是第一个为多目标多智能体强化学习提供标准化环境的集合。MOMAland满足了这一新兴领域对全面基准测试的需求,提供了超过10个不同的环境,这些环境在智能体数量、状态表示、奖励结构和效用考虑方面各不相同。为了为未来研究提供强有力的基准线,MOMAland还包括了能够在这种设置中学习策略的算法。
English
Many challenging tasks such as managing traffic systems, electricity grids,
or supply chains involve complex decision-making processes that must balance
multiple conflicting objectives and coordinate the actions of various
independent decision-makers (DMs). One perspective for formalising and
addressing such tasks is multi-objective multi-agent reinforcement learning
(MOMARL). MOMARL broadens reinforcement learning (RL) to problems with multiple
agents each needing to consider multiple objectives in their learning process.
In reinforcement learning research, benchmarks are crucial in facilitating
progress, evaluation, and reproducibility. The significance of benchmarks is
underscored by the existence of numerous benchmark frameworks developed for
various RL paradigms, including single-agent RL (e.g., Gymnasium), multi-agent
RL (e.g., PettingZoo), and single-agent multi-objective RL (e.g.,
MO-Gymnasium). To support the advancement of the MOMARL field, we introduce
MOMAland, the first collection of standardised environments for multi-objective
multi-agent reinforcement learning. MOMAland addresses the need for
comprehensive benchmarking in this emerging field, offering over 10 diverse
environments that vary in the number of agents, state representations, reward
structures, and utility considerations. To provide strong baselines for future
research, MOMAland also includes algorithms capable of learning policies in
such settings.Summary
AI-Generated Summary