CAMAR: 연속 행동 다중 에이전트 경로 탐색

초록

다중 에이전트 강화 학습(MARL)은 협력적 및 경쟁적 의사결정 문제를 해결하기 위한 강력한 패러다임이다. 많은 MARL 벤치마크가 제안되었지만, 연속적인 상태 및 행동 공간과 도전적인 조정 및 계획 작업을 결합한 사례는 드물다. 본 연구에서는 연속적인 행동을 지원하는 환경에서 다중 에이전트 경로 탐색을 위해 명시적으로 설계된 새로운 MARL 벤치마크인 CAMAR를 소개한다. CAMAR는 에이전트 간의 협력적 및 경쟁적 상호작용을 지원하며 초당 최대 100,000 환경 단계까지 효율적으로 실행된다. 또한, 알고리즘의 진전을 더 잘 추적하고 성능에 대한 심층 분석을 가능하게 하기 위해 3단계 평가 프로토콜을 제안한다. 더불어, CAMAR는 RRT 및 RRT*와 같은 고전적인 계획 방법을 MARL 파이프라인에 통합할 수 있도록 한다. 이를 독립적인 기준선으로 사용하고, RRT*를 인기 있는 MARL 알고리즘과 결합하여 하이브리드 접근법을 생성한다. 재현성과 공정한 비교를 보장하기 위해 테스트 시나리오 및 벤치마킹 도구 세트를 제공한다. 실험 결과, CAMAR는 MARL 커뮤니티에게 도전적이고 현실적인 테스트베드를 제공함을 보여준다.

English

Multi-agent reinforcement learning (MARL) is a powerful paradigm for solving cooperative and competitive decision-making problems. While many MARL benchmarks have been proposed, few combine continuous state and action spaces with challenging coordination and planning tasks. We introduce CAMAR, a new MARL benchmark designed explicitly for multi-agent pathfinding in environments with continuous actions. CAMAR supports cooperative and competitive interactions between agents and runs efficiently at up to 100,000 environment steps per second. We also propose a three-tier evaluation protocol to better track algorithmic progress and enable deeper analysis of performance. In addition, CAMAR allows the integration of classical planning methods such as RRT and RRT* into MARL pipelines. We use them as standalone baselines and combine RRT* with popular MARL algorithms to create hybrid approaches. We provide a suite of test scenarios and benchmarking tools to ensure reproducibility and fair comparison. Experiments show that CAMAR presents a challenging and realistic testbed for the MARL community.

CAMAR: 연속 행동 다중 에이전트 경로 탐색

CAMAR: Continuous Actions Multi-Agent Routing

초록

Support