GPUDrive: データ駆動型のマルチエージェント運転シミュレーションを100万FPSで実現

要旨

マルチエージェント学習アルゴリズムは、多様なゲームにおいて超人レベルの計画生成に成功してきたが、実際に展開されるマルチエージェントプランナーの設計にはほとんど影響を与えていない。これらの技術をマルチエージェント計画に適用する際の主要なボトルネックは、数十億ステップの経験を必要とすることである。この規模でのマルチエージェント計画の研究を可能にするため、我々はGPUDriveを開発した。これはMadrona Game Engine上に構築されたGPUアクセラレーション型のマルチエージェントシミュレータで、毎秒100万ステップ以上の経験を生成できる。観測、報酬、ダイナミクス関数は直接C++で記述されており、ユーザーは複雑で異質なエージェントの振る舞いを定義し、それを高性能なCUDAに変換することができる。GPUDriveを使用することで、Waymo Motionデータセットの多数のシーンにおいて強化学習エージェントを効果的に訓練し、個々のシーンでは数分で高度に効果的な目標到達エージェントを、また一般的な能力を持つエージェントを数時間で生成できることを示す。これらの訓練済みエージェントは、コードベースの一部としてhttps://github.com/Emerge-Lab/gpudriveで公開している。

English

Multi-agent learning algorithms have been successful at generating superhuman planning in a wide variety of games but have had little impact on the design of deployed multi-agent planners. A key bottleneck in applying these techniques to multi-agent planning is that they require billions of steps of experience. To enable the study of multi-agent planning at this scale, we present GPUDrive, a GPU-accelerated, multi-agent simulator built on top of the Madrona Game Engine that can generate over a million steps of experience per second. Observation, reward, and dynamics functions are written directly in C++, allowing users to define complex, heterogeneous agent behaviors that are lowered to high-performance CUDA. We show that using GPUDrive we are able to effectively train reinforcement learning agents over many scenes in the Waymo Motion dataset, yielding highly effective goal-reaching agents in minutes for individual scenes and generally capable agents in a few hours. We ship these trained agents as part of the code base at https://github.com/Emerge-Lab/gpudrive.

GPUDrive: データ駆動型のマルチエージェント運転シミュレーションを100万FPSで実現

GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS

要旨

Support