MotionDiffuser: 拡散モデルを用いた制御可能なマルチエージェント動作予測

要旨

本論文では、複数のエージェントにわたる将来の軌跡の同時分布を表現するための拡散モデルベースの手法であるMotionDiffuserを提案します。この表現にはいくつかの重要な利点があります。第一に、本モデルは多様な将来の結果を捉える高度に多峰性の分布を学習します。第二に、シンプルな予測器設計により、単一のL2損失訓練目的のみを必要とし、軌跡アンカーに依存しません。第三に、本モデルは複数のエージェントの運動の同時分布を順序不変の方法で学習することが可能です。さらに、PCAを用いた圧縮軌跡表現を活用することで、モデルの性能を向上させ、正確なサンプルの対数確率の効率的な計算を可能にします。その後、微分可能なコスト関数に基づいて制御された軌跡サンプリングを可能にする一般的な制約付きサンプリングフレームワークを提案します。この戦略により、ルールや物理的な事前知識を強制したり、特定のシミュレーションシナリオを作成するなど、さまざまな応用が可能になります。MotionDiffuserは既存のバックボーンアーキテクチャと組み合わせることで、最高の運動予測結果を達成することができます。Waymo Open Motion Datasetにおける多エージェント運動予測において、最先端の結果を得ました。

English

We present MotionDiffuser, a diffusion based representation for the joint distribution of future trajectories over multiple agents. Such representation has several key advantages: first, our model learns a highly multimodal distribution that captures diverse future outcomes. Second, the simple predictor design requires only a single L2 loss training objective, and does not depend on trajectory anchors. Third, our model is capable of learning the joint distribution for the motion of multiple agents in a permutation-invariant manner. Furthermore, we utilize a compressed trajectory representation via PCA, which improves model performance and allows for efficient computation of the exact sample log probability. Subsequently, we propose a general constrained sampling framework that enables controlled trajectory sampling based on differentiable cost functions. This strategy enables a host of applications such as enforcing rules and physical priors, or creating tailored simulation scenarios. MotionDiffuser can be combined with existing backbone architectures to achieve top motion forecasting results. We obtain state-of-the-art results for multi-agent motion prediction on the Waymo Open Motion Dataset.

MotionDiffuser: 拡散モデルを用いた制御可能なマルチエージェント動作予測

MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion

要旨

Support