MotionLM:多智能體運動預測作為語言建模
MotionLM: Multi-Agent Motion Forecasting as Language Modeling
September 28, 2023
作者: Ari Seff, Brian Cera, Dian Chen, Mason Ng, Aurick Zhou, Nigamaa Nayakanti, Khaled S. Refaat, Rami Al-Rfou, Benjamin Sapp
cs.AI
摘要
對道路代理的未來行為進行可靠預測是自主車輛安全規劃的關鍵組成部分。在這裡,我們將連續軌跡表示為離散運動標記的序列,並將多代理運動預測視為在該領域上的語言建模任務。我們的模型MotionLM具有幾個優勢:首先,它不需要錨點或明確的潛在變量優化來學習多模態分佈。相反,我們利用單一標準語言建模目標,最大化序列標記的平均對數概率。其次,我們的方法繞過事後交互啟發式,其中在交互式評分之前進行單個代理軌跡生成。相反,MotionLM在單一自回歸解碼過程中生成對交互式代理未來的聯合分佈。此外,模型的序列分解使得時間因果條件展開成為可能。所提出的方法在Waymo Open Motion Dataset上為多代理運動預測建立了新的最先進表現,並在互動挑戰排行榜上排名第一。
English
Reliable forecasting of the future behavior of road agents is a critical
component to safe planning in autonomous vehicles. Here, we represent
continuous trajectories as sequences of discrete motion tokens and cast
multi-agent motion prediction as a language modeling task over this domain. Our
model, MotionLM, provides several advantages: First, it does not require
anchors or explicit latent variable optimization to learn multimodal
distributions. Instead, we leverage a single standard language modeling
objective, maximizing the average log probability over sequence tokens. Second,
our approach bypasses post-hoc interaction heuristics where individual agent
trajectory generation is conducted prior to interactive scoring. Instead,
MotionLM produces joint distributions over interactive agent futures in a
single autoregressive decoding process. In addition, the model's sequential
factorization enables temporally causal conditional rollouts. The proposed
approach establishes new state-of-the-art performance for multi-agent motion
prediction on the Waymo Open Motion Dataset, ranking 1st on the interactive
challenge leaderboard.