ChatPaper.aiChatPaper

OmniNWM:全知駕駛導航世界模型

OmniNWM: Omniscient Driving Navigation World Models

October 21, 2025
作者: Bohan Li, Zhuang Ma, Dalong Du, Baorui Peng, Zhujin Liang, Zhenqiang Liu, Chao Ma, Yueming Jin, Hao Zhao, Wenjun Zeng, Xin Jin
cs.AI

摘要

自動駕駛世界模型被期望在三個核心維度上有效運作:狀態、動作和獎勵。然而,現有模型通常受限於有限的狀態模態、短視頻序列、不精確的動作控制以及缺乏獎勵意識。本文介紹了OmniNWM,一種全知全景導航世界模型,它在統一框架內解決了這三個維度。在狀態方面,OmniNWM聯合生成了RGB、語義、度量深度和3D佔用的全景視頻。靈活的強制策略實現了高質量的長時序自迴歸生成。在動作方面,我們引入了一種標準化的全景Plucker射線圖表示,將輸入軌跡編碼為像素級信號,從而實現對全景視頻生成的高度精確和可泛化的控制。關於獎勵,我們超越了使用外部基於圖像的模型學習獎勵函數的做法:相反,我們利用生成的3D佔用來直接定義基於規則的密集獎勵,以確保駕駛合規性和安全性。大量實驗表明,OmniNWM在視頻生成、控制精度和長時序穩定性方面達到了最先進的性能,同時通過基於佔用的獎勵提供了一個可靠的閉環評估框架。項目頁面可在https://github.com/Arlo0o/OmniNWM獲取。
English
Autonomous driving world models are expected to work effectively across three core dimensions: state, action, and reward. Existing models, however, are typically restricted to limited state modalities, short video sequences, imprecise action control, and a lack of reward awareness. In this paper, we introduce OmniNWM, an omniscient panoramic navigation world model that addresses all three dimensions within a unified framework. For state, OmniNWM jointly generates panoramic videos of RGB, semantics, metric depth, and 3D occupancy. A flexible forcing strategy enables high-quality long-horizon auto-regressive generation. For action, we introduce a normalized panoramic Plucker ray-map representation that encodes input trajectories into pixel-level signals, enabling highly precise and generalizable control over panoramic video generation. Regarding reward, we move beyond learning reward functions with external image-based models: instead, we leverage the generated 3D occupancy to directly define rule-based dense rewards for driving compliance and safety. Extensive experiments demonstrate that OmniNWM achieves state-of-the-art performance in video generation, control accuracy, and long-horizon stability, while providing a reliable closed-loop evaluation framework through occupancy-grounded rewards. Project page is available at https://github.com/Arlo0o/OmniNWM.
PDF62October 23, 2025