ChatPaper.aiChatPaper

Drive&Gen:端到端駕駛與視頻生成模型的聯合評估

Drive&Gen: Co-Evaluating End-to-End Driving and Video Generation Models

October 7, 2025
作者: Jiahao Wang, Zhenpei Yang, Yijing Bai, Yingwei Li, Yuliang Zou, Bo Sun, Abhijit Kundu, Jose Lezama, Luna Yue Huang, Zehao Zhu, Jyh-Jing Hwang, Dragomir Anguelov, Mingxing Tan, Chiyu Max Jiang
cs.AI

摘要

生成模型的最新進展為自動駕駛領域帶來了令人振奮的新可能性。特別是,視頻生成模型正被探索作為可控的虛擬測試環境。同時,端到端(E2E)駕駛模型作為傳統模塊化自動駕駛系統的簡化替代方案,因其簡單性和可擴展性而受到歡迎。然而,這些技術在模擬和規劃中的應用引發了重要問題。首先,雖然視頻生成模型能夠生成越來越逼真的視頻,但這些視頻能否忠實地遵循指定條件,並足夠真實以用於E2E自動規劃器的評估?其次,考慮到數據對於理解和控制E2E規劃器至關重要,我們如何更深入地了解其偏見並提高其在分佈外場景中的泛化能力?在本研究中,我們通過將駕駛模型與生成世界模型(Drive&Gen)相結合來解決這些問題。我們提出了利用E2E駕駛器來評估生成視頻真實性的新統計方法。通過利用視頻生成模型的可控性,我們進行了有針對性的實驗,以研究影響E2E規劃器性能的分佈差距。最後,我們展示了由視頻生成模型產生的合成數據作為現實世界數據收集的成本效益替代方案。這些合成數據有效地提高了E2E模型在現有操作設計域之外的泛化能力,促進了自動駕駛服務向新操作環境的擴展。
English
Recent advances in generative models have sparked exciting new possibilities in the field of autonomous vehicles. Specifically, video generation models are now being explored as controllable virtual testing environments. Simultaneously, end-to-end (E2E) driving models have emerged as a streamlined alternative to conventional modular autonomous driving systems, gaining popularity for their simplicity and scalability. However, the application of these techniques to simulation and planning raises important questions. First, while video generation models can generate increasingly realistic videos, can these videos faithfully adhere to the specified conditions and be realistic enough for E2E autonomous planner evaluation? Second, given that data is crucial for understanding and controlling E2E planners, how can we gain deeper insights into their biases and improve their ability to generalize to out-of-distribution scenarios? In this work, we bridge the gap between the driving models and generative world models (Drive&Gen) to address these questions. We propose novel statistical measures leveraging E2E drivers to evaluate the realism of generated videos. By exploiting the controllability of the video generation model, we conduct targeted experiments to investigate distribution gaps affecting E2E planner performance. Finally, we show that synthetic data produced by the video generation model offers a cost-effective alternative to real-world data collection. This synthetic data effectively improves E2E model generalization beyond existing Operational Design Domains, facilitating the expansion of autonomous vehicle services into new operational contexts.
PDF22October 10, 2025