ChatPaper.aiChatPaper

Drive&Gen:端到端驾驶与视频生成模型的协同评估

Drive&Gen: Co-Evaluating End-to-End Driving and Video Generation Models

October 7, 2025
作者: Jiahao Wang, Zhenpei Yang, Yijing Bai, Yingwei Li, Yuliang Zou, Bo Sun, Abhijit Kundu, Jose Lezama, Luna Yue Huang, Zehao Zhu, Jyh-Jing Hwang, Dragomir Anguelov, Mingxing Tan, Chiyu Max Jiang
cs.AI

摘要

生成模型的最新进展为自动驾驶领域带来了令人振奋的新可能。特别是,视频生成模型正被探索作为可控的虚拟测试环境。与此同时,端到端(E2E)驾驶模型作为传统模块化自动驾驶系统的简化替代方案崭露头角,因其简洁性和可扩展性而广受欢迎。然而,这些技术在仿真与规划中的应用引发了重要问题。首先,尽管视频生成模型能够生成愈发逼真的视频,但这些视频能否忠实遵循指定条件,并足够真实以用于E2E自动驾驶规划器的评估?其次,鉴于数据对于理解和控制E2E规划器至关重要,我们如何能更深入地洞察其偏差,并提升其在分布外场景下的泛化能力?在本研究中,我们通过将驾驶模型与生成世界模型(Drive&Gen)相结合,来解答这些问题。我们提出了利用E2E驾驶者评估生成视频真实性的新颖统计方法。通过发挥视频生成模型的可控性,我们进行了针对性实验,以探究影响E2E规划器性能的分布差距。最后,我们展示了由视频生成模型产生的合成数据,作为真实世界数据收集的经济高效替代方案。这些合成数据有效提升了E2E模型在现有操作设计域之外的泛化能力,促进了自动驾驶服务向新操作环境的扩展。
English
Recent advances in generative models have sparked exciting new possibilities in the field of autonomous vehicles. Specifically, video generation models are now being explored as controllable virtual testing environments. Simultaneously, end-to-end (E2E) driving models have emerged as a streamlined alternative to conventional modular autonomous driving systems, gaining popularity for their simplicity and scalability. However, the application of these techniques to simulation and planning raises important questions. First, while video generation models can generate increasingly realistic videos, can these videos faithfully adhere to the specified conditions and be realistic enough for E2E autonomous planner evaluation? Second, given that data is crucial for understanding and controlling E2E planners, how can we gain deeper insights into their biases and improve their ability to generalize to out-of-distribution scenarios? In this work, we bridge the gap between the driving models and generative world models (Drive&Gen) to address these questions. We propose novel statistical measures leveraging E2E drivers to evaluate the realism of generated videos. By exploiting the controllability of the video generation model, we conduct targeted experiments to investigate distribution gaps affecting E2E planner performance. Finally, we show that synthetic data produced by the video generation model offers a cost-effective alternative to real-world data collection. This synthetic data effectively improves E2E model generalization beyond existing Operational Design Domains, facilitating the expansion of autonomous vehicle services into new operational contexts.
PDF22October 10, 2025