NAVSIM：基于数据驱动的非反应式自主车辆模拟和基准测试

摘要

基于视觉的驾驶策略基准测试具有挑战性。一方面，使用真实数据进行开环评估很容易，但这些结果并不反映闭环性能。另一方面，在模拟中进行闭环评估是可能的，但由于其巨大的计算需求，很难扩展。此外，当今可用的模拟器与真实数据存在较大的领域差距。这导致无法从快速增长的端到端自动驾驶研究成果中得出明确结论。在本文中，我们提出了NAVSIM，这是开放环境评估和封闭环境评估之间的折衷方案，我们利用大型数据集与非反应式模拟器相结合，实现大规模真实世界基准测试。具体而言，我们通过展开测试场景的鸟瞰抽象，收集基于模拟的指标，如进展和碰撞时间，进行短期模拟。我们的模拟是非反应式的，即评估的策略和环境不会相互影响。正如我们经验证明的那样，这种解耦允许进行开环指标计算，同时与传统的位移误差相比，更好地与闭环评估保持一致。NAVSIM在CVPR 2024举办了一场新的比赛，共有143支团队提交了463份作品，得出了一些新的见解。在大量具有挑战性的场景中，我们观察到，像TransFuser这样具有中等计算需求的简单方法可以与UniAD等最新的大规模端到端驾驶架构相匹配。我们的模块化框架有可能通过新的数据集、数据策划策略和指标进行扩展，并将持续维护以举办未来的挑战。我们的代码可在https://github.com/autonomousvision/navsim获取。

English

Benchmarking vision-based driving policies is challenging. On one hand, open-loop evaluation with real data is easy, but these results do not reflect closed-loop performance. On the other, closed-loop evaluation is possible in simulation, but is hard to scale due to its significant computational demands. Further, the simulators available today exhibit a large domain gap to real data. This has resulted in an inability to draw clear conclusions from the rapidly growing body of research on end-to-end autonomous driving. In this paper, we present NAVSIM, a middle ground between these evaluation paradigms, where we use large datasets in combination with a non-reactive simulator to enable large-scale real-world benchmarking. Specifically, we gather simulation-based metrics, such as progress and time to collision, by unrolling bird's eye view abstractions of the test scenes for a short simulation horizon. Our simulation is non-reactive, i.e., the evaluated policy and environment do not influence each other. As we demonstrate empirically, this decoupling allows open-loop metric computation while being better aligned with closed-loop evaluations than traditional displacement errors. NAVSIM enabled a new competition held at CVPR 2024, where 143 teams submitted 463 entries, resulting in several new insights. On a large set of challenging scenarios, we observe that simple methods with moderate compute requirements such as TransFuser can match recent large-scale end-to-end driving architectures such as UniAD. Our modular framework can potentially be extended with new datasets, data curation strategies, and metrics, and will be continually maintained to host future challenges. Our code is available at https://github.com/autonomousvision/navsim.

NAVSIM：基于数据驱动的非反应式自主车辆模拟和基准测试

NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

摘要

Support