NAVSIM：基於數據驅動的非反應式自主車輛模擬和基準測試

摘要

基於視覺的駕駛策略基準測試具有挑戰性。一方面，使用真實數據進行開環評估容易，但這些結果並不反映閉環表現。另一方面，在模擬中進行閉環評估是可能的，但由於其巨大的計算需求，很難擴展。此外，當今可用的模擬器與真實數據存在著很大的領域差距。這導致無法從快速增長的端到端自動駕駛研究成果中得出明確結論。在本文中，我們提出了NAVSIM，它處於評估範式之間的中間地帶，我們在其中使用大型數據集結合非反應式模擬器，實現大規模真實世界基準測試。具體而言，我們通過展開測試場景的鳥瞰抽象來收集基於模擬的指標，例如進展和碰撞時間，用於短期模擬視野。我們的模擬是非反應式的，即評估的策略和環境不會相互影響。正如我們在實證中展示的那樣，這種解耦允許進行開環指標計算，同時與傳統的位移誤差相比更符合閉環評估。NAVSIM實現了一項新的比賽，於2024年CVPR舉辦，共有143支隊伍提交了463個作品，帶來了一些新的見解。在大量具有挑戰性的情境中，我們觀察到，像TransFuser這樣具有中等計算需求的簡單方法可以與UniAD等最新的大規模端到端駕駛架構相匹配。我們的模塊化框架可能可以通過新的數據集、數據整理策略和指標進行擴展，並將持續維護以舉辦未來的挑戰。我們的代碼可在https://github.com/autonomousvision/navsim找到。

English

Benchmarking vision-based driving policies is challenging. On one hand, open-loop evaluation with real data is easy, but these results do not reflect closed-loop performance. On the other, closed-loop evaluation is possible in simulation, but is hard to scale due to its significant computational demands. Further, the simulators available today exhibit a large domain gap to real data. This has resulted in an inability to draw clear conclusions from the rapidly growing body of research on end-to-end autonomous driving. In this paper, we present NAVSIM, a middle ground between these evaluation paradigms, where we use large datasets in combination with a non-reactive simulator to enable large-scale real-world benchmarking. Specifically, we gather simulation-based metrics, such as progress and time to collision, by unrolling bird's eye view abstractions of the test scenes for a short simulation horizon. Our simulation is non-reactive, i.e., the evaluated policy and environment do not influence each other. As we demonstrate empirically, this decoupling allows open-loop metric computation while being better aligned with closed-loop evaluations than traditional displacement errors. NAVSIM enabled a new competition held at CVPR 2024, where 143 teams submitted 463 entries, resulting in several new insights. On a large set of challenging scenarios, we observe that simple methods with moderate compute requirements such as TransFuser can match recent large-scale end-to-end driving architectures such as UniAD. Our modular framework can potentially be extended with new datasets, data curation strategies, and metrics, and will be continually maintained to host future challenges. Our code is available at https://github.com/autonomousvision/navsim.

NAVSIM：基於數據驅動的非反應式自主車輛模擬和基準測試

NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

摘要

Support