DiffSemanticFusion：基於語義柵格BEV融合的自動駕駛技術通過在線高清地圖擴散實現

摘要

自動駕駛需要精確的場景理解，包括道路幾何、交通參與者及其語義關係。在線高精度地圖生成場景中，基於柵格的表示法適合視覺模型，但缺乏幾何精度，而基於圖的表示法保留了結構細節，但在沒有精確地圖的情況下變得不穩定。為了利用兩者的互補優勢，我們提出了DiffSemanticFusion——一個用於多模態軌跡預測與規劃的融合框架。我們的方法在語義柵格融合的鳥瞰圖（BEV）空間中進行推理，並通過地圖擴散模塊增強，該模塊提高了在線高精度地圖表示的穩定性和表現力。我們在兩個下游任務上驗證了我們的框架：軌跡預測和面向規劃的端到端自動駕駛。在真實世界的自動駕駛基準測試nuScenes和NAVSIM上的實驗表明，相較於多種最先進的方法，我們的框架性能有所提升。對於nuScenes上的預測任務，我們將DiffSemanticFusion與基於在線高精度地圖的QCNet結合，實現了5.1%的性能提升。在NAVSIM的端到端自動駕駛中，DiffSemanticFusion達到了最先進的結果，在NavHard場景中性能提升了15%。此外，廣泛的消融和敏感性研究表明，我們的地圖擴散模塊可以無縫集成到其他基於向量的方法中，以提升性能。所有相關資源可在https://github.com/SunZhigang7/DiffSemanticFusion 獲取。

English

Autonomous driving requires accurate scene understanding, including road geometry, traffic agents, and their semantic relationships. In online HD map generation scenarios, raster-based representations are well-suited to vision models but lack geometric precision, while graph-based representations retain structural detail but become unstable without precise maps. To harness the complementary strengths of both, we propose DiffSemanticFusion -- a fusion framework for multimodal trajectory prediction and planning. Our approach reasons over a semantic raster-fused BEV space, enhanced by a map diffusion module that improves both the stability and expressiveness of online HD map representations. We validate our framework on two downstream tasks: trajectory prediction and planning-oriented end-to-end autonomous driving. Experiments on real-world autonomous driving benchmarks, nuScenes and NAVSIM, demonstrate improved performance over several state-of-the-art methods. For the prediction task on nuScenes, we integrate DiffSemanticFusion with the online HD map informed QCNet, achieving a 5.1\% performance improvement. For end-to-end autonomous driving in NAVSIM, DiffSemanticFusion achieves state-of-the-art results, with a 15\% performance gain in NavHard scenarios. In addition, extensive ablation and sensitivity studies show that our map diffusion module can be seamlessly integrated into other vector-based approaches to enhance performance. All artifacts are available at https://github.com/SunZhigang7/DiffSemanticFusion.

DiffSemanticFusion：基於語義柵格BEV融合的自動駕駛技術通過在線高清地圖擴散實現

DiffSemanticFusion: Semantic Raster BEV Fusion for Autonomous Driving via Online HD Map Diffusion

摘要

Support