DiffSemanticFusion：基于在线高精地图扩散的语义栅格BEV融合自动驾驶技术

摘要

自动驾驶需要精确的场景理解，包括道路几何、交通参与者及其语义关系。在在线高精地图生成场景中，基于栅格的表示方法虽适合视觉模型，但几何精度不足；而基于图的表示虽保留了结构细节，却因缺乏精确地图而变得不稳定。为融合两者的优势，我们提出了DiffSemanticFusion——一个多模态轨迹预测与规划的融合框架。该方法在语义栅格融合的鸟瞰图（BEV）空间中进行推理，并通过地图扩散模块增强，提升了在线高精地图表示的稳定性和表现力。我们在两个下游任务上验证了该框架：轨迹预测和面向规划的端到端自动驾驶。在真实世界自动驾驶基准测试nuScenes和NAVSIM上的实验表明，相较于多种最先进方法，我们的框架性能显著提升。在nuScenes的预测任务中，我们将DiffSemanticFusion与在线高精地图信息融合的QCNet结合，实现了5.1%的性能提升。在NAVSIM的端到端自动驾驶任务中，DiffSemanticFusion取得了最先进的结果，在NavHard场景下性能提升了15%。此外，广泛的消融实验和敏感性研究显示，我们的地图扩散模块可无缝集成到其他基于矢量的方法中，以增强性能。所有相关资源可在https://github.com/SunZhigang7/DiffSemanticFusion获取。

English

Autonomous driving requires accurate scene understanding, including road geometry, traffic agents, and their semantic relationships. In online HD map generation scenarios, raster-based representations are well-suited to vision models but lack geometric precision, while graph-based representations retain structural detail but become unstable without precise maps. To harness the complementary strengths of both, we propose DiffSemanticFusion -- a fusion framework for multimodal trajectory prediction and planning. Our approach reasons over a semantic raster-fused BEV space, enhanced by a map diffusion module that improves both the stability and expressiveness of online HD map representations. We validate our framework on two downstream tasks: trajectory prediction and planning-oriented end-to-end autonomous driving. Experiments on real-world autonomous driving benchmarks, nuScenes and NAVSIM, demonstrate improved performance over several state-of-the-art methods. For the prediction task on nuScenes, we integrate DiffSemanticFusion with the online HD map informed QCNet, achieving a 5.1\% performance improvement. For end-to-end autonomous driving in NAVSIM, DiffSemanticFusion achieves state-of-the-art results, with a 15\% performance gain in NavHard scenarios. In addition, extensive ablation and sensitivity studies show that our map diffusion module can be seamlessly integrated into other vector-based approaches to enhance performance. All artifacts are available at https://github.com/SunZhigang7/DiffSemanticFusion.

DiffSemanticFusion：基于在线高精地图扩散的语义栅格BEV融合自动驾驶技术

DiffSemanticFusion: Semantic Raster BEV Fusion for Autonomous Driving via Online HD Map Diffusion

摘要

Support