基于扩散模型的自动驾驶三维占用预测

摘要

精确预测视觉输入的三维占据网格对于自动驾驶至关重要，然而当前的判别式方法在处理噪声数据、不完整观测以及三维场景固有的复杂结构时面临挑战。本研究将三维占据预测重新定义为基于扩散模型的生成建模任务，通过学习底层数据分布并融入三维场景先验知识。这一方法提升了预测的一致性、噪声鲁棒性，并更好地处理了三维空间结构的复杂性。大量实验表明，基于扩散的生成模型超越了当前最先进的判别式方法，提供了更为真实和精确的占据预测，特别是在遮挡或低可见度区域。此外，改进的预测显著提升了下游规划任务的性能，凸显了该方法在实际自动驾驶应用中的实用优势。

English

Accurately predicting 3D occupancy grids from visual inputs is critical for autonomous driving, but current discriminative methods struggle with noisy data, incomplete observations, and the complex structures inherent in 3D scenes. In this work, we reframe 3D occupancy prediction as a generative modeling task using diffusion models, which learn the underlying data distribution and incorporate 3D scene priors. This approach enhances prediction consistency, noise robustness, and better handles the intricacies of 3D spatial structures. Our extensive experiments show that diffusion-based generative models outperform state-of-the-art discriminative approaches, delivering more realistic and accurate occupancy predictions, especially in occluded or low-visibility regions. Moreover, the improved predictions significantly benefit downstream planning tasks, highlighting the practical advantages of our method for real-world autonomous driving applications.

基于扩散模型的自动驾驶三维占用预测

Diffusion-Based Generative Models for 3D Occupancy Prediction in Autonomous Driving

摘要

Support