基於擴散模型的生成式方法在自動駕駛中的三維佔用預測

摘要

精確預測來自視覺輸入的3D佔用網格對於自動駕駛至關重要，但當前的判別式方法在處理噪聲數據、不完整觀測以及3D場景中固有的複雜結構時面臨挑戰。在本研究中，我們將3D佔用預測重新定義為使用擴散模型的生成建模任務，這些模型學習底層數據分佈並融入3D場景先驗。此方法增強了預測的一致性、噪聲魯棒性，並更好地處理了3D空間結構的複雜性。我們的大量實驗表明，基於擴散的生成模型超越了最先進的判別式方法，提供了更為真實和精確的佔用預測，特別是在遮擋或低能見度區域。此外，改進的預測顯著有益於下游規劃任務，凸顯了我們方法在現實世界自動駕駛應用中的實際優勢。

English

Accurately predicting 3D occupancy grids from visual inputs is critical for autonomous driving, but current discriminative methods struggle with noisy data, incomplete observations, and the complex structures inherent in 3D scenes. In this work, we reframe 3D occupancy prediction as a generative modeling task using diffusion models, which learn the underlying data distribution and incorporate 3D scene priors. This approach enhances prediction consistency, noise robustness, and better handles the intricacies of 3D spatial structures. Our extensive experiments show that diffusion-based generative models outperform state-of-the-art discriminative approaches, delivering more realistic and accurate occupancy predictions, especially in occluded or low-visibility regions. Moreover, the improved predictions significantly benefit downstream planning tasks, highlighting the practical advantages of our method for real-world autonomous driving applications.

基於擴散模型的生成式方法在自動駕駛中的三維佔用預測

Diffusion-Based Generative Models for 3D Occupancy Prediction in Autonomous Driving

摘要

Support