ChatPaper.aiChatPaper

任何深度皆可蒸餾:蒸餾技術打造更強大的單目深度估計器

Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator

February 26, 2025
作者: Xiankang He, Dongyan Guo, Hongji Li, Ruibo Li, Ying Cui, Chi Zhang
cs.AI

摘要

單目深度估計(MDE)旨在從單張RGB圖像預測場景深度,在3D場景理解中扮演著關鍵角色。近期,零樣本MDE的進展利用標準化深度表示和基於蒸餾的學習來提升對多樣場景的泛化能力。然而,當前用於蒸餾的深度標準化方法依賴於全局標準化,這可能放大噪聲偽標籤,降低蒸餾效果。本文系統分析了不同深度標準化策略對偽標籤蒸餾的影響。基於研究發現,我們提出了跨上下文蒸餾法,該方法整合全局與局部深度線索以提升偽標籤質量。此外,我們引入了一種多教師蒸餾框架,利用不同深度估計模型的互補優勢,從而實現更為穩健且準確的深度預測。在基準數據集上的廣泛實驗表明,我們的方法在定量與定性評估上均顯著優於現有最先進技術。
English
Monocular depth estimation (MDE) aims to predict scene depth from a single RGB image and plays a crucial role in 3D scene understanding. Recent advances in zero-shot MDE leverage normalized depth representations and distillation-based learning to improve generalization across diverse scenes. However, current depth normalization methods for distillation, relying on global normalization, can amplify noisy pseudo-labels, reducing distillation effectiveness. In this paper, we systematically analyze the impact of different depth normalization strategies on pseudo-label distillation. Based on our findings, we propose Cross-Context Distillation, which integrates global and local depth cues to enhance pseudo-label quality. Additionally, we introduce a multi-teacher distillation framework that leverages complementary strengths of different depth estimation models, leading to more robust and accurate depth predictions. Extensive experiments on benchmark datasets demonstrate that our approach significantly outperforms state-of-the-art methods, both quantitatively and qualitatively.

Summary

AI-Generated Summary

PDF115February 27, 2025