無論深度：透過透視蒸餾和未標記數據增強來提升360單眼深度估計

摘要

在360度影像中準確估計深度對虛擬實境、自主導航和沉浸式媒體應用至關重要。現有為透視視圖影像設計的深度估計方法在應用於360度影像時失敗，原因在於不同的相機投影和失真，而360度方法則由於缺乏標記數據對而表現較差。我們提出了一種新的深度估計框架，有效利用未標記的360度數據。我們的方法使用最先進的透視深度估計模型作為教師模型，通過六面立方體投影技術生成虛擬標籤，實現對360度影像中深度的高效標記。該方法利用了大型數據集日益增加的可用性。我們的方法包括兩個主要階段：用於無效區域的離線遮罩生成和在線半監督聯合訓練制度。我們在Matterport3D和Stanford2D3D等基準數據集上測試了我們的方法，顯示在深度估計準確性方面取得了顯著改進，特別是在零樣本情況下。我們提出的訓練流程可以增強任何360單眼深度估計器，展示了跨不同相機投影和數據類型的有效知識轉移。查看我們的項目頁面以獲得結果：https://albert100121.github.io/Depth-Anywhere/

English

Accurately estimating depth in 360-degree imagery is crucial for virtual reality, autonomous navigation, and immersive media applications. Existing depth estimation methods designed for perspective-view imagery fail when applied to 360-degree images due to different camera projections and distortions, whereas 360-degree methods perform inferior due to the lack of labeled data pairs. We propose a new depth estimation framework that utilizes unlabeled 360-degree data effectively. Our approach uses state-of-the-art perspective depth estimation models as teacher models to generate pseudo labels through a six-face cube projection technique, enabling efficient labeling of depth in 360-degree images. This method leverages the increasing availability of large datasets. Our approach includes two main stages: offline mask generation for invalid regions and an online semi-supervised joint training regime. We tested our approach on benchmark datasets such as Matterport3D and Stanford2D3D, showing significant improvements in depth estimation accuracy, particularly in zero-shot scenarios. Our proposed training pipeline can enhance any 360 monocular depth estimator and demonstrates effective knowledge transfer across different camera projections and data types. See our project page for results: https://albert100121.github.io/Depth-Anywhere/

無論深度：透過透視蒸餾和未標記數據增強來提升360單眼深度估計

Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation

摘要

Support