Depth Anywhere: 視点蒸留とラベルなしデータ拡張による360度単眼深度推定の強化

要旨

360度画像における深度の正確な推定は、仮想現実、自律ナビゲーション、没入型メディアアプリケーションにとって極めて重要です。既存の遠近法画像向けに設計された深度推定手法は、異なるカメラ投影と歪みのため360度画像に適用すると失敗します。一方、360度画像向けの手法はラベル付きデータペアの不足により性能が劣ります。本研究では、ラベルなし360度データを効果的に活用する新しい深度推定フレームワークを提案します。私たちのアプローチでは、最先端の遠近法深度推定モデルを教師モデルとして使用し、六面体立方体投影技術を通じて擬似ラベルを生成することで、360度画像の深度ラベル付けを効率的に行います。この手法は、大規模データセットの増加を活用します。私たちのアプローチは、無効領域のオフラインマスク生成と、オンライン半教師あり共同トレーニング体制の2つの主要な段階を含みます。Matterport3DやStanford2D3Dなどのベンチマークデータセットでこのアプローチをテストし、特にゼロショットシナリオにおいて深度推定精度の大幅な向上を示しました。提案するトレーニングパイプラインは、任意の360度単眼深度推定器を強化でき、異なるカメラ投影とデータタイプ間での効果的な知識転移を実証します。結果についてはプロジェクトページをご覧ください: https://albert100121.github.io/Depth-Anywhere/

English

Accurately estimating depth in 360-degree imagery is crucial for virtual reality, autonomous navigation, and immersive media applications. Existing depth estimation methods designed for perspective-view imagery fail when applied to 360-degree images due to different camera projections and distortions, whereas 360-degree methods perform inferior due to the lack of labeled data pairs. We propose a new depth estimation framework that utilizes unlabeled 360-degree data effectively. Our approach uses state-of-the-art perspective depth estimation models as teacher models to generate pseudo labels through a six-face cube projection technique, enabling efficient labeling of depth in 360-degree images. This method leverages the increasing availability of large datasets. Our approach includes two main stages: offline mask generation for invalid regions and an online semi-supervised joint training regime. We tested our approach on benchmark datasets such as Matterport3D and Stanford2D3D, showing significant improvements in depth estimation accuracy, particularly in zero-shot scenarios. Our proposed training pipeline can enhance any 360 monocular depth estimator and demonstrates effective knowledge transfer across different camera projections and data types. See our project page for results: https://albert100121.github.io/Depth-Anywhere/

Depth Anywhere: 視点蒸留とラベルなしデータ拡張による360度単眼深度推定の強化

Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation

要旨

Support