あらゆる条件下での深度推定

要旨

Depth Anything at Any Condition（DepthAnything-AC）を提案する。これは、多様な環境条件に対応可能な基盤的な単眼深度推定（MDE）モデルである。従来の基盤的MDEモデルは一般的なシーンにおいて高い性能を発揮するが、照明変動、悪天候、センサー起因の歪みなどの複雑な現実世界の環境では十分な性能を発揮できない。データ不足や劣化した画像から高品質な擬似ラベルを生成できないという課題を克服するため、比較的少量のラベルなしデータのみを必要とする教師なし一貫性正則化ファインチューニングパラダイムを提案する。さらに、パッチレベルの相対的関係を明示的に学習させるための空間距離制約（Spatial Distance Constraint）を導入し、より明確なセマンティック境界と正確な詳細を実現する。実験結果は、DepthAnything-ACが現実世界の悪天候ベンチマーク、合成劣化ベンチマーク、および一般的なベンチマークにおいて、ゼロショット能力を発揮することを示している。プロジェクトページ: https://ghost233lism.github.io/depthanything-AC-page コード: https://github.com/HVision-NKU/DepthAnythingAC

English

We present Depth Anything at Any Condition (DepthAnything-AC), a foundation monocular depth estimation (MDE) model capable of handling diverse environmental conditions. Previous foundation MDE models achieve impressive performance across general scenes but not perform well in complex open-world environments that involve challenging conditions, such as illumination variations, adverse weather, and sensor-induced distortions. To overcome the challenges of data scarcity and the inability of generating high-quality pseudo-labels from corrupted images, we propose an unsupervised consistency regularization finetuning paradigm that requires only a relatively small amount of unlabeled data. Furthermore, we propose the Spatial Distance Constraint to explicitly enforce the model to learn patch-level relative relationships, resulting in clearer semantic boundaries and more accurate details. Experimental results demonstrate the zero-shot capabilities of DepthAnything-AC across diverse benchmarks, including real-world adverse weather benchmarks, synthetic corruption benchmarks, and general benchmarks. Project Page: https://ghost233lism.github.io/depthanything-AC-page Code: https://github.com/HVision-NKU/DepthAnythingAC