어떤 조건에서도 깊이 추정 가능

초록

우리는 다양한 환경 조건을 처리할 수 있는 기초 단안 깊이 추정(Depth Anything at Any Condition, DepthAnything-AC) 모델을 제안합니다. 기존의 기초 단안 깊이 추정 모델들은 일반적인 장면에서 인상적인 성능을 보이지만, 조명 변화, 악천후, 센서 왜곡과 같은 도전적인 조건이 포함된 복잡한 개방형 환경에서는 잘 작동하지 않습니다. 데이터 부족과 손상된 이미지에서 고품질의 의사 레이블을 생성할 수 없는 문제를 극복하기 위해, 우리는 비교적 적은 양의 레이블 없는 데이터만 필요한 비지도 일관성 정규화 미세 조정 패러다임을 제안합니다. 또한, 패치 수준의 상대적 관계를 명시적으로 학습하도록 모델을 강제하는 공간 거리 제약(Spatial Distance Constraint)을 제안하여 더 명확한 의미 경계와 더 정확한 세부 사항을 얻을 수 있도록 합니다. 실험 결과는 DepthAnything-AC의 제로샷 능력을 다양한 벤치마크에서 입증하며, 이는 실제 악천후 벤치마크, 합성 손상 벤치마크, 일반 벤치마크를 포함합니다. 프로젝트 페이지: https://ghost233lism.github.io/depthanything-AC-page 코드: https://github.com/HVision-NKU/DepthAnythingAC

English

We present Depth Anything at Any Condition (DepthAnything-AC), a foundation monocular depth estimation (MDE) model capable of handling diverse environmental conditions. Previous foundation MDE models achieve impressive performance across general scenes but not perform well in complex open-world environments that involve challenging conditions, such as illumination variations, adverse weather, and sensor-induced distortions. To overcome the challenges of data scarcity and the inability of generating high-quality pseudo-labels from corrupted images, we propose an unsupervised consistency regularization finetuning paradigm that requires only a relatively small amount of unlabeled data. Furthermore, we propose the Spatial Distance Constraint to explicitly enforce the model to learn patch-level relative relationships, resulting in clearer semantic boundaries and more accurate details. Experimental results demonstrate the zero-shot capabilities of DepthAnything-AC across diverse benchmarks, including real-world adverse weather benchmarks, synthetic corruption benchmarks, and general benchmarks. Project Page: https://ghost233lism.github.io/depthanything-AC-page Code: https://github.com/HVision-NKU/DepthAnythingAC