ChatPaper.aiChatPaper

BetterDepth:零樣本單目深度估計的即插即用擴散精煉器。

BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation

July 25, 2024
作者: Xiang Zhang, Bingxin Ke, Hayko Riemenschneider, Nando Metzger, Anton Obukhov, Markus Gross, Konrad Schindler, Christopher Schroers
cs.AI

摘要

透過在大規模數據集上訓練,零樣本單目深度估計(MDE)方法在實際環境中展現出強大的性能,但往往在細節方面缺乏足夠的精確性。儘管最近基於擴散的MDE方法展示出引人注目的細節提取能力,但由於從多樣數據集中獲得強健的幾何先驗的困難,它們仍然在幾何上具挑戰性的場景中遇到困難。為了充分發揮兩者的優勢,我們提出了BetterDepth,以有效實現幾何正確的仿射不變MDE性能,同時捕捉細緻的細節。具體而言,BetterDepth是一個有條件的基於擴散的精細調節器,以預先訓練的MDE模型的預測作為深度條件,其中全局深度上下文被很好地捕捉,並根據輸入圖像迭代地精煉細節。為了訓練這樣一個調節器,我們提出了全局預對齊和局部補丁遮罩方法,以確保BetterDepth對深度條件的忠實性,同時學習捕捉細緻的場景細節。通過在小規模合成數據集上高效訓練,BetterDepth在各種公共數據集和實際場景中實現了最先進的零樣本MDE性能。此外,BetterDepth可以在不需額外重新訓練的情況下,改善其他MDE模型的性能。
English
By training over large-scale datasets, zero-shot monocular depth estimation (MDE) methods show robust performance in the wild but often suffer from insufficiently precise details. Although recent diffusion-based MDE approaches exhibit appealing detail extraction ability, they still struggle in geometrically challenging scenes due to the difficulty of gaining robust geometric priors from diverse datasets. To leverage the complementary merits of both worlds, we propose BetterDepth to efficiently achieve geometrically correct affine-invariant MDE performance while capturing fine-grained details. Specifically, BetterDepth is a conditional diffusion-based refiner that takes the prediction from pre-trained MDE models as depth conditioning, in which the global depth context is well-captured, and iteratively refines details based on the input image. For the training of such a refiner, we propose global pre-alignment and local patch masking methods to ensure the faithfulness of BetterDepth to depth conditioning while learning to capture fine-grained scene details. By efficient training on small-scale synthetic datasets, BetterDepth achieves state-of-the-art zero-shot MDE performance on diverse public datasets and in-the-wild scenes. Moreover, BetterDepth can improve the performance of other MDE models in a plug-and-play manner without additional re-training.

Summary

AI-Generated Summary

PDF337November 28, 2024