背景提示以提高物體深度
Background Prompting for Improved Object Depth
June 8, 2023
作者: Manel Baradad, Yuanzhen Li, Forrester Cole, Michael Rubinstein, Antonio Torralba, William T. Freeman, Varun Jampani
cs.AI
摘要
從單張圖像估計物體的深度對於許多視覺、機器人和圖形應用來說是一項有價值的任務。然而,目前的方法常常無法為不同場景中的物體生成準確的深度。在這項工作中,我們提出了一種簡單而有效的背景提示策略,該策略通過學習的背景來適應輸入的物體圖像。我們僅使用小規模合成物體數據集來學習背景提示。為了推斷真實圖像上的物體深度,我們將分割的物體放入學習的背景提示中,並運行現成的深度網絡。背景提示有助於深度網絡專注於前景物體,因為它們對背景變化具有不變性。此外,背景提示減小了合成和真實物體圖像之間的領域差距,比簡單的微調實現了更好的從模擬到真實的泛化。在多個合成和真實數據集上的結果表明,對於各種現有的深度網絡,真實物體深度的改進是一致的。代碼和優化的背景提示可在以下網址找到:https://mbaradad.github.io/depth_prompt。
English
Estimating the depth of objects from a single image is a valuable task for
many vision, robotics, and graphics applications. However, current methods
often fail to produce accurate depth for objects in diverse scenes. In this
work, we propose a simple yet effective Background Prompting strategy that
adapts the input object image with a learned background. We learn the
background prompts only using small-scale synthetic object datasets. To infer
object depth on a real image, we place the segmented object into the learned
background prompt and run off-the-shelf depth networks. Background Prompting
helps the depth networks focus on the foreground object, as they are made
invariant to background variations. Moreover, Background Prompting minimizes
the domain gap between synthetic and real object images, leading to better
sim2real generalization than simple finetuning. Results on multiple synthetic
and real datasets demonstrate consistent improvements in real object depths for
a variety of existing depth networks. Code and optimized background prompts can
be found at: https://mbaradad.github.io/depth_prompt.