改进物体深度的背景提示
Background Prompting for Improved Object Depth
June 8, 2023
作者: Manel Baradad, Yuanzhen Li, Forrester Cole, Michael Rubinstein, Antonio Torralba, William T. Freeman, Varun Jampani
cs.AI
摘要
从单个图像估计物体的深度是许多视觉、机器人和图形应用中的一项重要任务。然而,当前的方法通常无法为多样化场景中的物体产生准确的深度。在这项工作中,我们提出了一种简单而有效的背景提示策略,通过学习背景来调整输入的物体图像。我们仅使用小规模合成物体数据集来学习背景提示。为了推断真实图像上的物体深度,我们将分割的物体放入学习的背景提示中,并运行现成的深度网络。背景提示有助于深度网络专注于前景物体,因为它们对背景变化具有不变性。此外,背景提示减小了合成和真实物体图像之间的域差距,比简单的微调实现了更好的从模拟到真实的泛化效果。在多个合成和真实数据集上的结果表明,对于各种现有深度网络,真实物体深度都得到了一致的改进。代码和优化的背景提示可在以下网址找到:https://mbaradad.github.io/depth_prompt。
English
Estimating the depth of objects from a single image is a valuable task for
many vision, robotics, and graphics applications. However, current methods
often fail to produce accurate depth for objects in diverse scenes. In this
work, we propose a simple yet effective Background Prompting strategy that
adapts the input object image with a learned background. We learn the
background prompts only using small-scale synthetic object datasets. To infer
object depth on a real image, we place the segmented object into the learned
background prompt and run off-the-shelf depth networks. Background Prompting
helps the depth networks focus on the foreground object, as they are made
invariant to background variations. Moreover, Background Prompting minimizes
the domain gap between synthetic and real object images, leading to better
sim2real generalization than simple finetuning. Results on multiple synthetic
and real datasets demonstrate consistent improvements in real object depths for
a variety of existing depth networks. Code and optimized background prompts can
be found at: https://mbaradad.github.io/depth_prompt.