ChatPaper.aiChatPaper

松弛控制:提升通用深度调节网络

LooseControl: Lifting ControlNet for Generalized Depth Conditioning

December 5, 2023
作者: Shariq Farooq Bhat, Niloy J. Mitra, Peter Wonka
cs.AI

摘要

我们提出了LooseControl,以实现扩展的深度条件控制,用于基于扩散的图像生成。ControlNet,作为深度条件图像生成的最先进技术,产生了显著的结果,但依赖于对详细深度图的访问以进行引导。在许多情况下,创建这样精确的深度图是具有挑战性的。本文介绍了深度条件的通用版本,使许多新的内容创建工作流变得可能。具体而言,我们允许(C1)场景边界控制,用于仅通过边界条件粗略指定场景,以及(C2)3D框控制,用于指定目标对象的布局位置,而不是对象的确切形状和外观。使用LooseControl,结合文本指导,用户可以通过仅指定场景边界和主要对象的位置来创建复杂环境(例如房间、街景等)。此外,我们提供了两种编辑机制来优化结果:(E1)3D框编辑使用户能够通过更改、添加或删除框来优化图像,同时保持图像的风格不变。这会产生除由编辑的框引起的更改之外的最小更改。(E2)属性编辑提出了可能的编辑方向,以更改场景的某个特定方面,如整体对象密度或特定对象。通过广泛的测试和与基线的比较,证明了我们方法的通用性。我们相信LooseControl可以成为一个重要的设计工具,用于轻松创建复杂环境,并可以扩展到其他形式的引导通道。代码和更多信息可在https://shariqfarooq123.github.io/loose-control/ 上找到。
English
We present LooseControl to allow generalized depth conditioning for diffusion-based image generation. ControlNet, the SOTA for depth-conditioned image generation, produces remarkable results but relies on having access to detailed depth maps for guidance. Creating such exact depth maps, in many scenarios, is challenging. This paper introduces a generalized version of depth conditioning that enables many new content-creation workflows. Specifically, we allow (C1) scene boundary control for loosely specifying scenes with only boundary conditions, and (C2) 3D box control for specifying layout locations of the target objects rather than the exact shape and appearance of the objects. Using LooseControl, along with text guidance, users can create complex environments (e.g., rooms, street views, etc.) by specifying only scene boundaries and locations of primary objects. Further, we provide two editing mechanisms to refine the results: (E1) 3D box editing enables the user to refine images by changing, adding, or removing boxes while freezing the style of the image. This yields minimal changes apart from changes induced by the edited boxes. (E2) Attribute editing proposes possible editing directions to change one particular aspect of the scene, such as the overall object density or a particular object. Extensive tests and comparisons with baselines demonstrate the generality of our method. We believe that LooseControl can become an important design tool for easily creating complex environments and be extended to other forms of guidance channels. Code and more information are available at https://shariqfarooq123.github.io/loose-control/ .
PDF152December 15, 2024