松弛控制:提升通用深度调节网络
LooseControl: Lifting ControlNet for Generalized Depth Conditioning
December 5, 2023
作者: Shariq Farooq Bhat, Niloy J. Mitra, Peter Wonka
cs.AI
摘要
我们提出了LooseControl,以实现扩展的深度条件控制,用于基于扩散的图像生成。ControlNet,作为深度条件图像生成的最先进技术,产生了显著的结果,但依赖于对详细深度图的访问以进行引导。在许多情况下,创建这样精确的深度图是具有挑战性的。本文介绍了深度条件的通用版本,使许多新的内容创建工作流变得可能。具体而言,我们允许(C1)场景边界控制,用于仅通过边界条件粗略指定场景,以及(C2)3D框控制,用于指定目标对象的布局位置,而不是对象的确切形状和外观。使用LooseControl,结合文本指导,用户可以通过仅指定场景边界和主要对象的位置来创建复杂环境(例如房间、街景等)。此外,我们提供了两种编辑机制来优化结果:(E1)3D框编辑使用户能够通过更改、添加或删除框来优化图像,同时保持图像的风格不变。这会产生除由编辑的框引起的更改之外的最小更改。(E2)属性编辑提出了可能的编辑方向,以更改场景的某个特定方面,如整体对象密度或特定对象。通过广泛的测试和与基线的比较,证明了我们方法的通用性。我们相信LooseControl可以成为一个重要的设计工具,用于轻松创建复杂环境,并可以扩展到其他形式的引导通道。代码和更多信息可在https://shariqfarooq123.github.io/loose-control/ 上找到。
English
We present LooseControl to allow generalized depth conditioning for
diffusion-based image generation. ControlNet, the SOTA for depth-conditioned
image generation, produces remarkable results but relies on having access to
detailed depth maps for guidance. Creating such exact depth maps, in many
scenarios, is challenging. This paper introduces a generalized version of depth
conditioning that enables many new content-creation workflows. Specifically, we
allow (C1) scene boundary control for loosely specifying scenes with only
boundary conditions, and (C2) 3D box control for specifying layout locations of
the target objects rather than the exact shape and appearance of the objects.
Using LooseControl, along with text guidance, users can create complex
environments (e.g., rooms, street views, etc.) by specifying only scene
boundaries and locations of primary objects. Further, we provide two editing
mechanisms to refine the results: (E1) 3D box editing enables the user to
refine images by changing, adding, or removing boxes while freezing the style
of the image. This yields minimal changes apart from changes induced by the
edited boxes. (E2) Attribute editing proposes possible editing directions to
change one particular aspect of the scene, such as the overall object density
or a particular object. Extensive tests and comparisons with baselines
demonstrate the generality of our method. We believe that LooseControl can
become an important design tool for easily creating complex environments and be
extended to other forms of guidance channels. Code and more information are
available at https://shariqfarooq123.github.io/loose-control/ .