图像中的主体重新定位
Repositioning the Subject within Image
January 30, 2024
作者: Yikai Wang, Chenjie Cao, Qiaole Dong, Yifan Li, Yanwei Fu
cs.AI
摘要
当前图像处理主要集中在静态处理,例如替换图像中的特定区域或改变其整体风格。在本文中,我们介绍一项创新的动态处理任务,主体重新定位。该任务涉及将用户指定的主体移动到所需位置,同时保持图像的保真度。我们的研究表明,主体重新定位的基本子任务,包括填补重新定位主体留下的空白区域、重建被遮挡的主体部分以及将主体与周围区域保持一致,可以有效地重新构建为一个统一的、受提示引导的修复任务。因此,我们可以利用单一的扩散生成模型来处理这些子任务,通过我们提出的任务反演技术学习各种任务提示。此外,我们还整合了预处理和后处理技术,以进一步提高主体重新定位的质量。这些元素共同构成了我们的SEgment-gEnerate-and-bLEnd(SEELE)框架。为了评估SEELE在主体重新定位中的有效性,我们构建了一个名为ReS的真实世界主体重新定位数据集。我们在ReS上的结果展示了重新定位图像生成的质量。
English
Current image manipulation primarily centers on static manipulation, such as
replacing specific regions within an image or altering its overall style. In
this paper, we introduce an innovative dynamic manipulation task, subject
repositioning. This task involves relocating a user-specified subject to a
desired position while preserving the image's fidelity. Our research reveals
that the fundamental sub-tasks of subject repositioning, which include filling
the void left by the repositioned subject, reconstructing obscured portions of
the subject and blending the subject to be consistent with surrounding areas,
can be effectively reformulated as a unified, prompt-guided inpainting task.
Consequently, we can employ a single diffusion generative model to address
these sub-tasks using various task prompts learned through our proposed task
inversion technique. Additionally, we integrate pre-processing and
post-processing techniques to further enhance the quality of subject
repositioning. These elements together form our SEgment-gEnerate-and-bLEnd
(SEELE) framework. To assess SEELE's effectiveness in subject repositioning, we
assemble a real-world subject repositioning dataset called ReS. Our results on
ReS demonstrate the quality of repositioned image generation.