ChatPaper.aiChatPaper

圖像中主題的重新定位

Repositioning the Subject within Image

January 30, 2024
作者: Yikai Wang, Chenjie Cao, Qiaole Dong, Yifan Li, Yanwei Fu
cs.AI

摘要

目前的圖像操作主要集中在靜態操作,例如替換圖像中的特定區域或改變其整體風格。在本文中,我們介紹了一個創新的動態操作任務,主題重新定位。這個任務涉及將用戶指定的主題移到所需位置,同時保持圖像的忠實度。我們的研究顯示,主題重新定位的基本子任務,包括填補重新定位主題留下的空白、重建被遮蔽的主題部分以及將主題與周圍區域保持一致,可以有效地重構為統一的、受提示引導的修補任務。因此,我們可以利用單一擴散生成模型來處理這些子任務,使用通過我們提出的任務反演技術學習的各種任務提示。此外,我們還整合了預處理和後處理技術,以進一步提高主題重新定位的質量。這些元素共同構成了我們的SEgment-gEnerate-and-bLEnd(SEELE)框架。為了評估SEELE在主題重新定位中的有效性,我們收集了一個名為ReS的現實主題重新定位數據集。我們在ReS上的結果展示了重新定位圖像生成的質量。
English
Current image manipulation primarily centers on static manipulation, such as replacing specific regions within an image or altering its overall style. In this paper, we introduce an innovative dynamic manipulation task, subject repositioning. This task involves relocating a user-specified subject to a desired position while preserving the image's fidelity. Our research reveals that the fundamental sub-tasks of subject repositioning, which include filling the void left by the repositioned subject, reconstructing obscured portions of the subject and blending the subject to be consistent with surrounding areas, can be effectively reformulated as a unified, prompt-guided inpainting task. Consequently, we can employ a single diffusion generative model to address these sub-tasks using various task prompts learned through our proposed task inversion technique. Additionally, we integrate pre-processing and post-processing techniques to further enhance the quality of subject repositioning. These elements together form our SEgment-gEnerate-and-bLEnd (SEELE) framework. To assess SEELE's effectiveness in subject repositioning, we assemble a real-world subject repositioning dataset called ReS. Our results on ReS demonstrate the quality of repositioned image generation.
PDF141December 15, 2024