ChatPaper.aiChatPaper

穩定拖曳:針對基於點的圖像編輯的穩定拖曳

StableDrag: Stable Dragging for Point-based Image Editing

March 7, 2024
作者: Yutao Cui, Xiaotong Zhao, Guozhen Zhang, Shengming Cao, Kai Ma, Limin Wang
cs.AI

摘要

自從 DragGAN 出現以來,基於點的圖像編輯引起了顯著的關注。最近,DragDiffusion 通過將這種拖曳技術應用於擴散模型,進一步推動了生成質量。儘管取得了巨大成功,但這種拖曳方案存在兩個主要缺點,即不準確的點跟踪和不完整的運動監督,這可能導致令人不滿意的拖曳結果。為了應對這些問題,我們通過設計一種穩定而精確的基於拖曳的編輯框架,即 StableDrag,來解決問題,其中包括設計一種具有區分性的點跟踪方法和一種基於信心的潛在增強策略以進行運動監督。前者使我們能夠精確定位更新的控制點,從而提高長距離操作的穩定性,而後者則負責確保在所有操作步驟中優化的潛在盡可能高質量。由於這些獨特設計,我們實例化了兩種類型的圖像編輯模型,包括 StableDrag-GAN 和 StableDrag-Diff,通過對 DragBench 進行廣泛的定性實驗和定量評估,實現了更穩定的拖曳性能。
English
Point-based image editing has attracted remarkable attention since the emergence of DragGAN. Recently, DragDiffusion further pushes forward the generative quality via adapting this dragging technique to diffusion models. Despite these great success, this dragging scheme exhibits two major drawbacks, namely inaccurate point tracking and incomplete motion supervision, which may result in unsatisfactory dragging outcomes. To tackle these issues, we build a stable and precise drag-based editing framework, coined as StableDrag, by designing a discirminative point tracking method and a confidence-based latent enhancement strategy for motion supervision. The former allows us to precisely locate the updated handle points, thereby boosting the stability of long-range manipulation, while the latter is responsible for guaranteeing the optimized latent as high-quality as possible across all the manipulation steps. Thanks to these unique designs, we instantiate two types of image editing models including StableDrag-GAN and StableDrag-Diff, which attains more stable dragging performance, through extensive qualitative experiments and quantitative assessment on DragBench.
PDF304December 15, 2024