ChatPaper.aiChatPaper

稳定拖动:基于点的图像编辑的稳定拖动

StableDrag: Stable Dragging for Point-based Image Editing

March 7, 2024
作者: Yutao Cui, Xiaotong Zhao, Guozhen Zhang, Shengming Cao, Kai Ma, Limin Wang
cs.AI

摘要

自从DragGAN出现以来,基于点的图像编辑引起了显著关注。最近,DragDiffusion进一步通过将这种拖动技术应用于扩散模型来推动生成质量。尽管取得了巨大成功,但这种拖动方案存在两个主要缺点,即不准确的点跟踪和不完整的运动监督,这可能导致令人不满意的拖动结果。为了解决这些问题,我们构建了一个稳定且精确的基于拖动的编辑框架,命名为StableDrag,通过设计一种具有区分性的点跟踪方法和基于置信度的潜在增强策略来实现运动监督。前者使我们能够精确定位更新的控制点,从而提高长距离操作的稳定性,而后者负责确保在所有操作步骤中优化的潜在尽可能高质量。由于这些独特设计,我们实例化了两种类型的图像编辑模型,包括StableDrag-GAN和StableDrag-Diff,通过对DragBench进行广泛的定性实验和定量评估,实现了更稳定的拖动性能。
English
Point-based image editing has attracted remarkable attention since the emergence of DragGAN. Recently, DragDiffusion further pushes forward the generative quality via adapting this dragging technique to diffusion models. Despite these great success, this dragging scheme exhibits two major drawbacks, namely inaccurate point tracking and incomplete motion supervision, which may result in unsatisfactory dragging outcomes. To tackle these issues, we build a stable and precise drag-based editing framework, coined as StableDrag, by designing a discirminative point tracking method and a confidence-based latent enhancement strategy for motion supervision. The former allows us to precisely locate the updated handle points, thereby boosting the stability of long-range manipulation, while the latter is responsible for guaranteeing the optimized latent as high-quality as possible across all the manipulation steps. Thanks to these unique designs, we instantiate two types of image editing models including StableDrag-GAN and StableDrag-Diff, which attains more stable dragging performance, through extensive qualitative experiments and quantitative assessment on DragBench.
PDF304December 15, 2024