稳定拖动:基于点的图像编辑的稳定拖动
StableDrag: Stable Dragging for Point-based Image Editing
March 7, 2024
作者: Yutao Cui, Xiaotong Zhao, Guozhen Zhang, Shengming Cao, Kai Ma, Limin Wang
cs.AI
摘要
自从DragGAN出现以来,基于点的图像编辑引起了显著关注。最近,DragDiffusion进一步通过将这种拖动技术应用于扩散模型来推动生成质量。尽管取得了巨大成功,但这种拖动方案存在两个主要缺点,即不准确的点跟踪和不完整的运动监督,这可能导致令人不满意的拖动结果。为了解决这些问题,我们构建了一个稳定且精确的基于拖动的编辑框架,命名为StableDrag,通过设计一种具有区分性的点跟踪方法和基于置信度的潜在增强策略来实现运动监督。前者使我们能够精确定位更新的控制点,从而提高长距离操作的稳定性,而后者负责确保在所有操作步骤中优化的潜在尽可能高质量。由于这些独特设计,我们实例化了两种类型的图像编辑模型,包括StableDrag-GAN和StableDrag-Diff,通过对DragBench进行广泛的定性实验和定量评估,实现了更稳定的拖动性能。
English
Point-based image editing has attracted remarkable attention since the
emergence of DragGAN. Recently, DragDiffusion further pushes forward the
generative quality via adapting this dragging technique to diffusion models.
Despite these great success, this dragging scheme exhibits two major drawbacks,
namely inaccurate point tracking and incomplete motion supervision, which may
result in unsatisfactory dragging outcomes. To tackle these issues, we build a
stable and precise drag-based editing framework, coined as StableDrag, by
designing a discirminative point tracking method and a confidence-based latent
enhancement strategy for motion supervision. The former allows us to precisely
locate the updated handle points, thereby boosting the stability of long-range
manipulation, while the latter is responsible for guaranteeing the optimized
latent as high-quality as possible across all the manipulation steps. Thanks to
these unique designs, we instantiate two types of image editing models
including StableDrag-GAN and StableDrag-Diff, which attains more stable
dragging performance, through extensive qualitative experiments and
quantitative assessment on DragBench.