ChatPaper.aiChatPaper

在生成图像流形上进行交互式基于点的操纵:拖动您的生成对抗网络

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

May 18, 2023
作者: Xingang Pan, Ayush Tewari, Thomas Leimkühler, Lingjie Liu, Abhimitra Meka, Christian Theobalt
cs.AI

摘要

合成满足用户需求的视觉内容通常需要对生成对象的姿势、形状、表情和布局进行灵活且精确的可控性。现有方法通过手动注释的训练数据或先前的3D模型获得生成对抗网络(GANs)的可控性,但往往缺乏灵活性、精确性和普适性。在这项工作中,我们研究了一种强大但较少探索的控制GANs的方法,即以用户交互方式“拖动”图像的任意点以精确达到目标点,如图1所示。为实现这一目标,我们提出了DragGAN,它由两个主要组件组成:1)基于特征的运动监督,驱动控制点向目标位置移动;2)一种新的点追踪方法,利用判别生成器特征不断定位控制点的位置。通过DragGAN,任何人都可以精确控制像素的移动,从而操纵动物、汽车、人类、风景等各种类别的姿势、形状、表情和布局。由于这些操作是在GAN的学习生成图像流形上执行的,因此即使对于挑战性场景,如产生幻觉的遮挡内容和变形形状始终遵循对象的刚性,它们也往往会产生逼真的输出。定性和定量比较表明,在图像操作和点追踪任务中,DragGAN相对于先前方法具有优势。我们还展示了通过GAN反演对真实图像进行操作。
English
Synthesizing visual content that meets users' needs often requires flexible and precise controllability of the pose, shape, expression, and layout of the generated objects. Existing approaches gain controllability of generative adversarial networks (GANs) via manually annotated training data or a prior 3D model, which often lack flexibility, precision, and generality. In this work, we study a powerful yet much less explored way of controlling GANs, that is, to "drag" any points of the image to precisely reach target points in a user-interactive manner, as shown in Fig.1. To achieve this, we propose DragGAN, which consists of two main components: 1) a feature-based motion supervision that drives the handle point to move towards the target position, and 2) a new point tracking approach that leverages the discriminative generator features to keep localizing the position of the handle points. Through DragGAN, anyone can deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, and layout of diverse categories such as animals, cars, humans, landscapes, etc. As these manipulations are performed on the learned generative image manifold of a GAN, they tend to produce realistic outputs even for challenging scenarios such as hallucinating occluded content and deforming shapes that consistently follow the object's rigidity. Both qualitative and quantitative comparisons demonstrate the advantage of DragGAN over prior approaches in the tasks of image manipulation and point tracking. We also showcase the manipulation of real images through GAN inversion.
PDF3774December 15, 2024