条件感知神经网络用于可控图像生成

摘要

我们提出了条件感知神经网络（Condition-Aware Neural Network，简称CAN），这是一种为图像生成模型添加控制的新方法。与先前的条件控制方法并行，CAN通过动态操纵神经网络的权重来控制图像生成过程。这一目标通过引入一个条件感知权重生成模块来实现，该模块根据输入条件为卷积/线性层生成条件权重。我们在ImageNet上的类别条件图像生成和COCO上的文本到图像生成任务中测试了CAN。CAN持续为扩散变换器模型带来显著改进，包括DiT和UViT。特别是，结合EfficientViT（CaT）的CAN在ImageNet 512x512上达到了2.78的FID，超越了DiT-XL/2，同时每个采样步骤所需的MACs减少了52倍。

English

We present Condition-Aware Neural Network (CAN), a new method for adding control to image generative models. In parallel to prior conditional control methods, CAN controls the image generation process by dynamically manipulating the weight of the neural network. This is achieved by introducing a condition-aware weight generation module that generates conditional weight for convolution/linear layers based on the input condition. We test CAN on class-conditional image generation on ImageNet and text-to-image generation on COCO. CAN consistently delivers significant improvements for diffusion transformer models, including DiT and UViT. In particular, CAN combined with EfficientViT (CaT) achieves 2.78 FID on ImageNet 512x512, surpassing DiT-XL/2 while requiring 52x fewer MACs per sampling step.

条件感知神经网络用于可控图像生成

Condition-Aware Neural Network for Controlled Image Generation

摘要

Support