조건 인식 신경망을 통한 제어된 이미지 생성

초록

우리는 이미지 생성 모델에 제어 기능을 추가하기 위한 새로운 방법인 Condition-Aware Neural Network(CAN)을 제안합니다. 기존의 조건부 제어 방법과 병행하여, CAN은 신경망의 가중치를 동적으로 조작함으로써 이미지 생성 과정을 제어합니다. 이는 입력 조건에 기반하여 컨볼루션/선형 레이어를 위한 조건부 가중치를 생성하는 조건 인식 가중치 생성 모듈을 도입함으로써 달성됩니다. 우리는 CAN을 ImageNet에서의 클래스 조건부 이미지 생성과 COCO에서의 텍스트-이미지 생성에 대해 테스트했습니다. CAN은 DiT와 UViT를 포함한 디퓨전 트랜스포머 모델에서 지속적으로 상당한 개선을 제공합니다. 특히, EfficientViT와 결합된 CAN(CaT)은 ImageNet 512x512에서 2.78 FID를 달성하며, DiT-XL/2를 능가하면서 샘플링 단계당 52배 적은 MACs를 요구합니다.

English

We present Condition-Aware Neural Network (CAN), a new method for adding control to image generative models. In parallel to prior conditional control methods, CAN controls the image generation process by dynamically manipulating the weight of the neural network. This is achieved by introducing a condition-aware weight generation module that generates conditional weight for convolution/linear layers based on the input condition. We test CAN on class-conditional image generation on ImageNet and text-to-image generation on COCO. CAN consistently delivers significant improvements for diffusion transformer models, including DiT and UViT. In particular, CAN combined with EfficientViT (CaT) achieves 2.78 FID on ImageNet 512x512, surpassing DiT-XL/2 while requiring 52x fewer MACs per sampling step.

조건 인식 신경망을 통한 제어된 이미지 생성

Condition-Aware Neural Network for Controlled Image Generation

초록

Support