정확한 가이디드 확산 샘플링을 위한 심플렉틱 수반 방법

초록

디퓨전 모델에서의 학습 없이 가이드된 샘플링은 미리 학습된 네트워크(예: 미적 평가 모델)를 활용하여 생성 과정을 안내합니다. 현재의 학습 없이 가이드된 샘플링 알고리즘은 깨끗한 이미지의 일단계 추정을 기반으로 가이던스 에너지 함수를 얻습니다. 그러나 미리 학습된 네트워크가 깨끗한 이미지에 대해 학습되었기 때문에, 디퓨전 모델의 생성 과정 초기 단계에서 특히 깨끗한 이미지의 일단계 추정 절차가 부정확할 수 있습니다. 이로 인해 초기 시간 단계에서의 가이던스가 부정확해질 수 있습니다. 이 문제를 해결하기 위해, 우리는 두 단계의 내부 과정에서 그래디언트 가이던스를 계산하는 Symplectic Adjoint Guidance(SAG)를 제안합니다. 첫째, SAG는 n번의 함수 호출을 통해 깨끗한 이미지를 추정하며, 여기서 n은 특정 이미지 품질 요구 사항에 맞게 조정 가능한 유연한 하이퍼파라미터 역할을 합니다. 둘째, SAG는 메모리 요구 사항 측면에서 정확하고 효율적으로 그래디언트를 얻기 위해 심플렉틱 수반 방법을 사용합니다. 광범위한 실험을 통해 SAG가 가이드된 이미지 및 비디오 생성 작업에서 기준선보다 더 높은 품질의 이미지를 생성함을 입증했습니다.

English

Training-free guided sampling in diffusion models leverages off-the-shelf pre-trained networks, such as an aesthetic evaluation model, to guide the generation process. Current training-free guided sampling algorithms obtain the guidance energy function based on a one-step estimate of the clean image. However, since the off-the-shelf pre-trained networks are trained on clean images, the one-step estimation procedure of the clean image may be inaccurate, especially in the early stages of the generation process in diffusion models. This causes the guidance in the early time steps to be inaccurate. To overcome this problem, we propose Symplectic Adjoint Guidance (SAG), which calculates the gradient guidance in two inner stages. Firstly, SAG estimates the clean image via n function calls, where n serves as a flexible hyperparameter that can be tailored to meet specific image quality requirements. Secondly, SAG uses the symplectic adjoint method to obtain the gradients accurately and efficiently in terms of the memory requirements. Extensive experiments demonstrate that SAG generates images with higher qualities compared to the baselines in both guided image and video generation tasks.

정확한 가이디드 확산 샘플링을 위한 심플렉틱 수반 방법

Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint Method

초록

Support