通过辛共轭方法实现准确引导扩散采样

摘要

在扩散模型中，无需训练的引导采样利用现成的预训练网络，如美学评估模型，来指导生成过程。目前的无需训练的引导采样算法根据对干净图像的一步估计获得引导能量函数。然而，由于现成的预训练网络是在干净图像上训练的，对干净图像的一步估计过程可能不准确，尤其是在扩散模型的生成过程的早期阶段。这导致了早期时间步的引导不准确。为了克服这一问题，我们提出了辛共轭引导（SAG），它在两个内部阶段计算梯度引导。首先，SAG通过n个函数调用估计干净图像，其中n作为一个灵活的超参数，可以根据特定的图像质量要求进行调整。其次，SAG使用辛共轭方法准确高效地获取梯度，从内存需求方面来看。大量实验证明，与引导图像和视频生成任务中的基线相比，SAG生成的图像质量更高。

English

Training-free guided sampling in diffusion models leverages off-the-shelf pre-trained networks, such as an aesthetic evaluation model, to guide the generation process. Current training-free guided sampling algorithms obtain the guidance energy function based on a one-step estimate of the clean image. However, since the off-the-shelf pre-trained networks are trained on clean images, the one-step estimation procedure of the clean image may be inaccurate, especially in the early stages of the generation process in diffusion models. This causes the guidance in the early time steps to be inaccurate. To overcome this problem, we propose Symplectic Adjoint Guidance (SAG), which calculates the gradient guidance in two inner stages. Firstly, SAG estimates the clean image via n function calls, where n serves as a flexible hyperparameter that can be tailored to meet specific image quality requirements. Secondly, SAG uses the symplectic adjoint method to obtain the gradients accurately and efficiently in terms of the memory requirements. Extensive experiments demonstrate that SAG generates images with higher qualities compared to the baselines in both guided image and video generation tasks.

通过辛共轭方法实现准确引导扩散采样

Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint Method

摘要

Support