通過輔助的辛共軛方法實現準確的引導擴散取樣

摘要

在擴散模型中，無需訓練的引導取樣利用現成的預訓練網絡，例如美學評估模型，來引導生成過程。目前的無需訓練引導取樣算法是基於對乾淨圖像的單步估計來獲取引導能量函數。然而，由於現成的預訓練網絡是在乾淨圖像上訓練的，因此在擴散模型的生成過程的早期階段，對乾淨圖像的單步估計過程可能不準確。這導致早期時間步驟的引導不準確。為了克服這個問題，我們提出了Symplectic Adjoint Guidance (SAG)，它在兩個內部階段計算梯度引導。首先，SAG通過n個函數調用來估計乾淨圖像，其中n作為一個靈活的超參數，可以根據特定的圖像質量要求進行調整。其次，SAG使用較低的記憶需求，通過輔導對稱方法來準確且高效地獲取梯度。大量實驗表明，與基準相比，SAG在引導圖像和視頻生成任務中生成了質量更高的圖像。

English

Training-free guided sampling in diffusion models leverages off-the-shelf pre-trained networks, such as an aesthetic evaluation model, to guide the generation process. Current training-free guided sampling algorithms obtain the guidance energy function based on a one-step estimate of the clean image. However, since the off-the-shelf pre-trained networks are trained on clean images, the one-step estimation procedure of the clean image may be inaccurate, especially in the early stages of the generation process in diffusion models. This causes the guidance in the early time steps to be inaccurate. To overcome this problem, we propose Symplectic Adjoint Guidance (SAG), which calculates the gradient guidance in two inner stages. Firstly, SAG estimates the clean image via n function calls, where n serves as a flexible hyperparameter that can be tailored to meet specific image quality requirements. Secondly, SAG uses the symplectic adjoint method to obtain the gradients accurately and efficiently in terms of the memory requirements. Extensive experiments demonstrate that SAG generates images with higher qualities compared to the baselines in both guided image and video generation tasks.

通過輔助的辛共軛方法實現準確的引導擴散取樣

Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint Method

摘要

Support