步調一致：在擴散模型中優化取樣排程

摘要

擴散模型（DMs）已在視覺領域及其他領域確立自己作為最先進的生成建模方法。DMs的一個關鍵缺點是它們的抽樣速度較慢，依賴於通過大型神經網絡進行許多連續函數評估。從DMs中抽樣可以被視為通過一個稱為抽樣時間表的離散化噪聲級別集解決微分方程。過去的研究主要集中在導出高效求解器上，但很少關注尋找最佳抽樣時間表，整個文獻都依賴於手工設計的啟發式方法。在這項工作中，我們首次提出了一種通用且原則性的方法來優化DMs的抽樣時間表，以獲得高質量的輸出，稱為Align Your Steps。我們利用隨機微積分的方法，找到了針對不同求解器、訓練過的DMs和數據集的最佳時間表。我們在幾個圖像、視頻以及2D玩具數據合成基準上評估了我們的新方法，使用各種不同的抽樣器，並觀察到我們優化的時間表在幾乎所有實驗中優於以前手工設計的時間表。我們的方法展示了抽樣時間表優化的潛力，特別是在少步驟合成方案中。

English

Diffusion models (DMs) have established themselves as the state-of-the-art generative modeling approach in the visual domain and beyond. A crucial drawback of DMs is their slow sampling speed, relying on many sequential function evaluations through large neural networks. Sampling from DMs can be seen as solving a differential equation through a discretized set of noise levels known as the sampling schedule. While past works primarily focused on deriving efficient solvers, little attention has been given to finding optimal sampling schedules, and the entire literature relies on hand-crafted heuristics. In this work, for the first time, we propose a general and principled approach to optimizing the sampling schedules of DMs for high-quality outputs, called Align Your Steps. We leverage methods from stochastic calculus and find optimal schedules specific to different solvers, trained DMs and datasets. We evaluate our novel approach on several image, video as well as 2D toy data synthesis benchmarks, using a variety of different samplers, and observe that our optimized schedules outperform previous hand-crafted schedules in almost all experiments. Our method demonstrates the untapped potential of sampling schedule optimization, especially in the few-step synthesis regime.

步調一致：在擴散模型中優化取樣排程

Align Your Steps: Optimizing Sampling Schedules in Diffusion Models

摘要

Support