Uni-Instruct:通過統一擴散分歧指令實現的一步擴散模型
Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction
May 27, 2025
作者: Yifei Wang, Weimin Bai, Colin Zhang, Debing Zhang, Weijian Luo, He Sun
cs.AI
摘要
在本篇論文中,我們將超過十種現有的一步擴散蒸餾方法,如Diff-Instruct、DMD、SIM、SiD、f-distill等,統一於一個理論驅動的框架內,我們稱之為\emph{Uni-Instruct}。Uni-Instruct的構想源自於我們提出的f-散度族擴散擴展理論。隨後,我們引入關鍵理論,克服了原始擴展f-散度的不可計算性問題,從而導出一個等效且可計算的損失函數,該函數通過最小化擴展f-散度族來有效訓練一步擴散模型。Uni-Instruct所引入的新穎統一不僅提供了新的理論貢獻,從高層次視角幫助理解現有方法,還引領了一步擴散生成性能達到最新技術水平。在CIFAR10生成基準測試中,Uni-Instruct在無條件生成上取得了破紀錄的Frechet Inception Distance (FID)值\emph{1.46},在條件生成上則為\emph{1.38}。在ImageNet-64×64生成基準測試中,Uni-Instruct實現了新的SoTA一步生成FID值\emph{1.02},顯著超越了其79步教師擴散模型的1.33分(1.02對比2.35)。我們還將Uni-Instruct應用於更廣泛的任務,如文本到3D生成。在文本到3D生成方面,Uni-Instruct提供了不錯的結果,在生成質量和多樣性上均略優於先前的方法,如SDS和VSD。Uni-Instruct堅實的理論與實證貢獻,有望為未來一步擴散蒸餾及擴散模型知識遷移的研究提供助力。
English
In this paper, we unify more than 10 existing one-step diffusion distillation
approaches, such as Diff-Instruct, DMD, SIM, SiD, f-distill, etc, inside a
theory-driven framework which we name the \emph{Uni-Instruct}.
Uni-Instruct is motivated by our proposed diffusion expansion theory of the
f-divergence family. Then we introduce key theories that overcome the
intractability issue of the original expanded f-divergence, resulting in an
equivalent yet tractable loss that effectively trains one-step diffusion models
by minimizing the expanded f-divergence family. The novel unification
introduced by Uni-Instruct not only offers new theoretical contributions that
help understand existing approaches from a high-level perspective but also
leads to state-of-the-art one-step diffusion generation performances. On the
CIFAR10 generation benchmark, Uni-Instruct achieves record-breaking Frechet
Inception Distance (FID) values of \emph{1.46} for unconditional
generation and \emph{1.38} for conditional generation. On the
ImageNet-64times 64 generation benchmark, Uni-Instruct achieves a new SoTA
one-step generation FID of \emph{1.02}, which outperforms its 79-step
teacher diffusion with a significant improvement margin of 1.33 (1.02 vs 2.35).
We also apply Uni-Instruct on broader tasks like text-to-3D generation. For
text-to-3D generation, Uni-Instruct gives decent results, which slightly
outperforms previous methods, such as SDS and VSD, in terms of both generation
quality and diversity. Both the solid theoretical and empirical contributions
of Uni-Instruct will potentially help future studies on one-step diffusion
distillation and knowledge transferring of diffusion models.Summary
AI-Generated Summary