Uni-Instruct:通过统一扩散散度指令实现的一步扩散模型
Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction
May 27, 2025
作者: Yifei Wang, Weimin Bai, Colin Zhang, Debing Zhang, Weijian Luo, He Sun
cs.AI
摘要
本文中,我们将Diff-Instruct、DMD、SIM、SiD、f-distill等十余种现有的一步扩散蒸馏方法统一于一个理论驱动的框架内,称之为\emph{Uni-Instruct}。Uni-Instruct的提出基于我们提出的f-散度族扩散扩展理论。随后,我们引入关键理论,解决了原始扩展f-散度难以处理的问题,导出了一个等效且可操作的损失函数,通过最小化扩展f-散度族有效训练一步扩散模型。Uni-Instruct所引入的新颖统一不仅提供了新的理论贡献,有助于从高层次视角理解现有方法,还带来了一流的一步扩散生成性能。在CIFAR10生成基准测试中,Uni-Instruct在无条件生成和有条件生成上分别创下了\emph{1.46}和\emph{1.38}的Frechet Inception Distance (FID)记录。在ImageNet-64×64生成基准测试中,Uni-Instruct以\emph{1.02}的FID成绩刷新了一步生成的最高纪录,显著优于其79步教师扩散模型的2.35,提升幅度达1.33(1.02对2.35)。此外,我们还将Uni-Instruct应用于文本到3D生成等更广泛的任务。在文本到3D生成方面,Uni-Instruct取得了不错的结果,在生成质量和多样性上均略优于SDS和VSD等先前方法。Uni-Instruct坚实的理论和实证贡献,有望为未来一步扩散蒸馏及扩散模型知识迁移的研究提供有力支持。
English
In this paper, we unify more than 10 existing one-step diffusion distillation
approaches, such as Diff-Instruct, DMD, SIM, SiD, f-distill, etc, inside a
theory-driven framework which we name the \emph{Uni-Instruct}.
Uni-Instruct is motivated by our proposed diffusion expansion theory of the
f-divergence family. Then we introduce key theories that overcome the
intractability issue of the original expanded f-divergence, resulting in an
equivalent yet tractable loss that effectively trains one-step diffusion models
by minimizing the expanded f-divergence family. The novel unification
introduced by Uni-Instruct not only offers new theoretical contributions that
help understand existing approaches from a high-level perspective but also
leads to state-of-the-art one-step diffusion generation performances. On the
CIFAR10 generation benchmark, Uni-Instruct achieves record-breaking Frechet
Inception Distance (FID) values of \emph{1.46} for unconditional
generation and \emph{1.38} for conditional generation. On the
ImageNet-64times 64 generation benchmark, Uni-Instruct achieves a new SoTA
one-step generation FID of \emph{1.02}, which outperforms its 79-step
teacher diffusion with a significant improvement margin of 1.33 (1.02 vs 2.35).
We also apply Uni-Instruct on broader tasks like text-to-3D generation. For
text-to-3D generation, Uni-Instruct gives decent results, which slightly
outperforms previous methods, such as SDS and VSD, in terms of both generation
quality and diversity. Both the solid theoretical and empirical contributions
of Uni-Instruct will potentially help future studies on one-step diffusion
distillation and knowledge transferring of diffusion models.Summary
AI-Generated Summary