ChatPaper.aiChatPaper

Uni-Instruct:通過統一擴散分歧指令實現的一步擴散模型

Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction

May 27, 2025
作者: Yifei Wang, Weimin Bai, Colin Zhang, Debing Zhang, Weijian Luo, He Sun
cs.AI

摘要

在本篇論文中,我們將超過十種現有的一步擴散蒸餾方法,如Diff-Instruct、DMD、SIM、SiD、f-distill等,統一於一個理論驅動的框架內,我們稱之為\emph{Uni-Instruct}。Uni-Instruct的構想源自於我們提出的f-散度族擴散擴展理論。隨後,我們引入關鍵理論,克服了原始擴展f-散度的不可計算性問題,從而導出一個等效且可計算的損失函數,該函數通過最小化擴展f-散度族來有效訓練一步擴散模型。Uni-Instruct所引入的新穎統一不僅提供了新的理論貢獻,從高層次視角幫助理解現有方法,還引領了一步擴散生成性能達到最新技術水平。在CIFAR10生成基準測試中,Uni-Instruct在無條件生成上取得了破紀錄的Frechet Inception Distance (FID)值\emph{1.46},在條件生成上則為\emph{1.38}。在ImageNet-64×64生成基準測試中,Uni-Instruct實現了新的SoTA一步生成FID值\emph{1.02},顯著超越了其79步教師擴散模型的1.33分(1.02對比2.35)。我們還將Uni-Instruct應用於更廣泛的任務,如文本到3D生成。在文本到3D生成方面,Uni-Instruct提供了不錯的結果,在生成質量和多樣性上均略優於先前的方法,如SDS和VSD。Uni-Instruct堅實的理論與實證貢獻,有望為未來一步擴散蒸餾及擴散模型知識遷移的研究提供助力。
English
In this paper, we unify more than 10 existing one-step diffusion distillation approaches, such as Diff-Instruct, DMD, SIM, SiD, f-distill, etc, inside a theory-driven framework which we name the \emph{Uni-Instruct}. Uni-Instruct is motivated by our proposed diffusion expansion theory of the f-divergence family. Then we introduce key theories that overcome the intractability issue of the original expanded f-divergence, resulting in an equivalent yet tractable loss that effectively trains one-step diffusion models by minimizing the expanded f-divergence family. The novel unification introduced by Uni-Instruct not only offers new theoretical contributions that help understand existing approaches from a high-level perspective but also leads to state-of-the-art one-step diffusion generation performances. On the CIFAR10 generation benchmark, Uni-Instruct achieves record-breaking Frechet Inception Distance (FID) values of \emph{1.46} for unconditional generation and \emph{1.38} for conditional generation. On the ImageNet-64times 64 generation benchmark, Uni-Instruct achieves a new SoTA one-step generation FID of \emph{1.02}, which outperforms its 79-step teacher diffusion with a significant improvement margin of 1.33 (1.02 vs 2.35). We also apply Uni-Instruct on broader tasks like text-to-3D generation. For text-to-3D generation, Uni-Instruct gives decent results, which slightly outperforms previous methods, such as SDS and VSD, in terms of both generation quality and diversity. Both the solid theoretical and empirical contributions of Uni-Instruct will potentially help future studies on one-step diffusion distillation and knowledge transferring of diffusion models.

Summary

AI-Generated Summary

PDF82May 30, 2025