超越单标记:基于离散MMD的离散扩散模型蒸馏
Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD
March 20, 2026
作者: Emiel Hoogeboom, David Ruhe, Jonathan Heek, Thomas Mensink, Tim Salimans
cs.AI
摘要
当前,离散扩散模型的蒸馏仍面临困难。相比之下,连续扩散模型领域已存在多种蒸馏方法,可将采样步骤大幅缩减至个位数。我们提出的离散矩匹配蒸馏法(D-MMD)借鉴了连续域中极为成功的思路。在以往离散蒸馏方法失效的情况下,D-MMD仍能保持高质量和多样性(在采样步骤充足时)。这一优势在文本和图像数据集上均得到验证。此外,新蒸馏出的生成器甚至能超越其教师模型的表现。
English
It is currently difficult to distill discrete diffusion models. In contrast, continuous diffusion literature has many distillation approaches methods that can reduce sampling steps to a handful.
Our method, Discrete Moment Matching Distillation (D-MMD), leverages ideas that have been highly successful in the continuous domain. Whereas previous discrete distillation methods collapse, D-MMD maintains high quality and diversity (given sufficient sampling steps). This is demonstrated on both text and image datasets. Moreover, the newly distilled generators can outperform their teachers.