ChatPaper.aiChatPaper

超越單元標記:基於離散最大均值差異的離散擴散模型蒸餾法

Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD

March 20, 2026
作者: Emiel Hoogeboom, David Ruhe, Jonathan Heek, Thomas Mensink, Tim Salimans
cs.AI

摘要

目前,离散扩散模型的蒸馏仍存在困难。相比之下,连续扩散模型领域已有多种蒸馏方法,能将采样步骤缩减至个位数。我们提出的离散矩匹配蒸馏法(D-MMD)借鉴了连续域中极为成功的思路。在先前离散蒸馏方法失效的情况下,D-MMD仍能保持高质量和多样性(在采样步骤充足时)。这一优势在文本和图像数据集上均得到验证。此外,新蒸馏出的生成器性能甚至能超越原始教师模型。
English
It is currently difficult to distill discrete diffusion models. In contrast, continuous diffusion literature has many distillation approaches methods that can reduce sampling steps to a handful. Our method, Discrete Moment Matching Distillation (D-MMD), leverages ideas that have been highly successful in the continuous domain. Whereas previous discrete distillation methods collapse, D-MMD maintains high quality and diversity (given sufficient sampling steps). This is demonstrated on both text and image datasets. Moreover, the newly distilled generators can outperform their teachers.
PDF51March 24, 2026