连续时间分布匹配用于少步扩散蒸馏
Continuous-Time Distribution Matching for Few-Step Diffusion Distillation
May 7, 2026
作者: Tao Liu, Hao Yan, Mengting Chen, Taihang Hu, Zhengrong Yue, Zihao Pan, Jinsong Lan, Xiaoyong Zhu, Ming-Ming Cheng, Bo Zheng, Yaxing Wang
cs.AI
摘要
步长蒸馏已成为加速扩散模型的主流技术,其中分布匹配蒸馏(DMD)和一致性蒸馏是两种代表性范式。一致性方法通过强制完整PF-ODE轨迹上的自一致性来引导其朝向干净数据流形,而原始DMD仅依赖少数预定义离散时间步的稀疏监督。这种受限的离散时间建模方式及反向KL散度的模式聚焦特性容易导致视觉伪影和过度平滑的输出,通常需要复杂辅助模块(如GAN或奖励模型)来恢复视觉保真度。本文提出连续时间分布匹配(CDM),首次将DMD框架从离散锚点迁移至连续优化。CDM通过两项连续时间设计实现突破:首先用动态连续随机长度调度替代固定离散调度,使分布匹配在采样轨迹的任意点而非仅限固定锚点执行;其次提出连续时间对齐目标,通过学生速度场外推的潜变量进行主动离轨匹配,从而提升泛化能力并保留精细视觉细节。在SD3-Medium和Longcat-Image等不同架构上的大量实验表明,CDM无需复杂辅助目标即可为少步数图像生成提供极具竞争力的视觉保真度。代码已开源:https://github.com/byliutao/cdm。
English
Step distillation has become a leading technique for accelerating diffusion models, among which Distribution Matching Distillation (DMD) and Consistency Distillation are two representative paradigms. While consistency methods enforce self-consistency along the full PF-ODE trajectory to steer it toward the clean data manifold, vanilla DMD relies on sparse supervision at a few predefined discrete timesteps. This restricted discrete-time formulation and mode-seeking nature of the reverse KL divergence tends to exhibit visual artifacts and over-smoothed outputs, often necessitating complex auxiliary modules -- such as GANs or reward models -- to restore visual fidelity. In this work, we introduce Continuous-Time Distribution Matching (CDM), migrating the DMD framework from discrete anchoring to continuous optimization for the first time. CDM achieves this through two continuous-time designs. First, we replace the fixed discrete schedule with a dynamic continuous schedule of random length, so that distribution matching is enforced at arbitrary points along sampling trajectories rather than only at a few fixed anchors. Second, we propose a continuous-time alignment objective that performs active off-trajectory matching on latents extrapolated via the student's velocity field, improving generalization and preserving fine visual details. Extensive experiments on different architectures, including SD3-Medium and Longcat-Image, demonstrate that CDM provides highly competitive visual fidelity for few-step image generation without relying on complex auxiliary objectives. Code is available at https://github.com/byliutao/cdm.