ChatPaper.aiChatPaper

連續時間分佈匹配用於少步擴散蒸餾

Continuous-Time Distribution Matching for Few-Step Diffusion Distillation

May 7, 2026
作者: Tao Liu, Hao Yan, Mengting Chen, Taihang Hu, Zhengrong Yue, Zihao Pan, Jinsong Lan, Xiaoyong Zhu, Ming-Ming Cheng, Bo Zheng, Yaxing Wang
cs.AI

摘要

步長蒸餾已成為加速擴散模型的主流技術,其中分佈匹配蒸餾(DMD)和一致性蒸餾是兩大代表性範式。一致性方法通過在完整PF-ODE軌跡上強制自一致性來引導模型朝向乾淨數據流形,而原始DMD僅依賴於少數預定義離散時間步的稀疏監督。這種受限的離散時間公式與反向KL散度的模式聚焦特性容易導致視覺偽影和過度平滑的輸出,通常需要引入複雜輔助模塊(如GANs或獎勵模型)來恢復視覺保真度。本文提出連續時間分佈匹配(CDM),首次將DMD框架從離散錨點遷移至連續優化。CDM通過兩項連續時間設計實現突破:首先用隨機長度的動態連續調度替代固定離散調度,使分佈匹配在採樣軌跡的任意點生效而非僅限於少數固定錨點;其次提出連續時間對齊目標,通過學生速度場外推的潛變量進行主動離軌跡匹配,從而提升泛化能力並保留細膩視覺細節。在SD3-Medium和Longcat-Image等不同架構上的大量實驗表明,CDM無需複雜輔助目標即可在少步圖像生成中實現極具競爭力的視覺保真度。程式碼已開源於https://github.com/byliutao/cdm。
English
Step distillation has become a leading technique for accelerating diffusion models, among which Distribution Matching Distillation (DMD) and Consistency Distillation are two representative paradigms. While consistency methods enforce self-consistency along the full PF-ODE trajectory to steer it toward the clean data manifold, vanilla DMD relies on sparse supervision at a few predefined discrete timesteps. This restricted discrete-time formulation and mode-seeking nature of the reverse KL divergence tends to exhibit visual artifacts and over-smoothed outputs, often necessitating complex auxiliary modules -- such as GANs or reward models -- to restore visual fidelity. In this work, we introduce Continuous-Time Distribution Matching (CDM), migrating the DMD framework from discrete anchoring to continuous optimization for the first time. CDM achieves this through two continuous-time designs. First, we replace the fixed discrete schedule with a dynamic continuous schedule of random length, so that distribution matching is enforced at arbitrary points along sampling trajectories rather than only at a few fixed anchors. Second, we propose a continuous-time alignment objective that performs active off-trajectory matching on latents extrapolated via the student's velocity field, improving generalization and preserving fine visual details. Extensive experiments on different architectures, including SD3-Medium and Longcat-Image, demonstrate that CDM provides highly competitive visual fidelity for few-step image generation without relying on complex auxiliary objectives. Code is available at https://github.com/byliutao/cdm.
PDF223May 9, 2026