无图像时间步蒸馏:基于轨迹采样对连续时间一致性的方法
Image-Free Timestep Distillation via Continuous-Time Consistency with Trajectory-Sampled Pairs
November 25, 2025
作者: Bao Tang, Shuai Zhang, Yueting Zhu, Jijun Xiang, Xin Yang, Li Yu, Wenyu Liu, Xinggang Wang
cs.AI
摘要
时间步蒸馏是提升扩散模型生成效率的有效方法。一致性模型(CM)作为基于轨迹的框架,凭借其坚实的理论基础和高质量少步生成能力展现出显著潜力。然而,当前连续时间一致性蒸馏方法仍高度依赖训练数据和计算资源,这既阻碍了其在资源受限场景的部署,也限制了向多领域扩展的可行性。为解决该问题,我们提出轨迹反向一致性模型(TBCM),通过直接从教师模型生成轨迹中提取潜在表征,消除了对外部训练数据的依赖。与需要VAE编码和大规模数据集的传统方法不同,这种自包含的蒸馏范式显著提升了效率与简洁性。此外,轨迹提取的样本天然弥合了训练与推理间的分布差距,从而实现更有效的知识迁移。实验表明,TBCM在MJHQ-30k数据集上单步生成即可达到6.52 FID和28.08 CLIP分数,同时相较Sana-Sprint减少约40%训练时间并节省大量GPU内存,在保持质量的同时展现出卓越效率。我们进一步揭示了连续时间一致性蒸馏中的扩散-生成空间差异,并分析采样策略对蒸馏性能的影响,为未来蒸馏研究提供洞见。GitHub项目地址:https://github.com/hustvl/TBCM。
English
Timestep distillation is an effective approach for improving the generation efficiency of diffusion models. The Consistency Model (CM), as a trajectory-based framework, demonstrates significant potential due to its strong theoretical foundation and high-quality few-step generation. Nevertheless, current continuous-time consistency distillation methods still rely heavily on training data and computational resources, hindering their deployment in resource-constrained scenarios and limiting their scalability to diverse domains. To address this issue, we propose Trajectory-Backward Consistency Model (TBCM), which eliminates the dependence on external training data by extracting latent representations directly from the teacher model's generation trajectory. Unlike conventional methods that require VAE encoding and large-scale datasets, our self-contained distillation paradigm significantly improves both efficiency and simplicity. Moreover, the trajectory-extracted samples naturally bridge the distribution gap between training and inference, thereby enabling more effective knowledge transfer. Empirically, TBCM achieves 6.52 FID and 28.08 CLIP scores on MJHQ-30k under one-step generation, while reducing training time by approximately 40% compared to Sana-Sprint and saving a substantial amount of GPU memory, demonstrating superior efficiency without sacrificing quality. We further reveal the diffusion-generation space discrepancy in continuous-time consistency distillation and analyze how sampling strategies affect distillation performance, offering insights for future distillation research. GitHub Link: https://github.com/hustvl/TBCM.