Glance:单样本加速扩散模型
Glance: Accelerating Diffusion Models with 1 Sample
December 2, 2025
作者: Zhuobai Dong, Rui Zhao, Songjie Wu, Junchao Yi, Linjie Li, Zhengyuan Yang, Lijuan Wang, Alex Jinpeng Wang
cs.AI
摘要
扩散模型在图像生成领域取得了显著成功,但其部署仍受限于高昂的计算成本和大量推理步骤的需求。现有减少步数的蒸馏方法试图通过训练紧凑的学生模型来跳过冗余步骤,但往往面临繁重的重新训练成本与泛化能力下降的问题。本研究采用全新视角:实施智能非均匀加速,对早期语义阶段施加较小加速比,而对后期冗余阶段采用较大加速比。我们通过两个分别专注于慢速与快速去噪阶段的专家模型来实现这一阶段感知策略。令人惊讶的是,无需投入大量资源重新训练学生模型,仅需为基础模型配备轻量级LoRA适配器即可同时实现高效加速与强大泛化能力。我们将这两个适配器命名为Slow-LoRA与Fast-LoRA。大量实验表明,本方法在保持跨基准测试视觉质量可比性的同时,实现了相对基础模型最高5倍的加速效果。值得注意的是,LoRA专家模型仅需使用1%的样本在单张V100显卡上训练一小时,所得模型对未见提示词仍展现出强大的泛化能力。
English
Diffusion models have achieved remarkable success in image generation, yet their deployment remains constrained by the heavy computational cost and the need for numerous inference steps. Previous efforts on fewer-step distillation attempt to skip redundant steps by training compact student models, yet they often suffer from heavy retraining costs and degraded generalization. In this work, we take a different perspective: we accelerate smartly, not evenly, applying smaller speedups to early semantic stages and larger ones to later redundant phases. We instantiate this phase-aware strategy with two experts that specialize in slow and fast denoising phases. Surprisingly, instead of investing massive effort in retraining student models, we find that simply equipping the base model with lightweight LoRA adapters achieves both efficient acceleration and strong generalization. We refer to these two adapters as Slow-LoRA and Fast-LoRA. Through extensive experiments, our method achieves up to 5 acceleration over the base model while maintaining comparable visual quality across diverse benchmarks. Remarkably, the LoRA experts are trained with only 1 samples on a single V100 within one hour, yet the resulting models generalize strongly on unseen prompts.