ChatPaper.aiChatPaper

WaDi:面向一步式图像生成的权重方向感知蒸馏方法

WaDi: Weight Direction-aware Distillation for One-step Image Synthesis

March 9, 2026
作者: Lei Wang, Yang Cheng, Senmao Li, Ge Wu, Yaxing Wang, Jian Yang
cs.AI

摘要

尽管稳定扩散(SD)等扩散模型在图像生成方面表现出色,但其缓慢的推理速度限制了实际部署。近期研究通过将多步扩散提炼为单步生成器来加速推理。为深入理解提炼机制,我们分析了单步学生模型与其多步教师模型对应版本中U-Net/DiT权重的变化规律。分析发现,权重方向的变化幅度显著超过权重范数的变化,表明方向调整是蒸馏过程中的关键因素。基于这一发现,我们提出权重方向低秩旋转适配器(LoRaD)——一种专为单步扩散蒸馏设计的参数高效适配器。该模块通过可学习的低秩旋转矩阵对结构化方向变化进行建模。我们进一步将LoRaD融入变分分数蒸馏(VSD),构建出权重方向感知蒸馏(WaDi)新框架。WaDi在COCO 2014和COCO 2017数据集上取得了最优FID分数,且仅需占用U-Net/DiT约10%的可训练参数。此外,蒸馏得到的单步模型展现出强大的泛化能力与扩展性,在可控生成、关系反演和高分辨率合成等下游任务中均表现优异。
English
Despite the impressive performance of diffusion models such as Stable Diffusion (SD) in image generation, their slow inference limits practical deployment. Recent works accelerate inference by distilling multi-step diffusion into one-step generators. To better understand the distillation mechanism, we analyze U-Net/DiT weight changes between one-step students and their multi-step teacher counterparts. Our analysis reveals that changes in weight direction significantly exceed those in weight norm, highlighting it as the key factor during distillation. Motivated by this insight, we propose the Low-rank Rotation of weight Direction (LoRaD), a parameter-efficient adapter tailored to one-step diffusion distillation. LoRaD is designed to model these structured directional changes using learnable low-rank rotation matrices. We further integrate LoRaD into Variational Score Distillation (VSD), resulting in Weight Direction-aware Distillation (WaDi)-a novel one-step distillation framework. WaDi achieves state-of-the-art FID scores on COCO 2014 and COCO 2017 while using only approximately 10% of the trainable parameters of the U-Net/DiT. Furthermore, the distilled one-step model demonstrates strong versatility and scalability, generalizing well to various downstream tasks such as controllable generation, relation inversion, and high-resolution synthesis.
PDF12March 15, 2026