在擴散模型中提煉多樣性與控制性
Distilling Diversity and Control in Diffusion Models
March 13, 2025
作者: Rohit Gandikota, David Bau
cs.AI
摘要
蒸餾擴散模型面臨一個關鍵限制:與其基礎模型相比,樣本多樣性有所降低。在本研究中,我們發現儘管存在這種多樣性損失,蒸餾模型仍保留了基礎模型的基本概念表徵。我們展示了控制蒸餾——在基礎模型上訓練的控制機制(如概念滑塊和LoRAs)可以無縫轉移到蒸餾模型,反之亦然,從而有效地蒸餾控制而無需重新訓練。這種表徵結構的保留促使我們深入研究蒸餾過程中多樣性崩潰的機制。為了理解蒸餾如何影響多樣性,我們引入了擴散目標(DT)可視化,這是一種分析和調試工具,揭示了模型在中間步驟如何預測最終輸出。通過DT可視化,我們識別了生成偽影和不一致性,並證明初始擴散時間步長不成比例地決定了輸出多樣性,而後續步驟主要用於細節精煉。基於這些洞察,我們提出了多樣性蒸餾——一種混合推理方法,策略性地僅在關鍵的第一時間步使用基礎模型,然後轉向高效的蒸餾模型。我們的實驗表明,這一簡單修改不僅恢復了從基礎模型到蒸餾模型的多樣性能力,而且出乎意料地超越了它,同時保持了蒸餾推理的計算效率,所有這些都不需要額外的訓練或模型修改。我們的代碼和數據可在https://distillation.baulab.info獲取。
English
Distilled diffusion models suffer from a critical limitation: reduced sample
diversity compared to their base counterparts. In this work, we uncover that
despite this diversity loss, distilled models retain the fundamental concept
representations of base models. We demonstrate control distillation - where
control mechanisms like Concept Sliders and LoRAs trained on base models can be
seamlessly transferred to distilled models and vice-versa, effectively
distilling control without any retraining. This preservation of
representational structure prompted our investigation into the mechanisms of
diversity collapse during distillation. To understand how distillation affects
diversity, we introduce Diffusion Target (DT) Visualization, an analysis and
debugging tool that reveals how models predict final outputs at intermediate
steps. Through DT-Visualization, we identify generation artifacts,
inconsistencies, and demonstrate that initial diffusion timesteps
disproportionately determine output diversity, while later steps primarily
refine details. Based on these insights, we introduce diversity distillation -
a hybrid inference approach that strategically employs the base model for only
the first critical timestep before transitioning to the efficient distilled
model. Our experiments demonstrate that this simple modification not only
restores the diversity capabilities from base to distilled models but
surprisingly exceeds it, while maintaining nearly the computational efficiency
of distilled inference, all without requiring additional training or model
modifications. Our code and data are available at
https://distillation.baulab.infoSummary
AI-Generated Summary