排斥分數蒸餾用於擴散模型的多樣抽樣
Repulsive Score Distillation for Diverse Sampling of Diffusion Models
June 24, 2024
作者: Nicolas Zilberstein, Morteza Mardani, Santiago Segarra
cs.AI
摘要
分數蒸餾抽樣一直是將擴散模型整合到生成複雜視覺中的關鍵。儘管取得了令人印象深刻的結果,但卻存在模式崩潰和缺乏多樣性的問題。為了應對這一挑戰,我們利用分數蒸餾的梯度流解釋提出了排斥式分數蒸餾(RSD)。具體而言,我們提出了一個基於粒子集合的排斥變分框架,以促進多樣性。通過一個包含粒子之間耦合的變分逼近,排斥顯示為一種簡單的正則化,允許基於它們的相對成對相似性進行粒子之間的交互作用,例如通過基於半徑的核方法進行度量。我們設計了適用於無限制和受限制抽樣情況的 RSD。對於受限制抽樣,我們專注於潛在空間中的反問題,這導致一個擴充的變分公式,可以在計算、質量和多樣性之間取得良好平衡。我們對文本到圖像生成和反問題進行了廣泛實驗,結果表明,與最先進的替代方案相比,RSD在多樣性和質量之間實現了卓越的平衡。
English
Score distillation sampling has been pivotal for integrating diffusion models
into generation of complex visuals. Despite impressive results it suffers from
mode collapse and lack of diversity. To cope with this challenge, we leverage
the gradient flow interpretation of score distillation to propose Repulsive
Score Distillation (RSD). In particular, we propose a variational framework
based on repulsion of an ensemble of particles that promotes diversity. Using a
variational approximation that incorporates a coupling among particles, the
repulsion appears as a simple regularization that allows interaction of
particles based on their relative pairwise similarity, measured e.g., via
radial basis kernels. We design RSD for both unconstrained and constrained
sampling scenarios. For constrained sampling we focus on inverse problems in
the latent space that leads to an augmented variational formulation, that
strikes a good balance between compute, quality and diversity. Our extensive
experiments for text-to-image generation, and inverse problems demonstrate that
RSD achieves a superior trade-off between diversity and quality compared with
state-of-the-art alternatives.Summary
AI-Generated Summary