大小並非總是更好:潛在擴散模型的尺度特性
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
April 1, 2024
作者: Kangfu Mei, Zhengzhong Tu, Mauricio Delbracio, Hossein Talebi, Vishal M. Patel, Peyman Milanfar
cs.AI
摘要
我們研究潛在擴散模型(LDMs)的擴展特性,著重於它們的取樣效率。儘管改進的網絡架構和推理算法已被證明能有效提升擴散模型的取樣效率,但模型大小這一關鍵取樣效率決定因素尚未受到深入研究。通過對已建立的文本到圖像擴散模型進行實證分析,我們深入探討模型大小如何影響在不同取樣步驟下的取樣效率。我們的研究發現揭示了一個令人驚訝的趨勢:在給定推理預算下運作時,較小的模型經常優於其較大的對應模型在生成高質量結果方面。此外,我們擴展了我們的研究,通過應用各種擴散取樣器、探索不同的下游任務、評估後蒸餾模型,以及相對於訓練計算的性能比較,來展示這些發現的普遍性。這些發現為LDM擴展策略的發展開辟了新途徑,這些策略可用於在有限的推理預算內增強生成能力。
English
We study the scaling properties of latent diffusion models (LDMs) with an
emphasis on their sampling efficiency. While improved network architecture and
inference algorithms have shown to effectively boost sampling efficiency of
diffusion models, the role of model size -- a critical determinant of sampling
efficiency -- has not been thoroughly examined. Through empirical analysis of
established text-to-image diffusion models, we conduct an in-depth
investigation into how model size influences sampling efficiency across varying
sampling steps. Our findings unveil a surprising trend: when operating under a
given inference budget, smaller models frequently outperform their larger
equivalents in generating high-quality results. Moreover, we extend our study
to demonstrate the generalizability of the these findings by applying various
diffusion samplers, exploring diverse downstream tasks, evaluating
post-distilled models, as well as comparing performance relative to training
compute. These findings open up new pathways for the development of LDM scaling
strategies which can be employed to enhance generative capabilities within
limited inference budgets.Summary
AI-Generated Summary