流模型在推理時期的規模化:透過隨機生成與滾動預算強制
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
March 25, 2025
作者: Jaihoon Kim, Taehoon Yoon, Jisung Hwang, Minhyuk Sung
cs.AI
摘要
我們提出了一種針對預訓練流模型的推理時縮放方法。近年來,推理時縮放在大型語言模型和擴散模型中獲得了顯著關注,通過利用額外的計算資源來提升樣本質量或更好地使輸出與用戶偏好對齊。對於擴散模型而言,由於中間去噪步驟的隨機性,粒子採樣使得縮放更加高效。相反,儘管流模型作為擴散模型的替代方案已廣受歡迎——在最先進的圖像和視頻生成模型中提供了更快的生成速度和高質量輸出——但由於其確定性的生成過程,用於擴散模型的高效推理時縮放方法無法直接應用。為了實現流模型的高效推理時縮放,我們提出了三個關鍵思想:1)基於SDE的生成,使流模型中的粒子採樣成為可能;2)插值轉換,擴大搜索空間並增強樣本多樣性;3)滾動預算強制(RBF),一種跨時間步自適應分配計算資源以最大化預算利用率的方法。我們的實驗表明,基於SDE的生成,特別是基於方差保持(VP)插值的生成,提升了流模型中粒子採樣方法在推理時縮放中的性能。此外,我們證明了RBF與VP-SDE結合達到了最佳性能,超越了所有先前的推理時縮放方法。
English
We propose an inference-time scaling approach for pretrained flow models.
Recently, inference-time scaling has gained significant attention in LLMs and
diffusion models, improving sample quality or better aligning outputs with user
preferences by leveraging additional computation. For diffusion models,
particle sampling has allowed more efficient scaling due to the stochasticity
at intermediate denoising steps. On the contrary, while flow models have gained
popularity as an alternative to diffusion models--offering faster generation
and high-quality outputs in state-of-the-art image and video generative
models--efficient inference-time scaling methods used for diffusion models
cannot be directly applied due to their deterministic generative process. To
enable efficient inference-time scaling for flow models, we propose three key
ideas: 1) SDE-based generation, enabling particle sampling in flow models, 2)
Interpolant conversion, broadening the search space and enhancing sample
diversity, and 3) Rollover Budget Forcing (RBF), an adaptive allocation of
computational resources across timesteps to maximize budget utilization. Our
experiments show that SDE-based generation, particularly variance-preserving
(VP) interpolant-based generation, improves the performance of particle
sampling methods for inference-time scaling in flow models. Additionally, we
demonstrate that RBF with VP-SDE achieves the best performance, outperforming
all previous inference-time scaling approaches.Summary
AI-Generated Summary