フローモデルの推論時スケーリング：確率的生成とロールオーバー予算強制によるアプローチ

要旨

事前学習済みフローモデルに対する推論時スケーリング手法を提案する。最近、大規模言語モデル（LLM）や拡散モデルにおいて、推論時スケーリングが注目を集めており、追加の計算リソースを活用することでサンプル品質の向上やユーザー嗜好との整合性の改善が図られている。拡散モデルでは、中間段階の確率的なノイズ除去プロセスにより、粒子サンプリングが効率的なスケーリングを可能にしてきた。一方、フローモデルは拡散モデルの代替として人気を集めており、高速な生成と最先端の画像・動画生成モデルにおける高品質な出力を提供しているが、その決定論的な生成プロセスのため、拡散モデルで用いられる効率的な推論時スケーリング手法を直接適用することはできない。フローモデルにおける効率的な推論時スケーリングを実現するため、以下の3つの主要なアイデアを提案する：1) SDEベースの生成（フローモデルにおける粒子サンプリングを可能にする）、2) 補間変換（探索空間を広げ、サンプルの多様性を向上させる）、3) Rollover Budget Forcing (RBF)（計算リソースをタイムステップ間で適応的に割り当て、予算利用を最大化する）。実験結果から、SDEベースの生成、特に分散保存型（VP）補間に基づく生成が、フローモデルにおける推論時スケーリングのための粒子サンプリング手法の性能を向上させることが示された。さらに、VP-SDEとRBFを組み合わせることで、これまでのすべての推論時スケーリング手法を上回る最高の性能を達成することが実証された。

English

We propose an inference-time scaling approach for pretrained flow models. Recently, inference-time scaling has gained significant attention in LLMs and diffusion models, improving sample quality or better aligning outputs with user preferences by leveraging additional computation. For diffusion models, particle sampling has allowed more efficient scaling due to the stochasticity at intermediate denoising steps. On the contrary, while flow models have gained popularity as an alternative to diffusion models--offering faster generation and high-quality outputs in state-of-the-art image and video generative models--efficient inference-time scaling methods used for diffusion models cannot be directly applied due to their deterministic generative process. To enable efficient inference-time scaling for flow models, we propose three key ideas: 1) SDE-based generation, enabling particle sampling in flow models, 2) Interpolant conversion, broadening the search space and enhancing sample diversity, and 3) Rollover Budget Forcing (RBF), an adaptive allocation of computational resources across timesteps to maximize budget utilization. Our experiments show that SDE-based generation, particularly variance-preserving (VP) interpolant-based generation, improves the performance of particle sampling methods for inference-time scaling in flow models. Additionally, we demonstrate that RBF with VP-SDE achieves the best performance, outperforming all previous inference-time scaling approaches.

フローモデルの推論時スケーリング：確率的生成とロールオーバー予算強制によるアプローチ

Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

要旨

Support