플로우 모델을 위한 추론 시간 스케일링: 확률적 생성 및 롤오버 예산 강제 기법

초록

사전 학습된 플로우 모델을 위한 추론 시점 스케일링 접근법을 제안한다. 최근 대형 언어 모델(LLM)과 확산 모델에서 추론 시점 스케일링이 주목받으며, 추가 계산을 통해 샘플 품질을 향상시키거나 사용자 선호에 더 잘 맞는 출력을 생성하는 데 기여하고 있다. 확산 모델의 경우, 중간 노이즈 제거 단계에서의 확률적 특성 덕분에 입자 샘플링이 더 효율적인 스케일링을 가능하게 했다. 반면, 플로우 모델은 확산 모델의 대안으로 빠른 생성 속도와 최신 이미지 및 비디오 생성 모델에서의 고품질 출력을 제공하며 인기를 얻고 있지만, 확산 모델에 사용된 효율적인 추론 시점 스케일링 방법은 플로우 모델의 결정론적 생성 과정 때문에 직접 적용할 수 없다. 플로우 모델에서 효율적인 추론 시점 스케일링을 가능하게 하기 위해, 우리는 세 가지 핵심 아이디어를 제안한다: 1) SDE 기반 생성으로 플로우 모델에서 입자 샘플링을 가능하게 함, 2) 인터폴란트 변환으로 탐색 공간을 확장하고 샘플 다양성을 향상시킴, 3) 롤오버 예산 강제(RBF)로 시간 단계별 계산 자원을 적응적으로 할당하여 예산 활용을 극대화함. 실험 결과, 특히 분산 보존(VP) 인터폴란트 기반 생성이 플로우 모델에서 추론 시점 스케일링을 위한 입자 샘플링 방법의 성능을 향상시키는 것으로 나타났다. 또한, VP-SDE와 함께 RBF를 사용할 때 최고의 성능을 달성하며, 이전의 모든 추론 시점 스케일링 접근법을 능가하는 것을 보여준다.

English

We propose an inference-time scaling approach for pretrained flow models. Recently, inference-time scaling has gained significant attention in LLMs and diffusion models, improving sample quality or better aligning outputs with user preferences by leveraging additional computation. For diffusion models, particle sampling has allowed more efficient scaling due to the stochasticity at intermediate denoising steps. On the contrary, while flow models have gained popularity as an alternative to diffusion models--offering faster generation and high-quality outputs in state-of-the-art image and video generative models--efficient inference-time scaling methods used for diffusion models cannot be directly applied due to their deterministic generative process. To enable efficient inference-time scaling for flow models, we propose three key ideas: 1) SDE-based generation, enabling particle sampling in flow models, 2) Interpolant conversion, broadening the search space and enhancing sample diversity, and 3) Rollover Budget Forcing (RBF), an adaptive allocation of computational resources across timesteps to maximize budget utilization. Our experiments show that SDE-based generation, particularly variance-preserving (VP) interpolant-based generation, improves the performance of particle sampling methods for inference-time scaling in flow models. Additionally, we demonstrate that RBF with VP-SDE achieves the best performance, outperforming all previous inference-time scaling approaches.

플로우 모델을 위한 추론 시간 스케일링: 확률적 생성 및 롤오버 예산 강제 기법

Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

초록

Support