Stabiele Snelheid: Een Variantieperspectief op Stroomafstemming

Samenvatting

Hoewel flow matching elegant is, leidt de afhankelijkheid van voorwaardelijke snelheden met één steekproef tot trainingsdoelen met hoge variantie, wat de optimalisatie destabiliseert en de convergentie vertraagt. Door deze variantie expliciet te karakteriseren, identificeren we 1) een regime met hoge variantie nabij de prior, waar optimalisatie uitdagend is, en 2) een regime met lage variantie nabij de datadistributie, waar voorwaardelijke en marginale snelheden vrijwel samenvallen. Gebruikmakend van dit inzicht stellen we Stable Velocity voor, een uniform raamwerk dat zowel de training als de bemonstering verbetert. Voor de training introduceren we Stable Velocity Matching (StableVM), een zuivere variantiereductie-doelstelling, samen met Variance-Aware Representation Alignment (VA-REPA), dat adaptief de ondersteunende supervisie versterkt in het lage-variantie regime. Voor de inferentie tonen we aan dat dynamica in het lage-variantie regime gesloten-vorm vereenvoudigingen toelaat, wat Stable Velocity Sampling (StableVS) mogelijk maakt, een versnelling zonder fine-tuning. Uitgebreide experimenten op ImageNet 256×256 en grote voorgetrainde tekst-naar-beeld en tekst-naar-video modellen, waaronder SD3.5, Flux, Qwen-Image en Wan2.2, tonen consistente verbeteringen in trainings efficiëntie en meer dan 2 keer snellere bemonstering binnen het lage-variantie regime zonder verlies van samplekwaliteit. Onze code is beschikbaar op https://github.com/linYDTHU/StableVelocity.

English

While flow matching is elegant, its reliance on single-sample conditional velocities leads to high-variance training targets that destabilize optimization and slow convergence. By explicitly characterizing this variance, we identify 1) a high-variance regime near the prior, where optimization is challenging, and 2) a low-variance regime near the data distribution, where conditional and marginal velocities nearly coincide. Leveraging this insight, we propose Stable Velocity, a unified framework that improves both training and sampling. For training, we introduce Stable Velocity Matching (StableVM), an unbiased variance-reduction objective, along with Variance-Aware Representation Alignment (VA-REPA), which adaptively strengthen auxiliary supervision in the low-variance regime. For inference, we show that dynamics in the low-variance regime admit closed-form simplifications, enabling Stable Velocity Sampling (StableVS), a finetuning-free acceleration. Extensive experiments on ImageNet 256times256 and large pretrained text-to-image and text-to-video models, including SD3.5, Flux, Qwen-Image, and Wan2.2, demonstrate consistent improvements in training efficiency and more than 2times faster sampling within the low-variance regime without degrading sample quality. Our code is available at https://github.com/linYDTHU/StableVelocity.

Stabiele Snelheid: Een Variantieperspectief op Stroomafstemming

Stable Velocity: A Variance Perspective on Flow Matching

Samenvatting

Support