ChatPaper.aiChatPaper

过渡模型:重新思考生成式学习目标

Transition Models: Rethinking the Generative Learning Objective

September 4, 2025
作者: Zidong Wang, Yiyuan Zhang, Xiaoyu Yue, Xiangyu Yue, Yangguang Li, Wanli Ouyang, Lei Bai
cs.AI

摘要

生成模型领域存在一个根本性难题:迭代扩散模型虽能实现卓越的保真度,却需付出巨大的计算代价;而高效的少步生成方案则受限于难以突破的质量天花板。这一生成步数与输出质量之间的矛盾,源于训练目标过于局限,要么专注于无限小动态(PF-ODEs),要么仅着眼于直接预测终点。针对这一挑战,我们提出了一种精确的连续时间动态方程,它能解析地定义任意有限时间间隔内的状态转移。由此,我们开创了一种新的生成范式——过渡模型(Transition Models, TiM),它能够适应任意步数的转移,在生成轨迹上自如穿梭,从单步跨越到多步精细优化。尽管仅拥有8.65亿参数,TiM在所有评估步数下均实现了业界领先的性能,超越了如SD3.5(80亿参数)和FLUX.1(120亿参数)等顶尖模型。尤为重要的是,与以往的少步生成器不同,TiM在采样预算增加时展现出单调的质量提升。此外,采用我们的原生分辨率策略时,TiM在高达4096x4096的分辨率下仍能提供卓越的保真度。
English
A fundamental dilemma in generative modeling persists: iterative diffusion models achieve outstanding fidelity, but at a significant computational cost, while efficient few-step alternatives are constrained by a hard quality ceiling. This conflict between generation steps and output quality arises from restrictive training objectives that focus exclusively on either infinitesimal dynamics (PF-ODEs) or direct endpoint prediction. We address this challenge by introducing an exact, continuous-time dynamics equation that analytically defines state transitions across any finite time interval. This leads to a novel generative paradigm, Transition Models (TiM), which adapt to arbitrary-step transitions, seamlessly traversing the generative trajectory from single leaps to fine-grained refinement with more steps. Despite having only 865M parameters, TiM achieves state-of-the-art performance, surpassing leading models such as SD3.5 (8B parameters) and FLUX.1 (12B parameters) across all evaluated step counts. Importantly, unlike previous few-step generators, TiM demonstrates monotonic quality improvement as the sampling budget increases. Additionally, when employing our native-resolution strategy, TiM delivers exceptional fidelity at resolutions up to 4096x4096.
PDF142September 5, 2025