即时加速:无需训练的扩散变换器空间优化技术
Just-in-Time: Training-Free Spatial Acceleration for Diffusion Transformers
March 11, 2026
作者: Wenhao Sun, Ji Li, Zhaoqiang Liu
cs.AI
摘要
扩散变换器虽已在图像合成领域确立全新标杆,但其迭代采样过程的高计算成本严重制约了实际应用。现有加速方法多聚焦于时间维度,却忽视了生成过程中固有的空间冗余性——全局结构往往在精细细节形成前早已显现。对全部空间区域采用均质化计算处理,构成了关键性效率瓶颈。本文提出即时生成框架,通过空间维度加速突破这一难题。该框架构建了空间近似生成常微分方程,基于动态选定的稀疏锚点令牌计算驱动全潜态演化。为确保新令牌融入扩展潜态维度时的无缝过渡,我们设计了确定性微流——一种简洁有效的有限时间常微分方程,可同步保持结构连贯性与统计准确性。在顶尖FLUX.1-dev模型上的大量实验表明,即时生成框架可实现近无损性能的7倍加速,显著超越现有加速方法,在推理速度与生成保真度间建立了更优越的平衡点。
English
Diffusion Transformers have established a new state-of-the-art in image synthesis, but the high computational cost of iterative sampling severely hampers their practical deployment. While existing acceleration methods often focus on the temporal domain, they overlook the substantial spatial redundancy inherent in the generative process, where global structures emerge long before fine-grained details are formed. The uniform computational treatment of all spatial regions represents a critical inefficiency. In this paper, we introduce Just-in-Time (JiT), a novel training-free framework that addresses this challenge by acceleration in the spatial domain. JiT formulates a spatially approximated generative ordinary differential equation (ODE) that drives the full latent state evolution based on computations from a dynamically selected, sparse subset of anchor tokens. To ensure seamless transitions as new tokens are incorporated to expand the dimensions of the latent state, we propose a deterministic micro-flow, a simple and effective finite-time ODE that maintains both structural coherence and statistical correctness. Extensive experiments on the state-of-the-art FLUX.1-dev model demonstrate that JiT achieves up to a 7x speedup with nearly lossless performance, significantly outperforming existing acceleration methods and establishing a new and superior trade-off between inference speed and generation fidelity.