即时编译:扩散变换器的免训练空间加速技术
Just-in-Time: Training-Free Spatial Acceleration for Diffusion Transformers
March 11, 2026
作者: Wenhao Sun, Ji Li, Zhaoqiang Liu
cs.AI
摘要
扩散Transformer在图像合成领域确立了新的性能标杆,但迭代采样过程的高计算成本严重制约了其实际应用。现有加速方法多聚焦于时间维度优化,却忽视了生成过程中固有的空间冗余——全局结构往往在精细细节形成之前就已确立。对所有空间区域采用均质化计算的方式存在显著效率瓶颈。本文提出即时生成框架,通过空间维度加速突破这一困境。该框架构建了空间近似生成常微分方程,基于动态选择的锚点令牌稀疏子集驱动全潜态演化。为确保新令牌融入潜态维度扩展时的无缝衔接,我们设计了确定性微流机制,这种简洁有效的有限时间常微分方程能同时保持结构连贯性与统计准确性。在最新FLUX.1-dev模型上的实验表明,该框架可实现近7倍加速且性能几乎无损,显著超越现有加速方法,在推理速度与生成保真度之间建立了更优越的平衡点。
English
Diffusion Transformers have established a new state-of-the-art in image synthesis, but the high computational cost of iterative sampling severely hampers their practical deployment. While existing acceleration methods often focus on the temporal domain, they overlook the substantial spatial redundancy inherent in the generative process, where global structures emerge long before fine-grained details are formed. The uniform computational treatment of all spatial regions represents a critical inefficiency. In this paper, we introduce Just-in-Time (JiT), a novel training-free framework that addresses this challenge by acceleration in the spatial domain. JiT formulates a spatially approximated generative ordinary differential equation (ODE) that drives the full latent state evolution based on computations from a dynamically selected, sparse subset of anchor tokens. To ensure seamless transitions as new tokens are incorporated to expand the dimensions of the latent state, we propose a deterministic micro-flow, a simple and effective finite-time ODE that maintains both structural coherence and statistical correctness. Extensive experiments on the state-of-the-art FLUX.1-dev model demonstrate that JiT achieves up to a 7x speedup with nearly lossless performance, significantly outperforming existing acceleration methods and establishing a new and superior trade-off between inference speed and generation fidelity.