WorldCache:通过异构令牌缓存免费加速世界模型
WorldCache: Accelerating World Models for Free via Heterogeneous Token Caching
March 6, 2026
作者: Weilun Feng, Guoxin Fan, Haotong Qin, Chuanguang Yang, Mingqiang Wu, Yuqi Li, Xiangqi Li, Zhulin An, Libo Huang, Dingrui Wang, Longlong Liao, Michele Magno, Yongjun Xu
cs.AI
摘要
基于扩散的世界模型已展现出统一世界仿真的强大潜力,但迭代去噪过程的计算成本仍过高,难以支持交互式应用与长周期推演。虽然特征缓存技术可在无需训练的情况下加速推理,但我们发现:由于世界模型特有的两大障碍——多模态耦合与空间变异导致的令牌异质性,以及由少量难预测令牌驱动误差增长的非均匀时间动态特性,传统面向单模态扩散的策略移植效果不佳。为此,我们提出专为扩散世界模型设计的缓存框架WorldCache。我们引入曲率引导的异质令牌预测技术,通过基于物理原理的曲率评分估计令牌可预测性,并对方向突变的混沌令牌采用埃尔米特引导的阻尼预测器。同时设计混沌优先的自适应跳帧机制,通过累积曲率归一化的无量纲漂移信号,仅在瓶颈令牌开始漂移时重新计算。在扩散世界模型上的实验表明,WorldCache在保持98%推演质量的同时,可实现最高3.7倍的端到端加速,彰显了该框架在资源受限场景下的巨大优势与实用性。代码已发布于https://github.com/FofGofx/WorldCache。
English
Diffusion-based world models have shown strong potential for unified world simulation, but the iterative denoising remains too costly for interactive use and long-horizon rollouts. While feature caching can accelerate inference without training, we find that policies designed for single-modal diffusion transfer poorly to world models due to two world-model-specific obstacles: token heterogeneity from multi-modal coupling and spatial variation, and non-uniform temporal dynamics where a small set of hard tokens drives error growth, making uniform skipping either unstable or overly conservative. We propose WorldCache, a caching framework tailored to diffusion world models. We introduce Curvature-guided Heterogeneous Token Prediction, which uses a physics-grounded curvature score to estimate token predictability and applies a Hermite-guided damped predictor for chaotic tokens with abrupt direction changes. We also design Chaotic-prioritized Adaptive Skipping, which accumulates a curvature-normalized, dimensionless drift signal and recomputes only when bottleneck tokens begin to drift. Experiments on diffusion world models show that WorldCache delivers up to 3.7times end-to-end speedups while maintaining 98\% rollout quality, demonstrating the vast advantages and practicality of WorldCache in resource-constrained scenarios. Our code is released in https://github.com/FofGofx/WorldCache.