世界模型量化的实证研究
An Empirical Study of World Model Quantization
February 2, 2026
作者: Zhongqian Fu, Tianyi Zhao, Kai Han, Hang Zhou, Xinghao Chen, Yunhe Wang
cs.AI
摘要
世界模型通过学习环境动态的内部表征,使智能体能够在紧凑的潜在空间中模拟和推理未来状态,以完成规划、预测和推断等任务。然而,运行世界模型依赖高昂的计算成本与内存占用,使得模型量化成为高效部署的关键。迄今为止,训练后量化(PTQ)对世界模型的影响尚未得到系统研究。本文以DINO-WM为代表性案例,对世界模型量化展开系统性实证研究,在仅权重量化及权重-激活联合量化两种设置下评估多种PTQ方法。我们在多种视觉规划任务上进行了广泛实验,覆盖不同比特位宽、量化粒度及长达50步的规划跨度。实验结果表明:世界模型中的量化效应远超传统的精度-位宽权衡——分组权重量化可稳定低比特推演,激活量化粒度带来的收益不一致,且编码器与预测器模块的量化敏感性呈现高度不对称性。此外,激进的低比特量化会显著削弱规划目标与任务成功率之间的对齐关系,导致无法通过额外优化修复的失效现象。这些发现揭示了基于世界模型的规划中特有的量化失效模式,为严格计算约束下量化世界模型的部署提供了实用指导。代码已发布于https://github.com/huawei-noah/noah-research/tree/master/QuantWM。
English
World models learn an internal representation of environment dynamics, enabling agents to simulate and reason about future states within a compact latent space for tasks such as planning, prediction, and inference. However, running world models rely on hevay computational cost and memory footprint, making model quantization essential for efficient deployment. To date, the effects of post-training quantization (PTQ) on world models remain largely unexamined. In this work, we present a systematic empirical study of world model quantization using DINO-WM as a representative case, evaluating diverse PTQ methods under both weight-only and joint weight-activation settings. We conduct extensive experiments on different visual planning tasks across a wide range of bit-widths, quantization granularities, and planning horizons up to 50 iterations. Our results show that quantization effects in world models extend beyond standard accuracy and bit-width trade-offs: group-wise weight quantization can stabilize low-bit rollouts, activation quantization granularity yields inconsistent benefits, and quantization sensitivity is highly asymmetric between encoder and predictor modules. Moreover, aggressive low-bit quantization significantly degrades the alignment between the planning objective and task success, leading to failures that cannot be remedied by additional optimization. These findings reveal distinct quantization-induced failure modes in world model-based planning and provide practical guidance for deploying quantized world models under strict computational constraints. The code will be available at https://github.com/huawei-noah/noah-research/tree/master/QuantWM.