世界模型量化的实证研究

摘要

世界模型通过习得环境动态的内部表征，使得智能体能够在紧凑的潜在空间中模拟和推理未来状态，以完成规划、预测和推断等任务。然而运行世界模型需要高昂的计算成本与内存占用，这使得模型量化成为高效部署的关键技术。迄今为止，后训练量化对世界模型的影响尚未得到系统研究。本文以代表性模型DINO-WM为研究对象，在仅权重量化及权重-激活值联合量化两种设置下，对多种后训练量化方法展开系统性实证分析。我们在多种视觉规划任务上进行了大规模实验，覆盖不同比特位宽、量化粒度以及长达50步的规划跨度。实验结果表明：世界模型中的量化效应超越了传统的精度与比特位宽权衡关系——分组权重量化能稳定低比特推演过程，激活值量化粒度带来的收益具有不一致性，且编码器与预测器模块的量化敏感性呈现高度不对称性。此外，激进的低比特量化会显著破坏规划目标与任务成功率之间的对齐关系，导致无法通过额外优化弥补的失效现象。这些发现揭示了基于世界模型的规划任务中特有的量化诱发失效模式，为在严格计算约束下部署量化世界模型提供了实用指导。代码已发布于https://github.com/huawei-noah/noah-research/tree/master/QuantWM。

English

World models learn an internal representation of environment dynamics, enabling agents to simulate and reason about future states within a compact latent space for tasks such as planning, prediction, and inference. However, running world models rely on hevay computational cost and memory footprint, making model quantization essential for efficient deployment. To date, the effects of post-training quantization (PTQ) on world models remain largely unexamined. In this work, we present a systematic empirical study of world model quantization using DINO-WM as a representative case, evaluating diverse PTQ methods under both weight-only and joint weight-activation settings. We conduct extensive experiments on different visual planning tasks across a wide range of bit-widths, quantization granularities, and planning horizons up to 50 iterations. Our results show that quantization effects in world models extend beyond standard accuracy and bit-width trade-offs: group-wise weight quantization can stabilize low-bit rollouts, activation quantization granularity yields inconsistent benefits, and quantization sensitivity is highly asymmetric between encoder and predictor modules. Moreover, aggressive low-bit quantization significantly degrades the alignment between the planning objective and task success, leading to failures that cannot be remedied by additional optimization. These findings reveal distinct quantization-induced failure modes in world model-based planning and provide practical guidance for deploying quantized world models under strict computational constraints. The code will be available at https://github.com/huawei-noah/noah-research/tree/master/QuantWM.

世界模型量化的实证研究

An Empirical Study of World Model Quantization

摘要

Support