世界モデル量子化の実証的研究

要旨

ワールドモデルは環境ダイナミクスの内部表現を学習し、エージェントが計画・予測・推論などのタスクにおいて、コンパクトな潜在空間内で未来状態のシミュレーションと推論を可能にする。しかし、ワールドモデルの運用には高い計算コストとメモリ使用量が伴うため、効率的なデプロイにはモデル量子化が不可欠である。これまで、学習後量子化（PTQ）がワールドモデルに与える影響はほとんど検証されていなかった。本研究では、代表的な事例としてDINO-WMを用いたワールドモデル量子化の体系的実証研究を実施し、重みのみの量子化と重み・活性化の同時量子化という異なる設定下で多様なPTQ手法を評価する。様々なビット幅、量子化粒度、最大50ステップに及ぶ計画ホライズンにおいて、異なる視覚的計画タスクで広範な実験を行った。結果として、ワールドモデルにおける量子化の影響は従来の精度とビット幅のトレードオフを超えることが明らかになった：グループ単位の重み量子化は低ビットのロールアウトを安定化し、活性化量子化の粒度は一貫しない効果をもたらし、エンコーダと予測モジュール間で量子化感度が非対称性を示す。さらに、過度な低ビット量子化は計画目標とタスク成功率の整合性を著しく損ない、追加の最適化では回復不能な失敗を引き起こす。これらの知見は、ワールドモデルベースの計画における量子化に特有の失敗モードを明らかにし、計算制約が厳しい環境下での量子化ワールドモデルデプロイの実践的指針を提供する。コードはhttps://github.com/huawei-noah/noah-research/tree/master/QuantWM で公開予定である。

English

World models learn an internal representation of environment dynamics, enabling agents to simulate and reason about future states within a compact latent space for tasks such as planning, prediction, and inference. However, running world models rely on hevay computational cost and memory footprint, making model quantization essential for efficient deployment. To date, the effects of post-training quantization (PTQ) on world models remain largely unexamined. In this work, we present a systematic empirical study of world model quantization using DINO-WM as a representative case, evaluating diverse PTQ methods under both weight-only and joint weight-activation settings. We conduct extensive experiments on different visual planning tasks across a wide range of bit-widths, quantization granularities, and planning horizons up to 50 iterations. Our results show that quantization effects in world models extend beyond standard accuracy and bit-width trade-offs: group-wise weight quantization can stabilize low-bit rollouts, activation quantization granularity yields inconsistent benefits, and quantization sensitivity is highly asymmetric between encoder and predictor modules. Moreover, aggressive low-bit quantization significantly degrades the alignment between the planning objective and task success, leading to failures that cannot be remedied by additional optimization. These findings reveal distinct quantization-induced failure modes in world model-based planning and provide practical guidance for deploying quantized world models under strict computational constraints. The code will be available at https://github.com/huawei-noah/noah-research/tree/master/QuantWM.

世界モデル量子化の実証的研究

An Empirical Study of World Model Quantization

要旨

Support