월드 모델 양자화에 관한 실증적 연구

초록

월드 모델은 환경 역학의 내부 표현을 학습하여 에이전트가 계획, 예측, 추론과 같은 작업을 위해 컴팩트한 잠재 공간 내에서 미래 상태를 시뮬레이션하고 추론할 수 있도록 합니다. 그러나 월드 모델의 운영에는 높은 계산 비용과 메모리 사용량이 필요하여 효율적인 배포를 위한 모델 양자화가 필수적입니다. 현재까지 훈련 후 양자화(PTQ)가 월드 모델에 미치는 영향은 거의 연구되지 않았습니다. 본 연구에서는 대표적인 사례인 DINO-WM을 사용하여 월드 모델 양자화에 대한 체계적인 실증 연구를 제시하며, 가중치 전용 및 가중치-활성화 결합 설정에서 다양한 PTQ 방법을 평가합니다. 다양한 비트 폭, 양자화 세분성, 최대 50회 반복에 이르는 계획 범위를 아우르는 다양한 시각적 계획 작업에 대해 광범위한 실험을 수행합니다. 결과에 따르면 월드 모델의 양자화 효과는 정확도와 비트 폭 간의 표준 절충점을 넘어선다: 그룹 단위 가중치 양자화는 저비트 롤아웃을 안정화할 수 있으며, 활성화 양자화 세분성은 일관되지 않은 이점을 제공하고, 양자화 민감도는 인코더와 예측 모듈 간에 높은 비대칭성을 보인다. 또한 공격적인 저비트 양자화는 계획 목표와 작업 성공 간의 정렬을 크게 저하시켜 추가 최적화로 해결할 수 없는 실패를 초래합니다. 이러한 발견들은 월드 모델 기반 계획에서 발생하는 독특한 양자화 유발 실패 모드를 밝히며, 엄격한 계산 제약 조건에서 양자화된 월드 모델을 배포하기 위한 실용적인 지침을 제공합니다. 코드는 https://github.com/huawei-noah/noah-research/tree/master/QuantWM에서 공개될 예정입니다.

English

World models learn an internal representation of environment dynamics, enabling agents to simulate and reason about future states within a compact latent space for tasks such as planning, prediction, and inference. However, running world models rely on hevay computational cost and memory footprint, making model quantization essential for efficient deployment. To date, the effects of post-training quantization (PTQ) on world models remain largely unexamined. In this work, we present a systematic empirical study of world model quantization using DINO-WM as a representative case, evaluating diverse PTQ methods under both weight-only and joint weight-activation settings. We conduct extensive experiments on different visual planning tasks across a wide range of bit-widths, quantization granularities, and planning horizons up to 50 iterations. Our results show that quantization effects in world models extend beyond standard accuracy and bit-width trade-offs: group-wise weight quantization can stabilize low-bit rollouts, activation quantization granularity yields inconsistent benefits, and quantization sensitivity is highly asymmetric between encoder and predictor modules. Moreover, aggressive low-bit quantization significantly degrades the alignment between the planning objective and task success, leading to failures that cannot be remedied by additional optimization. These findings reveal distinct quantization-induced failure modes in world model-based planning and provide practical guidance for deploying quantized world models under strict computational constraints. The code will be available at https://github.com/huawei-noah/noah-research/tree/master/QuantWM.

월드 모델 양자화에 관한 실증적 연구

An Empirical Study of World Model Quantization

초록

Support