八令牌规划：面向潜在世界模型的紧凑离散分词器

摘要

世界模型为模拟基于动作或指令的环境动态提供了强大框架，能够支持动作规划或策略学习等下游任务。当前方法虽将世界模型作为学习型模拟器使用，但其在决策时规划中的应用仍因计算量过大而难以实现实时控制。关键瓶颈在于潜在表征：传统分词器将每个观测编码为数百个词元，导致规划速度缓慢且资源密集。为此，我们提出CompACT——一种将每个观测压缩至仅8个词元的离散分词器，在保留规划所需关键信息的同时大幅降低计算成本。搭载CompACT分词器的动作条件世界模型实现了具有竞争力的规划性能，其规划速度提升数个数量级，为世界模型的实际部署迈出关键一步。

English

World models provide a powerful framework for simulating environment dynamics conditioned on actions or instructions, enabling downstream tasks such as action planning or policy learning. Recent approaches leverage world models as learned simulators, but its application to decision-time planning remains computationally prohibitive for real-time control. A key bottleneck lies in latent representations: conventional tokenizers encode each observation into hundreds of tokens, making planning both slow and resource-intensive. To address this, we propose CompACT, a discrete tokenizer that compresses each observation into as few as 8 tokens, drastically reducing computational cost while preserving essential information for planning. An action-conditioned world model that occupies CompACT tokenizer achieves competitive planning performance with orders-of-magnitude faster planning, offering a practical step toward real-world deployment of world models.

八令牌规划：面向潜在世界模型的紧凑离散分词器

Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model

摘要

Support