MWM: 行動条件付き一貫性予測のためのモバイル世界モデル

要旨

World modelsは、予測された未来空間における計画立案を可能にし、具身化ナビゲーションの有望なフレームワークを提供する。しかし、既存のナビゲーション用world modelsは、行動条件付きの一貫性を欠いていることが多く、視覚的に妥当な予測であっても多段階のロールアウト下ではドリフトが生じ、計画立案の質を低下させる。さらに、効率的なデプロイには少ステップの拡散推論が求められるが、既存の蒸留手法はロールアウト一貫性を明示的に保持しないため、訓練と推論の間に不一致が生じる。これらの課題に対処するため、我々は計画ベースの画像目標ナビゲーションのための移動型world model、MWMを提案する。具体的には、構造事前訓練と行動条件付き一貫性（ACC）事後訓練を組み合わせた二段階訓練フレームワークを導入し、行動条件付きロールアウトの一貫性を改善する。さらに、ロールアウト一貫性が改善された少ステップ拡散蒸留のための推論一貫状態蒸留（ICSD）を提案する。ベンチマークおよび実世界タスクにおける実験により、視覚的忠実度、軌道精度、計画成功率、推論効率において一貫した向上が実証された。コード: https://github.com/AIGeeksGroup/MWM. ウェブサイト: https://aigeeksgroup.github.io/MWM.

English

World models enable planning in imagined future predicted space, offering a promising framework for embodied navigation. However, existing navigation world models often lack action-conditioned consistency, so visually plausible predictions can still drift under multi-step rollout and degrade planning. Moreover, efficient deployment requires few-step diffusion inference, but existing distillation methods do not explicitly preserve rollout consistency, creating a training-inference mismatch. To address these challenges, we propose MWM, a mobile world model for planning-based image-goal navigation. Specifically, we introduce a two-stage training framework that combines structure pretraining with Action-Conditioned Consistency (ACC) post-training to improve action-conditioned rollout consistency. We further introduce Inference-Consistent State Distillation (ICSD) for few-step diffusion distillation with improved rollout consistency. Our experiments on benchmark and real-world tasks demonstrate consistent gains in visual fidelity, trajectory accuracy, planning success, and inference efficiency. Code: https://github.com/AIGeeksGroup/MWM. Website: https://aigeeksgroup.github.io/MWM.

MWM: 行動条件付き一貫性予測のためのモバイル世界モデル

MWM: Mobile World Models for Action-Conditioned Consistent Prediction

要旨

Support