MiMo-Embodied:X-Embodied 基础模型技术报告
MiMo-Embodied: X-Embodied Foundation Model Technical Report
November 20, 2025
作者: Xiaoshuai Hao, Lei Zhou, Zhijian Huang, Zhiwen Hou, Yingbo Tang, Lingfeng Zhang, Guang Li, Zheng Lu, Shuhuai Ren, Xianhui Meng, Yuchen Zhang, Jing Wu, Jinghui Lu, Chenxu Dang, Jiayi Guan, Jianhua Wu, Zhiyi Hou, Hanbing Li, Shumeng Xia, Mingliang Zhou, Yinan Zheng, Zihao Yue, Shuhao Gu, Hao Tian, Yuannan Shen, Jianwei Cui, Wen Zhang, Shaoqing Xu, Bing Wang, Haiyang Sun, Zeyu Zhu, Yuncheng Jiang, Zibin Guo, Chuhong Gong, Chaofan Zhang, Wenbo Ding, Kun Ma, Guang Chen, Rui Cai, Diyun Xiang, Heng Qu, Fuli Luo, Hangjun Ye, Long Chen
cs.AI
摘要
我們開源了 MiMo-Embodied——首個成功整合自動駕駛與具身智能兩大領域,並實現最優性能的跨具身基礎模型。MiMo-Embodied 在具身智能的任務規劃、功能預測與空間理解等 17 項基準測試中刷新紀錄,同時在自動駕駛的環境感知、狀態預測與行駛規劃等 12 項基準中表現卓越。在這些任務中,MiMo-Embodied 顯著超越了現有的開源模型、閉源模型及專業化基準模型。我們的研究表明,通過多階段學習、精構數據建構以及思維鏈/強化學習微調,這兩個領域展現出強烈的正向遷移效應並相互強化。我們詳細分析了模型設計與訓練方法,以推動後續研究。代碼與模型已開源於:https://github.com/XiaomiMiMo/MiMo-Embodied。
English
We open-source MiMo-Embodied, the first cross-embodied foundation model to successfully integrate and achieve state-of-the-art performance in both Autonomous Driving and Embodied AI. MiMo-Embodied sets new records across 17 embodied AI benchmarks in Task Planning, Affordance Prediction and Spatial Understanding, while also excelling in 12 autonomous driving benchmarks across Environmental Perception, Status Prediction, and Driving Planning. Across these tasks, MiMo-Embodied significantly outperforms existing open-source, closed-source, and specialized baselines. Our results indicate that through multi-stage learning, curated data construction, and CoT/RL fine-tuning, these two domains exhibit strong positive transfer and mutually reinforce one another. We provide a detailed analysis of our model design and training methodologies to facilitate further research. Code and models are available at https://github.com/XiaomiMiMo/MiMo-Embodied.