MiMo-Embodied:X-Embodied基础模型技术报告
MiMo-Embodied: X-Embodied Foundation Model Technical Report
November 20, 2025
作者: Xiaoshuai Hao, Lei Zhou, Zhijian Huang, Zhiwen Hou, Yingbo Tang, Lingfeng Zhang, Guang Li, Zheng Lu, Shuhuai Ren, Xianhui Meng, Yuchen Zhang, Jing Wu, Jinghui Lu, Chenxu Dang, Jiayi Guan, Jianhua Wu, Zhiyi Hou, Hanbing Li, Shumeng Xia, Mingliang Zhou, Yinan Zheng, Zihao Yue, Shuhao Gu, Hao Tian, Yuannan Shen, Jianwei Cui, Wen Zhang, Shaoqing Xu, Bing Wang, Haiyang Sun, Zeyu Zhu, Yuncheng Jiang, Zibin Guo, Chuhong Gong, Chaofan Zhang, Wenbo Ding, Kun Ma, Guang Chen, Rui Cai, Diyun Xiang, Heng Qu, Fuli Luo, Hangjun Ye, Long Chen
cs.AI
摘要
我们开源了MiMo-Embodied——首个成功融合自动驾驶与具身智能两大领域并实现最先进性能的跨载体基础模型。该模型在任务规划、功能预测和空间理解等17项具身AI基准测试中刷新纪录,同时在环境感知、状态预测和驾驶规划等12项自动驾驶基准测试中表现卓越。在这些任务中,MiMo-Embodied显著超越了现有开源、闭源及专用基线模型。研究表明,通过多阶段学习、精标数据构建以及思维链/强化学习微调,两大领域呈现出显著的积极迁移效应并形成良性互促。我们详细解析了模型架构与训练方法以推动后续研究,代码与模型已发布于https://github.com/XiaomiMiMo/MiMo-Embodied。
English
We open-source MiMo-Embodied, the first cross-embodied foundation model to successfully integrate and achieve state-of-the-art performance in both Autonomous Driving and Embodied AI. MiMo-Embodied sets new records across 17 embodied AI benchmarks in Task Planning, Affordance Prediction and Spatial Understanding, while also excelling in 12 autonomous driving benchmarks across Environmental Perception, Status Prediction, and Driving Planning. Across these tasks, MiMo-Embodied significantly outperforms existing open-source, closed-source, and specialized baselines. Our results indicate that through multi-stage learning, curated data construction, and CoT/RL fine-tuning, these two domains exhibit strong positive transfer and mutually reinforce one another. We provide a detailed analysis of our model design and training methodologies to facilitate further research. Code and models are available at https://github.com/XiaomiMiMo/MiMo-Embodied.