ChatPaper.aiChatPaper

RynnBrain:開放式具身基礎模型

RynnBrain: Open Embodied Foundation Models

February 13, 2026
作者: Ronghao Dang, Jiayan Guo, Bohan Hou, Sicong Leng, Kehan Li, Xin Li, Jiangpin Liu, Yunxuan Mao, Zhikai Wang, Yuqian Yuan, Minghao Zhu, Xiao Lin, Yang Bai, Qian Jiang, Yaxi Zhao, Minghua Zeng, Junlong Gao, Yuming Jiang, Jun Cen, Siteng Huang, Liuyi Wang, Wenqiao Zhang, Chengju Liu, Jianfei Yang, Shijian Lu, Deli Zhao
cs.AI

摘要

儘管多模態基礎模型快速發展,具身智能領域仍缺乏一個能在真實世界時空動態中整合感知、推理與規劃的統一物理基礎模型。我們推出RynnBrain——一個開源時空基礎模型,專為具身智能設計。該模型在統一框架下強化四大核心能力:全面的自我中心理解、多樣化時空定位、物理基礎推理及物理感知規劃。RynnBrain系列包含三種基礎模型規模(2B、8B與30B-A3B MoE)及四種針對下游具身任務(即RynnBrain-Nav、RynnBrain-Plan與RynnBrain-VLA)或複雜空間推理任務(即RynnBrain-CoP)微調的後訓練變體。在對20個具身基準與8個通用視覺理解基準的廣泛評估中,RynnBrain基礎模型以顯著優勢大幅超越現有具身基礎模型。其後訓練模型組進一步驗證了RynnBrain基礎模型的兩大潛力:(一)實現物理基礎的推理與規劃;(二)作為可高效適配多樣具身任務的強預訓練骨幹。
English
Despite rapid progress in multimodal foundation models, embodied intelligence community still lacks a unified, physically grounded foundation model that integrates perception, reasoning, and planning within real-world spatial-temporal dynamics. We introduce RynnBrain, an open-source spatiotemporal foundation model for embodied intelligence. RynnBrain strengthens four core capabilities in a unified framework: comprehensive egocentric understanding, diverse spatiotemporal localization, physically grounded reasoning, and physics-aware planning. The RynnBrain family comprises three foundation model scales (2B, 8B, and 30B-A3B MoE) and four post-trained variants tailored for downstream embodied tasks (i.e., RynnBrain-Nav, RynnBrain-Plan, and RynnBrain-VLA) or complex spatial reasoning tasks (i.e., RynnBrain-CoP). In terms of extensive evaluations on 20 embodied benchmarks and 8 general vision understanding benchmarks, our RynnBrain foundation models largely outperform existing embodied foundation models by a significant margin. The post-trained model suite further substantiates two key potentials of the RynnBrain foundation model: (i) enabling physically grounded reasoning and planning, and (ii) serving as a strong pretrained backbone that can be efficiently adapted to diverse embodied tasks.
PDF434March 28, 2026