RynnBrain: 개방형 구체화된 기초 모델

초록

멀티모달 기반 모델의 급속한 발전에도 불구하고, 구현형 인텔리전스 커뮤니티는 여전히 인지, 추론, 계획을 현실 세계의 시공간적 역학 안에서 통합하는 통일되고 물리적으로 기반을 둔 기반 모델이 부족한 상황입니다. 본 논문에서는 구현형 인텔리전스를 위한 오픈소스 시공간 기반 모델인 RynnBrain을 소개합니다. RynnBrain은 통합 프레임워크 내에서 네 가지 핵심 역량, 즉 포괄적인 자기 중심적 이해, 다양한 시공간적 위치 파악, 물리적 기반 추론, 그리고 물리 법칙을 인지한 계획 수립 능력을 강화합니다. RynnBrain 패밀리는 세 가지 규모의 기반 모델(2B, 8B, 30B-A3B MoE)과 하류 구현형 작업(즉, RynnBrain-Nav, RynnBrain-Plan, RynnBrain-VLA)이나 복잡한 공간 추론 작업(즉, RynnBrain-CoP)에 맞춰 조정된 네 가지 사후 학습 변형 모델로 구성됩니다. 20개의 구현형 벤치마크와 8개의 일반 영상 이해 벤치마크에 대한 광범위한 평가 결과, 당사의 RynnBrain 기반 모델은 기존 구현형 기반 모델들을 큰 격차로 크게 앞섰습니다. 사후 학습 모델 제품군은 RynnBrain 기반 모델의 두 가지 주요 가능성을 추가로 입증합니다: (i) 물리적으로 기반을 둔 추론과 계획 수립을 가능하게 하고, (ii) 다양한 구현형 작업에 효율적으로 적용될 수 있는 강력한 사전 학습된 백본 역할을 하는 것입니다.

English

Despite rapid progress in multimodal foundation models, embodied intelligence community still lacks a unified, physically grounded foundation model that integrates perception, reasoning, and planning within real-world spatial-temporal dynamics. We introduce RynnBrain, an open-source spatiotemporal foundation model for embodied intelligence. RynnBrain strengthens four core capabilities in a unified framework: comprehensive egocentric understanding, diverse spatiotemporal localization, physically grounded reasoning, and physics-aware planning. The RynnBrain family comprises three foundation model scales (2B, 8B, and 30B-A3B MoE) and four post-trained variants tailored for downstream embodied tasks (i.e., RynnBrain-Nav, RynnBrain-Plan, and RynnBrain-VLA) or complex spatial reasoning tasks (i.e., RynnBrain-CoP). In terms of extensive evaluations on 20 embodied benchmarks and 8 general vision understanding benchmarks, our RynnBrain foundation models largely outperform existing embodied foundation models by a significant margin. The post-trained model suite further substantiates two key potentials of the RynnBrain foundation model: (i) enabling physically grounded reasoning and planning, and (ii) serving as a strong pretrained backbone that can be efficiently adapted to diverse embodied tasks.

RynnBrain: 개방형 구체화된 기초 모델

RynnBrain: Open Embodied Foundation Models

초록

Support