ChatPaper.aiChatPaper

OpenSpatial:赋能空间智能的规范化数据引擎

OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence

April 8, 2026
作者: Jianhui Liu, Haoze Sun, Wenbo Li, Yanbing Zhang, Rui Yang, Zhiliang Zhu, Yijun Yang, Shenghe Zheng, Nan Jiang, Jiaxiu Jiang, Haoyang Huang, Tien-Tsin Wong, Nan Duan, Xiaojuan Qi
cs.AI

摘要

空间理解是实现人类级别智能的基石。然而,当前研究主要聚焦于特定领域的数据生产,存在一个关键空白:缺乏能够充分释放高质量空间数据潜力的开源引擎。为弥补这一不足,我们阐述了稳健数据生成系统的设计原则,并推出OpenSpatial——一个具备高质量、强扩展性、广泛任务多样性及优化效率的开源数据引擎。该引擎以三维边界框为基础单元,构建了涵盖五大核心任务的数据层级体系:空间度量(SM)、空间关系(SR)、相机感知(CP)、多视角一致性(MC)和场景感知推理(SAR)。基于此可扩展架构,我们构建了包含300万高保真样本的大规模数据集OpenSpatial-3M。大量实验表明,基于本数据集训练的通用模型在各类空间推理基准测试中均达到最先进性能。值得注意的是,最佳模型的相对平均性能提升达19%。此外,我们系统分析了数据属性如何影响空间感知能力。通过开源引擎与300万规模数据集,我们为加速空间智能的未来研究奠定了坚实基础。
English
Spatial understanding is a fundamental cornerstone of human-level intelligence. Nonetheless, current research predominantly focuses on domain-specific data production, leaving a critical void: the absence of a principled, open-source engine capable of fully unleashing the potential of high-quality spatial data. To bridge this gap, we elucidate the design principles of a robust data generation system and introduce OpenSpatial -- an open-source data engine engineered for high quality, extensive scalability, broad task diversity, and optimized efficiency. OpenSpatial adopts 3D bounding boxes as the fundamental primitive to construct a comprehensive data hierarchy across five foundational tasks: Spatial Measurement (SM), Spatial Relationship (SR), Camera Perception (CP), Multi-view Consistency (MC), and Scene-Aware Reasoning (SAR). Leveraging this scalable infrastructure, we curate OpenSpatial-3M, a large-scale dataset comprising 3 million high-fidelity samples. Extensive evaluations demonstrate that versatile models trained on our dataset achieve state-of-the-art performance across a wide spectrum of spatial reasoning benchmarks. Notably, the best-performing model exhibits a substantial average improvement of 19 percent, relatively. Furthermore, we provide a systematic analysis of how data attributes influence spatial perception. By open-sourcing both the engine and the 3M-scale dataset, we provide a robust foundation to accelerate future research in spatial intelligence.
PDF261April 11, 2026