ChatPaper.aiChatPaper

InternScenes:一个具有真实布局的大规模可模拟室内场景数据集

InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts

September 13, 2025
作者: Weipeng Zhong, Peizhou Cao, Yichen Jin, Li Luo, Wenzhe Cai, Jingli Lin, Hanqing Wang, Zhaoyang Lyu, Tai Wang, Bo Dai, Xudong Xu, Jiangmiao Pang
cs.AI

摘要

具身智能(Embodied AI)的发展在很大程度上依赖于大规模、可模拟的3D场景数据集,这些数据集以场景多样性和逼真布局为特征。然而,现有数据集普遍存在数据规模或多样性不足、布局过于简化缺乏小物件以及严重的物体碰撞等问题。为解决这些缺陷,我们推出了InternScenes,一个新颖的大规模可模拟室内场景数据集,通过整合三种不同的场景来源——真实世界扫描、程序生成场景和设计师创作场景,包含约40,000个多样化场景,涵盖1.96M个3D物体,覆盖15种常见场景类型和288个物体类别。我们特别保留了场景中大量的小物件,使得布局既真实又复杂,平均每个区域包含41.5个物体。我们全面的数据处理流程通过为真实世界扫描创建虚实复制品来确保可模拟性,通过在这些场景中加入可交互物体来增强交互性,并通过物理模拟解决物体碰撞问题。我们通过两个基准应用展示了InternScenes的价值:场景布局生成和点目标导航。两者均揭示了复杂且逼真的布局所带来的新挑战。更重要的是,InternScenes为这两项任务的模型训练规模化铺平了道路,使得在如此复杂的场景中进行生成和导航成为可能。我们承诺开源数据、模型和基准测试,以惠及整个社区。
English
The advancement of Embodied AI heavily relies on large-scale, simulatable 3D scene datasets characterized by scene diversity and realistic layouts. However, existing datasets typically suffer from limitations in data scale or diversity, sanitized layouts lacking small items, and severe object collisions. To address these shortcomings, we introduce InternScenes, a novel large-scale simulatable indoor scene dataset comprising approximately 40,000 diverse scenes by integrating three disparate scene sources, real-world scans, procedurally generated scenes, and designer-created scenes, including 1.96M 3D objects and covering 15 common scene types and 288 object classes. We particularly preserve massive small items in the scenes, resulting in realistic and complex layouts with an average of 41.5 objects per region. Our comprehensive data processing pipeline ensures simulatability by creating real-to-sim replicas for real-world scans, enhances interactivity by incorporating interactive objects into these scenes, and resolves object collisions by physical simulations. We demonstrate the value of InternScenes with two benchmark applications: scene layout generation and point-goal navigation. Both show the new challenges posed by the complex and realistic layouts. More importantly, InternScenes paves the way for scaling up the model training for both tasks, making the generation and navigation in such complex scenes possible. We commit to open-sourcing the data, models, and benchmarks to benefit the whole community.
PDF302September 16, 2025