Habitat-GS：基于动态高斯溅射的高保真导航模拟器

摘要

面向具身智能体的训练高度依赖于仿真环境的视觉保真度与动态人体建模能力。当前仿真器普遍采用基于网格的光栅化渲染技术，其视觉真实感有限；即便支持动态人体化身，也受限于网格表示法，这阻碍了智能体向真实人类场景的泛化能力。我们提出Habitat-GS——一个基于Habitat-Sim拓展的以导航为核心的具身AI仿真平台，它融合了3D高斯溅射场景渲染技术与可驱动的高斯化身，同时保持与Habitat生态系统的完全兼容。该系统通过实现3DGS渲染器达成实时照片级真实感渲染，并支持从多源数据导入可扩展的3DGS资源。在动态人体建模方面，我们引入高斯化身模块，使每个化身既能作为逼真的视觉实体，又可充当有效的导航障碍物，从而让智能体在逼真环境中习得人类意识行为。点目标导航实验表明，在3DGS场景中训练的智能体具有更强的跨领域泛化能力，其中混合领域训练策略效果最佳。化身感知导航评估进一步验证了高斯化身可实现有效的人类感知导航。性能基准测试则证明了系统在不同场景复杂度与化身数量下的可扩展性。

English

Training embodied AI agents depends critically on the visual fidelity of simulation environments and the ability to model dynamic humans. Current simulators rely on mesh-based rasterization with limited visual realism, and their support for dynamic human avatars, where available, is constrained to mesh representations, hindering agent generalization to human-populated real-world scenarios. We present Habitat-GS, a navigation-centric embodied AI simulator extended from Habitat-Sim that integrates 3D Gaussian Splatting scene rendering and drivable gaussian avatars while maintaining full compatibility with the Habitat ecosystem. Our system implements a 3DGS renderer for real-time photorealistic rendering and supports scalable 3DGS asset import from diverse sources. For dynamic human modeling, we introduce a gaussian avatar module that enables each avatar to simultaneously serve as a photorealistic visual entity and an effective navigation obstacle, allowing agents to learn human-aware behaviors in realistic settings. Experiments on point-goal navigation demonstrate that agents trained on 3DGS scenes achieve stronger cross-domain generalization, with mixed-domain training being the most effective strategy. Evaluations on avatar-aware navigation further confirm that gaussian avatars enable effective human-aware navigation. Finally, performance benchmarks validate the system's scalability across varying scene complexity and avatar counts.

Habitat-GS：基于动态高斯溅射的高保真导航模拟器

Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting

摘要

Support