物理场景的Splatting：从非完美机器人数据到端到端的真实到仿真转换

摘要

从真实世界机器人运动直接创建精确的物理仿真，对于实现安全、可扩展且经济高效的机器人学习具有重大价值，然而这一过程仍面临极大挑战。真实机器人数据存在遮挡、相机位姿噪声以及动态场景元素等问题，这些因素阻碍了对未见物体构建几何精确且逼真的数字孪生体。我们提出了一种新颖的真实到仿真框架，一次性解决所有这些问题。我们的核心洞察在于一种混合场景表示方法，它将3D高斯溅射的逼真渲染与适用于物理仿真的显式物体网格相结合，形成单一表示。我们设计了一个端到端优化流程，利用MuJoCo中的可微分渲染与可微分物理，直接从原始且不精确的机器人轨迹中联合优化所有场景组件——从物体几何与外观到机器人位姿及物理参数。这种统一优化使我们能够同时实现高保真物体网格重建、生成逼真的新视角，并执行无需标注的机器人位姿校准。我们通过在仿真中使用ALOHA 2双手操作器以及在具有挑战性的真实世界序列上的实验，验证了该方法的有效性，从而推动了更实用、更稳健的真实到仿真流程的发展。

English

Creating accurate, physical simulations directly from real-world robot motion holds great value for safe, scalable, and affordable robot learning, yet remains exceptionally challenging. Real robot data suffers from occlusions, noisy camera poses, dynamic scene elements, which hinder the creation of geometrically accurate and photorealistic digital twins of unseen objects. We introduce a novel real-to-sim framework tackling all these challenges at once. Our key insight is a hybrid scene representation merging the photorealistic rendering of 3D Gaussian Splatting with explicit object meshes suitable for physics simulation within a single representation. We propose an end-to-end optimization pipeline that leverages differentiable rendering and differentiable physics within MuJoCo to jointly refine all scene components - from object geometry and appearance to robot poses and physical parameters - directly from raw and imprecise robot trajectories. This unified optimization allows us to simultaneously achieve high-fidelity object mesh reconstruction, generate photorealistic novel views, and perform annotation-free robot pose calibration. We demonstrate the effectiveness of our approach both in simulation and on challenging real-world sequences using an ALOHA 2 bi-manual manipulator, enabling more practical and robust real-to-simulation pipelines.

物理场景的Splatting：从非完美机器人数据到端到端的真实到仿真转换

Splatting Physical Scenes: End-to-End Real-to-Sim from Imperfect Robot Data

摘要

Support