ChatPaper.aiChatPaper

EgoSim:面向具身互動生成的自我中心世界模擬器

EgoSim: Egocentric World Simulator for Embodied Interaction Generation

April 1, 2026
作者: Jinkun Hao, Mingda Jia, Ruiyan Wang, Xihui Liu, Ran Yi, Lizhuang Ma, Jiangmiao Pang, Xudong Xu
cs.AI

摘要

我們提出EgoSim,這是一個閉環式第一人稱視角世界模擬器,能生成空間一致的互動影片並持續更新底層3D場景狀態以實現連續模擬。現有第一人稱模擬器要么缺乏明確的3D基礎,導致視角變化時產生結構漂移;要么將場景視為靜態,無法在多階段互動中更新世界狀態。EgoSim通過將3D場景建模為可更新的世界狀態,同時解決了這兩大局限。我們通過幾何動作感知的觀測模擬模型生成具身互動,並由互動感知狀態更新模塊保障空間一致性。為克服密集對齊的場景-互動訓練數據難以獲取所造成的關鍵數據瓶頸,我們設計了可擴展流程,能從野外大規模單目第一人稱影片中提取靜態點雲、相機軌跡和具身動作。我們進一步推出EgoCap採集系統,支持使用未標定智能手機進行低成本現實世界數據收集。大量實驗表明,EgoSim在視覺質量、空間一致性、對複雜場景及野外精細互動的泛化能力方面顯著優於現有方法,同時支持跨具身傳輸至機器人操作任務。代碼與數據集即將開源,項目頁面請訪問egosimulator.github.io。
English
We introduce EgoSim, a closed-loop egocentric world simulator that generates spatially consistent interaction videos and persistently updates the underlying 3D scene state for continuous simulation. Existing egocentric simulators either lack explicit 3D grounding, causing structural drift under viewpoint changes, or treat the scene as static, failing to update world states across multi-stage interactions. EgoSim addresses both limitations by modeling 3D scenes as updatable world states. We generate embodiment interactions via a Geometry-action-aware Observation Simulation model, with spatial consistency from an Interaction-aware State Updating module. To overcome the critical data bottleneck posed by the difficulty in acquiring densely aligned scene-interaction training pairs, we design a scalable pipeline that extracts static point clouds, camera trajectories, and embodiment actions from in-the-wild large-scale monocular egocentric videos. We further introduce EgoCap, a capture system that enables low-cost real-world data collection with uncalibrated smartphones. Extensive experiments demonstrate that EgoSim significantly outperforms existing methods in terms of visual quality, spatial consistency, and generalization to complex scenes and in-the-wild dexterous interactions, while supporting cross-embodiment transfer to robotic manipulation. Codes and datasets will be open soon. The project page is at egosimulator.github.io.
PDF301April 4, 2026