EgoPush:面向移動機器人的端到端自我中心多物體重排學習 (注:EgoPush作為專有名詞保留不譯,通過冒號後的副標題闡明技術內涵。採用「自我中心」對應Egocentric以體現第一視角特性,「端到端」保持End-to-End的標準術語,「多物體重排」準確傳達Multi-Object Rearrangement的技術目標,並通過「面向移動機器人的...學習」完整閉合標題語義鏈條)
EgoPush: Learning End-to-End Egocentric Multi-Object Rearrangement for Mobile Robots
February 20, 2026
作者: Boyuan An, Zhexiong Wang, Yipeng Wang, Jiaqi Li, Sihang Li, Jing Zhang, Chen Feng
cs.AI
摘要
人類能夠在以自我為中心的感知下重新排列雜亂環境中的物體,無需依賴全域座標系統即可應對遮蔽導航。受此能力啟發,我們研究利用單一自我中心相機實現移動機器人的長時序多物體非抓取式重排任務。本文提出EgoPush——一種策略學習框架,通過設計物件中心的潛在空間來編碼物體間的相對空間關係(而非絕對位姿),使移動機器人能在動態場景中實現不依賴易失效的全域狀態估計、純粹基於自我中心感知的重排操作。該框架讓具備特權資訊的強化學習教師模型能從稀疏關鍵點聯合學習潛在狀態與移動動作,隨後將知識蒸餾至純視覺學生策略。為縮小全知教師與局部觀測學生之間的監督差距,我們將教師的觀測限制在視覺可獲取的線索範圍內,從而誘導出能從學生視角恢復的主動感知行為。針對長時序信用分配難題,我們採用時間衰減的階段性完成獎勵,將重排任務分解為階段級子問題。大量模擬實驗表明,EgoPush在成功率上顯著優於端到端強化學習基準方法,消融實驗也驗證了各設計模塊的有效性。我們進一步在真實移動平台上實現了零樣本的模擬到實物遷移。程式碼與影片請參見:https://ai4ce.github.io/EgoPush/。
English
Humans can rearrange objects in cluttered environments using egocentric perception, navigating occlusions without global coordinates. Inspired by this capability, we study long-horizon multi-object non-prehensile rearrangement for mobile robots using a single egocentric camera. We introduce EgoPush, a policy learning framework that enables egocentric, perception-driven rearrangement without relying on explicit global state estimation that often fails in dynamic scenes. EgoPush designs an object-centric latent space to encode relative spatial relations among objects, rather than absolute poses. This design enables a privileged reinforcement-learning (RL) teacher to jointly learn latent states and mobile actions from sparse keypoints, which is then distilled into a purely visual student policy. To reduce the supervision gap between the omniscient teacher and the partially observed student, we restrict the teacher's observations to visually accessible cues. This induces active perception behaviors that are recoverable from the student's viewpoint. To address long-horizon credit assignment, we decompose rearrangement into stage-level subproblems using temporally decayed, stage-local completion rewards. Extensive simulation experiments demonstrate that EgoPush significantly outperforms end-to-end RL baselines in success rate, with ablation studies validating each design choice. We further demonstrate zero-shot sim-to-real transfer on a mobile platform in the real world. Code and videos are available at https://ai4ce.github.io/EgoPush/.